On 10/20/2014 05:20 PM, Alex Rukletsov wrote: > It looks like you try to set both command and executor. This is not > allowed, since setting a command implies using the CommandExecutor aka > mesos-executor. If you task is a command, do not specify the executor in > your TaskInfo: mesos will do it for you. See > https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto line > 579. > > Btw, you should observe something like "Task <id> should have either > CommandInfo or ExecutorInfo set but not both" in your logs. ok, thanks, I could get it work (at least I see my job).
There is a lack of documentation on API per language. :-( Thanks for your help Olivier > > On Mon, Oct 20, 2014 at 5:13 PM, Olivier Sallou <[email protected]> > wrote: > >> On 10/20/2014 08:11 AM, Olivier Sallou wrote: >>> On 10/18/2014 12:55 PM, Alex Rukletsov wrote: >>>> Hi Oliver, >>>> >>>> you can get a TASK_LOST if import directives in your executor fail. Do >> you >>>> have mesos python eggs installed or available through PYTHONPATH? Could >> you >>>> please also paste the output of stderr and stdout of the lost task (you >> can >>>> access them via mesos webUI → sandbox)? >>> I do not see the task at all on webUI. Python eggs are available from >>> PYTHONPATH. My eggs are in MESOS_BUILD_DIR. >>> If I execute directly my executor, I have no "python" error, only a >>> MISSING SLAVE ID (but this is correct as mesos adds this env at runtime). >>> >>> I see that task is lost because, in my scheduler, in the statusUpdate >>> method, I print the task status (value = 5). Message is empty. >>> >>> nothing in webUI, nothing in console logs.... as my executor is not >>> executed, it means that mesos (master or slave) give me this error >>> status, but I have no additional info about the reason. >>> >>> I have used and adapted the examples given with sources >>> (src/examples/python). >> Taking as example the python code in src/examples/python, I could >> progress a little. >> >> Though there is no additional error log, I found an issue with setting >> the "command" parameter. >> >> If I comment the "command" parameter, my executor is executed (it fails >> but that's fine for the moment). >> >> In my task, I was setting: task.command.value = "something to execute on >> node" >> >> Setting command creates a silent error. >> >> My TaskInfo was like: >> ..... >> executor { >> executor_id { >> value: "default" >> } >> command { >> value: "....../test-executor" >> } >> name: "Test Executor (Python)" >> source: "python_test" >> } >> command { >> value: "ls -l" >> } >> >> So I wonder: >> >> 1) why the error is silent on master side >> >> 2) how do I set the command to execute in the TaskInfo object ? >>> Olivier >>>> On Fri, Oct 17, 2014 at 7:31 PM, Vinod Kone <[email protected]> >> wrote: >>>>> Can you grep for TASK_LOST in master and slave logs and paste the >> output >>>>> here? >>>>> >>>>> On Fri, Oct 17, 2014 at 8:24 AM, Olivier Sallou < >> [email protected]> >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> I have installed mesos on a single host master/slave config (for >>>>>> devpt/test). >>>>>> >>>>>> Mesos works fine with frameworks I tested (aurora...). >>>>>> >>>>>> I try to create my own scheduler/executor in python, based on example >>>>>> given with sources, but I cannot get my task executed. >>>>>> >>>>>> Executor is not executed (I have added debug logs in a file to check, >>>>>> and no file is created), but I see no error in master logs (console) >> nor >>>>>> slave logs. >>>>>> >>>>>> In master I can see: >>>>>> >>>>>> I1017 16:50:30.601210 25794 master.cpp:3559] Sending 1 offers to >>>>>> framework 20141017-141022-16777343-5050-25774-0047 >>>>>> I1017 16:50:30.608912 25789 master.cpp:2169] Processing reply for >>>>>> offers: [ 20141017-141022-16777343-5050-25774-97 ] on slave >>>>>> 20141017-141022-16777343-5050-25774-0 at slave(1)@127.0.0.1:5051 >>>>>> (localhost) for framework 20141017-141022-16777343-5050-25774-0047 >>>>>> I1017 16:50:30.609207 25789 hierarchical_allocator_process.hpp:563] >>>>>> Recovered cpus(*):8; mem(*):6900; disk(*):215925; >> ports(*):[31000-32000] >>>>>> (total allocatable: cpus(*):8; mem(*):6900; disk(*):215925; >>>>>> ports(*):[31000-32000]) on slave 20141017-141022-16777343-5050-25774-0 >>>>>> from framework 20141017-141022-16777343-5050-25774-0047 >>>>>> >>>>>> My reply to the offer is received, but in my scheduler I receive an >>>>>> update status of TASK_LOST. >>>>>> >>>>>> I do not see how to debug this, I see no information why my task is >> lost >>>>>> (there is enough cpu/mem, I ask 2 cpu, and 2024 mem), and it seems >> that >>>>>> it is rejected at master level. >>>>>> >>>>>> Any hint on how to analyse this? >>>>>> >>>>>> Thanks >>>>>> >>>>>> -- >>>>>> gpg key id: 4096R/326D8438 (keyring.debian.org) >>>>>> Key fingerprint = 5FB4 6F83 D3B9 5204 6335 D26D 78DC 68DB 326D 8438 >>>>>> >>>>>> >>>>>> >> -- >> Olivier Sallou >> IRISA / University of Rennes 1 >> Campus de Beaulieu, 35000 RENNES - FRANCE >> Tel: 02.99.84.71.95 >> >> gpg key id: 4096R/326D8438 (keyring.debian.org) >> Key fingerprint = 5FB4 6F83 D3B9 5204 6335 D26D 78DC 68DB 326D 8438 >> >> -- Olivier Sallou IRISA / University of Rennes 1 Campus de Beaulieu, 35000 RENNES - FRANCE Tel: 02.99.84.71.95 gpg key id: 4096R/326D8438 (keyring.debian.org) Key fingerprint = 5FB4 6F83 D3B9 5204 6335 D26D 78DC 68DB 326D 8438
