It looks like you try to set both command and executor. This is not allowed, since setting a command implies using the CommandExecutor aka mesos-executor. If you task is a command, do not specify the executor in your TaskInfo: mesos will do it for you. See https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto line 579.
Btw, you should observe something like "Task <id> should have either CommandInfo or ExecutorInfo set but not both" in your logs. On Mon, Oct 20, 2014 at 5:13 PM, Olivier Sallou <[email protected]> wrote: > > On 10/20/2014 08:11 AM, Olivier Sallou wrote: > > On 10/18/2014 12:55 PM, Alex Rukletsov wrote: > >> Hi Oliver, > >> > >> you can get a TASK_LOST if import directives in your executor fail. Do > you > >> have mesos python eggs installed or available through PYTHONPATH? Could > you > >> please also paste the output of stderr and stdout of the lost task (you > can > >> access them via mesos webUI → sandbox)? > > I do not see the task at all on webUI. Python eggs are available from > > PYTHONPATH. My eggs are in MESOS_BUILD_DIR. > > If I execute directly my executor, I have no "python" error, only a > > MISSING SLAVE ID (but this is correct as mesos adds this env at runtime). > > > > I see that task is lost because, in my scheduler, in the statusUpdate > > method, I print the task status (value = 5). Message is empty. > > > > nothing in webUI, nothing in console logs.... as my executor is not > > executed, it means that mesos (master or slave) give me this error > > status, but I have no additional info about the reason. > > > > I have used and adapted the examples given with sources > > (src/examples/python). > Taking as example the python code in src/examples/python, I could > progress a little. > > Though there is no additional error log, I found an issue with setting > the "command" parameter. > > If I comment the "command" parameter, my executor is executed (it fails > but that's fine for the moment). > > In my task, I was setting: task.command.value = "something to execute on > node" > > Setting command creates a silent error. > > My TaskInfo was like: > ..... > executor { > executor_id { > value: "default" > } > command { > value: "....../test-executor" > } > name: "Test Executor (Python)" > source: "python_test" > } > command { > value: "ls -l" > } > > So I wonder: > > 1) why the error is silent on master side > > 2) how do I set the command to execute in the TaskInfo object ? > > > > Olivier > >> On Fri, Oct 17, 2014 at 7:31 PM, Vinod Kone <[email protected]> > wrote: > >> > >>> Can you grep for TASK_LOST in master and slave logs and paste the > output > >>> here? > >>> > >>> On Fri, Oct 17, 2014 at 8:24 AM, Olivier Sallou < > [email protected]> > >>> wrote: > >>> > >>>> Hi, > >>>> I have installed mesos on a single host master/slave config (for > >>>> devpt/test). > >>>> > >>>> Mesos works fine with frameworks I tested (aurora...). > >>>> > >>>> I try to create my own scheduler/executor in python, based on example > >>>> given with sources, but I cannot get my task executed. > >>>> > >>>> Executor is not executed (I have added debug logs in a file to check, > >>>> and no file is created), but I see no error in master logs (console) > nor > >>>> slave logs. > >>>> > >>>> In master I can see: > >>>> > >>>> I1017 16:50:30.601210 25794 master.cpp:3559] Sending 1 offers to > >>>> framework 20141017-141022-16777343-5050-25774-0047 > >>>> I1017 16:50:30.608912 25789 master.cpp:2169] Processing reply for > >>>> offers: [ 20141017-141022-16777343-5050-25774-97 ] on slave > >>>> 20141017-141022-16777343-5050-25774-0 at slave(1)@127.0.0.1:5051 > >>>> (localhost) for framework 20141017-141022-16777343-5050-25774-0047 > >>>> I1017 16:50:30.609207 25789 hierarchical_allocator_process.hpp:563] > >>>> Recovered cpus(*):8; mem(*):6900; disk(*):215925; > ports(*):[31000-32000] > >>>> (total allocatable: cpus(*):8; mem(*):6900; disk(*):215925; > >>>> ports(*):[31000-32000]) on slave 20141017-141022-16777343-5050-25774-0 > >>>> from framework 20141017-141022-16777343-5050-25774-0047 > >>>> > >>>> My reply to the offer is received, but in my scheduler I receive an > >>>> update status of TASK_LOST. > >>>> > >>>> I do not see how to debug this, I see no information why my task is > lost > >>>> (there is enough cpu/mem, I ask 2 cpu, and 2024 mem), and it seems > that > >>>> it is rejected at master level. > >>>> > >>>> Any hint on how to analyse this? > >>>> > >>>> Thanks > >>>> > >>>> -- > >>>> gpg key id: 4096R/326D8438 (keyring.debian.org) > >>>> Key fingerprint = 5FB4 6F83 D3B9 5204 6335 D26D 78DC 68DB 326D 8438 > >>>> > >>>> > >>>> > > -- > Olivier Sallou > IRISA / University of Rennes 1 > Campus de Beaulieu, 35000 RENNES - FRANCE > Tel: 02.99.84.71.95 > > gpg key id: 4096R/326D8438 (keyring.debian.org) > Key fingerprint = 5FB4 6F83 D3B9 5204 6335 D26D 78DC 68DB 326D 8438 > >
