Re: Question about LOST status on custom executor

Benjamin Mahler Mon, 07 Apr 2014 13:55:38 -0700

Why is your executor "failing"? When you say failing, is your executor
crashing or simply exiting after doing the required work?

You will need to manage the task status lifecycle. If your executor is
holding non-terminal tasks and it exits, the slave will report these tasks
as LOST since it does not know whether the tasks were run to completion.
Your executor will at the very least need to report when things are
FINISHED or FAILED.

It's also good practice to report once things are RUNNING to keep your
scheduler well informed.

Hope this helps,
Ben

On Mon, Apr 7, 2014 at 11:35 AM, David Greenberg <[email protected]>wrote:

> I'm working on porting my executor from the CommandExecutor to a custom
> executor, in order to take advantage of other features of Mesos. I started
> by changing the TaskInfo in the scheduler to define ExecutorInfo instead of
> CommandInfo, where the ExecutorInfo's command is the same as the original
> CommandInfo. I gave the executor a random ID.
>
> I can see that the executor successfully starts and seems to connect to
> Mesos. After a few moments (10s - 100s of ms), the executor fails with the
> LOST status.
>
> Am I responsible for explicitly managing the TaskState lifecycle of the
> executor? That is, do I need to immediately send the TASK_STARTING status
> update, and then send the TASK_RUNNING update once the task has begun? Are
> there any heartbeats that I'm responsible for?
>
> Thanks,
> David
>

Re: Question about LOST status on custom executor

Reply via email to