Hi Olivier,

> Can we have "non terminal" errors, from mesos point of view, where task
should not be considered as over?

Not really, what you're seeing certainly looks like a bug, terminal updates
should be terminal. It'lls probably be hard to debug it without more data ;)

As a wild guess, since you seem to be using custom task id's, maybe you
tried to start a task twice, and the TASK_ERROR was generated on the master
in response to the duplicate task id or some other validation issue, and
the TASK_FINISHED was generated on the slave when the first task finished?
Although I'm not sure from the top of my head if there are checks in mesos
that would catch this.

Best regards,

On Tue, Sep 19, 2017 at 7:47 AM, Olivier Sallou <olivier.sal...@irisa.fr>
wrote:

> Hi
> I found a strange behaviour on a cluster that I do not understand. I do
> not have access to mesos logs (not in my cluster), but anyone faced this
> before ?
> My framework uses Docker containerizer. We faced a task that sent
> TASK_ERROR to the framework (why not), but in reality the Docker executed
> correctly on mesos slave, then we received a TASK_FINISHED.
> So mesos detected an error with task but it detected anyway the end of the
> task sending the finished event at the end.
>
> How mesos can detect an error but still watching the task and detect its
> end ?
>
> Here are my framework logs:
> 2017-09-17 01:06:35,447 DEBUG [godocker-scheduler][Thread-1] Task 17820-0
> is in state TASK_RUNNING
> 2017-09-17 01:06:46,286 DEBUG [godocker-scheduler][Thread-1] Task 17820-0
> is in state TASK_ERROR
> 2017-09-17 02:13:44,537 DEBUG [godocker-scheduler][Thread-1] Task 17820-0
> is in state TASK_FINISHED
>
> Unfortunalty I did not log the "reason" of the ERROR, so I do not know
> what occured, and cannot at this stage reproduce manually the use case.
>
> Can we have "non terminal" errors, from mesos point of view, where task
> should not be considered as over?
>
> Thanks
>
> Olivier
>



-- 
Benno Evers
Software Engineer, Mesosphere

Reply via email to