http://aurora.apache.org/documentation/latest/reference/task-lifecycle/


Unexpected Termination: LOST

If a Task stays in a transient task state for too long (such as ASSIGNED or 
STARTING), the scheduler forces it into LOST state, creating a new Task in its 
place that’s sent into PENDING state.

So, the behavior we are observing while testing with our custom executor is 
mesos task in staging or say executor has not sent the task starting mesos 
status message, the transient timeout is working and task marked as lost in 
aurora. However, if executor has sent starting status message but then does not 
send the task running/failed message status, the transient timeout is not 
kicking in and aurora not marking it lost. we waited good 5+ mins after the 
timeout to see a change in multiple tests.
This is 0.19 aurora.
Thx

Reply via email to