Vinod Kone created MESOS-644:
--------------------------------

             Summary: Slave doesn't correctly handle checkpointed terminal 
update whose ack doesn't reach the executor
                 Key: MESOS-644
                 URL: https://issues.apache.org/jira/browse/MESOS-644
             Project: Mesos
          Issue Type: Task
            Reporter: Vinod Kone
            Assignee: Vinod Kone
             Fix For: 0.14.0


This is the scenario.

Slave dies after checkpointing a terminal update but before the ACK reached the 
executor.

Recovered slave/status update manager retries the update and cleans it up after 
it gets an ACK from the scheduler.

When the executor re-registers after this point, it still has a pending update 
but the slave cannot find the executor for this update because the task is 
completed! Currently the slave forwards this update to the SUM anyway but never 
acks the executor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to