Vinod Kone created MESOS-644:
--------------------------------
Summary: Slave doesn't correctly handle checkpointed terminal
update whose ack doesn't reach the executor
Key: MESOS-644
URL: https://issues.apache.org/jira/browse/MESOS-644
Project: Mesos
Issue Type: Task
Reporter: Vinod Kone
Assignee: Vinod Kone
Fix For: 0.14.0
This is the scenario.
Slave dies after checkpointing a terminal update but before the ACK reached the
executor.
Recovered slave/status update manager retries the update and cleans it up after
it gets an ACK from the scheduler.
When the executor re-registers after this point, it still has a pending update
but the slave cannot find the executor for this update because the task is
completed! Currently the slave forwards this update to the SUM anyway but never
acks the executor.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira