Vinod Kone created MESOS-646:
--------------------------------
Summary: Slave recovery doesn't properly handle checkpointed
queued tasks
Key: MESOS-646
URL: https://issues.apache.org/jira/browse/MESOS-646
Project: Mesos
Issue Type: Bug
Reporter: Vinod Kone
Assignee: Vinod Kone
Fix For: 0.14.0
If the slave dies after checkpointing a queued task but before it was launched
on an executor, the slave doesn't have enough information to relaunch it
(because we only checkpoint Task instead of TaskInfo).
When the executor re-registers it should simply remove these tasks from its
map.
Alternatively, slave could checkpoint TaskInfo instead of Task. We don't do
this because TaskInfo.data could be potentially huge.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira