Benjamin Hindman created MESOS-3545:
---------------------------------------
Summary: Investigate restoring tasks/executors after machine
reboot.
Key: MESOS-3545
URL: https://issues.apache.org/jira/browse/MESOS-3545
Project: Mesos
Issue Type: Improvement
Components: slave
Reporter: Benjamin Hindman
If a task/executor is restartable (see MESOS-3544) it might make sense to force
an agent to restart these tasks/executors _before_ after a machine reboot in
the event that the machine is network partitioned away from the master (or the
master has failed) but we'd like to get these services running again. Assuming
the agent(s) running on the machine has not been disconnected from the master
for longer than the master's agent re-registration timeout the agent should be
able to re-register (i.e., after a network partition is resolved) without a
problem. However, in the same way that a framework would be interested in
knowing that it's tasks/executors were restarted we'd want to send something
like a TASK_RESTARTED status update.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)