[
https://issues.apache.org/jira/browse/MESOS-3544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Artem Harutyunyan updated MESOS-3544:
-------------------------------------
Assignee: Anand Mazumdar
> Support task and/or executor restart on failure.
> ------------------------------------------------
>
> Key: MESOS-3544
> URL: https://issues.apache.org/jira/browse/MESOS-3544
> Project: Mesos
> Issue Type: Epic
> Components: HTTP API, master, slave
> Reporter: Benjamin Hindman
> Assignee: Anand Mazumdar
> Labels: mesosphere
>
> In certain instances it might be preferable to restart a task/executor after
> it fails (i.e., non-zero exit code) rather than going through an entire
> status update -> offer -> accept (launch) cycle to restart the task/executor
> on the same machine. This is especially true if the resources are reserved
> (dynamically or statically).
> Of course, we still want to highlight the restart to the framework, so
> introducing something like TASK_RESTARTED might be necessary (not sure what
> the analog would be for executors).
> Finally, if the task/executor has a bug we don't want to sit in an infinite
> loop, so we'll likely want to introduce this functionality in such a way as
> to limit the total restart attempts (or force a framework to have the proper
> authority to restart forever).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)