[ 
https://issues.apache.org/jira/browse/MESOS-3544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-3544:
-------------------------------------
    Assignee: Anand Mazumdar

> Support task and/or executor restart on failure.
> ------------------------------------------------
>
>                 Key: MESOS-3544
>                 URL: https://issues.apache.org/jira/browse/MESOS-3544
>             Project: Mesos
>          Issue Type: Epic
>          Components: HTTP API, master, slave
>            Reporter: Benjamin Hindman
>            Assignee: Anand Mazumdar
>              Labels: mesosphere
>
> In certain instances it might be preferable to restart a task/executor after 
> it fails (i.e., non-zero exit code) rather than going through an entire 
> status update -> offer -> accept (launch) cycle to restart the task/executor 
> on the same machine. This is especially true if the resources are reserved 
> (dynamically or statically).
> Of course, we still want to highlight the restart to the framework, so 
> introducing something like TASK_RESTARTED might be necessary (not sure what 
> the analog would be for executors).
> Finally, if the task/executor has a bug we don't want to sit in an infinite 
> loop, so we'll likely want to introduce this functionality in such a way as 
> to limit the total restart attempts (or force a framework to have the proper 
> authority to restart forever).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to