[jira] [Commented] (MESOS-106) Failover timeout should default to 0

Benjamin Hindman (Commented) (JIRA) Thu, 15 Dec 2011 13:36:05 -0800

    [ 
https://issues.apache.org/jira/browse/MESOS-106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13170508#comment-13170508
 ]


Benjamin Hindman commented on MESOS-106:
----------------------------------------

This sounds fine with me ... but it will require the failover timeout to be 
made configurable first.
                
> Failover timeout should default to 0
> ------------------------------------
>
>                 Key: MESOS-106
>                 URL: https://issues.apache.org/jira/browse/MESOS-106
>             Project: Mesos
>          Issue Type: Improvement
>            Reporter: Matei Zaharia
>
> Since the failover timeout was added, you get a lot of weird behavior in 
> clusters running frameworks that don't support failover due to its long 
> default value of 1 day. If a framework fails or just exits without calling 
> driver.stop(), all its executors stay around and consume resources on the 
> machines, causing subsequent runs to mysteriously fail to acquire resources. 
> See http://groups.google.com/group/spark-users/msg/553af12424e4ed3d for an 
> example. I know that the failover timeout is supposed to eventually become a 
> per-framework parameter anyway, but in the meantime, the easiest way to 
> prevent this is to set it to 0, because almost no users have failover-enabled 
> frameworks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MESOS-106) Failover timeout should default to 0

Reply via email to