Vinod Kone created MESOS-1033:
---------------------------------

             Summary: Create a stat for executors timing out on registration
                 Key: MESOS-1033
                 URL: https://issues.apache.org/jira/browse/MESOS-1033
             Project: Mesos
          Issue Type: Improvement
          Components: stats
            Reporter: Vinod Kone
             Fix For: 0.19.0


At Twitter we have seen cases where a slave host went in to a bad state 
(possible kernel bug) resulting in isolator/containerizer being blocked 
resulting in executors not being able to be launched.

It would be nice to have a stat to expose the number of executors that are 
being killed due to registration timeout to alert on this behavior.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to