Alexander Rukletsov created MESOS-6170:
------------------------------------------

             Summary: Health check grace period covers failures happening after 
first success.
                 Key: MESOS-6170
                 URL: https://issues.apache.org/jira/browse/MESOS-6170
             Project: Mesos
          Issue Type: Improvement
    Affects Versions: 1.0.0
            Reporter: Alexander Rukletsov
             Fix For: 1.1.0


Currently, the health check library [ignores *all* 
failures|https://github.com/apache/mesos/blob/b053572bc424478cafcd60d1bce078f5132c4590/src/health-check/health_checker.cpp#L192-L197]
 from the task’s start (technically from the health check library 
initialization) [until after the grace period 
ends|https://github.com/apache/mesos/blob/b053572bc424478cafcd60d1bce078f5132c4590/include/mesos/v1/mesos.proto#L403].

This behaviour is misleading. Once the health check succeeds for the first 
time, grace period rule for failures should not be applied any more.

For example, if the grace period is set to 10 minutes, the task becomes healthy 
after 1 minute and fails after 2 minutes, the failure should be treated as a 
normal failure with all the consequences.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to