[
https://issues.apache.org/jira/browse/MESOS-3479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14940131#comment-14940131
]
haosdent commented on MESOS-3479:
---------------------------------
I implement a draft patch here, not test yet.
https://reviews.apache.org/r/38932/
Let me clarify my idea for health check here. The health check should stop in
these case
* The value of command is not correct
* Reach consecutive_failures times, after killTask and healthCheck also should
exit.
Other cases of error the healthCheck should continue and clean up the check
command process before launch a new check command process.
And we also should keep this behaviour
* Before delay_seconds + grace_period_seconds, the failure of healthCheck would
be ignored except the healthCheck would exit when command value not correct.
> COMMAND Health Checks are not executed if the timeout is exceeded
> -----------------------------------------------------------------
>
> Key: MESOS-3479
> URL: https://issues.apache.org/jira/browse/MESOS-3479
> Project: Mesos
> Issue Type: Bug
> Affects Versions: 0.23.0
> Reporter: Matthias Veit
> Assignee: haosdent
> Priority: Critical
>
> The issue first appeared as Marathon Bug: See here for reference:
> https://github.com/mesosphere/marathon/issues/2179.
> A COMMAND health check is defined with a timeout of 20 seconds.
> The command itself takes longer than 20 seconds to execute.
> Current behavior:
> - The mesos health check process get's killed, but the defined command
> process not (in the example the curl command returns after 21 seconds).
> - The check attempt is considered healthy, if the timeout is exceeded
> - The health check stops and is not executed any longer
> Expected behavior:
> - The defined health check command is killed, when the timeout is exceeded
> - The check attempt is considered Unhealthy, if the timeout is exceeded
> - The health check does not stop
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)