Greg Mann created MESOS-9944:
--------------------------------
Summary: Command health check timeout begins to early
Key: MESOS-9944
URL: https://issues.apache.org/jira/browse/MESOS-9944
Project: Mesos
Issue Type: Bug
Components: agent
Affects Versions: 1.9.0
Reporter: Greg Mann
The checker process begins the timer for the command health check timeout when
the LAUNCH_NESTED_CONTAINER_SESSION request is first sent, which means any
delay in the execution of the health check command is included in the health
check timeout. This can be an issue when the agent is under heavy load, and it
may take a few seconds for the health check command to be run.
Once we have a streaming response for the ATTACH_CONTAINER_OUTPUT call which
follows the nested container launch, we can initiate the health check timeout
once the first byte of the response is received; this is a more accurate signal
that the health check command has begun running.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)