Greg Mann created MESOS-9944:
--------------------------------

             Summary: Command health check timeout begins to early
                 Key: MESOS-9944
                 URL: https://issues.apache.org/jira/browse/MESOS-9944
             Project: Mesos
          Issue Type: Bug
          Components: agent
    Affects Versions: 1.9.0
            Reporter: Greg Mann


The checker process begins the timer for the command health check timeout when 
the LAUNCH_NESTED_CONTAINER_SESSION request is first sent, which means any 
delay in the execution of the health check command is included in the health 
check timeout. This can be an issue when the agent is under heavy load, and it 
may take a few seconds for the health check command to be run.

Once we have a streaming response for the ATTACH_CONTAINER_OUTPUT call which 
follows the nested container launch, we can initiate the health check timeout 
once the first byte of the response is received; this is a more accurate signal 
that the health check command has begun running.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to