Greg Mann created MESOS-8573: -------------------------------- Summary: Container stuck in PULLING when Docker daemon hangs Key: MESOS-8573 URL: https://issues.apache.org/jira/browse/MESOS-8573 Project: Mesos Issue Type: Improvement Affects Versions: 1.5.0 Reporter: Greg Mann
When the {{force}} argument is not set to {{true}}, {{Docker::pull}} will always perform a {{docker inspect}} call before it does a {{docker pull}}. If either of these two Docker CLI calls hangs indefinitely, the Docker container will be stuck in the PULLING state. This means that we make no further progress in the {{launch()}} call path, so the executor binary is never executed, the {{Future}} associated with the {{launch()}} call is never failed or satisfied, and {{wait()}} is never called on the container. Thus, when the executor registration timeout elapses, the agent's call to {{containerizer->destroy()}} gets stuck waiting on the container status, and its continuation is never invoked. This leaves the task destined for that Docker executor stuck in TASK_STAGING from the framework's perspective, and attempts to kill the task will fail. -- This message was sent by Atlassian JIRA (v7.6.3#76005)