Andrei Budnik created MESOS-9887: ------------------------------------ Summary: Race condition between two terminal task status updates for Docker executor. Key: MESOS-9887 URL: https://issues.apache.org/jira/browse/MESOS-9887 Project: Mesos Issue Type: Bug Components: agent, containerization Reporter: Andrei Budnik Attachments: race_example.txt
h2. Overview Expected behavior: Task successfully finishes and sends TASK_FINISHED status update. Observed behavior: Task successfully finishes, but the agent sends TASK_FAILED with the reason "REASON_EXECUTOR_TERMINATED". In normal circumstances, Docker executor [sends|https://github.com/apache/mesos/blob/0026ea46dc35cbba1f442b8e425c6cbaf81ee8f8/src/docker/executor.cpp#L758] final status update TASK_FINISHED to the agent, which then [gets processed|https://github.com/apache/mesos/blob/0026ea46dc35cbba1f442b8e425c6cbaf81ee8f8/src/slave/slave.cpp#L5543] by the agent before termination of the executor's process. However, if the processing of the initial TASK_FINISHED gets delayed, then there is a chance that Docker executor terminates and the agent [triggers|https://github.com/apache/mesos/blob/0026ea46dc35cbba1f442b8e425c6cbaf81ee8f8/src/slave/slave.cpp#L6662] TASK_FAILED which will [be handled|https://github.com/apache/mesos/blob/0026ea46dc35cbba1f442b8e425c6cbaf81ee8f8/src/slave/slave.cpp#L5816-L5826] prior to the TASK_FINISHED status update. See attached logs which contain an example of the race condition. h2. Reproducing bug 1. Add the following code: {code:java} static int c = 0; if (++c == 3) { // to skip TASK_STARTING and TASK_RUNNING status updates. ::sleep(2); } {code} to the [`ComposingContainerizerProcess::status`|https://github.com/apache/mesos/blob/0026ea46dc35cbba1f442b8e425c6cbaf81ee8f8/src/slave/containerizer/composing.cpp#L578] and to the [`DockerContainerizerProcess::status`|https://github.com/apache/mesos/blob/0026ea46dc35cbba1f442b8e425c6cbaf81ee8f8/src/slave/containerizer/docker.cpp#L2167]. 2. Recompile mesos 3. Launch mesos master and agent locally 4. Launch a simple Docker task via `mesos-execute`: {code} # cd build ./src/mesos-execute --master="`hostname`:5050" --name="a" --containerizer=docker --docker_image=alpine --resources="cpus:1;mem:32" --command="ls" {code} h2. Race condition - description 1. Mesos agent receives TASK_FINISHED status update and then subscribes on [`containerizer->status()`|https://github.com/apache/mesos/blob/0026ea46dc35cbba1f442b8e425c6cbaf81ee8f8/src/slave/slave.cpp#L5754-L5761]. 2. `containerizer->status()` operation for TASK_FINISHED status update gets delayed in the composing containerizer (e.g. due to switch of the worker thread that executes `status` method). 3. Docker executor terminates and the agent [triggers|https://github.com/apache/mesos/blob/0026ea46dc35cbba1f442b8e425c6cbaf81ee8f8/src/slave/slave.cpp#L6662] TASK_FAILED. 4. Docker containerizer destroys the container. A registered callback for the `containerizer->wait` call in the composing containerizer dispatches [lambda function|https://github.com/apache/mesos/blob/0026ea46dc35cbba1f442b8e425c6cbaf81ee8f8/src/slave/containerizer/composing.cpp#L368-L373] that will clean up `containers_` map. 5. Composing c'zer resumes and dispatches `[status()|https://github.com/apache/mesos/blob/0026ea46dc35cbba1f442b8e425c6cbaf81ee8f8/src/slave/containerizer/composing.cpp#L579]` method to the Docker containerizer for TASK_FINISHED, which in turn hangs for a few seconds. 6. Corresponding `containerId` gets removed from the `containers_` map of the composing c'zer. 7. Mesos agent subscribes on [`containerizer->status()`|https://github.com/apache/mesos/blob/0026ea46dc35cbba1f442b8e425c6cbaf81ee8f8/src/slave/slave.cpp#L5754-L5761] for the TASK_FAILED status update. 8. Composing c'zer returns ["Container not found"|https://github.com/apache/mesos/blob/0026ea46dc35cbba1f442b8e425c6cbaf81ee8f8/src/slave/containerizer/composing.cpp#L576] for TASK_FAILED. 9. `[Slave::_statusUpdate|https://github.com/apache/mesos/blob/0026ea46dc35cbba1f442b8e425c6cbaf81ee8f8/src/slave/slave.cpp#L5826]` stores TASK_FAILED terminal status update in the executor's data structure. 10. Docker executor resumes and finishes processing of `status()` method for TASK_FINISHED. Finally, it returns control to the `Slave::_statusUpdate` continuation. This method [discovers|https://github.com/apache/mesos/blob/0026ea46dc35cbba1f442b8e425c6cbaf81ee8f8/src/slave/slave.cpp#L5808-L5814] that the executor has already been destroyed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)