[jira] [Commented] (MESOS-9887) Race condition between two terminal task status updates for Docker executor.

Andrei Budnik (Jira) Mon, 26 Aug 2019 05:25:22 -0700


    [ 
https://issues.apache.org/jira/browse/MESOS-9887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16915755#comment-16915755
 ]


Andrei Budnik commented on MESOS-9887:
--------------------------------------

{code:java}
commit 8aae23ec7cd4bc50532df0b1d1ea6ec23ce078f8
Author: Andrei Budnik [email protected]
Date:   Fri Aug 23 14:36:18 2019 +0200
Added missing `return` statement in `Slave::statusUpdate`.

Previously, if `statusUpdate` was called for a pending task, it would
forward the status update and then continue executing `statusUpdate`,
which then checks if there is an executor that is aware of this task.
Given that a pending task is not known to any executor, it would always
handle it by forwarding status update one more time. This patch adds
missing `return` statement, which fixes the issue.

Review: https://reviews.apache.org/r/71361
{code}
{code:java}
commit f0be23765531b05661ed7f1b124faf96744aa80b
Author: Andrei Budnik [email protected]
Date:   Tue Aug 20 19:24:44 2019 +0200
Fixed out-of-order processing of terminal status updates in agent.

Previously, Mesos agent could send TASK_FAILED status update on
executor termination while processing of TASK_FINISHED status update
was in progress. Processing of task status updates involves sending
requests to the containerizer, which might finish processing of these
requests out-of-order, e.g. `MesosContainerizer::status`. Also,
the agent does not overwrite status of the terminal status update once
it's stored in the `terminatedTasks`. Hence, there was a race condition
between two terminal status updates.

Note that V1 Executors are not affected by this problem because they
wait for an acknowledgement of the terminal status update by the agent
before terminating.

This patch introduces a new data structure `pendingStatusUpdates`,
which holds a list of status updates that are being processed. This
data structure allows validating the order of processing of status
updates by the agent.

Review: https://reviews.apache.org/r/71343
{code}

> Race condition between two terminal task status updates for Docker executor.
> ----------------------------------------------------------------------------
>
>                 Key: MESOS-9887
>                 URL: https://issues.apache.org/jira/browse/MESOS-9887
>             Project: Mesos
>          Issue Type: Bug
>          Components: agent, containerization
>            Reporter: Andrei Budnik
>            Assignee: Andrei Budnik
>            Priority: Blocker
>              Labels: agent, containerization
>         Attachments: race_example.txt
>
>
> h2. Overview
> Expected behavior:
>  Task successfully finishes and sends TASK_FINISHED status update.
> Observed behavior:
>  Task successfully finishes, but the agent sends TASK_FAILED with the reason 
> "REASON_EXECUTOR_TERMINATED".
> In normal circumstances, Docker executor 
> [sends|https://github.com/apache/mesos/blob/0026ea46dc35cbba1f442b8e425c6cbaf81ee8f8/src/docker/executor.cpp#L758]
>  final status update TASK_FINISHED to the agent, which then [gets 
> processed|https://github.com/apache/mesos/blob/0026ea46dc35cbba1f442b8e425c6cbaf81ee8f8/src/slave/slave.cpp#L5543]
>  by the agent before termination of the executor's process.
> However, if the processing of the initial TASK_FINISHED gets delayed, then 
> there is a chance that Docker executor terminates and the agent 
> [triggers|https://github.com/apache/mesos/blob/0026ea46dc35cbba1f442b8e425c6cbaf81ee8f8/src/slave/slave.cpp#L6662]
>  TASK_FAILED which will [be 
> handled|https://github.com/apache/mesos/blob/0026ea46dc35cbba1f442b8e425c6cbaf81ee8f8/src/slave/slave.cpp#L5816-L5826]
>  prior to the TASK_FINISHED status update.
> See attached logs which contain an example of the race condition.
> h2. Reproducing bug
> 1. Add the following code:
> {code:java}
>   static int c = 0;
>   if (++c == 3) { // to skip TASK_STARTING and TASK_RUNNING status updates.
>         ::sleep(2);
>   }
> {code}
> to the 
> [`ComposingContainerizerProcess::status`|https://github.com/apache/mesos/blob/0026ea46dc35cbba1f442b8e425c6cbaf81ee8f8/src/slave/containerizer/composing.cpp#L578]
>  and to the 
> [`DockerContainerizerProcess::status`|https://github.com/apache/mesos/blob/0026ea46dc35cbba1f442b8e425c6cbaf81ee8f8/src/slave/containerizer/docker.cpp#L2167].
> 2. Recompile mesos
> 3. Launch mesos master and agent locally
> 4. Launch a simple Docker task via `mesos-execute`:
> {code}
> #  cd build
> ./src/mesos-execute --master="`hostname`:5050" --name="a" 
> --containerizer=docker --docker_image=alpine --resources="cpus:1;mem:32" 
> --command="ls"
> {code}
> h2. Race condition - description
> 1. Mesos agent receives TASK_FINISHED status update and then subscribes on 
> [`containerizer->status()`|https://github.com/apache/mesos/blob/0026ea46dc35cbba1f442b8e425c6cbaf81ee8f8/src/slave/slave.cpp#L5754-L5761].
> 2. `containerizer->status()` operation for TASK_FINISHED status update gets 
> delayed in the composing containerizer (e.g. due to switch of the worker 
> thread that executes `status` method).
> 3. Docker executor terminates and the agent 
> [triggers|https://github.com/apache/mesos/blob/0026ea46dc35cbba1f442b8e425c6cbaf81ee8f8/src/slave/slave.cpp#L6662]
>  TASK_FAILED.
> 4. Docker containerizer destroys the container. A registered callback for the 
> `containerizer->wait` call in the composing containerizer dispatches [lambda 
> function|https://github.com/apache/mesos/blob/0026ea46dc35cbba1f442b8e425c6cbaf81ee8f8/src/slave/containerizer/composing.cpp#L368-L373]
>  that will clean up `containers_` map.
> 5. Composing c'zer resumes and dispatches 
> `[status()|https://github.com/apache/mesos/blob/0026ea46dc35cbba1f442b8e425c6cbaf81ee8f8/src/slave/containerizer/composing.cpp#L579]`
>  method to the Docker containerizer for TASK_FINISHED, which in turn hangs 
> for a few seconds.
> 6. Corresponding `containerId` gets removed from the `containers_` map of the 
> composing c'zer.
> 7. Mesos agent subscribes on 
> [`containerizer->status()`|https://github.com/apache/mesos/blob/0026ea46dc35cbba1f442b8e425c6cbaf81ee8f8/src/slave/slave.cpp#L5754-L5761]
>  for the TASK_FAILED status update.
> 8. Composing c'zer returns ["Container not 
> found"|https://github.com/apache/mesos/blob/0026ea46dc35cbba1f442b8e425c6cbaf81ee8f8/src/slave/containerizer/composing.cpp#L576]
>  for TASK_FAILED.
> 9. 
> `[Slave::_statusUpdate|https://github.com/apache/mesos/blob/0026ea46dc35cbba1f442b8e425c6cbaf81ee8f8/src/slave/slave.cpp#L5826]`
>  stores TASK_FAILED terminal status update in the executor's data structure.
> 10. Docker containerizer resumes and finishes processing of `status()` method 
> for TASK_FINISHED. Finally, it returns control to the `Slave::_statusUpdate` 
> continuation. This method 
> [discovers|https://github.com/apache/mesos/blob/0026ea46dc35cbba1f442b8e425c6cbaf81ee8f8/src/slave/slave.cpp#L5808-L5814]
>  that the executor has already been destroyed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Commented] (MESOS-9887) Race condition between two terminal task status updates for Docker executor.

Reply via email to