Re: Review Request 72029: Changed termination logic of the default executor.

2020-01-29 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72029/#review219432
---




src/launcher/default_executor.cpp
Lines 282-284 (patched)


Can you add a comment for this code?



src/launcher/default_executor.cpp
Lines 1095-1096 (original), 1099-1100 (patched)


I think this fail safe will be triggered if the agent doesn't send an ACK 
for any status updates (rather than just terminal update).


- Qian Zhang


On Jan. 30, 2020, 5:28 a.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72029/
> ---
> 
> (Updated Jan. 30, 2020, 5:28 a.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko, Greg Mann, Qian Zhang, and Vinod 
> Kone.
> 
> 
> Bugs: MESOS-8537
> https://issues.apache.org/jira/browse/MESOS-8537
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, the default executor terminated itself after all containers
> had terminated. This could lead to termination of the executor before
> processing of a terminal status update by the agent. In order
> to mitigate this issue, the executor slept for one second to give a
> chance to send all status updates and receive all status update
> acknowledgements before terminating itself. This might have led to
> various race conditions in some circumstances (e.g., on a slow host).
> This patch terminates the default executor if all status updates have
> been acknowledged by the agent and no running containers left.
> Also, this patch increases the timeout from one second to one minute
> for fail-safety.
> 
> 
> Diffs
> -
> 
>   src/launcher/default_executor.cpp 4369fd0052b2e8496ba63606fa57e17d881ea52c 
> 
> 
> Diff: https://reviews.apache.org/r/72029/diff/3/
> 
> 
> Testing
> ---
> 
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 72055: Changed termination logic of the Docker executor.

2020-01-29 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72055/#review219431
---




src/docker/executor.cpp
Lines 779-788 (original), 779-789 (patched)


This seems a bit redundant to me, I'd suggest to only have:
```
  void _stop()
  {
driver.get()->stop();
  }
```

And then call `delay(Seconds(60), self(), &Self::__stop);` in `_reaped()` 
and `launchTask()`.


- Qian Zhang


On Jan. 30, 2020, 12:23 a.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72055/
> ---
> 
> (Updated Jan. 30, 2020, 12:23 a.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko, Greg Mann, Qian Zhang, and Vinod 
> Kone.
> 
> 
> Bugs: MESOS-9847
> https://issues.apache.org/jira/browse/MESOS-9847
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, the Docker executor terminated itself after a task's
> container had terminated. This could lead to termination of the
> executor before processing of a terminal status update by the agent.
> In order to mitigate this issue, the executor slept for one second to
> give a chance to send all status updates and receive all status update
> acknowledgments before terminating itself. This might have led to
> various race conditions in some circumstances (e.g., on a slow host).
> This patch terminates the Docker executor after receiving a terminal
> status update acknowledgment. Also, this patch increases the timeout
> from one second to one minute for fail-safety.
> 
> 
> Diffs
> -
> 
>   src/docker/executor.cpp 132f42bfa42c846fc5dc40f7763aa0b5d12a7798 
>   src/exec/exec.cpp 69e5e24b248c7c913421de5e42713c34fd79ad46 
> 
> 
> Diff: https://reviews.apache.org/r/72055/diff/2/
> 
> 
> Testing
> ---
> 
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 72055: Changed termination logic of the Docker executor.

2020-01-29 Thread Qian Zhang


> On Jan. 29, 2020, 6:28 p.m., Qian Zhang wrote:
> > The commit message seems not accurate to me:
> > > This could lead to termination of the executor before receiving all 
> > > status update acknowledgments from the agent.
> > 
> > I think the issue that we wanted to mitigate is, executor may shutdown 
> > itself before the terminal status update (rather than the acks) is sent to 
> > agent.
> 
> Andrei Budnik wrote:
> Updated the description.

> This could lead to termination of the executor before processing of a 
> terminal status update by the agent.

What do you mean for `before processing of a terminal status update by the 
agent`? Executor processes terminal status update sent by the agent? I think it 
should be `before the terminal status update is sent to the agent`.


- Qian


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72055/#review219410
---


On Jan. 30, 2020, 12:23 a.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72055/
> ---
> 
> (Updated Jan. 30, 2020, 12:23 a.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko, Greg Mann, Qian Zhang, and Vinod 
> Kone.
> 
> 
> Bugs: MESOS-9847
> https://issues.apache.org/jira/browse/MESOS-9847
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, the Docker executor terminated itself after a task's
> container had terminated. This could lead to termination of the
> executor before processing of a terminal status update by the agent.
> In order to mitigate this issue, the executor slept for one second to
> give a chance to send all status updates and receive all status update
> acknowledgments before terminating itself. This might have led to
> various race conditions in some circumstances (e.g., on a slow host).
> This patch terminates the Docker executor after receiving a terminal
> status update acknowledgment. Also, this patch increases the timeout
> from one second to one minute for fail-safety.
> 
> 
> Diffs
> -
> 
>   src/docker/executor.cpp 132f42bfa42c846fc5dc40f7763aa0b5d12a7798 
>   src/exec/exec.cpp 69e5e24b248c7c913421de5e42713c34fd79ad46 
> 
> 
> Diff: https://reviews.apache.org/r/72055/diff/2/
> 
> 
> Testing
> ---
> 
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 72055: Changed termination logic of the Docker executor.

2020-01-29 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72055/#review219427
---




src/docker/executor.cpp
Lines 781-783 (original), 781-783 (patched)


Is there a reason to have a failsafe here, but not in the default executor?



src/exec/exec.cpp
Lines 412-417 (patched)


This seems like it's addressing a separate issue? If so, could we split it 
out into a separate patch?


- Greg Mann


On Jan. 29, 2020, 4:23 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72055/
> ---
> 
> (Updated Jan. 29, 2020, 4:23 p.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko, Greg Mann, Qian Zhang, and Vinod 
> Kone.
> 
> 
> Bugs: MESOS-9847
> https://issues.apache.org/jira/browse/MESOS-9847
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, the Docker executor terminated itself after a task's
> container had terminated. This could lead to termination of the
> executor before processing of a terminal status update by the agent.
> In order to mitigate this issue, the executor slept for one second to
> give a chance to send all status updates and receive all status update
> acknowledgments before terminating itself. This might have led to
> various race conditions in some circumstances (e.g., on a slow host).
> This patch terminates the Docker executor after receiving a terminal
> status update acknowledgment. Also, this patch increases the timeout
> from one second to one minute for fail-safety.
> 
> 
> Diffs
> -
> 
>   src/docker/executor.cpp 132f42bfa42c846fc5dc40f7763aa0b5d12a7798 
>   src/exec/exec.cpp 69e5e24b248c7c913421de5e42713c34fd79ad46 
> 
> 
> Diff: https://reviews.apache.org/r/72055/diff/2/
> 
> 
> Testing
> ---
> 
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 72029: Changed termination logic of the default executor.

2020-01-29 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72029/#review219426
---



The patch looks great, thanks Andrei. What about adding a test for this, would 
it be hard? I'm imagining something like:
1) kill a task under the default executor
2) intercept the ACK from agent to executor
3) verify that the executor is still running
4) send the ACK to the executor
5) verify that the executor has terminated

WDYT?

- Greg Mann


On Jan. 29, 2020, 9:28 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72029/
> ---
> 
> (Updated Jan. 29, 2020, 9:28 p.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko, Greg Mann, Qian Zhang, and Vinod 
> Kone.
> 
> 
> Bugs: MESOS-8537
> https://issues.apache.org/jira/browse/MESOS-8537
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, the default executor terminated itself after all containers
> had terminated. This could lead to termination of the executor before
> processing of a terminal status update by the agent. In order
> to mitigate this issue, the executor slept for one second to give a
> chance to send all status updates and receive all status update
> acknowledgements before terminating itself. This might have led to
> various race conditions in some circumstances (e.g., on a slow host).
> This patch terminates the default executor if all status updates have
> been acknowledged by the agent and no running containers left.
> Also, this patch increases the timeout from one second to one minute
> for fail-safety.
> 
> 
> Diffs
> -
> 
>   src/launcher/default_executor.cpp 4369fd0052b2e8496ba63606fa57e17d881ea52c 
> 
> 
> Diff: https://reviews.apache.org/r/72029/diff/3/
> 
> 
> Testing
> ---
> 
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 72029: Changed termination logic of the default executor.

2020-01-29 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72029/#review219425
---



Patch looks great!

Reviews applied: [72029]

Passed command: export OS='ubuntu:16.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers 
--disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/jenkins/buildbot.sh

- Mesos Reviewbot


On Jan. 29, 2020, 9:28 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72029/
> ---
> 
> (Updated Jan. 29, 2020, 9:28 p.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko, Greg Mann, Qian Zhang, and Vinod 
> Kone.
> 
> 
> Bugs: MESOS-8537
> https://issues.apache.org/jira/browse/MESOS-8537
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, the default executor terminated itself after all containers
> had terminated. This could lead to termination of the executor before
> processing of a terminal status update by the agent. In order
> to mitigate this issue, the executor slept for one second to give a
> chance to send all status updates and receive all status update
> acknowledgements before terminating itself. This might have led to
> various race conditions in some circumstances (e.g., on a slow host).
> This patch terminates the default executor if all status updates have
> been acknowledged by the agent and no running containers left.
> Also, this patch increases the timeout from one second to one minute
> for fail-safety.
> 
> 
> Diffs
> -
> 
>   src/launcher/default_executor.cpp 4369fd0052b2e8496ba63606fa57e17d881ea52c 
> 
> 
> Diff: https://reviews.apache.org/r/72029/diff/3/
> 
> 
> Testing
> ---
> 
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 72029: Changed termination logic of the default executor.

2020-01-29 Thread Andrei Budnik

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72029/
---

(Updated Янв. 29, 2020, 9:28 п.п.)


Review request for mesos, Andrei Sekretenko, Greg Mann, Qian Zhang, and Vinod 
Kone.


Summary (updated)
-

Changed termination logic of the default executor.


Bugs: MESOS-8537
https://issues.apache.org/jira/browse/MESOS-8537


Repository: mesos


Description (updated)
---

Previously, the default executor terminated itself after all containers
had terminated. This could lead to termination of the executor before
processing of a terminal status update by the agent. In order
to mitigate this issue, the executor slept for one second to give a
chance to send all status updates and receive all status update
acknowledgements before terminating itself. This might have led to
various race conditions in some circumstances (e.g., on a slow host).
This patch terminates the default executor if all status updates have
been acknowledged by the agent and no running containers left.
Also, this patch increases the timeout from one second to one minute
for fail-safety.


Diffs (updated)
-

  src/launcher/default_executor.cpp 4369fd0052b2e8496ba63606fa57e17d881ea52c 


Diff: https://reviews.apache.org/r/72029/diff/3/

Changes: https://reviews.apache.org/r/72029/diff/2-3/


Testing
---

internal CI


Thanks,

Andrei Budnik



Re: Review Request 72055: Changed termination logic of the Docker executor.

2020-01-29 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72055/#review219424
---



Patch looks great!

Reviews applied: [72055]

Passed command: export OS='ubuntu:16.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers 
--disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/jenkins/buildbot.sh

- Mesos Reviewbot


On Jan. 29, 2020, 4:23 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72055/
> ---
> 
> (Updated Jan. 29, 2020, 4:23 p.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko, Greg Mann, Qian Zhang, and Vinod 
> Kone.
> 
> 
> Bugs: MESOS-9847
> https://issues.apache.org/jira/browse/MESOS-9847
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, the Docker executor terminated itself after a task's
> container had terminated. This could lead to termination of the
> executor before processing of a terminal status update by the agent.
> In order to mitigate this issue, the executor slept for one second to
> give a chance to send all status updates and receive all status update
> acknowledgments before terminating itself. This might have led to
> various race conditions in some circumstances (e.g., on a slow host).
> This patch terminates the Docker executor after receiving a terminal
> status update acknowledgment. Also, this patch increases the timeout
> from one second to one minute for fail-safety.
> 
> 
> Diffs
> -
> 
>   src/docker/executor.cpp 132f42bfa42c846fc5dc40f7763aa0b5d12a7798 
>   src/exec/exec.cpp 69e5e24b248c7c913421de5e42713c34fd79ad46 
> 
> 
> Diff: https://reviews.apache.org/r/72055/diff/2/
> 
> 
> Testing
> ---
> 
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 72055: Changed termination logic of the Docker executor.

2020-01-29 Thread Andrei Budnik


> On Янв. 29, 2020, 10:28 д.п., Qian Zhang wrote:
> > The commit message seems not accurate to me:
> > > This could lead to termination of the executor before receiving all 
> > > status update acknowledgments from the agent.
> > 
> > I think the issue that we wanted to mitigate is, executor may shutdown 
> > itself before the terminal status update (rather than the acks) is sent to 
> > agent.

Updated the description.


> On Янв. 29, 2020, 10:28 д.п., Qian Zhang wrote:
> > src/docker/executor.cpp
> > Lines 786-787 (original)
> > 
> >
> > We have a fail safe in command executor: 
> > https://github.com/apache/mesos/blob/1.9.0/src/launcher/executor.cpp#L1060:L1062
> >  , do we want do the similar in Docker executor to ensure it can still self 
> > terminate in case the agent doesn't send an ACK for the terminal update for 
> > some reason?
> 
> Vinod Kone wrote:
> let's add the fail safe please.

I added a `delay` for 60 seconds before calling `driver->stop` as the fail-safe.


> On Янв. 29, 2020, 10:28 д.п., Qian Zhang wrote:
> > src/exec/exec.cpp
> > Lines 435 (patched)
> > 
> >
> > Do we want a `return;` after this code?

We don't need to. I want to make sure that corresponding objects are removed 
from `updates` and `tasks` in any case.


- Andrei


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72055/#review219410
---


On Янв. 29, 2020, 4:23 п.п., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72055/
> ---
> 
> (Updated Янв. 29, 2020, 4:23 п.п.)
> 
> 
> Review request for mesos, Andrei Sekretenko, Greg Mann, Qian Zhang, and Vinod 
> Kone.
> 
> 
> Bugs: MESOS-9847
> https://issues.apache.org/jira/browse/MESOS-9847
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, the Docker executor terminated itself after a task's
> container had terminated. This could lead to termination of the
> executor before processing of a terminal status update by the agent.
> In order to mitigate this issue, the executor slept for one second to
> give a chance to send all status updates and receive all status update
> acknowledgments before terminating itself. This might have led to
> various race conditions in some circumstances (e.g., on a slow host).
> This patch terminates the Docker executor after receiving a terminal
> status update acknowledgment. Also, this patch increases the timeout
> from one second to one minute for fail-safety.
> 
> 
> Diffs
> -
> 
>   src/docker/executor.cpp 132f42bfa42c846fc5dc40f7763aa0b5d12a7798 
>   src/exec/exec.cpp 69e5e24b248c7c913421de5e42713c34fd79ad46 
> 
> 
> Diff: https://reviews.apache.org/r/72055/diff/2/
> 
> 
> Testing
> ---
> 
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 72055: Changed termination logic of the Docker executor.

2020-01-29 Thread Andrei Budnik

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72055/
---

(Updated Янв. 29, 2020, 4:23 п.п.)


Review request for mesos, Andrei Sekretenko, Greg Mann, Qian Zhang, and Vinod 
Kone.


Changes
---

addressed Qian's comments


Summary (updated)
-

Changed termination logic of the Docker executor.


Bugs: MESOS-9847
https://issues.apache.org/jira/browse/MESOS-9847


Repository: mesos


Description (updated)
---

Previously, the Docker executor terminated itself after a task's
container had terminated. This could lead to termination of the
executor before processing of a terminal status update by the agent.
In order to mitigate this issue, the executor slept for one second to
give a chance to send all status updates and receive all status update
acknowledgments before terminating itself. This might have led to
various race conditions in some circumstances (e.g., on a slow host).
This patch terminates the Docker executor after receiving a terminal
status update acknowledgment. Also, this patch increases the timeout
from one second to one minute for fail-safety.


Diffs (updated)
-

  src/docker/executor.cpp 132f42bfa42c846fc5dc40f7763aa0b5d12a7798 
  src/exec/exec.cpp 69e5e24b248c7c913421de5e42713c34fd79ad46 


Diff: https://reviews.apache.org/r/72055/diff/2/

Changes: https://reviews.apache.org/r/72055/diff/1-2/


Testing
---

internal CI


Thanks,

Andrei Budnik



Re: Review Request 72055: Changed logic of termination of the Docker executor.

2020-01-29 Thread Vinod Kone


> On Jan. 29, 2020, 10:28 a.m., Qian Zhang wrote:
> > src/docker/executor.cpp
> > Lines 786-787 (original)
> > 
> >
> > We have a fail safe in command executor: 
> > https://github.com/apache/mesos/blob/1.9.0/src/launcher/executor.cpp#L1060:L1062
> >  , do we want do the similar in Docker executor to ensure it can still self 
> > terminate in case the agent doesn't send an ACK for the terminal update for 
> > some reason?

let's add the fail safe please.


- Vinod


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72055/#review219410
---


On Jan. 28, 2020, 2:13 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72055/
> ---
> 
> (Updated Jan. 28, 2020, 2:13 p.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko, Greg Mann, Qian Zhang, and Vinod 
> Kone.
> 
> 
> Bugs: MESOS-9847
> https://issues.apache.org/jira/browse/MESOS-9847
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, the Docker executor terminated itself after a task's
> container had terminated. This could lead to termination of the
> executor before receiving all status update acknowledgments from
> the agent. In order to mitigate this issue, the executor slept for
> one second to give a chance to send all status updates and receive
> all status update acknowledgments before terminating itself. This
> might have led to various race conditions in some circumstances
> (e.g., on a slow host). This patch terminates the Docker executor
> after receiving a terminal status update acknowledgment. Also,
> this patch removes the unnecessary call of sleep.
> 
> 
> Diffs
> -
> 
>   src/docker/executor.cpp 132f42bfa42c846fc5dc40f7763aa0b5d12a7798 
>   src/exec/exec.cpp 69e5e24b248c7c913421de5e42713c34fd79ad46 
> 
> 
> Diff: https://reviews.apache.org/r/72055/diff/1/
> 
> 
> Testing
> ---
> 
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 72055: Changed logic of termination of the Docker executor.

2020-01-29 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72055/#review219410
---



The commit message seems not accurate to me:
> This could lead to termination of the executor before receiving all status 
> update acknowledgments from the agent.

I think the issue that we wanted to mitigate is, executor may shutdown itself 
before the terminal status update (rather than the acks) is sent to agent.


src/docker/executor.cpp
Lines 786-787 (original)


We have a fail safe in command executor: 
https://github.com/apache/mesos/blob/1.9.0/src/launcher/executor.cpp#L1060:L1062
 , do we want do the similar in Docker executor to ensure it can still self 
terminate in case the agent doesn't send an ACK for the terminal update for 
some reason?



src/exec/exec.cpp
Line 420 (original), 426 (patched)


Why do we remove the task only if it is a terminal status update acked? 
That is not our previous implemention where we always remove the task no matter 
it is a terminal status update or not, and it is also not consistent with what 
we have done in command executor: 
https://github.com/apache/mesos/blob/1.9.0/src/launcher/executor.cpp#L246:L248



src/exec/exec.cpp
Lines 435 (patched)


Do we want a `return;` after this code?


- Qian Zhang


On Jan. 28, 2020, 10:13 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72055/
> ---
> 
> (Updated Jan. 28, 2020, 10:13 p.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko, Greg Mann, Qian Zhang, and Vinod 
> Kone.
> 
> 
> Bugs: MESOS-9847
> https://issues.apache.org/jira/browse/MESOS-9847
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, the Docker executor terminated itself after a task's
> container had terminated. This could lead to termination of the
> executor before receiving all status update acknowledgments from
> the agent. In order to mitigate this issue, the executor slept for
> one second to give a chance to send all status updates and receive
> all status update acknowledgments before terminating itself. This
> might have led to various race conditions in some circumstances
> (e.g., on a slow host). This patch terminates the Docker executor
> after receiving a terminal status update acknowledgment. Also,
> this patch removes the unnecessary call of sleep.
> 
> 
> Diffs
> -
> 
>   src/docker/executor.cpp 132f42bfa42c846fc5dc40f7763aa0b5d12a7798 
>   src/exec/exec.cpp 69e5e24b248c7c913421de5e42713c34fd79ad46 
> 
> 
> Diff: https://reviews.apache.org/r/72055/diff/1/
> 
> 
> Testing
> ---
> 
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>