-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72055/#review219410
-----------------------------------------------------------



The commit message seems not accurate to me:
> This could lead to termination of the executor before receiving all status 
> update acknowledgments from the agent.

I think the issue that we wanted to mitigate is, executor may shutdown itself 
before the terminal status update (rather than the acks) is sent to agent.


src/docker/executor.cpp
Lines 786-787 (original)
<https://reviews.apache.org/r/72055/#comment307583>

    We have a fail safe in command executor: 
https://github.com/apache/mesos/blob/1.9.0/src/launcher/executor.cpp#L1060:L1062
 , do we want do the similar in Docker executor to ensure it can still self 
terminate in case the agent doesn't send an ACK for the terminal update for 
some reason?



src/exec/exec.cpp
Line 420 (original), 426 (patched)
<https://reviews.apache.org/r/72055/#comment307578>

    Why do we remove the task only if it is a terminal status update acked? 
That is not our previous implemention where we always remove the task no matter 
it is a terminal status update or not, and it is also not consistent with what 
we have done in command executor: 
https://github.com/apache/mesos/blob/1.9.0/src/launcher/executor.cpp#L246:L248



src/exec/exec.cpp
Lines 435 (patched)
<https://reviews.apache.org/r/72055/#comment307584>

    Do we want a `return;` after this code?


- Qian Zhang


On Jan. 28, 2020, 10:13 p.m., Andrei Budnik wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72055/
> -----------------------------------------------------------
> 
> (Updated Jan. 28, 2020, 10:13 p.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko, Greg Mann, Qian Zhang, and Vinod 
> Kone.
> 
> 
> Bugs: MESOS-9847
>     https://issues.apache.org/jira/browse/MESOS-9847
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Previously, the Docker executor terminated itself after a task's
> container had terminated. This could lead to termination of the
> executor before receiving all status update acknowledgments from
> the agent. In order to mitigate this issue, the executor slept for
> one second to give a chance to send all status updates and receive
> all status update acknowledgments before terminating itself. This
> might have led to various race conditions in some circumstances
> (e.g., on a slow host). This patch terminates the Docker executor
> after receiving a terminal status update acknowledgment. Also,
> this patch removes the unnecessary call of sleep.
> 
> 
> Diffs
> -----
> 
>   src/docker/executor.cpp 132f42bfa42c846fc5dc40f7763aa0b5d12a7798 
>   src/exec/exec.cpp 69e5e24b248c7c913421de5e42713c34fd79ad46 
> 
> 
> Diff: https://reviews.apache.org/r/72055/diff/1/
> 
> 
> Testing
> -------
> 
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>

Reply via email to