----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/72029/#review219426 -----------------------------------------------------------
The patch looks great, thanks Andrei. What about adding a test for this, would it be hard? I'm imagining something like: 1) kill a task under the default executor 2) intercept the ACK from agent to executor 3) verify that the executor is still running 4) send the ACK to the executor 5) verify that the executor has terminated WDYT? - Greg Mann On Jan. 29, 2020, 9:28 p.m., Andrei Budnik wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/72029/ > ----------------------------------------------------------- > > (Updated Jan. 29, 2020, 9:28 p.m.) > > > Review request for mesos, Andrei Sekretenko, Greg Mann, Qian Zhang, and Vinod > Kone. > > > Bugs: MESOS-8537 > https://issues.apache.org/jira/browse/MESOS-8537 > > > Repository: mesos > > > Description > ------- > > Previously, the default executor terminated itself after all containers > had terminated. This could lead to termination of the executor before > processing of a terminal status update by the agent. In order > to mitigate this issue, the executor slept for one second to give a > chance to send all status updates and receive all status update > acknowledgements before terminating itself. This might have led to > various race conditions in some circumstances (e.g., on a slow host). > This patch terminates the default executor if all status updates have > been acknowledged by the agent and no running containers left. > Also, this patch increases the timeout from one second to one minute > for fail-safety. > > > Diffs > ----- > > src/launcher/default_executor.cpp 4369fd0052b2e8496ba63606fa57e17d881ea52c > > > Diff: https://reviews.apache.org/r/72029/diff/3/ > > > Testing > ------- > > internal CI > > > Thanks, > > Andrei Budnik > >
