> On April 2, 2018, 4:57 p.m., Greg Mann wrote: > > src/slave/slave.cpp > > Line 2233 (original), 2274 (patched) > > <https://reviews.apache.org/r/66144/diff/7/?file=1991208#file1991208line2274> > > > > We should defer this callback.
That would change the old behavior i.e. `sendExitedExecutorMessage` is sent synchronously along with other error handling code. https://github.com/apache/mesos/blob/594ee20c2453dad836313769aef9f8655cd75cd5/src/slave/slave.cpp#L2226-L2231 I found making the error handling asynchronous unnecessarily difficult to reason. e.g. making it asynchronous means that there is a brief moment that the first task has failed but the sequence is still alive--contradicting our comments. Tieing the sequence lifecycle and `exitedExecutorMessage` to task launch success atomically makes the code much easier to reason. I am not sure if it would make a difference now, but we should stick to the old behavior unless there is compelling reason not to. - Meng ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66144/#review200329 ----------------------------------------------------------- On April 2, 2018, 5:36 p.m., Meng Zhu wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/66144/ > ----------------------------------------------------------- > > (Updated April 2, 2018, 5:36 p.m.) > > > Review request for mesos, Chun-Hung Hsiao and Greg Mann. > > > Bugs: MESOS-8617 and MESOS-8624 > https://issues.apache.org/jira/browse/MESOS-8617 > https://issues.apache.org/jira/browse/MESOS-8624 > > > Repository: mesos > > > Description > ------- > > Up until now, Mesos does not guarantee in-order > task launch on the agent. There are two asynchronous > steps (unschedule GC and task authorization) in the > agent task launch path. Depending on the CPU scheduling > order, a later task launch may finish these two steps earlier > than its predecessors and get to the launch executor stage > earlier, resulting in out-of-order task delivery. > > This patch enforces the task delivery order by sequencing > task launch after the two asynchronous steps, specifically > right before entering `__run()`. > > > Diffs > ----- > > src/slave/slave.hpp 37f0361251524e63d02d251e8a03901812b8affb > src/slave/slave.cpp a4bd4ccd3fc59c3c0e462d9b480f5424b3e52d7a > > > Diff: https://reviews.apache.org/r/66144/diff/8/ > > > Testing > ------- > > make check > > > Thanks, > > Meng Zhu > >
