----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/54518/#review158479 -----------------------------------------------------------
So it looks like the basic idea in this patch is to: 1) Add a `reaped()` function which we call whenever a process is reaped. From there we log why the process was reaped and (possibly) set a limitation on it if it died abnormally. We store this limitation in the switchboard's info struct. 2) Simplify the logic in `watch()` down to do nothing more than return the limitation future from the switchboard's info struct. 3) Change `cleanup()` to only send SIGTERM to the switchboard process if its status is still pending (i.e. it hasn't been reaped yet). 4) Change cleanup to properly remove the unix domain socket file in all cases of the `reaped()` future being satisifed. If so, it sounds/looks good to me overall . Just a few comments. src/slave/containerizer/mesos/io/switchboard.cpp (line 779) <https://reviews.apache.org/r/54518/#comment229244> Why do you have to do things this way? With the `await()` and the `list` of futures. Can't we just have an `.onAny()` on status? src/slave/containerizer/mesos/io/switchboard.cpp (lines 812 - 816) <https://reviews.apache.org/r/54518/#comment229245> When does this case happen? src/slave/containerizer/mesos/io/switchboard.cpp (line 828) <https://reviews.apache.org/r/54518/#comment229248> Do you mean: ``` if (WIFEXITED(status.get()) && WEXITSTATUS(status.get()) == 0) ``` src/slave/containerizer/mesos/io/switchboard.cpp (line 834) <https://reviews.apache.org/r/54518/#comment229249> s/is/if/ - Kevin Klues On Dec. 8, 2016, 1:46 a.m., Jie Yu wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/54518/ > ----------------------------------------------------------- > > (Updated Dec. 8, 2016, 1:46 a.m.) > > > Review request for mesos and Kevin Klues. > > > Bugs: MESOS-6756 > https://issues.apache.org/jira/browse/MESOS-6756 > > > Repository: mesos > > > Description > ------- > > Previously, we don't handle the case where reaping failed. This patch > refactored the reaping logic. The main idea is to register a 'reaped' > callback which gets invoked when the reaping of the I/O switchboard > server process has a result. This also changes the 'watch' logic. > > > Diffs > ----- > > src/slave/containerizer/mesos/io/switchboard.hpp > a8a1e2558432ae1e4c7e5513e5653feba4352056 > src/slave/containerizer/mesos/io/switchboard.cpp > af956951de9c605c68f6ec16a7edf3c33f4bd499 > > Diff: https://reviews.apache.org/r/54518/diff/ > > > Testing > ------- > > make check > > > Thanks, > > Jie Yu > >
