----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18597/#review36490 -----------------------------------------------------------
src/launcher/executor.cpp <https://reviews.apache.org/r/18597/#comment67475> We really try to avoid blocking in a libprocess. Instead of polling and sleeping, can you use a combination of the 'reaped' signal and a timeout delay()? If you get the reaped signal, you're either done, or you need to cleanup the remainder of the tree. If your delay() expires you need to clean up the tree. Also, we need to be careful about our assumptions here, we may not be inside a 'pid' namespace and we may have privileges to signal pids outside our tree. In the face of stale pids, we may accidentally signal other processes after the signal escalation delay, no? - Ben Mahler On March 3, 2014, 7:13 p.m., Niklas Nielsen wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/18597/ > ----------------------------------------------------------- > > (Updated March 3, 2014, 7:13 p.m.) > > > Review request for mesos and Ben Mahler. > > > Bugs: MESOS-1031 > https://issues.apache.org/jira/browse/MESOS-1031 > > > Repository: mesos-git > > > Description > ------- > > This patch makes command executor shutdown a bit more graceful > (than sending SIGKILL upfront) by adding signal escalation. > Signal escalation checks for liveness of process trees after SIGTERM > has been sent. If all processes are dead before the given gracePeriod, > shutdown() returns. If not, SIGKILL will be sent to all pids in > the process trees. gracePeriod is set to 3 seconds, in order to issue > SIGKILL before shutdownTimeout in the slave is triggered > (EXECUTOR_SHUTDOWN_GRACE_PERIOD is 5 seconds). > > > Diffs > ----- > > src/launcher/executor.cpp e30d77a > > Diff: https://reviews.apache.org/r/18597/diff/ > > > Testing > ------- > > Functional testing and make check. > > > Thanks, > > Niklas Nielsen > >
