----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19239/#review38550 -----------------------------------------------------------
Ship it! Looks great now! Just one note about ensuring we deliver the signal below when killtree encounters an error. Looks shippable, outside of having Ian's PID namespace stuff for Linux. Feel free to add a TODO for that if you'd like this committed, and a TODO to make the escalation time configurable. I don't think Ian's namespace stuff is needed for this to land, since the semantics around accidentally killing the wrong pid here have remained the same! (Only if within 1 second the process dies, and the pid is recycled, will we kill the wrong process). src/launcher/executor.cpp <https://reviews.apache.org/r/19239/#comment70803> In the error case here as well, can we send ::kill(pid, SIGTERM) with a comment? src/launcher/executor.cpp <https://reviews.apache.org/r/19239/#comment70802> In this error case, it's not clear that the 'pid' would have received the desired signal, when killtree encounters an error. How about we insert a ::kill(pid, SIGKILL) here with a comment? - Ben Mahler On March 25, 2014, 10:21 p.m., Niklas Nielsen wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/19239/ > ----------------------------------------------------------- > > (Updated March 25, 2014, 10:21 p.m.) > > > Review request for mesos, Ben Mahler and Ian Downes. > > > Bugs: MESOS-1031 > https://issues.apache.org/jira/browse/MESOS-1031 > > > Repository: mesos-git > > > Description > ------- > > This patch adds a two step signal escalation: 1) SIGTERM is sent to > the process tree 2) If the process tree root hasn't been reaped within > escalationTimeout (3 seconds), SIGKILL is sent to the process tree. > Currently, this may leave behind orphan processes if parent processes > terminates and leave non-responsive processes behind. This will be > captured when PID namespaces are in place, as the orphan processes > will end up hanging off the process tree root instead of init. > > > Diffs > ----- > > src/launcher/executor.cpp e30d77a > src/slave/constants.hpp d237383 > src/slave/constants.cpp 3fb0cde > > Diff: https://reviews.apache.org/r/19239/diff/ > > > Testing > ------- > > make check > > > Thanks, > > Niklas Nielsen > >
