On Wed, Apr 29 2015, Ole Tange <o...@tange.dk> wrote: > On Wed, Apr 29, 2015 at 2:07 PM, Rasmus Villemoes <r...@rasmusvillemoes.dk> > wrote: >> On Wed, Apr 29 2015, Ole Tange <o...@tange.dk> wrote: >> >>> This still has the risk of killing an innocent PID and its children. >> >> Killing (in the sense of sending any signal whatsoever) an >> innocent/unrelated PID is completely unacceptable, IMO. On a reasonably >> busy system, PID reuse within 10 seconds is far from unlikely. > > On my system this gives PID reuse after 3.1 secs, but that is a very > extreme case, and I will accept if GNU Parallel deals wrongly with that case: > > perl -e 'while(1) { $a=(fork()|| exit); if(not $a %1000) {print "$a\n";} } ' >
It can happen a lot faster if the system has a lot of running processes (or not running; 20000 processes doing sleep(100000) has the same effect). But this is besides the point: I do not want GNU Parallel to accidentally kill some other process; there's no limit to what kind of corruption that might cause, especially if the signal is SIGKILL. >> Mapping the tree even before signalling the immediate children is not >> enough; some of the grand^nchildren may vanish in the meantime and >> their PIDs reused before one can use the gathered information. > > I doubt that is true in practice. Mapping takes less than 100 ms, so I > would find it very unlikely that the PID will be reused that fast. I > understand that this could in theory happen, but I would like to see > this demonstrated before I consider this a real problem. I'm a mathematician and a strong believer in Murphy's law. I really prefer _not_ seeing this happen :-) > Since GNU Parallel will be sleeping (and not doing anything else) we > could simply kill 0 all the (grand*)children every second and compute > the family tree of the current children. If the child dies, remove the > child from the list to be killed later. > > @children=familiy_tree(@job_pids); > for $signal (@the_signals) { > kill $signal, @job_pids; > $sleep_time = shift @sleep_times; > $time_slept = 0; > while($time_slept < $sleep_time and @children) { > @children = family_tree(grep { kill( 0, $_) } @children); > sleep $a_while; > $time_slept += $a_while; > } > } > kill KILL, @children; > > Rasmus: Can you find a situation in which the above will fail? Yes, but the argument will just be the same: Any pid in @children which is not a direct child of yours must be considered out-of-bounds. Between the time the pid is obtained and SIGKILL is sent, that process may have exited and the pid reused. In fact, the above is worse, since any random process can sneak into the @children array at some point, causing all its descendants to be picked up by the family_tree, and then we have the potential for a real bloodbath. >> I think the only way to do this right is for GNU Parallel to make each >> immediate child a process group leader (setpgrp 0,0 immediately after >> fork). > > GNU Parallel uses open3 to spawn children. According to strace -ff > that does not do a setpgrp. One can call setpgid on behalf of another process. So GNU Parallel can put the child created by open3 in its own process group by doing setpgid(child, 0) (equivalently setpgid(child, child)). In Perl, it seems that the wrapper for the setpgid system call is setpgrp, but the semantics should be the same. On Thu, Apr 30 2015, Martin d'Anjou <martin.danjo...@gmail.com> wrote: > To me the "right thing" is to give a chance for well-written programs, > programs that obey all the rules about cleaning up their child > processes, a chance to execute without causing in them spurious errors > like "kill (...) - No such process". Send the signals to the immediate > child first, then after a delay, if that has failed, do the process > group kill loop. To those who care, it is a huge improvement, and to > those who do not care, they won't care one way or the other. On Wed, Apr 29 2015, Ole Tange <o...@tange.dk> wrote: > Yes. GNU Parallel should do the right thing in most cases and not > cause a problem in the rest. OK, so here's a concrete proposal, which shouldn't cause random processes to get killed: - Put every child in its own process group, per the above. - When it's time to begin the termination sequence, send the first signal to each immediate child. Now the children may respond by (hopefully cleaning up and) exiting. _Before_ reaping the child, we can send a signal to its entire process group; if the child was well-behaved, there are no other processes in its process group, so this is a noop, otherwise this is precisely when we want to terminate the grandchildren etc. Of course, the SIGCHLD doesn't tell us which child has died, but we can reliably check each of our children for being in the zombie state before calling wait (and then only wait for children which we know are gone). I'm not sure what the appropriate signal is to send to the grandchildren - SIGKILL is rather harsh. It may be that the child wasn't well-behaved and just died upon getting the signal without propagating it further down, but that some grandchild actually has some state it could and should clean up. Maybe the right thing to do is send all the signals in the users chosen termination sequence in reasonably rapid succession, but still giving the grandchildren some time to respond to them. In the common case, when there are no grandchildren (often simply because the child never spawned any), or when the grandchildren actually respond to the first signal in the sequence, we'd also like to avoid a bunch of useless sleeps. As an optimization, we could scan the process tree for processes belonging to the process group of the zombie child, and stop when only the child is found. Note that I still do not propose sending signals specifically to any pids found - any signals are still sent to the process group as a whole, which exists for as long as we keep the child in the zombie state. The scanning also automatically introduces a little delay, which may be enough for the grandchildren to do their cleanup. After we've sent the last signal to the process group, there's nothing more we can do, so we can reap the child and forget about it. Now, this can be repeated for each given signal, with the given delays inbetween, stopping of course if/when all children are gone. This became a lot longer than I intended - I hope the idea is obvious enough. Rasmus