On Wed, Apr 29, 2015 at 2:07 PM, Rasmus Villemoes <r...@rasmusvillemoes.dk> wrote: > On Wed, Apr 29 2015, Ole Tange <o...@tange.dk> wrote: > >> This still has the risk of killing an innocent PID and its children. > > Killing (in the sense of sending any signal whatsoever) an > innocent/unrelated PID is completely unacceptable, IMO. On a reasonably > busy system, PID reuse within 10 seconds is far from unlikely.
On my system this gives PID reuse after 3.1 secs, but that is a very extreme case, and I will accept if GNU Parallel deals wrongly with that case: perl -e 'while(1) { $a=(fork()|| exit); if(not $a %1000) {print "$a\n";} } ' > Mapping > the tree even before signalling the immediate children is not enough; > some of the grand^nchildren may vanish in the meantime and their PIDs > reused before one can use the gathered information. I doubt that is true in practice. Mapping takes less than 100 ms, so I would find it very unlikely that the PID will be reused that fast. I understand that this could in theory happen, but I would like to see this demonstrated before I consider this a real problem. Since GNU Parallel will be sleeping (and not doing anything else) we could simply kill 0 all the (grand*)children every second and compute the family tree of the current children. If the child dies, remove the child from the list to be killed later. @children=familiy_tree(@job_pids); for $signal (@the_signals) { kill $signal, @job_pids; $sleep_time = shift @sleep_times; $time_slept = 0; while($time_slept < $sleep_time and @children) { @children = family_tree(grep { kill( 0, $_) } @children); sleep $a_while; $time_slept += $a_while; } } kill KILL, @children; Rasmus: Can you find a situation in which the above will fail? > I think the only way to do this right is for GNU Parallel to make each > immediate child a process group leader (setpgrp 0,0 immediately after > fork). GNU Parallel uses open3 to spawn children. According to strace -ff that does not do a setpgrp. > Do note that one can never clean up all descendants that may have been > spawned: A dance consisting of double fork() and some setpgid/setsid > yoga will create a process which cannot be tied to GNU Parallel or any > of its immediate children. So one has to rely on the children not doing > such things. Yes. GNU Parallel should do the right thing in most cases and not cause a problem in the rest. /Ole