Re: Revision of GNU Parallel's processing of SIGTERM

Martin d'Anjou Mon, 13 Apr 2015 06:44:55 -0700

On 15-04-12 07:14 AM, Ole Tange wrote:


You mean in sub Job::kill():

             # Wait up to 200 ms between TERMs - but only if any pids
are alive
             my $sleep = 1;
             for (my $sleepsum = 0; kill 0, $family_pids[0] and $sleepsum < 200;
                  $sleepsum += $sleep) {
                 $sleep = ::reap_usleep($sleep);
             }

'kill 0, pid' returns true if the process is still running.
$family_pids[0] is the immediate child (i.e. the parent of any
(grand)*children)).
There is no need to see if any (grand)*children are running: it is the
job of $family_pids[0] to kill those.

I agree: it is the job of the parent process to kill its childprocesses. But this does not match what I see in the code. Here is thekill loop:


    for my $pid (@family_pids) {
        if(kill 0, $pid) {
        # The job still running
        kill $signal, $pid;
        $alive = 1;
        }
    }

The kill loop sends the kill signal to all elements of the family_pidsarray. So IMO the code and the intent have to match. In fact, I ran intothis confusion when I wrote the test (line 19 of this file):

https://github.com/martinda/gnu-parallel/blob/sigterm-1/testsuite/tests-to-run/parallel-local-signals.sh#L19

I propose two solutions:

- Change the code to only kill the parent process (existing users willhave to change their application code to kill children processes themselves)- Do not change the default behaviour, but add an option so that onlythe parent job is killed (existing users see no change)


What solution is preferable?

Martin

Re: Revision of GNU Parallel's processing of SIGTERM

Reply via email to