Yeah, that’s a bug - we’ll have to address it


> On Nov 28, 2016, at 9:29 AM, Noel Rycroft <> wrote:
> I'm seeing different behaviour between Open MPI 1.8.4 and 2.0.1 with regards 
> to signal propagation.
> With version 1.8.4 mpirun seems to propagate SIGTERM to the tasks it starts 
> which enables the tasks to handle SIGTERM.
> In version 2.0.1 mpirun does not seem to propagate SIGTERM and instead I 
> suspect it's sending SIGKILL immediately. Because the child tasks are not 
> given a chance to handle SIGTERM they end up orphaning their child processes.
> I have a pretty simply reproducer which consists of:
> A simple MPI application that sleeps for a number of seconds.
> A simple bash script which launches mpirun.  
> A second bash script which is used to launch a 'child' MPI application 
> 'sleep' binary
> Both scripts launch their children in the background, and 'wait' on 
> completion. They both install signal handlers for SIGTERM.
> When SIGTERM is sent to the top level script it is explicitly propagated to 
> 'mpirun' via the signal handler. 
> In Open MPI 1.8.4 SIGTERM is propagated to the child MPI tasks which in turn 
> explicitly propagate the signal to the child binary processes.
> In Open MPI 2.0.1 I see no evidence that SIGTERM is propagated to the child 
> MPI tasks. Instead those tasks are killed and their children (the application 
> binaries) are orphaned.
> Is the difference in behaviour between the different versions expected..?
> _______________________________________________
> users mailing list

users mailing list

Reply via email to