That would break debugger attach. Sounds to me like it’s just an uninitialized 
variable for in_event_hdlr?

> On Jul 15, 2016, at 1:20 AM, Gilles Gouaillardet <gil...@rist.or.jp> wrote:
> 
> Ralph,
> 
> i noticed MPI_Comm_spawn is broken on master and on RHEL7
> 
> for some reason i cannot yet explain, it works just fine on RHEL6 (!)
> 
> 
> mpirun -np 1 ./dynamic/intercomm_create
> 
> from the ibm test suite can be used to reproduce the issue.
> 
> 
> 
> i digged a bit and i found OPAL_ERR_DEBUGGER_RELEASE is fired in mpirun, then 
> the tasks received
> 
> a PMIX_ERR_DEBUGGER_RELEASE notification. it seems no event handler is 
> registered, so the default handler
> 
> kills the task.
> 
> 
> for the time being, a trivial workaround is not to fire 
> OPAL_ERR_DEBUGGER_RELEASE in the first place
> 
> (see patch below)
> 
> 
> could you please have a look ?
> 
> i am not sure whether client should not be notified at all, or whether they 
> should register a dummy handler.
> 
> fwiw, in _event_hdlr, cd->nondefault is true on RHEL6, but false on RHEL7, 
> and that might indicate a race condition
> 
> 
> Cheers,
> 
> 
> Gilles
> 
> diff --git a/orte/orted/orted_submit.c b/orte/orted/orted_submit.c
> index b9d571c..0de0e79 100644
> --- a/orte/orted/orted_submit.c
> +++ b/orte/orted/orted_submit.c
> @@ -2155,6 +2155,7 @@ static bool mpir_breakpoint_fired = false;
> 
> static void _send_notification(void)
> {
> +#if 0
>     opal_buffer_t buf;
>     int status = OPAL_ERR_DEBUGGER_RELEASE;
>     orte_grpcomm_signature_t sig;
> @@ -2209,6 +2210,7 @@ static void _send_notification(void)
>     }
>     OBJ_DESTRUCT(&sig);
>     OBJ_DESTRUCT(&buf);
> +#endif
> }
> 
> static void orte_debugger_dump(void)
> 
> 
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2016/07/19214.php

Reply via email to