Re: [OMPI devel] OMPI devel] pmix: race condition in dynamic/intercomm_create from the ibm test suite

2014-08-25 Thread Ralph Castain
And that was indeed the problem - fixed, and now the trunk runs clean thru my MTT. Thanks again! Ralph On Aug 25, 2014, at 7:38 AM, Ralph Castain wrote: > Yeah, that was going to be my first place to look once I finished breakfast > :-) > > Thanks! > Ralph > > On Aug 25,

Re: [OMPI devel] OMPI devel] pmix: race condition in dynamic/intercomm_create from the ibm test suite

2014-08-25 Thread Ralph Castain
Yeah, that was going to be my first place to look once I finished breakfast :-) Thanks! Ralph On Aug 25, 2014, at 7:32 AM, Gilles Gouaillardet wrote: > Thanks for the explanation > > In orte_dt_compare_sig(...) memcmp did not multiply value1->sz by >

Re: [OMPI devel] OMPI devel] pmix: race condition in dynamic/intercomm_create from the ibm test suite

2014-08-25 Thread Gilles Gouaillardet
Thanks for the explanation In orte_dt_compare_sig(...) memcmp did not multiply value1->sz by sizeof(opal_identifier_t). Being afk, I could not test but that looks like a good suspect Cheers, Gilles Ralph Castain wrote: >Each collective is given a "signature" that is just

Re: [OMPI devel] pmix: race condition in dynamic/intercomm_create from the ibm test suite

2014-08-25 Thread Ralph Castain
Each collective is given a "signature" that is just the array of names for all procs involved in the collective. Thus, even though task 0 is involved in both of the disconnect barriers, the two collectives should be running in isolation from each other. The "tags" are just receive callbacks

[OMPI devel] pmix: race condition in dynamic/intercomm_create from the ibm test suite

2014-08-25 Thread Gilles Gouaillardet
Folks, when i run mpirun -np 1 ./intercomm_create from the ibm test suite, it either : - success - hangs - mpirun crashes (SIGSEGV) soon after writing the following message ORTE_ERROR_LOG: Not found in file ../../../src/ompi-trunk/orte/orted/pmix/pmix_server.c at line 566 here is what happens :