Thanks Ralph ! Cheers,
Gilles On 2014/08/28 4:52, Ralph Castain wrote: > Took me awhile to track this down, but it is now fixed - combination of > several minor errors > > Thanks > Ralph > > On Aug 27, 2014, at 4:07 AM, Gilles Gouaillardet > <gilles.gouaillar...@iferc.org> wrote: > >> Folks, >> >> the intercomm_create test case from the ibm test suite can hang under >> some configuration. >> >> basically, it will spawn n tasks in a first communicator, and then n >> tasks in a second communicator. >> >> when i run from node0 : >> mpirun -np 1 --mca btl tcp,self --mca coll ^ml -host node1,node2 >> ./intercomm_create >> >> the second spawn will hang. >> a simple workaround is to use 3 hosts : >> mpirun -np 1 --mca btl tcp,self --mca coll ^ml -host node1,node2,node3 >> ./intercomm_create >> >> the second spawn creates the task on node2. >> for some reasons i cannot fully understand, pmix believe orted of nodes >> node1 and node2 are involved in allgather. >> since node1 in not involved whatsoever, the program hangs >> /* in create_dmns, orte_get_job_data_object(sig->signature[0].jobid) >> returns jdata with jdata->map->num_nodes = 2 */ >> >> Cheers, >> >> Gilles >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2014/08/15732.php > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/08/15743.php