Thanks Ralph !

Cheers,

Gilles

On 2014/08/28 4:52, Ralph Castain wrote:
> Took me awhile to track this down, but it is now fixed - combination of 
> several minor errors
>
> Thanks
> Ralph
>
> On Aug 27, 2014, at 4:07 AM, Gilles Gouaillardet 
> <gilles.gouaillar...@iferc.org> wrote:
>
>> Folks,
>>
>> the intercomm_create test case from the ibm test suite can hang under
>> some configuration.
>>
>> basically, it will spawn n tasks in a first communicator, and then n
>> tasks in a second communicator.
>>
>> when i run from node0 :
>> mpirun -np 1 --mca btl tcp,self --mca coll ^ml -host node1,node2
>> ./intercomm_create
>>
>> the second spawn will hang.
>> a simple workaround is to use 3 hosts :
>> mpirun -np 1 --mca btl tcp,self --mca coll ^ml -host node1,node2,node3
>> ./intercomm_create
>>
>> the second spawn creates the task on node2.
>> for some reasons i cannot fully understand, pmix believe orted of nodes
>> node1 and node2 are involved in allgather.
>> since node1 in not involved whatsoever, the program hangs
>> /* in create_dmns, orte_get_job_data_object(sig->signature[0].jobid)
>> returns jdata with jdata->map->num_nodes = 2 */
>>
>> Cheers,
>>
>> Gilles
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2014/08/15732.php
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/08/15743.php

Reply via email to