Okay, I see it - will fix on Fri. This is unique to master.
> On May 5, 2016, at 1:54 PM, Aurélien Bouteiller <boute...@icl.utk.edu> wrote:
>
> Ralph,
>
> I still observe these issues in the current master. (npernode is not
> respected either).
>
> Also note that the display_allocation seems to be wrong (slots_inuse=0 when
> the slot is obviously in use).
>
> $ git show
> 4899c89 (HEAD -> master, origin/master, origin/HEAD) Fix a race condition
> when multiple threads try to create a bml en....Bouteiller 6 hours ago
>
> $ bin/mpirun -np 12 -hostfile /opt/etc/ib10g.machinefile.ompi
> -display-allocation -map-by node hostname
>
> ====================== ALLOCATED NODES ======================
> dancer00: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
> dancer01: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
> dancer02: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
> dancer03: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
> dancer04: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
> dancer05: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
> dancer06: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
> dancer07: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
> dancer08: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
> dancer09: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
> dancer10: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
> dancer11: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
> dancer12: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
> dancer13: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
> dancer14: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
> dancer15: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
> =================================================================
> dancer01
> dancer00
> dancer01
> dancer01
> dancer01
> dancer00
> dancer00
> dancer00
> dancer00
> dancer00
> dancer00
> dancer00
>
>
> --
> Aurélien Bouteiller, Ph.D. ~~ https://icl.cs.utk.edu/~bouteill/
> <https://icl.cs.utk.edu/~bouteill/>
>> Le 13 avr. 2016 à 13:38, Ralph Castain <r...@open-mpi.org
>> <mailto:r...@open-mpi.org>> a écrit :
>>
>> The —map-by node option should now be fixed on master, and PRs waiting for
>> 1.10 and 2.0
>>
>> Thx!
>>
>>> On Apr 12, 2016, at 6:45 PM, Ralph Castain <r...@open-mpi.org
>>> <mailto:r...@open-mpi.org>> wrote:
>>>
>>> FWIW: speaking just to the —map-by node issue, Josh Ladd reported the
>>> problem on master as well yesterday. I’ll be looking into it on Wed.
>>>
>>>> On Apr 12, 2016, at 5:53 PM, George Bosilca <bosi...@icl.utk.edu
>>>> <mailto:bosi...@icl.utk.edu>> wrote:
>>>>
>>>>
>>>>
>>>> On Wed, Apr 13, 2016 at 1:59 AM, Gilles Gouaillardet <gil...@rist.or.jp
>>>> <mailto:gil...@rist.or.jp>> wrote:
>>>> George,
>>>>
>>>> about the process binding part
>>>>
>>>> On 4/13/2016 7:32 AM, George Bosilca wrote:
>>>> Also my processes, despite the fact that I asked for 1 per node, are not
>>>> bound to the first core. Shouldn’t we release the process binding when we
>>>> know there is a single process per node (as in the above case) ?
>>>> did you expect the tasks are bound to the first *core* on each node ?
>>>>
>>>> i would expect the tasks are bound to the first *socket* on each node.
>>>>
>>>> In this particular instance, where it has been explicitly requested to
>>>> have a single process per node, I would have expected the process to be
>>>> unbound (we know there is only one per node). It is the responsibility of
>>>> the application to bound itself or its thread if necessary. Why are we
>>>> enforcing a particular binding policy?
>>>>
>>>> (since we do not know how many (OpenMP or other) threads will be used by
>>>> the application,
>>>> --bind-to socket is a good policy imho. in this case (one task per node),
>>>> no binding at all would mean
>>>> the task can migrate from one socket to the other, and/or OpenMP threads
>>>> are bound accross sockets.
>>>> That would trigger some NUMA effects (better bandwidth if memory is
>>>> locally accessed, but worst performance
>>>> is memory is allocated only on one socket).
>>>> so imho, --bind-to socket is still my preferred policy, even if there is
>>>> only one MPI task per node.
>>>>
>>>> Open MPI is about MPI ranks/processes. I don't think it is our job to try
>>>> to figure out how the user handle do with it's own threads.
>>>>
>>>> Your justification make sense if the application only uses a single
>>>> socket. It also make sense if one starts multiple ranks per node, and the
>>>> internal threads of each MPI process inherit the MPI process binding.
>>>> However, in the case where there is a single process per node, because
>>>> there is a mismatch between the number of resources available (hardware
>>>> threads) and the binding of the parent process, all the threads of the MPI
>>>> application are [by default] bound on a single socket.
>>>>
>>>> George.
>>>>
>>>> PS: That being said I think I'll need to implement the binding code anyway
>>>> in order to deal with the wide variety of behaviors in the different MPI
>>>> implementations.
>>>>
>>>>
>>>>
>>>> Cheers,
>>>>
>>>> Gilles
>>>> _______________________________________________
>>>> devel mailing list
>>>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>>>> Link to this post:
>>>> http://www.open-mpi.org/community/lists/devel/2016/04/18758.php
>>>> <http://www.open-mpi.org/community/lists/devel/2016/04/18758.php>
>>>> _______________________________________________
>>>> devel mailing list
>>>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>>>> Link to this post:
>>>> http://www.open-mpi.org/community/lists/devel/2016/04/18759.php
>>>> <http://www.open-mpi.org/community/lists/devel/2016/04/18759.php>
>>
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2016/04/18761.php
>> <http://www.open-mpi.org/community/lists/devel/2016/04/18761.php>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2016/05/18915.php