Re: [OMPI devel] Process placement

Ralph Castain Thu, 5 May 2016 23:37:25 -0400 (EDT)
Okay, I see it - will fix on Fri. This is unique to master.

> On May 5, 2016, at 1:54 PM, Aurélien Bouteiller <boute...@icl.utk.edu> wrote:
> 
> Ralph, 
> 
> I still observe these issues in the current master. (npernode is not 
> respected either).
> 
> Also note that the display_allocation seems to be wrong (slots_inuse=0 when 
> the slot is obviously in use). 
> 
> $ git show 
> 4899c89 (HEAD -> master, origin/master, origin/HEAD) Fix a race condition 
> when multiple threads try to create a bml en....Bouteiller  6 hours ago
> 
> $ bin/mpirun -np 12 -hostfile /opt/etc/ib10g.machinefile.ompi 
> -display-allocation -map-by node    hostname 
> 
> ======================   ALLOCATED NODES   ======================
>       dancer00: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>       dancer01: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>       dancer02: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>       dancer03: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>       dancer04: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>       dancer05: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>       dancer06: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>       dancer07: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>       dancer08: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>       dancer09: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>       dancer10: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>       dancer11: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>       dancer12: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>       dancer13: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>       dancer14: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>       dancer15: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
> =================================================================
> dancer01
> dancer00
> dancer01
> dancer01
> dancer01
> dancer00
> dancer00
> dancer00
> dancer00
> dancer00
> dancer00
> dancer00
> 
> 
> --
> Aurélien Bouteiller, Ph.D. ~~ https://icl.cs.utk.edu/~bouteill/ 
> <https://icl.cs.utk.edu/~bouteill/>
>> Le 13 avr. 2016 à 13:38, Ralph Castain <r...@open-mpi.org 
>> <mailto:r...@open-mpi.org>> a écrit :
>> 
>> The —map-by node option should now be fixed on master, and PRs waiting for 
>> 1.10 and 2.0
>> 
>> Thx!
>> 
>>> On Apr 12, 2016, at 6:45 PM, Ralph Castain <r...@open-mpi.org 
>>> <mailto:r...@open-mpi.org>> wrote:
>>> 
>>> FWIW: speaking just to the —map-by node issue, Josh Ladd reported the 
>>> problem on master as well yesterday. I’ll be looking into it on Wed.
>>> 
>>>> On Apr 12, 2016, at 5:53 PM, George Bosilca <bosi...@icl.utk.edu 
>>>> <mailto:bosi...@icl.utk.edu>> wrote:
>>>> 
>>>> 
>>>> 
>>>> On Wed, Apr 13, 2016 at 1:59 AM, Gilles Gouaillardet <gil...@rist.or.jp 
>>>> <mailto:gil...@rist.or.jp>> wrote:
>>>> George,
>>>> 
>>>> about the process binding part
>>>> 
>>>> On 4/13/2016 7:32 AM, George Bosilca wrote:
>>>> Also my processes, despite the fact that I asked for 1 per node, are not 
>>>> bound to the first core. Shouldn’t we release the process binding when we 
>>>> know there is a single process per node (as in the above case) ?
>>>> did you expect the tasks are bound to the first *core* on each node ?
>>>> 
>>>> i would expect the tasks are bound to the first *socket* on each node.
>>>> 
>>>> In this particular instance, where it has been explicitly requested to 
>>>> have a single process per node, I would have expected the process to be 
>>>> unbound (we know there is only one per node). It is the responsibility of 
>>>> the application to bound itself or its thread if necessary. Why are we 
>>>> enforcing a particular binding policy?
>>>> 
>>>> (since we do not know how many (OpenMP or other) threads will be used by 
>>>> the application, 
>>>> --bind-to socket is a good policy imho. in this case (one task per node), 
>>>> no binding at all would mean
>>>> the task can migrate from one socket to the other, and/or OpenMP threads 
>>>> are bound accross sockets.
>>>> That would trigger some NUMA effects (better bandwidth if memory is 
>>>> locally accessed, but worst performance
>>>> is memory is allocated only on one socket).
>>>> so imho, --bind-to socket is still my preferred policy, even if there is 
>>>> only one MPI task per node.
>>>> 
>>>> Open MPI is about MPI ranks/processes. I don't think it is our job to try 
>>>> to figure out how the user handle do with it's own threads.
>>>> 
>>>> Your justification make sense if the application only uses a single 
>>>> socket. It also make sense if one starts multiple ranks per node, and the 
>>>> internal threads of each MPI process inherit the MPI process binding. 
>>>> However, in the case where there is a single process per node, because 
>>>> there is a mismatch between the number of resources available (hardware 
>>>> threads) and the binding of the parent process, all the threads of the MPI 
>>>> application are [by default] bound on a single socket.
>>>> 
>>>>  George.
>>>> 
>>>> PS: That being said I think I'll need to implement the binding code anyway 
>>>> in order to deal with the wide variety of behaviors in the different MPI 
>>>> implementations.
>>>> 
>>>>  
>>>> 
>>>> Cheers,
>>>> 
>>>> Gilles
>>>> _______________________________________________
>>>> devel mailing list
>>>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>>>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>>>> Link to this post: 
>>>> http://www.open-mpi.org/community/lists/devel/2016/04/18758.php 
>>>> <http://www.open-mpi.org/community/lists/devel/2016/04/18758.php>
>>>> _______________________________________________
>>>> devel mailing list
>>>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>>>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>>>> Link to this post: 
>>>> http://www.open-mpi.org/community/lists/devel/2016/04/18759.php 
>>>> <http://www.open-mpi.org/community/lists/devel/2016/04/18759.php>
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2016/04/18761.php 
>> <http://www.open-mpi.org/community/lists/devel/2016/04/18761.php>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2016/05/18915.php
Re: [OMPI devel] Process placement

Reply via email to