Ralph, I still observe these issues in the current master. (npernode is not respected either).
Also note that the display_allocation seems to be wrong (slots_inuse=0 when the slot is obviously in use). $ git show 4899c89 (HEAD -> master, origin/master, origin/HEAD) Fix a race condition when multiple threads try to create a bml en....Bouteiller 6 hours ago $ bin/mpirun -np 12 -hostfile /opt/etc/ib10g.machinefile.ompi -display-allocation -map-by node hostname ====================== ALLOCATED NODES ====================== dancer00: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN dancer01: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN dancer02: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN dancer03: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN dancer04: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN dancer05: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN dancer06: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN dancer07: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN dancer08: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN dancer09: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN dancer10: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN dancer11: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN dancer12: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN dancer13: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN dancer14: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN dancer15: flags=0x13 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN ================================================================= dancer01 dancer00 dancer01 dancer01 dancer01 dancer00 dancer00 dancer00 dancer00 dancer00 dancer00 dancer00 -- Aurélien Bouteiller, Ph.D. ~~ https://icl.cs.utk.edu/~bouteill/ <https://icl.cs.utk.edu/~bouteill/> > Le 13 avr. 2016 à 13:38, Ralph Castain <r...@open-mpi.org> a écrit : > > The —map-by node option should now be fixed on master, and PRs waiting for > 1.10 and 2.0 > > Thx! > >> On Apr 12, 2016, at 6:45 PM, Ralph Castain <r...@open-mpi.org >> <mailto:r...@open-mpi.org>> wrote: >> >> FWIW: speaking just to the —map-by node issue, Josh Ladd reported the >> problem on master as well yesterday. I’ll be looking into it on Wed. >> >>> On Apr 12, 2016, at 5:53 PM, George Bosilca <bosi...@icl.utk.edu >>> <mailto:bosi...@icl.utk.edu>> wrote: >>> >>> >>> >>> On Wed, Apr 13, 2016 at 1:59 AM, Gilles Gouaillardet <gil...@rist.or.jp >>> <mailto:gil...@rist.or.jp>> wrote: >>> George, >>> >>> about the process binding part >>> >>> On 4/13/2016 7:32 AM, George Bosilca wrote: >>> Also my processes, despite the fact that I asked for 1 per node, are not >>> bound to the first core. Shouldn’t we release the process binding when we >>> know there is a single process per node (as in the above case) ? >>> did you expect the tasks are bound to the first *core* on each node ? >>> >>> i would expect the tasks are bound to the first *socket* on each node. >>> >>> In this particular instance, where it has been explicitly requested to have >>> a single process per node, I would have expected the process to be unbound >>> (we know there is only one per node). It is the responsibility of the >>> application to bound itself or its thread if necessary. Why are we >>> enforcing a particular binding policy? >>> >>> (since we do not know how many (OpenMP or other) threads will be used by >>> the application, >>> --bind-to socket is a good policy imho. in this case (one task per node), >>> no binding at all would mean >>> the task can migrate from one socket to the other, and/or OpenMP threads >>> are bound accross sockets. >>> That would trigger some NUMA effects (better bandwidth if memory is locally >>> accessed, but worst performance >>> is memory is allocated only on one socket). >>> so imho, --bind-to socket is still my preferred policy, even if there is >>> only one MPI task per node. >>> >>> Open MPI is about MPI ranks/processes. I don't think it is our job to try >>> to figure out how the user handle do with it's own threads. >>> >>> Your justification make sense if the application only uses a single socket. >>> It also make sense if one starts multiple ranks per node, and the internal >>> threads of each MPI process inherit the MPI process binding. However, in >>> the case where there is a single process per node, because there is a >>> mismatch between the number of resources available (hardware threads) and >>> the binding of the parent process, all the threads of the MPI application >>> are [by default] bound on a single socket. >>> >>> George. >>> >>> PS: That being said I think I'll need to implement the binding code anyway >>> in order to deal with the wide variety of behaviors in the different MPI >>> implementations. >>> >>> >>> >>> Cheers, >>> >>> Gilles >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org <mailto:de...@open-mpi.org> >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel> >>> Link to this post: >>> http://www.open-mpi.org/community/lists/devel/2016/04/18758.php >>> <http://www.open-mpi.org/community/lists/devel/2016/04/18758.php> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org <mailto:de...@open-mpi.org> >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel> >>> Link to this post: >>> http://www.open-mpi.org/community/lists/devel/2016/04/18759.php >>> <http://www.open-mpi.org/community/lists/devel/2016/04/18759.php> > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2016/04/18761.php > <http://www.open-mpi.org/community/lists/devel/2016/04/18761.php>
smime.p7s
Description: S/MIME cryptographic signature