The —map-by node option should now be fixed on master, and PRs waiting for 1.10 and 2.0
Thx! > On Apr 12, 2016, at 6:45 PM, Ralph Castain <r...@open-mpi.org> wrote: > > FWIW: speaking just to the —map-by node issue, Josh Ladd reported the problem > on master as well yesterday. I’ll be looking into it on Wed. > >> On Apr 12, 2016, at 5:53 PM, George Bosilca <bosi...@icl.utk.edu >> <mailto:bosi...@icl.utk.edu>> wrote: >> >> >> >> On Wed, Apr 13, 2016 at 1:59 AM, Gilles Gouaillardet <gil...@rist.or.jp >> <mailto:gil...@rist.or.jp>> wrote: >> George, >> >> about the process binding part >> >> On 4/13/2016 7:32 AM, George Bosilca wrote: >> Also my processes, despite the fact that I asked for 1 per node, are not >> bound to the first core. Shouldn’t we release the process binding when we >> know there is a single process per node (as in the above case) ? >> did you expect the tasks are bound to the first *core* on each node ? >> >> i would expect the tasks are bound to the first *socket* on each node. >> >> In this particular instance, where it has been explicitly requested to have >> a single process per node, I would have expected the process to be unbound >> (we know there is only one per node). It is the responsibility of the >> application to bound itself or its thread if necessary. Why are we enforcing >> a particular binding policy? >> >> (since we do not know how many (OpenMP or other) threads will be used by the >> application, >> --bind-to socket is a good policy imho. in this case (one task per node), no >> binding at all would mean >> the task can migrate from one socket to the other, and/or OpenMP threads are >> bound accross sockets. >> That would trigger some NUMA effects (better bandwidth if memory is locally >> accessed, but worst performance >> is memory is allocated only on one socket). >> so imho, --bind-to socket is still my preferred policy, even if there is >> only one MPI task per node. >> >> Open MPI is about MPI ranks/processes. I don't think it is our job to try to >> figure out how the user handle do with it's own threads. >> >> Your justification make sense if the application only uses a single socket. >> It also make sense if one starts multiple ranks per node, and the internal >> threads of each MPI process inherit the MPI process binding. However, in the >> case where there is a single process per node, because there is a mismatch >> between the number of resources available (hardware threads) and the binding >> of the parent process, all the threads of the MPI application are [by >> default] bound on a single socket. >> >> George. >> >> PS: That being said I think I'll need to implement the binding code anyway >> in order to deal with the wide variety of behaviors in the different MPI >> implementations. >> >> >> >> Cheers, >> >> Gilles >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org <mailto:de...@open-mpi.org> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> <http://www.open-mpi.org/mailman/listinfo.cgi/devel> >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2016/04/18758.php >> <http://www.open-mpi.org/community/lists/devel/2016/04/18758.php> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org <mailto:de...@open-mpi.org> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2016/04/18759.php >