FWIW: speaking just to the —map-by node issue, Josh Ladd reported the problem 
on master as well yesterday. I’ll be looking into it on Wed.

> On Apr 12, 2016, at 5:53 PM, George Bosilca <bosi...@icl.utk.edu> wrote:
> 
> 
> 
> On Wed, Apr 13, 2016 at 1:59 AM, Gilles Gouaillardet <gil...@rist.or.jp 
> <mailto:gil...@rist.or.jp>> wrote:
> George,
> 
> about the process binding part
> 
> On 4/13/2016 7:32 AM, George Bosilca wrote:
> Also my processes, despite the fact that I asked for 1 per node, are not 
> bound to the first core. Shouldn’t we release the process binding when we 
> know there is a single process per node (as in the above case) ?
> did you expect the tasks are bound to the first *core* on each node ?
> 
> i would expect the tasks are bound to the first *socket* on each node.
> 
> In this particular instance, where it has been explicitly requested to have a 
> single process per node, I would have expected the process to be unbound (we 
> know there is only one per node). It is the responsibility of the application 
> to bound itself or its thread if necessary. Why are we enforcing a particular 
> binding policy?
> 
> (since we do not know how many (OpenMP or other) threads will be used by the 
> application, 
> --bind-to socket is a good policy imho. in this case (one task per node), no 
> binding at all would mean
> the task can migrate from one socket to the other, and/or OpenMP threads are 
> bound accross sockets.
> That would trigger some NUMA effects (better bandwidth if memory is locally 
> accessed, but worst performance
> is memory is allocated only on one socket).
> so imho, --bind-to socket is still my preferred policy, even if there is only 
> one MPI task per node.
> 
> Open MPI is about MPI ranks/processes. I don't think it is our job to try to 
> figure out how the user handle do with it's own threads.
> 
> Your justification make sense if the application only uses a single socket. 
> It also make sense if one starts multiple ranks per node, and the internal 
> threads of each MPI process inherit the MPI process binding. However, in the 
> case where there is a single process per node, because there is a mismatch 
> between the number of resources available (hardware threads) and the binding 
> of the parent process, all the threads of the MPI application are [by 
> default] bound on a single socket.
> 
>  George.
> 
> PS: That being said I think I'll need to implement the binding code anyway in 
> order to deal with the wide variety of behaviors in the different MPI 
> implementations.
> 
>  
> 
> Cheers,
> 
> Gilles
> _______________________________________________
> devel mailing list
> de...@open-mpi.org <mailto:de...@open-mpi.org>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2016/04/18758.php 
> <http://www.open-mpi.org/community/lists/devel/2016/04/18758.php>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2016/04/18759.php

Reply via email to