Re: [OMPI devel] Process placement

Ralph Castain Wed, 13 Apr 2016 13:38:29 -0400 (EDT)

The —map-by node option should now be fixed on master, and PRs waiting for 1.10 
and 2.0


Thx!

> On Apr 12, 2016, at 6:45 PM, Ralph Castain <r...@open-mpi.org> wrote:
> 
> FWIW: speaking just to the —map-by node issue, Josh Ladd reported the problem 
> on master as well yesterday. I’ll be looking into it on Wed.
> 
>> On Apr 12, 2016, at 5:53 PM, George Bosilca <bosi...@icl.utk.edu 
>> <mailto:bosi...@icl.utk.edu>> wrote:
>> 
>> 
>> 
>> On Wed, Apr 13, 2016 at 1:59 AM, Gilles Gouaillardet <gil...@rist.or.jp 
>> <mailto:gil...@rist.or.jp>> wrote:
>> George,
>> 
>> about the process binding part
>> 
>> On 4/13/2016 7:32 AM, George Bosilca wrote:
>> Also my processes, despite the fact that I asked for 1 per node, are not 
>> bound to the first core. Shouldn’t we release the process binding when we 
>> know there is a single process per node (as in the above case) ?
>> did you expect the tasks are bound to the first *core* on each node ?
>> 
>> i would expect the tasks are bound to the first *socket* on each node.
>> 
>> In this particular instance, where it has been explicitly requested to have 
>> a single process per node, I would have expected the process to be unbound 
>> (we know there is only one per node). It is the responsibility of the 
>> application to bound itself or its thread if necessary. Why are we enforcing 
>> a particular binding policy?
>> 
>> (since we do not know how many (OpenMP or other) threads will be used by the 
>> application, 
>> --bind-to socket is a good policy imho. in this case (one task per node), no 
>> binding at all would mean
>> the task can migrate from one socket to the other, and/or OpenMP threads are 
>> bound accross sockets.
>> That would trigger some NUMA effects (better bandwidth if memory is locally 
>> accessed, but worst performance
>> is memory is allocated only on one socket).
>> so imho, --bind-to socket is still my preferred policy, even if there is 
>> only one MPI task per node.
>> 
>> Open MPI is about MPI ranks/processes. I don't think it is our job to try to 
>> figure out how the user handle do with it's own threads.
>> 
>> Your justification make sense if the application only uses a single socket. 
>> It also make sense if one starts multiple ranks per node, and the internal 
>> threads of each MPI process inherit the MPI process binding. However, in the 
>> case where there is a single process per node, because there is a mismatch 
>> between the number of resources available (hardware threads) and the binding 
>> of the parent process, all the threads of the MPI application are [by 
>> default] bound on a single socket.
>> 
>>  George.
>> 
>> PS: That being said I think I'll need to implement the binding code anyway 
>> in order to deal with the wide variety of behaviors in the different MPI 
>> implementations.
>> 
>>  
>> 
>> Cheers,
>> 
>> Gilles
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2016/04/18758.php 
>> <http://www.open-mpi.org/community/lists/devel/2016/04/18758.php>
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2016/04/18759.php
>

Re: [OMPI devel] Process placement

Reply via email to