How about adding (yet another) MCA param: 
orte_<whatever>_default_hostfile_slots, and it can take multiple different 
values:

- an integer >0
- num_cores
- num_hwthreads
- num_sockets
- ...could have others here, but I don't know if it's actually useful to be 
overly flexible here

Today's current default behavior would be to set this value to "1".  But it 
could easily change to "num_cores" (for example).


On Sep 2, 2012, at 11:33 AM, Ralph Castain wrote:

> Guess I should have emphasized the "IF" more - I'm talking about setting the 
> slots ONLY in the case where the user didn't provide that information. If the 
> user provides it, then that is what we use. In OMPI, we *always* accept the 
> user as the ultimate decider.
> 
> As things stand, we assume slots=1. This is just as arbitrary as you can get.
> 
> My point was that now that we use hwloc to discover the number of cpus in the 
> system, we could do something more intelligent than just set it to 1 *in the 
> case where we are given no info*.
> 
> Note that there is no intention to set binding policy here. All this impacts 
> is how many procs are mapped to a given node when we use the byslot mapping 
> algo.
> 
> Does that help?
> 
> On Sep 2, 2012, at 6:20 AM, Kenneth A. Lloyd <kenneth.ll...@wattsys.com> 
> wrote:
> 
>> I should note that we only virtualize the private cloud / management nodes
>> over our HPC.  The HPC is not virtualized as such.
>> 
>> Ken
>> 
>> -----Original Message-----
>> From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org] On
>> Behalf Of Kenneth A. Lloyd
>> Sent: Sunday, September 02, 2012 7:14 AM
>> To: 'Open MPI Developers'
>> Subject: Re: [OMPI devel] RFC: hostfile setting of #slots
>> 
>> This is a tricky issue, isn't it?  With the differences between AMD & Intel,
>> and between base operating systems "touching" & heaps (betw. Linux &
>> Windows), and various virtual machines schemes- we have opted for an
>> "outside the main code path" solution to get deterministic results. But that
>> is as things are now.  Who knows how stuff like AVX2 / memory mapping - or
>> the next new thing - will affect this?  This is similar to issues we've
>> found with CPU/GPU memory & affinity mapping over IB.  The basis of the
>> decision is (as is often) how much do you trust the user to do the right
>> thing?  What happens if you are wrong?
>> 
>> Only my opinion.
>> 
>> Ken
>> 
>> -----Original Message-----
>> From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org] On
>> Behalf Of Ralph Castain
>> Sent: Saturday, September 01, 2012 3:54 AM
>> To: Open MPI Developers
>> Subject: [OMPI devel] RFC: hostfile setting of #slots
>> 
>> This is not a notice of intended change, but rather a solicitation for
>> comment.
>> 
>> We currently default the number of slots on a host provided via hostfile or
>> -host to 1. This is a historical "feature" driven by the fact that our
>> initial launch system spawned daemons on the remote nodes after we had
>> already mapped the processes to them - so we had no way of guessing a
>> reasonable value for the number of slots on any node.
>> 
>> However, the "vm" launch method starts daemons on every node prior to  doing
>> the mapping, precisely so we can use the topology in the mapping algorithm.
>> This creates the possibility of setting the number of slots on a node to the
>> number of cpus (either cores or hardware threads, depending on how that flag
>> is set) IF it wasn't provided in the hostfile.
>> 
>> This would have an impact on the default "byslot" mapping algorithm. With
>> only one slot/node, byslot essentially equates to bynode mapping. So a
>> user-provided hostfile that doesn't give any info on the number of slots on
>> a node effectively changes the default mapping algorithm to "bynode". This
>> change would alter that behavior and retain a "byslot" algorithm, with the
>> number of slots being the number of cpus.
>> 
>> I have a use-case that would benefit from making the change, but can handle
>> it outside of the main code path. However, if others would also find it of
>> use, I can add it to the main code path, either as the default or via MCA
>> param.
>> 
>> Any thoughts?
>> Ralph
>> 
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to