Like I said, why don't we just do the following:

> I'd like to suggest an alternative solution. A BTL can exploit whatever data 
> it wants, but should first test if the data is available. If the data is 
> *required*, then the BTL gracefully disqualifies itself. If the data is 
> *desirable* for optimization, then the BTL writer (if they choose) can 
> include an alternate path that doesn't do the optimization if the data isn't 
> available.

Seems like this should resolve the disagreement in a way that meets everyone's 
need. It basically is an attribute approach, but not requiring modification of 
the BTL interface.


On Jul 31, 2014, at 8:26 AM, Pritchard Jr., Howard <howa...@lanl.gov> wrote:

> Hi  George,
> 
> The ompi_process_info.num_procs thing that seems to have been an object
> of some contention yesterday.
> 
> The ugni use of this is cloned off of the way I designed the mpich  netmod.
> Leveraging off size of the job was an easy way to scale the mailbox size.
> 
> If I'd been asked to have the netmod work in a context like it appears we
> may want to be eventually using BTLs - not just within ompi but for other
> things, I'd have worked with Darius (if still in mpich world) on changing the 
> netmod initialization
> method to allow for an optional attributes struct to be passed into the init 
> method to give hints about how many connections may need to be established,
> etc.  
> 
> For the GNI BTL - the way its currently designed - if you are only expecting
> to use it for a limited number of connections, then you want to initialize
> for big mailboxes (IBer's can think many large buffers posted as RX WQEs).
> But for very large jobs, with possibly highly connected communication pattern,
> you want very small mailboxes.
> 
> Howard
> 
> 
> -----Original Message-----
> From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of George Bosilca
> Sent: Thursday, July 31, 2014 9:09 AM
> To: Open MPI Developers
> Subject: Re: [OMPI devel] RFC: job size info in OPAL
> 
> What is your definition of "global job size"?
> 
>  George.
> 
> On Jul 31, 2014, at 11:06 , Pritchard Jr., Howard <howa...@lanl.gov> wrote:
> 
>> Hi Folks,
>> 
>> I think given the way we want to use the btl's in lower levels like 
>> opal, it is pretty disgusting for a btl to need to figure out on its 
>> own something like a "global job size".  That's not its business.  
>> Can't we add some attributes to the component's initialization method 
>> that provides hints for how to allocate resources it needs to provide its 
>> functionality?
>> 
>> I'll see if there's something clever that can be done in ugni for now.
>> I can always add in a hack to probe the apps placement info file and 
>> scale the smsg blocks by number of nodes rather than number of ranks.
>> 
>> Howard
>> 
>> 
>> -----Original Message-----
>> From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Nathan 
>> Hjelm
>> Sent: Thursday, July 31, 2014 8:58 AM
>> To: Open MPI Developers
>> Subject: Re: [OMPI devel] RFC: job size info in OPAL
>> 
>> 
>> +2^10000000
>> 
>> This information is absolutely necessary at this point. If someone has a 
>> better solution they can provide it as an alternative RFC. Until then this 
>> is how it should be done... Otherwise we loose uGNI support on the trunk. 
>> Because we ARE NOT going to remove the mailbox size optimization.
>> 
>> -Nathan
>> 
>> On Wed, Jul 30, 2014 at 10:00:18PM +0000, Jeff Squyres (jsquyres) wrote:
>>> WHAT: Should we make the job size (i.e., initial number of procs) available 
>>> in OPAL?
>>> 
>>> WHY: At least 2 BTLs are using this info (*more below)
>>> 
>>> WHERE: usnic and ugni
>>> 
>>> TIMEOUT: there's already been some inflammatory emails about this; 
>>> let's discuss next Tuesday on the teleconf: Tue, 5 Aug 2014
>>> 
>>> MORE DETAIL:
>>> 
>>> This is an open question.  We *have* the information at the time that the 
>>> BTLs are initialized: do we allow that information to go down to OPAL?
>>> 
>>> Ralph added this info down in OPAL in r32355, but George reverted it in 
>>> r32361.
>>> 
>>> Points for: YES, WE SHOULD
>>> +++ 2 BTLs were using it (usinc, ugni) Other RTE job-related info are 
>>> +++ already in OPAL (num local ranks, local rank)
>>> 
>>> Points for: NO, WE SHOULD NOT
>>> --- What exactly is this number (e.g., num currently-connected procs?), and 
>>> when is it updated?
>>> --- We need to precisely delineate what belongs in OPAL vs. 
>>> above-OPAL
>>> 
>>> FWIW: here's how ompi_process_info.num_procs was used before the BTL move 
>>> down to OPAL:
>>> 
>>> - usnic: for a minor latency optimization / sizing of a shared 
>>> receive buffer queue length, and for the initial size of a peer 
>>> lookup hash
>>> - ugni: to determine the size of the per-peer buffers used for 
>>> send/recv communication
>>> 
>>> --
>>> Jeff Squyres
>>> jsquy...@cisco.com
>>> For corporate legal information go to: 
>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>> 
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/devel/2014/07/15373.php
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2014/07/15395.php
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/07/15396.php
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/07/15400.php

Reply via email to