This approach will work now but we need to start thinking about how we
want to support multiple simultaneous btl users. Does each user call
add_procs with a single module (or set of modules) or does each user
call btl_component_init and get their own module? If we do the latter
then it might make sense to add a max_procs argument to the
btl_component_init. Keep in mind we need to change the
btl_component_init interface anyway because the threading arguments no
longer make sense in their current form.

-Nathan

On Thu, Jul 31, 2014 at 09:04:09AM -0700, Ralph Castain wrote:
> Like I said, why don't we just do the following:
> 
> > I'd like to suggest an alternative solution. A BTL can exploit whatever 
> > data it wants, but should first test if the data is available. If the data 
> > is *required*, then the BTL gracefully disqualifies itself. If the data is 
> > *desirable* for optimization, then the BTL writer (if they choose) can 
> > include an alternate path that doesn't do the optimization if the data 
> > isn't available.
> 
> Seems like this should resolve the disagreement in a way that meets 
> everyone's need. It basically is an attribute approach, but not requiring 
> modification of the BTL interface.
> 
> 
> On Jul 31, 2014, at 8:26 AM, Pritchard Jr., Howard <howa...@lanl.gov> wrote:
> 
> > Hi  George,
> > 
> > The ompi_process_info.num_procs thing that seems to have been an object
> > of some contention yesterday.
> > 
> > The ugni use of this is cloned off of the way I designed the mpich  netmod.
> > Leveraging off size of the job was an easy way to scale the mailbox size.
> > 
> > If I'd been asked to have the netmod work in a context like it appears we
> > may want to be eventually using BTLs - not just within ompi but for other
> > things, I'd have worked with Darius (if still in mpich world) on changing 
> > the netmod initialization
> > method to allow for an optional attributes struct to be passed into the 
> > init 
> > method to give hints about how many connections may need to be established,
> > etc.  
> > 
> > For the GNI BTL - the way its currently designed - if you are only expecting
> > to use it for a limited number of connections, then you want to initialize
> > for big mailboxes (IBer's can think many large buffers posted as RX WQEs).
> > But for very large jobs, with possibly highly connected communication 
> > pattern,
> > you want very small mailboxes.
> > 
> > Howard
> > 
> > 
> > -----Original Message-----
> > From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of George Bosilca
> > Sent: Thursday, July 31, 2014 9:09 AM
> > To: Open MPI Developers
> > Subject: Re: [OMPI devel] RFC: job size info in OPAL
> > 
> > What is your definition of "global job size"?
> > 
> >  George.
> > 
> > On Jul 31, 2014, at 11:06 , Pritchard Jr., Howard <howa...@lanl.gov> wrote:
> > 
> >> Hi Folks,
> >> 
> >> I think given the way we want to use the btl's in lower levels like 
> >> opal, it is pretty disgusting for a btl to need to figure out on its 
> >> own something like a "global job size".  That's not its business.  
> >> Can't we add some attributes to the component's initialization method 
> >> that provides hints for how to allocate resources it needs to provide its 
> >> functionality?
> >> 
> >> I'll see if there's something clever that can be done in ugni for now.
> >> I can always add in a hack to probe the apps placement info file and 
> >> scale the smsg blocks by number of nodes rather than number of ranks.
> >> 
> >> Howard
> >> 
> >> 
> >> -----Original Message-----
> >> From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Nathan 
> >> Hjelm
> >> Sent: Thursday, July 31, 2014 8:58 AM
> >> To: Open MPI Developers
> >> Subject: Re: [OMPI devel] RFC: job size info in OPAL
> >> 
> >> 
> >> +2^10000000
> >> 
> >> This information is absolutely necessary at this point. If someone has a 
> >> better solution they can provide it as an alternative RFC. Until then this 
> >> is how it should be done... Otherwise we loose uGNI support on the trunk. 
> >> Because we ARE NOT going to remove the mailbox size optimization.
> >> 
> >> -Nathan
> >> 
> >> On Wed, Jul 30, 2014 at 10:00:18PM +0000, Jeff Squyres (jsquyres) wrote:
> >>> WHAT: Should we make the job size (i.e., initial number of procs) 
> >>> available in OPAL?
> >>> 
> >>> WHY: At least 2 BTLs are using this info (*more below)
> >>> 
> >>> WHERE: usnic and ugni
> >>> 
> >>> TIMEOUT: there's already been some inflammatory emails about this; 
> >>> let's discuss next Tuesday on the teleconf: Tue, 5 Aug 2014
> >>> 
> >>> MORE DETAIL:
> >>> 
> >>> This is an open question.  We *have* the information at the time that the 
> >>> BTLs are initialized: do we allow that information to go down to OPAL?
> >>> 
> >>> Ralph added this info down in OPAL in r32355, but George reverted it in 
> >>> r32361.
> >>> 
> >>> Points for: YES, WE SHOULD
> >>> +++ 2 BTLs were using it (usinc, ugni) Other RTE job-related info are 
> >>> +++ already in OPAL (num local ranks, local rank)
> >>> 
> >>> Points for: NO, WE SHOULD NOT
> >>> --- What exactly is this number (e.g., num currently-connected procs?), 
> >>> and when is it updated?
> >>> --- We need to precisely delineate what belongs in OPAL vs. 
> >>> above-OPAL
> >>> 
> >>> FWIW: here's how ompi_process_info.num_procs was used before the BTL move 
> >>> down to OPAL:
> >>> 
> >>> - usnic: for a minor latency optimization / sizing of a shared 
> >>> receive buffer queue length, and for the initial size of a peer 
> >>> lookup hash
> >>> - ugni: to determine the size of the per-peer buffers used for 
> >>> send/recv communication
> >>> 
> >>> --
> >>> Jeff Squyres
> >>> jsquy...@cisco.com
> >>> For corporate legal information go to: 
> >>> http://www.cisco.com/web/about/doing_business/legal/cri/
> >>> 
> >>> _______________________________________________
> >>> devel mailing list
> >>> de...@open-mpi.org
> >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >>> Link to this post: 
> >>> http://www.open-mpi.org/community/lists/devel/2014/07/15373.php
> >> _______________________________________________
> >> devel mailing list
> >> de...@open-mpi.org
> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >> Link to this post: 
> >> http://www.open-mpi.org/community/lists/devel/2014/07/15395.php
> > 
> > _______________________________________________
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post: 
> > http://www.open-mpi.org/community/lists/devel/2014/07/15396.php
> > _______________________________________________
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post: 
> > http://www.open-mpi.org/community/lists/devel/2014/07/15400.php
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/07/15402.php

Attachment: pgpGhltvXmgsf.pgp
Description: PGP signature

Reply via email to