Re: [OMPI devel] collective problems

Gleb Natapov Thu, 8 Nov 2007 01:42:23 -0500

On Wed, Nov 07, 2007 at 09:07:23PM -0700, Brian Barrett wrote:
> Personally, I'd rather just not mark MPI completion until a local  
> completion callback from the BTL.  But others don't like that idea, so  
> we came up with a way for back pressure from the BTL to say "it's not  
> on the wire yet".  This is more complicated than just not marking MPI  
> completion early, but why would we do something that helps real apps  
> at the expense of benchmarks?  That would just be silly!
> 
I fully agree with Brian here. Trying to solve the issue with current
approach will introduce additional checking in the fast path and will
only hurt real apps.


> Brian
> 
> On Nov 7, 2007, at 7:56 PM, Richard Graham wrote:
> 
> > Does this mean that we don’t have a queue to store btl level  
> > descriptors that
> >  are only partially complete ?  Do we do an all or nothing with  
> > respect to btl
> >  level requests at this stage ?
> >
> > Seems to me like we want to mark things complete at the MPI level  
> > ASAP, and
> >  that this proposal is not to do that – is this correct ?
> >
> > Rich
> >
> >
> > On 11/7/07 11:26 PM, "Jeff Squyres" <jsquy...@cisco.com> wrote:
> >
> >> On Nov 7, 2007, at 9:33 PM, Patrick Geoffray wrote:
> >>
> >> >> Remember that this is all in the context of Galen's proposal for
> >> >> btl_send() to be able to return NOT_ON_WIRE -- meaning that the  
> >> send
> >> >> was successful, but it has not yet been sent (e.g., openib BTL
> >> >> buffered it because it ran out of credits).
> >> >
> >> > Sorry if I miss something obvious, but why does the PML has to be
> >> > aware
> >> > of the flow control situation of the BTL ? If the BTL cannot send
> >> > something right away for any reason, it should be the  
> >> responsibility
> >> > of
> >> > the BTL to buffer it and to progress on it later.
> >>
> >>
> >> That's currently the way it is.  But the BTL currently only has the
> >> option to say two things:
> >>
> >> 1. "ok, done!" -- then the PML will think that the request is  
> >> complete
> >> 2. "doh -- error!" -- then the PML thinks that Something Bad
> >> Happened(tm)
> >>
> >> What we really need is for the BTL to have a third option:
> >>
> >> 3. "not done yet!"
> >>
> >> So that the PML knows that the request is not yet done, but will  
> >> allow
> >> other things to progress while we're waiting for it to complete.
> >> Without this, the openib BTL currently replies "ok, done!", even when
> >> it has only buffered a message (rather than actually sending it out).
> >> This optimization works great (yeah, I know...) except for apps that
> >> don't dip into the MPI library frequently.  :-\
> >>
> >> --
> >> Jeff Squyres
> >> Cisco Systems
> >>
> >> _______________________________________________
> >> devel mailing list
> >> de...@open-mpi.org
> >> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >>
> >
> > _______________________________________________
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

--
                        Gleb.

Re: [OMPI devel] collective problems

Reply via email to