Re: [OMPI devel] collective problems

Richard Graham Wed, 7 Nov 2007 22:56:51 -0500

Does this mean that we don¹t have a queue to store btl level descriptors
that
 are only partially complete ?  Do we do an all or nothing with respect to
btl
 level requests at this stage ?


Seems to me like we want to mark things complete at the MPI level ASAP, and
 that this proposal is not to do that  is this correct ?

Rich


On 11/7/07 11:26 PM, "Jeff Squyres" <jsquy...@cisco.com> wrote:

> On Nov 7, 2007, at 9:33 PM, Patrick Geoffray wrote:
> 
>>> >> Remember that this is all in the context of Galen's proposal for
>>> >> btl_send() to be able to return NOT_ON_WIRE -- meaning that the send
>>> >> was successful, but it has not yet been sent (e.g., openib BTL
>>> >> buffered it because it ran out of credits).
>> >
>> > Sorry if I miss something obvious, but why does the PML has to be
>> > aware
>> > of the flow control situation of the BTL ? If the BTL cannot send
>> > something right away for any reason, it should be the responsibility
>> > of
>> > the BTL to buffer it and to progress on it later.
> 
> 
> That's currently the way it is.  But the BTL currently only has the
> option to say two things:
> 
> 1. "ok, done!" -- then the PML will think that the request is complete
> 2. "doh -- error!" -- then the PML thinks that Something Bad
> Happened(tm)
> 
> What we really need is for the BTL to have a third option:
> 
> 3. "not done yet!"
> 
> So that the PML knows that the request is not yet done, but will allow
> other things to progress while we're waiting for it to complete.
> Without this, the openib BTL currently replies "ok, done!", even when
> it has only buffered a message (rather than actually sending it out).
> This optimization works great (yeah, I know...) except for apps that
> don't dip into the MPI library frequently.  :-\
> 
> --
> Jeff Squyres
> Cisco Systems
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Re: [OMPI devel] collective problems

Reply via email to