Does this mean that we don¹t have a queue to store btl level descriptors that are only partially complete ? Do we do an all or nothing with respect to btl level requests at this stage ?
Seems to me like we want to mark things complete at the MPI level ASAP, and that this proposal is not to do that is this correct ? Rich On 11/7/07 11:26 PM, "Jeff Squyres" <jsquy...@cisco.com> wrote: > On Nov 7, 2007, at 9:33 PM, Patrick Geoffray wrote: > >>> >> Remember that this is all in the context of Galen's proposal for >>> >> btl_send() to be able to return NOT_ON_WIRE -- meaning that the send >>> >> was successful, but it has not yet been sent (e.g., openib BTL >>> >> buffered it because it ran out of credits). >> > >> > Sorry if I miss something obvious, but why does the PML has to be >> > aware >> > of the flow control situation of the BTL ? If the BTL cannot send >> > something right away for any reason, it should be the responsibility >> > of >> > the BTL to buffer it and to progress on it later. > > > That's currently the way it is. But the BTL currently only has the > option to say two things: > > 1. "ok, done!" -- then the PML will think that the request is complete > 2. "doh -- error!" -- then the PML thinks that Something Bad > Happened(tm) > > What we really need is for the BTL to have a third option: > > 3. "not done yet!" > > So that the PML knows that the request is not yet done, but will allow > other things to progress while we're waiting for it to complete. > Without this, the openib BTL currently replies "ok, done!", even when > it has only buffered a message (rather than actually sending it out). > This optimization works great (yeah, I know...) except for apps that > don't dip into the MPI library frequently. :-\ > > -- > Jeff Squyres > Cisco Systems > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel >