On 1/20/09 8:53 PM, "Jeff Squyres" <jsquy...@cisco.com> wrote:

> This all sounds really great to me.  I agree with most of what has
> been said -- e.g., benchmarks *are* important.  Improving them can
> even sometimes have the side effect of improving real applications.  ;-)
> 
> My one big concern is the moving of architectural boundaries of making
> the btl understand MPI match headers.  But even there, I'm torn:
> 
> 1. I understand why it is better -- performance-wise -- to do this.
> And the performance improvement results are hard to argue with.  We
> took a similar approach with ORTE; ORTE is now OMPI-specific, and
> many, many things have become better (from the OMPI perspective, at
> least).
> 
> 2. We all have the knee-jerk reaction that we don't want to have the
> BTLs know anything about MPI semantics because they've always been
> that way and it has been a useful abstraction barrier.  Now there's
> even a project afoot to move the BTLs out into a separate later that
> cannot know about MPI (so that other things can be built upon it).
> But are we sacrificing potential MPI performance here?  I think that's
> one important question.
> 
> Eugene: you mentioned that there are other possibilities to having the
> BTL understand match headers, such as a callback into the PML.  Have
> you tried this approach to see what the performance cost would be,
> perchance?

How is this different from the way matching is done today ?

Rich

> 
> I'd like to see George's reaction to this RFC, and Brian's (if he has
> time).
> 
> 
> On Jan 20, 2009, at 8:04 PM, Eugene Loh wrote:
> 
>> Patrick Geoffray wrote:
>> 
>>> Eugene Loh wrote:
>>> 
>>> 
>>>>> replace the fifo¹s with a single link list per process in shared
>>>>> memory, with senders to this process adding match envelopes
>>>>> atomically, with each process reading its own link list (multiple
>>>>> 
>>>>> 
>>>> *) Doesn't strike me as a "simple" change.
>>>> 
>>>> 
>>> Actually, it's much simpler than trying to optimize/scale the N^2
>>> implementation, IMHO.
>>> 
>>> 
>> 1) The version I talk about is already done. Check my putbacks.
>> "Already
>> done" is easier! :^)
>> 
>> 2) The two ideas are largely orthogonal. The RFC talks about a variety
>> of things: cleaning up the sendi function, moving the sendi call up
>> higher in the PML, bypassing the PML receive-request structure
>> (similar
>> to sendi), and stream-lining the data convertors in common cases. Only
>> one part of the RFC (directed polling) overlaps with having a single
>> FIFO per receiver.
>> 
>>>> *) Not sure this addresses all-to-all well.  E.g., let's say you
>>>> post a
>>>> receive for a particular source.  Do you then wade through a long
>>>> FIFO
>>>> to look for your match?
>>>> 
>>>> 
>>> The tradeoff is between demultiplexing by the sender, which cost in
>>> time
>>> and in space, or by the receiver, which cost an atomic inc. ANY_TAG
>>> forces you to demultiplex on the receive side anyway. Regarding
>>> all-to-all, it won't be more expensive if the receives are pre-
>>> posted,
>>> and they should be.
>>> 
>>> 
>> Not sure I understand this paragraph. I do, however, think there are
>> great benefits to the single-receiver-queue model. It implies
>> congestion
>> on the receiver side in the many-to-one case, but if a single receiver
>> is reading all those messages anyhow, message-processing is already
>> going to throttle the message rate. The extra "bottleneck" at the FIFO
>> might never be seen.
>> 
>>>> What the RFC talks about is not the last SM development we'll ever
>>>> need.  It's only supposed to be one step forward from where we are
>>>> today.  The "single queue per receiver" approach has many
>>>> advantages,
>>>> but I think it's a different topic.
>>>> 
>>>> 
>>> But is this intermediate step worth it or should we (well,
>>> you :-) ) go
>>> directly for the single queue model ?
>>> 
>> To recap:
>> 1) The work is already done.
>> 2) The single-queue model addresses only one of the RFC's issues.
>> 3) I'm a fan of the single-queue model, but it's just a separate
>> discussion.
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 


Reply via email to