On Jan 20, 2009, at 8:53 PM, Jeff Squyres wrote:
This all sounds really great to me. I agree with most of what has
been said -- e.g., benchmarks *are* important. Improving them can
even sometimes have the side effect of improving real
applications. ;-)
My one big concern is the moving of architectural boundaries of
making the btl understand MPI match headers. But even there, I'm
torn:
1. I understand why it is better -- performance-wise -- to do this.
And the performance improvement results are hard to argue with. We
took a similar approach with ORTE; ORTE is now OMPI-specific, and
many, many things have become better (from the OMPI perspective, at
least).
2. We all have the knee-jerk reaction that we don't want to have the
BTLs know anything about MPI semantics because they've always been
that way and it has been a useful abstraction barrier. Now there's
even a project afoot to move the BTLs out into a separate later that
cannot know about MPI (so that other things can be built upon it).
But are we sacrificing potential MPI performance here? I think
that's one important question.
Eugene: you mentioned that there are other possibilities to having
the BTL understand match headers, such as a callback into the PML.
Have you tried this approach to see what the performance cost would
be, perchance?
I neglected to say: the point of asking this question is an attempt to
quantify the performance gain of having the BTL understand the match
header. Specifically: is it a noticeable/important performance gain
to have change our age-old abstraction barrier? Or is another
approach just as good, performance-wise?
--
Jeff Squyres
Cisco Systems