On Jan 20, 2009, at 8:53 PM, Jeff Squyres wrote:

This all sounds really great to me. I agree with most of what has been said -- e.g., benchmarks *are* important. Improving them can even sometimes have the side effect of improving real applications. ;-)

My one big concern is the moving of architectural boundaries of making the btl understand MPI match headers. But even there, I'm torn:

1. I understand why it is better -- performance-wise -- to do this. And the performance improvement results are hard to argue with. We took a similar approach with ORTE; ORTE is now OMPI-specific, and many, many things have become better (from the OMPI perspective, at least).

2. We all have the knee-jerk reaction that we don't want to have the BTLs know anything about MPI semantics because they've always been that way and it has been a useful abstraction barrier. Now there's even a project afoot to move the BTLs out into a separate later that cannot know about MPI (so that other things can be built upon it). But are we sacrificing potential MPI performance here? I think that's one important question.

Eugene: you mentioned that there are other possibilities to having the BTL understand match headers, such as a callback into the PML. Have you tried this approach to see what the performance cost would be, perchance?

I neglected to say: the point of asking this question is an attempt to quantify the performance gain of having the BTL understand the match header. Specifically: is it a noticeable/important performance gain to have change our age-old abstraction barrier? Or is another approach just as good, performance-wise?

--
Jeff Squyres
Cisco Systems

Reply via email to