Re: [OMPI devel] calling sendi earlier in the PML

Brian W. Barrett Tue, 3 Mar 2009 15:59:25 -0500

On Tue, 3 Mar 2009, Jeff Squyres wrote:

On Mar 3, 2009, at 3:31 PM, Eugene Loh wrote:
First, this behavior is basically what I was proposing and what Georgedidn't feel comfortable with. It is arguably no compromise at all. (Uggh,why must I be so honest?) For eager messages, it favors BTLs with sendifunctions, which could lead to those BTLs becoming overloaded. I thinkfavoring BTLs with sendi for short messages is good. George thinks thatload balancing BTLs is good.
Second, the implementation can be simpler than you suggest:
*) You don't need a separate list since testing for a sendi-enabled BTL isrelatively cheap (I think... could verify).*) You don't need to shuffle the list. The mechanism used by ob1 justresumes the BTL search from the last BTL used. E.g., checkhttps://svn.open-mpi.org/source/xref/ompi_1.3/ompi/mca/pml/ob1/pml_ob1_sendreq.h#mca_pml_ob1_send_request_start. You use mca_bml_base_btl_array_get_next(&btl_eager) to roundrobin overBTLs in a totally fair manner (remembering where the last loop left off),and using mca_bml_base_btl_array_get_size(&btl_eager) to make sure youdon't loop endlessly.
Cool / fair enough.
How about an MCA parameter to switch between this mechanism (early sendi) andthe original behavior (late sendi)?
This is the usual way that we resolve "I want to do X / I want to do Y"disputes. :-)


Of all the options presented, this is the one I dislike most :).

This is *THE* critical path of the OB1 PML. It's already horribly complexand hard to follow (as Eugene is finding out the hard way). Making itmore complex as a way to settle this argument is pain and suffering justto avoid conflict.

However, one possible option that just occurred to me. I propose yetanother option. If (AND ONLY IF) ob1/r2 detects that there are at leasttwo BTLs to the same peer at the same priority and at least one has asendi and at least one does not have a sendi, what about an MCA parameterto disable all sendi functions to that peer?

There's only a 1% gain in the FAIR protocol Euegene proposed, so we'd losethat 1% in the heterogeneous multi-nic case (the least common case).There would be a much bigger gain for the sendi homogeneous multi-nic /all single-nic cases (much more common), because the FAST protocol wouldbe used.

That way, we get the FAST protocol in all cases for sm, which is what Ireally want ;).


Brian

Re: [OMPI devel] calling sendi earlier in the PML

Reply via email to