On Tue, 3 Mar 2009, Jeff Squyres wrote:

On Mar 3, 2009, at 3:31 PM, Eugene Loh wrote:

First, this behavior is basically what I was proposing and what George didn't feel comfortable with. It is arguably no compromise at all. (Uggh, why must I be so honest?) For eager messages, it favors BTLs with sendi functions, which could lead to those BTLs becoming overloaded. I think favoring BTLs with sendi for short messages is good. George thinks that load balancing BTLs is good.

Second, the implementation can be simpler than you suggest:

*) You don't need a separate list since testing for a sendi-enabled BTL is relatively cheap (I think... could verify). *) You don't need to shuffle the list. The mechanism used by ob1 just resumes the BTL search from the last BTL used. E.g., check https://svn.open-mpi.org/source/xref/ompi_1.3/ompi/mca/pml/ob1/pml_ob1_sendreq.h#mca_pml_ob1_send_request_start . You use mca_bml_base_btl_array_get_next(&btl_eager) to roundrobin over BTLs in a totally fair manner (remembering where the last loop left off), and using mca_bml_base_btl_array_get_size(&btl_eager) to make sure you don't loop endlessly.

Cool / fair enough.

How about an MCA parameter to switch between this mechanism (early sendi) and the original behavior (late sendi)?

This is the usual way that we resolve "I want to do X / I want to do Y" disputes. :-)

Of all the options presented, this is the one I dislike most :).

This is *THE* critical path of the OB1 PML. It's already horribly complex and hard to follow (as Eugene is finding out the hard way). Making it more complex as a way to settle this argument is pain and suffering just to avoid conflict.

However, one possible option that just occurred to me. I propose yet another option. If (AND ONLY IF) ob1/r2 detects that there are at least two BTLs to the same peer at the same priority and at least one has a sendi and at least one does not have a sendi, what about an MCA parameter to disable all sendi functions to that peer?

There's only a 1% gain in the FAIR protocol Euegene proposed, so we'd lose that 1% in the heterogeneous multi-nic case (the least common case). There would be a much bigger gain for the sendi homogeneous multi-nic / all single-nic cases (much more common), because the FAST protocol would be used.

That way, we get the FAST protocol in all cases for sm, which is what I really want ;).

Brian

Reply via email to