Eugene Loh wrote:

Actually, there may be a more important issue here.

Currently, the PML chooses the BTL first. Once the BTL choice is established, only then does the PML choose between sendi and send.

Currently, it's also the case that we're spending a lot of time in the PML doing a bunch of stuff that's totally unnecessary if the sendi succeeds. So, we're neutralizing much of the advantage sendi is supposed to provide.

So, I'm changing the PML to invoke sendi much sooner. The way I'm doing this is to loop over BTLs, looking for a sendi that exists and succeeds. If I find one, I'm done. If I don't, I have to go with the standard send code path.

The logic, as I just described it, allows that multiple sendi functions could fail and that the send that is ultimately used might be for a different BTL than for any of the failing sendi's. This would suggest that I do NOT want failing sendi's leaving any side effects (like allocated descriptors).

Is my proposed logic bad? Should I implement things another way? E.g., if I find a sendi function, use that BTL even if the sendi failed and another BTL might have a sendi that could succeed? Or, does my proposed change provide the justification for my pulling descriptor allocations out of the sendi functions?

Here's another way of looking at it.

The current PML send code does this:

   set_up_expensive_send_request(&sendreq);
   for ( btl = ... ) {
       if ( SUCCESS == sendi() ) return SUCCESS;
       if ( SUCCESS == send(&sendreq) ) return SUCESS;
   }

That is, we try one BTL after another. For each one, we try sendi first. So, each sendi() that fails is immediately followed by a send() of the same BTL. It's okay for a sendi() to do prep work for the send() of the same BTL. This scheme does a bunch of expensive send-request initialization that is unnecessary if the sendi(), which doesn't need the send request, succeeds.

My proposed PML send logic is this:

   for ( btl = ... ) {
       if ( SUCCESS == sendi() ) return SUCCESS;
   }
   set_up_expensive_send_request(&sendreq);
   for ( btl = ... ) {
       if ( SUCCESS == send(&sendreq) ) return SUCCESS;
   }

That is, if I can find a sendi() function, I use it. Only if I can't find any sendi() do I set up the send request and call send() functions.

This is why I would like sendi() functions to have no side effects... e.g., no allocated descriptors.

Reply via email to