Re: [OMPI devel] RFC: eliminating "descriptor" argument from sendi function

Eugene Loh Mon, 23 Feb 2009 17:34:02 -0500

Actually, there may be a more important issue here.

Currently, the PML chooses the BTL first. Once the BTL choice isestablished, only then does the PML choose between sendi and send.

Currently, it's also the case that we're spending a lot of time in thePML doing a bunch of stuff that's totally unnecessary if the sendisucceeds. So, we're neutralizing much of the advantage sendi issupposed to provide.

So, I'm changing the PML to invoke sendi much sooner. The way I'm doingthis is to loop over BTLs, looking for a sendi that exists andsucceeds. If I find one, I'm done. If I don't, I have to go with thestandard send code path.

The logic, as I just described it, allows that multiple sendi functionscould fail and that the send that is ultimately used might be for adifferent BTL than for any of the failing sendi's. This would suggestthat I do NOT want failing sendi's leaving any side effects (likeallocated descriptors).

Is my proposed logic bad? Should I implement things another way? E.g.,if I find a sendi function, use that BTL even if the sendi failed andanother BTL might have a sendi that could succeed? Or, does my proposedchange provide the justification for my pulling descriptor allocationsout of the sendi functions?


Further comments (of less importance) below:

George Bosilca wrote:

On Feb 23, 2009, at 12:14 , Eugene Loh wrote:
George Bosilca wrote:
It doesn't sound reasonable to me. There is a reason for this, andI think it's a good reason. The sendi function work for somedevices as a fast path for sending data, when the network is notflooded. However, in the case sendi cannot do the job we expect,the fact that it return the descriptor save us a call (we don'thave to do the alloc call later).
This does not make any sense to me. In what sense are we "saving acall"? Not in the sense of run-time performance since the BTL isnow having to allocate a descriptor it did not otherwise need. Theamount of work is the same (one descriptor allocation either way),but you're just pushing that work into the BTLs.
The descriptor is a BTL resource. If the sendi doesn't return one,the PML will have to call the BTL alloc function from the BTL again(in this case the calls will look like this: btl_sendi followed bybtl_alloc followed by btl_send). I'm not looking only at SM, I wantall of the BTL to have the opportunity to get good performance.
If sendi return a descriptor when it fails to send the data the calllist will be shorter: btl_sendi followed by btl_send. I'm trying todecrease the number of jumps between the layers (PML/BTL), not thenumber of lines of code in the BTL.

I think architectural streamlining -- even if just a little bit -- is agood thing. And, in this particular case, replicating code into eachBTL sendi function just doesn't buy us anything. When the PML allocatesthe descriptor, it simply calls mca_bml_base_alloc(), which is an*inlined* function that immediately calls the BTL alloc function. Nobig deal.

Further, having the sendi allocate the descriptor only makes adifference when the BTL has provided a sendi function *AND* when thatfunction failed. That's an edge case. It's much more likely that theBTL doesn't have a sendi function (e.g., openib) *OR* that function sentthe message successfully.

I could try comparing performance, but that's a lot of work just tomeasure "noise".

Re: [OMPI devel] RFC: eliminating "descriptor" argument from sendi function

Reply via email to