I've been looking at a "fast path" for sends and receives. This is like the sendi function, which attempts to send "immediately", without creating a bulky PML send request (which would be needed if, say, the send had to be progressed over multiple user MPI calls). One can do something similar on the receive side, and I have a workspace in which each BTL has the option of defining a "recvi" (receive immediate) function. The speedups I see in the prototype are gratifying: np=2 pingpong latencies are down 30%-2x, and they stay flat as np is increased. (OMPI, straight out of the box, sees pingpong latencies climb as np climbs due to the costs of polling.)

I'd like to have MPI_Sendrecv see the same performance benefits as well, but the MPI layer performs an MPI_Sendrecv as a Irecv/Send/Wait. The Irecv necessarily involves a receive request. So, the Send might be fast, but you lose most of the benefit of doing a fast path. I think the real way of doing a fast Sendrecv would be to do an immediate send (if you can) followed by an immediate receive.

It seems to me, there are two approaches here:

*) Teach the MPI layer about "fast path" sends and receives (sendi and recvi). *) Teach the PML layer about "Sendrecv". That is, have MPI_Sendrecv call something like mca_pml_ob1_sendrecv(). (This is the approach I'd prefer.)

Either way, the MPI/PML interface would need a new function (or two).

Any suggestions/comments?

Any guidelines on how I add a new function to the MPI/PML interface?

Reply via email to