On Jun 2, 2006, at 5:55 PM, Jonathan Day wrote:

Hi,

I'm working on developing some components for OpenMPI,
but am a little unclear as to how to implement
efficient sends and receives. I'm wanting to do
zero-copy two-sided MPI, but as far as I can see, this
is not going to be easy. As best as I can tell, the
receive mechanism copies into a temporary user buffer
then, on actually handling the receive, copies that
into the application's buffer. Would I be correct in
this interpretation?

This is really up to the implementer. If the BTL supports "send in place" the PML will prepare a descriptor pointing at the users memory and then use send/receive to transfer the message. The receive side is a bit more tricky. The zero copy interconnects which we support require that receive descriptors be posted to a queue and these are consumed in-order. To receive with zero copy the user buffer would need to be posted to the receive queue and these would have to be "in order". The MPI level cannot post these receives such that ordering is obeyed in all cases. To get around this some interconnects allow you to post receives along with matching information and the interconnect ensures MPI ordering.

In addition to the above issues on receiving directly into the user's buffer, there is also a performance hit for most interconnects because the memory must be registered (pinned and made resident). These costs dominate any benefit of zero copy for small/medium messages. Open MPI therefore uses send/receive with copy in/out for message sizes up to a configurable limit. After this limit RDMA is used to provide zero copy.



I'm also a little hazy on how to get information on
messages being passed. What information on the sending
process is visible to the receiving BTL components?

The BTL's are designed to be MPI agnostic. They are the "Byte Transfer Layer" and the PML "Point-to-Point Messaging Layer" hides MPI from them..

Finally, I'm assuming that developers have, over time,
produced test harnesses and other useful (for
developers) tools that would have no real value to
general users. Has anyone put together a kit of
development aids for coders of new components?

There have been some unit tests developed for various areas of Open MPI. For point-to-point however this was not seen as a big benefit. For us it was easier to begin testing with a simple MPI ping-pong and then graduate to the Intel-Test suite or some other more comprehensive set of point-to-point tests.

There is some information on the web that should help you in understanding the Open MPI p2p architecture:

http://www.open-mpi.org/papers/ipdps-2006

Take a look under Wednesday - Point to Point architecture, if you have problems reading the slides let me know and I can send them one slide per page.

http://www.open-mpi.org/papers/workshop-2006/

We are also working on another point-to-point architecture for interconnects that provide matching and other MPI facilities but we are a few weeks off from having this available.


Thanks,

Galen


Jonathan Day


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to