Gleb Natapov wrote:
On Wed, Dec 12, 2007 at 02:03:02PM -0500, Jeff Squyres wrote:
On Dec 9, 2007, at 10:34 AM, Gleb Natapov wrote:

 Currently BTL has parameter btl_min_send_size that is no longer used.
I want to change it to be btl_rndv_eager_limit. This new parameter will determine a size of a first fragment of rendezvous protocol. Now we use
btl_eager_limit to set its size. btl_rndv_eager_limit will have to be
smaller or equal to btl_eager_limit. By default it will be equal to
btl_eager_limit so no behavior change will be observed if default is
used.
Can you describe why it would be better to have the value less than the eager limit?

It is just one more knob to tune OB1 algorithm. I sometimes don't want
to send any data by copy in/out at all. This is not possible right now.
With this new param I will be able to control this.

From my experience tuning RDMA-rendezvous for the GASNet communications library, I know that it was beneficial to piggyback some portion of the payload on the rendezvous request. However, the best [insert your favorite performance metric here] was not always achieved by piggybacking the maximum that could be buffered at the receiver (equivalent of blt_eager_limit). If I understand correctly, Gleb's btl_rndv_eager_limit parameter would allow tuning for this behavior in OMPI.

An artificial/simplified example would be if the eager limit is 32K and you have a 64K xfer. Is it better to send 32K copy in/out plus 32K by RDMA, or to send 8K copy in/out plus 56K by RDMA? If the memcpy() overhead for 32K of eager payload exceeds what can be overlapped with the rendezvous setup then the second may be the better choice (higher bandwidth, lower latency, and lower CPU overheads on both sender and receiver).

-Paul

--
Paul H. Hargrove                          phhargr...@lbl.gov
Future Technologies Group
HPC Research Department                   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900


Reply via email to