Gleb Natapov wrote:
On Wed, Dec 12, 2007 at 02:03:02PM -0500, Jeff Squyres wrote:
On Dec 9, 2007, at 10:34 AM, Gleb Natapov wrote:
Currently BTL has parameter btl_min_send_size that is no longer used.
I want to change it to be btl_rndv_eager_limit. This new parameter
will
determine a size of a first fragment of rendezvous protocol. Now we
use
btl_eager_limit to set its size. btl_rndv_eager_limit will have to be
smaller or equal to btl_eager_limit. By default it will be equal to
btl_eager_limit so no behavior change will be observed if default is
used.
Can you describe why it would be better to have the value less than
the eager limit?
It is just one more knob to tune OB1 algorithm. I sometimes don't want
to send any data by copy in/out at all. This is not possible right now.
With this new param I will be able to control this.
From my experience tuning RDMA-rendezvous for the GASNet communications
library, I know that it was beneficial to piggyback some portion of the
payload on the rendezvous request. However, the best [insert your
favorite performance metric here] was not always achieved by
piggybacking the maximum that could be buffered at the receiver
(equivalent of blt_eager_limit). If I understand correctly, Gleb's
btl_rndv_eager_limit parameter would allow tuning for this behavior in OMPI.
An artificial/simplified example would be if the eager limit is 32K and
you have a 64K xfer. Is it better to send 32K copy in/out plus 32K by
RDMA, or to send 8K copy in/out plus 56K by RDMA? If the memcpy()
overhead for 32K of eager payload exceeds what can be overlapped with
the rendezvous setup then the second may be the better choice (higher
bandwidth, lower latency, and lower CPU overheads on both sender and
receiver).
-Paul
--
Paul H. Hargrove phhargr...@lbl.gov
Future Technologies Group
HPC Research Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900