On Wed, Dec 12, 2007 at 01:18:10PM -0800, Paul H. Hargrove wrote: > Gleb Natapov wrote: > > On Wed, Dec 12, 2007 at 02:03:02PM -0500, Jeff Squyres wrote: > > > >> On Dec 9, 2007, at 10:34 AM, Gleb Natapov wrote: > >> > >> > >>> Currently BTL has parameter btl_min_send_size that is no longer used. > >>> I want to change it to be btl_rndv_eager_limit. This new parameter > >>> will > >>> determine a size of a first fragment of rendezvous protocol. Now we > >>> use > >>> btl_eager_limit to set its size. btl_rndv_eager_limit will have to be > >>> smaller or equal to btl_eager_limit. By default it will be equal to > >>> btl_eager_limit so no behavior change will be observed if default is > >>> used. > >>> > >> Can you describe why it would be better to have the value less than > >> the eager limit? > >> > >> > > It is just one more knob to tune OB1 algorithm. I sometimes don't want > > to send any data by copy in/out at all. This is not possible right now. > > With this new param I will be able to control this. > > > > From my experience tuning RDMA-rendezvous for the GASNet communications > library, I know that it was beneficial to piggyback some portion of the > payload on the rendezvous request. However, the best [insert your > favorite performance metric here] was not always achieved by > piggybacking the maximum that could be buffered at the receiver > (equivalent of blt_eager_limit). If I understand correctly, Gleb's > btl_rndv_eager_limit parameter would allow tuning for this behavior in OMPI. Exactly. You explained it better than me.
> > An artificial/simplified example would be if the eager limit is 32K and > you have a 64K xfer. Is it better to send 32K copy in/out plus 32K by > RDMA, or to send 8K copy in/out plus 56K by RDMA? If the memcpy() > overhead for 32K of eager payload exceeds what can be overlapped with > the rendezvous setup then the second may be the better choice (higher > bandwidth, lower latency, and lower CPU overheads on both sender and > receiver). > -- Gleb.