On Mon, May 19, 2008 at 10:12:19PM +0300, Gleb Natapov wrote: > On Mon, May 19, 2008 at 01:52:22PM -0500, Jon Mason wrote: > > On Mon, May 19, 2008 at 05:17:57PM +0300, Gleb Natapov wrote: > > > On Mon, May 19, 2008 at 05:08:17PM +0300, Pavel Shamis (Pasha) wrote: > > > > >> 5. ...? > > > > >> > > > > > What about moving posting of receive buffers into main thread. With > > > > > SRQ it is easy: don't post anything in CPC thread. Main thread will > > > > > prepost buffers automatically after first fragment received on the > > > > > endpoint (in btl_openib_handle_incoming()). > > > > It still doesn't guaranty that we will not see RNR (as I understand we > > > > trying to resolve this problem for iwarp?!) > > > > > > > I don't think that iwarp has SRQ at all. And if it has then it should > > > > While Chelsio does not currently have an adapter that has SRQs, there are > > some other iWARP vendors that do have them. > > > > > have HW flow control for it too. I don't see what advantage SRQ without > > > flow control can provide over PPRQ. > > > > Technically, this is not flow control, it is a retransmit. iWARP can use > > the HW TCP stack to retransmit, but it will not have the "retransmit > > forever" ability that setting rnr_retry to 7 has for IB. > For how long will it try to retransmit before dropping connection. > > > > > > > So this solution will cost 1 buffer on each srq ... sounds acceptable > > > > for me. But I don't see too much > > > > difference compared to #1, as I understand we anyway will be need the > > > > pipe for communication with main thread. > > > > so why don't use #1 ? > > > What communication? No communication at all. Just don't prepost buffers > > > to SRQ during connection establishment. Problem solved (only for SRQ of > > > cause). > > > > iWARP needs preposted recv buffers (or it will drop the connection). So > > this isn't a good option. > I was talking about SRQ only. You said above that iwarp does retransmit for > SRQ. > openib BTL relies on HW retransmit when using SRQ, so if iwarp doesn't do it > reliably enough it can not be used with SRQ anyway.
How iWARP adapters behave with respect to SRQ retransmit is 100% HW dependent. The HW can queue some of the receives internally or use the HW TCP stack to have it retransmit. Of course, this is a BAD thing to do. The SRQ "low-water marker" event is the best way to handle these cases. Thanks, Jon > > -- > Gleb. > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel