On Mon, May 19, 2008 at 10:12:19PM +0300, Gleb Natapov wrote:
> On Mon, May 19, 2008 at 01:52:22PM -0500, Jon Mason wrote:
> > On Mon, May 19, 2008 at 05:17:57PM +0300, Gleb Natapov wrote:
> > > On Mon, May 19, 2008 at 05:08:17PM +0300, Pavel Shamis (Pasha) wrote:
> > > > >> 5. ...?
> > > > >>     
> > > > > What about moving posting of receive buffers into main thread. With
> > > > > SRQ it is easy: don't post anything in CPC thread. Main thread will
> > > > > prepost buffers automatically after first fragment received on the
> > > > > endpoint (in btl_openib_handle_incoming()). 
> > > > It still doesn't guaranty that we will not see RNR (as I understand we 
> > > > trying to resolve this problem  for iwarp?!)
> > > > 
> > > I don't think that iwarp has SRQ at all. And if it has then it should
> > 
> > While Chelsio does not currently have an adapter that has SRQs, there are
> > some other iWARP vendors that do have them.
> > 
> > > have HW flow control for it too. I don't see what advantage SRQ without
> > > flow control can provide over PPRQ.
> > 
> > Technically, this is not flow control, it is a retransmit.  iWARP can use
> > the HW TCP stack to retransmit, but it will not have the "retransmit
> > forever" ability that setting rnr_retry to 7 has for IB.
> For how long will it try to retransmit before dropping connection.
> 
> > 
> > > > So this solution will cost 1 buffer on each srq ... sounds acceptable 
> > > > for me. But I don't see too much
> > > > difference compared to #1, as I understand  we anyway will be need the 
> > > > pipe for communication with main thread.
> > > > so why don't use #1 ?
> > > What communication? No communication at all. Just don't prepost buffers
> > > to SRQ during connection establishment. Problem solved (only for SRQ of
> > > cause).
> > 
> > iWARP needs preposted recv buffers (or it will drop the connection).  So
> > this isn't a good option.
> I was talking about SRQ only. You said above that iwarp does retransmit for 
> SRQ.
> openib BTL relies on HW retransmit when using SRQ, so if iwarp doesn't do it
> reliably enough it can not be used with SRQ anyway.

How iWARP adapters behave with respect to SRQ retransmit is 100% HW dependent.
The HW can queue some of the receives internally or use the HW TCP stack to have
it retransmit.  Of course, this is a BAD thing to do.  The SRQ "low-water 
marker"
event is the best way to handle these cases.  

Thanks,
Jon

> 
> --
>                       Gleb.
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to