Quoting r. Sean Hefty <[EMAIL PROTECTED]>: > Subject: Re: [PATCH] IB/cma: add rdma_establish > > Michael S. Tsirkin wrote: > >>>As a side note, reasons for frequent loss of RTU must be investigated. > >> > >>A lost RTU shouldn't be any more likely than a lost REQ or REP. Is the RTU > >>never showing up? > > > > > > Seems like that. I know fir sure I do accept after REP but remote side never > > gets ESTABLISHED. > > I looked at the code, then ran some tests. The REP is retried until an RTU > is > received, or its number of retries is exhausted. By modifying the IB CM, I > was > able to force RTU drops. Using madeye, I could see that the REP would be > retried, resulting in the RTU being resent. After 4 drops, I had the code > receive the RTU, which allowed the test to proceed. > > A couple things to look at in OFED would be the setting of max cm retries and > the cm timeout. > > - Sean
OFED uses CMA from upstream kernel. If default parameters there are inappropriate, maybe should fix them? BTW, how about the idea of exporting max cm retries in transport-independent header? -- MST _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
