Michael S. Tsirkin wrote: >>>As a side note, reasons for frequent loss of RTU must be investigated. >> >>A lost RTU shouldn't be any more likely than a lost REQ or REP. Is the RTU >>never showing up? > > > Seems like that. I know fir sure I do accept after REP but remote side never > gets ESTABLISHED.
I looked at the code, then ran some tests. The REP is retried until an RTU is received, or its number of retries is exhausted. By modifying the IB CM, I was able to force RTU drops. Using madeye, I could see that the REP would be retried, resulting in the RTU being resent. After 4 drops, I had the code receive the RTU, which allowed the test to proceed. A couple things to look at in OFED would be the setting of max cm retries and the cm timeout. - Sean _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
