The timeout is 18 (~1sec), and retry is 7 (max). The error only occurs 1% of runs, sometimes I run the same hello_world code in a loop, and caught it after 1500 runs. So I don't think it is a cable issue(but I have not checked the port error counter).
--CQ > -----Original Message----- > From: Dotan Barak [mailto:[EMAIL PROTECTED] > Sent: Sunday, October 28, 2007 2:48 AM > To: Tang, Changqing > Cc: Sean Hefty; Roland Dreier; [email protected] > Subject: Re: [ofa-general] message is received but sender > report error. > > Hi. > > Maybe you should increase your timeout/retry count for your > application? > can you check the ports error counters (using perfquery) > maybe you have bad cables in your subnet .... > > Dotan > > Tang, Changqing wrote: > > This is Verbs layer code, no IB CM is used. > > > > --CQ > > > > > >> -----Original Message----- > >> From: Sean Hefty [mailto:[EMAIL PROTECTED] > >> Sent: Thursday, October 25, 2007 12:38 PM > >> To: Tang, Changqing; Roland Dreier > >> Cc: [email protected] > >> Subject: RE: [ofa-general] message is received but sender report > >> error. > >> > >> > >>> If this is the case, how would we fix the problem ? It's > >>> > >> hard for us to > >> > >>> delay to destroy the QP, because we don't know how long to delay. > >>> The other way is to do something from the driver, or firmware. > >>> > >> Do you disconnect the QPs using the IB CM? > >> > >> - Sean > >> > >> > > _______________________________________________ > > general mailing list > > [email protected] > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > To unsubscribe, please visit > > http://openib.org/mailman/listinfo/openib-general > > > > > > _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
