On Wed, 2005-06-01 at 14:49 -0400, Hal Rosenstock wrote: > Hi Tom, > > On Wed, 2005-06-01 at 14:34, Tom Duffy wrote: > > BTW, Hal, I just saw these fixes checked into Solaris Nevada (11) today: > > > > 6246111 IBCM confuses a retried REQ for a stale REQ > > This could come into play on SDP or kdapl interoperation with S10.
Further info on this:
During interoperability testing with Voltaire SM, IBCM timeout
tests
exposed a bug in IBCM.
Voltaire SM has a high PacketLifeTime (1 sec), this causes IBCM
to retry much slowly. Due to the delayed retries the receiving IBCM
thinks the retried REQ as a stale REQ and fails the connection.
> > 6247310 IBCM: Third Party SM can return different error code compared to
> > IBSRM
>
> The juxtaposition of CM and SM is confusing to me so I'm not sure
> exactly what this is. Perhaps this is something to do with SA client
> interaction with CM. This is not related to RMPP.
Again, from the bug report:
IB Specification does not dictate which is the correct "Error
Code" returned by Subnet Manager if a PathLook-up from 'A' to
'B' is Not available. It was left to implementors interpretation
from the list of error codes available.
IBSRM - return IBMF_SUCCESS with Zero Records, while
3rd Party SM (TopSpin, Voltaire) returns -
IBMF_NO_RECORDS/IBMF_REQ_INVALID
> > 6265425 crload test using abort and/or recycle exposes bugs in ibcm
>
> Is crload an Oracle test ?
Yes. It was doing something along the lines of:
post recv buffer to ep
ep_connect
...
/* before connect completes */
ep_disconnect
> > Do you think any of these bugs could be contributing to what you are
> > seeing?
>
> Doubtful.
Good. So, I should see the same behavior on Solaris 10 or 11.
-tduffy
signature.asc
Description: This is a digitally signed message part
_______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
