Sean Hefty wrote:
So in the case of lost DREQ etc, in cm_match_req() we will pass the checking for duplicate REQs but fall in the check for stale connections and it can happen in endless loop? this seems like a bug to me.
This problem isn't limited to stale connections. If a client tries to connect, gets a reject for whatever reason, ignores the reject, then tries to reconnect with the same parameters, then they've put themselves into an endless loop.
I don't follow: if they don't ignore the reject, but reuse the same QP for their successive connection requests, each new REQ will pass the ID check (duplicate REQs) but will fail on the remote QPN check, correct? so what can a client do to not fall into that? what does it means to not ignore the reject? note that even if on getting a reject they release the qp and allocate new one, they can get the qp number.
Yes, this seems to be able to solve the keep-alive thing in a generic fashion for all ULPs using the IB CM, will you be able to look on this during the next weeks or so?
This method can be used by apps today. The only enhancement that I can see being made is having the CM automatically send the messages at regular intervals. But I hesitate to add this to the CM since it doesn't have knowledge of traffic occurring over the QP, and may interfere with the app wanted to actually change alternate path information.
You mean one side to send a LAP message with the current path and the peer replying with APR message confirming this is fine? I guess this LAP sending has to carried out by both sides, correct? and its not supported for RDMA-CM users...
As for your comments, assuming an app must notify the CM that it does not use a QP anymore (and if not we delare it RTFM bug), as long as the QP is alive from the CM view point, its perfectly fine to sends these LAPs, doing this once every few seconds or tens of seconds will not create heavy load, I think. As for the point of interfering with apps that want to use LAP/APR for APM implementation over their protocols, we can let the CM consumer specify if they want the CM to issue keep-alives for them, and what is the frequency of sending the messages.
Or. _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
