> From: Or Gerlitz [mailto:[EMAIL PROTECTED] > > Rimmer, Todd wrote: > > This approach will not work. If the QP is in RTS the Communication > > established event will never be generated. Hence the lost RTU case > > would not be properly handled and the ULP would need to take on the > > burden. Its much better to isolate the solution to the CM and let the > > ULP post to the send Q in RTR. > > I might miss you allover also is there a chance you might not read the > patches with enough attention? > > Lets first agree that you don't refer to CMA consumers for which the CMA > does the state transitions, since for them the CM will always get the > COMM_EST async event and will emulate an RTU reception, that is will > transition the cm id state and generate CM_USER_ESTABLISHED event for > the CMA which will modify the qp state to RTS and generate > RDMA_ESTABLISHED event to the ULP. > > So might mean to other types of CM/CMA consumers, please provide the > details, specifically what makes you state "if the QP is in RTS". > > Or.
My comment was in response to Sean's comment: > I think it would be simpler to transition the QP to RTS after sending a > REP, with the restriction that a user may not post sends until an RTU is > received, a communication establish event occurs, or a receive message > completes on the QP. Hence, this was not in the patches, it was something he was proposing as an alternative. My point is that if the CMA moved the QP to RTS, the CMA would not get a HCA Communication Established Async Event, in which case the CMA would have no vehicle to generate the communication established event to the CMA consumers. It seems burdensome for all CMA consumers to need to implement an alternate Tx queue which will only be used for this one rare situation. The result would be that few CMA consumers would implement it and it would be difficult to test. Hence it is best for the CMA and stack to address the race itself. The particular rare race is the case where: active side CMA consumer completes connection process (and CMA sends RTU). active side immediately sends a message passive side CQ callback occurs before CMA gets RTU or Communication Established Async event (and hence before CMA has moved QP to RTS) While this race sounds rare, it's the kind of thing which will happen in some large cluster under heavy stress. In which case it will be hard to debug, so its better to design out the race from the start. In this rare case, the passive side needs to queue any response TX it may want to do until it gets Communication established. This sidebar queue would not be required after communication established callback. However to avoid CMA consumer protocol errors, the CMA consumer would have to make sure the messages on this TX queue were unconditionally sent before any future sends. As it turns out, we already have such a Q, the Send Q. The Send Q was created previously and the only true limitation is that per IBTA the HCA hardware may not accept send doorbells until in RTS. So one possible approach internal to the stack would be to allow CMA consumers to post to the Send Q when the QP is in RTR, however internal to the stack do not inform the HCA QP of these WQEs until the QP is moved to RTS. The HCA driver could keep track of how many Send Q posts occurred while in RTR, then upon movement to RTS, it could issue the appropriate doorbells to the hardware. The above approach would solve the race completely transparent to CMA consumers. Todd Rimmer _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
