On Wed, Oct 26, 2011 at 12:04 AM, Roland Dreier <rol...@purestorage.com> wrote: > Sorry, but now I confused about what the bug is. You have a QP > associated with an SRQ, and you transition the QP to error. At some > point you get a "last WQE received" event for that QP (which means > all receive requests for that QP have completed), and you drain the > CQ after that, which means you are guaranteed to have processed > all the receive requests for that QP. (At least this is how the verbs > are supposed to work). > > I don't think there is any way to guarantee that every request posted > to the SRQ is taken by a QP. You just have to make sure that you > tear down every QP as above, and then you know when there are > no more QPs associated to the SRQ that any receive requests that > haven't completed are still on the SRQ, and you can free their > resources after you destroy the SRQ.
I've learned something since I posted the message at the start of this thread: all error completions are posted on the CQ but draining the CQ immediately after having received the last WQE event is not sufficient. The CQ polling loop has to remain active a little longer in order to receive all receive and send error completions. Issuing "rmmod ib_srpt" during I/O works now without delay for SVN trunk r3900. I've had another look at the LIO version of ib_srpt and the way it handles completions should ensure that all error completions get processed (if any such completions are generated of course). Can I conclude from your reply that the "last WQE" event refers to the SRQ only and that it does not provide any information about the send queue associated with the same QP ? Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html