On Tue, Oct 25, 2011 at 11:09 AM, Bart Van Assche <bvanass...@acm.org> wrote:
> It's a little more complex than that. The original version of ib_srpt
> stops polling for completions as soon as the last WQE event has been
> received and after that the queue has been drained. So I don't know
> whether these flush errors were not delivered or whether these were
> delivered too late.

Sorry, but now I confused about what the bug is.  You have a QP
associated with an SRQ, and you transition the QP to error.  At some
point you get a "last WQE received" event for that QP (which means
all receive requests for that QP have completed), and you drain the
CQ after that, which means you are guaranteed to have processed
all the receive requests for that QP.  (At least this is how the verbs
are supposed to work).

So what goes wrong with ConnectX?

I don't think there is any way to guarantee that every request posted
to the SRQ is taken by a QP.  You just have to make sure that you
tear down every QP as above, and then you know when there are
no more QPs associated to the SRQ that any receive requests that
haven't completed are still on the SRQ, and you can free their
resources after you destroy the SRQ.

 - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to