Re: mlx4 missing completions (was Re: [PATCH-v2] ib_srpt: Initial SRP Target merge for v3.2-rc1)
On Wed, Oct 26, 2011 at 10:06 PM, Roland Dreier rol...@purestorage.com wrote: On Wed, Oct 26, 2011 at 11:05 AM, Bart Van Assche bvanass...@acm.org wrote: Can I conclude from your reply that the last WQE event refers to the SRQ only and that it does not provide any information about the send queue associated with the same QP ? That is correct as I understand things. The last WQE event was added to handle the fact that if you have a QP associated to an SRQ, you can't know how many receives have been taken from the SRQ and are in flight on the receive queue. The IB spec is a bit vague on what last WQE actually means but based on the background, I believe it means that the last request on the receive queue has been processed, without saying anything about the send queue. Are you sure about this ? There is a paragraph in the InfiniBand Architecture Manual that can only be correct if last WQE refers to both SRQ and send queue and not to the SRQ only. It is mentioned in that manual that it is safe to reset the QP immediately after the last WQE event has been received. A quote: Note, for QPs that are associated with an SRQ, the Consumer should take the QP through the Error State before invoking a Destroy QP or a Modify QP to the Reset State. The Consumer may invoke the Destroy QP without first performing a Modify QP to the Error State and waiting for the Affiliated Asynchronous Last WQE Reached Event. However, if the Consumer does not wait for the Affiliated Asynchronous Last WQE Reached Event, then WQE and Data Segment leakage may occur. Therefore, it is good programming practice to tear down a QP that is associated with an SRQ by using the following process: • Put the QP in the Error State; • wait for the Affiliated Asynchronous Last WQE Reached Event; • either: • drain the CQ by invoking the Poll CQ verb and either wait for CQ to be empty or the number of Poll CQ operations has exceeded CQ capacity size; or • post another WR that completes on the same CQ and wait for this WR to return as a WC; • and then invoke a Destroy QP or Reset QP. Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: mlx4 missing completions (was Re: [PATCH-v2] ib_srpt: Initial SRP Target merge for v3.2-rc1)
On Wed, Oct 26, 2011 at 12:04 AM, Roland Dreier rol...@purestorage.com wrote: Sorry, but now I confused about what the bug is. You have a QP associated with an SRQ, and you transition the QP to error. At some point you get a last WQE received event for that QP (which means all receive requests for that QP have completed), and you drain the CQ after that, which means you are guaranteed to have processed all the receive requests for that QP. (At least this is how the verbs are supposed to work). I don't think there is any way to guarantee that every request posted to the SRQ is taken by a QP. You just have to make sure that you tear down every QP as above, and then you know when there are no more QPs associated to the SRQ that any receive requests that haven't completed are still on the SRQ, and you can free their resources after you destroy the SRQ. I've learned something since I posted the message at the start of this thread: all error completions are posted on the CQ but draining the CQ immediately after having received the last WQE event is not sufficient. The CQ polling loop has to remain active a little longer in order to receive all receive and send error completions. Issuing rmmod ib_srpt during I/O works now without delay for SVN trunk r3900. I've had another look at the LIO version of ib_srpt and the way it handles completions should ensure that all error completions get processed (if any such completions are generated of course). Can I conclude from your reply that the last WQE event refers to the SRQ only and that it does not provide any information about the send queue associated with the same QP ? Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
mlx4 missing completions (was Re: [PATCH-v2] ib_srpt: Initial SRP Target merge for v3.2-rc1)
On Mon, Oct 24, 2011 at 11:07 PM, Bart Van Assche bvanass...@acm.org wrote: As far as I know every HCA supported by Linux does implement this correctly. Which class did you have in mind as not doing that? At least QDR ConnectX 2 HCAs with fairly recent firmware. This behavior can be reproduced easily with the SCST version of ib_srpt as follows: - Log in from an SRP initiator to ib_srpt. - Start a direct I/O read test, e.g. with fio. - Issue the command rmmod ib_srpt on the target during I/O. OK, this is a pretty serious bug in mlx4_ib if true. Are you sure that you really are seeing some pending work requests not generating a flush error when the QP transitions to the error state? Keep in mind that transitioning the QP to the reset state can cause completion entries to disappear after they are generated. - R. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: mlx4 missing completions (was Re: [PATCH-v2] ib_srpt: Initial SRP Target merge for v3.2-rc1)
On Tue, Oct 25, 2011 at 1:17 PM, Roland Dreier rol...@purestorage.com wrote: On Mon, Oct 24, 2011 at 11:07 PM, Bart Van Assche bvanass...@acm.org wrote: As far as I know every HCA supported by Linux does implement this correctly. Which class did you have in mind as not doing that? At least QDR ConnectX 2 HCAs with fairly recent firmware. This behavior can be reproduced easily with the SCST version of ib_srpt as follows: 1. Log in from an SRP initiator to ib_srpt. 2. Start a direct I/O read test, e.g. with fio. 3. Issue the command rmmod ib_srpt on the target during I/O. OK, this is a pretty serious bug in mlx4_ib if true. Are you sure that you really are seeing some pending work requests not generating a flush error when the QP transitions to the error state? It's a little more complex than that. The original version of ib_srpt stops polling for completions as soon as the last WQE event has been received and after that the queue has been drained. So I don't know whether these flush errors were not delivered or whether these were delivered too late. Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: mlx4 missing completions (was Re: [PATCH-v2] ib_srpt: Initial SRP Target merge for v3.2-rc1)
On Tue, Oct 25, 2011 at 11:09 AM, Bart Van Assche bvanass...@acm.org wrote: It's a little more complex than that. The original version of ib_srpt stops polling for completions as soon as the last WQE event has been received and after that the queue has been drained. So I don't know whether these flush errors were not delivered or whether these were delivered too late. Sorry, but now I confused about what the bug is. You have a QP associated with an SRQ, and you transition the QP to error. At some point you get a last WQE received event for that QP (which means all receive requests for that QP have completed), and you drain the CQ after that, which means you are guaranteed to have processed all the receive requests for that QP. (At least this is how the verbs are supposed to work). So what goes wrong with ConnectX? I don't think there is any way to guarantee that every request posted to the SRQ is taken by a QP. You just have to make sure that you tear down every QP as above, and then you know when there are no more QPs associated to the SRQ that any receive requests that haven't completed are still on the SRQ, and you can free their resources after you destroy the SRQ. - R. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html