Re: mlx4 missing completions (was Re: [PATCH-v2] ib_srpt: Initial SRP Target merge for v3.2-rc1)

2011-10-27 Thread Bart Van Assche
On Wed, Oct 26, 2011 at 10:06 PM, Roland Dreier rol...@purestorage.com wrote:
 On Wed, Oct 26, 2011 at 11:05 AM, Bart Van Assche bvanass...@acm.org wrote:
  Can I conclude from your reply that the last WQE event refers to the
  SRQ only and that it does not provide any information about the send
  queue associated with the same QP ?

 That is correct as I understand things.  The last WQE event was added
 to handle the fact that if you have a QP associated to an SRQ, you can't
 know how many receives have been taken from the SRQ and are in flight
 on the receive queue.  The IB spec is a bit vague on what last WQE
 actually means but based on the background, I believe it means that the
 last request on the receive queue has been processed, without saying
 anything about the send queue.

Are you sure about this ? There is a paragraph in the InfiniBand
Architecture Manual that can only be correct if last WQE refers to
both SRQ and send queue and not to the SRQ only. It is mentioned in
that manual that it is safe to reset the QP immediately after the last
WQE event has been received. A quote:

Note, for QPs that are associated with an SRQ, the Consumer should
take the QP through the Error State before invoking a Destroy QP or a
Modify QP to the Reset State. The Consumer may invoke the Destroy QP
without first performing a Modify QP to the Error State and waiting
for the Affiliated Asynchronous Last WQE Reached Event. However, if
the Consumer does not wait for the Affiliated Asynchronous Last WQE
Reached Event, then WQE and Data Segment leakage may occur. Therefore,
it is good programming practice to tear down a QP that is associated
with an SRQ by using the following process:
• Put the QP in the Error State;
• wait for the Affiliated Asynchronous Last WQE Reached Event;
• either:
• drain the CQ by invoking the Poll CQ verb and either wait for CQ to
be empty or the number of Poll CQ operations has exceeded CQ capacity
size; or
• post another WR that completes on the same CQ and wait for this WR
to return as a WC;
• and then invoke a Destroy QP or Reset QP.

Bart.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: mlx4 missing completions (was Re: [PATCH-v2] ib_srpt: Initial SRP Target merge for v3.2-rc1)

2011-10-26 Thread Bart Van Assche
On Wed, Oct 26, 2011 at 12:04 AM, Roland Dreier rol...@purestorage.com wrote:
 Sorry, but now I confused about what the bug is.  You have a QP
 associated with an SRQ, and you transition the QP to error.  At some
 point you get a last WQE received event for that QP (which means
 all receive requests for that QP have completed), and you drain the
 CQ after that, which means you are guaranteed to have processed
 all the receive requests for that QP.  (At least this is how the verbs
 are supposed to work).

 I don't think there is any way to guarantee that every request posted
 to the SRQ is taken by a QP.  You just have to make sure that you
 tear down every QP as above, and then you know when there are
 no more QPs associated to the SRQ that any receive requests that
 haven't completed are still on the SRQ, and you can free their
 resources after you destroy the SRQ.

I've learned something since I posted the message at the start of this
thread: all error completions are posted on the CQ but draining the CQ
immediately after having received the last WQE event is not
sufficient. The CQ polling loop has to remain active a little longer
in order to receive all receive and send error completions. Issuing
rmmod ib_srpt during I/O works now without delay for SVN trunk
r3900.

I've had another look at the LIO version of ib_srpt and the way it
handles completions should ensure that all error completions get
processed (if any such completions are generated of course).

Can I conclude from your reply that the last WQE event refers to the
SRQ only and that it does not provide any information about the send
queue associated with the same QP ?

Bart.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


mlx4 missing completions (was Re: [PATCH-v2] ib_srpt: Initial SRP Target merge for v3.2-rc1)

2011-10-25 Thread Roland Dreier
On Mon, Oct 24, 2011 at 11:07 PM, Bart Van Assche bvanass...@acm.org wrote:
 As far as I know every HCA supported by Linux does implement this
 correctly.  Which class did you have in mind as not doing that?

 At least QDR ConnectX 2 HCAs with fairly recent firmware. This
 behavior can be reproduced easily with the SCST version of ib_srpt as
 follows:
 - Log in from an SRP initiator to ib_srpt.
 - Start a direct I/O read test, e.g. with fio.
 - Issue the command rmmod ib_srpt on the target during I/O.

OK, this is a pretty serious bug in mlx4_ib if true.  Are
you sure that you really are seeing some pending work
requests not generating a flush error when the QP transitions
to the error state?

Keep in mind that transitioning the QP to the reset state
can cause completion entries to disappear after they
are generated.

 - R.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: mlx4 missing completions (was Re: [PATCH-v2] ib_srpt: Initial SRP Target merge for v3.2-rc1)

2011-10-25 Thread Bart Van Assche
On Tue, Oct 25, 2011 at 1:17 PM, Roland Dreier rol...@purestorage.com wrote:
 On Mon, Oct 24, 2011 at 11:07 PM, Bart Van Assche bvanass...@acm.org wrote:
  As far as I know every HCA supported by Linux does implement this
  correctly.  Which class did you have in mind as not doing that?
 
  At least QDR ConnectX 2 HCAs with fairly recent firmware. This
  behavior can be reproduced easily with the SCST version of ib_srpt as
  follows:
  1. Log in from an SRP initiator to ib_srpt.
  2. Start a direct I/O read test, e.g. with fio.
  3. Issue the command rmmod ib_srpt on the target during I/O.

 OK, this is a pretty serious bug in mlx4_ib if true.  Are
 you sure that you really are seeing some pending work
 requests not generating a flush error when the QP transitions
 to the error state?

It's a little more complex than that. The original version of ib_srpt
stops polling for completions as soon as the last WQE event has been
received and after that the queue has been drained. So I don't know
whether these flush errors were not delivered or whether these were
delivered too late.

Bart.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: mlx4 missing completions (was Re: [PATCH-v2] ib_srpt: Initial SRP Target merge for v3.2-rc1)

2011-10-25 Thread Roland Dreier
On Tue, Oct 25, 2011 at 11:09 AM, Bart Van Assche bvanass...@acm.org wrote:
 It's a little more complex than that. The original version of ib_srpt
 stops polling for completions as soon as the last WQE event has been
 received and after that the queue has been drained. So I don't know
 whether these flush errors were not delivered or whether these were
 delivered too late.

Sorry, but now I confused about what the bug is.  You have a QP
associated with an SRQ, and you transition the QP to error.  At some
point you get a last WQE received event for that QP (which means
all receive requests for that QP have completed), and you drain the
CQ after that, which means you are guaranteed to have processed
all the receive requests for that QP.  (At least this is how the verbs
are supposed to work).

So what goes wrong with ConnectX?

I don't think there is any way to guarantee that every request posted
to the SRQ is taken by a QP.  You just have to make sure that you
tear down every QP as above, and then you know when there are
no more QPs associated to the SRQ that any receive requests that
haven't completed are still on the SRQ, and you can free their
resources after you destroy the SRQ.

 - R.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html