RE: [PATCH 1/5] xhci: Don't change the ownership of LINK TRB until all the TRB are written

2014-01-29 Thread David Laight
From: Sarah Sharp
 On Wed, Jan 22, 2014 at 03:22:53PM -0800, walt wrote:
  On 01/22/2014 12:56 PM, Sarah Sharp wrote:
   Walt, can you turn on xHCI debugging and look for whether the NEC host
   that worked with David's patch is a 1.0 host?  You'll see a line like:
  
   // @%p = 0x%x (CAPLENGTH AND HCIVERSION)
 
  Hi Sarah.  This is from the NEC host controller, *not* the ASMedia:
 
  [9.089178] xhci_hcd :04:00.0: // @c90004f88000 = 0x960020 
  (CAPLENGTH AND HCIVERSION)
 
 Ok, so that's also a 0.96 host, but it's not impacted by the no-op TRBs.
 Still, I think disabling the recently added code for everything but 1.0
 hosts makes sense.

Actually it might be best to use a urb flag to indicate that the request
isn't 'aligned', and only add the NOP for unaligned transfers.

There is already a usb device flag usb_device_no_sg_constraint() to indicate
that scatter gather transfers may be misaligned.
The only code that looks at it is the ax88179_178a ethernet driver.
There is, of course, no definition of what the 'constraint' is, but
I think all the other requests are in 4k+ blocks.

The only problem is that this requires simultaneous updates of the xhci
and usbnet code.

David



--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/5] xhci: Don't change the ownership of LINK TRB until all the TRB are written

2014-01-29 Thread David Laight
From: Sarah Sharp 
 Hi David,
 
 I've been thinking about this some more, and I'd like to propose a much
 simpler solution.
 
 The TD fragment rules didn't go into the xHCI specification until the
 1.0 revision.  The code that follows those rules only seems to trigger
 issues for 0.96 ASMedia hosts, while Intel 1.0 hosts benefit from the
 code when USB ethernet devices are attached.  The patch that changed the
 link TRB behavior (commit 6c12db90f19727c76990e7f4801c67a148b30111) was
 submitted in 2010, and the xHCI 1.0 spec didn't come out until later
 that year, so it's unlikely that Synopsys host is a 1.0 host either.
 
 So why not just limit your code (and the sg table entry limit) to only
 trigger for 1.0 hosts?  That fix would be much less complex than this
 one.  I would still consider the other cleanup patches on to of it for
 inclusion in 3.15, although it looks like patches 4 and 5 rely on this
 patch.

My suspicion is that it is possible to lock up 0.96 ASMedia hosts (or
other hosts that need the link TRB behaviour change) without my patch
that adds NOPs to the ring.

It will happen much less often than when the NOPs are added, but it
can still happen.

If a TD starts with a LINK TRB then the LINK TRB is the last one that is
owned by the controller until the setup completes. If the controller
finishes the previous transfer while the while the setup is still in
progress then it will process the LINK TRB and we are in exactly
the same situation as before patch 6c12db90f1.
This actually means that patch 6c12db90f1 didn't completely fix the problem.

With the NOPs, the driver writes the first NOP at about the same time
as the previous version would have written the LINK. The rest of the NOPs
and the LINK are then probably written faster than the controller reads them.
So I surmise that if the patch wrote a LINK instead of the first NOP
then the controller would hang just as often.
But this is the same as the case when no NOPs are written because the
previous TD finished in the last slot.

I know this is all failing with disk transfers which I believe are all
single threaded. So maybe the LINK TRB that the controller is reading
is one written in response to a previous transfer completing.
This is a timing race which could be affected by small things (like
putting the controller into a different computer).
But I can't see how it is affected by the number of NOPs (including zero).

A patch that only adds the NOPs for transfers that are expected to be
misaligned (or are not known to be aligned - depending on which way
you want to flag things) is ok by me.
That would allow unlimited length aligned sg transfers on v1.00 hosts.

David



--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/5] xhci: Don't change the ownership of LINK TRB until all the TRB are written

2014-01-27 Thread Sarah Sharp
On Thu, Jan 23, 2014 at 10:35:56AM +, David Laight wrote:
 From: Sarah Sharp
  Hi David,
  
  I've been thinking about this some more, and I'd like to propose a much
  simpler solution.
  
  The TD fragment rules didn't go into the xHCI specification until the
  1.0 revision.  The code that follows those rules only seems to trigger
  issues for 0.96 ASMedia hosts, while Intel 1.0 hosts benefit from the
  code when USB ethernet devices are attached.  The patch that changed the
  link TRB behavior (commit 6c12db90f19727c76990e7f4801c67a148b30111) was
  submitted in 2010, and the xHCI 1.0 spec didn't come out until later
  that year, so it's unlikely that Synopsys host is a 1.0 host either.
 
 Ah - so I was right in thinking that a LINK TRB mustn't point to
 a TRB that is still owned by the driver.

For that *particular* piece of hardware.  This is not an issue for other
xHCI hosts; this is not something that's spec mandated.

 The thing is, that patch has a timing bug, and I think that is what Walt
 is hitting. It is there regardless of my NOP change.

That does not make sense.  Walt confirmed that your new patch does not
help him.  Therefore either your patch doesn't close the race condition,
or Walt's ASMedia host does not suffer from it.  Either way, I have no
hard proof your patch is necessary and correct.

 In order to hit the bug I've fixed you need two things:
 1) An xhci controller that doesn't like 'dangling' LINK TRB.
(These will be the ones that made the 6c12 fix needed.)
 and:
 2a) A host system where the timings of the PCIe transfers (especially
 the latency) are such that the controller manages to read a
 LINK TRB that prepare_transfer() has updated while queueing a
 new transfer in response to a transfer completion event.
 or:
 2b) To setup a second transfer just as the last transfer is completing.
 I'm not sure this can happen for disks, but it could happen for
 network traffic.

Neither of us has deep technical knowledge of how this host hardware
works.  We can argue about theoretical race conditions and link TRB
pre-fetches and packet pending flags, but in the end, we don't know how
this buggy host is designed.  We don't know if the timings you mentioned
in #2a are actually a problem.

In fact, looking back at the discussion around this patch, I brought up
the same issue you did about link TRB activation, and the original patch
submitter confirmed it was fine:

http://marc.info/?l=linux-usbm=127325508724339w=2

 So this fix is actually just a version of the 6c12 patch that
 actually works!

The original patch works, and the theoretical race condition this
patchset was supposed to fix is not an issue, based on that link.  If
your new commit causes issues with Synopsys hosts, we'll address it when
someone actually reports the issue.

In the meantime, I'm going to go with the simplest fix, and disable the
no-op TRB code for pre-1.0 hosts.  You are welcome to resubmit this
patchset without the patches that rely on the first patch, if you like.

Sarah Sharp
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/5] xhci: Don't change the ownership of LINK TRB until all the TRB are written

2014-01-23 Thread David Laight
From: Sarah Sharp
 Hi David,
 
 I've been thinking about this some more, and I'd like to propose a much
 simpler solution.
 
 The TD fragment rules didn't go into the xHCI specification until the
 1.0 revision.  The code that follows those rules only seems to trigger
 issues for 0.96 ASMedia hosts, while Intel 1.0 hosts benefit from the
 code when USB ethernet devices are attached.  The patch that changed the
 link TRB behavior (commit 6c12db90f19727c76990e7f4801c67a148b30111) was
 submitted in 2010, and the xHCI 1.0 spec didn't come out until later
 that year, so it's unlikely that Synopsys host is a 1.0 host either.

Ah - so I was right in thinking that a LINK TRB mustn't point to
a TRB that is still owned by the driver.

The thing is, that patch has a timing bug, and I think that is what Walt
is hitting. It is there regardless of my NOP change.
It might be hit by doing back to back single sector transfers.

Without the NOP patch it is unlikely that the transfers will start with
the LINK TRB - since they all use a moderate number of TRB.
Even writing the NOP TRB perturbs the timing slightly - although they
get added quite quickly.

The problem is that, although the 6c12 patch stopped inc_enq() changing
the ownership of LINK TRB, it left prepare_ring() doing so.
This means that there is a finite time where the last TRB is a LINK TRB.
The more fragments a transfer has, the longer this is true.

Clearly this is only a problem if the controller is actually active.
So you might think that you need to be setting up a transfer just as an
earlier transfer is completing. However I think the timings are such
that you can be setting up the transfer in response to the earlier
transfer completing.
This is what makes the error machine dependant.

So this fix is actually just a version of the 6c12 patch that
actually works!
It would look a lot less invasive if you compared it to the code
before that patch was applied.

 So why not just limit your code (and the sg table entry limit) to only
 trigger for 1.0 hosts?  That fix would be much less complex than this
 one.  I would still consider the other cleanup patches on to of it for
 inclusion in 3.15, although it looks like patches 4 and 5 rely on this
 patch.
 
 Walt, can you turn on xHCI debugging and look for whether the NEC host
 that worked with David's patch is a 1.0 host?

In order to hit the bug I've fixed you need two things:
1) An xhci controller that doesn't like 'dangling' LINK TRB.
   (These will be the ones that made the 6c12 fix needed.)
and:
2a) A host system where the timings of the PCIe transfers (especially
the latency) are such that the controller manages to read a
LINK TRB that prepare_transfer() has updated while queueing a
new transfer in response to a transfer completion event.
or:
2b) To setup a second transfer just as the last transfer is completing.
I'm not sure this can happen for disks, but it could happen for
network traffic.

So that fact that any given controller works in a specific system
doesn't mean it will work in any system.
I suspect that any that don't like dangling LINKs will fail
under certain workloads on some systems.

I will look at a patch the mitigates the problems for disk requests.

David



--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/5] xhci: Don't change the ownership of LINK TRB until all the TRB are written

2014-01-22 Thread Sarah Sharp
Hi David,

I've been thinking about this some more, and I'd like to propose a much
simpler solution.

The TD fragment rules didn't go into the xHCI specification until the
1.0 revision.  The code that follows those rules only seems to trigger
issues for 0.96 ASMedia hosts, while Intel 1.0 hosts benefit from the
code when USB ethernet devices are attached.  The patch that changed the
link TRB behavior (commit 6c12db90f19727c76990e7f4801c67a148b30111) was
submitted in 2010, and the xHCI 1.0 spec didn't come out until later
that year, so it's unlikely that Synopsys host is a 1.0 host either.

So why not just limit your code (and the sg table entry limit) to only
trigger for 1.0 hosts?  That fix would be much less complex than this
one.  I would still consider the other cleanup patches on to of it for
inclusion in 3.15, although it looks like patches 4 and 5 rely on this
patch.

Walt, can you turn on xHCI debugging and look for whether the NEC host
that worked with David's patch is a 1.0 host?  You'll see a line like:

// @%p = 0x%x (CAPLENGTH AND HCIVERSION)

That should allow me to decode the host version number.

Sarah Sharp

On Tue, Jan 21, 2014 at 05:12:54PM +, David Laight wrote:
 Some xhci controllers (eg ASMedia) don't like processing a LINK TRB and
 then finding that they can't process the next TRB.
 
 Instead of flipping the cycle bit on the first data TRB, remember the
 real first TRB in prepare_ring() and flip that in giveback_first_trb().
 
 In queue_trb() ignore the cycle bit value set by xhci_queue_bulk_tx() etc,
 and changes the ownership of all but the first TRB.
 
 If prepare_ring() adds any NOP TRB, the ownership of all but the first
 and the LINK are changed. The wmb() is no longer needed before changing
 the ownership of a LINK trb since is is preceeded by a NOP.
 
 Since the 'first trb' might be a LINK trb queue_command() must also now
 call giveback_first_trb(). Don't ring the doorbell here though.
 
 The isoc code calls prepare_ring() right at the start, this ensures that
 there is enough space for all the TRB and advances the write-ptr past
 any LINK TRB. It then calls prepare_transfer() multiple times.
 However the prepare_ring() calls must now be matched with those to
 giveback_first_trb().
 Don't call prepare_ring() from prepare_transfer() for isoc rings.
 Set 'more_trbs_coming' (it means 'advance past a LINK TRB) on all
 the queue_trb() calls except the very last one.
 
 Signed-off-by: David Laight david.lai...@aculab.com
 ---
 Note that this differs to any previous similar patches I've sent
 because it contains a fix to the isoc code.
 However I've only tsted it with USB3 ethernet.
 
  drivers/usb/host/xhci-ring.c | 84 
 +---
  drivers/usb/host/xhci.h  |  1 +
  2 files changed, 57 insertions(+), 28 deletions(-)
 
 diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
 index a0b248c..62049e5 100644
 --- a/drivers/usb/host/xhci-ring.c
 +++ b/drivers/usb/host/xhci-ring.c
 @@ -2920,6 +2920,16 @@ static void queue_trb(struct xhci_hcd *xhci, struct 
 xhci_ring *ring,
   struct xhci_generic_trb *trb;
  
   trb = ring-enqueue-generic;
 +
 + /*
 +  * Ignore the caller's CYCLE bit.
 +  * The caller doesn't know whether the real first TRB is
 +  * actually a LINK (or NOP) TRB.
 +  */
 + field4 = (field4  ~TRB_CYCLE) | ring-cycle_state;
 + if (trb == ring-enqueue_first-generic)
 + field4 ^= TRB_CYCLE;
 +
   trb-field[0] = cpu_to_le32(field1);
   trb-field[1] = cpu_to_le32(field2);
   trb-field[2] = cpu_to_le32(field3);
 @@ -2964,6 +2974,8 @@ static int prepare_ring(struct xhci_hcd *xhci, struct 
 xhci_ring *ep_ring,
   return -EINVAL;
   }
  
 + /* Save entry whose cycle bit needs flipping at the end */
 + ep_ring-enqueue_first = ep_ring-enqueue;
   while (1) {
   if (room_on_ring(xhci, ep_ring, num_trbs)) {
   union xhci_trb *trb = ep_ring-enqueue;
 @@ -3006,13 +3018,18 @@ static int prepare_ring(struct xhci_hcd *xhci, struct 
 xhci_ring *ep_ring,
   nop_cmd = cpu_to_le32(TRB_TYPE(TRB_TR_NOOP) |
   ep_ring-cycle_state);
   ep_ring-num_trbs_free -= usable;
 - do {
 + /* Leave TRB_CYCLE unchanged on first NOP */
 + trb-generic.field[3] = nop_cmd ^
 + cpu_to_le32(TRB_CYCLE);
 + for (;;) {
   trb-generic.field[0] = 0;
   trb-generic.field[1] = 0;
   trb-generic.field[2] = 0;
 - trb-generic.field[3] = nop_cmd;
   trb++;
 - } while (--usable);
 + if (!--usable)
 + break;
 + 

Re: [PATCH 1/5] xhci: Don't change the ownership of LINK TRB until all the TRB are written

2014-01-22 Thread walt
On 01/22/2014 12:56 PM, Sarah Sharp wrote:
 Walt, can you turn on xHCI debugging and look for whether the NEC host
 that worked with David's patch is a 1.0 host?  You'll see a line like:
 
 // @%p = 0x%x (CAPLENGTH AND HCIVERSION)

Hi Sarah.  This is from the NEC host controller, *not* the ASMedia:

[9.089178] xhci_hcd :04:00.0: // @c90004f88000 = 0x960020 
(CAPLENGTH AND HCIVERSION)

--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/5] xhci: Don't change the ownership of LINK TRB until all the TRB are written

2014-01-22 Thread Sarah Sharp
On Wed, Jan 22, 2014 at 03:22:53PM -0800, walt wrote:
 On 01/22/2014 12:56 PM, Sarah Sharp wrote:
  Walt, can you turn on xHCI debugging and look for whether the NEC host
  that worked with David's patch is a 1.0 host?  You'll see a line like:
  
  // @%p = 0x%x (CAPLENGTH AND HCIVERSION)
 
 Hi Sarah.  This is from the NEC host controller, *not* the ASMedia:
 
 [9.089178] xhci_hcd :04:00.0: // @c90004f88000 = 0x960020 
 (CAPLENGTH AND HCIVERSION)

Ok, so that's also a 0.96 host, but it's not impacted by the no-op TRBs.
Still, I think disabling the recently added code for everything but 1.0
hosts makes sense.

Sarah Sharp
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/5] xhci: Don't change the ownership of LINK TRB until all the TRB are written

2014-01-21 Thread David Laight
Some xhci controllers (eg ASMedia) don't like processing a LINK TRB and
then finding that they can't process the next TRB.

Instead of flipping the cycle bit on the first data TRB, remember the
real first TRB in prepare_ring() and flip that in giveback_first_trb().

In queue_trb() ignore the cycle bit value set by xhci_queue_bulk_tx() etc,
and changes the ownership of all but the first TRB.

If prepare_ring() adds any NOP TRB, the ownership of all but the first
and the LINK are changed. The wmb() is no longer needed before changing
the ownership of a LINK trb since is is preceeded by a NOP.

Since the 'first trb' might be a LINK trb queue_command() must also now
call giveback_first_trb(). Don't ring the doorbell here though.

The isoc code calls prepare_ring() right at the start, this ensures that
there is enough space for all the TRB and advances the write-ptr past
any LINK TRB. It then calls prepare_transfer() multiple times.
However the prepare_ring() calls must now be matched with those to
giveback_first_trb().
Don't call prepare_ring() from prepare_transfer() for isoc rings.
Set 'more_trbs_coming' (it means 'advance past a LINK TRB) on all
the queue_trb() calls except the very last one.

Signed-off-by: David Laight david.lai...@aculab.com
---
Note that this differs to any previous similar patches I've sent
because it contains a fix to the isoc code.
However I've only tsted it with USB3 ethernet.

 drivers/usb/host/xhci-ring.c | 84 +---
 drivers/usb/host/xhci.h  |  1 +
 2 files changed, 57 insertions(+), 28 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index a0b248c..62049e5 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -2920,6 +2920,16 @@ static void queue_trb(struct xhci_hcd *xhci, struct 
xhci_ring *ring,
struct xhci_generic_trb *trb;
 
trb = ring-enqueue-generic;
+
+   /*
+* Ignore the caller's CYCLE bit.
+* The caller doesn't know whether the real first TRB is
+* actually a LINK (or NOP) TRB.
+*/
+   field4 = (field4  ~TRB_CYCLE) | ring-cycle_state;
+   if (trb == ring-enqueue_first-generic)
+   field4 ^= TRB_CYCLE;
+
trb-field[0] = cpu_to_le32(field1);
trb-field[1] = cpu_to_le32(field2);
trb-field[2] = cpu_to_le32(field3);
@@ -2964,6 +2974,8 @@ static int prepare_ring(struct xhci_hcd *xhci, struct 
xhci_ring *ep_ring,
return -EINVAL;
}
 
+   /* Save entry whose cycle bit needs flipping at the end */
+   ep_ring-enqueue_first = ep_ring-enqueue;
while (1) {
if (room_on_ring(xhci, ep_ring, num_trbs)) {
union xhci_trb *trb = ep_ring-enqueue;
@@ -3006,13 +3018,18 @@ static int prepare_ring(struct xhci_hcd *xhci, struct 
xhci_ring *ep_ring,
nop_cmd = cpu_to_le32(TRB_TYPE(TRB_TR_NOOP) |
ep_ring-cycle_state);
ep_ring-num_trbs_free -= usable;
-   do {
+   /* Leave TRB_CYCLE unchanged on first NOP */
+   trb-generic.field[3] = nop_cmd ^
+   cpu_to_le32(TRB_CYCLE);
+   for (;;) {
trb-generic.field[0] = 0;
trb-generic.field[1] = 0;
trb-generic.field[2] = 0;
-   trb-generic.field[3] = nop_cmd;
trb++;
-   } while (--usable);
+   if (!--usable)
+   break;
+   trb-generic.field[3] = nop_cmd;
+   }
ep_ring-enqueue = trb;
if (room_on_ring(xhci, ep_ring, num_trbs))
break;
@@ -3050,8 +3067,9 @@ static int prepare_ring(struct xhci_hcd *xhci, struct 
xhci_ring *ep_ring,
else
next-link.control |= cpu_to_le32(TRB_CHAIN);
 
-   wmb();
-   next-link.control ^= cpu_to_le32(TRB_CYCLE);
+   /* If LINK follows a NOP, invert cycle */
+   if (next != ep_ring-enqueue_first)
+   next-link.control ^= cpu_to_le32(TRB_CYCLE);
 
/* Toggle the cycle bit after the last ring segment. */
if (last_trb_on_last_seg(xhci, ring, ring-enq_seg, 
next)) {
@@ -3088,11 +3106,14 @@ static int prepare_transfer(struct xhci_hcd *xhci,
return -EINVAL;
}
 
-   ret = prepare_ring(xhci, ep_ring,
-  le32_to_cpu(ep_ctx-ep_info)  EP_STATE_MASK,
-  num_trbs, mem_flags);
-   if (ret)
-   return ret;
+   /*