Re: 3.13-rc1 regression: Scatter-gather list issues at SuperSpeed only

Sarah Sharp Tue, 04 Mar 2014 10:07:01 -0800

On Tue, Mar 04, 2014 at 10:00:16AM -0500, Alan Stern wrote:
> On Tue, 4 Mar 2014, David Laight wrote:
> 
> > > > Notice the request length: 1536.  That's three 512-byte sectors.  A
> > > > little unusual, since most I/O is done in units of pages, which are
> > > > 4096 bytes.
> 
> > > Ok, we can't have SuperSpeed mass storage devices broken, so it looks
> > > like we'll have to revert the last patch to add scatter-gather to the
> > > ASIX driver to avoid that breakage.  That means Mathias is going to need
> > > to revert those two commits then, since he's taking over pushing xHCI
> > > driver bug fixes this kernel.
> > > 
> > > Greg, Dave, Freddy, how do you want to handle reverting commit
> > > 3804fad45411?  Should that come through Dave's networking tree or Greg's
> > > USB tree?
> > 
> > I'm not sure what those two commits have to do with this problem.
> > 
> > In order to support a request with a non-terminal buffer that isn't
> > a multiple of 1k you need something to stop a LINK TRB being in the
> > middle of the transfer.
> > Otherwise the 1536 byte transfer (expected as a 1k block followed by
> > 512byte one) can probably get sent as a 512byte block (terminating the
> > bulk data request) followed by a spurious 1k block.  These should
> > generate horrid errors from the target disk.
> 
> David is right; this problem can't be fixed simply by reverting
> patches.  The real problem is that the block layer has handed the USB
> stack an SG list that xhci-hcd cannot handle at all, in its current
> form.


We do not know if the driver not implementing TD fragment rules impacts
USB storage devices.  I can certainly look into that today, with the
xHCI 1.0 hosts I have on hand (Ivy Bridge and Haswell-ULT).  I can
experiment with shorting the ring segment so that almost every SCSI
transfer has a link TRB in the middle, and use a USB 3.0 analyzer to see
whether there are any short packets or abnormal traffic on the bus.

> There are only two reasonable ways to fix this: Add appropriate TRB
> fragment handling into xhci-hcd, or use bounce buffers for non-aligned
> requests.

Or disable scatter-gather for xHCI 1.0 hosts all together.

Alan, what do you suggest we do for the stable kernels in the meantime
before we have TD fragment rules in place?  3.12 and 3.13 already have
those two patches, and I keep getting failure reports.

>From a user perspective, USB 3.0 mass storage devices used to work
before 3.14-rc1.  Theoretically, the TD fragment rules could have caused
an occasional disk glitch.  Now the devices *will* fail, instead of
theoretically failing.  From a user perspective, this looks like a
regression; the USB device obviously fails on 3.14-rc1, and may
sometimes silently fail on prior kernels.

So what would you have me do to fix stable kernels?

> In theory the block layer could be taught about the need for these
> bounce buffers, but that would be only a partial solution.  It would
> help for mass-storage transfers, but not for networking (which doesn't
> use the block layer).
> 
> An alternative is to work around this particular problem by identifying
> the code that submits the 3-sector SG element, and changing it to use
> an even number of sectors.  But obviously that doesn't solve the
> underlying issue.
> 
> I think in the end there is no real choice but to bite the bullet and 
> implement the TRB fragmentation rules.

I agree we should implement TD fragment rules.  It's too late in the -rc
cycle to get those finished before 3.14 final.  So what do we do with
stable in the meantime?

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 3.13-rc1 regression: Scatter-gather list issues at SuperSpeed only

Reply via email to