David -- Excellent analysis. But, I have a couple of comments/questions....
(1) I've never been able to get a consistent answer on short/zero-length
packet terminating a series of URBs that use QUEUE_BULK. If I could get a
_guarantee_ that it works as you advertise, I'd be much happier. But this
has been the least-well supported feature of the HCDs, and I've been
reluctant to try it. 2.5 should reduce that reluctance, but I'm still
skeptical.
(2) What about the race condition between submitting URBs with QUEUE_BULK
and having one of them terminate? That is, what if we're doing a read and
the first URB gets terminated with a short-packet _before_ we submit the
2nd URB of the data sequence? Will the 2nd URB get retired because the HCD
recognizes this based on the QUEUE_BULK flag? Or does the status end up in
the userspace buffer? Can the driver tell the difference then between this
case and a very slow device stuck in it's data-transfer phase?
Matt
On Sun, Nov 25, 2001 at 08:49:25PM -0800, David Brownell wrote:
> Matt, what's wrong with this analysis (if anything)?
>
> Looking first at the typical case (no short-read error,
> device returns all the data that's expected), and in
> all cases ignoring the actual API used to get to this
> "best of all possible I/O patterns":
>
> - For Bulk-Only, the I/O pattern is always going to
> be WRITE (command, maybe data), IRQ (for
> completion of 1 or N+1 urbs), READ (maybe
> data, certainly status), IRQ. Since the data
> is transferred on the same endpoint used for
> either command or status transfer, one or the
> other can always be queued.
>
> - For Control/Bulk, the I/O pattern is always going
> to be CTRL-WRITE, IRQ, WRITE or READ
> (for data), IRQ ... hmm, same performance model.
> Maybe that's why there's no status phase for CB? :)
>
> - Control/Bulk/Interrupt is just like Control/Bulk
> except that there's an additional INTR-READ, IRQ
> pair at the end. A smidgeon slower; there's also a
> polling delay for that INTR-READ (goes into the
> next frame or microframe).
>
> And for those error paths, with short reads, where
> I'd contend slower behavior is reasonable/expected:
>
> - For bulk only, the short read case will involve
> an extra READ+IRQ at the end, since the
> error will have dequeued the status read.
>
> - For control/bulk, same as the non-error case.
> (I can imagine other errors, like overruns, would
> cause more trouble, given "no status phase".)
>
> - For control/bulk/interrupt, also just like non-error.
>
> Now, as for the "semantic difference" and how to
> factor the API ... call me a purist, but I don't see
> any benefit to adding intelligence to usbcore so that
> it can have logic like this, which can just as easily
> live in the device driver's completion handlers:
>
> // in scatter/gather completion
> if (that was a bulk-only short read)
> set flag to remember this;
>
> // in status read completion, bulk-only case:
> if (we were canceled, and that flag's set)
> reissue myself
>
> Seems straightforward to me. Reworking the
> drivers/usb/storage/transport.c won't be notably
> easier by avoiding that, since the big issue is that
> it's structured to demand rescheduling the request
> thread between each transfer phase (and right now,
> even between each bulk-queuable s/g segment).
>
> Of course, just making the data phase use bulk
> queuing would be the single biggest performance
> win (save ten interrupts and thread reschedules,
> in your example :) and that doesn't require that
> larger restructuring. I'd be more tempted if I
> weren't already tied up with other things.
>
> - Dave
>
>
> ----- Original Message -----
> From: "Matthew Dharm" <[EMAIL PROTECTED]>
> To: "Johannes Erdfelt" <[EMAIL PROTECTED]>
> Cc: "USB Developers" <[EMAIL PROTECTED]>
> Sent: Sunday, November 25, 2001 3:04 PM
> Subject: Re: [linux-usb-devel] URB with scatter-gather?
>
> Nono.. we're confused here.
>
> What I'm describing here is a three-stage transaction, with the middle
> stage being a scatter-gather stage -- that is, the "data" phase involves
> trying to get data in/out of multiple non-contiguous buffers.
>
> So, I'm not talking about 3 URBs... I'm talking about 12. One command, 10
> data, and 1 status.
>
> The problem here is that there is a semantic difference between the way
> command/data/status are queued together and the way the data segments are
> queued together, and we need a way to represent both.
>
> Command/data/status should be queued together so that a short packet does
> _not_ stop the next transaction.
>
> Data blocks need to be queued together so that a short packet _does_ stop a
> transaction.
>
> The problem is, for maximum effeciency, I want to be able to use _both_
> types of queueing. I want scatter-gather as well as "unrelated URB"
> (for lack of a better term) queuing.
>
> Matt Dharm
>
>
--
Matthew Dharm Home: [EMAIL PROTECTED]
Maintainer, Linux USB Mass Storage Driver
How would you like this tie wrapped around your hairy round head?
-- Greg
User Friendly, 9/2/1998
msg02716/pgp00000.pgp
Description: PGP signature
