On Sun, 1 Jul 2007, David Brownell wrote:

> > No.  Just as important as the number of sg elements is the total
> > transfer size.
> 
> ... and the device speed.  For full speed devices, at max of around
> 1200 bytes/frame, when the system isn't horrendously busy it might
> suffice to have just a few 4KB URBs queued(*) and have a completion
> handlers smart enough to recycle those.  That is, always keep one
> URB in the queue, and reuse the others ... a strategy that could
> be implemented today in the scatterlist code, if desired.

Such ideas have been at the back of my mind for a long time.  Somewhat
pessimistically I considered letting UHCI cap its bulk queues at about
32 KB and generate an IRQ every 12 KB or so.  Those numbers correspond
to approximately 27 ms and 11 ms respectively.

It would make life easier if the s-g library took responsibility for 
all this rather than the HCD.

>  For high
> speed, with max of around 35 KB/frame to most commercial USB disks,
> queueing lots of data (== transfer size) is essential to maximizing
> throughput.
> 
> (*) Understanding that OHCI is friendlier than UHCI in this case,
> another issue is interrupt rates.  What OHCI does right now, at
> least in the scatterlist code, is to hold off URB completion IRQs
> for up to 7 frames.  That means that a given endpoint will often
> have up to three "finished" URBs that haven't yet been handed back
> from hardware to software.  All other things being equal, it's
> better to use two IRQs to handle six 4K URBs than use six IRQs;
> that's another "throughput" factor.
> 
> 
> > Since each sg element refers to a page or more of data, 
> > the maximum number of elements is restricted by the maximum allowed
> > transfer size (which is currently 512 KB or 128 pages on i386).  The
> > default max transfer size set by usb-storage is 120 KB.  Larger values 
> > would lead to larger throughput but the benefit would be slight.
> 
> The relevant factor their being per-request overhead:  usb-storage
> must send a command, transfer data, get a response; and scatterlist
> code only improves the "transfer data" bits.  That is, it can get
> rid of wasted bus time between data packets ... but not wasted time
> surrounding the command or response.

In theory we could stream the commands and statuses as well, which
would reduce the overhead somewhat.  Doing this would complicate
usb-storage a fair amount, and it wouldn't be surprising if a lot of
devices couldn't handle the higher rates, so I'm not in favor.

> > However none of this matters if you don't insist on allocating a
> > separate URB for each sg element.
> 
> As noted above, for full speed devices we could get similar throughput
> with slightly more clever implementation of scatterlist handling.  If
> the HCD has good hardware support for queueing, and the system has fair
> IRQ latency, recycling as few as five URBs might give similar throughput
> to the current "URB per element" approach.

I think we definitely should do this.  For high speed the advantages 
aren't so great, since the amount of data needed to fill the pipeline 
is about the same as the default max transfer size anyway.

Depending on how the implementation turns out, we might also want to
have a "single URB, single page"-at-a-time fallback mode.

Alan Stern


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Reply via email to