On Sun, 1 Jul 2007, David Brownell wrote: > > No. Just as important as the number of sg elements is the total > > transfer size. > > ... and the device speed. For full speed devices, at max of around > 1200 bytes/frame, when the system isn't horrendously busy it might > suffice to have just a few 4KB URBs queued(*) and have a completion > handlers smart enough to recycle those. That is, always keep one > URB in the queue, and reuse the others ... a strategy that could > be implemented today in the scatterlist code, if desired.
Such ideas have been at the back of my mind for a long time. Somewhat pessimistically I considered letting UHCI cap its bulk queues at about 32 KB and generate an IRQ every 12 KB or so. Those numbers correspond to approximately 27 ms and 11 ms respectively. It would make life easier if the s-g library took responsibility for all this rather than the HCD. > For high > speed, with max of around 35 KB/frame to most commercial USB disks, > queueing lots of data (== transfer size) is essential to maximizing > throughput. > > (*) Understanding that OHCI is friendlier than UHCI in this case, > another issue is interrupt rates. What OHCI does right now, at > least in the scatterlist code, is to hold off URB completion IRQs > for up to 7 frames. That means that a given endpoint will often > have up to three "finished" URBs that haven't yet been handed back > from hardware to software. All other things being equal, it's > better to use two IRQs to handle six 4K URBs than use six IRQs; > that's another "throughput" factor. > > > > Since each sg element refers to a page or more of data, > > the maximum number of elements is restricted by the maximum allowed > > transfer size (which is currently 512 KB or 128 pages on i386). The > > default max transfer size set by usb-storage is 120 KB. Larger values > > would lead to larger throughput but the benefit would be slight. > > The relevant factor their being per-request overhead: usb-storage > must send a command, transfer data, get a response; and scatterlist > code only improves the "transfer data" bits. That is, it can get > rid of wasted bus time between data packets ... but not wasted time > surrounding the command or response. In theory we could stream the commands and statuses as well, which would reduce the overhead somewhat. Doing this would complicate usb-storage a fair amount, and it wouldn't be surprising if a lot of devices couldn't handle the higher rates, so I'm not in favor. > > However none of this matters if you don't insist on allocating a > > separate URB for each sg element. > > As noted above, for full speed devices we could get similar throughput > with slightly more clever implementation of scatterlist handling. If > the HCD has good hardware support for queueing, and the system has fair > IRQ latency, recycling as few as five URBs might give similar throughput > to the current "URB per element" approach. I think we definitely should do this. For high speed the advantages aren't so great, since the amount of data needed to fill the pipeline is about the same as the default max transfer size anyway. Depending on how the implementation turns out, we might also want to have a "single URB, single page"-at-a-time fallback mode. Alan Stern ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ linux-usb-devel@lists.sourceforge.net To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel