On Tuesday 09 September 2003 16:38, Alan Stern wrote:
> On Tue, 9 Sep 2003, Baldrick wrote:
> > Hi Alan, since you didn't squelch those ideas how about this even wilder
> > one: it seems to me that for IN transfers (I am thinking bulk) it is
> > possible to allocate no TDs whatsoever!  The TDs are set up in the
> > transfer buffer instead!!  For example, suppose the size of a TD is 32
> > bytes and the endpoint size is 32 bytes as well.  You don't put anything
> > in the first 32 bytes of the transfer buffer. In the second 32 bytes you
> > put a TD which points to the start of the buffer. In the third 32 byte
> > block of the transfer buffer you put a TD which points to byte 32 of the
> > buffer (where the first TD is) and so forth.  The last (and maybe the
> > second last if the buffer size is not a multiple of 32) TD needs to be
> > put somewhere else, but it can live in urb_priv.  The beauty of this
> > scheme is that TDs only get overwritten when they are no longer in use by
> > the hc - after all, if one gets overwritten then that is because (by
> > definition) the hc has moved on to the following TD.
> >
> > The main problem (of course) is that some endpoints may be smaller than
> > the size of a TD.  Another problem could come from cache coherency
> > issues. Also, some users may not expect their buffers to be scribbled on
> > after the end of the transferred data (which could happen).  There would
> > also be an asymmetry with OUT transfers, but in a sense this asymmetry is
> > natural: quite often you don't know how much data needs to come in, so
> > have to set up a much bigger buffer than is actually needed, leading to
> > massive TD allocation - this is cured by the above scheme.  For an OUT
> > transfer you always know how much to send.
>
> I'm not so sure about this, although it's a cute idea.  The driver needs
> the TDs to remain available after the transfer is completed in order to
> retrieve the final status.  If they are overwritten with IN data that
> won't be possible.

There is also a problem with short packets.  After a bit more thought I
came to the conclusion that it would be better to allocate a fixed number
of TDs per endpoint - enough to keep the hc busy - and recycle them down
the various urbs.  And then I turned on my computer on got Dave's message!

> But your earlier ideas seem much more doable.  A lot of the stuff in a
> uhci_td could be removed.
>
>       The dma_handle actually isn't needed at all, believe it or not.
> The value is needed only at the time the TD is being linked into the
> schedule, which can be done as it is created.  After that, the dma_handle
> value is available (if needed) in the link field of the previous TD or QH.

What about the first TD?  The QHs element pointer gets changed so you can't
use that.

>       I don't see that the pointers to either the URB or the usb_device
> are needed either.  There's no need to do a get() on the device for each
> TD; one single get() for the whole URB will suffice.  And you're right
> that the pointers are always followed from the URB to the TD, never the
> reverse.  Or at least, if some do go backwards, the driver could be
> rewritten so they don't.
>
>       The TD list doesn't need to be doubly-linked.  And I suspect that
> a good deal of the iso. stuff isn't needed either, although I haven't
> looked very carefully at iso. processing.
>
>       Allocation and freeing of TD's isn't as much of a problem as you
> seem to think.  Although shrinking them certainly won't hurt, they are
> taken from a pre-allocated pool of DMA-coherent memory.  Allocation and
> freeing are thus extremely quick.

Are you sure?  Here are the worst offenders from a oprofile I did some time ago:

vma      samples  %           symbol name             image name
...
000032e0 3638     7.58059     uhci_unlink_generic     /uhci_hcd
00001620 5731     11.9418     uhci_alloc_td           /uhci_hcd
00001740 6809     14.1881     uhci_remove_td          /uhci_hcd
00002740 12278    25.584      uhci_submit_common      /uhci_hcd

Notice the uhci_alloc_td?  Of course it may not be the assigning of memory that
hurst, just the sheer number of TDs to be set up and linked.

> > > My personal feeling is that a massive reworking of UHCI would help. 
> > > One QH per URB alone means pretty far-reaching changes.  Another
> > > related change would be taking much of the information that's now in
> > > the QH and putting in the urbp instead.  Since the QH would now refer
> > > to all the URBs queued for an endpoint, it doesn't make sense to store
> > > per-URB data there.
> >
> > But is it worth it?
>
> I think so.  It would also move a bunch of excess data out of the
> DMA-coherent region into normal memory.
>
> Take a look at those old messages I mentioned.  Along with the list of
> problems given there, I've run across two others.  One is related to the
> unlinking race mentioned in the messages; there's an equivalent race
> involved in queueing a new URB.  (Imagine what would happen if the driver
> links a new TD to the end of an old one during the time interval after the
> old one has completed but before it has been removed from the lists.)  The
> second is something that David Brownell mentioned: when a short packet
> occurs for an URB marked USB_SHORT_NOT_OK, the queue for that endpoint
> should stop until after the completion routine for the URB returns.  Right
> now it's awkward for the driver to manipulate endpoint queues; rewriting
> to use one QH per queue would formalize the relationship and make it a lot
> easier to use.

Go for it!

Duncan.


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Reply via email to