[linux-usb-devel] Re: More questions/comments about the UHCI driver

Alan Stern Thu, 24 Jul 2003 09:18:20 -0700

On Wed, 23 Jul 2003, Johannes Erdfelt wrote:

> >     UHCI does not initialize urb->error_count to 0 anywhere.  Again,
> > that probably ought to be done in usb_submit_urb(), anyway.
> 
> I must have missed when urb->error_count was added. What is it for? Why
> didn't the patch that added it implement something for UHCI?


It's documented in include/linux/usb.h -- it is the number of iso.  
descriptors that reported errors.  I don't know when it was added, but it
must have been a while back.  UHCI _increments_ it properly for each 
error; it just doesn't _initialize_ the field to begin with.

> >     What happens when an isochronous URB is submitted with descriptor
> > intervals that wrap all the way around the frame list (that is, if
> > uhci_insert_td_frame_list() is called with framenum > current-frame +
> > 1024)?  It looks like the TDs get placed on the list anyway and then
> > processed at the wrong time.
> 
> I don't quite understand what you mean?
> 
> It looks like it handles wraps correctly.

Here's an example.  Suppose an iso. URB is submitted that has 5 
descriptors and an interval of 300.  Let's say the driver schedules the 
first descriptor for frame #100.  Then looping through the descriptors, 
uhci_insert_td_frame_list() will be called 5 times and will store:

        td[0] in frame 100,
        td[1] in frame 400,
        td[2] in frame 700,
        td[3] in frame 1000, and
        td[4] in frame (1300 % 1024) = 276.

So the HC will transfer td[0] during frame 100, then td[4] during frame 
276, then td[1] during frame 400, and so on.  td[4] will be transferred at 
the wrong time; it shouldn't be sent until frame 276 _of the next loop_.

> >     I really don't understand what's going on in uhci_fsbr_timeout().  
> > What's it trying to accomplish?  And in the line
> > 
> >    if (tmp != head && (count % DEPTH_INTERVAL) == (DEPTH_INTERVAL - 1))
> > 
> > should that == be != ?
> 
> uhci_fsbr_timeout() is called if we turned on FSBR for the transfer, but
> it hasn't made any progress in the transfer.
> 
> This is so we don't keep FSBR for URBs that are sitting around getting
> NAK'd constantly.

Okay, that makes sense.

> When we determine that the URB is idle, as an optimization, we try to
> set the depth first flag for a group of TDs.

Does this really help?  Once the URB starts moving again, this will make
it finish up a little quicker, but not very much (unless FSBR has been
turned off completely).  Besides, if an URB has already been blocked for
so long, does it matter if it then takes a little extra time to finish up?

> >     uhci_submit_isochronous() skips over frame descriptors that have 
> > length 0; it doesn't set up 0-length TDs.  Is that appropriate?  By 
> > contrast, uhci_result_isochronous() doesn't skip over 0-length frame 
> > descriptors, so the descriptors and the TDs will get out of sync.
> 
> Nope, you're right. They will get out of sync. I think this was an
> oversight when I merged in some code from usb-uhci a long time ago.

Do you mean that uhci_submit_isochronous() isn't supposed to skip length-0 
descriptors or do you mean that uhci_result_isochronous() is?

> >     There's a nasty race in uhci_delete_queued_urb().  If the URB
> > being deleted isn't at the start of its queue and the HC finishes the
> > previous URB at about the same time as the function is called, then the
> > link pointer from the last TD in the previous URB will be copied into the
> > element pointer of the previous URB's QH.  This might happen before the
> > code manages to overwrite the TD's link pointer, but the code completely
> > ignores that possibility.
> 
> If the previous URB is finished, does it matter if QH->element is set to
> an already completed TD?
> 
> Or am I misunderstanding the problem?

You are.  Here's another example that will make it clearer.  Let's suppose 
that 3 URBs (a, b, and c) are queued for the same endpoint.  To keep 
things simple, assume each is a bulk transfer with only 1 TD.  So 
initially, the schedule looks like this:

        QHa: Link -> skel_term_qh, Element -> TDa
        TDa: Link -> QHb
        QHb: Link -> skel_term_qh, Element -> TDb
        TDb: Link -> QHc
        QHc: Link -> skel_term_qh, Element -> TDc
        TDc: Link -> null

Now suppose the HC completes TDa just before a driver unlinks URB b.  When 
TDa is completed, its Link is copied into QHa's Element pointer.  So now 
the schedule looks like:

        QHa: Link -> skel_term_qh, Element -> QHb
        TDa: Link -> QHb
         ...

The unlink request comes in, and uhci_unlink_generic() calls
uhci_delete_queued_urb().  Since URB b is queued, the routine sets purbp
to URB a and pltd to TDa.  Since nurbp points to URB c, the code sets
pltd->link to point to QHc -- that's how it attempts to unlink URB b.  
Now the schedule looks like this:

        QHa: Link -> skel_term_qh, Element -> QHb
        TDa: Link -> QHc
         ...

As you can see, URB b has not been properly removed from the schedule; the 
HC will access it immediately after reading QHa.

This race is very difficult to solve correctly without shutting down the 
entire queue.

> >     I don't understand why uhci_control_retrigger_status() is so 
> > complicated.  Why does it have to allocate a new QH instead of re-using 
> > the old one?  There's a comment about avoiding pointer overwriting 
> > problems -- what problems?  Why not simply store the DMA address of the 
> > status TD in the QH's element pointer?  Of course this would require 
> > testing to see if the status TD had already been executed, but that 
> > wouldn't be hard.  Doing this would also simplify uhci_remove_qh(), since 
> > it would only get called immediately after uhci_delete_queued_urb().
> 
> Because the HCD owns the QH still. We can't modify QH->element in that
> case.
> 
> We would have to grab ownership of the QH by unlinking it and waiting,
> before we can modify QH->element safely.
> 
> Allocating a new QH is just easier.

I don't agree.  Primarily, what do you mean by "the HCD owns the QH"?  Do 
you mean that the HC is going to read the QH at various unpredictable 
times?  That shouldn't matter; there are plenty of places where the code 
modifies QH link pointers under the same circumstances, so why should an 
element pointer be any different?

Or do you mean that the HC is going to _write_ the QH element pointer at
various unpredictable times?  That's not true in this case.  Since the
element pointer points at a TD that got a short read and is hence marked
inactive, the HC won't update the element pointer any more.

> >     Why are urb_remove_list and complete_list separate lists?  There
> > doesn't seem to be any reason for putting URBs on the urb_remove_list,
> > since all that happens is that uhci_remove_pending_qhs() moves them onto
> > the complete_list anyway.
> 
> Because complete_list is called ASAP, but urb_remove_list must wait until
> the next frame.

Darn it -- I was aware of that and then forgot it.  Trying to juggle too
many concepts at once.  Obviously, you can't give back an URB until the HC
is finished using the transfer buffer.

So okay, it just means that urb_remove_list is subject to the same race 
as qh_remove_list.

Alan Stern



-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01
_______________________________________________
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

[linux-usb-devel] Re: More questions/comments about the UHCI driver

Reply via email to