Re: [linux-usb-devel] What to do with all of the USB UHCI drivers in the kernel?

Johannes Erdfelt Fri, 05 Jul 2002 15:42:43 -0700

Before I reply, let me clarify something. We're talking about 2,
completely seperate concepts in this email which I know has been causing
some confusion:

1) Management and processing of outstanding URB's
2) Management of completions

On Fri, Jul 05, 2002, David Brownell <[EMAIL PROTECTED]> wrote:
> > If what you mean by "less work" is processor work, that can be debated. I
> > think there's cases where that will be true and cases where it won't be.
> 
> Well, I listed one (common) case where the number of things to
> check is reduced.  Another is wherever URB queueing is in use,
> and the HC hasn't finished the URB(s) at the front of the queue
> for a given endpoint.
> 
> Fewer things to check --> less work.
> 
> There shouldn't be much of a difference in the amount of work
> to do when an HCD finds a TD that the HC finished with; but
> if there is, that'd be a separate issue.

Yes, if the first URB in a queue is not finished, the others behind it
can't be finished. As a result, performing any checks to see if it's
completed is superfluous (uhci_transfer_result). That is one optimization
that can be made that uhci-hcd does not do. I'll touch on why I don't
think this optimization should be done at the end.

> >>Or maintain a "shadow schedule" which is what EHCI does.  Much
> >>simpler and more efficient ... only OHCI needs that annoying
> >>done-list processing, which requires a bus_to_virt() analogue.
> > 
> > I saw that in the EHCI code. Can you elaborate on what it does?
> 
> It's keeping a copy of the schedule in a way that the CPU can
> make sense of directly:  host order, not bus order.  (OHCI can
> become more efficient that way too.  I've started "ohci-hcd"
> down that path, but it's not very far along yet.)
> 
> And it holds _all_ the TDs that have been issued ... the HC removes
> some TDs as it processes them, the HCD has to walk behind (shadow)
> the HC and handle those (collect status, free memory, sometimes
> unmapping buffers or reporting URB completions).

So you add all TD's to the schedule, not the QH's?

Or do you modify the piece of data that is in the "shadow schedule"
based on what work has been completed by the HC?

This is probably a worthwhile optimization so we don't start from the
first TD in the QH every single time.

> > Since we need to check each URB to see if it's done, I don't see how
> > using a "shadown schedule" is any better. The current list is the most
> > efficient way of doing that.
> 
> How so?  Given "U" urbs and "N" endpoints, we know at least
> that N <= U.  How is it that checking U times would usually
> be more efficient, especially given cases when not all N would
> need to be checked?

Not all endpoints will have an URB queued. But I suspect you meant N as
in "endpoints with an outstanding URB".

Yes, again, more efficient.

> > The only possible optimization I can see would be to not check some
> > periodic URB's, but we can't do that because we don't know definitively
> > or not if the HC has processed the TD's since the last interrupt.
> 
> We know equally well with both models:  if the TD is now marked as
> done (by the HC), then it was processed.

You have me lost here. I'm not quite sure what your comment is
addressing?

> >>>However, when we drop the lock, the other CPU's are free to do pretty
> >>>much anything to the list.
> >>
> >>Like I said ... that's fine, and it needn't cause any confusion unless
> >>whatever code scans the schedule is being buggy.  Heck, it's an exact
> >>corrollary of the problem of updating the hardware's view of that same
> >>schedule: if the one problem can be solved, so can the other.
> > 
> > I fail to see the corollary.
> 
> It's two views of the same data structure:  isomorphic.  The
> fact that the HC and HCD (including copy of HCD on another CPU)
> use that structure slightly differently doesn't matter; the
> concurrent update issues come up with both views.

Umm. You said it yourself later in your reply, the HC doesn't know, nor
care, about URB's.

You said there was a corollary between removing URB's from a list as it
was for removing QH's from the HC's schedule. While they both remove
something from a linked list, the way the removal is achieved is
different.

> > There are no provisions like this in the urb_list case. If we have a
> > list like this
> > 
> > A - B - C - D - E
> > 
> > and we're processing URB B and C gets removed while we're traversing the
> > list, thanks can break. That's what we need to protect against.
> 
> I'm not quite clear what scenario you're talking about.  If they're
> not queued to the same endpoint, there's no issue, except for HCDs
> using that "list of all active URBs" model.  (Another area where
> avoiding that model gives simplification ... :)
>
> If they're queued to the same endpoint, URB A can't matter (it's
> been completed and dequeued already if we're now processing B),
> and in any case the endpoint had to be descheduled to prevent
> the HC from processing C's TDs as we're freeing them.  Code that's
> handling the unlink for C can't progress until it's really off
> the schedule, and won't run when B's TD processing is running
> because the schedule will necessarily be locked.  (In the "single
> lock per HCD" model I've been talking about.)

You're confusing things. I'm not talking about the schedule here. I was
talking about the management of outstanding URB's. Like uhci-hcd does
with urb_list.

> > If you have a list of URB's, you need a lock. Period. End of story. The
> > HC may provide this functionality (is in OHCI), but in UHCI the HC
> > doesn't help us.
> 
> No HC understands URBs, and I think I've demonstrated that having
> a list of URBs is not necessary.  (By code, also by pointing out
> that the schedule is isomorphic to a list of active URBs.)  There's
> no issue of hardware support (or lack thereof).

There is an issue of hardware support, but I think a poor choice of words
on my part is confusing you.

You're right the HC doesn't care about URB's. I really meant it in the
logical sense of a "transfer". In uhci-hcd's case, that would be a QH
(for control, bulk), a single TD (for interrupt) or a group of TD's (for
isochronous).

You need to protect against multiple CPU's inserting into the schedule
at the same time. In the case of OHCI, the HC arbitrates this. In the
case of UHCI, the HCD needs to arbitrate it. But you know all of this
already I think.

> > However, you knew that already since ehci-hcd protects against that by
> > dropping the lock before making the completion callback. However, it
> > adds extra complexity.
> 
> No complexity that I really noticed.  Such as it is, it's all
> localized in the code that scans for completed work ... rather
> than spread throughout the driver.
> 
> The UHCI drivers are (both) a _lot_ more complex in terms of
> how much state they manage and track.

That can be debated, but I don't think it's relevant.

Do you mean that the code to handle completion notifications in uhci-hcd
is too complex?

> > I based my comments on trying to interpret what you've said. Since you
> > were referring to the "schedule" in your previous email, I interpreted
> > that as the schedule the HC processes. That's where all of my comments
> > came from.
> > 
> > Thankfully, you've explained that you meant a "shadow schedule" which is
> > different and thusly isn't as bad as I was led to believe.
> 
> Right.  In a sense, every HCD except OHCI keeps a "shadow schedule",
> but (at this point) only EHCI organizes it like the hardware does.

I think you're using the term "shadow schedule" loosely here.

All you're simply keeping a track of are some piece of the whole logical
transfer that has a possibility of actually being finished by the HC.

uhci-hcd does this with urb_list, which is just a list of all of the
URB's submitted for that HC.

> I was talking about advantages of that organization, like being able
> to do less work when scanning ... but mostly not having code to keep
> a second not-quite-parallel data structure in sync with the hardware.
> That's a "simpler is better/faster" tradeoff.

Which "not-quite parallel data structure" are you talking about?

> > ...
> > ehci_urb_dequeue
> > intr_deschedule which has this tidbit
> >         qh->qh_next.ptr = 0;
> > 
> > Now, you correctly noticed that this call chain can occur and to restart
> > it, you check to see if next->ptr is set to NULL (to denote the URB was
> > unlinked) and restart the loop from the beginning since the list has been
> > changed.
> 
> Right -- a decidedly atypical case, with a simple (and highly localized)
> recovery strategy.  Typical case is one quick pass; atypical case adds
> a (shorter) second pass.  The periodic schedule is intended/optimized for
> the case of long-term stability:  hubs and webcams stay plugged in for
> months at a time, drivers running constantly and never wanting transfers
> to stop.
> 
> What I'd call "jumping through hoops" is anything that spreads knowledge
> of such rare cases throughout the whole driver ... which is exactly what
> that UHCI "complete_list" does, making two passes become typical.  :)

You're being misleading. uhci-hcd doesn't make 2 passes at urb_list. It
makes one pass on urb_list, and then one pass on complete_list.

complete_list being only the URB's that have been completed.

As far as CPU work goes, it's a couple of list operations extra, which is
negligble amounts of work.

ehci-hcd has some work in dropping and reaquiring the spinlock as well
as some extra logic to see if any URB's have been unlinked. You're right
that it's atypical for URB's to be deleted.

Anyway, the performance difference is moot.

Don't underestimate how important simple code is over more complicated
code for a small optimization.

The goal here isn't the be most efficient possible. It's about having
efficient code, that is maintainable. And that means simple.

Remember what Linus has said "Make it as simple as possible, and no
simpler".

As far as your optimization to not check URB's which are queued, I think
this falls into this category. Yes, it is more efficient, but the cost
is minimal (we'll fail the "finished" check on the first TD) and makes
the code more complicated.

That all being said, uhci-hcd isn't perfect, yet :)

JE

-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Bringing you mounds of caffeinated joy.
http://thinkgeek.com/sf
_______________________________________________
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Re: [linux-usb-devel] What to do with all of the USB UHCI drivers in the kernel?

Reply via email to