>>[ re why a separate list of URBs vs checking the schedule ] >> >>>Actually, you *MUST* maintain a list of outstanding URB's. Since the HC >>>doesn't tell the HCD which URB's have completed, we need to check all of >>>them to determine which one's completed. >> >>It does say which TDs have completed though ... and as I pointed >>out, that's quite sufficient for knowing which URBs completed. > > If by "says which TDs have completed" you mean that it twiddles some > bits in the TD, you are correct.
Yes. >>It may even be less work to check the schedule than the list of >>all URBs. (It's possible to skip chunks of the periodic schedule, >>and the URBs held there.) > > If what you mean by "less work" is processor work, that can be debated. I > think there's cases where that will be true and cases where it won't be. Well, I listed one (common) case where the number of things to check is reduced. Another is wherever URB queueing is in use, and the HC hasn't finished the URB(s) at the front of the queue for a given endpoint. Fewer things to check --> less work. There shouldn't be much of a difference in the amount of work to do when an HCD finds a TD that the HC finished with; but if there is, that'd be a separate issue. >>>The only other solution is to walk the schedule, but that requires >>>complicated code to convert bus addresses into kernel addresses. Not to >>>mention it's effectively a list anyway and it still needs a lock. >> >>Or maintain a "shadow schedule" which is what EHCI does. Much >>simpler and more efficient ... only OHCI needs that annoying >>done-list processing, which requires a bus_to_virt() analogue. > > I saw that in the EHCI code. Can you elaborate on what it does? It's keeping a copy of the schedule in a way that the CPU can make sense of directly: host order, not bus order. (OHCI can become more efficient that way too. I've started "ohci-hcd" down that path, but it's not very far along yet.) And it holds _all_ the TDs that have been issued ... the HC removes some TDs as it processes them, the HCD has to walk behind (shadow) the HC and handle those (collect status, free memory, sometimes unmapping buffers or reporting URB completions). > Since we need to check each URB to see if it's done, I don't see how > using a "shadown schedule" is any better. The current list is the most > efficient way of doing that. How so? Given "U" urbs and "N" endpoints, we know at least that N <= U. How is it that checking U times would usually be more efficient, especially given cases when not all N would need to be checked? > The only possible optimization I can see would be to not check some > periodic URB's, but we can't do that because we don't know definitively > or not if the HC has processed the TD's since the last interrupt. We know equally well with both models: if the TD is now marked as done (by the HC), then it was processed. >>>The reason we don't do it that way is because it's dumb. I'm sure if you >>>try to implement what you're proposing you'll see just how complicated >>>it is. >> >>On the contrary (and as I've already said) -- I did implement it, >>which is how I know it's a lot simpler. And why I thought it was >>worth commenting on the relative complexity of _both_ UHCI drivers. > > Hmm, I was hoping for an actual patch... :) Possibly after (a) we get down to just one UHCI HC, (b) that gets some other simplifications first, and (c) I fix a few open issues in the OHCI and EHCI code. But I was more hoping that the maintainer of the eventual single UHCI driver would shrink that code down smaller, instead. Like I said: I was just trying to point out a way things could be done better (smaller, simpler, faster). There's a cost to change, and we're at a point in the 2.5 cycle where those costs might reasonably be paid, so I think it'd be worth exploring. >>>However, when we drop the lock, the other CPU's are free to do pretty >>>much anything to the list. >> >>Like I said ... that's fine, and it needn't cause any confusion unless >>whatever code scans the schedule is being buggy. Heck, it's an exact >>corrollary of the problem of updating the hardware's view of that same >>schedule: if the one problem can be solved, so can the other. > > I fail to see the corollary. It's two views of the same data structure: isomorphic. The fact that the HC and HCD (including copy of HCD on another CPU) use that structure slightly differently doesn't matter; the concurrent update issues come up with both views. > The hardware processes the schedule once, and at the next frame it > starts from the beginning again. Well, only the next frame's part of the schedule. UHCI merges the periodic and async schedules, but that's not an issue. > The interrupt handler needs to walk the list of URB's once and only > once. > > The problem is really with removing entries from the list. ... Yes, that's the way it should be. > There are no provisions like this in the urb_list case. If we have a > list like this > > A - B - C - D - E > > and we're processing URB B and C gets removed while we're traversing the > list, thanks can break. That's what we need to protect against. I'm not quite clear what scenario you're talking about. If they're not queued to the same endpoint, there's no issue, except for HCDs using that "list of all active URBs" model. (Another area where avoiding that model gives simplification ... :) If they're queued to the same endpoint, URB A can't matter (it's been completed and dequeued already if we're now processing B), and in any case the endpoint had to be descheduled to prevent the HC from processing C's TDs as we're freeing them. Code that's handling the unlink for C can't progress until it's really off the schedule, and won't run when B's TD processing is running because the schedule will necessarily be locked. (In the "single lock per HCD" model I've been talking about.) > If you have a list of URB's, you need a lock. Period. End of story. The > HC may provide this functionality (is in OHCI), but in UHCI the HC > doesn't help us. No HC understands URBs, and I think I've demonstrated that having a list of URBs is not necessary. (By code, also by pointing out that the schedule is isomorphic to a list of active URBs.) There's no issue of hardware support (or lack thereof). > However, you knew that already since ehci-hcd protects against that by > dropping the lock before making the completion callback. However, it > adds extra complexity. No complexity that I really noticed. Such as it is, it's all localized in the code that scans for completed work ... rather than spread throughout the driver. The UHCI drivers are (both) a _lot_ more complex in terms of how much state they manage and track. >>You must have been reading what someone else wrote. For starters, >>the "usb-uhci" codebase does _not_ implement the approach I sketched. > > I know that usb-uhci-hcd has many similarities to what you do in > ehci-hcd, but that are some differences, which may or may not be > relevant to the topic at hand. I don't know of any intentional similarities, FWIW. If I built on any other driver it'd more likely be Roman's OHCI code; and that was only at the framework level, not in the guts of HC queuing. > I based my comments on trying to interpret what you've said. Since you > were referring to the "schedule" in your previous email, I interpreted > that as the schedule the HC processes. That's where all of my comments > came from. > > Thankfully, you've explained that you meant a "shadow schedule" which is > different and thusly isn't as bad as I was led to believe. Right. In a sense, every HCD except OHCI keeps a "shadow schedule", but (at this point) only EHCI organizes it like the hardware does. I was talking about advantages of that organization, like being able to do less work when scanning ... but mostly not having code to keep a second not-quite-parallel data structure in sync with the hardware. That's a "simpler is better/faster" tradeoff. > ... > ehci_urb_dequeue > intr_deschedule which has this tidbit > qh->qh_next.ptr = 0; > > Now, you correctly noticed that this call chain can occur and to restart > it, you check to see if next->ptr is set to NULL (to denote the URB was > unlinked) and restart the loop from the beginning since the list has been > changed. Right -- a decidedly atypical case, with a simple (and highly localized) recovery strategy. Typical case is one quick pass; atypical case adds a (shorter) second pass. The periodic schedule is intended/optimized for the case of long-term stability: hubs and webcams stay plugged in for months at a time, drivers running constantly and never wanting transfers to stop. What I'd call "jumping through hoops" is anything that spreads knowledge of such rare cases throughout the whole driver ... which is exactly what that UHCI "complete_list" does, making two passes become typical. :) - Dave ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Bringing you mounds of caffeinated joy. http://thinkgeek.com/sf _______________________________________________ [EMAIL PROTECTED] To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel
