On Monday 11 April 2005 12:07 pm, Alan Stern wrote: > On Mon, 11 Apr 2005, David Brownell wrote: > > > > * Host Controller Drivers (HCDs) place all the URBs for a particular > > > * endpoint in a queue. Normally the queue advances as the controller > > > ... > > > + * hardware processes each request. But when an URB terminates with an > > > + * error its queue stops, at least until that URB's completion routine > > > + * returns. It is guaranteed that the queue will not restart until all > > > + * its unlinked URBs have been fully retired, with their completion > > > + * routines run, even if that's not until some time after the original > > > + * completion handler returns. Normally the same behavior and guarantees > > > + * apply when an URB terminates because it was unlinked; > > > > That "normally" bit restates the guarantee provided above. > > > > The redundancy makes things less clear (IMO) because it's > > weakening the guarantee that "when an URB terminates with > > an error" (including an unlink or kill status!!) its queue > > stops ... that says "always", this says "sometimes maybe". > > That's just it. The first part applies only when an URB completes with a > hardware error, _not_ an unlink or kill!
It was intended to cover _all_ fault reports. Unlink/kill is just a software-induced fault. > Do you think this needs to be > made more clear? The second part applies when an URB terminates because > of an unlink or kill. It doesn't weaken the first guarantee because it > refers to different circumstances. Well, lately I've taken to talking about "faults" as an umbrella covering both hardware-induced stoppage and software-induced unlink/kill. In fact that's what the original text said ... which also addresses Oliver's comment. > > > + * however if an > > > + * URB is unlinked before the hardware has started to execute it, then > > > + * its queue is not guaranteed to stop until all the preceding URBs have > > > + * completed. > > > > Well, that's because no urb before that one will have > > completed with any fault code, so there was no need > > for the queue to have stopped. > > True. And "normally" the action of an HCD for an unlink request must be to deschedule/stop that endpoint's queue... So there are three cases, depending on the queue state. Always have been, in fact; can't really change anything. (1) Queue stopped before HC touched that URB. This is the cleanest case. Any preceding requests will be processed normally ... e.g. completing with success, or possibly because of some problem the hardware reported. And that URB will get a clean unlink report, actual_length 0. (2) Queue stopped part way through that URB (0..transfer_length bytes transferred). Same thing, except that the URB being unlinked may have nonzero actual_length in the completion. (3) (Unlink RACE case) Queue didn't manage to stop before the hardware fully processed that urb, and processed the whole transfer. There are two subcases here: (a) queue stopped before that unlink completion was issued, so again the original text should be true; (b) queue still active, which is the only problem case. See below for (3b). For all other cases, the original statement is true ... and the (3b) problem has always existed. I suspect a lurking issue here may really be that UHCI still has some issues with how it handles unlinking, which may not be fixable until it actually has a solid notion of endpoint queue. (Call that leftover 2.4-ism, if so. The whole notion of even being able to unlink _just one_ request is very dubious; if you look at the USB spec, it talks about killing _all_ IRPs at once.) The downside of that would be potential loss of integrity of the usb_sg_*() queues on fault paths ... hard to test, and also fortunately rare to be real trouble. > > The relevant urb was not yet terminated. > > False. The relevant (i.e., the unlinked) URB is terminated by the time > its completion handler gets called. Well, its termination is the same thing as the completion handler being called. The request is pending until that point, and is no longer pending after that. > This may well occur before all the > preceding URBs have terminated, since completion handlers are allowed to > run out of sequence when an URB is unlinked. To safely run that out of sequence, its queue clearly stopped in some sense. If there's a hardware DMA queue (as with all the PCI based HCs), that queue must have been stopped in order to safely take that URB out of the queue. That's just because of how DMA works; there can be lots of caching involved, and the HCD must handshake with the HC to ensure that URB isn't being worked on. The only reliable way to do that is ensure that _all_ urbs in that queue aren't being worked. For PIO based HCs, there are two basic cases: (a) this URB has a pending transfer that must be completed first; or (b) it's later in the queue, and can be removed immediately. The first case is effectively just like coping with DMA I/O caching; while the second is a useful simplification, since the HCD can know (unlike with DMA queues!) that it's safe. (And FWIW, most of the current UDC drivers work more like the PIO case there. Only the net2280 driver has DMA queue support, and while that worked fine for IN/TX it was a real bear for OUT/RX ... so it's not normally used. The DMA drivers don't use DMA queues.) > According to the original wording the queue _would_ be stopped when the > unlinked URB's completion handler was called, even though the hardware > might not have reached the URB yet. Yes. How can you avoid stopping the queue in that case? If the hardware is chewing through the queue, the _only_ way to control it is to stop that endpoint queue. Otherwise the HC could be munching away on that very transfer while the HCD was handing it back to some driver... > The old text said: > > But when an URB terminates with any fault (such as an error, or > being unlinked) its queue stops... > > So the new text does describe different behavior from the old text. OK, so that new text is wrong then. The only problem case has ever been (3b), where the URB completed (at the hardware level) before the queue could be stopped. I don't see a clean way to handle (3b), but on the other hand it doesn't matter with current drivers. All code that unlinks one URB is intending to nuke the whole queue (remember that issue?): * usb_sg_cancel() does it back-to-front and goes to some effort to ensure that there's a clean abort after the first N bytes, no matter when exactly the hardware stopped working on the queue; * usbnet doesn't do back-to-front (for no good reason) but in that case it doesn't matter ... packets can be dropped or reordered by definition. So I'm not too worried about case (3b), although I think it should get fixed someday ... by removing the "unlink one URB" semantics, replacing them with a "kill this endpoint queue" request which is not "buggy-as-designed" (because of a hardware race in the API). - Dave ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ linux-usb-devel@lists.sourceforge.net To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel