On Sun, 15 Oct 2006, David Brownell wrote:

> > > >         Queue B is the flow of URBs from the driver to ehci-hcd.  If 
> > > > this
> > > >         queue drains then the bandwidth is deallocated, something you
> > > >         desperately want to avoid.  It's up to the higher-level driver 
> > > > to
> > > >         keep the queue non-empty, even if that means submitting URBs 
> > > > with
> > > >         dummy data.  Latency has no effect here.
> > > > 
> > > >         Queue C is the flow of packets from the host controller to the
> > > >         device.  If this queue drains it is a loss-of-sync. 
> > > 
> > > And queue C would never drain unless queue B drains first... since they
> > > are coupled one-to-one.  No packet goes to/from the peripheral (C) unless
> > > it's been told to do so by the driver (B).
> > 
> > Not true at all.  Queue C drains (i.e., the device is left with a gap in
> > its data stream) whenever a slot isn't filled in time.  This can be caused
> > by excessive kernel latency, even if queue B remains full.
> 
> I guess you're defining things differently than I am then.  Neither OHCI
> nor EHCI supports the steady-state notion of an URB that's been handed to
> the HCD which has NOT been handed to the hardware.  And even the spinlocked
> code paths where the driver is adding to the HCD queue B can't realistically
> be said to complete before adding to the HC queue C.
> 
> That is, the notion of an URB that's on queue B (handed to HCD) yet is
> not on queue C (handed to HC) seems nonsensical to me.

I see where the confusion arises.  It's because an URB remains on queue B
even after it has been removed from queue C -- in fact, it effectively
stays on queue B until its completion handler returns.  So if the HCD
encounters a large IRQ latency and doesn't have a chance call the
completion handler for several milliseconds, C can drain while B remains
non-empty.

As you say, this is largely a matter of definition.  But what's _not_ a 
matter of definition is that queue C can drain without the endpoint losing 
its bandwidth allocation.


> > > > So we only need to consider errors caused by Queue C draining.  
> > > > Currently 
> > > > there is no standardized way to report these errors back to the 
> > > > higher-level driver.  Looking through 
> > > > documentation/usb/error-codes.txt, 
> > > > the closest thing we see are these wonderful entries:
> > > > 
> > > > -EXDEV                  ISO transfer only partially completed
> > > >                         look at individual frame status for details
> > > 
> > > ... that can itself be the individual frame status though!
> > 
> > Well yes, that's the idea.  When a slot is missed, the frame status is set 
> > to -EXDEV.
> 
> Hence "look at individual frame status" is going to get you back to -EXDEV;
> maybe that's some of the wonderfulness you were hinting at.  :)

Indeed.  I interpret that comment as meaning "Look at the individual frame 
status values to see which frames were or were not completed".  Even then, 
it's not clear that -EXDEV in the frame status means the controller missed 
the frame.

I wouldn't mind changing either urb->status or the frame status to use 
-EL2NSYNC instead of -EXDEV.


> > > > Sometimes it will be apparent at submission time that U is already too 
> > > > late.  For instance, slot N's microframe might already be over.  In 
> > > > such 
> > > > cases it is possible to return a submission error.  Let's call this 
> > > > option 
> > > > #1.
> > > 
> > > Right, and that's the intent of the current reporting of EL2NSYNC.
> > 
> > Which is undocumented and hence a new addition to the API.
> 
> Well, "new" ~= 3+ years by now.  You have to know that specific documentation
> has never been complete/accurate.

Yes, well, the history doesn't matter at this point.  We need to settle on 
a single sane approach, document it, and make sure all the HCDs follow it.
And that all the drivers are aware of it...


> > > > The only other reasonable option, #3, is to report an error upon the 
> > > > completion of U.  These two events (submission and completion) are the 
> > > > only chances ehci-hcd has to communicate with a higher-level driver.
> > > 
> > > This #3 is when -EXDEV gets reported, and/or noticing the start_frame
> > > hiccup.  I see no way around having these.
> > 
> > It seems clear that #3 is unavoidable, because there are circumstances in 
> > which the HCD is unable to use #1 or #2.  The real question is whether #1 
> > should be used at all.
> 
> No; I'd say that #1 and #3 are significantly different faults.

Strictly speaking, #1 and #3 are different options for reporting faults.  
The two different sorts of faults are:

    (a) URB submitted so late that all its slots have definitely been
        missed;

    (b) URB submitted late enough that some of its slots get missed but
        the HCD can't tell whether all of them will get missed (or can
        tell that some of them won't).

I agree that it may be worthwhile reporting them differently, since (a)
can require drastic recovery action whereas (b) generally will not.  On a 
higher plane, though, both faults come down to the same thing: The URB was 
submitted late.  It's merely a question of very late vs. a little late.

> Which makes the question different:  whether two such faults should
> be combined into one report, thereby discarding information that
> some drivers would be able to use.

Then you think we should report (a) by an error at submission time and (b) 
by the individual frame status values at completion time?


> > Auditing isn't enough; the drivers have to be fixed up to handle these 
> > faults in a reasonable way.  Unless ignoring them _is_ reasonable -- in 
> > which case why bother to report them?
> 
> As Monty said:  one issue is just "no regressions".
> 
> Another is that in _some_ cases ignoring is reasonable; but that does
> not mean it's always going to be reasonable.   That's why to "bother"
> reporting them even if ALSA, or near-term tweaks to ALSA, doesn't make
> effective use of them.

Okay.


> > For example, suppose the URB contains multiple slots and some of them have
> > already been missed while the rest are still okay.  Then the driver
> > doesn't have to do anything at all to catch back up.
> 
> Erm, URBs don't contain slots.  They contain packets.  Packets get put
> into slots by the HCD.

Well, URBs _really_ contain an array of usb_iso_packet_descriptor 
structures, which describe packets.  However I take your point: The 
association between the packet descriptors and the slots doesn't exist 
until the HCD sets it up.

> I'm catching up on some of this email, but this is that case where I had
> pointed out you were assuming a new/different scheduling policy.  If the
> policy is the existing "ASAP", then there *WILL* be a gap ...
> 
> If there's any catching-up to be done, something has to do it, and "ASAP"
> policy inside an HCD does not (so far as I've ever understood it) do that.

I think the "ASAP" policy, as you understand it, should not be used in
general.  HCDs shouldn't cause a mismatch between a packet descriptor and
its intended slot.

> > The issue then becomes, what if all the slots in the URB have been missed 
> > (or might be missed, since the HCD can't tell in cases where there's a 
> > close call)?  Okay, I admit, in this situation it makes sense to reject 
> > the submission entirely.  Maybe also set urb->start_frame to the next 
> > available slot.
> 
> Hmm, again we have different interpretations of what ASAP means.  In my
> book, this XRUN case could easily be scheduled ... because ASAP would
> clearly mean (as in the case right above!!) "starting right now".

Yes, differing interpretations.  Perhaps we should introduce a new URB 
submission flag, such as URB_ISO_STRICT, which would mean to schedule each 
packet for its intended slot even if it's already too late.  That way the 
driver could choose which policy to follow.


> > We don't currently have any way for a driver to tell the HCD it wants to
> > skip N (u)frames.  I suppose urb->start_frame could be used for this
> > purpose.
> 
> Heck, last I looked we didn't even have drivers that looked at the USB
> frame counter; nobody was even thinking about these issues.
> 
> I hope we agree that on URB completion, start_frame indicates when that
> frame started.

When the URB started, yes.  Or was nominally scheduled to start in the
case of URB_ISO_STRICT, since it didn't actually start until some time 
later.

> My understanding of start_frame on urb submission was that originally
> the idea was:  if ISO_ASAP wasn't set, that would specify the start frame.
> Now, I know that OHCI never implemented that; and EHCI didn't either.
> Maybe one of the UHCIs did/does.

Currently it does.  Of course this whole area is subject to change -- 
once we all agree on what it should change to!

> If we expect those semantics -- start_frame without ISO_ASAP set -- then
> we wouldn't need a "skip" mechanism.

That's okay with me, provided it can be made portable enough to work
consistently with all the HCDs.  For instance, if the previous URB (with N
packets) completed with start_frame set to S and the interval is I
(u)frames, then the driver should be able to set the next URB's
start_frame to S + (N+x)*I for some reasonable range of non-negative x,
even if this would overflow the number of bits used by the HCD in
reporting start_frame values.  The only restriction should be that it
doesn't fall outside the HCD's scheduling horizon.  And all HCDs should
allow x*I to be at least as large as 1 second.

Alan Stern


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Reply via email to