Re: [linux-usb-devel] [PATCH as498] USB API modification

David Brownell Wed, 13 Apr 2005 07:30:48 -0700

On Monday 11 April 2005 8:25 pm, Alan Stern wrote:
> On Mon, 11 Apr 2005, David Brownell wrote:
> 
> > > Actually this is a new issue, not a lurking one.  The whole point of that
> > > patch was because I'm going to make the majority of the UHCI driver run in
> > > a bottom half (a tasklet).  This includes most of the work involved in
> > > submitting and unlinking URBs.
> > 
> > Erm ... why?  That'll mean many of the relevant data structures
> > will more likely be dcache-cold.
> 
> I don't see why it would.  Especially if bottom-half handler happens to be
> running a bit behind, so that it takes care of more than one IRQ
> invocation in one go.  Then the data structures would be dcache-warmer 
> than they are now.


That doesn't describe the submit paths at all (linking things into
the schedule) ... and doesn't describe the relevant bits of the unlink
paths either (unlinking them from the schedule, at least from the
host perspective; the HC will agree later).

URB completion -- either naturally, or from unlink/cancel -- does
lend itself to a bottom half, as I'd noted.  But that's distinct
from submission.



> > For unlinking, the operations should be just patching the schedule
> > (under spinlock) and requesting an IRQ.
> 
> That's right, messing around with the schedule.  And since unlinking is
> allowed to happen in_interrupt, it means _nothing_ can touch the schedule
> without disabling interrupts.  That's my main reason for doing this -- so
> that the heavy-duty work (i.e., the stuff that uses or changes the
> schedule) can take place without blocking interrupts.

There should be no problem blocking IRQs for short periods, and as
you noted, you can't really avoid doing that; you've got to maintain
schedule integrity after all.

Maybe you're just thinking that some of the stuff now done under
that lock doesn't need to be done there?  For example, allocating
and preparing all the TDs associated with a given URB, prior to
linking them in on the submit path.

Or -- I've not looked at the innards of UHCI in a long time! -- maybe
you're more focussed on cleaning up the completion paths than it
seemed, from your inclusion of "submit" paths.  Last I looked at
the UHCI code I didn't understand why it had a bazillion locks and
queues complicating that completion logic; that's not an area where
UHCI needed to be different from OHCI or EHCI, both of which are a
lot simpler.  I know you got rid of most of the locks a while back...


> >  Unlinking is both rare and,
> > when it happens, on the critical path.  Submitting is likewise on
> > the critical path ... though it happens a lot.  Neither of those
> > would seem to me like good candidates for putting into a tasklet.
> 
> I agree about submitting.  It's a definite weak spot for the projected
> change.

A tasklet for the IRQ handling code wouldn't bother me the way even
the concept of one in the submit path does!


> I'm not so sure about unlinking being on the critical path. 

Unlink can be argued, since it's "rare"; but on the other hand, I'm
thinking of how the usb_sg_cancel() stuff needs to work to maintain
datastream integrity.  What's important is that the endpoint queue
be taken off the hardware schedule ASAP ... the rest can wait, like
any other completion.


> > I could see in some cases wanting the IRQ processing to be split
> > into the classic top half (in_irq) and bottom half.  I even had
> > EHCI working that way, originally ... DaveM removed that, mostly
> > to ensure that pt_regs got delivered to keyboard IRQ handlers
> > (evidently some keyboard code really wants pt_regs), but also to
> > make it a bit simpler.
> 
> It should be possible to deliver pt_regs even with a top-half/bottom-half 
> split. 

The correct pt_regs -- current as of the IRQ?  I thought they
were specific to that IRQ context.


> > Is this because of those no-details-provided reports of high IRQ
> > latencies with USB on one board?  It'd have been interesting to
> > see some accounting for those latencies.  For example, oprofile
> > traces of where the IRQ handlers spend time...
> 
> That was the main motivation, although I'd had this in the back of my mind 
> for a long time.  Yes, it would be nice to see some real numbers for those 
> latencies.  I suppose I could add some high-precision timer stuff to the 
> driver, to see just how much time actually is spent with interrupts 
> disabled...

I've deferred doing HCD performance work, on the grounds that usbcore
has -- until relatively recently! -- not been correct enough to trust.

Over the last year, I think most of those correctness issues have
been getting resolved.  Which means that maybe it's finally time to
do some performance work...


> > > > To safely run that out of sequence, its queue clearly stopped in some
> > > > sense.
> > > 
> > > Not if the URB was never in the hardware queue to begin with. 
> > 
> > From the usbcore perspective, there's only one queue.  If you're
> > proposing that usbcore start needing to think about two separate
> > queues, that's significantly more invasive than anything that's
> > happened in usbcore over the past few years (including the almost
> > total rewrite of the hub and enumeration code)...
> 
> Now you're reading much more into this than I ever said.  All I did was 
> change the API so now it claims that when an URB is unlinked, its queue 
> isn't guaranteed to stop until all the preceding URBs have been completed.  
> Surely you can't object to that? 

Well, previously it specified that the queue was stopped when that
canceled/unlinked URB was completed -- regardless of preceding (or
in fact, following) URBs.  The only glitch was "case (3b)", which
you didn't exactly identify, and which is basically an API bug
associated with the notion of selective cancelation.

Not stopping "until all the preceding URBs have been completed"
still seems like a _needless_ change, and I don't much like those
in areass where I've put effort into coming up with as simple a
formulation as possible.  And it wasn't how I read your text ...


> Do you know of any code that relies on  
> the queue being stopped as soon as any URB is unlinked, even if that URB 
> is still 15 deep in the queue?

I gave the two "big" examples before (usbnet, usb_sg_cancel)
and they both care to some extent:  the whole goal being to
cancel the whole queue ASAP, but needing to achieve that
using a cancel-one primitive.

You still haven't explained how you could safely (== no
data stream corruption, ever) unlink even one URB without
stopping the whole queue, though, so I clearly miss the
point of your question.  :)

 
> It's not at all clear to me that this would be a visible change.  Can you 
> think of any way to detect how soon a queue stops from within a driver?

Forget about detecting; think about the consequences of NOT
stopping the whole data stream.  Driver says "write A, B, C",
then later unlinks B (plus A and C in some order).  It's
always OK to get part way through that queue before acting
on the "unlink B" ... "A", or "A, B", even "A, B, C" written
to the device is no problem.  But "A, C" can be deep trouble,
e.g. wrong data gets written to disk; how to prevent it?

 
> > Morphing (3b) into (3a) implies some sort of additional mechanism to couple
> > completion reports to the (hardware) endpoint queue.
> 
> I don't know about EHCI or OHCI.  With UHCI it would be necessary to 
> change things so that a hardware URB completion isn't reported right away 
> if the URB was unlinked.  Is that what you mean?

More like holding back all fault completions until the queue stops.
For hardware faults ("errors") that should already be happening,
at least for non-ISO endpoint queues.  Doing that for software
faults (unlink/kill/cancel) would be the new behavior.


> >  I don't remember anyone speaking "against" that, except
> > in the case of ep0 ... the discussion stopped only because of a lack
> > of energy to rewrite the HCDs to address such API design bugs.
> 
> Here's the start of the thread, if you want to refresh your memory:
> 
> http://marc.theaimsgroup.com/?l=linux-usb-devel&m=110805033904399&w=2
> 
> Looking through the messages, I see that Oliver was lukewarm at best, you
> didn't think it was a particularly high priority because all the HCDs
> currently handle single-URB unlinking okay and furthermore it would mess
> up usbtest #10, and Wolfgang Muees suggested that URBs should have
> unlimited retries (okay, that was a bit OT).  _Nobody_ was really in favor
> of doing it, except possibly me.

That summary matches mine, I think.  (The usbtest #10 example is for ep0.)
The problem is that the "cancel-one" semantics embeds hardware races in
the API (as with case 3b), and is actually not simple to implement.  But
fixing that isn't simple either, and for the moment our workarounds seem
to suffice.

- Dave


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
[email protected]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Re: [linux-usb-devel] [PATCH as498] USB API modification

Reply via email to