Re: [linux-usb-devel] [PATCH as498] USB API modification

Alan Stern Wed, 13 Apr 2005 15:13:24 -0700

On Wed, 13 Apr 2005, David Brownell wrote:

> That doesn't describe the submit paths at all (linking things into
> the schedule) ... and doesn't describe the relevant bits of the unlink
> paths either (unlinking them from the schedule, at least from the
> host perspective; the HC will agree later).
> 
> URB completion -- either naturally, or from unlink/cancel -- does
> lend itself to a bottom half, as I'd noted.  But that's distinct
> from submission.


> There should be no problem blocking IRQs for short periods, and as
> you noted, you can't really avoid doing that; you've got to maintain
> schedule integrity after all.
> 
> Maybe you're just thinking that some of the stuff now done under
> that lock doesn't need to be done there?  For example, allocating
> and preparing all the TDs associated with a given URB, prior to
> linking them in on the submit path.

> A tasklet for the IRQ handling code wouldn't bother me the way even
> the concept of one in the submit path does!

I sense a common theme in these comments!  You're clearly concerned about 
undue delays during submission.

I don't see any way around it, though.  Yes, I would like to take a lot of 
the stuff now done with interrupts disabled (which isn't quite the same as 
under the spinlock) and let them run with interrupts enabled.  Remember, 
however, that ->enqueue() can be called with interrupts already disabled!  
The necessary conclusion is that it mustn't do very much.  Certainly it 
musn't try to allocate and prepare a whole lot of TDs.  Note that the 
controller is capable of going through up to 19 max-size bulk packets per 
frame -- that's more than I would like to prepare with interrupts 
disabled.


> > I'm not so sure about unlinking being on the critical path. 
> 
> Unlink can be argued, since it's "rare"; but on the other hand, I'm
> thinking of how the usb_sg_cancel() stuff needs to work to maintain
> datastream integrity.  What's important is that the endpoint queue
> be taken off the hardware schedule ASAP ... the rest can wait, like
> any other completion.

I don't agree.  Generally unlinks aren't critical at all; they tend to
occur after a reasonably lengthy timeout.  If you've just spent 5 seconds
waiting for transfer to complete, how does it hurt to spend 2 ms waiting
for it to be cancelled?

Datastream integrity is a separate issue.  It's not directly related to 
how quickly the entries are taken off the hardware schedule.


> > It should be possible to deliver pt_regs even with a top-half/bottom-half 
> > split. 
> 
> The correct pt_regs -- current as of the IRQ?  I thought they
> were specific to that IRQ context.

I'm not sure about this.  My understanding, which could be wrong, is that
pt_regs contains nothing more than the register contents at the time the
interrupt was fielded.  So while they are specific to an IRQ context, they
can usefully outlive that context.  Furthermore, the only purpose it's
used for is things like Alt-SysRq-R (show Registers) -- clearly a function
of the keyboard handler, which is why DaveM is interested -- and for that
it doesn't matter exactly which IRQ the pt_regs is associated with if
several of them occur in quick succession.  It's possible there are other
uses as well; I can't imagine that any of them would care if they get the
pt_regs associated with the most recent IRQ instead of the IRQ
corresponding to when the URB actually completed (which isn't necessarily
well defined to begin with!).


> I've deferred doing HCD performance work, on the grounds that usbcore
> has -- until relatively recently! -- not been correct enough to trust.
> 
> Over the last year, I think most of those correctness issues have
> been getting resolved.  Which means that maybe it's finally time to
> do some performance work...

I've been too busy with other things, but I'll try to set up some 
experiments reasonably soon.


> > Now you're reading much more into this than I ever said.  All I did was 
> > change the API so now it claims that when an URB is unlinked, its queue 
> > isn't guaranteed to stop until all the preceding URBs have been completed.  
> > Surely you can't object to that? 
> 
> Well, previously it specified that the queue was stopped when that
> canceled/unlinked URB was completed -- regardless of preceding (or
> in fact, following) URBs.  The only glitch was "case (3b)", which
> you didn't exactly identify, and which is basically an API bug
> associated with the notion of selective cancelation.
> 
> Not stopping "until all the preceding URBs have been completed"
> still seems like a _needless_ change, and I don't much like those
> in areass where I've put effort into coming up with as simple a
> formulation as possible.  And it wasn't how I read your text ...

Why not?  The exact text of the patch was: "... if an URB is unlinked
before the hardware has started to execute it, then its queue is not
guaranteed to stop until all the preceding URBs have completed".  How else 
could you read it?

In principle both forms of behavior can be covered by a single statement:
"If an URB terminates with a fault, then while its completion handler runs
the hardware will not handle any URBs from later in the queue (although it
may have previously started to handle some)" -- that parenthetical portion
referring to case 3b.

> > Do you know of any code that relies on  
> > the queue being stopped as soon as any URB is unlinked, even if that URB 
> > is still 15 deep in the queue?
> 
> I gave the two "big" examples before (usbnet, usb_sg_cancel)
> and they both care to some extent:  the whole goal being to
> cancel the whole queue ASAP, but needing to achieve that
> using a cancel-one primitive.
> 
> You still haven't explained how you could safely (== no
> data stream corruption, ever) unlink even one URB without
> stopping the whole queue, though, so I clearly miss the
> point of your question.  :)

Here's how.  You submit an URB.  uhci-hcd makes a note that the bottom
half needs to link this URB into the schedule and returns without doing
anything else.  Then before the bottom half gets a chance to run, you
unlink the URB.  Eventually the bottom half handler fires up and sees that
the URB has been unlinked and was never put on an endpoint queue.  So it
doesn't stop any queue and simply calls the completion handler.

The point of my question was this.  Suppose the queue is currently 20
deep, with the hardware working on the first URB, when a driver unlinks
the fifteenth.  If the HCD takes that URB off the schedule and calls the
completion routine while the hardware is still way back on the first or
second URB, why should the driver care if the queue is still running?  
How could it even tell?  All it cares about is that the queue should stop
before the hardware reaches URB 16.  If the HCD can make that guarantee
without stopping the queue, why shouldn't it?


> Forget about detecting; think about the consequences of NOT
> stopping the whole data stream.  Driver says "write A, B, C",
> then later unlinks B (plus A and C in some order).  It's
> always OK to get part way through that queue before acting
> on the "unlink B" ... "A", or "A, B", even "A, B, C" written
> to the device is no problem.  But "A, C" can be deep trouble,
> e.g. wrong data gets written to disk; how to prevent it?

Very simply.  Either unlink the URBs in reverse order (C, then B) or else 
unlink B and then unlink C from within B's completion routine.  Either way 
is guaranteed to avoid the troublesome "A, C" case, even with my new 
semantics.

> > > Morphing (3b) into (3a) implies some sort of additional mechanism to 
> > > couple
> > > completion reports to the (hardware) endpoint queue.
> > 
> > I don't know about EHCI or OHCI.  With UHCI it would be necessary to 
> > change things so that a hardware URB completion isn't reported right away 
> > if the URB was unlinked.  Is that what you mean?
> 
> More like holding back all fault completions until the queue stops.
> For hardware faults ("errors") that should already be happening,
> at least for non-ISO endpoint queues.  Doing that for software
> faults (unlink/kill/cancel) would be the new behavior.

That's what I said: Don't report the fault completion right away when an
URB is unlinked; hold it back until the queue stops and all the unlinked
URBs are returned.  Only the "don't report right away" part is new.  The
existing code already takes care of completion reports when the queue
stops; no additional mechanism is needed there (contrary to what you
said).

Alan Stern



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
[email protected]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Re: [linux-usb-devel] [PATCH as498] USB API modification

Reply via email to