On Fri, 17 Sep 2004, David Brownell wrote:

> > How do we tell khubd that it should shutdown or reset a root  
> > hub?  One possible way is to use a special sentinel value in hub->nerrors.  
> > Normally one doesn't expect there to be any errors in the status URB for a 
> > root hub.  So if khubd sees that nerrors is 9999 (for example) and that 
> > the hub is a root hub, it would know to do a poweroff-suspend.
> 
> Sounds like you're thinking about generalizing this differently:
> as "hub died" plus special "is it root" case.  I like that notion
> better without the special case ... :)

Actually, I'm trying to generalize along a different dimension: How to
handle root hubs.  With other USB devices, a logical disconnect is
implemented by setting change_bits in the device's parent hub.  Resets are
implemented by telling the parent hub to reinitialize the appropriate
port.  Suspend and resume are implemented by telling the parent hub to
suspend/resume the port.  Evidently none of those mechanisms can be used
with root hubs.

So we need both a way for khubd to carry these things out, and a way to 
_tell_ khubd to do them.  Two separate problems, but they both fall under 
the same special case.  The idea of reusing nerrors for notifications is 
a solution to the second problem -- how to tell khubd that something 
needs to be done.


> > My feeling (and I acknowledge that this may not be fully grounded in fact) 
> > is that when the HC dies or needs to be reset, it's because something we 
> > don't know about has gone badly wrong -
> 
> Right, and so the question is what to do about it.  Today this rare code
> path just tries to stop the HCD.  Then it'd be up to the sysadmin what to
> do -- maybe type "rmmod HCD; modprobe HCD" on a non-USB keyboard,
> or maybe press the reset button.
> 
> In the future usbcore might be able to just restart the controller, either
> by re-using the existing memory or:
> 
> > - and therefore we should begin  
> > with a clean slate: release all the hooks, destroy all the data 
> > structures, deallocate all the memory, etc.  To do otherwise might risk 
> > leaving in place whatever mechanism caused the original problem.
> > 
> > Maybe this is just overly cautious...
> 
> There are dozens of "how to restart" policies.  That one's much like
> the sysadmin would achieve by rmmod/modprobe. but automated,
> and applying only to the controller that died.   I was thinking of a
> lighter weight reset option, with fewer opportunities for new faults.
> 
> Maybe this just reflects me running into too many "rmmod hangs"
> or "modprobe hangs" failure modes.  Differently cautious.  :)

Yeah.  After some more thought, it seems likely that whatever caused the
HC to die either wouldn't need such a thorough reset or wouldn't be helped
by it.  With UHCI, for example, the HC will shut itself down if it
encounters invalid data in the schedule (result of a driver bug; the only
way to fix it is to remove the bug) or PCI troubles (probably transient;
if not then reinitialization won't help).


> > Obviously I haven't given this a tremendous amount of thought.  But 
> > suppose, for example, that when khubd wants to reset or resuscitate an HC 
> > that it calls the HCD's pci_driver.remove() routine, then the 
> > pci_driver.probe() routine.  (And does something equivalent for non-PCI 
> > controllers.)  The HC driver really _wouldn't_ know that the device exists 
> > during the intermediate interval.
> 
> The USB stack would still be knowing, OK, minor difference ...  It'd have
> to remember the device's bus, and call bus_rescan_devices().  Either way,
> the bus rescan might not give the device back to usbcore/khubd.

Not the way I was proposing.  The driver core and PCI core wouldn't know
anything had happened and no rescanning would be needed, since the
remove() and probe() routines would be called directly.  And the special
knowledge wouldn't really reside in the USB stack as a whole; it would
exist only in some local variable in the khubd thread.  As far as the rest
of the USB stack is concerned it would appear that the HC really did go
away and then a new one showed up.

But there are other problems with trying to do this (like how to handle 
non-PCI devices).  Overall it's probably not the best approach.


> > If people believe that my reasoning above isn't valid, and it's not
> > necessary to perform such a thorough reset, then fine -- doing the
> > equivalent of poweroff-suspend and resume would be almost as good.
> 
> It's one of many policy choices.  I just like the idea of turning this
> "rare" case into one that's more common (resume after poweroff),
> so the handling there sees reasonable amounts of testing ... and
> keeping it light weight to avoid adding new error cases.  I don't
> much like "couldn't re-allocate memory", for example!

Sticking to the more common case makes sense.


> > P.S.: While we're on the subject of root hubs...  What is an HCD supposed 
> > to do when the root hub is suspended and a remote wakeup IRQ arrives?  
> > Maybe this calls for another sentinel value in hub->nerrors.
> 
> The root hub should just wake up; how else can it handle the IRQ?
> If it were in a PM mode where the state couldn't change, it wouldn't
> have enabled wakeup before it went to sleep.

You missed my point.  The root hub _can't_ "just wake up"; someone has to
call usb_resume_device(), and that can't be done in_interrupt.  The
logical choice for that someone is again khubd -- so how does the HCD (or
the hcd glue layer) tell khubd that the root hub needs to be resumed?

Alan Stern



-------------------------------------------------------
This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170
Project Admins to receive an Apple iPod Mini FREE for your judgement on
who ports your project to Linux PPC the best. Sponsored by IBM.
Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php
_______________________________________________
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Reply via email to