Alan Stern wrote:
I've pretty much reached the end of the changes originally lined up
for the hub driver, but there are two areas that still need
attention: unbinding and error recovery.  The problems and solutions

And suspend/resume (which I think is in-hand for the moment, but surely deserves more attention) ... and power on/off, which we don't yet use.


aren't entirely well defined and some discussion might help clarify
what needs to be done.  There are several inter-related issues to
consider.


The unbinding question is: How should the hub driver handle being unbound from a hub that is not unplugged? This can happen in several different ways involving user intervention via usbfs or sysfs. We could try to disallow such things for hubs, but it seems more robust to allow them and then handle them properly.

If the hub driver is unbound, then the system won't be able to use the
hub very well.  In particular, connect changes won't be detected.  On
the other hand, devices that were plugged into the hub may still be
electronically accessible.  The cleanest way to handle this is to
disable all the in-use ports and call usb_disconnect() for their
children.  However, it's a little questionable how we can do this.

I'm not sure I see the issue. The hub disconnect() routine should shut down all (hub-managed) children before it returns, and disconnect all those devices from the usb device trees.


Once a driver's disconnect() is called, the driver isn't supposed to
communicate with the device any more.  By a quirk of the
implementation we don't enforce this restriction for endpoint 0, which
is all the hub driver needs for disabling all the ports.  Relying on
such quirks generally isn't a good idea, although this could
reasonably be considered a special case.

If hub disconnect() acts as I described above, what would use ep0 later on?


Of course, we can't disable the ports if the hub is suspended. I

No, but powering off the hub itself is usually an option. Except for some root hubs; those still don't fit in as smoothly as one might prefer.


think the only way to deal with this complication is to check
initially whether a device is suspended, and if it is, disallow
unbinding or configuration changes.  That's for all devices, not just
hubs.  As an additional complication, we _can't_ disallow unbinding
when a driver is being unloaded.  We have to let it unbind from its
interface -- even though we won't be able to set the interface back to
altsetting 0 afterwards.  (Luckily this part of the problem doesn't
affect the hub driver; it can't be unloaded unless all the usb_devices
are already gone.)

As a related matter, we also should disallow probing of interfaces on
suspended devices.  This won't be difficult, but it means a
newly-loaded driver might not be able to bind to all the interfaces it
should control.  I think that's unavoidable.

Ignoring the fault cases for the moment, I certainly agree that config changes (== change config or altsetting, driver bind) should only happen to devices that are configured and not suspended.

But disconnect/poweroff is a different case; we already know that
driver disconnect() must work with dead devices, so unbind()
ought to work for suspended devices.


Now let's discuss error recovery.  First, consider what sorts of
errors might occur.  One that the hub driver already checks for is 10
or more consecutive failures of the status interrupt URB (period is
256 ms).  Another, highly critical sort of error (not currently
handled) is failure of a hub to disable a port upon request.  Finally,
although not an error in the hub itself, is the possibility of rapid
repeated cycling of the connection status of a port -- as might happen
with a device that fails during initial probing, drops its connection,
re-establishes it, and repeats...

Errors of the first two sorts are best dealt with by resetting the
hub.  Should the user be given a chance to do something first?  And if
so, what could the user do?  Unplugging the hub seems to be the only
course of action, and that's even more drastic than resetting.

We don't currently have a "let the user do something" mechanism. Which sort of renders the question moot for now ... though I agree it'd be good to report some sort of event that userspace tools could pick up. Do you know how DBUS is handling that stuff now?


The code that resets hubs for error recovery isn't working now;
usb_reset_device() won't accept a hub as an argument.  The issue, of
course, is that when the hub is reset so are all its ports.  Thus it's
necessary to call usb_disconect() for all the child devices before
resetting the hub.

(In principle we could get around this. If there were an addition to the API for usb_drivers, a notification function for warning about impending resets, then we could simply notify all the children's drivers. But there isn't such an API, and it's not likely to arrive during 2.6.)

It would be easy enough for usb_reset_device() to disconnect all the
children when resetting a hub.  Right now the code works the other way
around: When the hub driver wants to reset a hub, it first disconnects
the children and then calls usb_reset_device() (which will fail).
Since device resets can arise from outside khubd (from usbfs, for
instance), the order of function calls should be switched.

I've had the same thought.


(A related matter is the question of what it should mean to reset a
root hub.  It's not entirely clear what an HCD would need to do or
whether the existing HCDs support the necessary functionality.)

I think the functionality is all there, with the reset/start/stop calls, but not at the generic "usb_bus" level. Now that more people are starting to understand the requirements, it's probably time to start looking at merging that "hcd" glue into the top level. Maybe call it "usb_host", maybe still call it "usb_bus".


Then there's always the possibility that the reset will itself fail.
The only ways this can happen are if the hub is suspended or
unplugged, or if we can't determine which port it plugs into in the
parent.  That last possibility is a "This can't happen" sort of thing,
so I will ignore it.  Failing to reset a hub that is unplugged is
understandable, of course.  What about failure to reset a suspended
hub?  Fortunately I think this won't matter; if the hub is suspended
there shouldn't be any need to reset it (but there's always the
possibility of a reset request racing with usb_suspend).

Again, power-off should be an option.


The remaining problem mentioned above is rapid repeated connect change
events on a port, rather like init or inetd respawning a program too
often.  We can use the same approach they do, and stop handling

How often have you seen such problems? I suspect overcurrent notifications should be in the same category, too.


connect change events when a port has too many of them in too short a
time.  For example, if there are 10 connect changes within a 30-second
period, we could ignore that port for the next 10 minutes (and print a
warning in the system log).  This requires a certain amount of
overhead -- memory space for a counter and a jiffies value for each
port on each hub -- is it worthwhile?

There aren't that many hubs in a system, and this should be hub-private data (but then, so should udev->children).

Another option is to just power such troublesome ports off.  There
should be a mechanism (not unlike "blinkenlights" deferred work)
for the hub driver to wake up and perform housekeeping, like see
if ports marked "trouble type 3" are behaving yet.

- Dave


Alan Stern






-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - digital self defense, top technical experts, no vendor pitches, unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Reply via email to