On Mon, 2 Jun 2003, Oliver Neukum wrote:

> OK, yes I see. You're right. Nevertheless I'd like a version that does
> a full reset with enumeration for error recovery.

Do you mean recovery from errors encountered by the core while trying to 
re-establish the configuration and altsettings that existed before the 
reset, or do you mean for use by drivers recovering from device-specific 
errors?  Maybe what I have in mind will be good enough.

Here's my current view.  usb_device_reset() does a port reset and then 
tries to re-establish the former config and alsettings.  If there are no 
errors it does not re-read the descriptors and it leaves all the drivers 
bound just as they were before.

usb_device_reconnect() -- a combination of device_reset() and what I 
formerly called usb_device_altered() -- also does a port reset, but then 
it unbinds the current drivers, re-reads the descriptors, and then behaves 
like usb_connect(): selects a configuration somehow and probes the 
interfaces.

For both of these:  If the port reset fails, an error is returned and
nothing else happens (or maybe we start error recovery on the hub?).  If
the reset succeeds but the device is gone, a partial-success code is
returned and khubd will handle the disconnect in its normal fashion some
time later.  If the device is still there but we can't set its address,
read the descriptors, or set the configuration, then we disable the port
and go through the recursive procedure we have already discussed below.

usb_device_reset() can be entirely synchronous.  The reset portion of
usb_device_reconnect() can be synchronous also, but the unbinding of the
current drivers and so on must be asynchronous -- it has to be done after
the original call to reconnect() has returned.  Actually, I'm not sure
exactly where the dividing line between the synchronous and async parts
should be drawn; maybe it should be even earlier.  And we might also want
to unbind the current drivers _before_ doing the reset, so that no
unwanted URBs get submitted to a partially reinitialized device.  (Of 
course, doing that would eliminate the possibility of keeping drivers 
bound to interfaces that weren't altered.)

If usb_device_reconnect() is called during the process of probing the
interfaces, the remainder of the probing is skipped, the asynchronous part
of device_reconnect() is carried out, and the probing restarts from the
beginning.  That's a simple optimization; there's no point probing drivers
for interfaces if they're just going to be unbound again in a little
while.

Sometime in the future, when we can change the USB API, we will add a
"reset" callback to struct usb_driver.  This callback will be invoked by
both device_reset() and device_reconnect() before the reset is done, and
again by device_reset() after the reset, by analogy with the
suspend-resume paradigm.  (device_reconnect() doesn't need to do the
resume part because the drivers will have been unbound.)

This leaves open the question of what device_reset() should do if it
encounters an error while trying to re-establish the original
configuration and altsettings.  I'm not sure what's best here.  Maybe it
should behave like device_reconnect().

> > Okay, so what about errors like a failure to accept firmware?  As you say,
> > the device is screwed.  I don't know what you mean by a "full reset" --
> > usb_device_reset() really does reset the device and usb_device_altered()
> > will re-read the descriptors and do reprobing.  If reading the descriptors
> > fails, altered() will disable the device, just as in the case of a
> > newly-connected device.  What else remains to be done?
> 
> Reset again. If a device has a failure somewhere during firmware load,
> it's probably really screwed.

This should be part of the error recovery code for the address-assignment
and descriptor-reading routines.  The recovery procedure could involve
retries of up to, say, 3 port resets.

> > No.  We need a usbcore-specific lock.  It's not enough to rely on the
> > driver model's locking.  Doing that would leave open the possibility, for
> 
> That's not a question of reliance. The generic lock is there. We need
> to deal with it. Furthermore we can always live with coarser locking.

I think if we use a device-specific lock (or even a bus-specific one) then 
we don't have to consider the sysfs locking at all.  The sysfs lock for an 
interface will only be held during the period that we own the 
device-specific lock, and during that period we will only probe or 
disconnect one interface at a time and in a single thread.

> Yes, but it need not be per device. Per bus would do as well.
> Per interface is wrong.

Agreed.  Maybe per bus would be simpler, since part of this processing 
always involves going through the address-0 state anyway.

> > > Sorry, I was unclear. Shouldn't robust error handling look like:
> > > disable port -> reset hub -- retry disable -> powerdown port ->
> > > reset hub -- retry power down port -> power down hub -> recursion
> >
> > Would that really work?  That is, if you reset the hub, won't that disable
> > _all_ its ports?  We don't want to do that if powering down a single port
> > is possible.
> 
> Powering down means that the port will no longer work. From disable we can
> recover.

Not quite true; a powered-down port will recover when/if the hub is reset.  
So the question is this:  If disabling a port fails, should we reset the
entire hub (and thereby reset all the other devices attached to it), or
should we power down just that one port (which will make it useless until
the entire hub is reset)?

Bear in mind that failure to disable or reset a port means the hub isn't
working right anyway; maybe it has crashed or maybe something is
electrically wrong with the port's hardware.  One way or another, either
the port will never work or we will have to reset the entire hub at some
point.  So I don't see that we lose anything by trying to power down the
port.

> > Let's be more precise.  URBs aren't submitted to interfaces, they're
> > submitted to endpoints.  Submitting an URB to a nonexistent endpoint isn't
> > very bad, it merely causes an error code to be sent to the URB's
> 
> Nonexistent is survivable. Endpoints going away while the URB is processed
> are another matter.

I see what you mean.  We want to avoid having any URBs in flight while a
configuration change occurs.  The code that David has written for
disabling endpoints might do the job, with no need for any additional
locking.

> > Anyway, how will allowing the drivers to lock out further probing prevent
> > these bad URBs from being sent?
> 
> Lock against probe -> Kick off other drivers -> reset/config change

Locking against probing is already taken care of by holding the device- or
bus-specific lock.  Kicking off the drivers before doing a config change
is fine, but not for a simple reset.  On the other hand, during a simple
reset endpoints don't disappear.

What do we do about URBs sent to endpoint 0 by a driver during the time 
that device_reset() is re-establishing the original altsettings?  We can't 
disable endpoint 0.

> > immediately beforehand.  Doing things this way would avoid having any
> > period in which the core is out-of-sync with the device.
> 
> Firmware download.

Hmm... I don't know exactly how any individual device implements firmware 
downloads.  It seems to me that the new firmware can't start executing 
right away; it must need _something_ like a reset to get going.  But if 
that's not true, if the new firmware takes effect immediately, then 
there's no way at all to keep the core fully in sync with the device 
during the brief period before we read the new descriptors.  After all, we 
can't read them _before_ the new firmware is running!

Using the combined usb_device_reconnect() function ought to solve most of
these problems.  The only other problem I can think of is that a driver
doing a firmware download might want to insure that no other drivers will
talk to the device while the download is in progress (but it might want to 
allow them to remain bound).  I'm not sure what the best way is to do 
this.

> > > > 2. I don't see how we can have a version without locking.  Setting a
> > > > new configuration necessarily involves unbinding the existing
> > > > interfaces and binding the new ones, and that requires holding the
> > > > device lock.
> > >
> > > Which is held during probe, where these cases arise.
> >
> > I don't like that idea.  It might lead to unbounded stack usage.  You're
> > saying that a driver's probe() function could tell the core to set a
> > configuration and restart probing from the beginning, without waiting for
> > the original probe() to return.  It goes against my grain.
> 
> No, by returning an error. There's no recursion. The loop is justed started
> anew.

Does my description of usb_device_reconnect() above agree with what you 
want?  usb_set_configuration() should be made to work the same way, just 
without doing the reset at the beginning or re-reading the descriptors.

> Probe() should have three possible returns.
> 1. can deal with the device
> 2. cannot deal with the device
> 3. can deal in a new form with firmware now in place

I don't like 3 -- it ignores the USB model of devices, configurations, and
interfaces.  A driver is probed for an interface, and interfaces exist
only within a single configuration -- different configurations have
disjoint sets of interfaces.  So while a driver might be able to bind to
one of the interfaces under each of two different configurations, since
these are two different interfaces there would have to be two different
corresponding calls to the driver's probe().

Okay, suppose a driver downloads new firmware during probe().  If this 
firmware doesn't change the configuration descriptor, then following a 
reset() the same configuration and interfaces will still exist, so probe() 
can return success -- it can deal with this interface.  If the firmware 
does change the configuration descriptors, or if usb_set_configuration() 
loads a new config, then the interface for which probe() was called is 
gone.  It doesn't matter what probe() returns, because there is nothing 
for it to bind to.  What will happen is the core will probe the interfaces 
for the new configuration; at that time probe() can return success and 
the driver will be bound correctly.

If it wants, a driver can invoke the second scenario above by calling 
usb_device_reconnect(), even if the firmware download did not change the 
configuration descriptor.  No harm will occur, and this could be useful if 
the driver isn't sure how big a change the new firmware will cause.

Alan Stern



-------------------------------------------------------
This SF.net email is sponsored by: eBay
Get office equipment for less on eBay!
http://adfarm.mediaplex.com/ad/ck/711-11697-6916-5
_______________________________________________
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Reply via email to