On Sat, 31 May 2003, Oliver Neukum wrote:

> Am Freitag, 30. Mai 2003 16:35 schrieb Alan Stern:
> >
> > As discussed earlier, we need a way for a driver to inform the core that a
> > device's descriptors have changed (following a firmware download, for
> > example).  Let's call this new function usb_device_altered().  It will
> 
> We need two versions. One that will simply rebind the device and another
> that handles errors by resetting, etc... .

I don't follow your reasoning here.  As far as I can see, the only errors
to be handled are problems reading the descriptors.  Are you suggesting
the one version should skip reading the descriptors and just do the
rebinding?  The only reason for this would be if we had read the
descriptors previously.  When would that have happened?

> > queue a request to be handled in another thread (necessary to prevent
> > deadlock).  When the request is carried out, it will first do the
> 
> Why? The driver could very well be expected to release its locks before
> calling this function.

The problem isn't the driver's locks; it's the core's lock.  Here's the 
idea.

First, a driver is very unlikely to call usb_device_altered() anywhere but
in its probe() routine.  After all, this call will unbind the driver from
the device, so why bother to bind in the first place?  Or to put it
another way, if a firmware download is needed for the device to work
properly, why wouldn't that download be done first thing, during probe()?

Next, we have already agreed that most of the steps involved in
connect/bind/unbind/disconnect processing require a per-device lock (some
require a per-bus lock too).  So the core already owns the device lock
when it invokes the driver's probe() routine.  When the driver calls
usb_device_altered(), that function will (among other things) have to
carry out a new round of probing, which means it will have to acquire the
device lock.  But since the lock is already held in a higher stack frame,
we will deadlock.

Now it's not strictly necessary for the request to be handled in another 
thread, as I wrote previously.  It would suffice to queue a request that 
could be handled later by the calling thread.  My point is that the 
request _must_ be handled _asynchronously_ with respect to the caller, 
because it will try to acquire a lock that the caller probably already 
owns.

> > equivalent of setting config 0: remove the existing interfaces and unbind
> > the drivers.  I'm undecided about whether it should actually send this
> 
> Why all interfaces?

Are you suggesting that we compare the new configuration descriptor with
the old one and only rebind interfaces that have changed?  I suppose
that's doable, although it would be a lot more work.  And I don't think
the benefit would be very large.  For a first pass, I wouldn't bother.

> > Some of the failures of usb_device_reset() are straightforward.  If we
> > can't tell the port to reset the hub, then we just return an error code.
> > If the reset succeeds but there is no device connected afterwards, we can
> > just return an error code because khubd will learn about the disconnection
> > all by itself in due course.  If the device is still connected but we
> > can't set its address, we can disable the port and return an error code.
> > (If disabling the port fails we must power-down the port, and if that
> > fails we must reset the hub.  This could get tricky... but it's the same
> > processing that must happen for every new connection.)
> 
> Why reset after powering down? We should strive to keep ports alive.

No, no.  Reset the hub if powering down the port _fails_.

Certainly we should strive to keep ports alive.  But if the device 
attached to a port is in the DEFAULT state (address 0) and we are unable 
to assign it a new address or to disable the port, what choice is there 
but to power down the port?  The only alternative is to disable the entire 
hub -- but why go to that extreme if powering down the port will suffice?

> Secondly, you are an optimist ;-)

I won't deny that...

> A device may show errors because a hub higher in the tree is heading
> for the scrapyard. We need to be prepared to go up the tree in error
> handling in emergency.

Definitely.  The process I described above must be recursive.  If it turns 
out that we need to reset the hub, but the reset fails, then we must 
power down the port upstream of the hub.  And if that fails, we must reset 
the upstream hub, and so on...

> > If we can set the address back to what it was but can't select the
> > original configuration or the original interface settings, things get more
> > interesting.  My feeling is that we should return an error code and
> > proceed with the possibly-incorrect kernel data structures still in place.
> > The device driver, if it is working properly, will shortly call
> > usb_device_altered() and thereby fix everything up.
> 
> Yes, but locking is screwed. While the device is not in sync, probing
> must be locked against. We need to export the locking to the drivers.

I agree partially.  Probing while the device is not in sync wouldn't be a
disaster, but it would be nice to avoid it.  Perhaps the
usb_device_altered() routine could set some flag which would tell the core
to abort further probing until after the new descriptors have been read.  
This would not require exporting the locking to the drivers.

> > On a different tack...  The operation of setting a new configuration will
> > need to be handled by a separate thread (separate from the calling device
> > driver, I mean), just like usb_device_altered().  Fitting a proper API for
> 
> Again, why?
> 1. A driver for well behaved devices has no business setting configurations
> during the normal course of operations.
> 2. For crappy devices which need it, we can provide a version without locking.

1. Agreed.
2. I don't see how we can have a version without locking.  Setting a new 
configuration necessarily involves unbinding the existing interfaces and 
binding the new ones, and that requires holding the device lock.

> > this may require some effort; I notice that several drivers already call
> > usb_set_configuration() synchronously.  Some set the config to its
> > already-existing value, others blindly set it to 1.  This will have to be
> > examined closely and in some cases changed.  Trying to set a new value
> > synchronously will almost certainly result in deadlock.  Even trying to
> > reset the current value, which normally should be innocuous, might cause
> > deadlock if the call failed.
> 
> Generally these calls should be eliminated.

Yes.  There may be a reason for keeping some of them.  If a device really
needs to be in a configuration other than the default, it might be easier
to set the configuration in the driver than to write a user-space hotplug
helper to do the job.  But the existing code would probably still need to
be rewritten.  The correct way for a driver to handle this in its probe()  
routine is:  See if the proper configuration is already loaded.  If yes
then proceed normally; if no then call usb_set_configuration() and return
a failure code (i.e., don't bind).

> > Note that there operations require a separate thread or work-queue
> > equivalent, because they have to take place in process context.  I can
> > think of several other possible arrangements:
> >
> >     (A) One work-queue entry per request, created on demand.  This
> > would involve creating lots of little kernel processes and would make it
> > difficult to handle requests in the correct order.
> >
> >     (B) One persistent kernel thread per bus.  This could result in a
> > large number of largely dormant threads.  On the plus side, this is a
> > natural level for handling address-0 serialization.
> >
> >     (C) One single kernel thread to do everything -- i.e., khubd.
> > That's what we've got now.
> >
> > I don't know what's best; maybe something not listed here.  Also, bear in
> > mind that the work being discussed doesn't crop up too often.
> > Parallelization is not needed for performance reasons, only for
> > reliability.
> 
> I would prefer B but I see no clean way to to do this. This laptop has
> 3 UHCI and 1 EHCI for two ports. What's a bus here? The idea of
> controllers associated with two busses is horrible.

A bus is a root hub together with an address space.  One controller with 
two busses isn't a problem; it just means that the controller has two root 
hubs that control two different address spaces.  The fact that the busses 
share a controller doesn't really affect anything else going on here.

> And now for the additional things.
> 
> 1. Reset currently puts the burden of disconnecting the other interfaces
> on the driver. This is wrong. Usbcore should do it with proper locking.

This contradicts 2 below, and I go with 2: resets shouldn't unbind any 
interfaces.  But it's a tough problem.  For now, the safest choice is to 
do what usb-storage does: don't reset unless there's only one interface.

Ultimately, I think the best solution will be to have an additional
callback, like probe() and disconnect(), that the core could use to inform
the driver that the device has been reset.  Or maybe call once before the
reset (to warn that it's coming) and once after (to say that it's
finished).  The driver could then perform any necessary re-initialization.

> 2. Reset should not mean disconnecting the other interfaces. The properties
> internal to the drivers of the other interfaces should be retained and restored.
> Suspend/resume is the correct model, not disconnect/probe. Kicking off all drivers
> but restoring the altsettings like we do is flat out wrong.

Do we currently kick off all the drivers?  I thought we didn't.  Or more
precisely, the core doesn't -- it just recommends that when a driver does
a reset, that driver should kick off all the other drivers.  That's
definitely wrong.

On the other side, there's no way the core can retain and restore the 
properties internal to the drivers of the other interfaces.  By 
definition, if a property is internal to the driver then the core doesn't 
know about it.

> 3. Usbfs is a security problem and needs filtering of control requests. You can use
> it to set a device to an occupied address thereby crashing any device.

Certainly some of the control requests should require the appropriate 
capability.  Just something else that needs to be taken care of.

Alan Stern




-------------------------------------------------------
This SF.net email is sponsored by: eBay
Get office equipment for less on eBay!
http://adfarm.mediaplex.com/ad/ck/711-11697-6916-5
_______________________________________________
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Reply via email to