I think the required changes to the core, though maybe a bit messy to 
program, will end up being straightforward.  Here are my thoughts on the 
serialization problems.

There are two main issues.  (1) We don't want connect processing
(registering interfaces in the driver model and probing device drivers,
maybe other things too) and/or disconnect processing (detaching drivers
and deregistering interfaces, etc.) to occur simultaneously in different
threads.  (2) Because drivers normally implement mutual exclusion locking,
we don't want to call a driver's disconnect() from within a synchronous
call issued by that driver.

Up to now (1) hasn't been a problem.  All connect/disconnect processing 
has been handled by khubd, which automatically makes it serial.  But now 
we face the possibility that these events may occur as a result of device 
resets and config changes, which can occur in other threads.

Let's take these one at a time.  Start with device resets.  A reset can
fail because a device has been disconnected.  In this case, there's no
need to do anything at all -- khubd will learn about the disconnect in due
course from its normal status reports and take care of everything.  Then
there's the possibility that the reset succeeds but the device itself has
changed.  Oliver writes:

> Well, no. Doing a functional reset, as opposed to a purely physical,
> which usb_reset_device() does, can mean several steps, like, as
> you mentioned, downloading a firmware. The core has not enough
> information to tell whether a driver can no longer be bound to its
> interface. It cannot tell an error from a legitimate intermediate condition.
> 
> IMHO precisely the opposite is needed. During a reset the core must
> leave a device entirely alone until a driver tells the core that it's finished
> successfully or that there's an irrecoverable error.
> The core needs to handle errors only as far as there must not be
> a device with adr 0 on the bus and that unplugging during reset
> isn't lethal, which can be done via timeouts on the control messages.
> What to do on an irrecoverable error is another question.

I agree.  Furthermore, firmware downloads occur so infrequently that
there's no point in the kernel checking for that sort of thing.  What we
need is a core routine that a device driver can call when the download is
finished, telling the core that the device needs to undergo a logical
disconnect/connect sequence.  (In principle this could be as simple as
telling the hub to set the port's C_PORT_CONNECTION status bit.  
Unfortunately, the USB spec prohibits this approach!)  Should an
irrecoverable error occur, the driver merely needs to unbind itself (or
fail to bind in the first place).

The actual processing would need to done in a different thread.  It could
be khubd or a keventd callback in the core.  It doesn't really matter, but
note that the callback would end up executing the same code as khubd in
any case.

(I don't really understand Oliver's comment about unplugging during reset
not being lethal.  Surely if the user physically unplugs the device,
whether it's undergoing a reset or not we want to delete the device
structure.)

Now consider configuration changes.  They shouldn't be a big problem
either.  They should simply repeat the same processing that
usb_disconnect() does now to remove drivers and interfaces, issue the
set-configuration request, and then probe the new interfaces.  In fact,
there's no reason why the last two steps here shouldn't be carried out by
the same code that handles new device connections, and no reason why the 
first step shouldn't be carried out by the same code that handles device 
disconnects.

The only new requirement is that this shouldn't overlap with 
connect/disconnect handling in other threads.  That's no problem; we can 
simply protect all such processing with a global semaphore.  This will 
require minimal overhead, since normally everthing is done within khubd 
anyway -- there won't be contention for the semaphore unless someone 
explicitly asks for a config change while khubd is busy.  Maybe this is 
what Oliver and David were talking about -- at cross purposes -- when they 
wrote:

> > We'd be better off with using a big semaphore for each bus.
> 
> Why would more than "address zero" state management need to be
> serialized across the whole bus?  As soon as the device gets
> into the ADDRESS state, there's no need to even lock out other
> ports in that hub before proceeding with configuration.


Now let's move on to (2): don't let drivers make synchronous calls that 
will invoke their disconnect() routines.  For device resets this isn't a 
problem; the disconnect will be issued asynchronously by khubd.  It's not 
explicitly a problem for firmware change notifications or configuration 
changes either, provided such things are only done from within the 
driver's probe() routine.  Since the probe hasn't finished, the driver 
hasn't yet been bound to the device, so the disconnect() routine won't be 
called.

But it _is_ a problem with regard to the global semaphore I mentioned
above.  During probe() the semaphore will already be locked; so the
disconnect processing would be unable to proceed.  That's why I said
earlier that firmware/config change handling would have to be done in
a different thread.

Although as a general rule, device drivers should not request config
changes -- choosing the proper configuration ought to be done by a
userspace hotplug helper -- there are times when it might be, let's say,
less inappropriate than others.  For both firmware updates and
configuration changes, a driver should determine in its probe() routine
whether a change is needed.  After doing the necessary preparation, it
should call the corresponding core routine and then return an error from
probe().  The actual work would be carried out by another thread, whether
khubd, keventd, or a work queue.  The question is, should the core provide
the service of sending the request to the other thread, or should the
driver be responsible for doing this?  The penalty for doing things
wrongly would be obvious: the driver would deadlock and end up blocking
all connect/disconnect processing!

With regard to using khubd to do a lot of the dirty work here, David 
points out:

> And <linux/workqueue.h> can be used to make keventd do things
> in parallel, so that misbehavior through one port can't break
> the others.  Dedicated thread not necessary.

That's true.  But on the other hand, if the actions being requested are
things that khubd is doing anyway, it hardly seems necessary to go to a
different thread.  Also, khubd is _already_ a big single point of failure.  
If it goes down, much of the USB subsystem's functionality will vanish.  
Losing the ability to reinitialize following a firmware download is pretty
small compared to losing the ability to recognize when devices are plugged
in or unplugged, which is a risk we face right now.

David also wrote:

> While things like marking devices as NOTATTACHED right
> when the hub status interrupt data reports it "gone" would
> cut down on false error reports ... vs waiting not only for
> khubd to be scheduled, but also to get around to handling
> that end its queue ... ;)

This is a bit confusing...  Hub status interrupt data only indicates that
a port has undergone a status change.  It doesn't say what the cause of
the change was; for that khubd has to make a port status request.  So
there doesn't seem to be any way to carry out this suggestion.

Alan Stern





-------------------------------------------------------
This SF.net email is sponsored by: ObjectStore.
If flattening out C++ or Java code to make your application fit in a
relational database is painful, don't do it! Check out ObjectStore.
Now part of Progress Software. http://www.objectstore.net/sourceforge
_______________________________________________
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Reply via email to