On Tue, 11 May 2004, Oliver Neukum wrote:

> Also, two more points.
> 1. Contention is rare, hence to conserve system resources a single lock
>    _enhances_ performance

Yes.  Tree traversals are pretty rare also; I think we are okay with a
regular semaphore rather than an rwsem.  In fact, the only traversals
outside the hub driver seem to be in devio.c (/proc/bus/usb/devices) and
usb.c (usb_find_device(), used only by security/root_plug.c).

> 2. Some of these operations also take addr0_semaphore which has to be
>     at least on the bus level

The usb_address0_sem right now is local to a single routine in hub.c, 
which means it applies to all the USB buses.  That's probably okay; except 
during boot we're not likely to see devices being added to more than one 
bus at a time.  We do still need to add code to prevent new connections 
when a broken hub fails to disable a port with a broken device stuck in 
address 0 -- that shouldn't be hard to handle.


On Tue, 11 May 2004, David Brownell wrote:

> That was another good analysis, and I don't think I much
> want to disagree with any of it.  Though I'll note that
> the timescales differ ... in particular the enumeration
> part of (6) is commonly at least 200 msec, sometimes more.
> while most everything else is 1 msec or so (except for
> suspend/resume/reset, at tens of msec) assuming that the
> driver callbacks don't delay things.

Yes.  Sometimes those callbacks can take a while.

> Also, (4) suspend/resume and (5) reset are mutually exclusive
> types of signaling.  For consistency, you should group them
> together as you did for (7).

This is of course a very minor point.  I separated (4) and (5) because
they occur in different contexts: Suspend/resume originates in sysfs or PM
and involves a driver callback, whereas reset originates in a device
driver and does not involve a callback.  I kept them together in (7)
because from the hub's point of view, all of these things are simply
requests from outside khubd to send a message to a port.  (Although reset 
obviously requires additional work.)


> Alan Stern wrote:
> 
> > This means that (6)-(7) on a port can be treated as though they belong to 
> > the child device.  In other words, for any device we can imagine the 
> > possible actions are (1)-(5) plus (6) & (7) on the upstream port.  When 
> > viewed this way, it turns out for any device all the actions must be 
> > mutually exclusive.
> 
> True, though during most of (6) -- debounce! -- there is no
> struct usb_device.

Yes indeed.  This means that (6) doesn't need to be exclusive with
anything happening on the (non-existent) device!


> > There are some other considerations too.  Like, while suspending the
> > entire subtree below a hub you don't want some other process to start
> > resuming the devices you've just suspended.  To prevent this sort of thing 
> > both usb_suspend_device() and usb_resume_device() should use the locktree 
> > algorithm.  I don't think it's needed for reset or disconnect, though.
> 
> Disconnect, sure -- assuming we mark trees dead ASAP, so that
> further operations on those devices fail ASAP.

That's the only aspect of this analysis I'm not entirely pleased with.  
Let's add an item (8) for hubs:

        (8): Reading port status messages and marking the child device
             as USB_STATE_NOTATTACHED when a connect change occurs

I didn't include this before because it's not something imposed from the
outside; it's simply the normal action of a driver managing its device.  
But clearly it is important.

On a hub, (8) must be mutually exclusive with (6) & (7) for the same port.  
Clearly it must also be exclusive with (1)-(5) on that hub.  However, it
does _not_ have to be exclusive with (1)-(3) on the child device -- nor
with parts of (4) and (5) -- and for optimal operation it should be
allowed to overlap with them.

Unfortunately, the scheme relies on children[n]->serialize to provide
mutual exclusion for (6) and (7), so (8) has to acquire that same
semaphore.  This means that (8) can't take place simultaneously with
(1)-(5) on the child, since they will also hold the semaphore.

One way to address this would be to adopt my earlier idea for requiring
threads to explicitly acquire access to a hub's ports.  Maybe you can come
up with something simpler.  Contention for a port isn't very high, since
there is only khubd trying to perform (8) and at most one other thread
which has acquired children[n]->serialize and is trying to perform (7).  
On the other hand, maybe it's not worthwhile worrying about this, and we
should simply accept a delay in sometimes marking devices NOTATTACHED
later rather than sooner.


> I'm not sure I see why it wouldn't apply for reset() though;
> your group (7) was "port suspend/resume/reset", so consistency
> would argue they should all apply the same locking policy.

It's because of the way these things affect their subtrees.  Suspend and 
resume on a hub have certain effects on the devices below the hub.  But 
reset on a hub will disconnect everything beneath it -- at least, that's 
the way it works now.  So we don't have to worry about what happens to 
the devices in the subtree, since they're about to go away.

> There's also khubd to consider.  Shouldn't it also use the
> same policy?

What do you mean?  When and why should khubd use locktree?


> > Now maybe I've overlooked something here or oversimplified in some way.  
> > But altogether I think this makes a good case that using a single lock is
> > the way to go.
> 
> That is, one per device, not changing away from dev->serialize?

Yes, that's what I mean.

> And hub->khubd_sem vanishing?

Again, yes.  The only reason khubd_sem exists is so that hub_disconnect() 
won't return until khubd is no longer using the hub.  hdev->serialize can 
serve that purpose just as well.

>  I'd be fine with that.  We may need
> to address some corner cases, but the analysis I snipped suggests
> there shouldn't be any.

Good.  So apart from the questions above, it looks like we've found a
solution.

Alan Stern





-------------------------------------------------------
This SF.Net email is sponsored by Sleepycat Software
Learn developer strategies Cisco, Motorola, Ericsson & Lucent use to 
deliver higher performing products faster, at low TCO.
http://www.sleepycat.com/telcomwpreg.php?From=osdnemail3
_______________________________________________
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Reply via email to