On Tue, 11 May 2004, Oliver Neukum wrote:
> Also, two more points.
> 1. Contention is rare, hence to conserve system resources a single lock
> _enhances_ performance
Yes. Tree traversals are pretty rare also; I think we are okay with a
regular semaphore rather than an rwsem. In fact, the only traversals
outside the hub driver seem to be in devio.c (/proc/bus/usb/devices) and
usb.c (usb_find_device(), used only by security/root_plug.c).
> 2. Some of these operations also take addr0_semaphore which has to be
> at least on the bus level
The usb_address0_sem right now is local to a single routine in hub.c,
which means it applies to all the USB buses. That's probably okay; except
during boot we're not likely to see devices being added to more than one
bus at a time. We do still need to add code to prevent new connections
when a broken hub fails to disable a port with a broken device stuck in
address 0 -- that shouldn't be hard to handle.
On Tue, 11 May 2004, David Brownell wrote:
> That was another good analysis, and I don't think I much
> want to disagree with any of it. Though I'll note that
> the timescales differ ... in particular the enumeration
> part of (6) is commonly at least 200 msec, sometimes more.
> while most everything else is 1 msec or so (except for
> suspend/resume/reset, at tens of msec) assuming that the
> driver callbacks don't delay things.
Yes. Sometimes those callbacks can take a while.
> Also, (4) suspend/resume and (5) reset are mutually exclusive
> types of signaling. For consistency, you should group them
> together as you did for (7).
This is of course a very minor point. I separated (4) and (5) because
they occur in different contexts: Suspend/resume originates in sysfs or PM
and involves a driver callback, whereas reset originates in a device
driver and does not involve a callback. I kept them together in (7)
because from the hub's point of view, all of these things are simply
requests from outside khubd to send a message to a port. (Although reset
obviously requires additional work.)
> Alan Stern wrote:
>
> > This means that (6)-(7) on a port can be treated as though they belong to
> > the child device. In other words, for any device we can imagine the
> > possible actions are (1)-(5) plus (6) & (7) on the upstream port. When
> > viewed this way, it turns out for any device all the actions must be
> > mutually exclusive.
>
> True, though during most of (6) -- debounce! -- there is no
> struct usb_device.
Yes indeed. This means that (6) doesn't need to be exclusive with
anything happening on the (non-existent) device!
> > There are some other considerations too. Like, while suspending the
> > entire subtree below a hub you don't want some other process to start
> > resuming the devices you've just suspended. To prevent this sort of thing
> > both usb_suspend_device() and usb_resume_device() should use the locktree
> > algorithm. I don't think it's needed for reset or disconnect, though.
>
> Disconnect, sure -- assuming we mark trees dead ASAP, so that
> further operations on those devices fail ASAP.
That's the only aspect of this analysis I'm not entirely pleased with.
Let's add an item (8) for hubs:
(8): Reading port status messages and marking the child device
as USB_STATE_NOTATTACHED when a connect change occurs
I didn't include this before because it's not something imposed from the
outside; it's simply the normal action of a driver managing its device.
But clearly it is important.
On a hub, (8) must be mutually exclusive with (6) & (7) for the same port.
Clearly it must also be exclusive with (1)-(5) on that hub. However, it
does _not_ have to be exclusive with (1)-(3) on the child device -- nor
with parts of (4) and (5) -- and for optimal operation it should be
allowed to overlap with them.
Unfortunately, the scheme relies on children[n]->serialize to provide
mutual exclusion for (6) and (7), so (8) has to acquire that same
semaphore. This means that (8) can't take place simultaneously with
(1)-(5) on the child, since they will also hold the semaphore.
One way to address this would be to adopt my earlier idea for requiring
threads to explicitly acquire access to a hub's ports. Maybe you can come
up with something simpler. Contention for a port isn't very high, since
there is only khubd trying to perform (8) and at most one other thread
which has acquired children[n]->serialize and is trying to perform (7).
On the other hand, maybe it's not worthwhile worrying about this, and we
should simply accept a delay in sometimes marking devices NOTATTACHED
later rather than sooner.
> I'm not sure I see why it wouldn't apply for reset() though;
> your group (7) was "port suspend/resume/reset", so consistency
> would argue they should all apply the same locking policy.
It's because of the way these things affect their subtrees. Suspend and
resume on a hub have certain effects on the devices below the hub. But
reset on a hub will disconnect everything beneath it -- at least, that's
the way it works now. So we don't have to worry about what happens to
the devices in the subtree, since they're about to go away.
> There's also khubd to consider. Shouldn't it also use the
> same policy?
What do you mean? When and why should khubd use locktree?
> > Now maybe I've overlooked something here or oversimplified in some way.
> > But altogether I think this makes a good case that using a single lock is
> > the way to go.
>
> That is, one per device, not changing away from dev->serialize?
Yes, that's what I mean.
> And hub->khubd_sem vanishing?
Again, yes. The only reason khubd_sem exists is so that hub_disconnect()
won't return until khubd is no longer using the hub. hdev->serialize can
serve that purpose just as well.
> I'd be fine with that. We may need
> to address some corner cases, but the analysis I snipped suggests
> there shouldn't be any.
Good. So apart from the questions above, it looks like we've found a
solution.
Alan Stern
-------------------------------------------------------
This SF.Net email is sponsored by Sleepycat Software
Learn developer strategies Cisco, Motorola, Ericsson & Lucent use to
deliver higher performing products faster, at low TCO.
http://www.sleepycat.com/telcomwpreg.php?From=osdnemail3
_______________________________________________
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel