On Mon, 2 Aug 2004, David Brownell wrote:

> The reset code changes aren't critical, at least not yet.  I expect
> locking arguments to continue for a while yet, regardless.

Probably.  :-)

> If you don't think that issuing non-data signaling on a port
> (affecting everything downstream) shouldn't be a reason
> to lock that whole tree, how would you propose keeping
> tasks using that sort of signaling from interfering with each
> other?  Or keeping other tasks from interfering, even if
> they don't use that sort of signaling?

These are good questions.

Non-data signalling on a port _is_ a good reason to lock the whole subtree
below that port, I agree.  What I don't agree is that locktree()  
provides the correct sort of locking for general non-data signalling.  
The way I see it, locktree() is appropriate for suspend and resume, but
not for khubd or device reset.

Furthermore, even with suspend and resume, locktree() alone isn't enough.  
For example, your suspend routine uses locktree() to lock the topmost hub
being suspended, but then it _still_ has to use ->serialize locking on
each of the devices below that hub.

> The kind of answer I want is simple, like "everyone just
> use locktree() and it works".  Special cases are negatives.

I hope my answer was simple enough for you.  More details are below.

> Clearly there's no "need" to have a simple rule like that;
> an armload of special cases (or even just a handful)
> could work too.  Likewise no need to have a cheap way
> to lock multiple devices; one _could_ just as well grab
> all the device locks for a whole subtree before starting
> some complex operation.

You may not have noticed it, but that's not so different from what
suspend() actually does.  usb_disconnect() too.  In fact, you could avoid
using locktree() with suspend() if you locked the entire subtree at once.  
Make suspend() a three-pass operation; the first pass recursively locks
everything below the topmost hub being suspended, the second pass
(bottom-up) actually suspends the devices, and the third pass (top-down)
releases the locks.  I'm not claiming this is optimal...


> The "unnecessary" delays thing is at most a performance tweak,
> as I've pointed out.  I'd say the others are necessary to provide
> a simple predictable event sequencing rule (top down), so the
> locking rule can be described with fewer exceptions.

I agree that performance isn't an issue at this time.  As for exceptions 
to the locking rule -- I think suspend() and resume() _are_ exceptions and 
should be treated that way.

Consider the other cases where I've said that locktree isn't needed:  
hub_events and usb_reset_device.  When a significant port event (like a
connect-change) occurs, khubd will disconnect everything below the port.  
When a hub is reset, everything below that hub will be disconnected.  In 
short, these other cases all involve disconnecting an entire subtree.  
Suspend and resume are exceptions in that they involve a subtree which 
survives the operation!  (Or if you want to put it the other way 'round, 
hub_events and usb_reset_device are the exceptions.  There are two 
different classes of subtree operations, at any rate -- those which 
destroy the subtree and those which don't.  It doesn't seem unreasonable 
for them to use different types of locking.)


> I think I see what you're saying:  You don't see preventing one
> task from interfering with another as a goal.  So you don't
> draw the distinction between prevention that's necessary,
> and prevention that's stronger than necessary (since that
> patch changed one line vs, say, a couple dozen).

No, no.  Preventing tasks from interfering with another is absolutely
necessary.  I'm saying that even if hub_events() called
down(&hdev->serialize) instread of locktree(), we would still have that
prevention.

> > And a "minimal change" patch wouldn't have changed 
> > down(&hdev->serialize) in hub_events() to locktree(hdev)!
> 
> No, it had to do that to prevent khubd from interfering with
> (or being interfered with by) some other task manipulating
> portions of the same tree, that's how the call sequences
> work.

Here's where we disagree.  _How_ is calling locktree() in hub_events()  
necessary to prevent khubd from interfering with (or being interfered with
by) some other task manipulating portions of the same tree?  Can you give
an example where locktree() provides such prevention and
down(&hdev->serialize) doesn't?


> > I didn't say that _sequencing_ wouldn't matter; I said that the _type of
> > locking_ wouldn't matter.  And as I mentioned above, the second thread
> > is either doing usb_reset_device() or hub_events().
> 
> Since the two styles of locking produce different event sequences
> (by design/intent), I'm not sure how to interpret your comment.
> If the sequencing matters,  then the type of locking matters;
> if locking doesn't matter, then sequencing can't either.

Put it this way:  Sequencing does matter.  Both types of locking produce
correct, although possibly different, sequences.  The sequences resulting
from using locktree() in hub_events() or usb_reset_device() tend to be
inferior to the ones resulting from down(&hdev->serialize); they generate
unnecessary delays and failures that could have been avoided.

> Well, the database folk have bad things to say about event sequences
> that can't be serialized ... in fact locktree() is a serialization algorithm,
> designed to let tasks work concurrently without sacrificing several
> handy characteristics.

That's a strange way of putting it.  Doesn't it make more sense to say
that serialization algorithms _prevent_ tasks from working concurrently,
so that even if they are started at the same time they will still function
correctly?  Doesn't the very term "serial" imply non-concurrent operation?


> > First, notice that it's _not_ okay for khubd to completely drop all locks 
> > on a hub while processing its events.  There has to be some way for it to 
> > synchronize when the hub is disconnected.  Until recently this was done 
> > using the private khubd_semaphore.  I recall you being very keen on the 
> > idea that this semaphore could now go away and be replaced with 
> > hdev->serialize.  Have you changed your mind about that?
> 
> No, but I don't see your point.  If everything is fail-fast on the
> disconnect paths, anything that touches a disconnected device
> (hub or otherwise) will drop it like a hot potato, so that very
> quickly it's only disconnect() processing at work.  I think we're
> pretty much at that point now.

You seem to be saying that it's never necessary to protect against
disconnection by locking a device.  That's just not true.  As an example,
consider that usb_new_device() requires the parent hub to be locked by the
caller.

Yes, I will agree that it would be possible for khubd to unlock hdev
during hub_port_init(), say, and then relock it before calling
usb_new_device().  But why go to the trouble?  Why not just hold the lock
the whole time?  Everything will still fail-fast if the hub is
disconnected.


> > Let's talk about event sequencing too.  As an example, let's consider
> > thread A doing a suspend on an upstream hub H while thread B is trying to
> > reset a downstream device D.  The question is, does using locktree(D) in
> > thread B give better sequencing than just doing down(&D->serialize)?
> 
> Depends on what "better" means.  For example, if it means
> that a task working on a subtree is guaranteed no other task
> will interfere with it, locktree() is much better.

No, because even without using locktree() in hub_events() or 
usb_reset_device() we have such a guarantee.


> > I meant to say "suspending them", not "locking them", sorry.  Given that
> > usbcore must take responsibility for suspending USB devices and can't
> > delegate it to the driver core, what happens when the device tree
> > traversal needed for proper suspension goes outside the USB boundary?
> 
> Good question.  I think that the PM core should have a routine that's
> responsible for suspending all of a subtree.  Instead it only has code
> to do that for the degenerate subtree:  the whole darn system!

You're still missing my point.  The PM core _can't_ have such a routine.  
Or if it does, the routine won't work for locking USB devices.

Why not?  In order to lock all the devices in a subtree the PM core first
has to acquire the subsystem rwsem; otherwise the subtree could change out
from under it.  So it locks the rwsem and then does the various
suspend() calls.  Suspending a USB device requires that you lock
->serialize.  But it's too late!  The locking rules say that ->serialize
must be locked _before_ the subsystem rwsem.  There's no way to fix this 
simpy by changing the PM core.

Alan Stern




-------------------------------------------------------
This SF.Net email is sponsored by OSTG. Have you noticed the changes on
Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now,
one more big change to announce. We are now OSTG- Open Source Technology
Group. Come see the changes on the new OSTG site. www.ostg.com
_______________________________________________
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Reply via email to