Re: [linux-usb-devel] [PATCH 2.6.8-rc2] usb hub docs and locktree()

Alan Stern Mon, 16 Aug 2004 10:17:36 -0700

[Email can sometimes be a very frustrating medium.  It's fair to say
that you and I are both very intelligent people, and yet we've been
talking past each other in this thread to an almost comical extent.  I
bet we could have settled the whole thing very simply in less than an
hour or two of face-to-face contact.]

On Sat, 14 Aug 2004, David Brownell wrote:

> > The 
> > problem you outlined still can't happen, though.  khubd won't see the 
> > remote wakeup status until it can lock the hub sending the status report; 
> > so long as suspend() is using that hub, khubd won't interfere. 
> 
> But locktree() is all that would prevent khubd from interfering,
> unless you're moving away from the "no-parent-locking" rule.
> 
> Your "no parent-locking" rule (or the "lock only the immediate
> parent" variant) would certainly let khubd wake up a hub before
> its subtree finished suspending.
> 
> Consider a task suspending from a root hub:
> 
>   SUSPEND( root{ a{v,w}, b{x,y}, c } )
> 
> When V issues a wakeup after A's been suspended, and
> while B is being suspended, the no-locktree behavior is
> to wake A up ... whoa, khubd just interfered with the task
> that was suspending!!   Likewise with other operations.

I don't know what you mean by my "no-parent-locking rule".  Maybe it
stems from a misunderstanding of something I said earlier, leading to
this current misconception?

Regardless, in the situation you describe (and after replacing
locktree() with down(serialize)) the suspend task would still own the
root hub's lock, because it had not yet finished suspending all the
children of the root.  The remote wakeup request from V, relayed
through A, is presented to khubd as the USB_PORT_STAT_C_SUSPEND bit in
the root hub's port status.  So khubd won't see the wakeup request
until the suspend task is complete, because khubd won't read the root
hub's port status until it can lock the root hub.

There is a potentially more dangerous variant of this scenario.
Suppose instead of V issuing a remote wakeup request that some task
decides to resume A.  Under the current implementation (and after
replacing locktree() with down(serialize)) the suspend task would have
released A's lock already, so the resume would proceed and would wake
A up.  In the algorithm I presented A remains locked until its parent
(the root) is suspended; the resume would be forced to wait and then
would fail because the clear_port_feature(USB_PORT_FEAT_SUSPEND)
message to the root hub would be rejected.

The basic message of this analysis is that I'm saying tasks like
suspend should keep devices locked for as long as they want to prevent
other tasks from accessing them.  Your implementation relies on
resume() using locktree() to prevent interference such as resuming a
device after it is suspended and unlocked but before its parent is
suspended.  My algorithm relies on keeping the device locked until
after its parent is suspended to prevent this; it seems more
straightforward.  After all, this is the same mechanism we rely on to
prevent tasks from resuming devices below a suspended hub when
usb_suspend_device() isn't running!

> > Well, there is a slight difference in meaning.  Debouncing, for instance,
> > could qualify as non-data signalling but not as a subtree operation.  
> 
> Debouncing should be covered by such a lock, certainly; have
> you got something against singleton degenerate trees?  :)

Not a bit.  But during debounce processing for new connections or
disconnections there _is_ no child device structure to lock.  Of
course, that could be changed...

> You've not yet shown me a "... than necessary" case though;
> your examples always allow tasks to interfere with each other.

It only seems that way because you haven't understood them fully!

> And I don't see complication at all, except when trying to get
> potential performance optimizations by switching to fancier
> algorithms.  How is the "lock parent then child" policy at all
> complicated?

It's more complicated than "lock child" is.  Imagine, for the sake of
argument, that the __usb_suspend_device()/hub_suspend() code is
reorganized so that children aren't unlocked until after their parent
is suspended.  (It wouldn't be such a bad idea to go ahead and
actually make that change, regardless of anything else...)  Now
imagine that a year from now some programmer who hasn't read the email
archives comes along and notices this strange locktree() routine
sitting there in hub.c.  He also notices that if every call to
locktree(u) were replaced with usb_lock_device(u) then everything
would work just as correctly as before.  Who could blame him for
making the replacements and removing locktree()?

> > As for simple locking rules...  Your rule seems to go like this:
> > 
> >     Every subtree operation starts by calling locktree() on its
> >     topmost node, then uses further locking as appropriate for
> >     the kind of subtree traversal it will perform.  Proper
> 
> Missing important adjective:  "task" serialization.
> 
> >     serialization is guaranteed by the fact that _every_ subtree
> >     operation calls locktree() at the start and locks each device 
> >     before touching it.
> > 
> > The rule I'm proposing goes like this:
> > 
> >     Every subtree operation uses locking as appropriate for the kind
> >     of subtree traversal it will perform.  Proper serialization is
> 
> Analogous adjective:  "device" serialization.
> 
> >     guaranteed by the fact that _every_ subtree operation locks each
> >     device before touching it.

I don't understand the distinction you're making between "task"
serialization and "device" serialization.

> > Once you accept that my rule is correct, I think you have to agree that
> > it's simpler than your rule.
> 
> If it's correct, it's for a different problem though; that's why it's
> simpler.  It allows two tasks to interfere with each other, forcing
> the use of non-deterministic and multipass algorithms to clean
> up after such interference.  (Rather than knowing that it can't
> be un-done till later, so a single-pass algorithm will work.)

_How_ does my approach allow two tasks to interfere with each other?
Under what circumstances can such interference occur?  I've already
shown that the remote-wakeup case you mentioned avoids interference,
as does resume-during-suspend.  Can you find any other examples?

The type of interference I've been concerned with is where a second
task can alter a device's state when the first task doesn't want it
to.  This can't happen with my approach, because the first task will
continue to hold the device's lock during the entire time it wants to
prevent state alterations.  Are you worried about some other sort of
interference?

> The way locktree() prevents them is by _consistently_ implementing the
> parent-before-child locking hierarchy.  If that policy is implemented
> consistently, it's equivalent to locktree.  Inconsistent locking
> policy implementation ... let's not go there!

My approach _never_ involves locking a parent while holding a child's
lock, so at least it's opposed to parent-after-child.  Is that enough
to make my approach is equivalent to locktree?  :-)

More likely you meant "always lock a device's parent before locking
the device".  I don't see any point to this.  In general, you lock a
device to prevent other tasks from interfering while you use the
device (or conversely, because you're going to change the device and
you don't want to interfere with anyone else using it).  But if you're
going to suspend A and everything below A then you're not doing
anything at all to A's parent, so why should you lock it?

> > >   - prefix (parent first then children) ... task can drop parent's lock
> > >     as soon as it has (a) visited the parent, (b) locked the children
> > 
> > So we can add: (c) unlock the parent, (d) recurse on the children.
> 
> Yes.  Ideally one just appends to list end, and takes the next item
> off its front.  That's a classic algorithm, one I suspect the PM core suspend
> code might be better off using.

> > Now here's another problem.  The fact is, subtree operations in general
> > can't safely use your prefix-traversal locking pattern.  You would run
> > into problems in step (d), because once the parent is unlocked new
> > children can be added. 
> 
> Which by definition is perfectly OK, otherwise you couldn't have
> been safe using the postfix locking pattern ... :)

This is a more complicated issue than your description would indicate.
If the set of nodes in a subtree can change (because children are
added) during a traversal, you have to specify very carefully which
nodes you want to visit.  The only thing that makes sense is to say
that the traversal should visit all of a parent's children _as of the
time the parent was visited_.  That's good for both suspend and resume.
Given such a requirement then yes, the prefix-type locking algorithm
you described will work -- provided you take care to insure no harm
will come from trying to visit a child that didn't exist at the time
you visited the parent.

> If unlock-early is incorrect, then prefix traversal would also be
> incorrect ... likely it's a postfix algorithm, not a prefix one.
> Otherwise there'd be no need to keep those other locks,
> or care about things like siblings added after the algorithm
> had "finished" with the parent node.

Well, that's the thing.  If there were an operation which (like
resume) required visiting parents before children and also (_unlike_
resume) required visiting all the nodes in a subtree as of the time
the operation completed -- even nodes that weren't present when their
parent was visited -- then your unlock-early approach wouldn't work.
Nevertheless, wouldn't you say that this operation required a
prefix-order traversal (by definition: parents before children)?
Fortunately there aren't any such operations we have to worry about.

> > Simple locking can do this with no help from locktree.
> 
> But locktree() doesn't hurt either, to get that first lock;
> T2 just needs a multi-pass algorithm, and the locking pass
> grabs a whole bunch of locks after locktree() returns.

So you're not totally against the idea of multi-pass algorithms!  :-)
It sounded like you were, from some other things you said.

> You're the one who suggested talking about "locking device
> nodes" rather than saying exactly how it was done ... very
> specifically to move away from saying it means down(serialize).
> Have you changed your mind on that?

No.  When I wrote the preceding message I was speaking in the context
of the current kernel code, which doesn't include usb_lock_device().
When that routine gets merged back in, I would rather say that locking
device nodes means calling usb_lock_device() for the first lock a task
acquires and doing down(serialize) for subsequent nested locks.  At
any rate, it definitely does not mean calling locktree() -- in fact it
refers to the individual locking operations that comprise locktree().

> > >  something like it is needed
> > > to prevent two tasks from interfering with each other, just
> > > grabbing locks from the middle of the tree can NEVER provide
> > > task serialization.
> > 
> > You keep saying that, and I don't understand why.  
> 
> Again, see the example at the top of this message.

I've shown that the example does not give rise to interference.  As
for task serialization...  I can't speak to that until I know what you
mean by it.

> Consider when more than two tasks are active too;
> when tasks touch a node, they should always do it in
> the same order.

I agree this is generally desireable, although it's not always
necessary.  For instance, if usb_disconnect() and usb_suspend_device()
were working on overlapping sets of nodes, it wouldn't really matter
if the orders didn't always match.  Anyway, all of the algorithms I've
proposed do have this property.  (Your prefix-order locking
optimization does _not_.)

> > Since suspend() and  
> > resume() are the two most problematic subtree operations, here are some 
> > explicit algorithms to provide a counterexample to your statement:
> 
> Ones that don't use the simple down() scheme you've been
> talking about, note -- they added a "must own parent lock"
> rule.  Except (!!) when getting the first lock; ergo, you've got
> those "tasks interfere with each other" problems, which comes
> from thosee exceptions-to-rule.  :)

Hang on there!  They _do_ use the simple down() scheme -- at least,
they don't call locktree() anywhere!  As for "must own parent lock"
rules:

    (a) The algorithms have no such rule; it's not clear why you
        say they do.  And quite apart from any transparent internal
        aspects of the implementation, there's no such requirement
        placed on any code calling suspend() or resume().

    (b) What's wrong with requiring callers to hold a parent's lock?
        usb_new_device() does so.

    (c) After some more thought, I decided it would be better to
        require the caller to own the lock on the topmost device
        being suspended or resumed.  Doing this would permit the
        subroutine in my example to become the main (the only!)
        routine for suspend.  It also permits tasks to suspend
        or resume devices that are already locked, which is a Good
        Thing (suppose a driver wants to suspend a device during
        probe?).  In fact, I suggest making such a change to the
        existing code.  It would bring usb_suspend_device() and
        usb_resume_device() in line with other routines like
        usb_set_configuration() and usb_reset_device().  (Whoops --
        usb_reset_device() doesn't require this in the current
        code.  Well, it will once my patches get merged back in.)

Finally, I don't see what you mean about having "tasks interfere with
each other" problems.  I'm pretty confident that you can't find any
examples of situations where another task can interfere with my
suspend or resume algorithms.

> So long as there's a device node in hdev->children[], the prefix algorithm
> lets that node's lock be used instead of holding the hub's lock.  The node
> could be stored there earlier than we do now by someone willing to spend
> time on a performance patch.

Yes, that would work.  It would be a little awkward, because
hub_port_connect_change() sometimes deletes the child device and
creates another one, but it could be done.

It would cause synchronization problems between khubd and the hub
disconnect() routine, however.  I _like_ the idea of making sure that
a driver doesn't try to access its device after disconnect() returns.
If hdev->serialize can't be used for that purpose then something else
would have to be introduced, like that old private khubd semaphore.

Alan Stern

P.S.: Ironically, even though I object to locktree in part because of
the unnecessary delays it can cause, I wouldn't mind so much getting
rid of usbdev->serialize entirely and just using a single semaphore to
lock an entire bus.  Although it would be even less efficient in terms
of concurrent operation, it would be much simpler, more elegant, and
quite obviously correct.  And in fact, so long as khubd remains
single-threaded the loss in concurrency wouldn't even be all that
high.  Probably a number of people won't think this is a good idea,
though.

-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
_______________________________________________
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Re: [linux-usb-devel] [PATCH 2.6.8-rc2] usb hub docs and locktree()

Reply via email to