Re: [linux-usb-devel] [PATCH 2.6.8-rc2] usb hub docs and locktree()

David Brownell Sat, 07 Aug 2004 10:25:06 -0700

On Friday 06 August 2004 09:39, Alan Stern wrote:
> On Wed, 4 Aug 2004, David Brownell wrote:
> 
> > > No, no.  Preventing tasks from interfering with another is absolutely
> > > necessary.  I'm saying that even if hub_events() called
> > > down(&hdev->serialize) instread of locktree(), we would still have that
> > > prevention.
> > 
> > No we wouldn't.  Consider a task suspending a tree; now khubd
> > comes in and wakes something up that's already been carefully
> > suspended.  You're allowing interference ... encouraging it. 
> 
> In fact that can't happen; khubd can't wake up suspended devices.


Except for its remote wakeup path... :)


> > Yet I've demonstrated a locking policy that works the same in
> > all those cases, without claiming that, for example, reset() is a
> > special case ...
> 
> Your policy isn't correct, or at least, it doesn't do what you seem to be 
> claiming it does.
> 
> To see why, let's go back to first principles.  You are concerned about 
> "non-data signalling" -- which I take to mean any action that, when 
> applied to a hub, can affect the entire subtree of devices below that 
> hub.  (There's not much point in considering the subtree of devices below 
> a non-hub, which is why I use this phrasing.)  A better name might be 
> "subtree operations"; if you don't mind I'll start using that name in 
> place of the other.

I started out talking about tree/subtree operations, but that seemed
to invite misunderstandings ... which is why I also started to talk about
a class of operations that required subtree operations.   Namely,
those which aren't sending differential/data signaling down the wires.


> When you think about it, lots of things qualify as subtree operations.  
> ...

That's why my thoughts have tended that way ... :)


> Faced with this analysis, you may agree and go on to say that it simply 
> shows that _all_ our locking should use locktree().

Or I might just say that I want the locking rules to be simple,
and not dominated by an armload of easy-to-misunderstand
special case exceptions ...  in fact I'm very sure I said that!

We may come up with cases where exceptions are needed,
but let's look at those cases as they appear.  So far you've
not shown me any cases where they're needed.  (Though
there's still that subsys.rwsem problem, which seems to me
not unlike a driver model design flaw.  Not USB-specific.)


> But that's not right,  
> because your procedures for suspend/resume -- which _do_ use locktree() -- 
> don't accomplish what you want.  That is, they don't provide 
> general-purpose serialization with no need to consider special cases.
> 
> 
> To understand why this is so, let's use a particular example.  Take this
> hypothetical device tree:
> 
>               [snip!]
> 
> While all that is going on, suppose task T2 decides to suspend the entire 
> tree starting at A.  It uses locktree() to lock A.  Then it locks B, then 
> it locks, suspends, and unlocks C & D.  Then it suspends and unlocks B.  
> And so on...  It doesn't block until it tries to lock G, at which point it 
> has to wait for T1 to finish up.

Right.  T1 and T2 are fully sequenced; T1 always completes before T2;
they can't interfere with each other.  They may be partially concurrent.


> Observe:  T1 and T2 are operating concurrently on overlapping subtrees.  
> At the same time the T1 is suspending H, I, and G, T2 might be suspending
> C, D, and B.  This example demonstrates that your algorithm allows
> concurrency; it does _not_ achieve complete serialization.

Not true.  T1 is always serialized with respect to T2; there will never be
a case where T2 can undo something T1 did before T1 completes, or
vice versa.  That's a fairly standard definition of task serialization;
I don't know what kind of serialization you're referring to.


> One might claim that this isn't so bad, that the intermediate stage of T2
> when it has suspended C, D, and B (for example) means that it has been
> operating on a sub-subtree disjoint from the H,I,G sub-subtree.  That sort
> of argument doesn't wash when considering resume().  Since resume() works
> on devices from the top down, during its intermediate stages the devices
> it has affected don't even form a subtree!

The approach I suggested for khubd -- in case that potential performance
issue makes trouble -- applies here too.  See below for more comments
on that; it generalizes nicely.

Again, this seems like a case of "want" a special case, not "need".

 
> Now suppose for the sake of argument there was some subtree operation
> (call it SOP) which had to prevent other tasks from working on devices
> _before_ SOP had touched them but which was okay for concurrent execution
> on devices _afterwards_....

Tree locking handles that by dropping the lock as soon as it's no longer
needed.  Consider the three classic kinds of tree walk, and their
different locking requirements:

  - prefix (parent first then children) ... task can drop parent's lock
    as soon as it has (a) visited the parent, (b) locked the children
 
  - postfix (parent after each child) ... pretty common, must hold
    lock until after visiting children.  All tree traversals can lock
    like this, modulo performance tradeoffs.

  - infix (parent after each child) ... not all that common, must
    hold lock until after last child (like postfix).

I've already described how to make khubd use a prefix walk to get
rid of that one performance issue that bothered you ... SOP could
use the same policy.  Note that suspend() is a good example of a
postfix walk, and resume() could be implemented as a prefix walk.

Key point:  so long as locktree() is used to get the first lock, the
kind of tree traversal used during the tree operation need not be
visible to other tasks.  Because locktree() established the basic
task-level serialization.

And again:  using an unlock-early policy with a prefix walk is
fundamentally a performance issue, not a correctness one.


> The code used by suspend won't work for SOP.  T2 running SOP starting at
> A, for example, would need to explicitly lock all the nodes before it
> could start doing anything.  Otherwise the possibility would exist that
> the previously-started T1 could still be doing something below G, which
> would be bad.

Oh, so there's an _additional_ constraint on SOP:  it really doesn't
like concurrent operations.  Sounds to me like SOP is just an
academic example then ... normally one wants it to not matter
whether task T1 completes before T2 starts.  Designs usually quite
work hard to avoid "one lock to rule them all" models ... :)

Is there a real-world use case behind SOP?  (It's clearly not
"Standard Operating Procedure" despite its TLA.)


> Furthermore, once T2 finished applying SOP to some nodes 
> and they were available for other work, there wouldn't be any way for
> another task to do anything to those nodes until T2 released its lock
> on A.  So legal concurrency would be ruled out.

I described prefix locking several email messages ago, as the
way to address the (potential) performance issue that bothered
you in the khubd context.  SOP could do the same; it addresses
this kind of issue.
 

>       Concurrency is allowed whenever it is _safe_ (which must be
>       determined according to the nature of the particular operations
>       acting concurrently).  Serialization is achieved simply by
>       locking device nodes.

And if "locking device nodes" uses locktree(), or requires holding
the parent node's lock, you're not saying anything different from
what I've said.

If there's nothing like locktree() to manage the basic task
serialization, I'll have to disagree on basic principles.  See
the earlier parts of this message ... something like it is needed
to prevent two tasks from interfering with each other, just
grabbing locks from the middle of the tree can NEVER provide
task serialization.


> These algorithms provide safe concurrency, exactly when and as desired.  
> Each device is locked precisely during the time that another subtree 
> operation on it cannot be allowed.

And if you accept that locks after the initial one aren't acquired
by locktree, but by the task performing the operation, I don't see
any issue either.  Some tasks (like khubd and resume) could
use a prefix tree traversal, and drop locks earlier than others
that need postfix traversal (like suspend).

- Dave


-------------------------------------------------------
This SF.Net email is sponsored by OSTG. Have you noticed the changes on
Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now,
one more big change to announce. We are now OSTG- Open Source Technology
Group. Come see the changes on the new OSTG site. www.ostg.com
_______________________________________________
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Re: [linux-usb-devel] [PATCH 2.6.8-rc2] usb hub docs and locktree()

Reply via email to