On Mon, 16 Aug 2004, David Brownell wrote:

> Maybe ... it's not a simple locking problem though.
> 
> Doctoral theses have been gotten in such areas,
> though I hope none recently!  (Now they just give
> spurious software patents for trivial variants of old
> schemes.)

:-)

I feel better now; we seem to be making genuine progress toward mutual 
understanding, if not a resolution...


> > I don't know what you mean by my "no-parent-locking rule".  Maybe it
> > stems from a misunderstanding of something I said earlier, leading to
> > this current misconception?
> 
> What misconception?  It's how one defines the locking
> policy in use; you certainly talked about such a policy early
> on (though now you're talking about a different one).

Yes, I have altered my ideas over time.  The "misconception" was which 
algorithm I was referring to.

>  Here
> are some policies that have come up:
> 
>  - "no-parent" -- just grab udev->serialize, and you get
>    per-device serialization.  But whole-subtree tasks, rare
>    until trees start suspending/resuming, interfere with
>    each other unless you switch to multi-pass algorithms.
>    (You refined this one a bit; see below.)
> 
> - "single parent" -- grab udev->parent->serialize before
>   grabbing udev->serialize, and you synchronize on the
>   hub in certain cases.  That means some of the task
>   interference problems in the "no-parent" policy got
>   a lot more rare (most systems don't have many hubs).
> 
> - "lock parent" -- apply "single" parent all the way up to the
>   root hub (this is what locktree does), and you get task-level
>   serialization for all operations in that tree.  Those "rare"
>   cases are now provably gone.

None of these really give a completely accurate description of what I'm
suggesting.  How about this:

  - "explicit locks" -- grab udev->serialize for per-device
    serialization.  Subtree operations achieve task serialization
    by explicitly locking each device in their subtree as needed
    (but never locking a parent while holding a child's lock),
    even if this means using multiple locking passes.

Or is that the same as "no-parent"?

> A variant of "lock parent" would designate some lock as
> applying over the root hub locks ... one of your patches
> had something like that.

Yes.  In those circumstances one wants to prevent root hubs from being 
added or removed, so the usb_bus_list_lock is the natural thing to use.

> You've recently discussed "no-parent" variants that fill
> holes by adding extra passes to grab (or release) lots
> of locks.  I'll confess I don't like algorithms holding extra
> locks (like that!) ... that's one reason locktree() drops the
> parent lock as soon as possible, and I've talked about
> algorithms that can do the same thing.

You must be referring to the most recent algorithms I gave for suspend and 
resume.  There _is_ an extra pass in the suspend code to release some 
locks -- I don't view it as a big issue although I can understand that you 
might not like it.

However: The code doesn't hold extra locks.  That is, it doesn't lock any
more devices than the current suspend code does -- in fact it locks fewer,
since avoiding locktree() means that it doesn't lock anything above the
topmost node being suspended.

It's true that my algorithm holds some locks longer than the current
suspend code does; the current code unlocks a child as soon as it is
suspended whereas my algorithm keeps a child locked until its parent is
suspended.  In spite of appearances this isn't a disadvantage: Even though
the current code unlocks children earlier, they don't become _accessible_
any earlier -- in fact they become accessible later.  What I mean is, if
some other task wants to lock a suspended child and calls locktree() to do
so, then that other task must block until the entire suspend operation is
complete.  Whereas with my approach, if some other task wants to lock a
suspended child then it isn't forced to use locktree(), so it only has to
block until the child's parent is suspended, not until the entire suspend
operation is complete.


> > The remote wakeup request from V, relayed 
> > through A, is presented to khubd as the USB_PORT_STAT_C_SUSPEND bit in
> > the root hub's port status.  So khubd won't see the wakeup request
> > until the suspend task is complete, because khubd won't read the root
> > hub's port status until it can lock the root hub.
> 
> And actually the way resume (and wakeup) works, that'd apply to
> any already-suspended hub ... so long as any non-suspended
> parent is locked.   Compared to the algorithm you're now
> presenting, the locktree() based one does however eliminate the
> need for an extra "go back and drop the locks" pass.
> (And compared to the first algorithm I thought you were
> describing: that first one didn't lock non-suspended parents.)

(Forget about that first algorithm; it was wrong.)  Yes, there is that 
extra pass.  It's present only in suspend.  I think that's better than 
having to invoke locktree() every time you lock a subtree.  Maybe things 
will change in the future and more operations will end up needing multiple 
passes; then I might change my opinion.


> > There is a potentially more dangerous variant of this scenario.
> > Suppose instead of V issuing a remote wakeup request that some task
> > decides to resume A.  Under the current implementation (and after
> > replacing locktree() with down(serialize)) the suspend task would have
> > released A's lock already, so the resume would proceed and would wake
> > A up.  In the algorithm I presented A remains locked until its parent
> > (the root) is suspended; the resume would be forced to wait and then
> > would fail because the clear_port_feature(USB_PORT_FEAT_SUSPEND)
> > message to the root hub would be rejected.
> 
> What you say is incorrect with respect to locktree()-based suspend;
> see above, all non-suspended parents stay locked.

Right, but it _is_ correct if all calls to locktree() in the current code
are replaced with down(serialize).  Then resume wouldn't care that the
parent was still locked; it would go ahead anyway.  The point I was trying
to make is: Simply replacing locktree() with down(serialize) is incorrect,
but if the suspend code is changed the way I said (with the extra
unlocking pass added) then it is correct.


> Any implementation is going to rely on all tasks applying SOME
> locking policy consistently.   You're just advocating a different
> policy, which requires additional locking passes for correctness,
> because to you:
> 
> > ... My algorithm... seems more straightforward.
> 
> Whereas the algorithm I'm describing is a kind of classic
> example applying lock chaining in a lock hierarchy.  I'd be 
> more likely to call it "elegant" than anything else.  The
> lock hierarchy is straightforward, the chaining is the nice
> way to make sure concurrency is usually possible.

"Usually possible".  The locktree approach rules out certain types of
acceptable concurrency that my approach allows, and not vice versa.  The
one disadvantage is the extra pass in suspend -- none of the other subtree
operations require an extra pass -- and all it does is release some locks,
locks which would have to be released anyway (though not in a separate
pass).

> I don't want to add multi-pass algorithms unless they're unavoidable;
> so I'd argue instead "yes it WOULD be such a bad idea...".

Okay, that's a matter of taste.


> > Now 
> > imagine that a year from now some programmer who hasn't read the email
> > archives comes along and notices this strange locktree() routine 
> > sitting there in hub.c.  He also notices that if every call to
> > locktree(u) were replaced with usb_lock_device(u) then everything
> > would work just as correctly as before.  Who could blame him for
> > making the replacements and removing locktree()?
> 
> There _is_ no usb_lock_device() call now (in 2.6.8) ... though at one
> point I thought we were actively discussing having one, with its
> implementation looking like locktree().

There is no usb_lock_device() at the moment, but I'll send the patch in to
Greg when he gets back.  A year from now it _will_ exist (I hope!).


> One of the things I like most about locktree() is that the locking
> hierarchy itself is so simple:  lock parent first.  Since root hubs have
> no parent, that's where the recursion bottoms out. Everything else
> is a natural consequence of following that lock hierarchy.

My approach can also be described as "lock parent first" -- that's the
nice thing about ambiguous catch-phrases!  In your case it means "always
lock the parent before locking the child".  In my case it means "if you're 
going to lock both the parent and the child, the one to lock first is the 
parent".


> > I don't understand the distinction you're making between "task"
> > serialization and "device" serialization.
> 
> If two tasks work on possibly disjoint trees, each with a
> bunch of devices (including hubs), the difference shows
> up with all devices in the intersection of those trees.  For
> any device, one task was first ... that's a guarantee from
> device serialization.  Task serialization goes further, and
> guarantees that the _same_ task was always first. 

Okay, clear enough.  My algorithms are task-serialized.  If two tasks are
working on overlapping subtrees, then (since one subtree is necessarily a
sub-subtree of the other) the task that first locks the head of the inner
subtree will visit each device in the inner subtree first, just as with
locktree.


> > My approach _never_ involves locking a parent while holding a child's
> > lock, so at least it's opposed to parent-after-child.  Is that enough
> > to make my approach is equivalent to locktree?  :-)
> 
> Not equivalent in terms of work; it adds extra passes.
> Or concurrency ... it grabs some locks earlier  than they're
> needed.

Could you explain that last statement?  My algorithm doesn't grab any lock
earlier than the current code does.  And I don't think it grabs any locks
before they are needed.


> > This is a more complicated issue than your description would indicate.
> > If the set of nodes in a subtree can change (because children are
> > added) during a traversal, you have to specify very carefully which
> > nodes you want to visit. ...
> 
> That's all part of the algorithm definition, as I said.  

As you _didn't_ say!  You left all discussion about whether to visit
newly-added children out of the algorithm definition.  In the absence of
any other evidence I would normally believe that the definition, as you
stated it, was intended to apply only in situations where the tree was
static.  For dynamic trees a more careful statement would be needed.

You mentioned that postfix-order locking can't be used with a
prefix-order algorithm.  In a narrow sense that's true.  But if you're
willing to allow multiple passes then you could construct a larger
algorithm that grabs the locks in prefix order to prevent the tree from
changing, then applies the static-tree algorithm in whatever order it
likes, then releases the locks in postfix order.  In this larger sense the 
order of locking is totally unrelated to the traversal order of the 
inner algorithm.


> > No.  When I wrote the preceding message I was speaking in the context
> > of the current kernel code, which doesn't include usb_lock_device().
> > When that routine gets merged back in, I would rather say that locking
> > device nodes means calling usb_lock_device() for the first lock a task
> > acquires and doing down(serialize) for subsequent nested locks.  At
> > any rate, it definitely does not mean calling locktree() -- in fact it
> > refers to the individual locking operations that comprise locktree().
> 
> Hmm, so #define usb_lock_device(x) locktree(x) would fit?

Actually it would.  Except that when the usb_lock_device patch appears, 
you'll see that the non-recursive part of locktree() -- locking the root 
hub -- calls usb_lock_device().  That part of locktree() would have to be 
changed to do what usb_lock_device() does, i.e., acquire the 
usb_all_devices_rwsem readlock.


> > > they added a "must own parent lock" rule.
> 
> >     (a) The algorithms have no such rule; it's not clear why you
> >     say they do. 
> 
> Your pseudocode showing the parent lock getting grabbed
> before the child lock, and other comments you've made!!

I misunderstood what you were getting at.  Yes, my suspend_subr routine
requires the caller to own u's device lock and it treats u as a parent, so
in that sense it does have a "must own parent lock" rule.  That's not how
I think of it though; I view it as a "caller must own the device lock"  
rule.  The same rule is present in several routines throughout usbcore, so
there doesn't seem to be anything remarkable about having it here.  In
fact it's a direct application of the "no-parent" rule: You're going to 
suspend u so you must own u's lock.

> Certainly in any case where you grab parent and child locks,
> you grab parent first.

That's an application of the "never lock the parent after locking the 
child" rule (or as I like to think of it, "lock parent first"!).

> > (c) After some more thought, I decided it would be better to
> >     require the caller to own the lock on the topmost device
> >     being suspended or resumed.  
> 
> That's what the usbcore-internal routine (__usb_suspend_device)
> does.  When called from sysfs, we know that the caller does NOT
> have that lock.

Sure.  My suggestion is that we eliminate usb_suspend_device, rename
__usb_suspend_device to usb_suspend_device, and move the locktree() call
into the sysfs code.

A prerequisite change is something I have already put into the
usb_lock_device patch -- make usb_suspend_device release the lock that it
acquires instead of having __usb_suspend_device release it.  Remember:
"Release a lock in the same routine that acquired it"... you violated one
of your own design rules.  :-)


> > P.S.: Ironically, even though I object to locktree in part because of
> > the unnecessary delays it can cause, I wouldn't mind so much getting
> > rid of usbdev->serialize entirely and just using a single semaphore to
> > lock an entire bus.  
> 
> Well, those "unnecessary" delays relate entirely to task serialization;
> and you seem to have restored a few of them by moving towards
> algorithms that use multiple locking passes!

Only one algorithm, and as I said above, although the locks are released
later the devices become accessible earlier.

> I think having a single lock is a bit far from being fully workable
> right now, but "grand simplifications" often look interesting!

It may not be as far as you think.  After usb_lock_device is in place --
and it's already written -- all that's needed is to remove all calls to
down(serialize) and up(serialize), make locktree() and usb_lock_device()
acquire the single bus-wide semaphore, and make usb_unlock_device()
release it.

I'm not seriously suggesting doing this, though.  Vendors are moving 
toward EHCI controllers with lots of ports; even without external hubs 
such systems would be penalized by this approach.

Alan Stern




-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
_______________________________________________
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Reply via email to