On Monday 16 August 2004 9:56 am, Alan Stern wrote:
> [Email can sometimes be a very frustrating medium.  It's fair to say
> that you and I are both very intelligent people, and yet we've been
> talking past each other in this thread to an almost comical extent.  I
> bet we could have settled the whole thing very simply in less than an
> hour or two of face-to-face contact.]

Maybe ... it's not a simple locking problem though.

Doctoral theses have been gotten in such areas,
though I hope none recently!  (Now they just give
spurious software patents for trivial variants of old
schemes.)


> > Consider a task suspending from a root hub:
> > 
> >   SUSPEND( root{ a{v,w}, b{x,y}, c } )
> > 
> > When V issues a wakeup after A's been suspended, and
> > while B is being suspended, the no-locktree behavior is
> > to wake A up ... whoa, khubd just interfered with the task
> > that was suspending!!   Likewise with other operations.
> 
> I don't know what you mean by my "no-parent-locking rule".  Maybe it
> stems from a misunderstanding of something I said earlier, leading to
> this current misconception?

What misconception?  It's how one defines the locking
policy in use; you certainly talked about such a policy early
on (though now you're talking about a different one).  Here
are some policies that have come up:

 - "no-parent" -- just grab udev->serialize, and you get
   per-device serialization.  But whole-subtree tasks, rare
   until trees start suspending/resuming, interfere with
   each other unless you switch to multi-pass algorithms.
   (You refined this one a bit; see below.)

- "single parent" -- grab udev->parent->serialize before
  grabbing udev->serialize, and you synchronize on the
  hub in certain cases.  That means some of the task
  interference problems in the "no-parent" policy got
  a lot more rare (most systems don't have many hubs).

- "lock parent" -- apply "single" parent all the way up to the
  root hub (this is what locktree does), and you get task-level
  serialization for all operations in that tree.  Those "rare"
  cases are now provably gone.

A variant of "lock parent" would designate some lock as
applying over the root hub locks ... one of your patches
had something like that.

You've recently discussed "no-parent" variants that fill
holes by adding extra passes to grab (or release) lots
of locks.  I'll confess I don't like algorithms holding extra
locks (like that!) ... that's one reason locktree() drops the
parent lock as soon as possible, and I've talked about
algorithms that can do the same thing.


> The remote wakeup request from V, relayed 
> through A, is presented to khubd as the USB_PORT_STAT_C_SUSPEND bit in
> the root hub's port status.  So khubd won't see the wakeup request
> until the suspend task is complete, because khubd won't read the root
> hub's port status until it can lock the root hub.

And actually the way resume (and wakeup) works, that'd apply to
any already-suspended hub ... so long as any non-suspended
parent is locked.   Compared to the algorithm you're now
presenting, the locktree() based one does however eliminate the
need for an extra "go back and drop the locks" pass.
(And compared to the first algorithm I thought you were
describing: that first one didn't lock non-suspended parents.)


> There is a potentially more dangerous variant of this scenario.
> Suppose instead of V issuing a remote wakeup request that some task
> decides to resume A.  Under the current implementation (and after
> replacing locktree() with down(serialize)) the suspend task would have
> released A's lock already, so the resume would proceed and would wake
> A up.  In the algorithm I presented A remains locked until its parent
> (the root) is suspended; the resume would be forced to wait and then
> would fail because the clear_port_feature(USB_PORT_FEAT_SUSPEND)
> message to the root hub would be rejected.

I wasn't quite clear that you were discussing two different algorithms
as well as the locktree() one.  One of your algorithms handles that
case better than the other (adding the extra pass).

What you say is incorrect with respect to locktree()-based suspend;
see above, all non-suspended parents stay locked.


> The basic message of this analysis is that I'm saying tasks like
> suspend should keep devices locked for as long as they want to prevent
> other tasks from accessing them.  Your implementation relies on
> resume() using locktree()

Any implementation is going to rely on all tasks applying SOME
locking policy consistently.   You're just advocating a different
policy, which requires additional locking passes for correctness,
because to you:

> ... My algorithm... seems more straightforward.

Whereas the algorithm I'm describing is a kind of classic
example applying lock chaining in a lock hierarchy.  I'd be 
more likely to call it "elegant" than anything else.  The
lock hierarchy is straightforward, the chaining is the nice
way to make sure concurrency is usually possible.



> > And I don't see complication at all, except when trying to get
> > potential performance optimizations by switching to fancier
> > algorithms.  How is the "lock parent then child" policy at all
> > complicated?
> 
> It's more complicated than "lock child" is.

The "complication" seems to be entirely because the lock
hierarchy is (a) defined, (b) followed everywhere.  

> Imagine, for the sake of 
> argument, that the __usb_suspend_device()/hub_suspend() code is
> reorganized so that children aren't unlocked until after their parent
> is suspended.  (It wouldn't be such a bad idea to go ahead and
> actually make that change, regardless of anything else...)  

I don't want to add multi-pass algorithms unless they're unavoidable;
so I'd argue instead "yes it WOULD be such a bad idea...".


> Now 
> imagine that a year from now some programmer who hasn't read the email
> archives 

Or the comments in the code explaining how all that does is implement
the locking hierarchy in a convenient way (OK maybe those still need to
be written) ...


> comes along and notices this strange locktree() routine 
> sitting there in hub.c.  He also notices that if every call to
> locktree(u) were replaced with usb_lock_device(u) then everything
> would work just as correctly as before.  Who could blame him for
> making the replacements and removing locktree()?

There _is_ no usb_lock_device() call now (in 2.6.8) ... though at one
point I thought we were actively discussing having one, with its
implementation looking like locktree().

One of the things I like most about locktree() is that the locking
hierarchy itself is so simple:  lock parent first.  Since root hubs have
no parent, that's where the recursion bottoms out. Everything else
is a natural consequence of following that lock hierarchy.


> I don't understand the distinction you're making between "task"
> serialization and "device" serialization.

If two tasks work on possibly disjoint trees, each with a
bunch of devices (including hubs), the difference shows
up with all devices in the intersection of those trees.  For
any device, one task was first ... that's a guarantee from
device serialization.  Task serialization goes further, and
guarantees that the _same_ task was always first. 

 
> _How_ does my approach allow two tasks to interfere with each other?

The latest variant you're discussing -- with extra passes to grab
and release locks -- doesn't.  The version you  first described
did allow it, though.


> > The way locktree() prevents them is by _consistently_ implementing the
> > parent-before-child locking hierarchy.  If that policy is implemented
> > consistently, it's equivalent to locktree.  Inconsistent locking
> > policy implementation ... let's not go there!
> 
> My approach _never_ involves locking a parent while holding a child's
> lock, so at least it's opposed to parent-after-child.  Is that enough
> to make my approach is equivalent to locktree?  :-)

Not equivalent in terms of work; it adds extra passes.
Or concurrency ... it grabs some locks earlier  than they're
needed.


> More likely you meant "always lock a device's parent before locking
> the device".  I don't see any point to this.  In general, you lock a
> device to prevent other tasks from interfering while you use the
> device ...

Or because that's the lock hierarchy defined for these objects,
and that's just how it always works for devices on the interior of
the device tree.  (Which include non-USB devices, like ones from
the SCSI and block subsystems, as you've noted.)


> > > Now here's another problem.  The fact is, subtree operations in general
> > > can't safely use your prefix-traversal locking pattern.  You would run
> > > into problems in step (d), because once the parent is unlocked new
> > > children can be added. 
> > 
> > Which by definition is perfectly OK, otherwise you couldn't have
> > been safe using the postfix locking pattern ... :)
> 
> This is a more complicated issue than your description would indicate.
> If the set of nodes in a subtree can change (because children are
> added) during a traversal, you have to specify very carefully which
> nodes you want to visit. ...

That's all part of the algorithm definition, as I said.  


> So you're not totally against the idea of multi-pass algorithms!  :-)
> It sounded like you were, from some other things you said.

Given that "SOP" example wasn't real-world, I wasn't keen on
applying real-world goals to solving it ... ;)


> > You're the one who suggested talking about "locking device
> > nodes" rather than saying exactly how it was done ... very
> > specifically to move away from saying it means down(serialize).
> > Have you changed your mind on that?
> 
> No.  When I wrote the preceding message I was speaking in the context
> of the current kernel code, which doesn't include usb_lock_device().
> When that routine gets merged back in, I would rather say that locking
> device nodes means calling usb_lock_device() for the first lock a task
> acquires and doing down(serialize) for subsequent nested locks.  At
> any rate, it definitely does not mean calling locktree() -- in fact it
> refers to the individual locking operations that comprise locktree().

Hmm, so #define usb_lock_device(x) locktree(x) would fit?


> > > Since suspend() and  
> > > resume() are the two most problematic subtree operations, here are some 
> > > explicit algorithms to provide a counterexample to your statement:
> > 
> > Ones that don't use the simple down() scheme you've been
> > talking about, note -- they added a "must own parent lock"
> > rule.  Except (!!) when getting the first lock; ergo, you've got
> > those "tasks interfere with each other" problems, which comes
> > from thosee exceptions-to-rule.  :)
> 
> Hang on there!  They _do_ use the simple down() scheme -- at least,
> they don't call locktree() anywhere!  As for "must own parent lock"
> rules:

Yeah, you've talked about a few different policies, it's hard to
keep them straight.  Especially in that last note, where I responded
with respect to the first policy, and you proposed a second one;
so I had to talk about both policies in one response.


>     (a) The algorithms have no such rule; it's not clear why you
>       say they do. 

Your pseudocode showing the parent lock getting grabbed
before the child lock, and other comments you've made!!
Certainly in any case where you grab parent and child locks,
you grab parent first.


>     (b) What's wrong with requiring callers to hold a parent's lock?

Nothing much, at least for code inside usbcore.


> (c)   After some more thought, I decided it would be better to
>       require the caller to own the lock on the topmost device
>       being suspended or resumed.  

That's what the usbcore-internal routine (__usb_suspend_device)
does.  When called from sysfs, we know that the caller does NOT
have that lock.


> P.S.: Ironically, even though I object to locktree in part because of
> the unnecessary delays it can cause, I wouldn't mind so much getting
> rid of usbdev->serialize entirely and just using a single semaphore to
> lock an entire bus.  

Well, those "unnecessary" delays relate entirely to task serialization;
and you seem to have restored a few of them by moving towards
algorithms that use multiple locking passes!

I think having a single lock is a bit far from being fully workable
right now, but "grand simplifications" often look interesting!

When I think about things like that at this level, I tend to think about
wanting a per-device spinlock in the driver model, used to control
movement to/from various lists.  

- Dave





-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
_______________________________________________
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Reply via email to