On Monday 16 August 2004 9:56 am, Alan Stern wrote: > [Email can sometimes be a very frustrating medium. It's fair to say > that you and I are both very intelligent people, and yet we've been > talking past each other in this thread to an almost comical extent. I > bet we could have settled the whole thing very simply in less than an > hour or two of face-to-face contact.]
Maybe ... it's not a simple locking problem though. Doctoral theses have been gotten in such areas, though I hope none recently! (Now they just give spurious software patents for trivial variants of old schemes.) > > Consider a task suspending from a root hub: > > > > SUSPEND( root{ a{v,w}, b{x,y}, c } ) > > > > When V issues a wakeup after A's been suspended, and > > while B is being suspended, the no-locktree behavior is > > to wake A up ... whoa, khubd just interfered with the task > > that was suspending!! Likewise with other operations. > > I don't know what you mean by my "no-parent-locking rule". Maybe it > stems from a misunderstanding of something I said earlier, leading to > this current misconception? What misconception? It's how one defines the locking policy in use; you certainly talked about such a policy early on (though now you're talking about a different one). Here are some policies that have come up: - "no-parent" -- just grab udev->serialize, and you get per-device serialization. But whole-subtree tasks, rare until trees start suspending/resuming, interfere with each other unless you switch to multi-pass algorithms. (You refined this one a bit; see below.) - "single parent" -- grab udev->parent->serialize before grabbing udev->serialize, and you synchronize on the hub in certain cases. That means some of the task interference problems in the "no-parent" policy got a lot more rare (most systems don't have many hubs). - "lock parent" -- apply "single" parent all the way up to the root hub (this is what locktree does), and you get task-level serialization for all operations in that tree. Those "rare" cases are now provably gone. A variant of "lock parent" would designate some lock as applying over the root hub locks ... one of your patches had something like that. You've recently discussed "no-parent" variants that fill holes by adding extra passes to grab (or release) lots of locks. I'll confess I don't like algorithms holding extra locks (like that!) ... that's one reason locktree() drops the parent lock as soon as possible, and I've talked about algorithms that can do the same thing. > The remote wakeup request from V, relayed > through A, is presented to khubd as the USB_PORT_STAT_C_SUSPEND bit in > the root hub's port status. So khubd won't see the wakeup request > until the suspend task is complete, because khubd won't read the root > hub's port status until it can lock the root hub. And actually the way resume (and wakeup) works, that'd apply to any already-suspended hub ... so long as any non-suspended parent is locked. Compared to the algorithm you're now presenting, the locktree() based one does however eliminate the need for an extra "go back and drop the locks" pass. (And compared to the first algorithm I thought you were describing: that first one didn't lock non-suspended parents.) > There is a potentially more dangerous variant of this scenario. > Suppose instead of V issuing a remote wakeup request that some task > decides to resume A. Under the current implementation (and after > replacing locktree() with down(serialize)) the suspend task would have > released A's lock already, so the resume would proceed and would wake > A up. In the algorithm I presented A remains locked until its parent > (the root) is suspended; the resume would be forced to wait and then > would fail because the clear_port_feature(USB_PORT_FEAT_SUSPEND) > message to the root hub would be rejected. I wasn't quite clear that you were discussing two different algorithms as well as the locktree() one. One of your algorithms handles that case better than the other (adding the extra pass). What you say is incorrect with respect to locktree()-based suspend; see above, all non-suspended parents stay locked. > The basic message of this analysis is that I'm saying tasks like > suspend should keep devices locked for as long as they want to prevent > other tasks from accessing them. Your implementation relies on > resume() using locktree() Any implementation is going to rely on all tasks applying SOME locking policy consistently. You're just advocating a different policy, which requires additional locking passes for correctness, because to you: > ... My algorithm... seems more straightforward. Whereas the algorithm I'm describing is a kind of classic example applying lock chaining in a lock hierarchy. I'd be more likely to call it "elegant" than anything else. The lock hierarchy is straightforward, the chaining is the nice way to make sure concurrency is usually possible. > > And I don't see complication at all, except when trying to get > > potential performance optimizations by switching to fancier > > algorithms. How is the "lock parent then child" policy at all > > complicated? > > It's more complicated than "lock child" is. The "complication" seems to be entirely because the lock hierarchy is (a) defined, (b) followed everywhere. > Imagine, for the sake of > argument, that the __usb_suspend_device()/hub_suspend() code is > reorganized so that children aren't unlocked until after their parent > is suspended. (It wouldn't be such a bad idea to go ahead and > actually make that change, regardless of anything else...) I don't want to add multi-pass algorithms unless they're unavoidable; so I'd argue instead "yes it WOULD be such a bad idea...". > Now > imagine that a year from now some programmer who hasn't read the email > archives Or the comments in the code explaining how all that does is implement the locking hierarchy in a convenient way (OK maybe those still need to be written) ... > comes along and notices this strange locktree() routine > sitting there in hub.c. He also notices that if every call to > locktree(u) were replaced with usb_lock_device(u) then everything > would work just as correctly as before. Who could blame him for > making the replacements and removing locktree()? There _is_ no usb_lock_device() call now (in 2.6.8) ... though at one point I thought we were actively discussing having one, with its implementation looking like locktree(). One of the things I like most about locktree() is that the locking hierarchy itself is so simple: lock parent first. Since root hubs have no parent, that's where the recursion bottoms out. Everything else is a natural consequence of following that lock hierarchy. > I don't understand the distinction you're making between "task" > serialization and "device" serialization. If two tasks work on possibly disjoint trees, each with a bunch of devices (including hubs), the difference shows up with all devices in the intersection of those trees. For any device, one task was first ... that's a guarantee from device serialization. Task serialization goes further, and guarantees that the _same_ task was always first. > _How_ does my approach allow two tasks to interfere with each other? The latest variant you're discussing -- with extra passes to grab and release locks -- doesn't. The version you first described did allow it, though. > > The way locktree() prevents them is by _consistently_ implementing the > > parent-before-child locking hierarchy. If that policy is implemented > > consistently, it's equivalent to locktree. Inconsistent locking > > policy implementation ... let's not go there! > > My approach _never_ involves locking a parent while holding a child's > lock, so at least it's opposed to parent-after-child. Is that enough > to make my approach is equivalent to locktree? :-) Not equivalent in terms of work; it adds extra passes. Or concurrency ... it grabs some locks earlier than they're needed. > More likely you meant "always lock a device's parent before locking > the device". I don't see any point to this. In general, you lock a > device to prevent other tasks from interfering while you use the > device ... Or because that's the lock hierarchy defined for these objects, and that's just how it always works for devices on the interior of the device tree. (Which include non-USB devices, like ones from the SCSI and block subsystems, as you've noted.) > > > Now here's another problem. The fact is, subtree operations in general > > > can't safely use your prefix-traversal locking pattern. You would run > > > into problems in step (d), because once the parent is unlocked new > > > children can be added. > > > > Which by definition is perfectly OK, otherwise you couldn't have > > been safe using the postfix locking pattern ... :) > > This is a more complicated issue than your description would indicate. > If the set of nodes in a subtree can change (because children are > added) during a traversal, you have to specify very carefully which > nodes you want to visit. ... That's all part of the algorithm definition, as I said. > So you're not totally against the idea of multi-pass algorithms! :-) > It sounded like you were, from some other things you said. Given that "SOP" example wasn't real-world, I wasn't keen on applying real-world goals to solving it ... ;) > > You're the one who suggested talking about "locking device > > nodes" rather than saying exactly how it was done ... very > > specifically to move away from saying it means down(serialize). > > Have you changed your mind on that? > > No. When I wrote the preceding message I was speaking in the context > of the current kernel code, which doesn't include usb_lock_device(). > When that routine gets merged back in, I would rather say that locking > device nodes means calling usb_lock_device() for the first lock a task > acquires and doing down(serialize) for subsequent nested locks. At > any rate, it definitely does not mean calling locktree() -- in fact it > refers to the individual locking operations that comprise locktree(). Hmm, so #define usb_lock_device(x) locktree(x) would fit? > > > Since suspend() and > > > resume() are the two most problematic subtree operations, here are some > > > explicit algorithms to provide a counterexample to your statement: > > > > Ones that don't use the simple down() scheme you've been > > talking about, note -- they added a "must own parent lock" > > rule. Except (!!) when getting the first lock; ergo, you've got > > those "tasks interfere with each other" problems, which comes > > from thosee exceptions-to-rule. :) > > Hang on there! They _do_ use the simple down() scheme -- at least, > they don't call locktree() anywhere! As for "must own parent lock" > rules: Yeah, you've talked about a few different policies, it's hard to keep them straight. Especially in that last note, where I responded with respect to the first policy, and you proposed a second one; so I had to talk about both policies in one response. > (a) The algorithms have no such rule; it's not clear why you > say they do. Your pseudocode showing the parent lock getting grabbed before the child lock, and other comments you've made!! Certainly in any case where you grab parent and child locks, you grab parent first. > (b) What's wrong with requiring callers to hold a parent's lock? Nothing much, at least for code inside usbcore. > (c) After some more thought, I decided it would be better to > require the caller to own the lock on the topmost device > being suspended or resumed. That's what the usbcore-internal routine (__usb_suspend_device) does. When called from sysfs, we know that the caller does NOT have that lock. > P.S.: Ironically, even though I object to locktree in part because of > the unnecessary delays it can cause, I wouldn't mind so much getting > rid of usbdev->serialize entirely and just using a single semaphore to > lock an entire bus. Well, those "unnecessary" delays relate entirely to task serialization; and you seem to have restored a few of them by moving towards algorithms that use multiple locking passes! I think having a single lock is a bit far from being fully workable right now, but "grand simplifications" often look interesting! When I think about things like that at this level, I tend to think about wanting a per-device spinlock in the driver model, used to control movement to/from various lists. - Dave ------------------------------------------------------- SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media 100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33 Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift. http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 _______________________________________________ [EMAIL PROTECTED] To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel