On Friday 06 August 2004 09:39, Alan Stern wrote: > On Wed, 4 Aug 2004, David Brownell wrote: > > > > No, no. Preventing tasks from interfering with another is absolutely > > > necessary. I'm saying that even if hub_events() called > > > down(&hdev->serialize) instread of locktree(), we would still have that > > > prevention. > > > > No we wouldn't. Consider a task suspending a tree; now khubd > > comes in and wakes something up that's already been carefully > > suspended. You're allowing interference ... encouraging it. > > In fact that can't happen; khubd can't wake up suspended devices.
Except for its remote wakeup path... :) > > Yet I've demonstrated a locking policy that works the same in > > all those cases, without claiming that, for example, reset() is a > > special case ... > > Your policy isn't correct, or at least, it doesn't do what you seem to be > claiming it does. > > To see why, let's go back to first principles. You are concerned about > "non-data signalling" -- which I take to mean any action that, when > applied to a hub, can affect the entire subtree of devices below that > hub. (There's not much point in considering the subtree of devices below > a non-hub, which is why I use this phrasing.) A better name might be > "subtree operations"; if you don't mind I'll start using that name in > place of the other. I started out talking about tree/subtree operations, but that seemed to invite misunderstandings ... which is why I also started to talk about a class of operations that required subtree operations. Namely, those which aren't sending differential/data signaling down the wires. > When you think about it, lots of things qualify as subtree operations. > ... That's why my thoughts have tended that way ... :) > Faced with this analysis, you may agree and go on to say that it simply > shows that _all_ our locking should use locktree(). Or I might just say that I want the locking rules to be simple, and not dominated by an armload of easy-to-misunderstand special case exceptions ... in fact I'm very sure I said that! We may come up with cases where exceptions are needed, but let's look at those cases as they appear. So far you've not shown me any cases where they're needed. (Though there's still that subsys.rwsem problem, which seems to me not unlike a driver model design flaw. Not USB-specific.) > But that's not right, > because your procedures for suspend/resume -- which _do_ use locktree() -- > don't accomplish what you want. That is, they don't provide > general-purpose serialization with no need to consider special cases. > > > To understand why this is so, let's use a particular example. Take this > hypothetical device tree: > > [snip!] > > While all that is going on, suppose task T2 decides to suspend the entire > tree starting at A. It uses locktree() to lock A. Then it locks B, then > it locks, suspends, and unlocks C & D. Then it suspends and unlocks B. > And so on... It doesn't block until it tries to lock G, at which point it > has to wait for T1 to finish up. Right. T1 and T2 are fully sequenced; T1 always completes before T2; they can't interfere with each other. They may be partially concurrent. > Observe: T1 and T2 are operating concurrently on overlapping subtrees. > At the same time the T1 is suspending H, I, and G, T2 might be suspending > C, D, and B. This example demonstrates that your algorithm allows > concurrency; it does _not_ achieve complete serialization. Not true. T1 is always serialized with respect to T2; there will never be a case where T2 can undo something T1 did before T1 completes, or vice versa. That's a fairly standard definition of task serialization; I don't know what kind of serialization you're referring to. > One might claim that this isn't so bad, that the intermediate stage of T2 > when it has suspended C, D, and B (for example) means that it has been > operating on a sub-subtree disjoint from the H,I,G sub-subtree. That sort > of argument doesn't wash when considering resume(). Since resume() works > on devices from the top down, during its intermediate stages the devices > it has affected don't even form a subtree! The approach I suggested for khubd -- in case that potential performance issue makes trouble -- applies here too. See below for more comments on that; it generalizes nicely. Again, this seems like a case of "want" a special case, not "need". > Now suppose for the sake of argument there was some subtree operation > (call it SOP) which had to prevent other tasks from working on devices > _before_ SOP had touched them but which was okay for concurrent execution > on devices _afterwards_.... Tree locking handles that by dropping the lock as soon as it's no longer needed. Consider the three classic kinds of tree walk, and their different locking requirements: - prefix (parent first then children) ... task can drop parent's lock as soon as it has (a) visited the parent, (b) locked the children - postfix (parent after each child) ... pretty common, must hold lock until after visiting children. All tree traversals can lock like this, modulo performance tradeoffs. - infix (parent after each child) ... not all that common, must hold lock until after last child (like postfix). I've already described how to make khubd use a prefix walk to get rid of that one performance issue that bothered you ... SOP could use the same policy. Note that suspend() is a good example of a postfix walk, and resume() could be implemented as a prefix walk. Key point: so long as locktree() is used to get the first lock, the kind of tree traversal used during the tree operation need not be visible to other tasks. Because locktree() established the basic task-level serialization. And again: using an unlock-early policy with a prefix walk is fundamentally a performance issue, not a correctness one. > The code used by suspend won't work for SOP. T2 running SOP starting at > A, for example, would need to explicitly lock all the nodes before it > could start doing anything. Otherwise the possibility would exist that > the previously-started T1 could still be doing something below G, which > would be bad. Oh, so there's an _additional_ constraint on SOP: it really doesn't like concurrent operations. Sounds to me like SOP is just an academic example then ... normally one wants it to not matter whether task T1 completes before T2 starts. Designs usually quite work hard to avoid "one lock to rule them all" models ... :) Is there a real-world use case behind SOP? (It's clearly not "Standard Operating Procedure" despite its TLA.) > Furthermore, once T2 finished applying SOP to some nodes > and they were available for other work, there wouldn't be any way for > another task to do anything to those nodes until T2 released its lock > on A. So legal concurrency would be ruled out. I described prefix locking several email messages ago, as the way to address the (potential) performance issue that bothered you in the khubd context. SOP could do the same; it addresses this kind of issue. > Concurrency is allowed whenever it is _safe_ (which must be > determined according to the nature of the particular operations > acting concurrently). Serialization is achieved simply by > locking device nodes. And if "locking device nodes" uses locktree(), or requires holding the parent node's lock, you're not saying anything different from what I've said. If there's nothing like locktree() to manage the basic task serialization, I'll have to disagree on basic principles. See the earlier parts of this message ... something like it is needed to prevent two tasks from interfering with each other, just grabbing locks from the middle of the tree can NEVER provide task serialization. > These algorithms provide safe concurrency, exactly when and as desired. > Each device is locked precisely during the time that another subtree > operation on it cannot be allowed. And if you accept that locks after the initial one aren't acquired by locktree, but by the task performing the operation, I don't see any issue either. Some tasks (like khubd and resume) could use a prefix tree traversal, and drop locks earlier than others that need postfix traversal (like suspend). - Dave ------------------------------------------------------- This SF.Net email is sponsored by OSTG. Have you noticed the changes on Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now, one more big change to announce. We are now OSTG- Open Source Technology Group. Come see the changes on the new OSTG site. www.ostg.com _______________________________________________ [EMAIL PROTECTED] To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel