On Wed, Jun 12, 2013 at 05:01:19PM -0700, Linus Torvalds wrote: > I'd actually suggest we do *not* remove any existing d_lock usage > outside of the particular special cases we want to optimize, which at > least from Davidlohr's profile is just dput() (which has shown up a > lot before) and dget_parent() (which I'm not sure why it happens so > much on his load, but it really seems trivially safe to optimistically > do under just the RCU lock).
Actually, dget_parent() change might be broken; the thing is, the assumptions are more subtle than "zero -> non-zero only happens under ->d_lock". It's actually "new references are grabbed by somebody who's either already holding one on the same dentry _or_ holding ->d_lock". That's what d_invalidate() check for ->d_count needs for correctness - caller holds one reference, so comparing ->d_count with 2 under ->d_lock means checking that there's no other holders _and_ there won't be any new ones appearing. Consider the following situation: X is dentry of a/b Y is dentry of a/b/c Z is dentry of d/e A holds a reference to Y and enters dget_parent(Y) B holds a reference to X and enters d_invalidate(X) A picks the value of Y->d_parent (== X) C moves Y to Z B grabs ->d_lock on X B checks X->d_count; it's 1, we deduce that no other references exist or are going to appear A does atomic_inc_not_zero(&X->d_count). And since it's not zero (it's 1, actually), we've just grabbed an extra reference on X that was not going to appear according to B... > That said, I do wonder if we could do something like > "atomic_inc_not_zero()" on the d_count, and only if it is zero (which > won't be horribly unusual, since for leaf dentries that nobody else is > using) we'd do the whole locking sequence. Same correctness issue as above, I'm afraid... > End result: I think it would be interesting to try this all out, and > it could be a noticeable win under some cases, but it *definitely* > needs a lot of timing and testing to see which ways it goes.. *nod* What's more, we need the underlying assumptions documented very clearly for any such change; it's _not_ as simple as "protect transitions from zero to non-zero and we are done" ;-/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

