On Thu, 2011-07-21 at 21:00 -0700, Linus Torvalds wrote: > On Thu, Jul 21, 2011 at 8:52 PM, Ian Kent <ra...@themaw.net> wrote: > > > > Oh, sorry my bad, obviously I missed the significance of moving the > > inode update, I still don't get it, so I'll have a look at the > > surrounding code again. > > So it's admittedly pretty subtle. The issue is that during the RCU > pathname walk, since we (very much on purpose) do not take any d_lock > or even increment the dentry count, the d_inode field is not stable.
Thanks very much for the explanation. In hindsight, the specific case I was addressing was for a dentry that has no mount on it so I shouldn't have changed the inode assignment anyway, oops. > > So when we read from dentry->d_inode, we need to re-check the sequence > count afterwards. > > The exceptions ends up being: > > - In __d_lookup_rcu(), we carefully load d_inode (and check the > name), and verify the dentry sequence number afterwards with the > proper memory barriers, so the *inode value that __d_lookup_rcu() > returns is valid. > > - for mount-points (ie at the *end* of the loop, after we've checked > it), we can do the load, because those dentries are guaranteed to be > pinned. > > - after we've turned the speculative RCU walk into a non-speculative > walk with 'unlazy_walk()' (ie we've incremented the dentry refcount > and we've checked the sequence count), d_inode is stable again. So we > can trust dentries once we've fully done the walk and finished the > whole link_path_walk() or whatever. > > Those exceptions (especially the last one) cover *most* of the kernel. > But __follow_mount_rcu() is solidly in the speculative RCU path walk > area, which is why reading d_inode was buggy there without the extra > tests for pinning. > > Linus _______________________________________________ stable mailing list stable@linux.kernel.org http://linux.kernel.org/mailman/listinfo/stable