On Tue, Mar 11, 2014 at 11:33:37AM -0500, Andrew Deason wrote: > [Moving from openafs-info to openafs-devel.] > > On Tue, 11 Mar 2014 11:45:47 -0400 > "J. Bruce Fields" <bfie...@redhat.com> wrote: > > > - d_materialise_unique: which was originally designed which > > distributed filesystems in mind, which have the special issue > > that the directory tree can be modified by other nodes without > > their knowledge, so they need to be ble to deal with lookup > > encountering an inode which was previously known under some > > other name. > > I'm not clear on if this really helps, as I'm not completely sure what > the effects of this are, but maybe you can clarify. It seems to just > 'move' the dentry from one location to another, but I'm not sure if > that's all. Say we have the following AFS mountpoints (which appear as > directories, not e.g. vfsmounts or anything like that). These can be > anywhere in the hierarchy: > > /afs/.../parent1/dir1 > /afs/.../parent2/dir2 > > 'dir1' and 'dir2' are AFS mountpoints to the same place. So, they are > effectively the same directory. > > So if someone accesses dir1, say they have a reference to dentry > 0x1234, which has d_name "dir1" and d_parent pointing to parent1. > And say someone accesses dir2 while dir1 is still held open. With > d_materialise_unique, it looks like they also get dentry 0x1234, but it > is 'moved' so now its d_name is "dir2" and its d_parent points to > parent2. Is that correct?
That's right. (There's a yuchy corner case which makes this a little unreliable: the lookup operation already has the i_mutex on the parent directory, and we need other locks to do the move succesfully, but we would have needed to acquire them in a different order--so d_materialise_unique instead makes nonblocking attempts to get those locks (these are the mutex_trylock's in __d_unalias) and returns -EBUSY if that fails. So you can get EBUSY on the lookup if there happens to be, say, a rename going on at the same moment.) > It seems like that does not get rid of the problem, since someone (call > it P1) could lookup dir1, and get back dentry 0x1234. Then someone else > looks up dir2 and renames/moves dentry 0x1234. P1 now has a reference to > dentry 0x1234, whose d_parent does not match the parent1, the directory > where we looked up 'dir1'. As we've seen, that can cause a panic on the > box (I'm glossing over how, if it's not readily apparent I can go into > more detail.) > > I would also guess this causes an ELOOP error when trying to access a > mountpoint inside itself. That is maybe not the worst thing in the > world, but does preclude access which was previously allowed. > > And just to be clear, I wasn't really expecting there to be some other > function to call that would make all of this "okay". I am working on > something to use actual mount points (that is, not "AFS mountpoints", > but 'struct vfsmount'/'struct mount' mountpoints). It is just a little > difficult since we cannot use the builtin kernel functions to create new > mountpoints. Right, the vfs depends on the assumption that the directory tree is really a tree. d_splice_alias as currently implemented can violate that assumption by creating multiple dentries pointing to the same directory, but that really can cause deadlocks down the road, so it's a bug we intend to fix. Other filesystems, like NFS, do as you describe and create mountpoints. --b. _______________________________________________ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info