On Wed, Mar 19, 2014 at 10:46:15AM -0500, Andrew Deason wrote: > On Tue, 18 Mar 2014 16:01:26 -0400 > "J. Bruce Fields" <[email protected]> wrote: > > > > CC'ing J. Bruce Fields, if he has anything to say about this. > > > d_materialise_unique still seems like it would be problematic to me, > > > as I mentioned earlier, since using it can result in vfs_rmdir'ing a > > > dentry where our parent inode does not match the dentry's > > > d_parent->d_inode, which can panic may_delete. But it's very > > > possible I am missing something here. > > > > Both the d_move case of d_materialise_unique and the vfs_rmdir take the > > i_mutex on the parent directories involved, so they'll be serialised > > with respect to each other and I don't think you should see that panic. > > Yes, I see. > > > The one nit is that d_materialise_unique depends on a trylock for that > > so whoever does the lookup can get a spurious -EBUSY if they're unlucky. > > Via __d_unalias, yeah, that looks like it would make it unsuitable. But > the restrictions on directory loops maybe makes this unsuitable anyway. > > Is the reasoning for the trylocks just lock ordering?
Yes, if we waited for those locks at this point in the code, we could deadlock, since the lock-holder we're waiting on might be waiting on the i_mutex of the parent directory, which we already hold. I haven't thought very hard about how to fix that yet. > It seems odd to me that lookups for anything using > d_materialise_unique can seem to just randomly fail. Agreed, I think this hasn't been fixed just because it's very rarely hit in practice (in the absence of multiply-linked directories like you have). We can reproduce the failure on NFS with artificial testing. It requires an uncached lookup to happen at the same time as a cross-directory rename from another client. > > But it sounds like you can avoid these multiply-linked directories by > > creating mountpoints at the appropriate places, as the NFS client > > does. > > We are also pursuing this approach (again) ever since the recent RHEL > "regression" was reported. This is very difficult to implement while > avoiding GPLONLY symbols, though, as we must perform all of the mounts > and umounts from userspace. It seems possible, but it just takes a lot > of work to avoid numerous issues, and will take some time. > > I don't suppose there's any way the GPLONLY restriction on those > interfaces could change? Or do you have any other suggestions or > anything as to an approach we should be taking? I don't know really know that code, apologies. My only suggestion would be to look at the in-kernel NFS and AFS code and see how it does this. As for GPL_ONLY symbols I think it's certainly worth compiling a list and asking. Obviously the kernel community (myself included!) would be much happier to see effort invested on upstream code. But obvious oversights (d_materialise_unique looks like one) do get fixed. --b. _______________________________________________ OpenAFS-devel mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-devel
