Tejun Heo <[EMAIL PROTECTED]> writes: > Hello, > > Tejun Heo wrote: >> Al Viro wrote: >>> On Sat, Jan 05, 2008 at 11:30:25PM +0900, Tejun Heo wrote: >>>>> Assuming that this is what we get, everything looks explainable - we >>>>> have sysfs_rename_dir() calling sysfs_get_dentry() while the parent >>>>> gets evicted. We don't have any exclusion, so while we are playing >>>>> silly buggers with lookups in sysfs_get_dentry() we have parent become >>>>> negative; the rest is obvious... >>>> That part of code is walking down the sysfs tree from the s_root of >>>> sysfs hierarchy and on each step parent is held using dget() while being >>>> referenced, so I don't think they can turn negative there. >>> Turn? Just what stops you from getting a negative (and unhashed) from >>> lookup_one_noperm() and on the next iteration being buggered on >>> mutex_lock()? >> >> Right, I haven't thought about that. When sysfs_get_dentry() is called, >> @sd is always valid so unless there was existing negative dentry, lookup >> is guaranteed to return positive dentry, but by populating dcache with >> negative dentry before a node is created, things can go wrong. I don't >> think that's what's going on here tho. If that was the case, the >> while() loop looking up the next sd to lookup (@cur) should have blown >> up as negative dentry will have NULL d_fsdata which doesn't match any sd. >> >> I guess what's needed here is d_revalidate() as other distributed >> filesystems do. I'll test whether this can be actually triggered and >> prepare a fix. Thanks a lot for pointing out the problem. > > This can't happen because lookup of non-existent entry doesn't create a > negative dentry. The new dentry is never hashed and killed after lookup > failure, the above scenario can't happen. > > That said, the mechanism is a bit too fragile. sysfs currently ensures > that dentry/inode point to the associated sysfs_dirent. This is mainly > remanent of conversion from previous VFS based implementation. I think > the right thing to do here is to make sysfs behave like other proper > distributed filesystems using d_revalidate.
Huh? We still need something like sysfs_get_dentry to find the dentries for the rename or move operation. So we can call d_move. Further we should be talking about sysfs_move_dir not sysfs_rename_dir as only the networking code calls sysfs_rename_dir. Not that it is significantly different. I think I saw a change in lock ordering recently to help distributed filesystems, and maybe that will simplify some of this. Anyway I figure if I am to understand this I better go look and see if I can find the start of this thread. I don't have a clue where we are blowing up. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

