On Tue, 10 Sep 2002, Don Lewis wrote:

> >>> > The changes are obviously just cleanups for leaf file systems, but I
> >>> > wonder why everything wasn't always locked at the top.  Could it have
> >>> > been because locking all the way down is harmful?
> 
> >> I think this is because some places are still worrying about using
> >> link(2) on directories.  I think there aren't any significant problems
> >> when vp is not a directory.  Hard linking of directories was permitted
> >> (and perhaps even supported) until relatively recently:
> > 
> > I suspect that you are right about this.  If so, it should be able to
> > let namei() lock the first vnode since we check its type and bail out if
> > it is a directory before the second namei() call.
> 
> I looked at namei() and ufs_lookup() and found a problem with telling
> the first namei() to grab the lock.  It appears that the *_lookup() 
> routines always lock the child vnode, and namei() later releases the
> lock if LOCKLEAF is not set.  If we passed LOCKLEAF to the first
> namei(), then link("/foo/bar", "/foo/bar") would attempt to lock "bar" 
> twice.

This also violates the "directory vnodes before file vnodes" lock order.
Generally speaking, you must always lock directories before files, since
files are always leaves, and directories may not be.  The patch you have
looks fine to me, since it grabs a reference to the file but doesn't lock
it until after the directory look is held (and guarantees it's a leaf node
by checking VDIR so that it's safe to do the locking later).

> Because we now bail out early if the first vnode turns out to be a
> pointer, I believe we can dispense with the vnode inequality test and
> always grab the lock. 

Yeah.  vp != VDIR, and dvp == VDIR due to earlier constraints, so we know
that vp != dvp.

> BTW, is it safe to call ASSERT_VOP_UNLOCKED() in the SMP case after the
> reference has been dropped with vput() or vrele()? 

Well, all the VFS code currently runs under Giant so a certain class of
unsafe cases is avoided, but generally no: ASSERT_VOP_UNLOCKED() invokes a
vnode operation to determine if the vnode is locked or not, which
dereferences fields in the vnode that require a valid reference to use. 
Because none of the code there currently blocks, it's not an issue
generally right now (as neither vnode is likely to have been deleted
during the link operation, so releasing the reference is unlikely to
result in chances to the vnode).  It's worth fixing though.

> Here's a replacement for the kern_link() patch in my previous patch
> file:

Looks good to me.  I'm not familiar enough with unionfs and stacking to
provide an effective review of that code, however.  Maybe Jeff can give us
some comments?  (unionfs is actually the reason I stalled in making the
same changes).

> Index: vfs_syscalls.c
> ===================================================================
> RCS file: /home/ncvs/src/sys/kern/vfs_syscalls.c,v
> retrieving revision 1.285
> diff -u -r1.285 vfs_syscalls.c
> --- vfs_syscalls.c    1 Sep 2002 20:37:28 -0000       1.285
> +++ vfs_syscalls.c    10 Sep 2002 11:28:01 -0000
> @@ -1027,10 +1027,12 @@
>               if (nd.ni_vp != NULL) {
>                       vrele(nd.ni_vp);
>                       error = EEXIST;
> -             } else {
> +             } else if ((error = vn_lock(vp, LK_EXCLUSIVE | LK_RETRY, td))
> +                 == 0) {
>                       VOP_LEASE(nd.ni_dvp, td, td->td_ucred, LEASE_WRITE);
>                       VOP_LEASE(vp, td, td->td_ucred, LEASE_WRITE);
>                       error = VOP_LINK(nd.ni_dvp, vp, &nd.ni_cnd);
> +                     VOP_UNLOCK(vp, 0, td);
>               }
>               NDFREE(&nd, NDF_ONLY_PNBUF);
>               vput(nd.ni_dvp);
> 
> 
> 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Reply via email to