I've gotten a couple "lockmgr: locking against myself" panics today and
it seems to be somewhat reproduceable.  I had the serial console hooked
up for the last one, so I was able to get a stack trace:


panic: lockmgr: locking against myself
Debugger("panic")
Stopped at      Debugger+0x45:  xchgl   %ebx,in_Debugger.0
db> tr
Debugger(c0449abc) at Debugger+0x45
panic(c0447760,215,0,c6cd7de0,10002) at panic+0x7c
lockmgr(c6cd7ea8,10002,c6cd7de0,c6381f00,c6cd7de0) at lockmgr+0x412
vop_sharedlock(e4b90b58,c047c440,c6cd7de0,1010002,c6381f00) at vop_sharedlock+0x
6e
vn_lock(c6cd7de0,10002,c6381f00) at vn_lock+0xb4
vrele(c6cd7de0,c6cd7de0,0,c681cb00,c6381f00,1) at vrele+0x8d
nfs_inactive(e4b90bdc,c047c3c0,c6cd7de0,c6381f00,c6381f00) at nfs_inactive+0xbd
VOP_INACTIVE(c6cd7de0,c6381f00) at VOP_INACTIVE+0xa0
vrele(c6cd7de0,c6834474,c6834474,e4b90c38,c02eed4a) at vrele+0x9b
vn_close(c6cd7de0,2,c681cb00,c6381f00,e4b90c70) at vn_close+0x37
vn_closefile(c6834474,c6381f00) at vn_closefile+0x1e
fdrop_locked(c6834474,c6381f00,c04fe8ac,0,c0446020) at fdrop_locked+0x102
fdrop(c6834474,c6381f00,c028a1ec,c6381f00,c6377c8c) at fdrop+0x24
closef(c6834474,c6381f00,c6834474,c6c0d3f8,c6381f00) at closef+0x77
close(c6381f00,e4b90d14,1,b9,202) at close+0x107
syscall(2f,2f,2f,82124cc,23) at syscall+0x243
Xint0x80_syscall() at Xint0x80_syscall+0x1d
--- syscall (6, FreeBSD ELF32, close), eip = 0x4862ff3b, esp = 0xbfbff2dc, ebp =
 0xbfbff358 ---

db> show lockedvnods
Locked vnodes
0xc6cd7de0: type VREG, usecount 0, writecount 0, refcount 0, flags (VV_OBJBUF)
        tag VT_NFS, fileid 112174 fsid 0x100ff02


It looks like the call to vrele() from vn_close() is executing the
following code:

        if (vp->v_usecount == 1) {
                vp->v_usecount--;
                /*
                 * We must call VOP_INACTIVE with the node locked.
                 * If we are doing a vput, the node is already locked,
                 * but, in the case of vrele, we must explicitly lock
                 * the vnode before calling VOP_INACTIVE.
                 */ 
                if (vn_lock(vp, LK_EXCLUSIVE | LK_INTERLOCK, td) == 0)
                        VOP_INACTIVE(vp, td);
                VI_LOCK(vp);
                if (VSHOULDFREE(vp))
                        vfree(vp);
                else
                        vlruvp(vp);
                VI_UNLOCK(vp);

It has decremented v_usecount to 0, grabbed an exclusive lock on the
vnode, and has called VOP_INACTIVE().

It looks like nfs_inactive() is executing the following code:

        if (sp) {
                /*
                 * We need a reference to keep the vnode from being
                 * recycled by getnewvnode while we do the I/O
                 * associated with discarding the buffers unless we
                 * are being forcibly unmounted in which case we already
                 * have our own reference.
                 */
                if (ap->a_vp->v_usecount > 0)
                        (void) nfs_vinvalbuf(ap->a_vp, 0, sp->s_cred, td, 1);
                else if (vget(ap->a_vp, 0, td))
                        panic("nfs_inactive: lost vnode");
                else {
                        (void) nfs_vinvalbuf(ap->a_vp, 0, sp->s_cred, td, 1);
                        vrele(ap->a_vp);
                }

Because v_usecount was 0, nfs_inactive() calls vget(), which increments
v_usecount to 1, has called nfs_vinvalbuf(), and finally calls vrele()
again.

Because v_usecount is 1, vrele() is re-executing the same code and blows
up when it tries to grab the lock again.  If that didn't cause a panic,
VOP_INACTIVE() and nfs_inactive() would be called recursively, which
would probably be a bad thing ...

I suspect that nfs_inactive() needs to avoid calling vrele() here and
should just decrement the v_usecount with the appropriate locking.
Something like:
        VI_LOCK(ap->a_vp);
        ap->a_vp->v_usecount--;
        VI_UNLOCK(ap->a_vp);


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Reply via email to