Re: recursed on non-recursive lock (sleep mutex) vnode interlock @ /var/portbuild/sparc64/src-client/sys/ufs/ufs/ufs_ihash.c:128
I got this on an alpha machine as well. Can someone track it down? msgbufp = 0xfc0023f85fe0 magic = 63062, size = 32736, r= 59046, w = 59565, ptr = 0xfc0023f7e000, cksum= 2511626 lock order reversal 1st 0xfc001a793d80 vnode interlock (vnode interlock) @ /a/asami/portbuild/alpha/src-client/sys/ufs/ufs/ufs_ihash.c:128 2nd 0xfc6feda0 ufs ihash (ufs ihash) @ /a/asami/portbuild/alpha/src-client/sys/ufs/ufs/ufs_ihash.c:124 Stack backtrace: recursed on non-recursive lock (sleep mutex) vnode interlock @ /a/asami/portbuild/alpha/src-client/sys/ufs/ufs/ufs_ihash.c:128 first acquired @ /a/asami/portbuild/alpha/src-client/sys/ufs/ufs/ufs_ihash.c:128 Debugger() at Debugger+0x38 panic() at panic+0x168 witness_lock() at witness_lock+0x408 _mtx_lock_flags() at _mtx_lock_flags+0xc8 ufs_ihashget() at ufs_ihashget+0xec ffs_vget() at ffs_vget+0x54 ufs_lookup() at ufs_lookup+0xc9c ufs_vnoperate() at ufs_vnoperate+0x2c vfs_cache_lookup() at vfs_cache_lookup+0x37c ufs_vnoperate() at ufs_vnoperate+0x2c lookup() at lookup+0x4dc namei() at namei+0x310 stat() at stat+0x4c syscall() at syscall+0x39c XentSys() at XentSys+0x64 --- syscall (188, FreeBSD ELF64, stat) --- --- user mode --- db> Kris On Mon, Nov 24, 2003 at 12:58:01PM -0800, Kris Kennaway wrote: > One of my sparc64 package machines (running -current from Nov 21) died > overnight with the following: > > recursed on non-recursive lock (sleep mutex) vnode interlock @ > /var/portbuild/sparc64/src-client/sys/ufs/ufs/ufs_ihash.c:128 > first acquired @ /var/portbuild/sparc64/src-client/sys/ufs/ufs/ufs_ihash.c:128 > panic: recurse > cpuid = 0; > Debugger("panic") > Stopped at Debugger+0x1c: ta %xcc, 1 > db> trace > panic() at panic+0x174 > witness_lock() at witness_lock+0x3b4 > _mtx_lock_flags() at _mtx_lock_flags+0x9c > ufs_ihashget() at ufs_ihashget+0x94 > ffs_vget() at ffs_vget+0x20 > ufs_lookup() at ufs_lookup+0xb2c > ufs_vnoperate() at ufs_vnoperate+0x1c > vfs_cache_lookup() at vfs_cache_lookup+0x330 > ufs_vnoperate() at ufs_vnoperate+0x1c > lookup() at lookup+0x408 > namei() at namei+0x254 > vn_open_cred() at vn_open_cred+0x208 > vn_open() at vn_open+0x18 > kern_open() at kern_open+0x84 > open() at open+0x14 > syscall() at syscall+0x308 > -- syscall (5, FreeBSD ELF64, open) %o7=0x4038c2b0 -- > userland() at 0x40395948 > user trace: trap %o7=0x4038c2b0 > pc 0x40395948, sp 0x7fddaf1 > pc 0x4038b47c, sp 0x7fddc31 > pc 0x101778, sp 0x7fddcf1 > pc 0x101378, sp 0x7fdddb1 > pc 0x100f80, sp 0x7fdde71 > pc 0x4020a234, sp 0x7fddf31 > done pgp0.pgp Description: PGP signature
Re: recursed on non-recursive lock (sleep mutex) vm page queue mutex
On Sat, Oct 04, 2003 at 11:31:33PM -0700, Kris Kennaway wrote: > I don't think I've seen this one before (i386, kernel built Sep 17). > Is it already fixed? > No, not yet. Regards, Alan > > recursed on non-recursive lock (sleep mutex) vm page queue mutex @ > /a/asami/portbuild/i386/src-client/sys/kern/vfs_bio.c:3630 > first acquired @ /a/asami/portbuild/i386/src-client/sys/vm/vm_pageout.c:403 > panic: recurse > Debugger("panic") > Stopped at Debugger+0x54: xchgl %ebx,in_Debugger.0 > db> trace > Debugger(c043582e,c04a70e0,c0438952,d7077940,100) at Debugger+0x54 > panic(c0438952,c044c2d9,193,c043b873,e2e) at panic+0xd5 > witness_lock(c04d5900,8,c043b873,e2e,1) at witness_lock+0x3b3 > _mtx_lock_flags(c04d5900,0,c043b873,e2e,0) at _mtx_lock_flags+0xba > vm_hold_free_pages(ce50cbc0,d0807000,d0808000,a75,c4ccfb68) at > vm_hold_free_pages+0x142 > allocbuf(ce50cbc0,0,c043b873,74c,c449f5b4) at allocbuf+0x1b8 > getnewbuf(0,0,8000,8000,200) at getnewbuf+0x3fc > getblk(c449f5b4,2878c80,0,8000,0) at getblk+0x38e > breadn(c449f5b4,2878c80,0,8000,0) at breadn+0x52 > bread(c449f5b4,2878c80,0,8000,0) at bread+0x4c > ffs_update(c4631db0,0,1,54,c0af9b88) at ffs_update+0x206 > ufs_inactive(d7077c30,d7077c4c,c02c1333,d7077c30,0) at ufs_inactive+0x1f5 > ufs_vnoperate(d7077c30,0,c043d141,8e3,c048efa0) at ufs_vnoperate+0x18 > vput(c4631db0,0,c044c2d9,3b2,c4631db0) at vput+0x143 > vm_pageout_scan(0,0,c044c2d9,5d5,1f4) at vm_pageout_scan+0x67d > vm_pageout(0,d7077d48,c043313d,314,1a537318) at vm_pageout+0x2db > fork_exit(c03a5fe0,0,d7077d48) at fork_exit+0xcf > fork_trampoline() at fork_trampoline+0x8 > --- trap 0x1, eip = 0, esp = 0xd7077d7c, ebp = 0 --- > db> ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: recursed on non-recursive lock
On Sun, 27 May 2001 17:32:15 -0700 (PDT), John Baldwin <[EMAIL PROTECTED]> said: > Please try http://www.FreeBSD.org/~jhb/patches/vm.patch it fixes > several places where we hold the vm lock across VOP's, etc. Does that mean you've upgraded it? The last time I tried it (shortly after you announced it) it didn't apply cleanly. -- Michael D. Harnois[EMAIL PROTECTED] Redeemer Lutheran Church Washburn, Iowa Sed quis custodiet ipsos custodes? -- Juvenal To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: recursed on non-recursive lock
On 26-May-01 Michael Harnois wrote: > I finally got this much. I hope it helps. > > lock order reversal > 1st 0xc03af0a0 mntvnode @ ../../ufs/ffs/ffs_vnops.c:1007 > 2nd 0xc8b539cc vnode interlock @ ../../ufs/ffs/ffs_vfsops.c:1016 > > recursed on non-recursive lock (sleep mutex) vm @ Please try http://www.FreeBSD.org/~jhb/patches/vm.patch it fixes several places where we hold the vm lock across VOP's, etc. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: recursed on non-recursive lock
After I'd been up a couple of hours I had a spontaneous reboot. No idea why. Still a lot better than I'd been doing ... -- Michael D. Harnois[EMAIL PROTECTED] Redeemer Lutheran Church Washburn, Iowa Earth has its boundaries, but human stupidity is limitless. -- Gustave Flaubert To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: recursed on non-recursive lock
On Sun, 27 May 2001 03:59:20 +0200, Thomas Moestl <[EMAIL PROTECTED]> said: > The attached patch just unlocks vm_mtx before this call and > reacquires the it when it's done. This works for me Me, too. So far, at least ... uptime 25 minutes, swapping, X running, none of which I could do before ... thanks! > > lock order reversal I still have this little booger ... is it significant? -- Michael D. Harnois[EMAIL PROTECTED] Redeemer Lutheran Church Washburn, Iowa Earth has its boundaries, but human stupidity is limitless. -- Gustave Flaubert To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: recursed on non-recursive lock
On Sat, 2001/05/26 at 11:07:36 -0500, Michael Harnois wrote: > I finally got this much. I hope it helps. > > lock order reversal > 1st 0xc03af0a0 mntvnode @ ../../ufs/ffs/ffs_vnops.c:1007 > 2nd 0xc8b539cc vnode interlock @ ../../ufs/ffs/ffs_vfsops.c:1016 > > recursed on non-recursive lock (sleep mutex) vm @ ../../ufs/ufs/ufs_readwrite.c:420 > first acquired @ ../../vm/vnode_pager.c:912 > panic:recurse > Debugger ("panic") > Stopped at Debugger+0x45: pushl %ebx > db> t > Debugger(c0310767b) at Debugger+0x45 > panic(c0313348,c81b9cb8,a0,10,0) at panic+0x70 > witness_lock(c03b3f20,8,c03263b6,1a4) at witness_lock+0x356 > ffs_write(c81b9ca4) at ffs_write+0xba > vnode_pager_generic_putpages(c8c31d00,c81b9ddc,1,0,c81b9d74) at >vnode_pager_generic_putpages+0x19c > vop_stdputpages(c81b9d28,c81b9d0c,c02a7f9d,c81b9d28,c81b9d48) at vop_stdputpages+0x1f > vop_defaultop(c81b9d28,c81b9d48,c02c5c3d,c81b9d28,0) at vop_defaultop+0x15 > ufs_vnoperate(c81b9d28) at ufs_vnoperate+0x15 > vnode_pager_putpages(c8c4b360,c81b9ddc,10,0,c81b9d74,c03b3f20,1,c0329ffa,91) at >vnode_pager_putpages+0x1ad > [...] I can relatively reliable reproduce this panic here... The problem appears to be that the vm_mtx is held when VOP_WRITE is called in vnode_pager_generic_putpages (sys/vm/vnode_pager.c:999). This may try to grab the vm_mtx (e.g. the ufs implementation in sys/ufs/ufs/ufs_readwrite.c), so you end up with a recursion on the lock. Even if it wouldn't recurse, VOP_WRITE can AFAIK block, so there is a potential for other panics, too. The attached patch just unlocks vm_mtx before this call and reacquires the it when it's done. This works for me, and I think it theoretically should be safe because all relevant parts are under Giant again for now; YMMV, it might cause other panics or corruption, so you've been warned ;) - thomas Index: sys/vm/vnode_pager.c === RCS file: /home/ncvs/src/sys/vm/vnode_pager.c,v retrieving revision 1.130 diff -u -r1.130 vnode_pager.c --- sys/vm/vnode_pager.c2001/05/23 22:51:23 1.130 +++ sys/vm/vnode_pager.c2001/05/27 01:07:19 @@ -996,7 +996,9 @@ auio.uio_rw = UIO_WRITE; auio.uio_resid = maxsize; auio.uio_procp = (struct proc *) 0; + mtx_unlock(&vm_mtx); error = VOP_WRITE(vp, &auio, ioflags, curproc->p_ucred); + mtx_lock(&vm_mtx); cnt.v_vnodeout++; cnt.v_vnodepgsout += ncount;