Re: recursed on non-recursive lock (sleep mutex) vnode interlock @ /var/portbuild/sparc64/src-client/sys/ufs/ufs/ufs_ihash.c:128

2003-11-29 Thread Kris Kennaway
I got this on an alpha machine as well.  Can someone track it down?

msgbufp = 0xfc0023f85fe0
magic = 63062, size = 32736, r= 59046, w = 59565, ptr = 0xfc0023f7e000, cksum= 
2511626
lock order reversal
 1st 0xfc001a793d80 vnode interlock (vnode interlock) @ 
/a/asami/portbuild/alpha/src-client/sys/ufs/ufs/ufs_ihash.c:128
 2nd 0xfc6feda0 ufs ihash (ufs ihash) @ 
/a/asami/portbuild/alpha/src-client/sys/ufs/ufs/ufs_ihash.c:124
Stack backtrace:
recursed on non-recursive lock (sleep mutex) vnode interlock @ 
/a/asami/portbuild/alpha/src-client/sys/ufs/ufs/ufs_ihash.c:128
first acquired @ /a/asami/portbuild/alpha/src-client/sys/ufs/ufs/ufs_ihash.c:128
Debugger() at Debugger+0x38
panic() at panic+0x168
witness_lock() at witness_lock+0x408
_mtx_lock_flags() at _mtx_lock_flags+0xc8
ufs_ihashget() at ufs_ihashget+0xec
ffs_vget() at ffs_vget+0x54
ufs_lookup() at ufs_lookup+0xc9c
ufs_vnoperate() at ufs_vnoperate+0x2c
vfs_cache_lookup() at vfs_cache_lookup+0x37c
ufs_vnoperate() at ufs_vnoperate+0x2c
lookup() at lookup+0x4dc
namei() at namei+0x310
stat() at stat+0x4c
syscall() at syscall+0x39c
XentSys() at XentSys+0x64
--- syscall (188, FreeBSD ELF64, stat) ---
--- user mode ---
db>

Kris

On Mon, Nov 24, 2003 at 12:58:01PM -0800, Kris Kennaway wrote:
> One of my sparc64 package machines (running -current from Nov 21) died
> overnight with the following:
> 
> recursed on non-recursive lock (sleep mutex) vnode interlock @ 
> /var/portbuild/sparc64/src-client/sys/ufs/ufs/ufs_ihash.c:128
> first acquired @ /var/portbuild/sparc64/src-client/sys/ufs/ufs/ufs_ihash.c:128
> panic: recurse
> cpuid = 0;
> Debugger("panic")
> Stopped at  Debugger+0x1c:  ta  %xcc, 1
> db> trace
> panic() at panic+0x174
> witness_lock() at witness_lock+0x3b4
> _mtx_lock_flags() at _mtx_lock_flags+0x9c
> ufs_ihashget() at ufs_ihashget+0x94
> ffs_vget() at ffs_vget+0x20
> ufs_lookup() at ufs_lookup+0xb2c
> ufs_vnoperate() at ufs_vnoperate+0x1c
> vfs_cache_lookup() at vfs_cache_lookup+0x330
> ufs_vnoperate() at ufs_vnoperate+0x1c
> lookup() at lookup+0x408
> namei() at namei+0x254
> vn_open_cred() at vn_open_cred+0x208
> vn_open() at vn_open+0x18
> kern_open() at kern_open+0x84
> open() at open+0x14
> syscall() at syscall+0x308
> -- syscall (5, FreeBSD ELF64, open) %o7=0x4038c2b0 --
> userland() at 0x40395948
> user trace: trap %o7=0x4038c2b0
> pc 0x40395948, sp 0x7fddaf1
> pc 0x4038b47c, sp 0x7fddc31
> pc 0x101778, sp 0x7fddcf1
> pc 0x101378, sp 0x7fdddb1
> pc 0x100f80, sp 0x7fdde71
> pc 0x4020a234, sp 0x7fddf31
> done




pgp0.pgp
Description: PGP signature


Re: recursed on non-recursive lock (sleep mutex) vm page queue mutex

2003-10-05 Thread Alan Cox
On Sat, Oct 04, 2003 at 11:31:33PM -0700, Kris Kennaway wrote:
> I don't think I've seen this one before (i386, kernel built Sep 17).
> Is it already fixed?
> 

No, not yet.

Regards,
Alan

> 
> recursed on non-recursive lock (sleep mutex) vm page queue mutex @ 
> /a/asami/portbuild/i386/src-client/sys/kern/vfs_bio.c:3630
> first acquired @ /a/asami/portbuild/i386/src-client/sys/vm/vm_pageout.c:403
> panic: recurse
> Debugger("panic")
> Stopped at  Debugger+0x54:  xchgl   %ebx,in_Debugger.0
> db> trace
> Debugger(c043582e,c04a70e0,c0438952,d7077940,100) at Debugger+0x54
> panic(c0438952,c044c2d9,193,c043b873,e2e) at panic+0xd5
> witness_lock(c04d5900,8,c043b873,e2e,1) at witness_lock+0x3b3
> _mtx_lock_flags(c04d5900,0,c043b873,e2e,0) at _mtx_lock_flags+0xba
> vm_hold_free_pages(ce50cbc0,d0807000,d0808000,a75,c4ccfb68) at 
> vm_hold_free_pages+0x142
> allocbuf(ce50cbc0,0,c043b873,74c,c449f5b4) at allocbuf+0x1b8
> getnewbuf(0,0,8000,8000,200) at getnewbuf+0x3fc
> getblk(c449f5b4,2878c80,0,8000,0) at getblk+0x38e
> breadn(c449f5b4,2878c80,0,8000,0) at breadn+0x52
> bread(c449f5b4,2878c80,0,8000,0) at bread+0x4c
> ffs_update(c4631db0,0,1,54,c0af9b88) at ffs_update+0x206
> ufs_inactive(d7077c30,d7077c4c,c02c1333,d7077c30,0) at ufs_inactive+0x1f5
> ufs_vnoperate(d7077c30,0,c043d141,8e3,c048efa0) at ufs_vnoperate+0x18
> vput(c4631db0,0,c044c2d9,3b2,c4631db0) at vput+0x143
> vm_pageout_scan(0,0,c044c2d9,5d5,1f4) at vm_pageout_scan+0x67d
> vm_pageout(0,d7077d48,c043313d,314,1a537318) at vm_pageout+0x2db
> fork_exit(c03a5fe0,0,d7077d48) at fork_exit+0xcf
> fork_trampoline() at fork_trampoline+0x8
> --- trap 0x1, eip = 0, esp = 0xd7077d7c, ebp = 0 ---
> db>

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: recursed on non-recursive lock

2001-05-27 Thread Michael Harnois

On Sun, 27 May 2001 17:32:15 -0700 (PDT), John Baldwin <[EMAIL PROTECTED]> said:

> Please try http://www.FreeBSD.org/~jhb/patches/vm.patch it fixes
> several places where we hold the vm lock across VOP's, etc.

Does that mean you've upgraded it? The last time I tried it (shortly
after you announced it) it didn't apply cleanly.

-- 
Michael D. Harnois[EMAIL PROTECTED]
Redeemer Lutheran Church  Washburn, Iowa 
 Sed quis custodiet ipsos custodes? -- Juvenal

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



RE: recursed on non-recursive lock

2001-05-27 Thread John Baldwin


On 26-May-01 Michael Harnois wrote:
> I finally got this much. I hope it helps.
> 
> lock order reversal
> 1st 0xc03af0a0 mntvnode @ ../../ufs/ffs/ffs_vnops.c:1007
> 2nd 0xc8b539cc vnode interlock @ ../../ufs/ffs/ffs_vfsops.c:1016
> 
> recursed on non-recursive lock (sleep mutex) vm @

Please try http://www.FreeBSD.org/~jhb/patches/vm.patch it fixes several places
where we hold the vm lock across VOP's, etc.

-- 

John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: recursed on non-recursive lock

2001-05-26 Thread Michael Harnois

After I'd been up a couple of hours I had a spontaneous reboot. No
idea why. Still a lot better than I'd been doing ...

-- 
Michael D. Harnois[EMAIL PROTECTED]
Redeemer Lutheran Church  Washburn, Iowa 
 Earth has its boundaries, but human stupidity is limitless.
   -- Gustave Flaubert

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: recursed on non-recursive lock

2001-05-26 Thread Michael Harnois

On Sun, 27 May 2001 03:59:20 +0200, Thomas Moestl <[EMAIL PROTECTED]> said:

> The attached patch just unlocks vm_mtx before this call and
> reacquires the it when it's done. This works for me

Me, too. So far, at least ... uptime 25 minutes, swapping, X running,
none of which I could do before ... thanks!

> >   lock order reversal 

I still have this little booger ... is it significant?

-- 
Michael D. Harnois[EMAIL PROTECTED]
Redeemer Lutheran Church  Washburn, Iowa 
 Earth has its boundaries, but human stupidity is limitless.
   -- Gustave Flaubert

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: recursed on non-recursive lock

2001-05-26 Thread Thomas Moestl

On Sat, 2001/05/26 at 11:07:36 -0500, Michael Harnois wrote:
> I finally got this much. I hope it helps.
> 
> lock order reversal
> 1st 0xc03af0a0 mntvnode @ ../../ufs/ffs/ffs_vnops.c:1007
> 2nd 0xc8b539cc vnode interlock @ ../../ufs/ffs/ffs_vfsops.c:1016
> 
> recursed on non-recursive lock (sleep mutex) vm @ ../../ufs/ufs/ufs_readwrite.c:420
> first acquired @ ../../vm/vnode_pager.c:912
> panic:recurse
> Debugger ("panic")
> Stopped at Debugger+0x45: pushl %ebx
> db> t
> Debugger(c0310767b) at Debugger+0x45
> panic(c0313348,c81b9cb8,a0,10,0) at panic+0x70
> witness_lock(c03b3f20,8,c03263b6,1a4) at witness_lock+0x356
> ffs_write(c81b9ca4) at ffs_write+0xba
> vnode_pager_generic_putpages(c8c31d00,c81b9ddc,1,0,c81b9d74) at 
>vnode_pager_generic_putpages+0x19c
> vop_stdputpages(c81b9d28,c81b9d0c,c02a7f9d,c81b9d28,c81b9d48) at vop_stdputpages+0x1f
> vop_defaultop(c81b9d28,c81b9d48,c02c5c3d,c81b9d28,0) at vop_defaultop+0x15
> ufs_vnoperate(c81b9d28) at ufs_vnoperate+0x15
> vnode_pager_putpages(c8c4b360,c81b9ddc,10,0,c81b9d74,c03b3f20,1,c0329ffa,91) at 
>vnode_pager_putpages+0x1ad
> [...]

I can relatively reliable reproduce this panic here...
The problem appears to be that the vm_mtx is held when VOP_WRITE is
called in vnode_pager_generic_putpages
(sys/vm/vnode_pager.c:999). This may try to grab the vm_mtx (e.g. the
ufs implementation in sys/ufs/ufs/ufs_readwrite.c), so you end up with
a recursion on the lock. Even if it wouldn't recurse, VOP_WRITE can 
AFAIK block, so there is a potential for other panics, too.
The attached patch just unlocks vm_mtx before this call and reacquires
the it when it's done. This works for me, and I think it theoretically
should be safe because all relevant parts are under Giant again for
now; YMMV, it might cause other panics or corruption, so you've been
warned ;)

- thomas


Index: sys/vm/vnode_pager.c
===
RCS file: /home/ncvs/src/sys/vm/vnode_pager.c,v
retrieving revision 1.130
diff -u -r1.130 vnode_pager.c
--- sys/vm/vnode_pager.c2001/05/23 22:51:23 1.130
+++ sys/vm/vnode_pager.c2001/05/27 01:07:19
@@ -996,7 +996,9 @@
auio.uio_rw = UIO_WRITE;
auio.uio_resid = maxsize;
auio.uio_procp = (struct proc *) 0;
+   mtx_unlock(&vm_mtx);
error = VOP_WRITE(vp, &auio, ioflags, curproc->p_ucred);
+   mtx_lock(&vm_mtx);
cnt.v_vnodeout++;
cnt.v_vnodepgsout += ncount;