code ordering in coredump() (was: Re: cvs commit: src/sys/tools vnode_if.awk)
I was studying the following DEBUG_VFS_LOCKS panic and noticed something bothersome about the ordering of the code in coredump(). It looked to me like it made more sense to verify that the file was something that was valid to dump to before doing the vn_start_write() stuff. Rearranging the code also allows VOP_GETATTR() to be called before dropping the initial lock. VOP_GETATTR: 0xc7445100 is not locked but should be Debugger(Lock violation. ) Stopped at Debugger+0x45: xchgl %ebx,in_Debugger.0 db tr Debugger(c041b364) at Debugger+0x45 coredump(c6a2c180) at coredump+0x470 sigexit(c6a2c180,b,0,e54f8000,c6a2c180) at sigexit+0xa4 postsig(b) at postsig+0xe9 ast(e5514d48) at ast+0x27e doreti_ast() at doreti_ast+0x1a Index: kern_sig.c === RCS file: /home/ncvs/src/sys/kern/kern_sig.c,v retrieving revision 1.174 diff -u -r1.174 kern_sig.c --- kern_sig.c 3 Jul 2002 09:15:20 - 1.174 +++ kern_sig.c 9 Jul 2002 15:17:31 - @@ -1967,8 +1967,6 @@ * then it passes on a vnode and a size limit to the process-specific * coredump routine if there is one; if there _is not_ one, it returns * ENOSYS; otherwise it returns the error from the process-specific routine. - * - * XXX: VOP_GETATTR() here requires holding the vnode lock. */ static int @@ -2021,6 +2019,14 @@ NDFREE(nd, NDF_ONLY_PNBUF); vp = nd.ni_vp; + /* Don't dump to non-regular files or files with links. */ + if (vp-v_type != VREG || + VOP_GETATTR(vp, vattr, cred, td) || vattr.va_nlink != 1) { + VOP_UNLOCK(vp, 0, td); + error = EFAULT; + goto out2; + } + VOP_UNLOCK(vp, 0, td); lf.l_whence = SEEK_SET; lf.l_start = 0; @@ -2040,12 +2046,6 @@ goto restart; } - /* Don't dump to non-regular files or files with links. */ - if (vp-v_type != VREG || - VOP_GETATTR(vp, vattr, cred, td) || vattr.va_nlink != 1) { - error = EFAULT; - goto out1; - } VATTR_NULL(vattr); vattr.va_size = 0; vn_lock(vp, LK_EXCLUSIVE | LK_RETRY, td); @@ -2060,7 +2060,6 @@ p-p_sysent-sv_coredump(td, vp, limit) : ENOSYS; -out1: lf.l_type = F_UNLCK; VOP_ADVLOCK(vp, (caddr_t)p, F_UNLCK, lf, F_FLOCK); vn_finished_write(mp); To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: cvs commit: src/sys/tools vnode_if.awk
On 7 Jul, Jeff Roberson wrote: On Sat, 6 Jul 2002, Jeff Roberson wrote: Log: - Use 'options DEBUG_VFS_LOCKS' instead of the DEBUG_ALL_VFS_LOCKS environment variable to enable the lock verifiction code. If you have a crash test box I would appreciate it if you would enable this kernel option. If it catches any errors you will be droped into the debugger where you can get a backtrace (type: tr) and mail it to me current@ to avoid dups. Here's another one that happened while I was building the Mozilla port. VOP_GETATTR: 0xc7445100 is not locked but should be Debugger(Lock violation. ) Stopped at Debugger+0x45: xchgl %ebx,in_Debugger.0 db tr Debugger(c041b364) at Debugger+0x45 coredump(c6a2c180) at coredump+0x470 sigexit(c6a2c180,b,0,e54f8000,c6a2c180) at sigexit+0xa4 postsig(b) at postsig+0xe9 ast(e5514d48) at ast+0x27e doreti_ast() at doreti_ast+0x1a I don't know why this particular core dump triggered the bug. I've seen lots that didn't. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: cvs commit: src/sys/tools vnode_if.awk
On Sat, 6 Jul 2002, Jeff Roberson wrote: jeff2002/07/06 23:39:37 PDT Modified files: sys/toolsvnode_if.awk Log: - Use 'options DEBUG_VFS_LOCKS' instead of the DEBUG_ALL_VFS_LOCKS environment variable to enable the lock verifiction code. Revision ChangesPath 1.33 +7 -5 src/sys/tools/vnode_if.awk This was previously disabled because our locking was so bad that we could not boot with this option enabled. I can now boot, compile a kernel, and reboot without catching any locking asserts. This means that we are safe at our current level of debugging, but we are certainly not out of the woods wrt VFS locking yet. If you have a crash test box I would appreciate it if you would enable this kernel option. If it catches any errors you will be droped into the debugger where you can get a backtrace (type: tr) and mail it to me current@ to avoid dups. To disable the panic print once you've hit a bug type the following in ddb: w vfs_badlock_print 0 w vfs_badlock_panic 0 And you will not see any more errors. Thanks! Jeff To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: cvs commit: src/sys/tools vnode_if.awk
On 7 Jul, Jeff Roberson wrote: On Sat, 6 Jul 2002, Jeff Roberson wrote: - Use 'options DEBUG_VFS_LOCKS' instead of the DEBUG_ALL_VFS_LOCKS environment variable to enable the lock verifiction code. This was previously disabled because our locking was so bad that we could not boot with this option enabled. I can now boot, compile a kernel, and reboot without catching any locking asserts. This means that we are safe at our current level of debugging, but we are certainly not out of the woods wrt VFS locking yet. If you have a crash test box I would appreciate it if you would enable this kernel option. If it catches any errors you will be droped into the debugger where you can get a backtrace (type: tr) and mail it to me current@ to avoid dups. It wasn't able to sucessfully boot with this enabled. I'm hand transcribing this, so apologies for any typos: [fsck finishes] Doing initial network setup: host.conf hostname. VOP_READ: 0xc6737800 is not locked but should be Debugger(Lock violation. ) Debugger(c0420fe4) at Debugger+0x45 vn_rdwr(0,c6737800,c6425000,55ac,0,0,1,8,c22c7200,df241aec,c22cc0c0) at vn_rdwr+0x18d linker_hints_lookup(c04750a0,c,c62df000,5,0) at linker_hints_lookup+0x2d9 linker_search_module(c62df000,5,0,0,c0415120) at linker_search_module+0x43 linker_load_module(0,c62df000,0,0,df241cdc) at linker_load_module+0x72 kldload(c22cc0c0,df241d14,1,0,296) at kldload+0xc3 syscall(...) If I disable the panic and continue the boot process, I see the following in dmesg: da0 at ahc0 bus 0 target 0 lun 0 da0: SEAGATE ST336706LW 010A Fixed Direct Access SCSI-3 device da0: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing Enable d da0: 35003MB (71687370 512 byte sectors: 255H 63S/T 4462C) /usr/src/sys/vm/uma_core.c:1332: could sleep with kernel linker locked from /u sr/src/sys/kern/kern_linker.c:1798 VOP_READ: 0xc6737800 is not locked but should be VOP_GETVOBJECT: 0xc6737800 is not locked but should be VOP_GETVOBJECT: 0xc6737800 is not locked but should be VOP_BMAP: 0xc6737800 is not locked but should be VOP_GETVOBJECT: 0xc6737800 is not locked but should be VOP_GETVOBJECT: 0xc6737800 is not locked but should be VOP_READ: 0xc6737800 is not locked but should be VOP_READ: 0xc6737800 is not locked but should be VOP_READ: 0xc6737800 is not locked but should be To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: cvs commit: src/sys/tools vnode_if.awk
On Sun, 7 Jul 2002, Don Lewis wrote: On 7 Jul, Jeff Roberson wrote: On Sat, 6 Jul 2002, Jeff Roberson wrote: - Use 'options DEBUG_VFS_LOCKS' instead of the DEBUG_ALL_VFS_LOCKS environment variable to enable the lock verifiction code. If you have a crash test box I would appreciate it if you would enable this kernel option. If it catches any errors you will be droped into the debugger where you can get a backtrace (type: tr) and mail it to me current@ to avoid dups. It wasn't able to sucessfully boot with this enabled. I'm hand transcribing this, so apologies for any typos: [fsck finishes] Doing initial network setup: host.conf hostname. VOP_READ: 0xc6737800 is not locked but should be Debugger(Lock violation. ) Debugger(c0420fe4) at Debugger+0x45 vn_rdwr(0,c6737800,c6425000,55ac,0,0,1,8,c22c7200,df241aec,c22cc0c0) at vn_rdwr+0x18d linker_hints_lookup(c04750a0,c,c62df000,5,0) at linker_hints_lookup+0x2d9 linker_search_module(c62df000,5,0,0,c0415120) at linker_search_module+0x43 linker_load_module(0,c62df000,0,0,df241cdc) at linker_load_module+0x72 kldload(c22cc0c0,df241d14,1,0,296) at kldload+0xc3 syscall(...) Oh, I don't use kernel modules on my main dev box. Thanks! I'll have this resolved tomorrow. More below. If I disable the panic and continue the boot process, I see the following in dmesg: da0 at ahc0 bus 0 target 0 lun 0 da0: SEAGATE ST336706LW 010A Fixed Direct Access SCSI-3 device da0: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing Enable d da0: 35003MB (71687370 512 byte sectors: 255H 63S/T 4462C) /usr/src/sys/vm/uma_core.c:1332: could sleep with kernel linker locked from /u sr/src/sys/kern/kern_linker.c:1798 VOP_READ: 0xc6737800 is not locked but should be VOP_GETVOBJECT: 0xc6737800 is not locked but should be VOP_GETVOBJECT: 0xc6737800 is not locked but should be VOP_BMAP: 0xc6737800 is not locked but should be VOP_GETVOBJECT: 0xc6737800 is not locked but should be VOP_GETVOBJECT: 0xc6737800 is not locked but should be VOP_READ: 0xc6737800 is not locked but should be VOP_READ: 0xc6737800 is not locked but should be VOP_READ: 0xc6737800 is not locked but should be These are all also from the linker. I just verified by loading a module. Thanks, Jeff To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: cvs commit: src/sys/tools vnode_if.awk
On Sun, 7 Jul 2002, Don Lewis wrote: It wasn't able to sucessfully boot with this enabled. I'm hand transcribing this, so apologies for any typos: [snip] Debugger(c0420fe4) at Debugger+0x45 vn_rdwr(0,c6737800,c6425000,55ac,0,0,1,8,c22c7200,df241aec,c22cc0c0) at vn_rdwr+0x18d linker_hints_lookup(c04750a0,c,c62df000,5,0) at linker_hints_lookup+0x2d9 linker_search_module(c62df000,5,0,0,c0415120) at linker_search_module+0x43 linker_load_module(0,c62df000,0,0,df241cdc) at linker_load_module+0x72 kldload(c22cc0c0,df241d14,1,0,296) at kldload+0xc3 syscall(...) Revision 1.91 of kern_linker.c fixes this problem. Can you try again? Thanks! Jeff To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message