code ordering in coredump() (was: Re: cvs commit: src/sys/tools vnode_if.awk)

2002-07-09 Thread Don Lewis

I was studying the following DEBUG_VFS_LOCKS panic and noticed something
bothersome about the ordering of the code in coredump().  It looked to
me like it made more sense to verify that the file was something that
was valid to dump to before doing the vn_start_write() stuff.
Rearranging the code also allows VOP_GETATTR() to be called before
dropping the initial lock.

 VOP_GETATTR: 0xc7445100 is not locked but should be
 Debugger(Lock violation.
 )
 Stopped at Debugger+0x45: xchgl %ebx,in_Debugger.0
 db tr
 Debugger(c041b364) at Debugger+0x45
 coredump(c6a2c180) at coredump+0x470
 sigexit(c6a2c180,b,0,e54f8000,c6a2c180) at sigexit+0xa4
 postsig(b) at postsig+0xe9
 ast(e5514d48) at ast+0x27e
 doreti_ast() at doreti_ast+0x1a

Index: kern_sig.c
===
RCS file: /home/ncvs/src/sys/kern/kern_sig.c,v
retrieving revision 1.174
diff -u -r1.174 kern_sig.c
--- kern_sig.c  3 Jul 2002 09:15:20 -   1.174
+++ kern_sig.c  9 Jul 2002 15:17:31 -
@@ -1967,8 +1967,6 @@
  * then it passes on a vnode and a size limit to the process-specific
  * coredump routine if there is one; if there _is not_ one, it returns
  * ENOSYS; otherwise it returns the error from the process-specific routine.
- *
- * XXX: VOP_GETATTR() here requires holding the vnode lock.
  */
 
 static int
@@ -2021,6 +2019,14 @@
NDFREE(nd, NDF_ONLY_PNBUF);
vp = nd.ni_vp;
 
+   /* Don't dump to non-regular files or files with links. */
+   if (vp-v_type != VREG ||
+   VOP_GETATTR(vp, vattr, cred, td) || vattr.va_nlink != 1) {
+   VOP_UNLOCK(vp, 0, td);
+   error = EFAULT;
+   goto out2;
+   }
+
VOP_UNLOCK(vp, 0, td);
lf.l_whence = SEEK_SET;
lf.l_start = 0;
@@ -2040,12 +2046,6 @@
goto restart;
}
 
-   /* Don't dump to non-regular files or files with links. */
-   if (vp-v_type != VREG ||
-   VOP_GETATTR(vp, vattr, cred, td) || vattr.va_nlink != 1) {
-   error = EFAULT;
-   goto out1;
-   }
VATTR_NULL(vattr);
vattr.va_size = 0;
vn_lock(vp, LK_EXCLUSIVE | LK_RETRY, td);
@@ -2060,7 +2060,6 @@
  p-p_sysent-sv_coredump(td, vp, limit) :
  ENOSYS;
 
-out1:
lf.l_type = F_UNLCK;
VOP_ADVLOCK(vp, (caddr_t)p, F_UNLCK, lf, F_FLOCK);
vn_finished_write(mp);



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: cvs commit: src/sys/tools vnode_if.awk

2002-07-08 Thread Don Lewis

On  7 Jul, Jeff Roberson wrote:
 On Sat, 6 Jul 2002, Jeff Roberson wrote:

   Log:
- Use 'options DEBUG_VFS_LOCKS' instead of the DEBUG_ALL_VFS_LOCKS
  environment variable to enable the lock verifiction code.

 If you have a crash test box I would appreciate it if you would enable
 this kernel option.  If it catches any errors you will be droped into the
 debugger where you can get a backtrace (type: tr) and mail it to me 
 current@ to avoid dups.

Here's another one that happened while I was building the Mozilla port.

VOP_GETATTR: 0xc7445100 is not locked but should be
Debugger(Lock violation.
)
Stopped at Debugger+0x45: xchgl %ebx,in_Debugger.0
db tr
Debugger(c041b364) at Debugger+0x45
coredump(c6a2c180) at coredump+0x470
sigexit(c6a2c180,b,0,e54f8000,c6a2c180) at sigexit+0xa4
postsig(b) at postsig+0xe9
ast(e5514d48) at ast+0x27e
doreti_ast() at doreti_ast+0x1a

I don't know why this particular core dump triggered the bug.  I've seen
lots that didn't.



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: cvs commit: src/sys/tools vnode_if.awk

2002-07-07 Thread Jeff Roberson

On Sat, 6 Jul 2002, Jeff Roberson wrote:

 jeff2002/07/06 23:39:37 PDT

   Modified files:
 sys/toolsvnode_if.awk
   Log:
- Use 'options DEBUG_VFS_LOCKS' instead of the DEBUG_ALL_VFS_LOCKS
  environment variable to enable the lock verifiction code.

   Revision  ChangesPath
   1.33  +7 -5  src/sys/tools/vnode_if.awk


This was previously disabled because our locking was so bad that we could
not boot with this option enabled.  I can now boot, compile a kernel, and
reboot without catching any locking asserts.  This means that we are safe
at our current level of debugging, but we are certainly not out of the
woods wrt VFS locking yet.

If you have a crash test box I would appreciate it if you would enable
this kernel option.  If it catches any errors you will be droped into the
debugger where you can get a backtrace (type: tr) and mail it to me 
current@ to avoid dups.

To disable the panic  print once you've hit a bug type the following in
ddb:

w vfs_badlock_print 0
w vfs_badlock_panic 0

And you will not see any more errors.

Thanks!
Jeff


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: cvs commit: src/sys/tools vnode_if.awk

2002-07-07 Thread Don Lewis

On  7 Jul, Jeff Roberson wrote:
 On Sat, 6 Jul 2002, Jeff Roberson wrote:

- Use 'options DEBUG_VFS_LOCKS' instead of the DEBUG_ALL_VFS_LOCKS
  environment variable to enable the lock verifiction code.

 This was previously disabled because our locking was so bad that we could
 not boot with this option enabled.  I can now boot, compile a kernel, and
 reboot without catching any locking asserts.  This means that we are safe
 at our current level of debugging, but we are certainly not out of the
 woods wrt VFS locking yet.
 
 If you have a crash test box I would appreciate it if you would enable
 this kernel option.  If it catches any errors you will be droped into the
 debugger where you can get a backtrace (type: tr) and mail it to me 
 current@ to avoid dups.

It wasn't able to sucessfully boot with this enabled.  I'm hand
transcribing this, so apologies for any typos:

[fsck finishes]
Doing initial network setup: host.conf hostname.
VOP_READ: 0xc6737800 is not locked but should be
Debugger(Lock violation.
)


Debugger(c0420fe4) at Debugger+0x45
vn_rdwr(0,c6737800,c6425000,55ac,0,0,1,8,c22c7200,df241aec,c22cc0c0) at
vn_rdwr+0x18d
linker_hints_lookup(c04750a0,c,c62df000,5,0) at
linker_hints_lookup+0x2d9
linker_search_module(c62df000,5,0,0,c0415120) at
linker_search_module+0x43
linker_load_module(0,c62df000,0,0,df241cdc) at linker_load_module+0x72
kldload(c22cc0c0,df241d14,1,0,296) at kldload+0xc3
syscall(...)


If I disable the panic and continue the boot process, I see the
following in dmesg:


da0 at ahc0 bus 0 target 0 lun 0
da0: SEAGATE ST336706LW 010A Fixed Direct Access SCSI-3 device 
da0: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing Enable
d
da0: 35003MB (71687370 512 byte sectors: 255H 63S/T 4462C)
/usr/src/sys/vm/uma_core.c:1332: could sleep with kernel linker locked from /u
sr/src/sys/kern/kern_linker.c:1798
VOP_READ: 0xc6737800 is not locked but should be
VOP_GETVOBJECT: 0xc6737800 is not locked but should be
VOP_GETVOBJECT: 0xc6737800 is not locked but should be
VOP_BMAP: 0xc6737800 is not locked but should be
VOP_GETVOBJECT: 0xc6737800 is not locked but should be
VOP_GETVOBJECT: 0xc6737800 is not locked but should be
VOP_READ: 0xc6737800 is not locked but should be
VOP_READ: 0xc6737800 is not locked but should be
VOP_READ: 0xc6737800 is not locked but should be





To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: cvs commit: src/sys/tools vnode_if.awk

2002-07-07 Thread Jeff Roberson


On Sun, 7 Jul 2002, Don Lewis wrote:

 On  7 Jul, Jeff Roberson wrote:
  On Sat, 6 Jul 2002, Jeff Roberson wrote:

 - Use 'options DEBUG_VFS_LOCKS' instead of the DEBUG_ALL_VFS_LOCKS
   environment variable to enable the lock verifiction code.

  If you have a crash test box I would appreciate it if you would enable
  this kernel option.  If it catches any errors you will be droped into the
  debugger where you can get a backtrace (type: tr) and mail it to me 
  current@ to avoid dups.

 It wasn't able to sucessfully boot with this enabled.  I'm hand
 transcribing this, so apologies for any typos:

 [fsck finishes]
 Doing initial network setup: host.conf hostname.
 VOP_READ: 0xc6737800 is not locked but should be
 Debugger(Lock violation.
 )


 Debugger(c0420fe4) at Debugger+0x45
 vn_rdwr(0,c6737800,c6425000,55ac,0,0,1,8,c22c7200,df241aec,c22cc0c0) at
 vn_rdwr+0x18d
 linker_hints_lookup(c04750a0,c,c62df000,5,0) at
 linker_hints_lookup+0x2d9
 linker_search_module(c62df000,5,0,0,c0415120) at
 linker_search_module+0x43
 linker_load_module(0,c62df000,0,0,df241cdc) at linker_load_module+0x72
 kldload(c22cc0c0,df241d14,1,0,296) at kldload+0xc3
 syscall(...)

Oh, I don't use kernel modules on my main dev box.  Thanks!  I'll have
this resolved tomorrow.  More below.



 If I disable the panic and continue the boot process, I see the
 following in dmesg:


 da0 at ahc0 bus 0 target 0 lun 0
 da0: SEAGATE ST336706LW 010A Fixed Direct Access SCSI-3 device
 da0: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing Enable
 d
 da0: 35003MB (71687370 512 byte sectors: 255H 63S/T 4462C)
 /usr/src/sys/vm/uma_core.c:1332: could sleep with kernel linker locked from /u
 sr/src/sys/kern/kern_linker.c:1798
 VOP_READ: 0xc6737800 is not locked but should be
 VOP_GETVOBJECT: 0xc6737800 is not locked but should be
 VOP_GETVOBJECT: 0xc6737800 is not locked but should be
 VOP_BMAP: 0xc6737800 is not locked but should be
 VOP_GETVOBJECT: 0xc6737800 is not locked but should be
 VOP_GETVOBJECT: 0xc6737800 is not locked but should be
 VOP_READ: 0xc6737800 is not locked but should be
 VOP_READ: 0xc6737800 is not locked but should be
 VOP_READ: 0xc6737800 is not locked but should be


These are all also from the linker.  I just verified by loading a module.

Thanks,
Jeff


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: cvs commit: src/sys/tools vnode_if.awk

2002-07-07 Thread Jeff Roberson



On Sun, 7 Jul 2002, Don Lewis wrote:


 It wasn't able to sucessfully boot with this enabled.  I'm hand
 transcribing this, so apologies for any typos:

[snip]


 Debugger(c0420fe4) at Debugger+0x45
 vn_rdwr(0,c6737800,c6425000,55ac,0,0,1,8,c22c7200,df241aec,c22cc0c0) at
 vn_rdwr+0x18d
 linker_hints_lookup(c04750a0,c,c62df000,5,0) at
 linker_hints_lookup+0x2d9
 linker_search_module(c62df000,5,0,0,c0415120) at
 linker_search_module+0x43
 linker_load_module(0,c62df000,0,0,df241cdc) at linker_load_module+0x72
 kldload(c22cc0c0,df241d14,1,0,296) at kldload+0xc3
 syscall(...)


Revision 1.91 of kern_linker.c fixes this problem.  Can you try again?

Thanks!
Jeff


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message