Re: kernel debugging assistance

1999-05-27 Thread David E. Cross
 I don't think that this dump is useful for debugging this problem. Perhaps, 
 if 
 you compile the kernel with DEBUG_LOCKS, you will get more useful info.
 
 Dima

I checked through the source for DEBUG_LOCKS, it doesn't appear to do anything
other than to printout information information that I already have access
to by way of this dump file.  I will turn it on regardless swo it makes my
life a bit more simple.

In looking through this, and at the program that used to cause this problem
reliably (it no longer does, even though nothing changed on the client or
workstation; I am guessing that it is a race condition that happens 5% of
the time and I filled my quota for the next 20 years ;) I have a theory
what is going on...  NFS service is entirely in the kernel for FreeBSD, 
excepting the NFSDs which mostly sit arround to give the kernel contexts to
pass requests into.  NFS uses its own namei mechanism which requests a lock
on what it is looking up.  What if it gets 2 requests at about the same 
time for the same file.  That would certainly seem a likely cause for this
problem.  I note that all the files that are causing this crash are
files that would be accessed in the aforementioned behaviour; netscape
cache files, .Xauthority-c, and the data file for the test prgram which
is accessed rapidly and repeatedly.

Does this seem like a reasonable theory to anyone?
--
David Cross   |  email: cro...@cs.rpi.edu 
Systems Administrator/Research Programmer |  Web: http://www.cs.rpi.edu/~crossd 
Rensselaer Polytechnic Institute, |  Ph: 518.276.2860
Department of Computer Science|  Fax: 518.276.4033
I speak only for myself.  |  WinNT:Linux::Linux:FreeBSD


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: kernel debugging assistance

1999-05-27 Thread Don Lewis
On May 27, 10:32am, David E. Cross wrote:
} Subject: Re: kernel debugging assistance
}  I don't think that this dump is useful for debugging this problem. Perhaps, 
if 
}  you compile the kernel with DEBUG_LOCKS, you will get more useful info.
}  
}  Dima
} 
} I checked through the source for DEBUG_LOCKS, it doesn't appear to do anything
} other than to printout information information that I already have access
} to by way of this dump file.  I will turn it on regardless swo it makes my
} life a bit more simple.

In some cases it sure would be nice if DEBUG_LOCKS preserved more of the
stack context.  Knowing that the current holder of the lock also called
vop_stdlock() isn't all that useful.  I'd rather know where vget() was
called.


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: kernel debugging assistance

1999-05-27 Thread Dmitrij Tejblum
  I don't think that this dump is useful for debugging this problem. Perhaps, 
  if 
  you compile the kernel with DEBUG_LOCKS, you will get more useful info.
 
 I checked through the source for DEBUG_LOCKS, it doesn't appear to do anything
 other than to printout information information that I already have access
 to by way of this dump file.  I will turn it on regardless swo it makes my
 life a bit more simple.

Hmm. As I see, DEBUG_LOCKS don't print out anything. The kernel with 
DEBUG_LOCKS will store the file and line number of every locker in the vnode.

 
 In looking through this, and at the program that used to cause this problem
 reliably (it no longer does, even though nothing changed on the client or
 workstation; I am guessing that it is a race condition that happens 5% of
 the time and I filled my quota for the next 20 years ;) I have a theory
 what is going on...  NFS service is entirely in the kernel for FreeBSD, 
 excepting the NFSDs which mostly sit arround to give the kernel contexts to
 pass requests into.  NFS uses its own namei mechanism which requests a lock
 on what it is looking up.  

The standard namei mechanism request locks on what it is looking up too.

 What if it gets 2 requests at about the same 
 time for the same file.  

One nfsd service only one request once. When it try to lock something locked by
another nfsd (or, in general, another program), it will wait until the lock is 
released. After nfsd served a request, it has to release all the locks it got.

I think, your panic is caused by nfsd forgot to release a lock. If so, the bug 
is
not where it paniced, but some time before. This is why I suggested 
DEBUG_LOCKS.
Unfortunately, as Don Lewis pointed out, it will not very useful too :-(.

Dima




To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: kernel debugging assistance

1999-05-26 Thread Dmitrij Tejblum
 I am trying to trace down the cause of the recursive lock and I stumbled upon
 this:
 
 (kgdb) bt
 #0  boot (howto=256) at ../../kern/kern_shutdown.c:285
 #1  0xc014b3f4 in at_shutdown (
 function=0xc0234aca 
 __set_sysuninit_set_sym_M_KTRACE_uninit_sys_uninit+154, arg=0x10002, 
 queue=-951064448) at ../../kern/kern_shutdown.c:446
 #2  0xc01470f8 in lockmgr (lkp=0xc10d8f00, flags=16842754, 
 interlkp=0xc74fe8f0, p=0xc743eb20) at ../../kern/kern_lock.c:326
 #3  0xc016cfbc in vop_stdlock (ap=0xc7482a64) at ../../kern/vfs_default.c:209
 #4  0xc01e4fad in ufs_vnoperate (ap=0xc7482a64)
 at ../../ufs/ufs/ufs_vnops.c:2299
 #5  0xc0175d97 in vn_lock (vp=0xc74fe880, flags=65538, p=0xc743eb20)
 at vnode_if.h:811

[...]

 (kgdb) up 3
 #3  0xc016cfbc in vop_stdlock (ap=0xc7482a64) at ../../kern/vfs_default.c:209
 209 return (lockmgr(l, ap-a_flags, ap-a_vp-v_interlock, 
 ap-a_p));
 (kgdb) print ap
 $1 = (struct vop_lock_args *) 0x0

This is just a glitch in gdb. The true value of ap is here:
 #3  0xc016cfbc in vop_stdlock (ap=0xc7482a64) at ../../kern/vfs_default.c:209
^^

I don't think that this dump is useful for debugging this problem. Perhaps, if 
you compile the kernel with DEBUG_LOCKS, you will get more useful info.

Dima




To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message