Re: FreeBSD 5.4-RELEASE-p5 panic

2005-08-03 Thread dpk
On Tue, 2 Aug 2005, dpk wrote:

 (Another panic I would get would follow roughly the same path except it
 would die while trying to unlock a vnode lock that the thread didn't own.
 I'll try to get this information some time, too.)

Here's the backtrace from that panic:

#0  kdb_enter (msg=0x12 Address 0x12 out of bounds) at 
../../../kern/subr_kdb.c:266
#1  0xc033ea1f in panic (fmt=0xc04c99ff lockmgr: thread %p, not %s %p 
unlocking)
at ../../../kern/kern_shutdown.c:550
#2  0xc0333181 in lockmgr (lkp=0xc61f5e14, flags=6, interlkp=0x100, td=0x0)
at ../../../kern/kern_lock.c:419
#3  0xc038b08b in vop_stdunlock (ap=0x12) at ../../../kern/vfs_default.c:295
#4  0xc038af3b in vop_defaultop (ap=0x0) at ../../../kern/vfs_default.c:157
#5  0xc03010bb in spec_vnoperate (ap=0x0) at ../../../fs/specfs/spec_vnops.c:118
#6  0xc0301648 in spec_write (ap=0xeb858a94) at vnode_if.h:1044
#7  0xc03010bb in spec_vnoperate (ap=0x0) at ../../../fs/specfs/spec_vnops.c:118
#8  0xc0452ecd in vnode_pager_generic_putpages (vp=0xc61f5d68, m=0xeb858bf0, 
bytecount=4096,
flags=0, rtvals=0xeb858b70) at vnode_if.h:432
#9  0xc038b7e2 in vop_stdputpages (ap=0x12) at ../../../kern/vfs_default.c:650
#10 0xc038af3b in vop_defaultop (ap=0x0) at ../../../kern/vfs_default.c:157
#11 0xc03010bb in spec_vnoperate (ap=0x0) at ../../../fs/specfs/spec_vnops.c:118
#12 0xc0452c6a in vnode_pager_putpages (object=0xc085e7bc, m=0x12, count=18, 
sync=0, rtvals=0x12)
at vnode_if.h:1357
#13 0xc044a603 in vm_pageout_flush (mc=0xeb858bf0, count=1, flags=0) at 
vm_pager.h:147
#14 0xc044a52d in vm_pageout_clean (m=0x0) at ../../../vm/vm_pageout.c:347
#15 0xc044b3df in vm_pageout_scan (pass=0) at ../../../vm/vm_pageout.c:996
#16 0xc044c162 in vm_pageout () at ../../../vm/vm_pageout.c:1487
#17 0xc032911d in fork_exit (callout=0xc044be50 vm_pageout, arg=0x0, 
frame=0xeb858d48)
at ../../../kern/kern_fork.c:791
#18 0xc0474fcc in fork_trampoline () at ../../../i386/i386/exception.s:209

Again, vm_pageout_clean is being called with a NULL argument, and
eventually the spec_vnoperate function is called with a NULL (the other
panic, ufs_vnoperate was called with a NULL).

These couple of panics are relatively easy to reproduce on demand.

Interestingly (I think), vm_pageout_flush's m argument was the same with
each panic: 0xeb858bf0 .

That is decimal 3,951,397,872 . When you boot these servers without PAE
enabled, the real memory is 3,757,965,312. I think this indicates that
the page vnode_pager_generic_putpages is dealing with is within the PAE
range (I don't know exactly how to describe that). This could be a total
long shot, but I think it's unlikely that both panics would have something
like that in common without it being a bug of some sort.

If there's somewhere else I should be sending these please let me know.
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


FreeBSD 5.4-RELEASE-p5 panic

2005-08-02 Thread dpk
After much struggling (documented elsewhere) I have a backtrace showing
one of a handful of panics I am getting on a FreeBSD 5.4-RELEASE-p5
system. The server has 4GB RAM, and is running with PAE and SMP enabled.
If this is not the appropriate list for this, I can send it elsewhere,
please let me know.

(gdb) bt
#0  kdb_enter (msg=0x12 Address 0x12 out of bounds) at 
../../../kern/subr_kdb.c:266
#1  0xc033ea1f in panic (fmt=0xc04d782d ffs_write: dir write) at 
../../../kern/kern_shutdown.c:550
#2  0xc04292de in ffs_write (ap=0xeb858a94) at ../../../ufs/ffs/ffs_vnops.c:614
#3  0xc0452e71 in vnode_pager_generic_putpages (vp=0xc6237630, m=0xeb858bf0, 
bytecount=4096,
flags=0, rtvals=0xeb858b70) at vnode_if.h:432
#4  0xc038b7e2 in vop_stdputpages (ap=0x12) at ../../../kern/vfs_default.c:650
#5  0xc038af3b in vop_defaultop (ap=0x0) at ../../../kern/vfs_default.c:157
#6  0xc0435ebf in ufs_vnoperate (ap=0x0) at ../../../ufs/ufs/ufs_vnops.c:2821
#7  0xc0452c0e in vnode_pager_putpages (object=0xc6901a50, m=0x12, count=18, 
sync=0, rtvals=0x12)
at vnode_if.h:1357
#8  0xc044a5db in vm_pageout_flush (mc=0xeb858bf0, count=1, flags=0) at 
vm_pager.h:147
#9  0xc044a505 in vm_pageout_clean (m=0x0) at ../../../vm/vm_pageout.c:347
#10 0xc044b386 in vm_pageout_scan (pass=1) at ../../../vm/vm_pageout.c:985
#11 0xc044c106 in vm_pageout () at ../../../vm/vm_pageout.c:1476
#12 0xc032911d in fork_exit (callout=0xc044bdf4 vm_pageout, arg=0x0, 
frame=0xeb858d48)
at ../../../kern/kern_fork.c:791
#13 0xc0474f6c in fork_trampoline () at ../../../i386/i386/exception.s:209

(Another panic I would get would follow roughly the same path except it
would die while trying to unlock a vnode lock that the thread didn't own.
I'll try to get this information some time, too.)

This might all trace back to vm_pageout_clean() being called with as NULL
argument. Looking at vm_pageout_clean, it looks as though that should
never happen -- at least, there's nothing there that checks if it is NULL
before it goes on to treat it as a pointer to a struct:

static int
vm_pageout_clean(m)
vm_page_t m;
{
vm_object_t object;
vm_page_t mc[2*vm_pageout_page_count];
int pageout_count;
int ib, is, page_base;
vm_pindex_t pindex = m-pindex;

mtx_assert(vm_page_queue_mtx, MA_OWNED);
VM_OBJECT_LOCK_ASSERT(m-object, MA_OWNED);

In frame #10, vm_pageout_scan:

#10 0xc044b386 in vm_pageout_scan (pass=1) at ../../../vm/vm_pageout.c:985
985 if (vm_pageout_clean(m) != 0) {
(gdb) p m
$65 = 0xc0da66f8
(gdb) p *m
$78 = {pageq = {tqe_next = 0xeb858cb0, tqe_prev = 0xc231c840},
listq = {tqe_next = 0x0, tqe_prev = 0xc6901a88}, left = 0x0,
right = 0x0, object = 0xc6901a50, pindex = 1, phys_addr = 296792064,
md = {pv_list_count = 0, pv_list = {tqh_first = 0x0,
tqh_last = 0xc0da6728}},queue = 33, flags = 4, pc = 11,
wire_count = 0, hold_count = 0, act_count = 0 '\0', busy = 1 '\001',
valid = 255 '', dirty = 255 '', cow = 0}

So it seems as though m is getting lost. What follows that seems to be
undefined behavior. (I have slightly modified the above. valid had a 'y'
shaped upper-8-bit symbol between the quotes, and formatted it to fit in
80 columns).

I'll admit I'm quite green when it comes to debugging kernels, especially
5.x kernels. It gets really tricky when some functions trace back to .h
files, and not all of the variables seem available to the debugger. The
servers appear to work fine without PAE enabled, if that's of interest.

This gdb session is still active and I hope to keep it active in case
there are other commands you'd like me to run that might help shed some
light on the situation.
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]