On Fri, 23 Feb 2001, Warner Losh wrote:

> I've added INVARIANTS and WITNESS to my kernel.  Today I get a random
> panic on boot sometimes:
> 
> lock order reseral            (this doesn't cause the panic, but
>                               does seem to happen all the time)
>  1st vnode interlock last acquired @ ../../usr/ffs/ffs_fsops.c:396
>  2nd 0xc04837a0 mntvnode @ ../../ufs/ffs/ffs_vfsops.c:457
>  3rd 0xc80b9e8c vnode interlock @ ../../kern/vfs_subr.c:1872
> kernel trap 12 with interrupts disabled
> panic: runq-add: proc 0xc7b28ee0 (fsck_ufs) not SRUN
> Debugger("panic")
> Stopeed at Debugger+0x44: pushl %ebx
> db> trace
> Debugger(c03d3c03) at Debugger+0x44
> panic(c03d4040,c7b28ee0,c7b290a5,282,c7b2b960) at panic+0x70
> runq_add(c046ae20,c7b28ee0,c8a4bcra4,c0221ee5,c7b28ee0) at runq_add+0x40
> setrunqueue(c7b28ee0) at setrunqueue+0x10
> ithread_schedule(c0f0a00,1) at ithread_schedule+0x129
> sched_ithd(e) at sched_ithrd+0x3f
> Xresume14() at Xresume14+0x8
> --- interrupt, eip = 0xc03830fb, esp = 0x286, ebp = 0xc8a4bd34 ---
> trap(18,10,10,73b152,0) at trap+0x9b
> calltrap() at calltrap+0x5
> --- trap 0xc, eip = 0xc03822e9, esp = 0xc8a4bd7c, ebp = 0xc8a4bd90 ---
> sw1b(0,...) at sw1b+0x6b
> msleep(...) at msleep+0x588
> physio(...) at physio+0x30d
> spec_read(...) at spec_read+0x71
> ufsspec_read(...) at ufsspec_read+0x20
> ufs_noperatespec(...) at ufs_noperatespec+0x15
> vn_read(...) at vn_read+0x128
> dofileread(...) at dofileread+0xb0
> read(...) at read+0x36
> syscall(...) at syscall+0x551
> Xint0x80_syscall() at Xint0x80_syscall+0x23
> --- syscall 0x3, eip = 0x8054770, esp = 0xbfbfef60, ebp = 0xbfbfef9c ---
> db>
> 
> Anything that I can do to help?  I don't have a core dump of this, but
> it is happening often enough to be a pain.

It seems to be another trap while holding sched_lock.  This should be
fatal, but the problem is only detected because trap() enables
interrupts.  Then an interrupt causes bad things to happen.  Unfortunately,
the above omits the critical information: the instruction at sw1b+0x6b.
There is no instruction at that address here.  It is apparently just an
access to a swapped-out page for the new process.  I can't see how this
ever worked.  The page must be faulted in, but this can't be done while
sched_lock is held (not to mention after we have committed to switching
contexts).

Bruce


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Reply via email to