Re: gcc -pg causes 'kernel trap 12 with interrupts disabled' panic

2001-06-01 Thread David Taylor

On Thu, 31 May 2001, Bruce Evans wrote:
 On Wed, 30 May 2001, David Taylor wrote:
 
  When trying to profile ircd-hybrid-7 on -CURRENT (I tried using a pre-vm
  madness version first, then tried a version cvsuped today), I reliably get
  lots of:
  
  kernel trap 12 with interrupts disabled
  
  messages on the console (one every 5-10 seconds, when the ircd is reasonably
  loaded).
 
 This is because ast() calls addupc_task() with sched_lock held.
 addupc_task() calls copyin() and copyin() sometimes traps to fault in the
 profiling buffer.
 
 This seems to be just a bug in ast().  userret() is missing the bug.
 Untested fix:
 
 ---
 Index: trap.c
 ===
 RCS file: /home/ncvs/src/sys/i386/i386/trap.c,v
 retrieving revision 1.189
 diff -u -1 -r1.189 trap.c
 --- trap.c2001/05/23 22:58:09 1.189
 +++ trap.c2001/05/31 13:09:02
 @@ -1285,5 +1341,6 @@
   mtx_lock(Giant);
 - mtx_lock_spin(sched_lock);
   addupc_task(p, p-p_stats-p_prof.pr_addr,
   p-p_stats-p_prof.pr_ticks);
 + mtx_lock_spin(sched_lock);
 + /* XXX why not unlock Giant? */
   }
 ---

I tested this, and it works!

No more `kernel trap 12 with interrupts disabled' messages, and also,
thankfully, no more panics.  (Related to this anyway, I'm still getting
freelist corruption related things).
 
 I think this is caused by the same bug.
 
 kernel trap almost any with interrupts disabled
 
 should be fatal (the case of trap 12 (only) _is_ fatal in my version),
 but the kernel attempts to fix the problem and continue.  This sort
 of worked when things were locked by disabling interrupts.  Now, things
 may be locked by a spinlock as well as by disabling interrupts, and
 the corresponding fixup would be to release the spinlock.  But this
 is more obviously wrong.
 
 Bruce
 

Yeah, just trying to cover up the problem and march on usually doesn't work
out very well in computing.. or anywhere else, really..

-- 
David Taylor
[EMAIL PROTECTED]

 PGP signature


Re: gcc -pg causes 'kernel trap 12 with interrupts disabled' panic

2001-05-31 Thread Bruce Evans

On Wed, 30 May 2001, David Taylor wrote:

 When trying to profile ircd-hybrid-7 on -CURRENT (I tried using a pre-vm
 madness version first, then tried a version cvsuped today), I reliably get
 lots of:
 
   kernel trap 12 with interrupts disabled
 
 messages on the console (one every 5-10 seconds, when the ircd is reasonably
 loaded).

This is because ast() calls addupc_task() with sched_lock held.
addupc_task() calls copyin() and copyin() sometimes traps to fault in the
profiling buffer.

This seems to be just a bug in ast().  userret() is missing the bug.
Untested fix:

---
Index: trap.c
===
RCS file: /home/ncvs/src/sys/i386/i386/trap.c,v
retrieving revision 1.189
diff -u -1 -r1.189 trap.c
--- trap.c  2001/05/23 22:58:09 1.189
+++ trap.c  2001/05/31 13:09:02
@@ -1285,5 +1341,6 @@
mtx_lock(Giant);
-   mtx_lock_spin(sched_lock);
addupc_task(p, p-p_stats-p_prof.pr_addr,
p-p_stats-p_prof.pr_ticks);
+   mtx_lock_spin(sched_lock);
+   /* XXX why not unlock Giant? */
}
---

 One thing I got with ircd-hybrid-7 (when very heavily loaded with lots of
 clones), which I _couldnt_ replicate with the test program (probably because
 it wasn't very heavily loaded) was a panic:
 ...
 kernel trap 12 with interrupts disabled
 panic: mutex sched lock recursed at /usr/src/sys/kern/kern_sync.c:858
 Debugger(panic)
 Stopped atDebugger+0x45:  pushl   %ebx
 db t
 Debugger(c02fa51b) at Debugger+0x45
 panic(c02f9684,c031d2a9,c02fad20,35a,282) at panic+0x70
 _mtx_assert(c03a7120,9,c02fad20,35a,282) at _mtx_assert+0x6c
 mi_switch(d1748420,38,d1748420,e,d17c9e90) at mi_switch+0x25
 ithread_schedule(c26e3180,1) at ithread_schedule+0x165
 sched_ithd(e) at sched_ithd+0x3d
 Xresume14() at Xresume14+0x7
 -- interrupt, eip = 0xc02c8a18, esp = 0xd17c9ed8, ebp = 0xd17c9f04 --
 trap(d1740018,d1740010,8080010,d17c9f72,85d02ec) at trap+0x94
 calltrap() at calltrap+0x5
 -- trap 0xc, eip = 0xc02c71f5, esp = 0xd17c9f4c, ebp = 0xd17c9f74 --
 generic_copyin(d1748420,809073d,1) at generic_copyin+0x39
 ast(d17c9fa8) at ast+0x318
 doreti_ast() at doreti_ast+0x6

I think this is caused by the same bug.

kernel trap almost any with interrupts disabled

should be fatal (the case of trap 12 (only) _is_ fatal in my version),
but the kernel attempts to fix the problem and continue.  This sort
of worked when things were locked by disabling interrupts.  Now, things
may be locked by a spinlock as well as by disabling interrupts, and
the corresponding fixup would be to release the spinlock.  But this
is more obviously wrong.

Bruce


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message