Re: NULL td passed to propagate_priority() when using xmms...
On Mon, 3 Nov 2003, John Baldwin wrote: On 01-Nov-2003 Soren Schmidt wrote: It seems Sean Chittenden wrote: Howdy. I'm not sure if this is a ULE bug or a KSE bug, or both, but, for those interested (this is using ule 1.67, rebuilding world now), here's my stack. I couldn't figure out where td was being set to NULL. :( Oh! Where is TD_SET_LOCK defined? egrep -r didn't turn up anything. -sc Its not ULE, I'm running 4BSD and has gotten this on boot for over a week now, rendering -current totally useless... Having a kernel panic with INVARIANTS on would really help narrow down where the bug is. I found something that causes this bug fairly reliably: - configure ddb so that db_print_backtrace() is called on panics. - break the fd driver so that the panic() in fdstrategy() is called on floppy accesses. - attempt to access a floppy so that fdstrategy() is called. - db_print_backtrace() then does bad things. It never completes here, though it works in other contexts. Usually it prints only the first line or two. Then quite often ddb is called for a null pointer panic in propagate_priority(). More details about the null pointer panic: This seems to have nothing to do with scheduling. propagate_priority() is not called with a null td of course, but it sometimes follows a null m: %%% /* * Pick up the mutex that td is blocked on. */ m = td-td_blocked; MPASS(m != NULL); /* * Check if the thread needs to be moved up on * the blocked chain */ if (td == TAILQ_FIRST(m-mtx_blocked)) { continue; } %%% I don't have invariants enabled, so MPASS(m != NULL) doesn't do anything, but m is null so attempting to load m-mtx_blocked causes a panic. For the backtrace context, propagate_priority() gets called for attempting to aquire a lock in softclock(). Tasks like the softclock task get scheduled despite the system being in panic(). ps seemed to show that the user process doing the floppy access no longer existed. I don't know how that could happen, since the panic() is done in the context of the that process. More details about bugs in db_print_backtrace(): Maybe the stack is messed up. Attempting to access invalid stack offsets can cause problems. My version of db_print_backtrace() has extra code to attempt not to access invalid offsets, but there is normally no problem since ddb's trap handler fixes up the problem. But backtrace() bogusly calls db_print_backtrace() in non-ddb context and then the longjmp in the trap handler goes to hyperspace if anywhere. Bugs tripped over while debugging this: Putting a breakpoint in fdopen() didn't work, because fd.c:fdopen() conflicts with kern_descrip.c:fdopen(). This was broken in fd.c 1.259. There are hundreds of similar conflicts in GENERIC, some for obviously broken things like the same malloc type being static in several files. Bruce ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: NULL td passed to propagate_priority() when using xmms...
On 01-Nov-2003 Soren Schmidt wrote: It seems Sean Chittenden wrote: Howdy. I'm not sure if this is a ULE bug or a KSE bug, or both, but, for those interested (this is using ule 1.67, rebuilding world now), here's my stack. I couldn't figure out where td was being set to NULL. :( Oh! Where is TD_SET_LOCK defined? egrep -r didn't turn up anything. -sc Its not ULE, I'm running 4BSD and has gotten this on boot for over a week now, rendering -current totally useless... Having a kernel panic with INVARIANTS on would really help narrow down where the bug is. -- John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/ Power Users Use the Power to Serve! - http://www.FreeBSD.org/ ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: NULL td passed to propagate_priority() when using xmms...
It seems John Baldwin wrote: On 01-Nov-2003 Soren Schmidt wrote: It seems Sean Chittenden wrote: Howdy. I'm not sure if this is a ULE bug or a KSE bug, or both, but, for those interested (this is using ule 1.67, rebuilding world now), here's my stack. I couldn't figure out where td was being set to NULL. :( Oh! Where is TD_SET_LOCK defined? egrep -r didn't turn up anything. -sc Its not ULE, I'm running 4BSD and has gotten this on boot for over a week now, rendering -current totally useless... Having a kernel panic with INVARIANTS on would really help narrow down where the bug is. doesn't produce anything it seems, but I do have this traceback: Fatal trap 12: page fault while in kernel mode fault virtual address = 0x24 fault code = supervisor read, page not present instruction pointer = 0x8:0xc04b8f3b stack pointer = 0x10:0xcd619bc4 frame pointer = 0x10:0xcd619bd8 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 3 (g_up) kernel: type 12 trap, code=0 Stopped at propagate_priority+0x8b:cmpl0x24(%ebx),%ecx db trace propagate_priority(c0ebfe40,c25983c0,c2512400,cd619c14,c04cc9d1) at propagate_pr iority+0x8b _mtx_lock_sleep(c069c340,0,0,0,c0698700) at _mtx_lock_sleep+0x249 bufdonebio(c7675b68,cd619c44,c048d612,c084c540,c259f990) at bufdonebio+0x47 biodone(c7675b68,c06411ba,c259f990,c7675b68,0) at biodone+0xbc g_dev_done(c259f990,c0ebfe40,1,0,4) at g_dev_done+0x8a biodone(c259f990,0,24c,c0640b45,a) at biodone+0xbc g_io_schedule_up(c0ebfe40,c0ebe1e4,cd619d34,c04acae1,0) at g_io_schedule_up+0xb8 g_up_procbody(0,cd619d48,0,0,0) at g_up_procbody+0x28 fork_exit(c048df60,0,cd619d48) at fork_exit+0xb1 fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip = 0, esp = 0xcd619d7c, ebp = 0 --- db -Søren ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: NULL td passed to propagate_priority() when using xmms...
It seems Sean Chittenden wrote: Howdy. I'm not sure if this is a ULE bug or a KSE bug, or both, but, for those interested (this is using ule 1.67, rebuilding world now), here's my stack. I couldn't figure out where td was being set to NULL. :( Oh! Where is TD_SET_LOCK defined? egrep -r didn't turn up anything. -sc Its not ULE, I'm running 4BSD and has gotten this on boot for over a week now, rendering -current totally useless... (kgdb) bt #0 doadump () at /usr/src/sys/kern/kern_shutdown.c:240 #1 0xc0530569 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:372 #2 0xc0530948 in panic () at /usr/src/sys/kern/kern_shutdown.c:550 #3 0xc06e6bc6 in trap_fatal (frame=0xd675dc2c, eva=0) at /usr/src/sys/i386/i386/trap.c:820 #4 0xc06e6203 in trap (frame= {tf_fs = -881065960, tf_es = 16, tf_ds = -881065968, tf_edi = 0, tf_esi = -871763664, tf_ebp = -696918912, tf_isp = -696918952, tf_ebx = 24, tf_edx = 24, tf_ecx = -871763664, tf_eax = -879614080, tf_trapno = 12, tf_err = 0, tf_eip = -1068208110, tf_cs = 8, tf_eflags = 66051, tf_esp = -874093984, tf_ss = 0}) at /usr/src/sys/i386/i386/trap.c:252 #5 0xc06d6a68 in calltrap () at {standard input}:102 #6 0xc05257ac in propagate_priority (td=0x0) at /usr/src/sys/kern/kern_mutex.c:152 #7 0xc0525bf9 in _mtx_lock_sleep (m=0xc0796b40, opts=0, file=0x0, line=0) at /usr/src/sys/kern/kern_mutex.c:635 #8 0xc051a2c4 in ithread_loop (arg=0xcb7a6b00) at /usr/src/sys/kern/kern_intr.c:539 #9 0xc0518f51 in fork_exit (callout=0xc051a100 ithread_loop, arg=0x0, frame=0x0) at /usr/src/sys/kern/kern_fork.c:796 (kgdb) frame 9 #9 0xc0518f51 in fork_exit (callout=0xc051a100 ithread_loop, arg=0x0, frame=0x0) at /usr/src/sys/kern/kern_fork.c:796 796 callout(arg, frame); (kgdb) list 791 * cpu_set_fork_handler intercepts this function call to 792 * have this call a non-return function to stay in kernel mode. 793 * initproc has its own fork handler, but it does return. 794 */ 795 KASSERT(callout != NULL, (NULL callout in fork_exit)); 796 callout(arg, frame); 797 798 /* 799 * Check if a kernel thread misbehaved and returned from its main 800 * function. (kgdb) frame 8 #8 0xc051a2c4 in ithread_loop (arg=0xcb7a6b00) at /usr/src/sys/kern/kern_intr.c:539 539 mtx_lock(Giant); (kgdb) list 534 wakeup(ih); 535 mtx_unlock(ithd-it_lock); 536 goto restart; 537 } 538 if ((ih-ih_flags IH_MPSAFE) == 0) 539 mtx_lock(Giant); 540 ih-ih_handler(ih-ih_argument); 541 if ((ih-ih_flags IH_MPSAFE) == 0) 542 mtx_unlock(Giant); 543 } (kgdb) frame 7 #7 0xc0525bf9 in _mtx_lock_sleep (m=0xc0796b40, opts=0, file=0x0, line=0) at /usr/src/sys/kern/kern_mutex.c:635 635 propagate_priority(td); (kgdb) list 630 * Save who we're blocked on. 631 */ 632 td-td_blocked = m; 633 td-td_lockname = m-mtx_object.lo_name; 634 TD_SET_LOCK(td); 635 propagate_priority(td); 636 637 if (LOCK_LOG_TEST(m-mtx_object, opts)) 638 CTR3(KTR_LOCK, 639 _mtx_lock_sleep: p %p blocked on [%p] %s, td, m, (kgdb) frame 6 #6 0xc05257ac in propagate_priority (td=0x0) at /usr/src/sys/kern/kern_mutex.c:152 152 sched_prio(td, pri); (kgdb) list 147 * XXXKSE this gets a lot more complicated under threads 148 * but try anyhow. 149 */ 150 if (TD_ON_RUNQ(td)) { 151 MPASS(td-td_blocked == NULL); 152 sched_prio(td, pri); 153 return; 154 } 155 /* 156 * Adjust for any other cases. --- panic: page fault panic messages: --- Fatal trap 12: page fault while in kernel mode fault virtual address = 0x38 fault code = supervisor read, page not present instruction pointer = 0x8:0xc0547012 stack pointer = 0x10:0xd6763c6c frame pointer = 0x10:0xd6763c80 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process =
NULL td passed to propagate_priority() when using xmms...
Howdy. I'm not sure if this is a ULE bug or a KSE bug, or both, but, for those interested (this is using ule 1.67, rebuilding world now), here's my stack. I couldn't figure out where td was being set to NULL. :( Oh! Where is TD_SET_LOCK defined? egrep -r didn't turn up anything. -sc (kgdb) bt #0 doadump () at /usr/src/sys/kern/kern_shutdown.c:240 #1 0xc0530569 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:372 #2 0xc0530948 in panic () at /usr/src/sys/kern/kern_shutdown.c:550 #3 0xc06e6bc6 in trap_fatal (frame=0xd675dc2c, eva=0) at /usr/src/sys/i386/i386/trap.c:820 #4 0xc06e6203 in trap (frame= {tf_fs = -881065960, tf_es = 16, tf_ds = -881065968, tf_edi = 0, tf_esi = -871763664, tf_ebp = -696918912, tf_isp = -696918952, tf_ebx = 24, tf_edx = 24, tf_ecx = -871763664, tf_eax = -879614080, tf_trapno = 12, tf_err = 0, tf_eip = -1068208110, tf_cs = 8, tf_eflags = 66051, tf_esp = -874093984, tf_ss = 0}) at /usr/src/sys/i386/i386/trap.c:252 #5 0xc06d6a68 in calltrap () at {standard input}:102 #6 0xc05257ac in propagate_priority (td=0x0) at /usr/src/sys/kern/kern_mutex.c:152 #7 0xc0525bf9 in _mtx_lock_sleep (m=0xc0796b40, opts=0, file=0x0, line=0) at /usr/src/sys/kern/kern_mutex.c:635 #8 0xc051a2c4 in ithread_loop (arg=0xcb7a6b00) at /usr/src/sys/kern/kern_intr.c:539 #9 0xc0518f51 in fork_exit (callout=0xc051a100 ithread_loop, arg=0x0, frame=0x0) at /usr/src/sys/kern/kern_fork.c:796 (kgdb) frame 9 #9 0xc0518f51 in fork_exit (callout=0xc051a100 ithread_loop, arg=0x0, frame=0x0) at /usr/src/sys/kern/kern_fork.c:796 796 callout(arg, frame); (kgdb) list 791 * cpu_set_fork_handler intercepts this function call to 792 * have this call a non-return function to stay in kernel mode. 793 * initproc has its own fork handler, but it does return. 794 */ 795 KASSERT(callout != NULL, (NULL callout in fork_exit)); 796 callout(arg, frame); 797 798 /* 799 * Check if a kernel thread misbehaved and returned from its main 800 * function. (kgdb) frame 8 #8 0xc051a2c4 in ithread_loop (arg=0xcb7a6b00) at /usr/src/sys/kern/kern_intr.c:539 539 mtx_lock(Giant); (kgdb) list 534 wakeup(ih); 535 mtx_unlock(ithd-it_lock); 536 goto restart; 537 } 538 if ((ih-ih_flags IH_MPSAFE) == 0) 539 mtx_lock(Giant); 540 ih-ih_handler(ih-ih_argument); 541 if ((ih-ih_flags IH_MPSAFE) == 0) 542 mtx_unlock(Giant); 543 } (kgdb) frame 7 #7 0xc0525bf9 in _mtx_lock_sleep (m=0xc0796b40, opts=0, file=0x0, line=0) at /usr/src/sys/kern/kern_mutex.c:635 635 propagate_priority(td); (kgdb) list 630 * Save who we're blocked on. 631 */ 632 td-td_blocked = m; 633 td-td_lockname = m-mtx_object.lo_name; 634 TD_SET_LOCK(td); 635 propagate_priority(td); 636 637 if (LOCK_LOG_TEST(m-mtx_object, opts)) 638 CTR3(KTR_LOCK, 639 _mtx_lock_sleep: p %p blocked on [%p] %s, td, m, (kgdb) frame 6 #6 0xc05257ac in propagate_priority (td=0x0) at /usr/src/sys/kern/kern_mutex.c:152 152 sched_prio(td, pri); (kgdb) list 147 * XXXKSE this gets a lot more complicated under threads 148 * but try anyhow. 149 */ 150 if (TD_ON_RUNQ(td)) { 151 MPASS(td-td_blocked == NULL); 152 sched_prio(td, pri); 153 return; 154 } 155 /* 156 * Adjust for any other cases. --- panic: page fault panic messages: --- Fatal trap 12: page fault while in kernel mode fault virtual address = 0x38 fault code = supervisor read, page not present instruction pointer = 0x8:0xc0547012 stack pointer = 0x10:0xd6763c6c frame pointer = 0x10:0xd6763c80 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 33 (irq12: psm0) trap number = 12 panic: page fault syncing disks, buffers remaining... Fatal trap 12: page fault while in kernel mode fault virtual address = 0x38 fault code = supervisor read, page not present instruction
Re: NULL td passed to propagate_priority() when using xmms...
Howdy. I'm not sure if this is a ULE bug or a KSE bug, or both, but, for those interested (this is using ule 1.67, rebuilding world now), here's my stack. I couldn't figure out where td was being set to NULL. :( Oh! Where is TD_SET_LOCK defined? egrep -r didn't turn up anything. -sc sched_ule.c v1.70 fixes this. -sc -- Sean Chittenden ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]