Re: A page fault in subr_turnstile.c:propogate_priority()
On Wed, 3 Dec 2003, John Baldwin wrote: > > With a patch from http://www.FreeBSD.org/~jhb/patches/turnstile.patch > > I can not even rebuild kernel. Can I add MPASS to sources from > > 2003.11.28.00.00.00 ? > > Ok, I've just updated turnstile.patch again. I was testing a bogus > condition in my earlier patch. :/ This patch has survived for a while > now on my test machine that was panic'ing with the earlier patch, so > try giving this patch a try against stock CVS sources and see if it > clears up all your panics. Thanks. I added MPASS to sources from 2003.11.28.00.00.00 and since then the system made 3 builds of the world. I think it would panic tomorrow or so. I'll get crash dump and then try your new patch. Can I apply the new patch to sources from 2003.11.28.00.00.00 or I need to cvsup to more recent data ? Igor Sysoev http://sysoev.ru/en/ ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: A page fault in subr_turnstile.c:propogate_priority()
On Wed, 3 Dec 2003, John Baldwin wrote: > On 03-Dec-2003 Igor Sysoev wrote: > > On Wed, 3 Dec 2003, John Baldwin wrote: > > > >> > >> On 03-Dec-2003 Igor Sysoev wrote: > >> > On Wed, 3 Dec 2003, John Baldwin wrote: > >> > > >> >> On 03-Dec-2003 Brian F. Feldman wrote: > >> >> > Igor Sysoev <[EMAIL PROTECTED]> wrote: > >> >> >> I'd cvsup'ed 5.1-CURRENT from 2003.11.04.02.02.00 up to > >> >> >> 2003.11.28.00.00.00 with the turnstile support and it can still > >> >> >> causes sometimes a page fault in propogate_priority(). > >> >> >> I have core dump and can send debug output. > Both faults are a trap 12 page fault with a faulting va of 0xe5 > because td1 is NULL in priority propagation. Try this change to > get a better cashdump that can be analyzed: > > Index: subr_turnstile.c > === > RCS file: /usr/cvs/src/sys/kern/subr_turnstile.c,v > retrieving revision 1.134 > diff -u -r1.134 subr_turnstile.c > --- subr_turnstile.c12 Nov 2003 23:48:42 - 1.134 > +++ subr_turnstile.c3 Dec 2003 17:48:02 - > @@ -254,6 +254,7 @@ > } > > td1 = TAILQ_PREV(td, threadqueue, td_lockq); > + MPASS(td1 != NULL); > if (td1->td_priority <= pri) { > mtx_unlock_spin(&tc->tc_lock); > continue; > > Then print 'td' and '*ts' in gdb from the priority_propagation() > frame and mail the output. With a patch from http://www.FreeBSD.org/~jhb/patches/turnstile.patch I can not even rebuild kernel. Can I add MPASS to sources from 2003.11.28.00.00.00 ? Igor Sysoev http://sysoev.ru/en/ ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: A page fault in subr_turnstile.c:propogate_priority()
On Wed, 3 Dec 2003, Igor Sysoev wrote: > On Wed, 3 Dec 2003, John Baldwin wrote: > > > > > On 03-Dec-2003 Igor Sysoev wrote: > > > On Wed, 3 Dec 2003, John Baldwin wrote: > > > > > >> On 03-Dec-2003 Brian F. Feldman wrote: > > >> > Igor Sysoev <[EMAIL PROTECTED]> wrote: > > >> >> I'd cvsup'ed 5.1-CURRENT from 2003.11.04.02.02.00 up to > > >> >> 2003.11.28.00.00.00 with the turnstile support and it can still > > >> >> causes sometimes a page fault in propogate_priority(). > > >> >> I have core dump and can send debug output. > > >> > > > >> > Go ahead and load up kernel.debug and the core dump in gdb -k, and show us > > >> > the backtrace. Also, do you have any idea about more specific circumstances > > >> > that will cause this problem? Thanks! > > >> > > >> Actually, please try http://www.FreeBSD.org/~jhb/patches/turnstile.patch > > > > > > I've applied patch. > > > Now it's a second fault. > > > > This is the same fault. Did you just apply this patch today or did > > you apply an earlier version of it a while ago? I just updated it > > two days ago. > > In <[EMAIL PROTECTED]> I sent trace before > the patch. It seems to me that it's singe fault. > > Then I applied patch I saw the double fault. This trace I sent in > <[EMAIL PROTECTED]>. Sorry, I missed your question. I've got and applied the patch today. Igor Sysoev http://sysoev.ru/en/ ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: A page fault in subr_turnstile.c:propogate_priority()
On Wed, 3 Dec 2003, John Baldwin wrote: > > On 03-Dec-2003 Igor Sysoev wrote: > > On Wed, 3 Dec 2003, John Baldwin wrote: > > > >> On 03-Dec-2003 Brian F. Feldman wrote: > >> > Igor Sysoev <[EMAIL PROTECTED]> wrote: > >> >> I'd cvsup'ed 5.1-CURRENT from 2003.11.04.02.02.00 up to > >> >> 2003.11.28.00.00.00 with the turnstile support and it can still > >> >> causes sometimes a page fault in propogate_priority(). > >> >> I have core dump and can send debug output. > >> > > >> > Go ahead and load up kernel.debug and the core dump in gdb -k, and show us > >> > the backtrace. Also, do you have any idea about more specific circumstances > >> > that will cause this problem? Thanks! > >> > >> Actually, please try http://www.FreeBSD.org/~jhb/patches/turnstile.patch > > > > I've applied patch. > > Now it's a second fault. > > This is the same fault. Did you just apply this patch today or did > you apply an earlier version of it a while ago? I just updated it > two days ago. In <[EMAIL PROTECTED]> I sent trace before the patch. It seems to me that it's singe fault. Then I applied patch I saw the double fault. This trace I sent in <[EMAIL PROTECTED]>. Igor Sysoev http://sysoev.ru/en/ ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: A page fault in subr_turnstile.c:propogate_priority()
On Wed, 3 Dec 2003, John Baldwin wrote: > On 03-Dec-2003 Brian F. Feldman wrote: > > Igor Sysoev <[EMAIL PROTECTED]> wrote: > >> I'd cvsup'ed 5.1-CURRENT from 2003.11.04.02.02.00 up to > >> 2003.11.28.00.00.00 with the turnstile support and it can still > >> causes sometimes a page fault in propogate_priority(). > >> I have core dump and can send debug output. > > > > Go ahead and load up kernel.debug and the core dump in gdb -k, and show us > > the backtrace. Also, do you have any idea about more specific circumstances > > that will cause this problem? Thanks! > > Actually, please try http://www.FreeBSD.org/~jhb/patches/turnstile.patch With this patch the system panics after short time when I run make -j 32 buildworld. I tried 2 times. Igor Sysoev http://sysoev.ru/en/ ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: A page fault in subr_turnstile.c:propogate_priority()
On Wed, 3 Dec 2003, Brian F. Feldman wrote: > Igor Sysoev <[EMAIL PROTECTED]> wrote: > > I'd cvsup'ed 5.1-CURRENT from 2003.11.04.02.02.00 up to > > 2003.11.28.00.00.00 with the turnstile support and it can still > > causes sometimes a page fault in propogate_priority(). > > I have core dump and can send debug output. > > Go ahead and load up kernel.debug and the core dump in gdb -k, and show us > the backtrace. Also, do you have any idea about more specific circumstances > that will cause this problem? Thanks! It is SMP system 2xP4, HTT CPUs halted, 4BSD scheduler. It panics sometimes when running in a cycle "make -j 64 buildworld" panic: page fault panic messages: --- Fatal trap 12: page fault while in kernel mode cpuid = 2; apic id = 02 fault virtual address = 0xe5 fault code = supervisor read, page not present instruction pointer = 0x8:0xc053f197 stack pointer = 0x10:0xe3c21c80 frame pointer = 0x10:0xe3c21ca0 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= resume, IOPL = 0 current process = 42 (irq29: ahd0) trap number = 12 panic: page fault cpuid = 2; boot() called on cpu#2 syncing disks, buffers remaining... panic: bremfree: removing a buffer not on a queue cpuid = 2; boot() called on cpu#2 Uptime: 1d2h4m15s Dumping 2047 MB 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 256 272 288 304 320 336 352 368 384 400 416 432 448 464 480 496 512 528 544 560 576 592 608 624 640 656 672 688 704 720 736 752 768 784 800 816 832 848 864 880 896 912 928 944 960 976 992 1008 1024 1040 1056 1072 1088 1104 1120 1136 1152 1168 1184 1200 1216 1232 1248 1264 1280 1296 1312 1328 1344 1360 1376 1392 1408 1424 1440 1456 1472 1488 1504 1520 1536 1552 1568 1584 1600 1616 1632 1648 1664 1680 1696 1712 1728 1744 1760 1776 1792 1808 1824 1840 1856 1872 1888 1904 1920 1936 1952 1968 1984 2000 2016 2032 --- #0 doadump () at ../../../kern/kern_shutdown.c:240 240 dumping++; (kgdb) bt #0 doadump () at ../../../kern/kern_shutdown.c:240 #1 0xc0517067 in boot (howto=260) at ../../../kern/kern_shutdown.c:372 #2 0xc0517480 in poweroff_wait (junk=0xc0666ee0, howto=-729086152) at ../../../kern/kern_shutdown.c:550 #3 0xc05614d1 in bremfreel (bp=0xe3c218f0) at ../../../kern/vfs_bio.c:647 #4 0xc05613db in bremfree (bp=0x0) at ../../../kern/vfs_bio.c:629 #5 0xc0565dd1 in getblk (vp=0xc8154000, blkno=131360, size=16384, slpflag=0, slptimeo=0, flags=0) at ../../../kern/vfs_bio.c:2468 #6 0xc05615b2 in breadn (vp=0xc8154000, blkno=0, size=0, rablkno=0x0, rabsize=0x0, cnt=0, cred=0x0, bpp=0x0) at ../../../kern/vfs_bio.c:700 #7 0xc056155c in bread (vp=0x0, blkno=0, size=0, cred=0x0, bpp=0x0) at ../../../kern/vfs_bio.c:682 #8 0xc05bba85 in ffs_update (vp=0xc815330c, waitfor=0) at ../../../ufs/ffs/ffs_inode.c:108 #9 0xc05d1802 in ffs_fsync (ap=0xe3c21af0) at ../../../ufs/ffs/ffs_vnops.c:325 #10 0xc05d06ca in ffs_sync (mp=0xc812a000, waitfor=2, cred=0xc3f00e80, td=0xc06a5ca0) at vnode_if.h:627 #11 0xc057ab7e in sync (td=0xc06a5ca0, uap=0x0) at ../../../kern/vfs_syscalls.c:141 #12 0xc0516b75 in boot (howto=256) at ../../../kern/kern_shutdown.c:281 #13 0xc0517480 in poweroff_wait (junk=0xc066a837, howto=-1066983121) at ../../../kern/kern_shutdown.c:550 #14 0xc0636d5c in trap_fatal (frame=0xc066a837, eva=0) at ../../../i386/i386/trap.c:821 #15 0xc06363c3 in trap (frame= {tf_fs = -473825256, tf_es = -1068498928, tf_ds = -473825264, tf_edi = -938141248, tf_esi = -1066743576, tf_ebp = -473817952, tf_isp = -473818004, tf_ebx = -941495168, tf_edx = 0, tf_ecx = -941553792, tf_eax = -941495136, tf_trapno = 12, tf_err = 0, tf_eip = -1068240489, tf_cs = 8, tf_eflags = 65667, tf_esp = -941551444, tf_ss = 131}) at ../../../i386/i386/trap.c:250 #16 0xc0623228 in calltrap () at {standard input}:94 #17 0xc053f974 in turnstile_wait (ts=0xc81519c0, lock=0xc06a94a0, owner=0x0) at ../../../kern/subr_turnstile.c:509 #18 0xc050c655 in _mtx_lock_sleep (m=0xc06a94a0, opts=0, file=0x0, line=0) at ../../../kern/kern_mutex.c:476 #19 0xc0501405 in ithread_loop (arg=0xc7e05080) at ../../../kern/kern_intr.c:543 #20 0xc0500040 in fork_exit (callout=0xc0501240 , arg=0x0, frame=0x0) at ../../../kern/kern_fork.c:793 (kgdb) disassemble 0xc053f197 Dump of assembler code for function propagate_priority: 0xc053f070 :push %ebp [ skipped ] 0xc053f0d7 :call 0xc052da60 0xc053f0dc : jmp0xc053f2b2 0xc053f0e1 :movzbl 0xfff0(%ebp),%eax 0xc053f0e5 :mov%al,0xe5(%ebx) 0xc053f0eb :mov0x60(%ebx),%edi 0xc053f0ee :mov0x24(%edi),%eax 0xc053f0f1 :shr$0x8,%eax 0xc053f0f4 :and$0x7f,%eax 0xc053f0f7 :lea(%eax,%eax,4),%eax 0xc053f0fa :lea0xc06ac820(,%eax,8),%esi 0xc053f101 :call 0x
Re: A page fault in subr_turnstile.c:propogate_priority()
On Wed, 3 Dec 2003, John Baldwin wrote: > On 03-Dec-2003 Brian F. Feldman wrote: > > Igor Sysoev <[EMAIL PROTECTED]> wrote: > >> I'd cvsup'ed 5.1-CURRENT from 2003.11.04.02.02.00 up to > >> 2003.11.28.00.00.00 with the turnstile support and it can still > >> causes sometimes a page fault in propogate_priority(). > >> I have core dump and can send debug output. > > > > Go ahead and load up kernel.debug and the core dump in gdb -k, and show us > > the backtrace. Also, do you have any idea about more specific circumstances > > that will cause this problem? Thanks! > > Actually, please try http://www.FreeBSD.org/~jhb/patches/turnstile.patch I've applied patch. Now it's a second fault. Fatal trap 12: page fault while in kernel mode cpuid = 2; apic id = 02 fault virtual address = 0xe5 fault code = supervisor read, page not present instruction pointer = 0x8:0xc053f1a8 stack pointer = 0x10:0xe62dbaac frame pointer = 0x10:0xe62dbacc code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= resume, IOPL = 0 current process = 1851 (make) trap number = 12 panic: page fault cpuid = 2; boot() called on cpu#2 syncing disks, buffers remaining... panic: bremfree: removing a buffer not on a queue cpuid = 2; boot() called on cpu#2 Uptime: 2m55s Dumping 2047 MB [ skipped ] --- #0 doadump () at ../../../kern/kern_shutdown.c:240 240 dumping++; (kgdb) bt #0 doadump () at ../../../kern/kern_shutdown.c:240 #1 0xc0517067 in boot (howto=260) at ../../../kern/kern_shutdown.c:372 #2 0xc0517480 in poweroff_wait (junk=0xc0666ee0, howto=-731617912) at ../../../kern/kern_shutdown.c:550 #3 0xc05614e1 in bremfreel (bp=0xe62db71c) at ../../../kern/vfs_bio.c:647 #4 0xc05613eb in bremfree (bp=0x0) at ../../../kern/vfs_bio.c:629 #5 0xc0565de1 in getblk (vp=0xc814d000, blkno=131552, size=16384, slpflag=0, slptimeo=0, flags=0) at ../../../kern/vfs_bio.c:2468 #6 0xc05615c2 in breadn (vp=0xc814d000, blkno=0, size=0, rablkno=0x0, rabsize=0x0, cnt=0, cred=0x0, bpp=0x0) at ../../../kern/vfs_bio.c:700 #7 0xc056156c in bread (vp=0x0, blkno=0, size=0, cred=0x0, bpp=0x0) at ../../../kern/vfs_bio.c:682 #8 0xc05bba95 in ffs_update (vp=0xc896130c, waitfor=0) at ../../../ufs/ffs/ffs_inode.c:108 #9 0xc05d1812 in ffs_fsync (ap=0xe62db91c) at ../../../ufs/ffs/ffs_vnops.c:325 #10 0xc05d06da in ffs_sync (mp=0xc811a800, waitfor=2, cred=0xc3f00e80, td=0xc06a5ca0) at vnode_if.h:627 #11 0xc057ab8e in sync (td=0xc06a5ca0, uap=0x0) at ../../../kern/vfs_syscalls.c:141 #12 0xc0516b75 in boot (howto=256) at ../../../kern/kern_shutdown.c:281 #13 0xc0517480 in poweroff_wait (junk=0xc066a837, howto=-1066983121) at ../../../kern/kern_shutdown.c:550 #14 0xc0636d6c in trap_fatal (frame=0xc066a837, eva=0) at ../../../i386/i386/trap.c:821 #15 0xc06363d3 in trap (frame= {tf_fs = -1068105704, tf_es = -1037172720, tf_ds = -433258480, tf_edi = -932123520, tf_esi = -1066740096, tf_ebp = -433210676, tf_isp = -433210728, tf_ebx = -937236928, tf_edx = 0, tf_ecx = -931024576, tf_eax = -937236896, tf_trapno = 12, tf_err = 0, tf_eip = -1068240472, tf_cs = 8, tf_eflags = 65666, tf_esp = -433210676, tf_ss = -1068447965}) at ../../../i386/i386/trap.c:250 #16 0xc0623238 in calltrap () at {standard input}:94 #17 0xc053f984 in turnstile_wait (ts=0xc870ec80, lock=0xc06a94a0, owner=0x0) at ../../../kern/subr_turnstile.c:510 #18 0xc050c655 in _mtx_lock_sleep (m=0xc06a94a0, opts=0, file=0x0, line=0) at ../../../kern/kern_mutex.c:476 #19 0xc04ef50f in cv_timedwait_sig (cvp=0xc06adc64, mp=0xc06adc40, timo=0) at ../../../kern/kern_condvar.c:478 #20 0xc0541972 in kern_select (td=0xc881b140, nd=20, fd_in=0xbfbfdbd0, fd_ou=0x0, fd_ex=0x0, tvp=0xe62dbcd8) at ../../../kern/sys_generic.c:844 #21 0xc05412c6 in select (td=0x0, uap=0xe62dbd14) at ../../../kern/sys_generic.c:720 #22 0xc0637120 in syscall (frame= {tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = -1077945264, tf_esi = 134765696, tf_ebp = -1077945256, tf_isp = -433209996, tf_ebx = 134895872, tf_edx = 2184, tf_ecx = 0, tf_eax = 93, tf_trapno = 22, tf_err = 2, tf_eip = 134628119, tf_cs = 31, tf_eflags = 514, tf_esp = -1077945444, tf_ss = 47}) at ../../../i386/i386/trap.c:1010 #23 0xc062328d in Xint0x80_syscall () at {standard input}:136 ---Can't read userspace from dump, or kernel process--- (kgdb) Igor Sysoev http://sysoev.ru/en/ ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
A page fault in subr_turnstile.c:propogate_priority()
I'd cvsup'ed 5.1-CURRENT from 2003.11.04.02.02.00 up to 2003.11.28.00.00.00 with the turnstile support and it can still causes sometimes a page fault in propogate_priority(). I have core dump and can send debug output. Igor Sysoev http://sysoev.ru/en/ ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
panic: bad pte
I have core dump caused by "panic: bad pte" on FreeBSD 5.1-CURRENT SMP cvsuped on date=2003.11.04.02.02.00. System runs "make -j 64 buildworld" in a cycle and sometimes paniced with message "bad pte". - Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x24 fault code = supervisor read, page not present instruction pointer = 0x8:0xc050a35b stack pointer = 0x10:0xe21c6c88 frame pointer = 0x10:0xe21c6c9c code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= resume, IOPL = 0 current process = 42 (irq29: ahd0) trap number = 12 panic: page fault cpuid = 0; boot() called on cpu#0 - - (kgdb) where #0 doadump () at ../../../kern/kern_shutdown.c:240 #1 0xc0515167 in boot (howto=260) at ../../../kern/kern_shutdown.c:372 #2 0xc0515580 in poweroff_wait (junk=0xc06676f0, howto=-1066995772) at ../../../kern/kern_shutdown.c:550 #3 0xc063359c in trap_fatal (frame=0xc06676f0, eva=0) at ../../../i386/i386/trap.c:821 #4 0xc0632c13 in trap (frame= {tf_fs = -1007615976, tf_es = -501481456, tf_ds = -1068433392, tf_edi = 4, tf_esi = 20, tf_ebp = -501453668, tf_isp = -501453708, tf_ebx = 0, tf_edx = -1067055282, tf_ecx = -920489984, tf_eax = 20, tf_trapno = 12, tf_err = 0, tf_eip = -1068457125, tf_cs = 8, tf_eflags = 65683, tf_esp = 91645925, tf_ss = -148261714}) at ../../../i386/i386/trap.c:250 #5 0xc061fbb8 in calltrap () at {standard input}:94 #6 0xc050a7a9 in _mtx_lock_sleep (m=0x14, opts=0, file=0x0, line=0) at ../../../kern/kern_mutex.c:635 #7 0xc04ff295 in ithread_loop (arg=0xc7df1080) at ../../../kern/kern_intr.c:543 #8 0xc04fded0 in fork_exit (callout=0xc04ff0d0 , arg=0x0, frame=0x0) at ../../../kern/kern_fork.c:793 - But it seems that it's incorrect back trace because the faulting instruction is in kern/kern_mutex.c:propagate_priority() @c050a35b. Here is disassembled and commented code starting from line 150 in kern/kern_mutex.c:propagate_priority(): c050a332 cmpl $0x3,0xe4(%ecx) # if (TD_ON_RUNQ(td)) { c050a339 jne0xc050a350 c050a33b mov%esi,%edx # prio -> %edx c050a33d movzbl %dl,%eax# prio -> %eax c050a340 mov%eax,0x4(%esp,1)# prio c050a344 mov%ecx,(%esp,1) # td c050a347 call 0xc052bc10 # sched_prio(td, pri); c050a34c jmp0xc050a3cb c050a34e mov%esi,%esi # nop c050a350 mov%esi,%eax # prio -> %eax c050a352 mov%al,0xdd(%ecx) # td->td_priority = pri; c050a358 mov0x5c(%ecx),%ebx # m = td->td_blocked; FAULT: c050a35b cmp0x24(%ebx),%ecx # if (td == TAILQ_FIRST(&m->mtx_blocked)) { c050a35e je 0xc050a2f0 # continue; It seems that td->td_blocked is NULL. Igor Sysoev htto://sysoev.ru/en/ ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"