Re: Page fault in kernel when using CD, BSD 7.2
Mark Terribile wrote: AMI BIOS. The NB heatsink is barely warm (fan cooled), the not necessarily a good sign, you would get this if the heat from the cpu is not getting transferred to the heatsink. Remove, clean, apply new heat transfer compound, make sure the heatsink is actually seating properly, not getting lodged on something. Chris ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
RE: Page fault in kernel when using CD, BSD 7.2
-Original Message- From: Mark Terribile [mailto:materrib...@yahoo.com] Sent: 12 August 2010 03:32 To: freebsd-questions@freebsd.org Subject: Page fault in kernel when using CD, BSD 7.2 Hi, In the last two days I've had two nasty problems on two machines. The first started dumping core on epiphany, apparently when the Javascript garbage collector ran. I found that the fan on the video card was running and stopping. I jury-rigged a fan over it (until I get a new one) and the problem has gone away. Probably nothing to do with the second problem, but who knows? Second problem: on a machine (Core 2 Quad, 2.24 GHz) the CD/DVD drive has started to give me page faults in the kernel. The press any key on the console to halt the reboot does not work. I've been using this drive on and off for months. I've checked all the connections (PATA), blown out the machine (the temperatures reported by sysctl range from 50 to 59 degrees from core to core), and put a different power lead into the drive. Sometimes the console gets large transfer errors (I don't want to excite the problem right now, as the fsck is finally running) before the fault. The disk transfers don't work, the drive won't open, the process can't be interrupted, etc. The error usually comes a few minutes after the drive stops working. Yes, the processor is running a little hot, but I don't think it's dangerous and its been like this for months. I have a compact heat sink on it and the interaction between the rotor/stator fan and the CPU speed control reduces the speed too much at low load. But again, it's been like that for months. Does anyone have any suggestions? Is it worth trying a new PATA or SATA drive? Mark Terribile materrib...@yahoo.com Might be worthwhile running memtest on the machine. Regards Graeme ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Page fault in kernel when using CD, BSD 7.2
Hi, Second problem: on a machine (Core 2 Quad, 2.24 GHz) the CD/DVD drive has started to give me page faults in the kernel. The press any key on the console to halt the reboot does not work. Okay, now it's happening with nothing but the fsck running. It takes maybe fifteen minutes. Okay, I've got a suspect. I got it past the fsck by going ino Single User and doing the file systems one disk at a time. (Two on this machine). I suspect the power supply has gone marginal. My spare is much bigger than the what I need for this machine; I'll wait on a replacement if I can. And I'll let you all know. Thanks to those who've written. Mark Terribile materrib...@yahoo.com ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Page fault in swapper on boot
On Mon, Jun 28, 2010 at 02:16:42PM -0500, Richard Kolkovich wrote: I rebooted today to recover a couple devices which were in use by zombie processes. Now, I'm met with the same page fault documented in this previous message: http://lists.freebsd.org/pipermail/freebsd-questions/2010-June/217625.html I also performed the gettext upgrade between my last reboot and this one, but I had no problems completing it. I have since booted the MFS live CD, mounted my zpools and updated world and kernel (RELENG_8 as of this morning) - still have the same page fault in the same place. I also tried GENERIC - no dice there, either. Single user, no-ACPI, etc. don't work. I'll build some debugging into the kernel and see what else I can find... I should have done this before spamming the list, but at least I can provide a solution. With debugging enabled, I saw the problem child is the VirtualBox network module (vboxnetflt.ko). I set both enable_vboxdrv and enable_vboxnetflt to NO in loader.conf, and I was able to boot just fine. Sorry for the noise - hope this helps someone else out. -- Richard Kolkovich http://www.sigil.org PGP Key: 0x9E54EF59 (http://pgp.mit.edu) pgpVWLmRRrn2z.pgp Description: PGP signature
Re: Page fault in swapper on boot
Richard Kolkovich wrote: On Mon, Jun 28, 2010 at 02:16:42PM -0500, Richard Kolkovich wrote: I rebooted today to recover a couple devices which were in use by zombie processes. Now, I'm met with the same page fault documented in this previous message: http://lists.freebsd.org/pipermail/freebsd-questions/2010- June/217625.html I also performed the gettext upgrade between my last reboot and this one, but I had no problems completing it. I have since booted the MFS live CD, mounted my zpools and updated world and kernel (RELENG_8 as of this morning) - still have the same page fault in the same place. I also tried GENERIC - no dice there, either. Single user, no-ACPI, etc. don't work. I'll build some debugging into the kernel and see what else I can find... I should have done this before spamming the list, but at least I can provide a solution. With debugging enabled, I saw the problem child is the VirtualBox network module (vboxnetflt.ko). I set both enable_vboxdrv and enable_vboxnetflt to NO in loader.conf, and I was able to boot just fine. Sorry for the noise - hope this helps someone else out. The recent gettext update is a dependency for both the vbox kernel modules and the virtualbox-ose ports. Try rebuilding/reinstalling both with make make deinstall make reinstall if you still wish to use them. -Mike ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Page Fault.
On Monday 01 December 2008 19:32:59 Keith wrote: Have a machine, Dell dual CPU/quad core Xeon. Runs FBSD 6.2. Custom kernel, with IPFW compiled in and using SMP. FreeBSD hostname 6.2-RELEASE FreeBSD 6.2-RELEASE #1: Wed Jan 23 12:17:29 PST 2008 It runs, Dovecot, Postfix, Mysql, Apache. Standard email stuff. Put into production in March, ran perfect until July 29th when it rebooted by itself. It rebooted 2 more times in the last few months on its own. But in the last 6 weeks it has become a weekly occurance, with uptime no more than 6-7 days at most. The last 2 times I have cores and have run kgdb on them. Both vmcore's show the same things. Same pointers etc, the only difference is what the cpuid was at the time. == kernel trap 12 with interrupts disabled Fatal trap 12: page fault while in kernel mode cpuid = 2; apic id = 02 fault virtual address = 0x104 fault code = supervisor read, page not present instruction pointer = 0x20:0xc066ca51 stack pointer = 0x28:0xe6ec0c90 frame pointer = 0x28:0xe6ec0c9c code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= resume, IOPL = 0 current process = 9 (thread taskq) trap number = 12 panic: page fault cpuid = 2 Uptime: 6d6h23m45s Dumping 3327 MB (2 chunks) chunk 0: 1MB (159 pages) ... ok chunk 1: 3327MB (851624 pages) 3311 3295 3279 3263 3247 3231 3215 3199 3183 3167 3151 3135 3119 3103 3087 3071 3055 3039 3023 3007 2991 2975 2959 2943 2927 2911 2895 2879 2863 2847 2831 2815 2799 2783 2767 2751 2735 2719 2703 2687 2671 2655 2639 2623 2607 2591 2575 2559 2543 2527 2511 2495 2479 2463 2447 2431 2415 2399 2383 2367 2351 2335 2319 2303 2287 2271 2255 2239 2223 2207 2191 2175 2159 2143 2127 2111 2095 2079 2063 2047 2031 2015 1999 1983 1967 1951 1935 1919 1903 1887 1871 1855 1839 1823 1807 1791 1775 1759 1743 1727 1711 1695 1679 1663 1647 1631 1615 1599 1583 1567 1551 1535 1519 1503 1487 1471 1455 1439 1423 1407 1391 1375 1359 1343 1327 1311 1295 1279 1263 1247 1231 1215 1199 1183 1167 1151 1135 1119 1103 1087 1071 1055 1039 1023 1007 991 975 959 943 927 911 895 879 863 847 831 815 799 783 767 751 735 719 703 687 671 655 639 623 607 591 575 559 543 527 511 495 479 463 447 431 415 399 383 367 351 335 319 303 287 271 255 239 223 207 191 175 159 143 127 111 95 79 63 47 31 15 #0 doadump () at pcpu.h:165 165 __asm __volatile(movl %%fs:0,%0 : =r (td)); What might be the cause for this? It is the in the same place every time. Once the machine hung and had to be powercycled. But on the screen was the same page fault error on the same process. frame 0 useless. You need the frame after calltrap(). And: instruction pointer = 0x20:0xc066ca51 list *0xc066ca51 Generally a bt will show the needed information. Likely cause: file system corruption, caused by background_fsck, but a backtrace should show more. -- Mel Problem with today's modular software: they start with the modules and never get to the software part. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Page Fault.
On Mon, 1 Dec 2008, Mel wrote: |-On Monday 01 December 2008 19:32:59 Keith wrote: |- |- == |- kernel trap 12 with interrupts disabled |- |- Fatal trap 12: page fault while in kernel mode |- cpuid = 2; apic id = 02 |- fault virtual address = 0x104 |- fault code = supervisor read, page not present |- instruction pointer = 0x20:0xc066ca51 |- stack pointer = 0x28:0xe6ec0c90 |- frame pointer = 0x28:0xe6ec0c9c |- code segment= base 0x0, limit 0xf, type 0x1b |- = DPL 0, pres 1, def32 1, gran 1 |- processor eflags= resume, IOPL = 0 |- current process = 9 (thread taskq) |- trap number = 12 |- panic: page fault |- Uptime: 6d6h23m45s |- #0 doadump () at pcpu.h:165 |- 165 __asm __volatile(movl %%fs:0,%0 : =r (td)); |- |- |-frame 0 useless. You need the frame after calltrap(). |-And: |- |- instruction pointer = 0x20:0xc066ca51 |-list *0xc066ca51 |- |-Generally a bt will show the needed information. |-Likely cause: file system corruption, caused by background_fsck, but a |-backtrace should show more. Ok, so how does one fix corruption if that is the case? Here is a backtrace, but means nothing to me unfortunately. (kgdb) backtrace #0 doadump () at pcpu.h:165 #1 0xc067582a in boot (howto=260) at ../../../kern/kern_shutdown.c:409 #2 0xc0675b51 in panic (fmt=0xc08f090b %s) at ../../../kern/kern_shutdown.c:565 #3 0xc0899f1c in trap_fatal (frame=0xe6ec0c50, eva=260) at ../../../i386/i386/trap.c:837 #4 0xc089968e in trap (frame={tf_fs = 8, tf_es = -920256472, tf_ds = -420741080, tf_edi = -936184704, tf_esi = 4, tf_ebp = -420737892, tf_isp = -420737924, tf_ebx = -920236452, tf_edx = 6, tf_ecx = -936306488, tf_eax = 1, tf_trapno = 12, tf_err = 0, tf_eip = -1067005359, tf_cs = 32, tf_eflags = 65538, tf_esp = -930065784, tf_ss = 4}) at ../../../i386/i386/trap.c:270 #5 0xc08859ca in calltrap () at ../../../i386/i386/exception.s:139 #6 0xc066ca51 in _mtx_lock_sleep (m=0xc9264e5c, tid=3358782592, opts=0, file=0x0, line=0) at ../../../kern/kern_mutex.c:546 #7 0xc06bbdb6 in unp_gc (arg=0x0, pending=1) at ../../../kern/uipc_usrreq.c:1714 #8 0xc06964d3 in taskqueue_run (queue=0xc843fa80) at ../../../kern/subr_taskqueue.c:257 #9 0xc06969b6 in taskqueue_thread_loop (arg=0x1) at ../../../kern/subr_taskqueue.c:376 #10 0xc065ef6d in fork_exit (callout=0xc0696924 taskqueue_thread_loop, arg=0xc09f1d28, frame=0xe6ec0d38) at ../../../kern/kern_fork.c:821 #11 0xc0885a2c in fork_trampoline () at ../../../i386/i386/exception.s:208 ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Page Fault.
On Monday 01 December 2008 21:34:14 Keith wrote: On Mon, 1 Dec 2008, Mel wrote: |-On Monday 01 December 2008 19:32:59 Keith wrote: |- |- == |- kernel trap 12 with interrupts disabled |- |- Fatal trap 12: page fault while in kernel mode |- cpuid = 2; apic id = 02 |- fault virtual address = 0x104 |- fault code = supervisor read, page not present |- instruction pointer = 0x20:0xc066ca51 |- stack pointer = 0x28:0xe6ec0c90 |- frame pointer = 0x28:0xe6ec0c9c |- code segment= base 0x0, limit 0xf, type 0x1b |- = DPL 0, pres 1, def32 1, gran 1 |- processor eflags= resume, IOPL = 0 |- current process = 9 (thread taskq) |- trap number = 12 |- panic: page fault |- Uptime: 6d6h23m45s |- #0 doadump () at pcpu.h:165 |- 165 __asm __volatile(movl %%fs:0,%0 : =r (td)); |- |- |-frame 0 useless. You need the frame after calltrap(). |-And: |- |- instruction pointer = 0x20:0xc066ca51 |-list *0xc066ca51 |- |-Generally a bt will show the needed information. |-Likely cause: file system corruption, caused by background_fsck, but a |-backtrace should show more. Ok, so how does one fix corruption if that is the case? Here is a backtrace, but means nothing to me unfortunately. (kgdb) backtrace #0 doadump () at pcpu.h:165 #1 0xc067582a in boot (howto=260) at ../../../kern/kern_shutdown.c:409 #2 0xc0675b51 in panic (fmt=0xc08f090b %s) at ../../../kern/kern_shutdown.c:565 #3 0xc0899f1c in trap_fatal (frame=0xe6ec0c50, eva=260) at ../../../i386/i386/trap.c:837 #4 0xc089968e in trap (frame={tf_fs = 8, tf_es = -920256472, tf_ds = -420741080, tf_edi = -936184704, tf_esi = 4, tf_ebp = -420737892, tf_isp = -420737924, tf_ebx = -920236452, tf_edx = 6, tf_ecx = -936306488, tf_eax = 1, tf_trapno = 12, tf_err = 0, tf_eip = -1067005359, tf_cs = 32, tf_eflags = 65538, tf_esp = -930065784, tf_ss = 4}) at ../../../i386/i386/trap.c:270 #5 0xc08859ca in calltrap () at ../../../i386/i386/exception.s:139 #6 0xc066ca51 in _mtx_lock_sleep (m=0xc9264e5c, tid=3358782592, opts=0, file=0x0, line=0) at ../../../kern/kern_mutex.c:546 #7 0xc06bbdb6 in unp_gc (arg=0x0, pending=1) at ../../../kern/uipc_usrreq.c:1714 This has been fixed: http://www.freebsd.org/cgi/query-pr.cgi?pr=113823 -- Mel Problem with today's modular software: they start with the modules and never get to the software part. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: page fault while in kernel mode
On Monday 20 October 2008 08:52:07 pm Robert Fitzpatrick wrote: On Mon, 2008-10-20 at 13:45 -0400, John Baldwin wrote: i386 cannot address more than 4GB unless the kernel is built with PAE mode enabled. This isn't enabled in GENERIC for many (justified) reasons. If you have more than 4GB, you should be using amd64, so you made the right decision there. If you aren't using kernel modules, then PAE should work fine. You can make kernel modules work with PAE as well, but that takes more work. Thanks for the help, I am missing AMD Features for this CPU in dmesg, so it looks like the CPU does not support amd64. I tried to build my own kernel with PAE option and getting the following error... /usr/src/sys/dev/advansys/advansys.c: In function 'adv_action': /usr/src/sys/dev/advansys/advansys.c:259: warning: cast from pointer to integer of different size *** Error code 1 Any idea what I can do for this error? Some drivers don't work with PAE (see all the 'nodevice' lines in /sys/i386/conf/PAE). You'll need to purge those drivers from your config. If you are using the hardware those drivers support, then you can't use PAE. -- John Baldwin ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: page fault while in kernel mode
On Tue, 2008-10-21 at 12:03 -0400, John Baldwin wrote: Some drivers don't work with PAE (see all the 'nodevice' lines in /sys/i386/conf/PAE). You'll need to purge those drivers from your config. If you are using the hardware those drivers support, then you can't use PAE. Thanks for the help. Excuse the ignorance, I'm more a programmer than system guy. How do I purge a driver, or know which driver to look for, from the config and know what the driver supports? Do you mean, in this case, remove 'nodevice adv' from the PAE file? If so, I don't know what that supports :/ -- Robert ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: page fault while in kernel mode
On Tue, Oct 21, 2008 at 02:35:22PM -0400, Robert Fitzpatrick wrote: On Tue, 2008-10-21 at 12:03 -0400, John Baldwin wrote: Some drivers don't work with PAE (see all the 'nodevice' lines in /sys/i386/conf/PAE). You'll need to purge those drivers from your config. If you are using the hardware those drivers support, then you can't use PAE. Thanks for the help. Excuse the ignorance, I'm more a programmer than system guy. How do I purge a driver, or know which driver to look for, from the config and know what the driver supports? Do you mean, in this case, remove 'nodevice adv' from the PAE file? If so, I don't know what that supports :/ Yeah, I don't think anyone's really explaining this very well to you, so I'll try a different approach: Certain FreeBSD drivers do not work in PAE mode. The drivers which don't work are listed in the /sys/i386/conf/PAE file. They're prefixed by the word nodevice, which tells the kernel config reader DO NOT build this device, because it won't work. You will need to take the nodevice lines from /sys/i386/conf/PAE and put them into your kernel config file. (There are alternative methods such as using include directives and so on, but I'm trying to keep this explanation simple.) Make sense now? :-) -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: page fault while in kernel mode
On Tue, 2008-10-21 at 11:47 -0700, Jeremy Chadwick wrote: The drivers which don't work are listed in the /sys/i386/conf/PAE file. They're prefixed by the word nodevice, which tells the kernel config reader DO NOT build this device, because it won't work. You will need to take the nodevice lines from /sys/i386/conf/PAE and put them into your kernel config file. (There are alternative methods such as using include directives and so on, but I'm trying to keep this explanation simple.) Make sense now? :-) Perfect sense now, believe it or not I was beginning to think along these lines as I was doing some searching and found 'device adv' in my config file and there was a description of the hardware it was for, which I don't have. Thanks for the clarification, now let's see if I can get this build done...thanks to all! -- Robert ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: page fault while in kernel mode
On Tuesday 21 October 2008 02:47:11 pm Jeremy Chadwick wrote: On Tue, Oct 21, 2008 at 02:35:22PM -0400, Robert Fitzpatrick wrote: On Tue, 2008-10-21 at 12:03 -0400, John Baldwin wrote: Some drivers don't work with PAE (see all the 'nodevice' lines in /sys/i386/conf/PAE). You'll need to purge those drivers from your config. If you are using the hardware those drivers support, then you can't use PAE. Thanks for the help. Excuse the ignorance, I'm more a programmer than system guy. How do I purge a driver, or know which driver to look for, from the config and know what the driver supports? Do you mean, in this case, remove 'nodevice adv' from the PAE file? If so, I don't know what that supports :/ Yeah, I don't think anyone's really explaining this very well to you, so I'll try a different approach: Certain FreeBSD drivers do not work in PAE mode. The drivers which don't work are listed in the /sys/i386/conf/PAE file. They're prefixed by the word nodevice, which tells the kernel config reader DO NOT build this device, because it won't work. You will need to take the nodevice lines from /sys/i386/conf/PAE and put them into your kernel config file. (There are alternative methods such as using include directives and so on, but I'm trying to keep this explanation simple.) Make sense now? :-) Alternatively, you could just remove the 'device adv' line from your kernel config rather than adding lots of 'nodevice' lines at the bottom. You can usually do 'man 4 driver name' to see what devices it supports. In this case, adv(4) supports mostly ancient Advansys SCSI host adapters. The manpage has a full list of the various model numbers, etc. -- John Baldwin ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: page fault while in kernel mode
On Tue, 2008-10-21 at 15:09 -0400, John Baldwin wrote: Alternatively, you could just remove the 'device adv' line from your kernel config rather than adding lots of 'nodevice' lines at the bottom. You can usually do 'man 4 driver name' to see what devices it supports. In this case, adv(4) supports mostly ancient Advansys SCSI host adapters. The manpage has a full list of the various model numbers, etc. Yes, that is what I thought. Right now, I am just commenting them out, now I know what people mean when they say they are running a trimmed/clean kernel. I did see one potential issue... # USB support device uhci# UHCI PCI-USB interface device ohci# OHCI PCI-USB interface device ehci# EHCI PCI-USB interface (USB 2.0) device usb # USB Bus (required) I see all of these with nodevice lines in the PAE file. Although I have USB ports, I don't use them, but I was concerned by the 'required' on the last one, is it OK to remove? Also, would I then need to disable USB in the BIOS to avoid errors? -- Robert ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: page fault while in kernel mode
On Tue, Oct 21, 2008 at 03:22:28PM -0400, Robert Fitzpatrick wrote: On Tue, 2008-10-21 at 15:09 -0400, John Baldwin wrote: Alternatively, you could just remove the 'device adv' line from your kernel config rather than adding lots of 'nodevice' lines at the bottom. You can usually do 'man 4 driver name' to see what devices it supports. In this case, adv(4) supports mostly ancient Advansys SCSI host adapters. The manpage has a full list of the various model numbers, etc. Yes, that is what I thought. Right now, I am just commenting them out, now I know what people mean when they say they are running a trimmed/clean kernel. I did see one potential issue... # USB support device uhci# UHCI PCI-USB interface device ohci# OHCI PCI-USB interface device ehci# EHCI PCI-USB interface (USB 2.0) device usb # USB Bus (required) I see all of these with nodevice lines in the PAE file. Although I have USB ports, I don't use them, but I was concerned by the 'required' on the last one, is it OK to remove? Also, would I then need to disable USB in the BIOS to avoid errors? If you remove device usb, you will also need to remove uhci, ohci, ehci, umass, ukbd, etc. etc. etc... from your config as well. You do not need to disable USB support in the BIOS; the kernel will simply state that it sees devices on the PCI bus but lacks a driver to attach to them. This will not harm anything. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: page fault while in kernel mode
On Tuesday 21 October 2008 03:22:28 pm Robert Fitzpatrick wrote: On Tue, 2008-10-21 at 15:09 -0400, John Baldwin wrote: Alternatively, you could just remove the 'device adv' line from your kernel config rather than adding lots of 'nodevice' lines at the bottom. You can usually do 'man 4 driver name' to see what devices it supports. In this case, adv(4) supports mostly ancient Advansys SCSI host adapters. The manpage has a full list of the various model numbers, etc. Yes, that is what I thought. Right now, I am just commenting them out, now I know what people mean when they say they are running a trimmed/clean kernel. I did see one potential issue... # USB support device uhci# UHCI PCI-USB interface device ohci# OHCI PCI-USB interface device ehci# EHCI PCI-USB interface (USB 2.0) device usb # USB Bus (required) I see all of these with nodevice lines in the PAE file. Although I have USB ports, I don't use them, but I was concerned by the 'required' on the last one, is it OK to remove? Also, would I then need to disable USB in the BIOS to avoid errors? Actually, USB is ok with PAE. I recently updated the PAE configs to not disable PAE and at work we've run PAE kernels with USB enabled for a few years now on 6.x w/o any problems. -- John Baldwin ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: page fault while in kernel mode
On Sun, 2008-10-19 at 13:16 -0700, Jeremy Chadwick wrote: On Sun, Oct 19, 2008 at 03:50:01PM -0400, Robert Fitzpatrick wrote: I took a working 5.4-i386 server and trying to convert its RAID 5 to RAID 10 and load 7.0 amd64. I kept getting BTX halted even after flashing the latest bios and firmware for the raid card, Intel SRCZCR, in this dual Xeon 2.4GHz supermicro superserver. I have another server, bit newer, but same basic hardware makeup with Xeon 3.0 procs that runs 6.1-amd64 fine. Anyway, so I have resorted to the i386 version of 7.0 to see if the server is just incapable of running amd64, which after passing the initial boot where amd64 failed, now gives me the subject error after some reference to GEOM_LABEL. I did rebuild the RAID to RAID-10, can someone tell me what this error means? http://columbus.webtent.org/freebsd.png Can you please try 7.1-BETA2 instead (ISOs are now available)? There have been fixes/improvements to BTX since 7.0-RELEASE which could fix your problem. Thanks, but that didn't work either trying 7.1-BETA2 amd64 :( Forgot to mention I added memory to this server as well, took it from 2GB it was using under 5.4-RELEASE up to 6GB filling all slots, that is why I wanted to load amd64. I reduced down to 4GB and now am able to install 7.0-RELEASE i386. Does this mean that I may have a hardware issue or can FreeBSD produce the page fault I was getting when using over 4GB with i386? I would love to figure out this BTX halted issue instead...any ideas on that? -- Robert ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: page fault while in kernel mode
On Mon, Oct 20, 2008 at 12:07:17PM -0400, Robert Fitzpatrick wrote: On Sun, 2008-10-19 at 13:16 -0700, Jeremy Chadwick wrote: On Sun, Oct 19, 2008 at 03:50:01PM -0400, Robert Fitzpatrick wrote: I took a working 5.4-i386 server and trying to convert its RAID 5 to RAID 10 and load 7.0 amd64. I kept getting BTX halted even after flashing the latest bios and firmware for the raid card, Intel SRCZCR, in this dual Xeon 2.4GHz supermicro superserver. I have another server, bit newer, but same basic hardware makeup with Xeon 3.0 procs that runs 6.1-amd64 fine. Anyway, so I have resorted to the i386 version of 7.0 to see if the server is just incapable of running amd64, which after passing the initial boot where amd64 failed, now gives me the subject error after some reference to GEOM_LABEL. I did rebuild the RAID to RAID-10, can someone tell me what this error means? http://columbus.webtent.org/freebsd.png Can you please try 7.1-BETA2 instead (ISOs are now available)? There have been fixes/improvements to BTX since 7.0-RELEASE which could fix your problem. Thanks, but that didn't work either trying 7.1-BETA2 amd64 :( Forgot to mention I added memory to this server as well, took it from 2GB it was using under 5.4-RELEASE up to 6GB filling all slots, that is why I wanted to load amd64. I reduced down to 4GB and now am able to install 7.0-RELEASE i386. Does this mean that I may have a hardware issue or can FreeBSD produce the page fault I was getting when using over 4GB with i386? i386 cannot address more than 4GB unless the kernel is built with PAE mode enabled. This isn't enabled in GENERIC for many (justified) reasons. If you have more than 4GB, you should be using amd64, so you made the right decision there. I would love to figure out this BTX halted issue instead...any ideas on that? Boot loader problems are difficult to figure out/debug for reasons which should be obvious. I'm CC'ing John Baldwin here, who has experience with BTX. He might be able to shed some light on this. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: page fault while in kernel mode
On Monday 20 October 2008 12:32:37 pm Jeremy Chadwick wrote: Forgot to mention I added memory to this server as well, took it from 2GB it was using under 5.4-RELEASE up to 6GB filling all slots, that is why I wanted to load amd64. I reduced down to 4GB and now am able to install 7.0-RELEASE i386. Does this mean that I may have a hardware issue or can FreeBSD produce the page fault I was getting when using over 4GB with i386? i386 cannot address more than 4GB unless the kernel is built with PAE mode enabled. This isn't enabled in GENERIC for many (justified) reasons. If you have more than 4GB, you should be using amd64, so you made the right decision there. If you aren't using kernel modules, then PAE should work fine. You can make kernel modules work with PAE as well, but that takes more work. I would love to figure out this BTX halted issue instead...any ideas on that? Boot loader problems are difficult to figure out/debug for reasons which should be obvious. I'm CC'ing John Baldwin here, who has experience with BTX. He might be able to shed some light on this. You will get a BTX fault in 7.0 if your CPU does not support 64-bit long mode (i.e., amd64). You can check to see if your CPU does support it by looking in the 'AMD features' line of 'dmesg' from an i386 kernel and seeing if you have a 'LM' feature. If you don't, your CPU only supports i386. -- John Baldwin ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: page fault while in kernel mode
On Mon, 2008-10-20 at 13:45 -0400, John Baldwin wrote: i386 cannot address more than 4GB unless the kernel is built with PAE mode enabled. This isn't enabled in GENERIC for many (justified) reasons. If you have more than 4GB, you should be using amd64, so you made the right decision there. If you aren't using kernel modules, then PAE should work fine. You can make kernel modules work with PAE as well, but that takes more work. Thanks for the help, I am missing AMD Features for this CPU in dmesg, so it looks like the CPU does not support amd64. I tried to build my own kernel with PAE option and getting the following error... /usr/src/sys/dev/advansys/advansys.c: In function 'adv_action': /usr/src/sys/dev/advansys/advansys.c:259: warning: cast from pointer to integer of different size *** Error code 1 Any idea what I can do for this error? -- Robert ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: page fault while in kernel mode
On Sun, Oct 19, 2008 at 03:50:01PM -0400, Robert Fitzpatrick wrote: I took a working 5.4-i386 server and trying to convert its RAID 5 to RAID 10 and load 7.0 amd64. I kept getting BTX halted even after flashing the latest bios and firmware for the raid card, Intel SRCZCR, in this dual Xeon 2.4GHz supermicro superserver. I have another server, bit newer, but same basic hardware makeup with Xeon 3.0 procs that runs 6.1-amd64 fine. Anyway, so I have resorted to the i386 version of 7.0 to see if the server is just incapable of running amd64, which after passing the initial boot where amd64 failed, now gives me the subject error after some reference to GEOM_LABEL. I did rebuild the RAID to RAID-10, can someone tell me what this error means? http://columbus.webtent.org/freebsd.png Can you please try 7.1-BETA2 instead (ISOs are now available)? There have been fixes/improvements to BTX since 7.0-RELEASE which could fix your problem. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Page Fault - Wireless problem?
Unread portion of the kernel message buffer: ural0: could not transmit buffer: SHORT_XFER Fatal trap 12: page fault while in kernel mode fault virtual address = 0x4 fault code = supervisor read, page not present instruction pointer = 0x20:0xc0667091 stack pointer = 0x28:0xd33e4c00 frame pointer = 0x28:0xd33e4c0c code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 32 (irq21: uhci0 uhci1+) trap number = 12 panic: page fault I am not a USB/Ralink expert, but it the node can be removed mid-transmission elsewhere (ural_free_tx_list) on input errors. I don't know about threading issues that would require extra locking as well: in sys/dev/usb/ --- if_ural.c.orig Sun Jan 29 08:16:36 2006 +++ if_ural.c Thu Feb 23 08:30:36 2006 @@ -881,8 +881,10 @@ m_freem(data-m); data-m = NULL; - ieee80211_free_node(data-ni); - data-ni = NULL; + if (data-ni != NULL) { + ieee80211_free_node(data-ni); + data-ni = NULL; + } sc-tx_queued--; ifp-if_opackets++; ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Page fault in kernel mode, 6.0-RELEASE
On Tue, Nov 15, 2005 at 11:48:15PM +0100, Guido Van Hoecke wrote: I am fairly new at FreeBSD. I installed a 'beastie' server with FreeBSD 6.0-RELEASE #0: Thu Nov 3 09:36:13 UTC 2005 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC but it traps immediately after the boot menu with following output immediately after the boot menu: /boot/kernel/acpi.ko text=0x40c2c data=0x2160+0x1090 \ syms=[0x4+0x7810+0x4+0xa292] Fatal trap 12: page fault while in kernel mode fault virtual address = 0x8 fault code = supervisor read, page not present instruction pointer = 0x20:0xc079abb9 stack pointer = 0x28:0xc0c20d4c frame pointer = 0x28:0xc0c20d58 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 0 () trap number = 12 panic: page fault Uptime: 1s Note that the /boot/kernel/acpi.ko line is absent if I boot with acpi disabled (menu option 2). But the trap info is not affected by slecting that boot option. Are you using other modules (e.g. nvidia driver)? Kris pgpPPSG6pJq9A.pgp Description: PGP signature
Re: Page fault in kernel mode, 6.0-RELEASE
Kris Kennaway [EMAIL PROTECTED] writes: On Tue, Nov 15, 2005 at 11:48:15PM +0100, Guido Van Hoecke wrote: I am fairly new at FreeBSD. I installed a 'beastie' server with FreeBSD 6.0-RELEASE #0: Thu Nov 3 09:36:13 UTC 2005 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC but it traps immediately after the boot menu with following output immediately after the boot menu: /boot/kernel/acpi.ko text=0x40c2c data=0x2160+0x1090 \ syms=[0x4+0x7810+0x4+0xa292] Fatal trap 12: page fault while in kernel mode fault virtual address = 0x8 fault code = supervisor read, page not present instruction pointer = 0x20:0xc079abb9 stack pointer = 0x28:0xc0c20d4c frame pointer = 0x28:0xc0c20d58 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 0 () trap number = 12 panic: page fault Uptime: 1s Note that the /boot/kernel/acpi.ko line is absent if I boot with acpi disabled (menu option 2). But the trap info is not affected by slecting that boot option. Are you using other modules (e.g. nvidia driver)? No Kris, this is just when booting from the generic release cd's. I had an old Hercules 3d Prophet card and just used that, without loading anything special. Kris ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: page fault while in kernel mode
On Mon, Oct 24, 2005 at 12:11:29PM +0100, Owen Smith wrote: Whats the best thing todo? debugging kernel etc or just upgrade to 5.4? The latter. A few hundred bugs were fixed between 5.3 and 5.4, and the former is no longer supported anyway. Kris pgpT4R8dnqxnu.pgp Description: PGP signature
Re: page fault while in kernel mode
On Mon, Oct 24, 2005 at 12:11:29PM +0100, Owen Smith wrote: Whats the best thing todo? debugging kernel etc or just upgrade to 5.4? Run a memory est on the machine. memtest386 works well. -- U.S. Encouraged by Vietnam Vote - Officials Cite 83% Turnout Despite Vietcong Terror - New York Times 9/3/1967 ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: page fault error?
On Wed, Mar 24, 2004 at 06:57:20PM -0500, chris wrote: Hello, I am having the following problem. About a week or so ago this started. My box running 4.9-STABLE keeps panicing and rebooting, below is an output of dmesg -a. Also i have tried to replace the memory and the hard drive is only 5 months old or so. I have also cvsup'ed and ran make/ buildworld and rebuilt my kernel. It doesnt appear to be overheating either. Any help would be greatly appreciated, thanks This sort of panicing that starts at an arbitrary time and not associated with some system change (new hardware, OS update) is almost always due to a hardware fault. You've eliminated two of the most probable causes: bad memory and overheating, but unfortunately computers are complicated beasties and there's a lot more that can go wrong than that. It could be as simple as your power supply fading and not providing the required +5V and +12V within spec. Or it could be that one or more of the chips on your motherboard, or indeed, your CPU has developed a fault and is trashing data that passes through it. The page fault error you're seeing is due to the kernel trying to access a memory page which was never previously allocated. You can get so idea about what is triggering the problem by setting your system up to save a core image and looking at what the kernel was doing immediately it paniced. Instructions for doing that are here: http://www.oreillynet.com/pub/a/bsd/2002/03/21/Big_Scary_Daemons.html http://www.oreillynet.com/pub/a/bsd/2002/04/04/Big_Scary_Daemons.html Now, grovelling in the bowels of the kernel is quite a daunting task for the uninitiated, but gathering this sort of information is precisely what will be most helpful to a FreeBSD developer who does know their way around the kernel. (if anyone recieved this twice i apologize, i sent it earlier but it didnt appear to go through for some reason) Yes -- occasionally the freebsd.org mailer takes an hour or two or three to get a message out to the list, rather than the usual minute or so. Sometimes even longer. As a rule of thumb, wait for at least 4 hours and preferably a day before re-sending. Cheers, Matthew -- Dr Matthew J Seaman MA, D.Phil. 26 The Paddocks Savill Way PGP: http://www.infracaninophile.co.uk/pgpkey Marlow Tel: +44 1628 476614 Bucks., SL7 1TH UK pgp0.pgp Description: PGP signature