Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
On Thu, Sep 11, 2008 at 11:56 AM, Jeremy Chadwick [EMAIL PROTECTED] wrote: On Thu, Sep 11, 2008 at 12:08:47PM +0200, Michael Grant wrote: On Thu, Sep 11, 2008 at 11:20 AM, Jeremy Chadwick [EMAIL PROTECTED] wrote: On Thu, Sep 11, 2008 at 10:38:36AM +0200, Michael Grant wrote: My box crashed again: panic: kmem_malloc(4096): kmem_map too small: 1073741824 total allocated cpuid = 0 Uptime: 33d11h12m58s Dumping 3327 MB (2 chunks) chunk 0: 1MB (151 pages) ... ok chunk 1: 3327MB (851568 pages) ---hung here Still no valid dump. There is 4gig of physical memory in the machine. In /boot/loader.conf, I currently have the following: vm.kmem_size=1G vm.kmem_size_max=1G vm.kmem_size_scale=2 and in my kernel conf file I have: options KVA_PAGES=512 It stayed up for 33 days this time. Is there anything else I can do? First and foremost: are you using ZFS on this machine? If so, there are many tunables you can apply to try and limit this; I'm willing to bet it's ARC which is doing it. See below. In general, it appears that you need to increase the maximum range of kmem. The kernel attempted to utilise more than 1GB, and your limit is 1G. My machines running RELENG_7 on amd64, with only 2GB of RAM installed, use the following tunables in loader.conf: vm.kmem_size=1536M vm.kmem_size_max=1536M If ZFS is in use, I recommend these as well: vfs.zfs.arc_min=16M vfs.zfs.arc_max=64M vfs.zfs.prefetch_disable=1 Do not increase kmem_size any larger than 1.5GB; the amount of RAM you have in the machine, with regards to RELENG_7, will not help. This is a known limitation which has been fixed in HEAD/CURRENT (where the limit has been increased to 512GB). See the Kernel section below; you'll see the applicable item. http://wiki.freebsd.org/JeremyChadwick/Commonly_reported_issues Your only solution may be to run HEAD/CURRENT. I am not running ZFS. My file systems are ufs. This feels like some sort of memory leak in the kernel. Giving it more and more memory just seems to delay the crash. Are you saying the crash is fixed in HEAD/CURRENT? It's an intentional crash, not the program tried to access NULL, which crashed the machine crash. The kernel wants more memory to accomplish a certain thing, and it's not available. kris@ can explain this in better terms than I can. First and foremost, it would be good to find out what all you are running on this machine (process-wise). A process could be tickling something in the kernel which requires a large amount of memory to be required. I can imagine something like MySQL would require this. Ideally what needs to happen is to debug the kernel or get a full map of kmem to find out what's using what. I believe vmstat -m or vmstat -z output might help. Obviously since the machine panics, you won't be able to run those commands after the fact. I would recommend you set up a cronjob that runs every 1-2 minutes and logs the output of both of those commands to a file. When the panic happens, restart the system and look at the logfile to see if you can figure out if anything suddenly starts taking up a large amount of memory, or if it's a gradual thing (indicating a memory leak). If you can figure out what might be tickling the problem, you can ultimately figure out if increasing kmem is the right thing to do, or if there's a greater problem here. I'm running 6.3 by the way. I have put your changes into my loader.conf, we'll see how long it goes this time. I'm not qute in position to update everything to 7.x at the moment. Our production webservers run RELENG_6 and RELENG_7, and we don't encounter this kind of problem. I'm not saying what you're experiencing is indicative of hardware issues or something like that -- I'm simply saying I have loaded systems which don't ever hit that condition. So figuring out what's causing it in your case would be good. This appears to be too high as the machine reboots immediately after the fsck: vm.kmem_size=1536M vm.kmem_size_max=1536M Returning it to 1G, it panics again about a month later. Here's vmstat -m and -z roughly 1 minute before it crashed (I was logging to a file every minute via cron): Fri Nov 21 15:15:00 EST 2008 Type InUse MemUse HighUse Requests Size(s) pfs_vncache 2 1K - 864205 32 GEOM 16824K - 416279 16,32,64,128,256,512,1024,2048,4096 isadev17 2K - 17 64 CAM periph 1 1K -1 128 cdev26 4K - 26 128 CAM queue 3 1K -3 16 file desc 739 474K - 284943537 16,32,64,256,512,1024,2048,4096 sigio 3 1K - 4802 32 kenv 116 8K - 118 16,32,64,4096 kqueue 246 154K - 17652506 256,1024 proc-args 15310K - 107101480
Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
My box crashed again: panic: kmem_malloc(4096): kmem_map too small: 1073741824 total allocated cpuid = 0 Uptime: 33d11h12m58s Dumping 3327 MB (2 chunks) chunk 0: 1MB (151 pages) ... ok chunk 1: 3327MB (851568 pages) ---hung here Still no valid dump. There is 4gig of physical memory in the machine. In /boot/loader.conf, I currently have the following: vm.kmem_size=1G vm.kmem_size_max=1G vm.kmem_size_scale=2 and in my kernel conf file I have: options KVA_PAGES=512 It stayed up for 33 days this time. Is there anything else I can do? Michael Grant ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
On Thu, Sep 11, 2008 at 10:38:36AM +0200, Michael Grant wrote: My box crashed again: panic: kmem_malloc(4096): kmem_map too small: 1073741824 total allocated cpuid = 0 Uptime: 33d11h12m58s Dumping 3327 MB (2 chunks) chunk 0: 1MB (151 pages) ... ok chunk 1: 3327MB (851568 pages) ---hung here Still no valid dump. There is 4gig of physical memory in the machine. In /boot/loader.conf, I currently have the following: vm.kmem_size=1G vm.kmem_size_max=1G vm.kmem_size_scale=2 and in my kernel conf file I have: options KVA_PAGES=512 It stayed up for 33 days this time. Is there anything else I can do? First and foremost: are you using ZFS on this machine? If so, there are many tunables you can apply to try and limit this; I'm willing to bet it's ARC which is doing it. See below. In general, it appears that you need to increase the maximum range of kmem. The kernel attempted to utilise more than 1GB, and your limit is 1G. My machines running RELENG_7 on amd64, with only 2GB of RAM installed, use the following tunables in loader.conf: vm.kmem_size=1536M vm.kmem_size_max=1536M If ZFS is in use, I recommend these as well: vfs.zfs.arc_min=16M vfs.zfs.arc_max=64M vfs.zfs.prefetch_disable=1 Do not increase kmem_size any larger than 1.5GB; the amount of RAM you have in the machine, with regards to RELENG_7, will not help. This is a known limitation which has been fixed in HEAD/CURRENT (where the limit has been increased to 512GB). See the Kernel section below; you'll see the applicable item. http://wiki.freebsd.org/JeremyChadwick/Commonly_reported_issues Your only solution may be to run HEAD/CURRENT. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
On Thu, Sep 11, 2008 at 11:20 AM, Jeremy Chadwick [EMAIL PROTECTED] wrote: On Thu, Sep 11, 2008 at 10:38:36AM +0200, Michael Grant wrote: My box crashed again: panic: kmem_malloc(4096): kmem_map too small: 1073741824 total allocated cpuid = 0 Uptime: 33d11h12m58s Dumping 3327 MB (2 chunks) chunk 0: 1MB (151 pages) ... ok chunk 1: 3327MB (851568 pages) ---hung here Still no valid dump. There is 4gig of physical memory in the machine. In /boot/loader.conf, I currently have the following: vm.kmem_size=1G vm.kmem_size_max=1G vm.kmem_size_scale=2 and in my kernel conf file I have: options KVA_PAGES=512 It stayed up for 33 days this time. Is there anything else I can do? First and foremost: are you using ZFS on this machine? If so, there are many tunables you can apply to try and limit this; I'm willing to bet it's ARC which is doing it. See below. In general, it appears that you need to increase the maximum range of kmem. The kernel attempted to utilise more than 1GB, and your limit is 1G. My machines running RELENG_7 on amd64, with only 2GB of RAM installed, use the following tunables in loader.conf: vm.kmem_size=1536M vm.kmem_size_max=1536M If ZFS is in use, I recommend these as well: vfs.zfs.arc_min=16M vfs.zfs.arc_max=64M vfs.zfs.prefetch_disable=1 Do not increase kmem_size any larger than 1.5GB; the amount of RAM you have in the machine, with regards to RELENG_7, will not help. This is a known limitation which has been fixed in HEAD/CURRENT (where the limit has been increased to 512GB). See the Kernel section below; you'll see the applicable item. http://wiki.freebsd.org/JeremyChadwick/Commonly_reported_issues Your only solution may be to run HEAD/CURRENT. I am not running ZFS. My file systems are ufs. This feels like some sort of memory leak in the kernel. Giving it more and more memory just seems to delay the crash. Are you saying the crash is fixed in HEAD/CURRENT? I'm running 6.3 by the way. I have put your changes into my loader.conf, we'll see how long it goes this time. I'm not qute in position to update everything to 7.x at the moment. Michael Grant ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
On Thu, Sep 11, 2008 at 12:08:47PM +0200, Michael Grant wrote: On Thu, Sep 11, 2008 at 11:20 AM, Jeremy Chadwick [EMAIL PROTECTED] wrote: On Thu, Sep 11, 2008 at 10:38:36AM +0200, Michael Grant wrote: My box crashed again: panic: kmem_malloc(4096): kmem_map too small: 1073741824 total allocated cpuid = 0 Uptime: 33d11h12m58s Dumping 3327 MB (2 chunks) chunk 0: 1MB (151 pages) ... ok chunk 1: 3327MB (851568 pages) ---hung here Still no valid dump. There is 4gig of physical memory in the machine. In /boot/loader.conf, I currently have the following: vm.kmem_size=1G vm.kmem_size_max=1G vm.kmem_size_scale=2 and in my kernel conf file I have: options KVA_PAGES=512 It stayed up for 33 days this time. Is there anything else I can do? First and foremost: are you using ZFS on this machine? If so, there are many tunables you can apply to try and limit this; I'm willing to bet it's ARC which is doing it. See below. In general, it appears that you need to increase the maximum range of kmem. The kernel attempted to utilise more than 1GB, and your limit is 1G. My machines running RELENG_7 on amd64, with only 2GB of RAM installed, use the following tunables in loader.conf: vm.kmem_size=1536M vm.kmem_size_max=1536M If ZFS is in use, I recommend these as well: vfs.zfs.arc_min=16M vfs.zfs.arc_max=64M vfs.zfs.prefetch_disable=1 Do not increase kmem_size any larger than 1.5GB; the amount of RAM you have in the machine, with regards to RELENG_7, will not help. This is a known limitation which has been fixed in HEAD/CURRENT (where the limit has been increased to 512GB). See the Kernel section below; you'll see the applicable item. http://wiki.freebsd.org/JeremyChadwick/Commonly_reported_issues Your only solution may be to run HEAD/CURRENT. I am not running ZFS. My file systems are ufs. This feels like some sort of memory leak in the kernel. Giving it more and more memory just seems to delay the crash. Are you saying the crash is fixed in HEAD/CURRENT? It's an intentional crash, not the program tried to access NULL, which crashed the machine crash. The kernel wants more memory to accomplish a certain thing, and it's not available. kris@ can explain this in better terms than I can. First and foremost, it would be good to find out what all you are running on this machine (process-wise). A process could be tickling something in the kernel which requires a large amount of memory to be required. I can imagine something like MySQL would require this. Ideally what needs to happen is to debug the kernel or get a full map of kmem to find out what's using what. I believe vmstat -m or vmstat -z output might help. Obviously since the machine panics, you won't be able to run those commands after the fact. I would recommend you set up a cronjob that runs every 1-2 minutes and logs the output of both of those commands to a file. When the panic happens, restart the system and look at the logfile to see if you can figure out if anything suddenly starts taking up a large amount of memory, or if it's a gradual thing (indicating a memory leak). If you can figure out what might be tickling the problem, you can ultimately figure out if increasing kmem is the right thing to do, or if there's a greater problem here. I'm running 6.3 by the way. I have put your changes into my loader.conf, we'll see how long it goes this time. I'm not qute in position to update everything to 7.x at the moment. Our production webservers run RELENG_6 and RELENG_7, and we don't encounter this kind of problem. I'm not saying what you're experiencing is indicative of hardware issues or something like that -- I'm simply saying I have loaded systems which don't ever hit that condition. So figuring out what's causing it in your case would be good. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
Removing KDB_UNATTENDED from your kernel will allow you to interact with the debugger and obtain backtraces etc, which is useful when dumps are not being saved. Easier said than done, this cause a few panics - no dumps though ...g!! Still the same result ... the system seems to panic twice then hang. I will keep trying unless you have some other ideas?? Right, after trying for a number of days the system still just hung without letting me get either a dump or to interactively debug in the failed state, I reverted back to the Generic kernel, removed half the memory (2 of the 4 1GB sticks) and the system became stable. I inserted 1 of the 2 removed sticks and all was fine. I swapped that stick with the remaining stick and all was fine. I put them both back in and I started to see the crashes again - the first of which, gave me this dump -- server251# kgdb /boot/kernel/kernel /var/crash/vmcore.1 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol ps_pglobal_lookup] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as amd64-marcel-freebsd. Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 01 fault virtual address= 0xb0 fault code= supervisor read data, page not present instruction pointer= 0x8:0x8068d4bd stack pointer= 0x10:0xb20738e0 frame pointer= 0x10:0x0 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process= 72836 (objdump) trap number= 12 panic: page fault cpuid = 1 Uptime: 28m4s Physical memory: 4082 MB Dumping 518 MB: 503 487 471 455 439 423 407 391 375 359 343 327 311 295 279 263 247 231 215 199 183 167 151 135 119 103 87 71 55 39 23 7 #0 doadump () at pcpu.h:194 194pcpu.h: No such file or directory. in pcpu.h (kgdb) backtrace #0 doadump () at pcpu.h:194 #1 0x0004 in ?? () #2 0x80477699 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 #3 0x80477a9d in panic (fmt=0x104 Address 0x104 out of bounds) at /usr/src/sys/kern/kern_shutdown.c:563 #4 0x8072ed44 in trap_fatal (frame=0xff003c39c000, eva=18446742974629017808) at /usr/src/sys/amd64/amd64/trap.c:724 #5 0x8072f115 in trap_pfault (frame=0xb2073830, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:641 #6 0x8072fa58 in trap (frame=0xb2073830) at /usr/src/sys/amd64/amd64/trap.c:410 #7 0x807156be in calltrap () at /usr/src/sys/amd64/amd64/exception.S:169 #8 0x8068d4bd in vm_page_cache_remove (m=0xff00da9ec3b8) at /usr/src/sys/vm/vm_page.c:896 #9 0x8068e1b5 in vm_page_alloc (object=0xff00374ffc30, pindex=14, req=64) at /usr/src/sys/vm/vm_page.c:1080 #10 0x8067fa77 in vm_fault (map=0xff0005f23d00, vaddr=34365804544, fault_type=1 '\001', fault_flags=0) at /usr/src/sys/vm/vm_fault.c:432 #11 0x8072efaf in trap_pfault (frame=0xb2073c70, usermode=1) at /usr/src/sys/amd64/amd64/trap.c:618 #12 0x8072fbf8 in trap (frame=0xb2073c70) at /usr/src/sys/amd64/amd64/trap.c:309 #13 0x807156be in calltrap () at /usr/src/sys/amd64/amd64/exception.S:169 #14 0x00080059c54f in ?? () Previous frame inner to this frame (corrupt stack?) So to answer your question are the backtraces always the same, no, they are not. But I am still confused as to what this means?? I would appreciate any further insight anyone can give. Thanks John ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
John Sullivan wrote: Removing KDB_UNATTENDED from your kernel will allow you to interact with the debugger and obtain backtraces etc, which is useful when dumps are not being saved. Easier said than done, this cause a few panics - no dumps though ...g!! Still the same result ... the system seems to panic twice then hang. I will keep trying unless you have some other ideas?? Right, after trying for a number of days the system still just hung without letting me get either a dump or to interactively debug in the failed state, I reverted back to the Generic kernel, removed half the memory (2 of the 4 1GB sticks) and the system became stable. I inserted 1 of the 2 removed sticks and all was fine. I swapped that stick with the remaining stick and all was fine. I put them both back in and I started to see the crashes again - the first of which, gave me this dump -- server251# kgdb /boot/kernel/kernel /var/crash/vmcore.1 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol ps_pglobal_lookup] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as amd64-marcel-freebsd. Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 01 fault virtual address= 0xb0 fault code= supervisor read data, page not present instruction pointer= 0x8:0x8068d4bd stack pointer= 0x10:0xb20738e0 frame pointer= 0x10:0x0 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process= 72836 (objdump) trap number= 12 panic: page fault cpuid = 1 Uptime: 28m4s Physical memory: 4082 MB Dumping 518 MB: 503 487 471 455 439 423 407 391 375 359 343 327 311 295 279 263 247 231 215 199 183 167 151 135 119 103 87 71 55 39 23 7 #0 doadump () at pcpu.h:194 194pcpu.h: No such file or directory. in pcpu.h (kgdb) backtrace #0 doadump () at pcpu.h:194 #1 0x0004 in ?? () #2 0x80477699 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 #3 0x80477a9d in panic (fmt=0x104 Address 0x104 out of bounds) at /usr/src/sys/kern/kern_shutdown.c:563 #4 0x8072ed44 in trap_fatal (frame=0xff003c39c000, eva=18446742974629017808) at /usr/src/sys/amd64/amd64/trap.c:724 #5 0x8072f115 in trap_pfault (frame=0xb2073830, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:641 #6 0x8072fa58 in trap (frame=0xb2073830) at /usr/src/sys/amd64/amd64/trap.c:410 #7 0x807156be in calltrap () at /usr/src/sys/amd64/amd64/exception.S:169 #8 0x8068d4bd in vm_page_cache_remove (m=0xff00da9ec3b8) at /usr/src/sys/vm/vm_page.c:896 #9 0x8068e1b5 in vm_page_alloc (object=0xff00374ffc30, pindex=14, req=64) at /usr/src/sys/vm/vm_page.c:1080 #10 0x8067fa77 in vm_fault (map=0xff0005f23d00, vaddr=34365804544, fault_type=1 '\001', fault_flags=0) at /usr/src/sys/vm/vm_fault.c:432 #11 0x8072efaf in trap_pfault (frame=0xb2073c70, usermode=1) at /usr/src/sys/amd64/amd64/trap.c:618 #12 0x8072fbf8 in trap (frame=0xb2073c70) at /usr/src/sys/amd64/amd64/trap.c:309 #13 0x807156be in calltrap () at /usr/src/sys/amd64/amd64/exception.S:169 #14 0x00080059c54f in ?? () Previous frame inner to this frame (corrupt stack?) So to answer your question are the backtraces always the same, no, they are not. But I am still confused as to what this means?? I would appreciate any further insight anyone can give. That's another corrupted backtrace that doesn't point to an actual software problem. Still sounds like bad RAM, or bad hardware. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
On Jul 24, 2008, at 9:15 AM, John Sullivan wrote: Right, after trying for a number of days the system still just hung without letting me get either a dump or to interactively debug in the failed state, I reverted back to the Generic kernel, removed half the memory (2 of the 4 1GB sticks) and the system became stable. I inserted 1 of the 2 removed sticks and all was fine. I swapped that stick with the remaining stick and all was fine. I put them both back in and I started to see the crashes again - the first of which, gave me this dump -- You might want to double-check the detailed documentation about your motherboard. There are a fair number of consumer-grade motherboards that can't reliably handle 4 double-sided DIMMs at full speed. Some of them require you to downgrade the memory clock from, say, PC3200 (aka 200MHz DDR) down to PC2700 speed (aka 166MHz DDR); others may work, but only if you install the more expensive buffered type of RAM (which also tend to include ECC) rather than generic unbuffered RAM. Regards, -- -Chuck ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
I have been having what seems like similar panics. I too cannot manage to get a crash dump, neither classic style nor minidump. Nor can I get it to work with DDB, there seems to be a problem with DDB and my Geom mirror. Kris recommended I up kmem_size which I have done (twice now) and since the last time I upped it, the machine has not crashed again (yet?). For the moment, I'm hoping things are stable. In /boot/loader.conf, I currently have the following: vm.kmem_size=1G vm.kmem_size_max=1G vm.kmem_size_scale=2 and in my kernel conf file I have: options KVA_PAGES=512 Here's what top says currently: last pid: 57367; load averages: 0.56, 0.54, 0.61 up 2+10:16:57 15:50:55 407 processes: 6 running, 378 sleeping, 2 zombie, 21 waiting CPU states: 0.1% user, 0.0% nice, 2.3% system, 0.7% interrupt, 97.0% idle Mem: 1309M Active, 1291M Inact, 497M Wired, 155M Cache, 199M Buf, 7408K Free Swap: 9541M Total, 1628K Used, 9540M Free Is this a heavily loaded machine? It's using a lot of memory, but it's mostly idle. I have 2 sticks of double-sided memory (4gig total) in the box. The SuperMicro documentation recommends using single sided sticks for 6 or more sticks. I feel for you John, I've lost many nights sleep in the last couple weeks trying to understand why this production box was crashing. I was really surprised to see this start happening, normally my freebsd boxes have uptimes in terms of years, not hours. Michael Grant ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
Michael Grant wrote: I have been having what seems like similar panics. I too cannot manage to get a crash dump, neither classic style nor minidump. Nor can I get it to work with DDB, there seems to be a problem with DDB and my Geom mirror. They're not at all similar, please don't confuse the issue :) Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
I feel for you John, I've lost many nights sleep in the last couple weeks trying to understand why this production box was crashing. I was really surprised to see this start happening, normally my freebsd boxes have uptimes in terms of years, not hours. Thanks for the sentiment, at last I have been able to smile about this problem - maybe we should start a support group ... I'll start ... Hi, I'm John and I'm a failing sys admin, I haven't had a panic for 2 hours now and I'm taking it just 1 tick at a time ;-) Just to share with the group, I had an email from Kris off of the list that made a lot of sense. I'm beginning to agree with him that it is probably a hardware issue. I'll go quiet now and spend some money on different hardware. For anyone who finds this thread on Google, I can only echo Michael's comments - the thing that makes these panics so infuriating is that even with dodgy old hardware FreeBSD has always proven to be a very stable OS for me and as you can see, the community is always willing to help. Thanks to all that have spent time on this issue for me. John This message was sent using IMP, the Internet Messaging Program. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
Could be memory, but I'd also suggest looking at temperatures. I've had overheating systems produce lots of such errors. Temperature is fine - it never get's that hot here in the UK ;-) Seriously, I put my hand in the box, touched a few heat sync's, it is not running hot enough to cause a problem. The BIOS reports that all is well with the temperature inside the box of just over 30 degrees C. John ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
John, a question, how is swap set up on your system? I was swapping to a file (a memory disk device /dev/md0). I was doing this because for some reason lost in ancient history, this machine was not set up with a real swap partition. Hence, no crash dump. Swap is a partition on the 1st disk. Last night I repartitioned a second disk, set up a real swap partition and now I'm currently waiting for this to happen again so I can get a crash dump. I will try creating a swap partition on my second drive to see if that improves things ... I am able to cause a panic on demand but a crash dump is rarely written (presumably because the system believes the device is not accessible?). I must have crashed it 10-20 times now with various corruptions of the panic screen - once it had blue text with trap 12 trap 12 all over the screen, I liked that one ;-). I did manage to complete a make index while the background FSCK was running, once it had finished, performing the same task caused a panic locking the machine up again with no crash dump. John ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
John Sullivan wrote: John, a question, how is swap set up on your system? I was swapping to a file (a memory disk device /dev/md0). I was doing this because for some reason lost in ancient history, this machine was not set up with a real swap partition. Hence, no crash dump. Swap is a partition on the 1st disk. Last night I repartitioned a second disk, set up a real swap partition and now I'm currently waiting for this to happen again so I can get a crash dump. I will try creating a swap partition on my second drive to see if that improves things ... I am able to cause a panic on demand but a crash dump is rarely written (presumably because the system believes the device is not accessible?). I must have crashed it 10-20 times now with various corruptions of the panic screen - once it had blue text with trap 12 trap 12 all over the screen, I liked that one ;-). I did manage to complete a make index while the background FSCK was running, once it had finished, performing the same task caused a panic locking the machine up again with no crash dump. OK, the first thing to do is disable bg fsck, then force a full fsck of all filesystems. bg fsck does a poor job of fixing arbitrary filesystem corruption (it's not designed to do so, in fact), and you can get into a situation where corrupted filesystems cause further panics. Removing KDB_UNATTENDED from your kernel will allow you to interact with the debugger and obtain backtraces etc, which is useful when dumps are not being saved. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
On Wed, Jul 16, 2008 at 10:38 AM, John Sullivan [EMAIL PROTECTED] wrote: Could be memory, but I'd also suggest looking at temperatures. I've had overheating systems produce lots of such errors. Temperature is fine - it never get's that hot here in the UK ;-) Seriously, I put my hand in the box, touched a few heat sync's, it is not running hot enough to cause a problem. The BIOS reports that all is well with the temperature inside the box of just over 30 degrees C. John This looks like the same panic I reported yesterday but I'm running 6.3 patch 2. I have seen these crashes on my box since 6.3 pre-release, randomly, but under load. My box is based on a SuperMicro motherboard running Intel Xeon processors. The only commonality is that we're both using Sata drives. John, a question, how is swap set up on your system? I was swapping to a file (a memory disk device /dev/md0). I was doing this because for some reason lost in ancient history, this machine was not set up with a real swap partition. Hence, no crash dump. Last night I repartitioned a second disk, set up a real swap partition and now I'm currently waiting for this to happen again so I can get a crash dump. Michael Grant ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
Michael Grant wrote: On Wed, Jul 16, 2008 at 10:38 AM, John Sullivan [EMAIL PROTECTED] wrote: Could be memory, but I'd also suggest looking at temperatures. I've had overheating systems produce lots of such errors. Temperature is fine - it never get's that hot here in the UK ;-) Seriously, I put my hand in the box, touched a few heat sync's, it is not running hot enough to cause a problem. The BIOS reports that all is well with the temperature inside the box of just over 30 degrees C. John This looks like the same panic I reported yesterday but I'm running 6.3 patch 2. Unless you have information you haven't yet shared, no it doesn't :) Fatal trap 12 is an effect, not a cause. We still need your backtrace to make progress understanding the cause of your panic. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
From: John Sullivan [EMAIL PROTECTED] Date: Wed, 16 Jul 2008 09:38:26 +0100 Could be memory, but I'd also suggest looking at temperatures. I've had overheating systems produce lots of such errors. Temperature is fine - it never get's that hot here in the UK ;-) Seriously, I put my hand in the box, touched a few heat sync's, it is not running hot enough to cause a problem. The BIOS reports that all is well with the temperature inside the box of just over 30 degrees C. It's not the heat sink temperature that I am concerned with. It is the temperature of the CPU and (if it's not AMD) the north bridge. I have encountered several cases of improper heat sink installation which resulted in poor transfer from the chip to the heat sink. Cleaning and properly applying heat transfer grease made a huge difference. You say that BIOS is reporting a 30C temperature. If this is the CPU temperature when the CPU is busy, I don't believe it. I have a system where the BIOS (via ACPI) reports the temperature as 35C, regardless of how long the system has been under power or what it is doing. I'm not at all sure that the problem is thermal, but I don't think you should dismiss the possibility too quickly. -- R. Kevin Oberman, Network Engineer Energy Sciences Network (ESnet) Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab) E-mail: [EMAIL PROTECTED] Phone: +1 510 486-8634 Key fingerprint:059B 2DDF 031C 9BA3 14A4 EADA 927D EBB3 987B 3751 pgpoh1jzjnO0A.pgp Description: PGP signature
Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
OK, the first thing to do is disable bg fsck, then force a full fsck of all filesystems. bg fsck does a poor job of fixing arbitrary filesystem corruption (it's not designed to do so, in fact), and you can get into a situation where corrupted filesystems cause further panics. Done, nothing really found wrong size in superblock which it corrected. Removing KDB_UNATTENDED from your kernel will allow you to interact with the debugger and obtain backtraces etc, which is useful when dumps are not being saved. Easier said than done, this cause a few panics - no dumps though ...g!! Still the same result ... the system seems to panic twice then hang. I will keep trying unless you have some other ideas?? Thanks for your support John This message was sent using IMP, the Internet Messaging Program. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Fresh 7.0 Install: Fatal Trap 12 panic when put under load
I am experiencing 'random' reboots interspersed with panics whenever I put a newly installed system under load (make index in /usr/ports is enough). A sample panic is at the end of this email. I have updated to 7.0-RELEASE-p2 using the GENERIC amd64 kernel and it is still the same. The system is a Gigabyte GA-M56S-S3 motherboard with 4GB of RAM, an Athlon X2 6400+ and 3 x Maxtor SATA 750GB HDD's (only the first is currently in use). The first disk is all allocated to FreeBSD using UFS. There is also a Linksys 802.11a/b/g card installed. I have flashed the BIOS to the latest revision (F4e). The onboard RAID is disabled. At the moment there is no exotic software installed. Although I have been using FreeBSD for a number of years this is the first time I have experienced regular panics and am at a complete loss trying to work out what is wrong. I would be grateful for any advice anyone is willing to give to help me troubleshoot this issue. Thanks in advance John Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x80b0 fault code - supervisor write data, page not present instruction pointer = 0x8:0x804db18c stack pointer = 0x10:b1e92450 frame pointer = 0x10:ffec code segment = base 0x0, limit 0xf, type 0x16, DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interupt enabled, resume, IOPL = 0 current processkernel trap 12 with interrupts disabled #nm -n /boot/kernel/kernel | grep 804db 804dbac0 t flushbufqueues ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
On Tue, Jul 15, 2008 at 10:58:19AM +0100, John Sullivan wrote: I am experiencing 'random' reboots interspersed with panics whenever I put a newly installed system under load (make index in /usr/ports is enough). A sample panic is at the end of this email. I have updated to 7.0-RELEASE-p2 using the GENERIC amd64 kernel and it is still the same. The system is a Gigabyte GA-M56S-S3 motherboard with 4GB of RAM, an Athlon X2 6400+ and 3 x Maxtor SATA 750GB HDD's (only the first is currently in use). The first disk is all allocated to FreeBSD using UFS. There is also a Linksys 802.11a/b/g card installed. I have flashed the BIOS to the latest revision (F4e). The onboard RAID is disabled. At the moment there is no exotic software installed. Although I have been using FreeBSD for a number of years this is the first time I have experienced regular panics and am at a complete loss trying to work out what is wrong. I would be grateful for any advice anyone is willing to give to help me troubleshoot this issue. Can the system in question run memtest86+ successfully (no errors) for an hour? It would help diminish (but not entirely rule out) hardware (memory or chipset) issues. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
Can the system in question run memtest86+ successfully (no errors) for an hour? It would help diminish (but not entirely rule out) hardware (memory or chipset) issues. Sorry, forgot to mention, I ran memtest over night without any problem reported. I ran Fedora 9 for a month without any issue - FreeBSD 7.0 crashes within an hour. John ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
John Sullivan wrote: Can the system in question run memtest86+ successfully (no errors) for an hour? It would help diminish (but not entirely rule out) hardware (memory or chipset) issues. Sorry, forgot to mention, I ran memtest over night without any problem reported. I ran Fedora 9 for a month without any issue - FreeBSD 7.0 crashes within an hour. Well, that doesn't rule out hardware failure. Different OSes may use different capabilities of the hardware, or just use it in a different way, and that can provoke failures from marginal hardware. Please collect kgdb/ddb backtraces. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
Please collect kgdb/ddb backtraces. kgdb backtrace: server251# kgdb -c /var/crash/vmcore.0 kgdb: couldn't find a suitable kernel image server251# kgdb /boot/kernel/kernel /var/crash/vmcore.0 kgdb: kvm_read: invalid address (0xff00010e5468) [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Unde fined symbol ps_pglobal_lookup] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as amd64-marcel-freebsd. Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x64 fault code = supervisor read instruction, page not present instruction pointer = 0x8:0x64 stack pointer = 0x10:0xb1d7f590 frame pointer = 0x10:0xff0035d2dcc0 code segment = base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 88622 (make) trap number = 12 panic: page fault cpuid = 0 Uptime: 5h57m22s Physical memory: 4082 MB Dumping 444 MB: 429 413 397 381 365 349 333 317 301 285 269 253 237 221 205 189 173 157 141 125 109 93 77 61 45 29 13 #0 doadump () at pcpu.h:194 194 pcpu.h: No such file or directory. in pcpu.h (kgdb) (kgdb) list *0x64 No source file for address 0x64. (kgdb) backtrace #0 doadump () at pcpu.h:194 #1 0xff0004742440 in ?? () #2 0x80477699 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 #3 0x80477a9d in panic (fmt=0x104 Address 0x104 out of bounds) at /usr/src/sys/kern/kern_shutdown.c:563 #4 0x8072ed44 in trap_fatal (frame=0xff00048ee000, eva=18446742974275512528) at /usr/src/sys/amd64/amd64/trap.c:724 #5 0x8072f115 in trap_pfault (frame=0xb1d7f4e0, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:641 #6 0x8072fa58 in trap (frame=0xb1d7f4e0) at /usr/src/sys/amd64/amd64/trap.c:410 #7 0x807156be in calltrap () at /usr/src/sys/amd64/amd64/exception.S:169 #8 0x0064 in ?? () #9 0x8067d3ee in uma_zalloc_arg (zone=0xff00bfed07e0, udata=0x0, flags=-256) at /usr/src/sys/vm/uma_core.c:1835 #10 0x80661ecf in ffs_vget (mp=0xff00047f4978, ino=47884512, flags=2, vpp=0xb1d7f728) at uma.h:277 #11 0x8066d010 in ufs_lookup (ap=0xb1d7f780) at /usr/src/sys/ufs/ufs/ufs_lookup.c:573 #12 0x804dfa89 in vfs_cache_lookup (ap=Variable ap is not available. ) at vnode_if.h:83 #13 0x8077235f in VOP_LOOKUP_APV (vop=0x809e7de0, a=0xb1d7f840) at vnode_if.c:99 ---Type return to continue, or q return to quit--- #14 0x804e6394 in lookup (ndp=0xb1d7f950) at vnode_if.h:57 #15 0x804e7228 in namei (ndp=0xb1d7f950) at /usr/src/sys/kern/vfs_lookup.c:219 #16 0x804f4717 in kern_stat (td=0xff00048ee000, path=0x8006f7040 Address 0x8006f7040 out of bounds, pathseg=Variable path seg is not available. ) at /usr/src/sys/kern/vfs_syscalls.c:2109 #17 0x804f4987 in stat (td=Variable td is not available. ) at /usr/src/sys/kern/vfs_syscalls.c:2093 #18 0x8072f397 in syscall (frame=0xb1d7fc70) at /usr/src/sys/amd64/amd64/trap.c:852 #19 0x807158cb in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:290 #20 0x0043127c in ?? () Previous frame inner to this frame (corrupt stack?) I really don't understand this -any advice you can give would really be appreciated. John This message was sent using IMP, the Internet Messaging Program. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
[EMAIL PROTECTED] wrote: (kgdb) backtrace #0 doadump () at pcpu.h:194 #1 0xff0004742440 in ?? () #2 0x80477699 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 #3 0x80477a9d in panic (fmt=0x104 Address 0x104 out of bounds) at /usr/src/sys/kern/kern_shutdown.c:563 #4 0x8072ed44 in trap_fatal (frame=0xff00048ee000, eva=18446742974275512528) at /usr/src/sys/amd64/amd64/trap.c:724 #5 0x8072f115 in trap_pfault (frame=0xb1d7f4e0, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:641 #6 0x8072fa58 in trap (frame=0xb1d7f4e0) at /usr/src/sys/amd64/amd64/trap.c:410 #7 0x807156be in calltrap () at /usr/src/sys/amd64/amd64/exception.S:169 #8 0x0064 in ?? () #9 0x8067d3ee in uma_zalloc_arg (zone=0xff00bfed07e0, udata=0x0, flags=-256) at /usr/src/sys/vm/uma_core.c:1835 OK, that is if (zone-uz_ctor != NULL) { if (zone-uz_ctor(item, zone-uz_keg-uk_size, uz_ctor is indeed not null, but it's got 3 bits set. Not impossible that it's bad RAM still. I didn't spot anything that could cause it otherwise but I don't know this code in detail. Do all of the panics have the same backtrace? Kris #10 0x80661ecf in ffs_vget (mp=0xff00047f4978, ino=47884512, flags=2, vpp=0xb1d7f728) at uma.h:277 #11 0x8066d010 in ufs_lookup (ap=0xb1d7f780) at /usr/src/sys/ufs/ufs/ufs_lookup.c:573 #12 0x804dfa89 in vfs_cache_lookup (ap=Variable ap is not available. ) at vnode_if.h:83 #13 0x8077235f in VOP_LOOKUP_APV (vop=0x809e7de0, a=0xb1d7f840) at vnode_if.c:99 ---Type return to continue, or q return to quit--- #14 0x804e6394 in lookup (ndp=0xb1d7f950) at vnode_if.h:57 #15 0x804e7228 in namei (ndp=0xb1d7f950) at /usr/src/sys/kern/vfs_lookup.c:219 #16 0x804f4717 in kern_stat (td=0xff00048ee000, path=0x8006f7040 Address 0x8006f7040 out of bounds, pathseg=Variable path seg is not available. ) at /usr/src/sys/kern/vfs_syscalls.c:2109 #17 0x804f4987 in stat (td=Variable td is not available. ) at /usr/src/sys/kern/vfs_syscalls.c:2093 #18 0x8072f397 in syscall (frame=0xb1d7fc70) at /usr/src/sys/amd64/amd64/trap.c:852 #19 0x807158cb in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:290 #20 0x0043127c in ?? () Previous frame inner to this frame (corrupt stack?) I really don't understand this -any advice you can give would really be appreciated. John This message was sent using IMP, the Internet Messaging Program. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
#9 0x8067d3ee in uma_zalloc_arg (zone=0xff00bfed07e0, udata=0x0, flags=-256) at /usr/src/sys/vm/uma_core.c:1835 From the frame #9, please do p *zone I am esp. interested in the value of the uz_ctor member. It seems that it becomes corrupted, it value should be 0, as this seems to be ffs inode zone. I suspect that gdb would show 0x64 instead. I am afraid that you may need to spell out each step for me :-( (kgdb) p *zone No symbol zone in current context. (kgdb) list *0x8067d3ee 0x8067d3ee is in uma_zalloc_arg (/usr/src/sys/vm/uma_core.c:1835). 1830 (uma_zalloc: Bucket pointer mangled.)); 1831 cache-uc_allocs++; 1832 critical_exit(); 1833 #ifdef INVARIANTS 1834 ZONE_LOCK(zone); 1835 uma_dbg_alloc(zone, NULL, item); 1836 ZONE_UNLOCK(zone); 1837 #endif 1838 if (zone-uz_ctor != NULL) { 1839 if (zone-uz_ctor(item, zone-uz_keg-uk_size, Is this that you were looking for? John This message was sent using IMP, the Internet Messaging Program. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
On Tue, Jul 15, 2008 at 08:19:15PM +0100, [EMAIL PROTECTED] wrote: Please collect kgdb/ddb backtraces. kgdb backtrace: server251# kgdb -c /var/crash/vmcore.0 kgdb: couldn't find a suitable kernel image server251# kgdb /boot/kernel/kernel /var/crash/vmcore.0 kgdb: kvm_read: invalid address (0xff00010e5468) [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Unde fined symbol ps_pglobal_lookup] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as amd64-marcel-freebsd. Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x64 fault code = supervisor read instruction, page not present instruction pointer = 0x8:0x64 stack pointer = 0x10:0xb1d7f590 frame pointer = 0x10:0xff0035d2dcc0 code segment = base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 88622 (make) trap number = 12 panic: page fault cpuid = 0 Uptime: 5h57m22s Physical memory: 4082 MB Dumping 444 MB: 429 413 397 381 365 349 333 317 301 285 269 253 237 221 205 189 173 157 141 125 109 93 77 61 45 29 13 #0 doadump () at pcpu.h:194 194 pcpu.h: No such file or directory. in pcpu.h (kgdb) (kgdb) list *0x64 No source file for address 0x64. (kgdb) backtrace #0 doadump () at pcpu.h:194 #1 0xff0004742440 in ?? () #2 0x80477699 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 #3 0x80477a9d in panic (fmt=0x104 Address 0x104 out of bounds) at /usr/src/sys/kern/kern_shutdown.c:563 #4 0x8072ed44 in trap_fatal (frame=0xff00048ee000, eva=18446742974275512528) at /usr/src/sys/amd64/amd64/trap.c:724 #5 0x8072f115 in trap_pfault (frame=0xb1d7f4e0, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:641 #6 0x8072fa58 in trap (frame=0xb1d7f4e0) at /usr/src/sys/amd64/amd64/trap.c:410 #7 0x807156be in calltrap () at /usr/src/sys/amd64/amd64/exception.S:169 #8 0x0064 in ?? () #9 0x8067d3ee in uma_zalloc_arg (zone=0xff00bfed07e0, udata=0x0, flags=-256) at /usr/src/sys/vm/uma_core.c:1835 From the frame #9, please do p *zone I am esp. interested in the value of the uz_ctor member. It seems that it becomes corrupted, it value should be 0, as this seems to be ffs inode zone. I suspect that gdb would show 0x64 instead. That may be kernel memory corruption, but might be a bad memory as well (double bit inversion ?). #10 0x80661ecf in ffs_vget (mp=0xff00047f4978, ino=47884512, flags=2, vpp=0xb1d7f728) at uma.h:277 #11 0x8066d010 in ufs_lookup (ap=0xb1d7f780) at /usr/src/sys/ufs/ufs/ufs_lookup.c:573 #12 0x804dfa89 in vfs_cache_lookup (ap=Variable ap is not available. ) at vnode_if.h:83 #13 0x8077235f in VOP_LOOKUP_APV (vop=0x809e7de0, a=0xb1d7f840) at vnode_if.c:99 ---Type return to continue, or q return to quit--- #14 0x804e6394 in lookup (ndp=0xb1d7f950) at vnode_if.h:57 #15 0x804e7228 in namei (ndp=0xb1d7f950) at /usr/src/sys/kern/vfs_lookup.c:219 #16 0x804f4717 in kern_stat (td=0xff00048ee000, path=0x8006f7040 Address 0x8006f7040 out of bounds, pathseg=Variable path seg is not available. ) at /usr/src/sys/kern/vfs_syscalls.c:2109 #17 0x804f4987 in stat (td=Variable td is not available. ) at /usr/src/sys/kern/vfs_syscalls.c:2093 #18 0x8072f397 in syscall (frame=0xb1d7fc70) at /usr/src/sys/amd64/amd64/trap.c:852 #19 0x807158cb in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:290 #20 0x0043127c in ?? () Previous frame inner to this frame (corrupt stack?) I really don't understand this -any advice you can give would really be appreciated. pgpRxQ8vDk9c9.pgp Description: PGP signature
Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
On Tue, Jul 15, 2008 at 08:47:03PM +0100, [EMAIL PROTECTED] wrote: #9 0x8067d3ee in uma_zalloc_arg (zone=0xff00bfed07e0, udata=0x0, flags=-256) at /usr/src/sys/vm/uma_core.c:1835 From the frame #9, please do p *zone I am esp. interested in the value of the uz_ctor member. It seems that it becomes corrupted, it value should be 0, as this seems to be ffs inode zone. I suspect that gdb would show 0x64 instead. I am afraid that you may need to spell out each step for me :-( (kgdb) p *zone No symbol zone in current context. Do the frame 9 before p *zone. (kgdb) list *0x8067d3ee 0x8067d3ee is in uma_zalloc_arg (/usr/src/sys/vm/uma_core.c:1835). 1830 (uma_zalloc: Bucket pointer mangled.)); 1831 cache-uc_allocs++; 1832 critical_exit(); 1833 #ifdef INVARIANTS 1834 ZONE_LOCK(zone); 1835 uma_dbg_alloc(zone, NULL, item); 1836 ZONE_UNLOCK(zone); 1837 #endif 1838 if (zone-uz_ctor != NULL) { 1839 if (zone-uz_ctor(item, zone-uz_keg-uk_size, Is this that you were looking for? No, see above. pgpvrqCJe6SDX.pgp Description: PGP signature
Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
Do the frame 9 before p *zone. It's obvious now you say it ;-) You are indeed right: (kgdb) frame 9 #9 0x8067d3ee in uma_zalloc_arg (zone=0xff00bfed07e0, udata=0x0, flags=-256) at /usr/src/sys/vm/uma_core.c:1835 1835 uma_dbg_alloc(zone, NULL, item); (kgdb) p *zone $1 = {uz_name = 0x808084cd FFS inode, uz_lock = 0xff00bfecf7f0, uz_keg = 0xff00bfecf7e0, uz_link = {le_next = 0x0, le_prev = 0xff00bfecf830}, uz_full_bucket = { lh_first = 0xffe01a74c830}, uz_free_bucket = { lh_first = 0xff00469bf830}, uz_ctor = 0x64, uz_dtor = 0, uz_init = 0x9a, uz_fini = 0, uz_allocs = 17180460407, uz_frees = 504673, uz_fails = 0, uz_fills = 0, uz_count = 128, uz_cpu = {{ uc_freebucket = 0xff000e5d6830, uc_allocbucket = 0xff003a5f7000, uc_allocs = 97, uc_frees = 0}}} Now what does that mean?? I just experienced another panic, but it failed to writ to disk :-(. I will force another one and check that the details are the same. John This message was sent using IMP, the Internet Messaging Program. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
[EMAIL PROTECTED] wrote: #9 0x8067d3ee in uma_zalloc_arg (zone=0xff00bfed07e0, udata=0x0, flags=-256) at /usr/src/sys/vm/uma_core.c:1835 From the frame #9, please do p *zone I am esp. interested in the value of the uz_ctor member. It seems that it becomes corrupted, it value should be 0, as this seems to be ffs inode zone. I suspect that gdb would show 0x64 instead. I am afraid that you may need to spell out each step for me :-( (kgdb) p *zone No symbol zone in current context. (kgdb) list *0x8067d3ee 0x8067d3ee is in uma_zalloc_arg (/usr/src/sys/vm/uma_core.c:1835). 1830(uma_zalloc: Bucket pointer mangled.)); 1831cache-uc_allocs++; 1832critical_exit(); 1833#ifdef INVARIANTS 1834ZONE_LOCK(zone); 1835uma_dbg_alloc(zone, NULL, item); 1836ZONE_UNLOCK(zone); 1837#endif 1838if (zone-uz_ctor != NULL) { 1839if (zone-uz_ctor(item, zone-uz_keg-uk_size, Is this that you were looking for? Are you sure that is the same source tree you are running? The 7.0-RELEASE source has the zone-uz_ctor on line 1835, which is consistent with your backtrace. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
From: John Sullivan [EMAIL PROTECTED] Date: Tue, 15 Jul 2008 10:58:19 +0100 Sender: [EMAIL PROTECTED] I am experiencing 'random' reboots interspersed with panics whenever I put a newly installed system under load (make index in /usr/ports is enough). A sample panic is at the end of this email. I have updated to 7.0-RELEASE-p2 using the GENERIC amd64 kernel and it is still the same. The system is a Gigabyte GA-M56S-S3 motherboard with 4GB of RAM, an Athlon X2 6400+ and 3 x Maxtor SATA 750GB HDD's (only the first is currently in use). The first disk is all allocated to FreeBSD using UFS. There is also a Linksys 802.11a/b/g card installed. I have flashed the BIOS to the latest revision (F4e). The onboard RAID is disabled. At the moment there is no exotic software installed. Although I have been using FreeBSD for a number of years this is the first time I have experienced regular panics and am at a complete loss trying to work out what is wrong. I would be grateful for any advice anyone is willing to give to help me troubleshoot this issue. Thanks in advance John Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x80b0 fault code - supervisor write data, page not present instruction pointer = 0x8:0x804db18c stack pointer = 0x10:b1e92450 frame pointer = 0x10:ffec code segment = base 0x0, limit 0xf, type 0x16, DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interupt enabled, resume, IOPL = 0 current processkernel trap 12 with interrupts disabled #nm -n /boot/kernel/kernel | grep 804db 804dbac0 t flushbufqueues Could be memory, but I'd also suggest looking at temperatures. I've had overheating systems produce lots of such errors. -- R. Kevin Oberman, Network Engineer Energy Sciences Network (ESnet) Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab) E-mail: [EMAIL PROTECTED] Phone: +1 510 486-8634 Key fingerprint:059B 2DDF 031C 9BA3 14A4 EADA 927D EBB3 987B 3751 pgpnWWuBCVU7i.pgp Description: PGP signature