Re: 9.1 minimal ram requirements
http://www.freebsd.org/cgi/query-pr.cgi?pr=174671 -- View this message in context: http://freebsd.1045724.n5.nabble.com/9-1-minimal-ram-requirements-tp5771583p5771862.html Sent from the freebsd-stable mailing list archive at Nabble.com. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: [HEADSUP] zfs root pool mounting
On Sun, Dec 23, 2012 at 2:49 PM, Andriy Gapon a...@freebsd.org wrote: on 23/12/2012 14:34 Kimmo Paasiala said the following: On Sun, Dec 23, 2012 at 2:28 PM, Andriy Gapon a...@freebsd.org wrote: I have MFCed the following change, so please double-check if you might be affected. Preferably before upgrading :-) on 28/11/2012 20:35 Andriy Gapon said the following: Recently some changes were made to how a root pool is opened for root filesystem mounting. Previously the root pool had to be present in zpool.cache. Now it is automatically discovered by probing available GEOM providers. The new scheme is believed to be more flexible. For example, it allows to prepare a new root pool at one system, then export it and then boot from it on a new system without doing any extra/magical steps with zpool.cache. It could also be convenient after zpool split and in some other situations. The change was introduced via multiple commits, the latest relevant revision in head is r243502. The changes are partially MFC-ed, the remaining parts are scheduled to be MFC-ed soon. I have received a report that the change caused a problem with booting on at least one system. The problem has been identified as an issue in local environment and has been fixed. Please read on to see if you might be affected when you upgrade, so that you can avoid any unnecessary surprises. You might be affected if you ever had a pool named the same as your current root pool. And you still have any disks connected to your system that belonged to that pool (in whole or via some partitions). And that pool was never properly destroyed using zpool destroy, but merely abandoned (its disks re-purposed/re-partitioned/reused). If all of the above are true, then I recommend that you run 'zdb -l disk' for all suspect disks and their partitions (or just all disks and partitions). If this command reports at least one valid ZFS label for a disk or a partition that do not belong to any current pool, then the problem may affect you. The best course is to remove the offending labels. If you are affected, please follow up to this email. Much appreciated! I have verified that my system is not affected. One question, do I have to rewrite the zfs gpt boot loader (/boot/gptzfsboot) onto the freebsd-boot partition to make use of this change? This change is kernel-level only. There is no interaction with boot blocks. -- Andriy Gapon I can happily report that booting from the ZFS pool works on my 9-STABLE system without the zpool.cache file. Thanks, merry christmas and happy new year! -Kimmo ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Kernel panic when playing games/iourbanterror
Hello, When playing a lot Urban Terror, the system panic with ACPI related issues : Fatal trap 9: general protection fault while in kernel mode cpuid = 1; apic id = 01 instruction pointer = 0x20:0x802c6f15 stack pointer = 0x28:0xff80d89ac6c0 frame pointer = 0x28:0x0 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 1288 (hald) trap number = 9 panic: general protection fault cpuid = 1 Uptime: 1h52m22s Dumping 596 out of 3054 MB:..3%..11%..22%..33%..41%..51%..62%..73%..81%..92% Reading symbols from /boot/modules/vboxdrv.ko...done. Loaded symbols for /boot/modules/vboxdrv.ko #0 doadump (textdump=Variable textdump is not available. ) at pcpu.h:224 224 __asm(movq %%gs:0,%0 : =r (td)); (kgdb) list *0xff80d89ac6c0 No source file for address 0xff80d89ac6c0. (kgdb) backtrace #0 doadump (textdump=Variable textdump is not available. ) at pcpu.h:224 #1 0x0004 in ?? () #2 0x804f3ae6 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:448 #3 0x804f3fa9 in panic (fmt=0x1 Address 0x1 out of bounds) at /usr/src/sys/kern/kern_shutdown.c:636 #4 0x806fcfa9 in trap_fatal (frame=0x9, eva=Variable eva is not available. ) at /usr/src/sys/amd64/amd64/trap.c:857 #5 0x806fd554 in trap (frame=0xff80d89ac610) at /usr/src/sys/amd64/amd64/trap.c:599 #6 0x806e81bf in calltrap () at /usr/src/sys/amd64/amd64/exception.S:228 #7 0x802c6f15 in AcpiUtUpdateObjectReference ( Object=0xfe0001824a80, Action=0) at /usr/src/sys/contrib/dev/acpica/utilities/utdelete.c:563 #8 0x802b77a4 in AcpiExResolveNodeToValue ( ObjectPtr=0xfe0001a2c2e0, WalkState=0xfe0001a2c000) at /usr/src/sys/contrib/dev/acpica/executer/exresnte.c:184 #9 0x802b7ad3 in AcpiExResolveToValue (StackPtr=0xfe0001a2c2e0, WalkState=0xfe0001a2c000) at /usr/src/sys/contrib/dev/acpica/executer/exresolv.c:124 #10 0x802ac433 in AcpiDsEvaluateNamePath (WalkState=0xfe0001a2c000) at /usr/src/sys/contrib/dev/acpica/dispatcher/dsutils.c:886 ---Type return to continue, or q return to quit--- #11 0x802aceef in AcpiDsExecEndOp (WalkState=0xfe0001a2c000) at /usr/src/sys/contrib/dev/acpica/dispatcher/dswexec.c:436 #12 0x802c05ba in AcpiPsParseLoop (WalkState=0xfe0001a2c000) at /usr/src/sys/contrib/dev/acpica/parser/psloop.c:1249 #13 0x802c10a8 in AcpiPsParseAml (WalkState=0xfe0001a2c000) at /usr/src/sys/contrib/dev/acpica/parser/psparse.c:525 #14 0x802c1d45 in AcpiPsExecuteMethod (Info=0xfe0033df8540) at /usr/src/sys/contrib/dev/acpica/parser/psxface.c:368 #15 0x802bb784 in AcpiNsEvaluate (Info=0xfe0033df8540) at /usr/src/sys/contrib/dev/acpica/namespace/nseval.c:193 #16 0x802bec91 in AcpiEvaluateObject (Handle=0xfe00017f7b80, Pathname=0x8078229f _BST, ExternalParams=0x0, ReturnBuffer=0xff80d89ac960) at /usr/src/sys/contrib/dev/acpica/namespace/nsxfeval.c:289 #17 0x80309802 in acpi_cmbat_get_bst (arg=Variable arg is not available. ) at /usr/src/sys/dev/acpica/acpi_cmbat.c:257 #18 0x80309af8 in acpi_cmbat_bst (dev=0xfe0001936400, bstp=0xfe008b319400) at /usr/src/sys/dev/acpica/acpi_cmbat.c:418 #19 0x8045bd22 in devfs_ioctl_f (fp=0xfe001ba256e0, com=3231990289, data=Variable data is not available. ) at /usr/src/sys/fs/devfs/devfs_vnops.c:757 #20 0x8053a23d in kern_ioctl (td=0xfe00039ae8e0, fd=Variable fd is not available. ) at file.h:293 #21 0x8053a4ad in sys_ioctl (td=0xfe00039ae8e0, uap=0xff80d89acb70) at /usr/src/sys/kern/sys_generic.c:691 ---Type return to continue, or q return to quit--- #22 0x806fc902 in amd64_syscall (td=0xfe00039ae8e0, traced=0) at subr_syscall.c:135 #23 0x806e84a7 in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:387 #24 0x000801d89c5c in ?? () Previous frame inner to this frame (corrupt stack?) Before the panic, a lot of ACPI Error appears in dmesg like that : ACPI Error: Method execution failed [\\_SB_.BAT0._UID] (Node 0xfe00017f7b00), AE_AML_NO_OPERAND (20110527/uteval-113) ACPI Error: No object attached to node 0xfe00017f7b00 (20110527/exresnte-139) ACPI Error: Method execution failed [\\_SB_.BAT0._UID] (Node 0xfe00017f7b00), AE_AML_NO_OPERAND (20110527/uteval-113) ACPI Error: No object attached to node 0xfe00017f7b00 (20110527/exresnte-139) This happens on 9.1-RELEASE amd64 Cheers, ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to
Re: PKGNG Monitoring in Zabbix
Marin Atanasov Nikolov schreef: Hey, Looks like the end of the World is postponed, so I've though that now I have some time to document some stuff :) The documentations are about monitoring your PKGNG package database in Zabbix. Part I explains how to monitor your database and have graphs of the number of packages and disk space taken by packages on your FreeBSD system. Part II talks about how to perform audits of your package database for things like missing package dependencies and packages that are known to vulnerable. You can find the documentations at the links below: * http://unix-heaven.org/monitorig-pkgng-in-zabbix-part-i * http://unix-heaven.org/monitorig-pkgng-in-zabbix-part-ii Hope you like them, and Happy Holidays! :) Regards, Marin Thanks, allways nice to see things in my zabbix console I will try it out when i find some time : ) gr Johan Hendriks ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: What is negative group permissions? (Re: narawntapu security run output)
On 23.12.2012 11:48, Chris Rees wrote: They involve a lot of thought to get right, as well as chmod g-w on something where you probably meant chmod go-w is a disastrous but (perhaps) common error. Chris Well, in (over 20) years of dealing with Unix, I've never made a mistake like that, nor do I understand, how it can be considered common ... Got to admit, I was surprised to see it. It made me think, I do not understand something -- or that FreeBSD is becoming overly paternalistic. It turned out to be the latter... I doubt, it is useful. Worse, issuing such warnings routinely, only reinforces the unfortunate misconceptions like the one Barney demonstrated in this thread. When originally added, the check was meant to be off by default: r215213 | brooks | 2010-11-12 19:40:43 -0500 (пт, 12 лис 2010) | 7 lines Add an (off by default) check for negative permissions (where the group on a object has less permissions that everyone). These permissions will not work reliably over NFS if you have more than 14 supplemental groups and are usually not what you mean. MFC after: 1 week perhaps, it should have remained off? Yours, -mi ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: What is negative group permissions? (Re: narawntapu security run output)
Mikhail T. mi+thun at aldan.algebra.com writes: On 23.12.2012 11:48, Chris Rees wrote: They involve a lot of thought to get right, as well as chmod g-w on something where you probably meant chmod go-w is a disastrous but (perhaps) common error. Chris Well, in (over 20) years of dealing with Unix, I've never made a mistake like that, nor do I understand, how it can be considered common ... Got to admit, I was surprised to see it. It made me think, I do not understand something -- or that FreeBSD is becoming overly paternalistic. It turned out to be the latter... I doubt, it is useful. Worse, issuing such warnings routinely, only reinforces the unfortunate misconceptions like the one Barney demonstrated in this thread. When originally added, the check was meant to be off by default: ... perhaps, it should have remained off? Yours, Those security checks are for a reason - people make mistakes (even a perfect guy like you will have a head in a brown bag time). It is better to get a heads-up, then think about it and turn it off (customize) if considered unneeded. jb ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: What is negative group permissions? (Re: narawntapu security run output)
On 24 December 2012 10:27, jb jb.1234a...@gmail.com wrote: Those security checks are for a reason - people make mistakes (even a perfect guy like you will have a head in a brown bag time). It is better to get a heads-up, then think about it and turn it off (customize) if considered unneeded. +1. Default to helping the new user (or the user that makes mistakes). -- Eitan Adler ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FreeBSD 9.1-RELEASE crashes almost daily; backtraces always list zfs routines
on 24/12/2012 00:23 Derek Kulinski said the following: Dumping 3701 out of 8072 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91% So do you have the crash dump(s)? -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FreeBSD 9.1-RELEASE crashes almost daily; backtraces always list zfs routines
Hello Andriy, Monday, December 24, 2012, 8:01:26 AM, you wrote: on 24/12/2012 00:23 Derek Kulinski said the following: Dumping 3701 out of 8072 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91% So do you have the crash dump(s)? Yes, but they are 3.5GB each. I attached text dump to GNATS but I can resend it to you (I don't know if it's ok to send attachments to the mailing list). If you would prefer I could give you access to the box. -- Best regards, Derekmailto:tak...@takeda.tk -- Programmer - A red-eyed, mumbling mammal capable of conversing with inanimate objects. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Kernel panic when playing games/iourbanterror
Hi, Please file a PR? We can bump it to the ACPI person who has been busily making this stuff updated and stable. Thanks! Adrian On 24 December 2012 05:52, David Demelier demelier.da...@gmail.com wrote: Hello, When playing a lot Urban Terror, the system panic with ACPI related issues : Fatal trap 9: general protection fault while in kernel mode cpuid = 1; apic id = 01 instruction pointer = 0x20:0x802c6f15 stack pointer = 0x28:0xff80d89ac6c0 frame pointer = 0x28:0x0 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 1288 (hald) trap number = 9 panic: general protection fault cpuid = 1 Uptime: 1h52m22s Dumping 596 out of 3054 MB:..3%..11%..22%..33%..41%..51%..62%..73%..81%..92% Reading symbols from /boot/modules/vboxdrv.ko...done. Loaded symbols for /boot/modules/vboxdrv.ko #0 doadump (textdump=Variable textdump is not available. ) at pcpu.h:224 224 __asm(movq %%gs:0,%0 : =r (td)); (kgdb) list *0xff80d89ac6c0 No source file for address 0xff80d89ac6c0. (kgdb) backtrace #0 doadump (textdump=Variable textdump is not available. ) at pcpu.h:224 #1 0x0004 in ?? () #2 0x804f3ae6 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:448 #3 0x804f3fa9 in panic (fmt=0x1 Address 0x1 out of bounds) at /usr/src/sys/kern/kern_shutdown.c:636 #4 0x806fcfa9 in trap_fatal (frame=0x9, eva=Variable eva is not available. ) at /usr/src/sys/amd64/amd64/trap.c:857 #5 0x806fd554 in trap (frame=0xff80d89ac610) at /usr/src/sys/amd64/amd64/trap.c:599 #6 0x806e81bf in calltrap () at /usr/src/sys/amd64/amd64/exception.S:228 #7 0x802c6f15 in AcpiUtUpdateObjectReference ( Object=0xfe0001824a80, Action=0) at /usr/src/sys/contrib/dev/acpica/utilities/utdelete.c:563 #8 0x802b77a4 in AcpiExResolveNodeToValue ( ObjectPtr=0xfe0001a2c2e0, WalkState=0xfe0001a2c000) at /usr/src/sys/contrib/dev/acpica/executer/exresnte.c:184 #9 0x802b7ad3 in AcpiExResolveToValue (StackPtr=0xfe0001a2c2e0, WalkState=0xfe0001a2c000) at /usr/src/sys/contrib/dev/acpica/executer/exresolv.c:124 #10 0x802ac433 in AcpiDsEvaluateNamePath (WalkState=0xfe0001a2c000) at /usr/src/sys/contrib/dev/acpica/dispatcher/dsutils.c:886 ---Type return to continue, or q return to quit--- #11 0x802aceef in AcpiDsExecEndOp (WalkState=0xfe0001a2c000) at /usr/src/sys/contrib/dev/acpica/dispatcher/dswexec.c:436 #12 0x802c05ba in AcpiPsParseLoop (WalkState=0xfe0001a2c000) at /usr/src/sys/contrib/dev/acpica/parser/psloop.c:1249 #13 0x802c10a8 in AcpiPsParseAml (WalkState=0xfe0001a2c000) at /usr/src/sys/contrib/dev/acpica/parser/psparse.c:525 #14 0x802c1d45 in AcpiPsExecuteMethod (Info=0xfe0033df8540) at /usr/src/sys/contrib/dev/acpica/parser/psxface.c:368 #15 0x802bb784 in AcpiNsEvaluate (Info=0xfe0033df8540) at /usr/src/sys/contrib/dev/acpica/namespace/nseval.c:193 #16 0x802bec91 in AcpiEvaluateObject (Handle=0xfe00017f7b80, Pathname=0x8078229f _BST, ExternalParams=0x0, ReturnBuffer=0xff80d89ac960) at /usr/src/sys/contrib/dev/acpica/namespace/nsxfeval.c:289 #17 0x80309802 in acpi_cmbat_get_bst (arg=Variable arg is not available. ) at /usr/src/sys/dev/acpica/acpi_cmbat.c:257 #18 0x80309af8 in acpi_cmbat_bst (dev=0xfe0001936400, bstp=0xfe008b319400) at /usr/src/sys/dev/acpica/acpi_cmbat.c:418 #19 0x8045bd22 in devfs_ioctl_f (fp=0xfe001ba256e0, com=3231990289, data=Variable data is not available. ) at /usr/src/sys/fs/devfs/devfs_vnops.c:757 #20 0x8053a23d in kern_ioctl (td=0xfe00039ae8e0, fd=Variable fd is not available. ) at file.h:293 #21 0x8053a4ad in sys_ioctl (td=0xfe00039ae8e0, uap=0xff80d89acb70) at /usr/src/sys/kern/sys_generic.c:691 ---Type return to continue, or q return to quit--- #22 0x806fc902 in amd64_syscall (td=0xfe00039ae8e0, traced=0) at subr_syscall.c:135 #23 0x806e84a7 in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:387 #24 0x000801d89c5c in ?? () Previous frame inner to this frame (corrupt stack?) Before the panic, a lot of ACPI Error appears in dmesg like that : ACPI Error: Method execution failed [\\_SB_.BAT0._UID] (Node 0xfe00017f7b00), AE_AML_NO_OPERAND (20110527/uteval-113) ACPI Error: No object attached to node 0xfe00017f7b00 (20110527/exresnte-139) ACPI Error: Method execution failed [\\_SB_.BAT0._UID] (Node 0xfe00017f7b00), AE_AML_NO_OPERAND (20110527/uteval-113) ACPI Error: No object attached to node
stable/9 i386 panic [ACPI/timer?]
I finally(!) got around to enabling crash dumps on the primary machine here at the house ... and managed to make use of it (unfortunately). I've copied the relevant files (both those from /var/crash and dmesg.boot) so they should be visibale at http://www.catwhisker.org/~david/FreeBSD/panic_24Dec2012/ (though only the dmesg.boot, core.text.0, info.0 files should be fetchable for now). [I'll make the vmcore.0 available to individuals who wish to work on the problem; please contact me to arrange this.] Here's a bit of information excerpted from core.text.0: Mon Dec 24 11:16:04 PST 2012 FreeBSD albert.catwhisker.org 9.1-PRERELEASE FreeBSD 9.1-PRERELEASE #434 244582M: Sat Dec 22 05:06:29 PST 2012 r...@freebeast.catwhisker.org:/usr/obj/usr/src/sys/ALBERT i386 Note that while the version string says 244582M: * Userland was at r244608. * The Modification was merely a change to src/sys/newvers.sh to re-factor the extraction of the version string. panic: page fault Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x34 fault code = supervisor read, page not present instruction pointer = 0x20:0xc0ad475c stack pointer = 0x28:0xc6fba9d8 frame pointer = 0x28:0xc6fbaa18 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= resume, IOPL = 0 current process = 11 (idle: cpu0) trap number = 12 panic: page fault cpuid = 0 KDB: stack backtrace: db_trace_self_wrapper(c0ffbab8,46,1,ca931e80,0,...) at 0xc051ef76 = db_trace_self_wrapper+0x36/frame 0xc6fba740 kdb_backtrace(c1033ff1,0,c0e75cc4,c6fba7ec,c71f08d0,...) at 0xc0afc400 = kdb_backtrace+0x30/frame 0xc6fba7a0 panic(c0e75cc4,c1034ddb,c71f0a84,1,1,...) at 0xc0ac763c = panic+0x1bc/frame 0xc6fba7e0 trap_fatal(28,7fff,3,0,28,...) at 0xc0e35560 = trap_fatal+0x340/frame 0xc6fba828 trap_pfault(34,c,1,c11a68b0,c6fba940,...) at 0xc0e358cb = trap_pfault+0x35b/frame 0xc6fba8a0 trap(c6fba998) at 0xc0e34e13 = trap+0x443/frame 0xc6fba98c calltrap() at 0xc0e1e86c = calltrap+0x6/frame 0xc6fba98c --- trap 0xc, eip = 0xc0ad475c, esp = 0xc6fba9d8, ebp = 0xc6fbaa18 --- tc_windup(1,0,c0ff3ba6,21c,0,...) at 0xc0ad475c = tc_windup+0x1c/frame 0xc6fbaa18 hardclock_cnt(1,0,0,3,0,...) at 0xc0a77e39 = hardclock_cnt+0x2e9/frame 0xc6fbaa68 handleevents(c6fbaaf8,2,46,c71f08d0,c6fbaae4,...) at 0xc0e3c534 = handleevents+0x184/frame 0xc6fbaac0 timercb(c7564064,0,c76a82f0,c6fbab58,c0a99a0e,...) at 0xc0e3d1a1 = timercb+0x281/frame 0xc6fbab14 hpet_intr_single(c7564064,c7569780,0,c6fbabbc,c6fbab78,...) at 0xc053a345 = hpet_intr_single+0x195/frame 0xc6fbab40 hpet_intr(c7564000,0,c71f08d0,14,c723b710,...) at 0xc053a3cf = hpet_intr+0x6f/frame 0xc6fbab58 intr_event_handle(c723c280,c6fbabbc,c6fbab94,0,c7182600,...) at 0xc0a99c5c = intr_event_handle+0x7c/frame 0xc6fbab78 intr_execute_handlers(c723b710,c6fbabbc,0) at 0xc0e4c552 = intr_execute_handlers+0x42/frame 0xc6fbab98 lapic_handle_intr(33,c6fbabbc) at 0xc0e4f50d = lapic_handle_intr+0x3d/frame 0xc6fbabac Xapic_isr1() at 0xc0e1ec35 = Xapic_isr1+0x35/frame 0xc6fbabac --- interrupt, eip = 0xc0e1a202, esp = 0xc6fbabfc, ebp = 0xc6fbac3c --- acpi_cpu_c1(0,c6fbac58,c0e250a6,0,c1198018,...) at 0xc0e1a202 = acpi_cpu_c1+0x2/frame 0xc6fbac3c cpu_idle_acpi(0,c1198018,c6fbacd0,c0aee519,0,...) at 0xc0e24fff = cpu_idle_acpi+0x2f/frame 0xc6fbac48 cpu_idle(0,2,c0ffa49a,a36,c71f08d0,...) at 0xc0e250a6 = cpu_idle+0x96/frame 0xc6fbac58 sched_idletd(0,c6fbad08,0,0,c0aee250,...) at 0xc0aee519 = sched_idletd+0x2c9/frame 0xc6fbacd0 fork_exit(c0aee250,0,c6fbad08) at 0xc0a977c7 = fork_exit+0x67/frame 0xc6fbacf4 fork_trampoline() at 0xc0e1e8e4 = fork_trampoline+0x8/frame 0xc6fbacf4 --- trap 0, eip = 0, esp = 0xc6fbad40, ebp = 0 --- Uptime: 7h11m46s Physical memory: 3045 MB #0 doadump (textdump=value optimized out) at pcpu.h:249 249 pcpu.h: No such file or directory. in pcpu.h (kgdb) #0 doadump (textdump=value optimized out) at pcpu.h:249 #1 0xc0ac71fa in kern_reboot (howto=Unhandled dwarf expression opcode 0xc0 ) at /usr/src/sys/kern/kern_shutdown.c:448 #2 0xc0ac7688 in panic (fmt=Unhandled dwarf expression opcode 0xc0 ) at /usr/src/sys/kern/kern_shutdown.c:636 #3 0xc0e35560 in trap_fatal (frame=value optimized out, eva=value optimized out) at /usr/src/sys/i386/i386/trap.c:1043 #4 0xc0e358cb in trap_pfault (frame=value optimized out, usermode=Unhandled dwarf expression opcode 0xc3 ) at /usr/src/sys/i386/i386/trap.c:858 #5 0xc0e34e13 in trap (frame=value optimized out) at /usr/src/sys/i386/i386/trap.c:555 #6 0xc0e1e86c in calltrap () at /tmp/exception-SmXQMs.s:94 #7 0xc0ad475c in tc_windup () at /usr/src/sys/kern/kern_tc.c:450 #8 0xc0a77e39 in hardclock_cnt (usermode=value optimized out) at /usr/src/sys/kern/kern_clock.c:556 #9 0xc0e3c534 in handleevents (now=value optimized out, fake=value optimized out)
Re: FreeBSD 9.1-RELEASE crashes almost daily; backtraces always list zfs routines
On Mon, Dec 24, 2012 at 10:17:19AM -0800, Derek Kulinski wrote: Yes, but they are 3.5GB each. I attached text dump to GNATS but I can resend it to you We have a limit of 500K on GNATS PRs. For something that huge, a PR database is really not the right place for it -- please post the dumps somewhere and include a URL to them in a followup to the PR. Thanks. Mark Linimon, on behalf of bugmeister ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FreeBSD 9.1-RELEASE crashes almost daily; backtraces always list zfs routines
Hello Mark, Monday, December 24, 2012, 12:46:53 PM, you wrote: On Mon, Dec 24, 2012 at 10:17:19AM -0800, Derek Kulinski wrote: Yes, but they are 3.5GB each. I attached text dump to GNATS but I can resend it to you We have a limit of 500K on GNATS PRs. For something that huge, a PR database is really not the right place for it -- please post the dumps somewhere and include a URL to them in a followup to the PR. Thanks. Mark Linimon, on behalf of bugmeister I included the text dump, but I do not see it when I visit the web interface so I don't know if it was attached there or not. -- Best regards, Derekmailto:tak...@takeda.tk My new car runs at 56Kbps ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: stable/9 i386 panic [ACPI/timer?]
on 24/12/2012 21:58 David Wolfskill said the following: I finally(!) got around to enabling crash dumps on the primary machine here at the house ... and managed to make use of it (unfortunately). I've copied the relevant files (both those from /var/crash and dmesg.boot) so they should be visibale at http://www.catwhisker.org/~david/FreeBSD/panic_24Dec2012/ (though only the dmesg.boot, core.text.0, info.0 files should be fetchable for now). [I'll make the vmcore.0 available to individuals who wish to work on the problem; please contact me to arrange this.] Here's a bit of information excerpted from core.text.0: Mon Dec 24 11:16:04 PST 2012 FreeBSD albert.catwhisker.org 9.1-PRERELEASE FreeBSD 9.1-PRERELEASE #434 244582M: Sat Dec 22 05:06:29 PST 2012 r...@freebeast.catwhisker.org:/usr/obj/usr/src/sys/ALBERT i386 Note that while the version string says 244582M: * Userland was at r244608. * The Modification was merely a change to src/sys/newvers.sh to re-factor the extraction of the version string. panic: page fault Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x34 fault code = supervisor read, page not present instruction pointer = 0x20:0xc0ad475c stack pointer = 0x28:0xc6fba9d8 frame pointer = 0x28:0xc6fbaa18 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= resume, IOPL = 0 current process = 11 (idle: cpu0) trap number = 12 panic: page fault cpuid = 0 KDB: stack backtrace: db_trace_self_wrapper(c0ffbab8,46,1,ca931e80,0,...) at 0xc051ef76 = db_trace_self_wrapper+0x36/frame 0xc6fba740 kdb_backtrace(c1033ff1,0,c0e75cc4,c6fba7ec,c71f08d0,...) at 0xc0afc400 = kdb_backtrace+0x30/frame 0xc6fba7a0 panic(c0e75cc4,c1034ddb,c71f0a84,1,1,...) at 0xc0ac763c = panic+0x1bc/frame 0xc6fba7e0 trap_fatal(28,7fff,3,0,28,...) at 0xc0e35560 = trap_fatal+0x340/frame 0xc6fba828 trap_pfault(34,c,1,c11a68b0,c6fba940,...) at 0xc0e358cb = trap_pfault+0x35b/frame 0xc6fba8a0 trap(c6fba998) at 0xc0e34e13 = trap+0x443/frame 0xc6fba98c calltrap() at 0xc0e1e86c = calltrap+0x6/frame 0xc6fba98c --- trap 0xc, eip = 0xc0ad475c, esp = 0xc6fba9d8, ebp = 0xc6fbaa18 --- tc_windup(1,0,c0ff3ba6,21c,0,...) at 0xc0ad475c = tc_windup+0x1c/frame 0xc6fbaa18 hardclock_cnt(1,0,0,3,0,...) at 0xc0a77e39 = hardclock_cnt+0x2e9/frame 0xc6fbaa68 handleevents(c6fbaaf8,2,46,c71f08d0,c6fbaae4,...) at 0xc0e3c534 = handleevents+0x184/frame 0xc6fbaac0 timercb(c7564064,0,c76a82f0,c6fbab58,c0a99a0e,...) at 0xc0e3d1a1 = timercb+0x281/frame 0xc6fbab14 hpet_intr_single(c7564064,c7569780,0,c6fbabbc,c6fbab78,...) at 0xc053a345 = hpet_intr_single+0x195/frame 0xc6fbab40 hpet_intr(c7564000,0,c71f08d0,14,c723b710,...) at 0xc053a3cf = hpet_intr+0x6f/frame 0xc6fbab58 intr_event_handle(c723c280,c6fbabbc,c6fbab94,0,c7182600,...) at 0xc0a99c5c = intr_event_handle+0x7c/frame 0xc6fbab78 intr_execute_handlers(c723b710,c6fbabbc,0) at 0xc0e4c552 = intr_execute_handlers+0x42/frame 0xc6fbab98 lapic_handle_intr(33,c6fbabbc) at 0xc0e4f50d = lapic_handle_intr+0x3d/frame 0xc6fbabac Xapic_isr1() at 0xc0e1ec35 = Xapic_isr1+0x35/frame 0xc6fbabac --- interrupt, eip = 0xc0e1a202, esp = 0xc6fbabfc, ebp = 0xc6fbac3c --- acpi_cpu_c1(0,c6fbac58,c0e250a6,0,c1198018,...) at 0xc0e1a202 = acpi_cpu_c1+0x2/frame 0xc6fbac3c cpu_idle_acpi(0,c1198018,c6fbacd0,c0aee519,0,...) at 0xc0e24fff = cpu_idle_acpi+0x2f/frame 0xc6fbac48 cpu_idle(0,2,c0ffa49a,a36,c71f08d0,...) at 0xc0e250a6 = cpu_idle+0x96/frame 0xc6fbac58 sched_idletd(0,c6fbad08,0,0,c0aee250,...) at 0xc0aee519 = sched_idletd+0x2c9/frame 0xc6fbacd0 fork_exit(c0aee250,0,c6fbad08) at 0xc0a977c7 = fork_exit+0x67/frame 0xc6fbacf4 fork_trampoline() at 0xc0e1e8e4 = fork_trampoline+0x8/frame 0xc6fbacf4 --- trap 0, eip = 0, esp = 0xc6fbad40, ebp = 0 --- Uptime: 7h11m46s Physical memory: 3045 MB #0 doadump (textdump=value optimized out) at pcpu.h:249 249 pcpu.h: No such file or directory. in pcpu.h (kgdb) #0 doadump (textdump=value optimized out) at pcpu.h:249 #1 0xc0ac71fa in kern_reboot (howto=Unhandled dwarf expression opcode 0xc0 ) at /usr/src/sys/kern/kern_shutdown.c:448 #2 0xc0ac7688 in panic (fmt=Unhandled dwarf expression opcode 0xc0 ) at /usr/src/sys/kern/kern_shutdown.c:636 #3 0xc0e35560 in trap_fatal (frame=value optimized out, eva=value optimized out) at /usr/src/sys/i386/i386/trap.c:1043 #4 0xc0e358cb in trap_pfault (frame=value optimized out, usermode=Unhandled dwarf expression opcode 0xc3 ) at /usr/src/sys/i386/i386/trap.c:858 #5 0xc0e34e13 in trap (frame=value optimized out) at /usr/src/sys/i386/i386/trap.c:555 #6 0xc0e1e86c in calltrap () at /tmp/exception-SmXQMs.s:94 #7 0xc0ad475c in tc_windup () at /usr/src/sys/kern/kern_tc.c:450 I'd say that what you see
Re: stable/9 i386 panic [ACPI/timer?]
On Mon, Dec 24, 2012 at 11:04:04PM +0200, Andriy Gapon wrote: ... I'd say that what you see is impossible... Well, I suppose it's small comfort, but that does make me feel a little better about being a bit clueless about why this happened. Thanks! :-} Could you please provide the following info from kgdb? p timehands p th0 ... p th9 disassemble tc_windup ... Here you go, cut/pasted (though I elided the ---Type return to continue, or q return to quit--- lines): albert(9.1-P)[3] kgdb /boot/kernel/kernel.symbols vmcore.0 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. ... Unread portion of the kernel message buffer: kernel trap 12 with interrupts disabled Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x34 fault code = supervisor read, page not present instruction pointer = 0x20:0xc0ad475c stack pointer = 0x28:0xc6fba9d8 frame pointer = 0x28:0xc6fbaa18 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= resume, IOPL = 0 current process = 11 (idle: cpu0) trap number = 12 panic: page fault ... Loaded symbols for /boot/kernel/drm.ko #0 doadump (textdump=value optimized out) at pcpu.h:249 249 pcpu.h: No such file or directory. in pcpu.h (kgdb) frame 7 #7 0xc0ad475c in tc_windup () at /usr/src/sys/kern/kern_tc.c:450 450 /usr/src/sys/kern/kern_tc.c: No such file or directory. in /usr/src/sys/kern/kern_tc.c Current language: auto; currently minimal (kgdb) p timehands $1 = (struct timehands * volatile) 0xc11ba910 (kgdb) p th0 $2 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, th_scale = 1690726758248, th_offset_count = 3989950369, th_offset = { sec = 25906, frac = 2057132249855343962}, th_microtime = { tv_sec = 1356376278, tv_usec = 180944}, th_nanotime = { tv_sec = 1356376278, tv_nsec = 180944041}, th_generation = 669311, th_next = 0xc112a7e4} (kgdb) p th1 $3 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, th_scale = 1690726758248, th_offset_count = 3990015836, th_offset = { sec = 25906, frac = 2167819058537565778}, th_microtime = { tv_sec = 1356376278, tv_usec = 186944}, th_nanotime = { tv_sec = 1356376278, tv_nsec = 186944385}, th_generation = 669311, th_next = 0xc112a820} (kgdb) p th2 $4 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, th_scale = 1690726758248, th_offset_count = 3990048555, th_offset = { sec = 25906, frac = 2223137947340682090}, th_microtime = { tv_sec = 1356376278, tv_usec = 189943}, th_nanotime = { tv_sec = 1356376278, tv_nsec = 189943228}, th_generation = 669311, th_next = 0xc112a85c} (kgdb) p th3 $5 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, th_scale = 1690726758248, th_offset_count = 3990059490, th_offset = { sec = 25906, frac = 224162602123970}, th_microtime = { tv_sec = 1356376278, tv_usec = 190945}, th_nanotime = { tv_sec = 1356376278, tv_nsec = 190945470}, th_generation = 669311, th_next = 0xc112a898} (kgdb) p th4 $6 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, th_scale = 1690726758248, th_offset_count = 3990070376, th_offset = { sec = 25906, frac = 2260031295932411698}, th_microtime = { tv_sec = 1356376278, tv_usec = 191943}, th_nanotime = { tv_sec = 1356376278, tv_nsec = 191943220}, th_generation = 669311, th_next = 0xc112a8d4} (kgdb) p th5 $7 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, th_scale = 1690726758248, th_offset_count = 3990081323, th_offset = { sec = 25906, frac = 2278539681754952554}, th_microtime = { tv_sec = 1356376278, tv_usec = 192946}, th_nanotime = { tv_sec = 1356376278, tv_nsec = 192946562}, th_generation = 669311, th_next = 0xc11ba910} (kgdb) p th6 $8 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, th_scale = 1690726758248, th_offset_count = 3989906722, th_offset = { sec = 25906, frac = 1983337099038093506}, th_microtime = { tv_sec = 1356376278, tv_usec = 176943}, th_nanotime = { tv_sec = 1356376278, tv_nsec = 176943598}, th_generation = 669310, th_next = 0xc112a94c} (kgdb) p th7 $9 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, th_scale = 1690726758248, th_offset_count = 3989927028, th_offset = { sec = 25906, frac = 2017668996591077394}, th_microtime = { tv_sec = 1356376278, tv_usec = 178804}, th_nanotime = { tv_sec = 1356376278, tv_nsec = 178804734}, th_generation = 669310, th_next = 0xc112a988} (kgdb) p th8 $10 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, th_scale = 1690726758248, th_offset_count = 3989928549, th_offset = { sec = 25906, frac = 2020240591990372602}, th_microtime = { tv_sec = 1356376278, tv_usec = 178944}, th_nanotime = { tv_sec = 1356376278, tv_nsec = 178944140}, th_generation = 669310, th_next =
Re: stable/9 i386 panic [ACPI/timer?]
on 24/12/2012 23:16 David Wolfskill said the following: albert(9.1-P)[3] kgdb /boot/kernel/kernel.symbols vmcore.0 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. ... Unread portion of the kernel message buffer: kernel trap 12 with interrupts disabled Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x34 fault code = supervisor read, page not present instruction pointer = 0x20:0xc0ad475c stack pointer = 0x28:0xc6fba9d8 frame pointer = 0x28:0xc6fbaa18 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= resume, IOPL = 0 current process = 11 (idle: cpu0) trap number = 12 panic: page fault ... Loaded symbols for /boot/kernel/drm.ko #0 doadump (textdump=value optimized out) at pcpu.h:249 249 pcpu.h: No such file or directory. in pcpu.h (kgdb) frame 7 #7 0xc0ad475c in tc_windup () at /usr/src/sys/kern/kern_tc.c:450 450 /usr/src/sys/kern/kern_tc.c: No such file or directory. in /usr/src/sys/kern/kern_tc.c Current language: auto; currently minimal (kgdb) p timehands $1 = (struct timehands * volatile) 0xc11ba910 (kgdb) p th0 $2 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, th_scale = 1690726758248, th_offset_count = 3989950369, th_offset = { sec = 25906, frac = 2057132249855343962}, th_microtime = { tv_sec = 1356376278, tv_usec = 180944}, th_nanotime = { tv_sec = 1356376278, tv_nsec = 180944041}, th_generation = 669311, th_next = 0xc112a7e4} (kgdb) p th1 $3 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, th_scale = 1690726758248, th_offset_count = 3990015836, th_offset = { sec = 25906, frac = 2167819058537565778}, th_microtime = { tv_sec = 1356376278, tv_usec = 186944}, th_nanotime = { tv_sec = 1356376278, tv_nsec = 186944385}, th_generation = 669311, th_next = 0xc112a820} (kgdb) p th2 $4 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, th_scale = 1690726758248, th_offset_count = 3990048555, th_offset = { sec = 25906, frac = 2223137947340682090}, th_microtime = { tv_sec = 1356376278, tv_usec = 189943}, th_nanotime = { tv_sec = 1356376278, tv_nsec = 189943228}, th_generation = 669311, th_next = 0xc112a85c} (kgdb) p th3 $5 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, th_scale = 1690726758248, th_offset_count = 3990059490, th_offset = { sec = 25906, frac = 224162602123970}, th_microtime = { tv_sec = 1356376278, tv_usec = 190945}, th_nanotime = { tv_sec = 1356376278, tv_nsec = 190945470}, th_generation = 669311, th_next = 0xc112a898} (kgdb) p th4 $6 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, th_scale = 1690726758248, th_offset_count = 3990070376, th_offset = { sec = 25906, frac = 2260031295932411698}, th_microtime = { tv_sec = 1356376278, tv_usec = 191943}, th_nanotime = { tv_sec = 1356376278, tv_nsec = 191943220}, th_generation = 669311, th_next = 0xc112a8d4} (kgdb) p th5 $7 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, th_scale = 1690726758248, th_offset_count = 3990081323, th_offset = { sec = 25906, frac = 2278539681754952554}, th_microtime = { tv_sec = 1356376278, tv_usec = 192946}, th_nanotime = { tv_sec = 1356376278, tv_nsec = 192946562}, th_generation = 669311, th_next = 0xc11ba910} (kgdb) p th6 $8 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, th_scale = 1690726758248, th_offset_count = 3989906722, th_offset = { sec = 25906, frac = 1983337099038093506}, th_microtime = { tv_sec = 1356376278, tv_usec = 176943}, th_nanotime = { tv_sec = 1356376278, tv_nsec = 176943598}, th_generation = 669310, th_next = 0xc112a94c} (kgdb) p th7 $9 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, th_scale = 1690726758248, th_offset_count = 3989927028, th_offset = { sec = 25906, frac = 2017668996591077394}, th_microtime = { tv_sec = 1356376278, tv_usec = 178804}, th_nanotime = { tv_sec = 1356376278, tv_nsec = 178804734}, th_generation = 669310, th_next = 0xc112a988} (kgdb) p th8 $10 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, th_scale = 1690726758248, th_offset_count = 3989928549, th_offset = { sec = 25906, frac = 2020240591990372602}, th_microtime = { tv_sec = 1356376278, tv_usec = 178944}, th_nanotime = { tv_sec = 1356376278, tv_nsec = 178944140}, th_generation = 669310, th_next = 0xc112a9c4} (kgdb) p th9 $11 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, th_scale = 1690726758248, th_offset_count = 3989939440, th_offset = { sec = 25906, frac = 2038654297114451570}, th_microtime = { tv_sec = 1356376278, tv_usec = 179942}, th_nanotime = { tv_sec = 1356376278, tv_nsec =
Re: stable/9 i386 panic [ACPI/timer?]
On Tue, Dec 25, 2012 at 12:35:18AM +0200, Andriy Gapon wrote: ... Could you please also provide from the same frame i reg p timehands ? Thank you! You're the one doing the work. :-} I had left teh kgdb session active; I also included p *timehands just in case it might be of use: (kgdb) i reg eax0x1 1 ecx0xc11ba910 -1055151856 edx0xc72405ff -953940481 ebx0x0 0 esp0x0 0x0 ebp0xc6fbaa18 0xc6fbaa18 esi0x1 1 edi0xc71c8300 -954432768 eip0xc0ad475c 0xc0ad475c eflags 0x10086 65670 cs 0x20 32 ss 0xc6fbaa18 -956585448 ds 0x28 40 es 0xc6fb0028 -956628952 fs 0xc71f0008 -954269688 gs 0x0 0 (kgdb) p timehands $13 = (struct timehands * volatile *) 0xc112a6c8 (kgdb) p *timehands $14 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, th_scale = 1690726758248, th_offset_count = 3990092176, th_offset = { sec = 25906, frac = 2296889139262218098}, th_microtime = { tv_sec = 1356376278, tv_usec = 193941}, th_nanotime = { tv_sec = 1356376278, tv_nsec = 193941288}, th_generation = 1, th_next = 0x0} (kgdb) Also: the machine has been in service for about 2.5 years, and was purchased refurbished. If it turns out that there are hardware issues, my feelings won't be hurt at all -- I'd merely want to identify the (likely) failing part(s) and replace them. Peace, david -- David H. Wolfskill da...@catwhisker.org Taliban: Evil men with guns afraid of truth from a 14-year old girl. See http://www.catwhisker.org/~david/publickey.gpg for my public key. pgpustrCoBOIU.pgp Description: PGP signature
Re: stable/9 i386 panic [ACPI/timer?]
on 25/12/2012 00:39 David Wolfskill said the following: I had left teh kgdb session active; I also included p *timehands just in case it might be of use: Thank you. Please also print th0 ... th9. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: stable/9 i386 panic [ACPI/timer?]
On Tue, Dec 25, 2012 at 12:58:00AM +0200, Andriy Gapon wrote: on 25/12/2012 00:39 David Wolfskill said the following: I had left teh kgdb session active; I also included p *timehands just in case it might be of use: Thank you. Please also print th0 ... th9. ... Here you go: (kgdb) p th0 $15 = (struct timehands *) 0xc112a7a8 (kgdb) p th1 $16 = (struct timehands *) 0xc112a7e4 (kgdb) p th2 $17 = (struct timehands *) 0xc112a820 (kgdb) p th3 $18 = (struct timehands *) 0xc112a85c (kgdb) p th4 $19 = (struct timehands *) 0xc112a898 (kgdb) p th5 $20 = (struct timehands *) 0xc112a8d4 (kgdb) p th6 $21 = (struct timehands *) 0xc112a910 (kgdb) p th7 $22 = (struct timehands *) 0xc112a94c (kgdb) p th8 $23 = (struct timehands *) 0xc112a988 (kgdb) p th9 $24 = (struct timehands *) 0xc112a9c4 (kgdb) I've copied /boot/kernel/kernel.symbols over, as well: I need to head out for some errands for a while. Peace, david -- David H. Wolfskill da...@catwhisker.org Taliban: Evil men with guns afraid of truth from a 14-year old girl. See http://www.catwhisker.org/~david/publickey.gpg for my public key. pgpobxxp0rarO.pgp Description: PGP signature
Re: FreeBSD 9.1-RELEASE crashes almost daily; backtraces always list zfs routines
on 24/12/2012 20:17 Derek Kulinski said the following: Hello Andriy, Monday, December 24, 2012, 8:01:26 AM, you wrote: on 24/12/2012 00:23 Derek Kulinski said the following: Dumping 3701 out of 8072 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91% So do you have the crash dump(s)? Yes, but they are 3.5GB each. I attached text dump to GNATS but I can resend it to you (I don't know if it's ok to send attachments to the mailing list). If you would prefer I could give you access to the box. Derek, I've looked through the cores and it does look like in all cases some sort of memory corruption is a precursor to a subsequent crash. I can't decidedly say if the corruptions are caused by the hardware, by some code overwriting random memory locations (rogue driver) or by a simpler bug like use after free. I am always inclined to suspect the hardware first. You can try to reproduce the problem with some additional checks enabled in the kernel. Those should catch the problem earlier and thus make its source clearer. I recommend the following: options INVARIANTS options INVARIANT_SUPPORT options WITNESS options DEBUG_MEMGUARD makeoptions DEBUG+=-DDEBUG The last is really needed only for the ZFS and OpenSolaris compat code. It make result in some extra noise from unrelated subsystems. Perhaps you could just add #define DEBUG to sys/cddl/contrib/opensolaris/uts/common/sys/debug.h. I haven't tested this approach though. Also, please put vm.memguard.desc=arc_buf_hdr_t into loader.conf. Please note that these options will make your system significantly slower. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: stable/9 i386 panic [ACPI/timer?]
on 25/12/2012 01:04 David Wolfskill said the following: On Tue, Dec 25, 2012 at 12:58:00AM +0200, Andriy Gapon wrote: on 25/12/2012 00:39 David Wolfskill said the following: I had left teh kgdb session active; I also included p *timehands just in case it might be of use: Thank you. Please also print th0 ... th9. ... Here you go: (kgdb) p th0 $15 = (struct timehands *) 0xc112a7a8 (kgdb) p th1 $16 = (struct timehands *) 0xc112a7e4 (kgdb) p th2 $17 = (struct timehands *) 0xc112a820 (kgdb) p th3 $18 = (struct timehands *) 0xc112a85c (kgdb) p th4 $19 = (struct timehands *) 0xc112a898 (kgdb) p th5 $20 = (struct timehands *) 0xc112a8d4 (kgdb) p th6 $21 = (struct timehands *) 0xc112a910 Comparing the above and the following from an earlier email: (kgdb) p timehands $1 = (struct timehands * volatile) 0xc11ba910 and the following: (kgdb) p th5 $7 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, th_scale = 1690726758248, th_offset_count = 3990081323, th_offset = { sec = 25906, frac = 2278539681754952554}, th_microtime = { tv_sec = 1356376278, tv_usec = 192946}, th_nanotime = { tv_sec = 1356376278, tv_nsec = 192946562}, th_generation = 669311, th_next = 0xc11ba910} I am quite sure that the impossible happened only because the faulty memory made it possible. (kgdb) p th7 $22 = (struct timehands *) 0xc112a94c (kgdb) p th8 $23 = (struct timehands *) 0xc112a988 (kgdb) p th9 $24 = (struct timehands *) 0xc112a9c4 (kgdb) I've copied /boot/kernel/kernel.symbols over, as well: I need to head out for some errands for a while. Peace, david -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FreeBSD 9.1-RELEASE crashes almost daily; backtraces always list zfs routines
Hello Andriy, Monday, December 24, 2012, 3:28:00 PM, you wrote: I've looked through the cores and it does look like in all cases some sort of memory corruption is a precursor to a subsequent crash. I can't decidedly say if the corruptions are caused by the hardware, by some code overwriting random memory locations (rogue driver) or by a simpler bug like use after free. I am always inclined to suspect the hardware first. You can try to reproduce the problem with some additional checks enabled in the kernel. Those should catch the problem earlier and thus make its source clearer. I recommend the following: options INVARIANTS options INVARIANT_SUPPORT options WITNESS options DEBUG_MEMGUARD makeoptions DEBUG+=-DDEBUG The last is really needed only for the ZFS and OpenSolaris compat code. It make result in some extra noise from unrelated subsystems. Perhaps you could just add #define DEBUG to sys/cddl/contrib/opensolaris/uts/common/sys/debug.h. I haven't tested this approach though. Also, please put vm.memguard.desc=arc_buf_hdr_t into loader.conf. Please note that these options will make your system significantly slower. I recompiled the kernel and is running with options you specified (I enabled DEBUG in the file). Anyway even at boot time I started getting following warnings, is this anything: Dec 24 16:06:03 chinatsu kernel: Creating and/or trimming log files Dec 24 16:06:03 chinatsu kernel: lock order reversal: Dec 24 16:06:03 chinatsu kernel: 1st 0x80bf5780 pf task mtx (pf task mtx) @ /usr/src/sys/contrib/pf/net/pf.c:3330 Dec 24 16:06:03 chinatsu kernel: . Dec 24 16:06:03 chinatsu kernel: 2nd 0xfe0009211af8 radix node head (radix node head) @ /usr/src/sys/net/route.c:384 Dec 24 16:06:03 chinatsu kernel: KDB: stack backtrace: Dec 24 16:06:03 chinatsu kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a Dec 24 16:06:03 chinatsu kernel: kdb_backtrace() at kdb_backtrace+0x37 Dec 24 16:06:03 chinatsu kernel: _witness_debugger() at _witness_debugger+0x2c Dec 24 16:06:03 chinatsu kernel: witness_checkorder() at witness_checkorder+0x844 Dec 24 16:06:03 chinatsu kernel: _rw_rlock() at Dec 24 16:06:03 chinatsu kernel: Starting syslogd. Dec 24 16:06:03 chinatsu kernel: _rw_rlock+0x81 Dec 24 16:06:03 chinatsu kernel: rtalloc1_fib() at rtalloc1_fib+0x11c Dec 24 16:06:03 chinatsu kernel: rtalloc_ign_fib() at rtalloc_ign_fib+0xc5 Dec 24 16:06:03 chinatsu kernel: pf_routable() at pf_routable+0x1fd Dec 24 16:06:03 chinatsu kernel: pf_test_rule() at pf_test_rule+0x6cf Dec 24 16:06:03 chinatsu kernel: pf_test() at pf_test+0xf58 Dec 24 16:06:03 chinatsu kernel: pf_check_in() at pf_check_in+0x2b Dec 24 16:06:03 chinatsu kernel: pfil_run_hooks() at pfil_run_hooks+0xd2 Dec 24 16:06:03 chinatsu kernel: ip_input() at ip_input+0x2dc Dec 24 16:06:03 chinatsu kernel: netisr_dispatch_src() at netisr_dispatch_src+0x170 Dec 24 16:06:03 chinatsu kernel: ether_demux() at ether_demux+0x17d Dec 24 16:06:03 chinatsu kernel: ether_nh_input() at ether_nh_input+0x209 Dec 24 16:06:03 chinatsu kernel: netisr_dispatch_src() at netisr_dispatch_src+0x170 Dec 24 16:06:03 chinatsu kernel: alc_int_task() at alc_int_task+0x2ff Dec 24 16:06:03 chinatsu kernel: taskqueue_run_locked() at taskqueue_run_locked+0x93 Dec 24 16:06:03 chinatsu kernel: taskqueue_thread_loop() at taskqueue_thread_loop+0x3e Dec 24 16:06:03 chinatsu kernel: fork_exit() at fork_exit+0x133 Dec 24 16:06:03 chinatsu kernel: fork_trampoline() at fork_trampoline+0xe Dec 24 16:06:03 chinatsu kernel: --- trap 0, rip = 0, rsp = 0xff85fb2ebbb0, rbp = 0 --- Dec 24 16:06:03 chinatsu kernel: No core dumps found. Dec 24 16:06:04 chinatsu kernel: lock order reversal: Dec 24 16:06:04 chinatsu kernel: 1st 0xff85b9cb8dd8 bufwait (bufwait) @ /usr/src/sys/kern/vfs_bio.c:2677 Dec 24 16:06:04 chinatsu kernel: 2nd 0xfe00092c5c00 dirhash (dirhash) @ /usr/src/sys/ufs/ufs/ufs_dirhash.c:284 Dec 24 16:06:04 chinatsu kernel: KDB: stack backtrace: Dec 24 16:06:04 chinatsu kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a Dec 24 16:06:04 chinatsu kernel: kdb_backtrace() at kdb_backtrace+0x37 Dec 24 16:06:04 chinatsu kernel: _witness_debugger() at _witness_debugger+0x2c Dec 24 16:06:04 chinatsu kernel: witness_checkorder() at witness_checkorder+0x844 Dec 24 16:06:04 chinatsu kernel: _sx_xlock() at _sx_xlock+0x61 Dec 24 16:06:04 chinatsu kernel: ufsdirhash_acquire() at ufsdirhash_acquire+0x33 Dec 24 16:06:04 chinatsu kernel: ufsdirhash_remove() at Dec 24 16:06:04 chinatsu kernel: ufsdirhash_remove+0x16 Dec 24 16:06:04 chinatsu kernel: ufs_dirremove() at ufs_dirremove+0x1bb Dec 24 16:06:04 chinatsu kernel: ufs_remove() at ufs_remove+0x92 Dec 24 16:06:04 chinatsu kernel: VOP_REMOVE_APV() at VOP_REMOVE_APV+0xb7 Dec 24 16:06:04 chinatsu kernel: kern_unlinkat() at kern_unlinkat+0x2eb Dec 24 16:06:04 chinatsu kernel: amd64_syscall() at amd64_syscall+0x30e Dec 24 16:06:04 chinatsu
Re: stable/9 i386 panic [ACPI/timer?]
On Tue, Dec 25, 2012 at 01:33:15AM +0200, Andriy Gapon wrote: ... (kgdb) p th6 $21 = (struct timehands *) 0xc112a910 Comparing the above and the following from an earlier email: (kgdb) p timehands $1 = (struct timehands * volatile) 0xc11ba910 and the following: (kgdb) p th5 $7 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, th_scale = 1690726758248, th_offset_count = 3990081323, th_offset = { sec = 25906, frac = 2278539681754952554}, th_microtime = { tv_sec = 1356376278, tv_usec = 192946}, th_nanotime = { tv_sec = 1356376278, tv_nsec = 192946562}, th_generation = 669311, th_next = 0xc11ba910} I am quite sure that the impossible happened only because the faulty memory made it possible. Ah. Well, that's not unreasonable, then. I have (2) 1GB DIMMs + (2) 512MB DIMMs in the machine presently. Since I bought the 1GB DIMMs more recently, I'll just pull the 512MB DIMMs for now, and if that causes things to settle down, I'll plan on buying a couple more 1GBDIMMs to replace the 512MB DIMMs. Thank you very much for your help! ... Peace, david -- David H. Wolfskill da...@catwhisker.org Taliban: Evil men with guns afraid of truth from a 14-year old girl. See http://www.catwhisker.org/~david/publickey.gpg for my public key. pgpts0FNZ2DyO.pgp Description: PGP signature
CAM hangs in 9-STABLE? [Was: NFS/ZFS hangs after upgrading from 9.0-RELEASE to -STABLE]
Dear All It turns out that reverting to an older version of the mps driver did not fix the ZFS hangs I've been struggling with in 9.1 and 9-STABLE after all (they just took a bit longer to occur again, possibly just by chance). I followed steps along lines suggested by Andriy to collect more information when the problem occurs. Hopefully this will help figure out what's going on. As far as I can tell, what happens is that at some point IO operations to a bunch of drives that belong to different pools get stuck. For these drives, gstat shows no activity but 1 pending operation, as such: L(q) ops/sr/s kBps ms/rw/s kBps ms/wd/s kBps ms/d %busy Name 1 0 0 00.0 0 00.0 0 00.0 0.0 da1 I've been running gstat in a loop (every 100s) to monitor the machine. Just before the hang occurs, everything seems fine (see full gstat output below). Right after the hang occurs a number of drives seem stuck (see full gstat output below). Notably, some stuck drives are seen through the mps driver and others through the mpt driver. So the problem doesn't seem to be driver-specific. I have had the problem occur (at a lower frequency) on similar machines that don't use the mpt driver (and only have 1 disk provided through mps), so the problem doesn't seem to be caused by the mpt driver (and is likely not caused by defective hardware). Since based on the information I provided earlier Andriy thinks the problem might not originate in ZFS, perhaps that means that the problem is in the CAM layer? camcontrol tags -v (as suggested by Andriy) in the hung state shows for example (pass56:mpt1:0:8:20): dev_openings 254 (pass56:mpt1:0:8:20): dev_active1 (pass56:mpt1:0:8:20): devq_openings 254 (pass56:mpt1:0:8:20): devq_queued 0 (pass56:mpt1:0:8:20): held 0 (pass56:mpt1:0:8:20): mintags 2 (pass56:mpt1:0:8:20): maxtags 255 (I'm not providing full camcontrol tags output below because I couldn't get it to run during the specific hang I documented most thoroughly; the example above is from a different occurrence of the hang). The buses don't seem completely frozen: if I manually remove drives while the machine is hanging, that's picked up by the mpt driver, which prints out corresponding messages to the console. But camcontrol reset all or rescan all don't seem to do anything. I've tried reducing vfs.zfs.vdev.min_pending and vfs.zfs.vdev.max_pending to 1, to no avail. Any suggestions to resolve this problem, work around it, or further investigate it would be greatly appreciated! Thanks a lot Olivier Detailed information: Output of procstat -a -kk when the machine is hanging is available at http://pastebin.com/7D2KtT35 (not putting it here because it's pretty long) dmesg is available at http://pastebin.com/9zJQwWJG . Note that I'm using LUN masking, so the illegal requests reported aren't really errors. Maybe one day if I get my problems sorted out I'll use geom multipathing instead. My kernel config is include GENERIC ident MYKERNEL options IPSEC device crypto options OFED # Infiniband protocol device mlx4ib # ConnectX Infiniband support device mlxen # ConnectX Ethernet support device mthca # Infinihost cards device ipoib # IP over IB devices options ATA_CAM # Handle legacy controllers with CAM options ATA_STATIC_ID # Static device numbering options KDB options DDB Full output of gstat just before the hang (at most 100s before the hang): L(q) ops/sr/s kBps ms/rw/s kBps ms/wd/s kBps ms/d %busy Name 0 0 0 00.0 0 00.0 0 00.0 0.0 da2 0 0 0 00.0 0 00.0 0 00.0 0.0 da0 0 0 0 00.0 0 00.0 0 00.0 0.0 DEV/da2/da2 0 0 0 00.0 0 00.0 0 00.0 0.0 DEV/da0/da0 1 85 48 794.7 35 840.5 0 00.0 24.3 da1 0 0 0 00.0 0 00.0 0 00.0 0.0 DEV/da1/da1 1 83 47 774.3 34 790.5 0 00.0 22.1 da4 1 1324 1303 214330.6 19 420.7 0 00.0 79.8 da3 0 0 0 00.0 0 00.0 0 00.0 0.0 da5 0 0 0 00.0 0 00.0 0 00.0 0.0 da6 0 0 0 00.0 0 00.0 0 00.0 0.0 da7 0 0 0 00.0 0 00.0 0 00.0 0.0 da8 0 0 0 00.0 0 00.0 0 00.0 0.0 da9 0 0 0 00.0 0 00.0 0 00.0 0.0 da10 0 0 0 00.0 0 00.0 0 00.0 0.0 da11 0 0 0 00.0 0 0