Re: 9.1 minimal ram requirements

2012-12-24 Thread Jakub Lach
http://www.freebsd.org/cgi/query-pr.cgi?pr=174671



--
View this message in context: 
http://freebsd.1045724.n5.nabble.com/9-1-minimal-ram-requirements-tp5771583p5771862.html
Sent from the freebsd-stable mailing list archive at Nabble.com.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: [HEADSUP] zfs root pool mounting

2012-12-24 Thread Kimmo Paasiala
On Sun, Dec 23, 2012 at 2:49 PM, Andriy Gapon a...@freebsd.org wrote:
 on 23/12/2012 14:34 Kimmo Paasiala said the following:
 On Sun, Dec 23, 2012 at 2:28 PM, Andriy Gapon a...@freebsd.org wrote:

 I have MFCed the following change, so please double-check if you might be
 affected.  Preferably before upgrading :-)

 on 28/11/2012 20:35 Andriy Gapon said the following:

 Recently some changes were made to how a root pool is opened for root 
 filesystem
 mounting.  Previously the root pool had to be present in zpool.cache.  Now 
 it is
 automatically discovered by probing available GEOM providers.
 The new scheme is believed to be more flexible.  For example, it allows to 
 prepare
 a new root pool at one system, then export it and then boot from it on a 
 new
 system without doing any extra/magical steps with zpool.cache.  It could 
 also be
 convenient after zpool split and in some other situations.

 The change was introduced via multiple commits, the latest relevant 
 revision in
 head is r243502.  The changes are partially MFC-ed, the remaining parts are
 scheduled to be MFC-ed soon.

 I have received a report that the change caused a problem with booting on 
 at least
 one system.  The problem has been identified as an issue in local 
 environment and
 has been fixed.  Please read on to see if you might be affected when you 
 upgrade,
 so that you can avoid any unnecessary surprises.

 You might be affected if you ever had a pool named the same as your 
 current root
 pool.  And you still have any disks connected to your system that belonged 
 to that
 pool (in whole or via some partitions).  And that pool was never properly
 destroyed using zpool destroy, but merely abandoned (its disks
 re-purposed/re-partitioned/reused).

 If all of the above are true, then I recommend that you run 'zdb -l 
 disk' for
 all suspect disks and their partitions (or just all disks and partitions). 
  If
 this command reports at least one valid ZFS label for a disk or a 
 partition that
 do not belong to any current pool, then the problem may affect you.

 The best course is to remove the offending labels.

 If you are affected, please follow up to this email.

 Much appreciated!

 I have verified that my system is not affected.

 One question, do I have to rewrite the zfs gpt boot loader
 (/boot/gptzfsboot) onto the freebsd-boot partition to make use of this
 change?

 This change is kernel-level only.  There is no interaction with boot blocks.

 --
 Andriy Gapon

I can happily report that booting from the ZFS pool works on my
9-STABLE system without the zpool.cache file.

Thanks, merry christmas and happy new year!

-Kimmo
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Kernel panic when playing games/iourbanterror

2012-12-24 Thread David Demelier

Hello,

When playing a lot Urban Terror, the system panic with ACPI related issues :

Fatal trap 9: general protection fault while in kernel mode
cpuid = 1; apic id = 01
instruction pointer = 0x20:0x802c6f15
stack pointer   = 0x28:0xff80d89ac6c0
frame pointer   = 0x28:0x0
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 1288 (hald)
trap number = 9
panic: general protection fault
cpuid = 1
Uptime: 1h52m22s
Dumping 596 out of 3054 MB:..3%..11%..22%..33%..41%..51%..62%..73%..81%..92%

Reading symbols from /boot/modules/vboxdrv.ko...done.
Loaded symbols for /boot/modules/vboxdrv.ko
#0  doadump (textdump=Variable textdump is not available.
) at pcpu.h:224
224 __asm(movq %%gs:0,%0 : =r (td));
(kgdb) list *0xff80d89ac6c0
No source file for address 0xff80d89ac6c0.
(kgdb) backtrace
#0  doadump (textdump=Variable textdump is not available.
) at pcpu.h:224
#1  0x0004 in ?? ()
#2  0x804f3ae6 in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:448
#3  0x804f3fa9 in panic (fmt=0x1 Address 0x1 out of bounds)
at /usr/src/sys/kern/kern_shutdown.c:636
#4  0x806fcfa9 in trap_fatal (frame=0x9, eva=Variable eva is 
not available.

)
at /usr/src/sys/amd64/amd64/trap.c:857
#5  0x806fd554 in trap (frame=0xff80d89ac610)
at /usr/src/sys/amd64/amd64/trap.c:599
#6  0x806e81bf in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:228
#7  0x802c6f15 in AcpiUtUpdateObjectReference (
Object=0xfe0001824a80, Action=0)
at /usr/src/sys/contrib/dev/acpica/utilities/utdelete.c:563
#8  0x802b77a4 in AcpiExResolveNodeToValue (
ObjectPtr=0xfe0001a2c2e0, WalkState=0xfe0001a2c000)
at /usr/src/sys/contrib/dev/acpica/executer/exresnte.c:184
#9  0x802b7ad3 in AcpiExResolveToValue 
(StackPtr=0xfe0001a2c2e0,

WalkState=0xfe0001a2c000)
at /usr/src/sys/contrib/dev/acpica/executer/exresolv.c:124
#10 0x802ac433 in AcpiDsEvaluateNamePath 
(WalkState=0xfe0001a2c000)

at /usr/src/sys/contrib/dev/acpica/dispatcher/dsutils.c:886
---Type return to continue, or q return to quit---
#11 0x802aceef in AcpiDsExecEndOp (WalkState=0xfe0001a2c000)
at /usr/src/sys/contrib/dev/acpica/dispatcher/dswexec.c:436
#12 0x802c05ba in AcpiPsParseLoop (WalkState=0xfe0001a2c000)
at /usr/src/sys/contrib/dev/acpica/parser/psloop.c:1249
#13 0x802c10a8 in AcpiPsParseAml (WalkState=0xfe0001a2c000)
at /usr/src/sys/contrib/dev/acpica/parser/psparse.c:525
#14 0x802c1d45 in AcpiPsExecuteMethod (Info=0xfe0033df8540)
at /usr/src/sys/contrib/dev/acpica/parser/psxface.c:368
#15 0x802bb784 in AcpiNsEvaluate (Info=0xfe0033df8540)
at /usr/src/sys/contrib/dev/acpica/namespace/nseval.c:193
#16 0x802bec91 in AcpiEvaluateObject (Handle=0xfe00017f7b80,
Pathname=0x8078229f _BST, ExternalParams=0x0,
ReturnBuffer=0xff80d89ac960)
at /usr/src/sys/contrib/dev/acpica/namespace/nsxfeval.c:289
#17 0x80309802 in acpi_cmbat_get_bst (arg=Variable arg is not 
available.

)
at /usr/src/sys/dev/acpica/acpi_cmbat.c:257
#18 0x80309af8 in acpi_cmbat_bst (dev=0xfe0001936400,
bstp=0xfe008b319400) at /usr/src/sys/dev/acpica/acpi_cmbat.c:418
#19 0x8045bd22 in devfs_ioctl_f (fp=0xfe001ba256e0,
com=3231990289, data=Variable data is not available.
) at /usr/src/sys/fs/devfs/devfs_vnops.c:757
#20 0x8053a23d in kern_ioctl (td=0xfe00039ae8e0, fd=Variable 
fd is not available.

) at file.h:293
#21 0x8053a4ad in sys_ioctl (td=0xfe00039ae8e0,
uap=0xff80d89acb70) at /usr/src/sys/kern/sys_generic.c:691
---Type return to continue, or q return to quit---
#22 0x806fc902 in amd64_syscall (td=0xfe00039ae8e0, traced=0)
at subr_syscall.c:135
#23 0x806e84a7 in Xfast_syscall ()
at /usr/src/sys/amd64/amd64/exception.S:387
#24 0x000801d89c5c in ?? ()
Previous frame inner to this frame (corrupt stack?)

Before the panic, a lot of ACPI Error appears in dmesg like that :

ACPI Error: Method execution failed [\\_SB_.BAT0._UID] (Node 
0xfe00017f7b00), AE_AML_NO_OPERAND (20110527/uteval-113)
ACPI Error: No object attached to node 0xfe00017f7b00 
(20110527/exresnte-139)
ACPI Error: Method execution failed [\\_SB_.BAT0._UID] (Node 
0xfe00017f7b00), AE_AML_NO_OPERAND (20110527/uteval-113)
ACPI Error: No object attached to node 0xfe00017f7b00 
(20110527/exresnte-139)


This happens on 9.1-RELEASE amd64

Cheers,
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to 

Re: PKGNG Monitoring in Zabbix

2012-12-24 Thread Johan Hendriks

Marin Atanasov Nikolov schreef:

Hey,

Looks like the end of the World is postponed, so I've though that now I
have some time to document some stuff :)

The documentations are about monitoring your PKGNG package database in
Zabbix.

Part I explains how to monitor your database and have graphs of the number
of packages and disk space taken by packages on your FreeBSD system.

Part II talks about how to perform audits of your package database for
things like missing package dependencies and packages that are known to
vulnerable.

You can find the documentations at the links below:

* http://unix-heaven.org/monitorig-pkgng-in-zabbix-part-i
* http://unix-heaven.org/monitorig-pkgng-in-zabbix-part-ii

Hope you like them, and Happy Holidays! :)

Regards,
Marin


Thanks, allways nice to see things in my zabbix console
I will try it out when i find some time  : )

gr
Johan Hendriks

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: What is negative group permissions? (Re: narawntapu security run output)

2012-12-24 Thread Mikhail T.

On 23.12.2012 11:48, Chris Rees wrote:
They involve a lot of thought to get right, as well as chmod g-w on 
something where you probably meant chmod go-w is a disastrous but 
(perhaps) common error. Chris 


Well, in (over 20) years of dealing with Unix, I've never made a mistake 
like that, nor do I understand, how it can be considered common ... 
Got to admit, I was surprised to see it. It made me think, I do not 
understand something -- or that FreeBSD is becoming overly 
paternalistic. It turned out to be the latter...


I doubt, it is useful. Worse, issuing such warnings routinely, only 
reinforces the unfortunate misconceptions like the one Barney 
demonstrated in this thread. When originally added, the check was meant 
to be off by default:


   r215213 | brooks | 2010-11-12 19:40:43 -0500 (пт, 12 лис 2010) | 7 lines

   Add an (off by default) check for negative permissions (where the
   group on a object has less permissions that everyone).  These
   permissions will not work reliably over NFS if you have more than
   14 supplemental groups and are usually not what you mean.

   MFC after:  1 week

perhaps, it should have remained off? Yours,

   -mi

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: What is negative group permissions? (Re: narawntapu security run output)

2012-12-24 Thread jb
Mikhail T. mi+thun at aldan.algebra.com writes:

 
 On 23.12.2012 11:48, Chris Rees wrote:
  They involve a lot of thought to get right, as well as chmod g-w on 
  something where you probably meant chmod go-w is a disastrous but 
  (perhaps) common error. Chris 
 
 Well, in (over 20) years of dealing with Unix, I've never made a mistake 
 like that, nor do I understand, how it can be considered common ... 
 Got to admit, I was surprised to see it. It made me think, I do not 
 understand something -- or that FreeBSD is becoming overly 
 paternalistic. It turned out to be the latter...
 
 I doubt, it is useful. Worse, issuing such warnings routinely, only 
 reinforces the unfortunate misconceptions like the one Barney 
 demonstrated in this thread. When originally added, the check was meant 
 to be off by default:
 ... 
 perhaps, it should have remained off? Yours,

Those security checks are for a reason - people make mistakes (even a perfect
guy like you will have a head in a brown bag time).
It is better to get a heads-up, then think about it and turn it off (customize)
if considered unneeded.
jb
 



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: What is negative group permissions? (Re: narawntapu security run output)

2012-12-24 Thread Eitan Adler
On 24 December 2012 10:27, jb jb.1234a...@gmail.com wrote:
 Those security checks are for a reason - people make mistakes (even a perfect
 guy like you will have a head in a brown bag time).
 It is better to get a heads-up, then think about it and turn it off 
 (customize)
 if considered unneeded.

+1.  Default to helping the new user (or the user that makes mistakes).


-- 
Eitan Adler
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD 9.1-RELEASE crashes almost daily; backtraces always list zfs routines

2012-12-24 Thread Andriy Gapon
on 24/12/2012 00:23 Derek Kulinski said the following:
 Dumping 3701 out of 8072 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%

So do you have the crash dump(s)?

-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD 9.1-RELEASE crashes almost daily; backtraces always list zfs routines

2012-12-24 Thread Derek Kulinski
Hello Andriy,

Monday, December 24, 2012, 8:01:26 AM, you wrote:

 on 24/12/2012 00:23 Derek Kulinski said the following:
 Dumping 3701 out of 8072 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%

 So do you have the crash dump(s)?

Yes, but they are 3.5GB each. I attached text dump to GNATS but I can
resend it to you (I don't know if it's ok to send attachments to the
mailing list). If you would prefer I could give you access to the
box.

-- 
Best regards,
 Derekmailto:tak...@takeda.tk

-- Programmer - A red-eyed, mumbling mammal capable of conversing with 
inanimate objects.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Kernel panic when playing games/iourbanterror

2012-12-24 Thread Adrian Chadd
Hi,

Please file a PR? We can bump it to the ACPI person who has been
busily making this stuff updated and stable.


Thanks!



Adrian


On 24 December 2012 05:52, David Demelier demelier.da...@gmail.com wrote:
 Hello,

 When playing a lot Urban Terror, the system panic with ACPI related issues :

 Fatal trap 9: general protection fault while in kernel mode
 cpuid = 1; apic id = 01
 instruction pointer = 0x20:0x802c6f15
 stack pointer   = 0x28:0xff80d89ac6c0
 frame pointer   = 0x28:0x0
 code segment= base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, long 1, def32 0, gran 1
 processor eflags= interrupt enabled, resume, IOPL = 0
 current process = 1288 (hald)
 trap number = 9
 panic: general protection fault
 cpuid = 1
 Uptime: 1h52m22s
 Dumping 596 out of 3054 MB:..3%..11%..22%..33%..41%..51%..62%..73%..81%..92%

 Reading symbols from /boot/modules/vboxdrv.ko...done.
 Loaded symbols for /boot/modules/vboxdrv.ko
 #0  doadump (textdump=Variable textdump is not available.
 ) at pcpu.h:224
 224 __asm(movq %%gs:0,%0 : =r (td));
 (kgdb) list *0xff80d89ac6c0
 No source file for address 0xff80d89ac6c0.
 (kgdb) backtrace
 #0  doadump (textdump=Variable textdump is not available.
 ) at pcpu.h:224
 #1  0x0004 in ?? ()
 #2  0x804f3ae6 in kern_reboot (howto=260)
 at /usr/src/sys/kern/kern_shutdown.c:448
 #3  0x804f3fa9 in panic (fmt=0x1 Address 0x1 out of bounds)
 at /usr/src/sys/kern/kern_shutdown.c:636
 #4  0x806fcfa9 in trap_fatal (frame=0x9, eva=Variable eva is not
 available.
 )
 at /usr/src/sys/amd64/amd64/trap.c:857
 #5  0x806fd554 in trap (frame=0xff80d89ac610)
 at /usr/src/sys/amd64/amd64/trap.c:599
 #6  0x806e81bf in calltrap ()
 at /usr/src/sys/amd64/amd64/exception.S:228
 #7  0x802c6f15 in AcpiUtUpdateObjectReference (
 Object=0xfe0001824a80, Action=0)
 at /usr/src/sys/contrib/dev/acpica/utilities/utdelete.c:563
 #8  0x802b77a4 in AcpiExResolveNodeToValue (
 ObjectPtr=0xfe0001a2c2e0, WalkState=0xfe0001a2c000)
 at /usr/src/sys/contrib/dev/acpica/executer/exresnte.c:184
 #9  0x802b7ad3 in AcpiExResolveToValue (StackPtr=0xfe0001a2c2e0,
 WalkState=0xfe0001a2c000)
 at /usr/src/sys/contrib/dev/acpica/executer/exresolv.c:124
 #10 0x802ac433 in AcpiDsEvaluateNamePath
 (WalkState=0xfe0001a2c000)
 at /usr/src/sys/contrib/dev/acpica/dispatcher/dsutils.c:886
 ---Type return to continue, or q return to quit---
 #11 0x802aceef in AcpiDsExecEndOp (WalkState=0xfe0001a2c000)
 at /usr/src/sys/contrib/dev/acpica/dispatcher/dswexec.c:436
 #12 0x802c05ba in AcpiPsParseLoop (WalkState=0xfe0001a2c000)
 at /usr/src/sys/contrib/dev/acpica/parser/psloop.c:1249
 #13 0x802c10a8 in AcpiPsParseAml (WalkState=0xfe0001a2c000)
 at /usr/src/sys/contrib/dev/acpica/parser/psparse.c:525
 #14 0x802c1d45 in AcpiPsExecuteMethod (Info=0xfe0033df8540)
 at /usr/src/sys/contrib/dev/acpica/parser/psxface.c:368
 #15 0x802bb784 in AcpiNsEvaluate (Info=0xfe0033df8540)
 at /usr/src/sys/contrib/dev/acpica/namespace/nseval.c:193
 #16 0x802bec91 in AcpiEvaluateObject (Handle=0xfe00017f7b80,
 Pathname=0x8078229f _BST, ExternalParams=0x0,
 ReturnBuffer=0xff80d89ac960)
 at /usr/src/sys/contrib/dev/acpica/namespace/nsxfeval.c:289
 #17 0x80309802 in acpi_cmbat_get_bst (arg=Variable arg is not
 available.
 )
 at /usr/src/sys/dev/acpica/acpi_cmbat.c:257
 #18 0x80309af8 in acpi_cmbat_bst (dev=0xfe0001936400,
 bstp=0xfe008b319400) at /usr/src/sys/dev/acpica/acpi_cmbat.c:418
 #19 0x8045bd22 in devfs_ioctl_f (fp=0xfe001ba256e0,
 com=3231990289, data=Variable data is not available.
 ) at /usr/src/sys/fs/devfs/devfs_vnops.c:757
 #20 0x8053a23d in kern_ioctl (td=0xfe00039ae8e0, fd=Variable
 fd is not available.
 ) at file.h:293
 #21 0x8053a4ad in sys_ioctl (td=0xfe00039ae8e0,
 uap=0xff80d89acb70) at /usr/src/sys/kern/sys_generic.c:691
 ---Type return to continue, or q return to quit---
 #22 0x806fc902 in amd64_syscall (td=0xfe00039ae8e0, traced=0)
 at subr_syscall.c:135
 #23 0x806e84a7 in Xfast_syscall ()
 at /usr/src/sys/amd64/amd64/exception.S:387
 #24 0x000801d89c5c in ?? ()
 Previous frame inner to this frame (corrupt stack?)

 Before the panic, a lot of ACPI Error appears in dmesg like that :

 ACPI Error: Method execution failed [\\_SB_.BAT0._UID] (Node
 0xfe00017f7b00), AE_AML_NO_OPERAND (20110527/uteval-113)
 ACPI Error: No object attached to node 0xfe00017f7b00
 (20110527/exresnte-139)
 ACPI Error: Method execution failed [\\_SB_.BAT0._UID] (Node
 0xfe00017f7b00), AE_AML_NO_OPERAND (20110527/uteval-113)
 ACPI Error: No object attached to node 

stable/9 i386 panic [ACPI/timer?]

2012-12-24 Thread David Wolfskill
I finally(!) got around to enabling crash dumps on the primary machine
here at the house ... and managed to make use of it (unfortunately).

I've copied the relevant files (both those from /var/crash and
dmesg.boot) so they should be visibale at
http://www.catwhisker.org/~david/FreeBSD/panic_24Dec2012/ (though
only the dmesg.boot, core.text.0,  info.0 files should be fetchable
for now).  [I'll make the vmcore.0 available to individuals who
wish to work on the problem; please contact me to arrange this.]

Here's a bit of information excerpted from core.text.0:

Mon Dec 24 11:16:04 PST 2012

FreeBSD albert.catwhisker.org 9.1-PRERELEASE FreeBSD 9.1-PRERELEASE #434 
244582M: Sat Dec 22 05:06:29 PST 2012 
r...@freebeast.catwhisker.org:/usr/obj/usr/src/sys/ALBERT  i386

Note that while the version string says 244582M:

* Userland was at r244608.

* The Modification was merely a change to src/sys/newvers.sh to re-factor
  the extraction of the version string.



panic: page fault

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x34
fault code  = supervisor read, page not present
instruction pointer = 0x20:0xc0ad475c
stack pointer   = 0x28:0xc6fba9d8
frame pointer   = 0x28:0xc6fbaa18
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= resume, IOPL = 0
current process = 11 (idle: cpu0)
trap number = 12
panic: page fault
cpuid = 0
KDB: stack backtrace:
db_trace_self_wrapper(c0ffbab8,46,1,ca931e80,0,...) at 0xc051ef76 = 
db_trace_self_wrapper+0x36/frame 0xc6fba740
kdb_backtrace(c1033ff1,0,c0e75cc4,c6fba7ec,c71f08d0,...) at 0xc0afc400 = 
kdb_backtrace+0x30/frame 0xc6fba7a0
panic(c0e75cc4,c1034ddb,c71f0a84,1,1,...) at 0xc0ac763c = panic+0x1bc/frame 
0xc6fba7e0
trap_fatal(28,7fff,3,0,28,...) at 0xc0e35560 = trap_fatal+0x340/frame 
0xc6fba828
trap_pfault(34,c,1,c11a68b0,c6fba940,...) at 0xc0e358cb = 
trap_pfault+0x35b/frame 0xc6fba8a0
trap(c6fba998) at 0xc0e34e13 = trap+0x443/frame 0xc6fba98c
calltrap() at 0xc0e1e86c = calltrap+0x6/frame 0xc6fba98c
--- trap 0xc, eip = 0xc0ad475c, esp = 0xc6fba9d8, ebp = 0xc6fbaa18 ---
tc_windup(1,0,c0ff3ba6,21c,0,...) at 0xc0ad475c = tc_windup+0x1c/frame 
0xc6fbaa18
hardclock_cnt(1,0,0,3,0,...) at 0xc0a77e39 = hardclock_cnt+0x2e9/frame 
0xc6fbaa68
handleevents(c6fbaaf8,2,46,c71f08d0,c6fbaae4,...) at 0xc0e3c534 = 
handleevents+0x184/frame 0xc6fbaac0
timercb(c7564064,0,c76a82f0,c6fbab58,c0a99a0e,...) at 0xc0e3d1a1 = 
timercb+0x281/frame 0xc6fbab14
hpet_intr_single(c7564064,c7569780,0,c6fbabbc,c6fbab78,...) at 0xc053a345 = 
hpet_intr_single+0x195/frame 0xc6fbab40
hpet_intr(c7564000,0,c71f08d0,14,c723b710,...) at 0xc053a3cf = 
hpet_intr+0x6f/frame 0xc6fbab58
intr_event_handle(c723c280,c6fbabbc,c6fbab94,0,c7182600,...) at 0xc0a99c5c = 
intr_event_handle+0x7c/frame 0xc6fbab78
intr_execute_handlers(c723b710,c6fbabbc,0) at 0xc0e4c552 = 
intr_execute_handlers+0x42/frame 0xc6fbab98
lapic_handle_intr(33,c6fbabbc) at 0xc0e4f50d = lapic_handle_intr+0x3d/frame 
0xc6fbabac
Xapic_isr1() at 0xc0e1ec35 = Xapic_isr1+0x35/frame 0xc6fbabac
--- interrupt, eip = 0xc0e1a202, esp = 0xc6fbabfc, ebp = 0xc6fbac3c ---
acpi_cpu_c1(0,c6fbac58,c0e250a6,0,c1198018,...) at 0xc0e1a202 = 
acpi_cpu_c1+0x2/frame 0xc6fbac3c
cpu_idle_acpi(0,c1198018,c6fbacd0,c0aee519,0,...) at 0xc0e24fff = 
cpu_idle_acpi+0x2f/frame 0xc6fbac48
cpu_idle(0,2,c0ffa49a,a36,c71f08d0,...) at 0xc0e250a6 = cpu_idle+0x96/frame 
0xc6fbac58
sched_idletd(0,c6fbad08,0,0,c0aee250,...) at 0xc0aee519 = 
sched_idletd+0x2c9/frame 0xc6fbacd0
fork_exit(c0aee250,0,c6fbad08) at 0xc0a977c7 = fork_exit+0x67/frame 0xc6fbacf4
fork_trampoline() at 0xc0e1e8e4 = fork_trampoline+0x8/frame 0xc6fbacf4
--- trap 0, eip = 0, esp = 0xc6fbad40, ebp = 0 ---
Uptime: 7h11m46s
Physical memory: 3045 MB


#0  doadump (textdump=value optimized out) at pcpu.h:249
249 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) #0  doadump (textdump=value optimized out) at pcpu.h:249
#1  0xc0ac71fa in kern_reboot (howto=Unhandled dwarf expression opcode 0xc0
)
at /usr/src/sys/kern/kern_shutdown.c:448
#2  0xc0ac7688 in panic (fmt=Unhandled dwarf expression opcode 0xc0
) at /usr/src/sys/kern/kern_shutdown.c:636
#3  0xc0e35560 in trap_fatal (frame=value optimized out, 
eva=value optimized out) at /usr/src/sys/i386/i386/trap.c:1043
#4  0xc0e358cb in trap_pfault (frame=value optimized out, usermode=Unhandled 
dwarf expression opcode 0xc3
)
at /usr/src/sys/i386/i386/trap.c:858
#5  0xc0e34e13 in trap (frame=value optimized out)
at /usr/src/sys/i386/i386/trap.c:555
#6  0xc0e1e86c in calltrap () at /tmp/exception-SmXQMs.s:94
#7  0xc0ad475c in tc_windup () at /usr/src/sys/kern/kern_tc.c:450
#8  0xc0a77e39 in hardclock_cnt (usermode=value optimized out)
at /usr/src/sys/kern/kern_clock.c:556
#9  0xc0e3c534 in handleevents (now=value optimized out, 
fake=value optimized out) 

Re: FreeBSD 9.1-RELEASE crashes almost daily; backtraces always list zfs routines

2012-12-24 Thread Mark Linimon
On Mon, Dec 24, 2012 at 10:17:19AM -0800, Derek Kulinski wrote:
 Yes, but they are 3.5GB each. I attached text dump to GNATS but I can
 resend it to you

We have a limit of 500K on GNATS PRs.  For something that huge, a PR
database is really not the right place for it -- please post the dumps
somewhere and include a URL to them in a followup to the PR.

Thanks.

Mark Linimon, on behalf of bugmeister
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD 9.1-RELEASE crashes almost daily; backtraces always list zfs routines

2012-12-24 Thread Derek Kulinski
Hello Mark,

Monday, December 24, 2012, 12:46:53 PM, you wrote:

 On Mon, Dec 24, 2012 at 10:17:19AM -0800, Derek Kulinski wrote:
 Yes, but they are 3.5GB each. I attached text dump to GNATS but I can
 resend it to you

 We have a limit of 500K on GNATS PRs.  For something that huge, a PR
 database is really not the right place for it -- please post the dumps
 somewhere and include a URL to them in a followup to the PR.

 Thanks.

 Mark Linimon, on behalf of bugmeister

I included the text dump, but I do not see it when I visit the web
interface so I don't know if it was attached there or not.

-- 
Best regards,
 Derekmailto:tak...@takeda.tk

My new car runs at 56Kbps

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: stable/9 i386 panic [ACPI/timer?]

2012-12-24 Thread Andriy Gapon
on 24/12/2012 21:58 David Wolfskill said the following:
 I finally(!) got around to enabling crash dumps on the primary machine
 here at the house ... and managed to make use of it (unfortunately).
 
 I've copied the relevant files (both those from /var/crash and
 dmesg.boot) so they should be visibale at
 http://www.catwhisker.org/~david/FreeBSD/panic_24Dec2012/ (though
 only the dmesg.boot, core.text.0,  info.0 files should be fetchable
 for now).  [I'll make the vmcore.0 available to individuals who
 wish to work on the problem; please contact me to arrange this.]
 
 Here's a bit of information excerpted from core.text.0:
 
 Mon Dec 24 11:16:04 PST 2012
 
 FreeBSD albert.catwhisker.org 9.1-PRERELEASE FreeBSD 9.1-PRERELEASE #434 
 244582M: Sat Dec 22 05:06:29 PST 2012 
 r...@freebeast.catwhisker.org:/usr/obj/usr/src/sys/ALBERT  i386
 
 Note that while the version string says 244582M:
 
 * Userland was at r244608.
 
 * The Modification was merely a change to src/sys/newvers.sh to re-factor
   the extraction of the version string.
 
 
 
 panic: page fault
 
 Fatal trap 12: page fault while in kernel mode
 cpuid = 0; apic id = 00
 fault virtual address   = 0x34
 fault code  = supervisor read, page not present
 instruction pointer = 0x20:0xc0ad475c
 stack pointer   = 0x28:0xc6fba9d8
 frame pointer   = 0x28:0xc6fbaa18
 code segment= base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, def32 1, gran 1
 processor eflags= resume, IOPL = 0
 current process = 11 (idle: cpu0)
 trap number = 12
 panic: page fault
 cpuid = 0
 KDB: stack backtrace:
 db_trace_self_wrapper(c0ffbab8,46,1,ca931e80,0,...) at 0xc051ef76 = 
 db_trace_self_wrapper+0x36/frame 0xc6fba740
 kdb_backtrace(c1033ff1,0,c0e75cc4,c6fba7ec,c71f08d0,...) at 0xc0afc400 = 
 kdb_backtrace+0x30/frame 0xc6fba7a0
 panic(c0e75cc4,c1034ddb,c71f0a84,1,1,...) at 0xc0ac763c = panic+0x1bc/frame 
 0xc6fba7e0
 trap_fatal(28,7fff,3,0,28,...) at 0xc0e35560 = trap_fatal+0x340/frame 
 0xc6fba828
 trap_pfault(34,c,1,c11a68b0,c6fba940,...) at 0xc0e358cb = 
 trap_pfault+0x35b/frame 0xc6fba8a0
 trap(c6fba998) at 0xc0e34e13 = trap+0x443/frame 0xc6fba98c
 calltrap() at 0xc0e1e86c = calltrap+0x6/frame 0xc6fba98c
 --- trap 0xc, eip = 0xc0ad475c, esp = 0xc6fba9d8, ebp = 0xc6fbaa18 ---
 tc_windup(1,0,c0ff3ba6,21c,0,...) at 0xc0ad475c = tc_windup+0x1c/frame 
 0xc6fbaa18
 hardclock_cnt(1,0,0,3,0,...) at 0xc0a77e39 = hardclock_cnt+0x2e9/frame 
 0xc6fbaa68
 handleevents(c6fbaaf8,2,46,c71f08d0,c6fbaae4,...) at 0xc0e3c534 = 
 handleevents+0x184/frame 0xc6fbaac0
 timercb(c7564064,0,c76a82f0,c6fbab58,c0a99a0e,...) at 0xc0e3d1a1 = 
 timercb+0x281/frame 0xc6fbab14
 hpet_intr_single(c7564064,c7569780,0,c6fbabbc,c6fbab78,...) at 0xc053a345 = 
 hpet_intr_single+0x195/frame 0xc6fbab40
 hpet_intr(c7564000,0,c71f08d0,14,c723b710,...) at 0xc053a3cf = 
 hpet_intr+0x6f/frame 0xc6fbab58
 intr_event_handle(c723c280,c6fbabbc,c6fbab94,0,c7182600,...) at 0xc0a99c5c = 
 intr_event_handle+0x7c/frame 0xc6fbab78
 intr_execute_handlers(c723b710,c6fbabbc,0) at 0xc0e4c552 = 
 intr_execute_handlers+0x42/frame 0xc6fbab98
 lapic_handle_intr(33,c6fbabbc) at 0xc0e4f50d = lapic_handle_intr+0x3d/frame 
 0xc6fbabac
 Xapic_isr1() at 0xc0e1ec35 = Xapic_isr1+0x35/frame 0xc6fbabac
 --- interrupt, eip = 0xc0e1a202, esp = 0xc6fbabfc, ebp = 0xc6fbac3c ---
 acpi_cpu_c1(0,c6fbac58,c0e250a6,0,c1198018,...) at 0xc0e1a202 = 
 acpi_cpu_c1+0x2/frame 0xc6fbac3c
 cpu_idle_acpi(0,c1198018,c6fbacd0,c0aee519,0,...) at 0xc0e24fff = 
 cpu_idle_acpi+0x2f/frame 0xc6fbac48
 cpu_idle(0,2,c0ffa49a,a36,c71f08d0,...) at 0xc0e250a6 = cpu_idle+0x96/frame 
 0xc6fbac58
 sched_idletd(0,c6fbad08,0,0,c0aee250,...) at 0xc0aee519 = 
 sched_idletd+0x2c9/frame 0xc6fbacd0
 fork_exit(c0aee250,0,c6fbad08) at 0xc0a977c7 = fork_exit+0x67/frame 0xc6fbacf4
 fork_trampoline() at 0xc0e1e8e4 = fork_trampoline+0x8/frame 0xc6fbacf4
 --- trap 0, eip = 0, esp = 0xc6fbad40, ebp = 0 ---
 Uptime: 7h11m46s
 Physical memory: 3045 MB
 
 
 #0  doadump (textdump=value optimized out) at pcpu.h:249
 249 pcpu.h: No such file or directory.
 in pcpu.h
 (kgdb) #0  doadump (textdump=value optimized out) at pcpu.h:249
 #1  0xc0ac71fa in kern_reboot (howto=Unhandled dwarf expression opcode 0xc0
 )
 at /usr/src/sys/kern/kern_shutdown.c:448
 #2  0xc0ac7688 in panic (fmt=Unhandled dwarf expression opcode 0xc0
 ) at /usr/src/sys/kern/kern_shutdown.c:636
 #3  0xc0e35560 in trap_fatal (frame=value optimized out, 
 eva=value optimized out) at /usr/src/sys/i386/i386/trap.c:1043
 #4  0xc0e358cb in trap_pfault (frame=value optimized out, 
 usermode=Unhandled dwarf expression opcode 0xc3
 )
 at /usr/src/sys/i386/i386/trap.c:858
 #5  0xc0e34e13 in trap (frame=value optimized out)
 at /usr/src/sys/i386/i386/trap.c:555
 #6  0xc0e1e86c in calltrap () at /tmp/exception-SmXQMs.s:94
 #7  0xc0ad475c in tc_windup () at /usr/src/sys/kern/kern_tc.c:450

I'd say that what you see 

Re: stable/9 i386 panic [ACPI/timer?]

2012-12-24 Thread David Wolfskill
On Mon, Dec 24, 2012 at 11:04:04PM +0200, Andriy Gapon wrote:
 ...
 I'd say that what you see is impossible...

Well, I suppose it's small comfort, but that does make me feel a little
better about being a bit clueless about why this happened.  Thanks! :-}

 Could you please provide the following info from kgdb?
 p timehands
 p th0
 ...
 p th9
 disassemble tc_windup
 ...

Here you go, cut/pasted (though I elided the ---Type return to
continue, or q return to quit--- lines):

albert(9.1-P)[3] kgdb /boot/kernel/kernel.symbols vmcore.0
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
...
Unread portion of the kernel message buffer:
kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x34
fault code  = supervisor read, page not present
instruction pointer = 0x20:0xc0ad475c
stack pointer   = 0x28:0xc6fba9d8
frame pointer   = 0x28:0xc6fbaa18
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= resume, IOPL = 0
current process = 11 (idle: cpu0)
trap number = 12
panic: page fault
...
Loaded symbols for /boot/kernel/drm.ko
#0  doadump (textdump=value optimized out) at pcpu.h:249
249 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) frame 7
#7  0xc0ad475c in tc_windup () at /usr/src/sys/kern/kern_tc.c:450
450 /usr/src/sys/kern/kern_tc.c: No such file or directory.
in /usr/src/sys/kern/kern_tc.c
Current language:  auto; currently minimal
(kgdb) p timehands
$1 = (struct timehands * volatile) 0xc11ba910
(kgdb) p th0
$2 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, 
  th_scale = 1690726758248, th_offset_count = 3989950369, th_offset = {
sec = 25906, frac = 2057132249855343962}, th_microtime = {
tv_sec = 1356376278, tv_usec = 180944}, th_nanotime = {
tv_sec = 1356376278, tv_nsec = 180944041}, th_generation = 669311, 
  th_next = 0xc112a7e4}
(kgdb) p th1
$3 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, 
  th_scale = 1690726758248, th_offset_count = 3990015836, th_offset = {
sec = 25906, frac = 2167819058537565778}, th_microtime = {
tv_sec = 1356376278, tv_usec = 186944}, th_nanotime = {
tv_sec = 1356376278, tv_nsec = 186944385}, th_generation = 669311, 
  th_next = 0xc112a820}
(kgdb) p th2
$4 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, 
  th_scale = 1690726758248, th_offset_count = 3990048555, th_offset = {
sec = 25906, frac = 2223137947340682090}, th_microtime = {
tv_sec = 1356376278, tv_usec = 189943}, th_nanotime = {
tv_sec = 1356376278, tv_nsec = 189943228}, th_generation = 669311, 
  th_next = 0xc112a85c}
(kgdb) p th3
$5 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, 
  th_scale = 1690726758248, th_offset_count = 3990059490, th_offset = {
sec = 25906, frac = 224162602123970}, th_microtime = {
tv_sec = 1356376278, tv_usec = 190945}, th_nanotime = {
tv_sec = 1356376278, tv_nsec = 190945470}, th_generation = 669311, 
  th_next = 0xc112a898}
(kgdb) p th4
$6 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, 
  th_scale = 1690726758248, th_offset_count = 3990070376, th_offset = {
sec = 25906, frac = 2260031295932411698}, th_microtime = {
tv_sec = 1356376278, tv_usec = 191943}, th_nanotime = {
tv_sec = 1356376278, tv_nsec = 191943220}, th_generation = 669311, 
  th_next = 0xc112a8d4}
(kgdb) p th5
$7 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, 
  th_scale = 1690726758248, th_offset_count = 3990081323, th_offset = {
sec = 25906, frac = 2278539681754952554}, th_microtime = {
tv_sec = 1356376278, tv_usec = 192946}, th_nanotime = {
tv_sec = 1356376278, tv_nsec = 192946562}, th_generation = 669311, 
  th_next = 0xc11ba910}
(kgdb) p th6
$8 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, 
  th_scale = 1690726758248, th_offset_count = 3989906722, th_offset = {
sec = 25906, frac = 1983337099038093506}, th_microtime = {
tv_sec = 1356376278, tv_usec = 176943}, th_nanotime = {
tv_sec = 1356376278, tv_nsec = 176943598}, th_generation = 669310, 
  th_next = 0xc112a94c}
(kgdb) p th7
$9 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, 
  th_scale = 1690726758248, th_offset_count = 3989927028, th_offset = {
sec = 25906, frac = 2017668996591077394}, th_microtime = {
tv_sec = 1356376278, tv_usec = 178804}, th_nanotime = {
tv_sec = 1356376278, tv_nsec = 178804734}, th_generation = 669310, 
  th_next = 0xc112a988}
(kgdb) p th8
$10 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, 
  th_scale = 1690726758248, th_offset_count = 3989928549, th_offset = {
sec = 25906, frac = 2020240591990372602}, th_microtime = {
tv_sec = 1356376278, tv_usec = 178944}, th_nanotime = {
tv_sec = 1356376278, tv_nsec = 178944140}, th_generation = 669310, 
  th_next = 

Re: stable/9 i386 panic [ACPI/timer?]

2012-12-24 Thread Andriy Gapon
on 24/12/2012 23:16 David Wolfskill said the following:
 albert(9.1-P)[3] kgdb /boot/kernel/kernel.symbols vmcore.0
 GNU gdb 6.1.1 [FreeBSD]
 Copyright 2004 Free Software Foundation, Inc.
 ...
 Unread portion of the kernel message buffer:
 kernel trap 12 with interrupts disabled
 
 
 Fatal trap 12: page fault while in kernel mode
 cpuid = 0; apic id = 00
 fault virtual address   = 0x34
 fault code  = supervisor read, page not present
 instruction pointer = 0x20:0xc0ad475c
 stack pointer   = 0x28:0xc6fba9d8
 frame pointer   = 0x28:0xc6fbaa18
 code segment= base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, def32 1, gran 1
 processor eflags= resume, IOPL = 0
 current process = 11 (idle: cpu0)
 trap number = 12
 panic: page fault
 ...
 Loaded symbols for /boot/kernel/drm.ko
 #0  doadump (textdump=value optimized out) at pcpu.h:249
 249 pcpu.h: No such file or directory.
 in pcpu.h
 (kgdb) frame 7
 #7  0xc0ad475c in tc_windup () at /usr/src/sys/kern/kern_tc.c:450
 450 /usr/src/sys/kern/kern_tc.c: No such file or directory.
 in /usr/src/sys/kern/kern_tc.c
 Current language:  auto; currently minimal
 (kgdb) p timehands
 $1 = (struct timehands * volatile) 0xc11ba910
 (kgdb) p th0
 $2 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, 
   th_scale = 1690726758248, th_offset_count = 3989950369, th_offset = {
 sec = 25906, frac = 2057132249855343962}, th_microtime = {
 tv_sec = 1356376278, tv_usec = 180944}, th_nanotime = {
 tv_sec = 1356376278, tv_nsec = 180944041}, th_generation = 669311, 
   th_next = 0xc112a7e4}
 (kgdb) p th1
 $3 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, 
   th_scale = 1690726758248, th_offset_count = 3990015836, th_offset = {
 sec = 25906, frac = 2167819058537565778}, th_microtime = {
 tv_sec = 1356376278, tv_usec = 186944}, th_nanotime = {
 tv_sec = 1356376278, tv_nsec = 186944385}, th_generation = 669311, 
   th_next = 0xc112a820}
 (kgdb) p th2
 $4 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, 
   th_scale = 1690726758248, th_offset_count = 3990048555, th_offset = {
 sec = 25906, frac = 2223137947340682090}, th_microtime = {
 tv_sec = 1356376278, tv_usec = 189943}, th_nanotime = {
 tv_sec = 1356376278, tv_nsec = 189943228}, th_generation = 669311, 
   th_next = 0xc112a85c}
 (kgdb) p th3
 $5 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, 
   th_scale = 1690726758248, th_offset_count = 3990059490, th_offset = {
 sec = 25906, frac = 224162602123970}, th_microtime = {
 tv_sec = 1356376278, tv_usec = 190945}, th_nanotime = {
 tv_sec = 1356376278, tv_nsec = 190945470}, th_generation = 669311, 
   th_next = 0xc112a898}
 (kgdb) p th4
 $6 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, 
   th_scale = 1690726758248, th_offset_count = 3990070376, th_offset = {
 sec = 25906, frac = 2260031295932411698}, th_microtime = {
 tv_sec = 1356376278, tv_usec = 191943}, th_nanotime = {
 tv_sec = 1356376278, tv_nsec = 191943220}, th_generation = 669311, 
   th_next = 0xc112a8d4}
 (kgdb) p th5
 $7 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, 
   th_scale = 1690726758248, th_offset_count = 3990081323, th_offset = {
 sec = 25906, frac = 2278539681754952554}, th_microtime = {
 tv_sec = 1356376278, tv_usec = 192946}, th_nanotime = {
 tv_sec = 1356376278, tv_nsec = 192946562}, th_generation = 669311, 
   th_next = 0xc11ba910}
 (kgdb) p th6
 $8 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, 
   th_scale = 1690726758248, th_offset_count = 3989906722, th_offset = {
 sec = 25906, frac = 1983337099038093506}, th_microtime = {
 tv_sec = 1356376278, tv_usec = 176943}, th_nanotime = {
 tv_sec = 1356376278, tv_nsec = 176943598}, th_generation = 669310, 
   th_next = 0xc112a94c}
 (kgdb) p th7
 $9 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, 
   th_scale = 1690726758248, th_offset_count = 3989927028, th_offset = {
 sec = 25906, frac = 2017668996591077394}, th_microtime = {
 tv_sec = 1356376278, tv_usec = 178804}, th_nanotime = {
 tv_sec = 1356376278, tv_nsec = 178804734}, th_generation = 669310, 
   th_next = 0xc112a988}
 (kgdb) p th8
 $10 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, 
   th_scale = 1690726758248, th_offset_count = 3989928549, th_offset = {
 sec = 25906, frac = 2020240591990372602}, th_microtime = {
 tv_sec = 1356376278, tv_usec = 178944}, th_nanotime = {
 tv_sec = 1356376278, tv_nsec = 178944140}, th_generation = 669310, 
   th_next = 0xc112a9c4}
 (kgdb) p th9
 $11 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, 
   th_scale = 1690726758248, th_offset_count = 3989939440, th_offset = {
 sec = 25906, frac = 2038654297114451570}, th_microtime = {
 tv_sec = 1356376278, tv_usec = 179942}, th_nanotime = {
 tv_sec = 1356376278, tv_nsec = 

Re: stable/9 i386 panic [ACPI/timer?]

2012-12-24 Thread David Wolfskill
On Tue, Dec 25, 2012 at 12:35:18AM +0200, Andriy Gapon wrote:
 ...
 Could you please also provide from the same frame
 i reg
 p timehands
 ?

Thank you!  You're the one doing the work. :-}

I had left teh kgdb session active; I also included p *timehands just
in case it might be of use:

(kgdb) i reg
eax0x1  1
ecx0xc11ba910   -1055151856
edx0xc72405ff   -953940481
ebx0x0  0
esp0x0  0x0
ebp0xc6fbaa18   0xc6fbaa18
esi0x1  1
edi0xc71c8300   -954432768
eip0xc0ad475c   0xc0ad475c
eflags 0x10086  65670
cs 0x20 32
ss 0xc6fbaa18   -956585448
ds 0x28 40
es 0xc6fb0028   -956628952
fs 0xc71f0008   -954269688
gs 0x0  0
(kgdb) p timehands
$13 = (struct timehands * volatile *) 0xc112a6c8
(kgdb) p *timehands
$14 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, 
  th_scale = 1690726758248, th_offset_count = 3990092176, th_offset = {
sec = 25906, frac = 2296889139262218098}, th_microtime = {
tv_sec = 1356376278, tv_usec = 193941}, th_nanotime = {
tv_sec = 1356376278, tv_nsec = 193941288}, th_generation = 1, 
  th_next = 0x0}
(kgdb) 

Also: the machine has been in service for about 2.5 years, and was
purchased refurbished.  If it turns out that there are hardware
issues, my feelings won't be hurt at all -- I'd merely want to identify
the (likely) failing part(s) and replace them.

Peace,
david
-- 
David H. Wolfskill  da...@catwhisker.org
Taliban: Evil men with guns afraid of truth from a 14-year old girl.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.


pgpustrCoBOIU.pgp
Description: PGP signature


Re: stable/9 i386 panic [ACPI/timer?]

2012-12-24 Thread Andriy Gapon
on 25/12/2012 00:39 David Wolfskill said the following:
 I had left teh kgdb session active; I also included p *timehands just in
 case it might be of use:

Thank you.
Please also print th0 ... th9.

-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: stable/9 i386 panic [ACPI/timer?]

2012-12-24 Thread David Wolfskill
On Tue, Dec 25, 2012 at 12:58:00AM +0200, Andriy Gapon wrote:
 on 25/12/2012 00:39 David Wolfskill said the following:
  I had left teh kgdb session active; I also included p *timehands just in
  case it might be of use:
 
 Thank you.
 Please also print th0 ... th9.
 ...

Here you go:

(kgdb) p th0
$15 = (struct timehands *) 0xc112a7a8
(kgdb) p th1
$16 = (struct timehands *) 0xc112a7e4
(kgdb) p th2
$17 = (struct timehands *) 0xc112a820
(kgdb) p th3
$18 = (struct timehands *) 0xc112a85c
(kgdb) p th4
$19 = (struct timehands *) 0xc112a898
(kgdb) p th5
$20 = (struct timehands *) 0xc112a8d4
(kgdb) p th6
$21 = (struct timehands *) 0xc112a910
(kgdb) p th7
$22 = (struct timehands *) 0xc112a94c
(kgdb) p th8
$23 = (struct timehands *) 0xc112a988
(kgdb) p th9
$24 = (struct timehands *) 0xc112a9c4
(kgdb) 

I've copied /boot/kernel/kernel.symbols over, as well: I need to head
out for some errands for a while.

Peace,
david
-- 
David H. Wolfskill  da...@catwhisker.org
Taliban: Evil men with guns afraid of truth from a 14-year old girl.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.


pgpobxxp0rarO.pgp
Description: PGP signature


Re: FreeBSD 9.1-RELEASE crashes almost daily; backtraces always list zfs routines

2012-12-24 Thread Andriy Gapon
on 24/12/2012 20:17 Derek Kulinski said the following:
 Hello Andriy,
 
 Monday, December 24, 2012, 8:01:26 AM, you wrote:
 
 on 24/12/2012 00:23 Derek Kulinski said the following:
 Dumping 3701 out of 8072 
 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%
 
 So do you have the crash dump(s)?
 
 Yes, but they are 3.5GB each. I attached text dump to GNATS but I can
 resend it to you (I don't know if it's ok to send attachments to the
 mailing list). If you would prefer I could give you access to the
 box.

Derek,

I've looked through the cores and it does look like in all cases some sort of
memory corruption is a precursor to a subsequent crash.

I can't decidedly say if the corruptions are caused by the hardware, by some
code overwriting random memory locations (rogue driver) or by a simpler bug
like use after free.

I am always inclined to suspect the hardware first.

You can try to reproduce the problem with some additional checks enabled in the
kernel.  Those should catch the problem earlier and thus make its source 
clearer.

I recommend the following:
options INVARIANTS
options INVARIANT_SUPPORT
options WITNESS
options DEBUG_MEMGUARD
makeoptions DEBUG+=-DDEBUG

The last is really needed only for the ZFS and OpenSolaris compat code.  It make
result in some extra noise from unrelated subsystems.
Perhaps you could just add #define DEBUG to
sys/cddl/contrib/opensolaris/uts/common/sys/debug.h.  I haven't tested this
approach though.

Also, please put vm.memguard.desc=arc_buf_hdr_t into loader.conf.

Please note that these options will make your system significantly slower.

-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: stable/9 i386 panic [ACPI/timer?]

2012-12-24 Thread Andriy Gapon
on 25/12/2012 01:04 David Wolfskill said the following:
 On Tue, Dec 25, 2012 at 12:58:00AM +0200, Andriy Gapon wrote:
 on 25/12/2012 00:39 David Wolfskill said the following:
 I had left teh kgdb session active; I also included p *timehands just in
 case it might be of use:

 Thank you.
 Please also print th0 ... th9.
 ...
 
 Here you go:
 
 (kgdb) p th0
 $15 = (struct timehands *) 0xc112a7a8
 (kgdb) p th1
 $16 = (struct timehands *) 0xc112a7e4
 (kgdb) p th2
 $17 = (struct timehands *) 0xc112a820
 (kgdb) p th3
 $18 = (struct timehands *) 0xc112a85c
 (kgdb) p th4
 $19 = (struct timehands *) 0xc112a898
 (kgdb) p th5
 $20 = (struct timehands *) 0xc112a8d4
 (kgdb) p th6
 $21 = (struct timehands *) 0xc112a910

Comparing the above and the following from an earlier email:
 (kgdb) p timehands
 $1 = (struct timehands * volatile) 0xc11ba910
and the following:
 (kgdb) p th5
 $7 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, 
   th_scale = 1690726758248, th_offset_count = 3990081323, th_offset = {
 sec = 25906, frac = 2278539681754952554}, th_microtime = {
 tv_sec = 1356376278, tv_usec = 192946}, th_nanotime = {
 tv_sec = 1356376278, tv_nsec = 192946562}, th_generation = 669311, 
   th_next = 0xc11ba910}


I am quite sure that the impossible happened only because the faulty memory made
it possible.

 (kgdb) p th7
 $22 = (struct timehands *) 0xc112a94c
 (kgdb) p th8
 $23 = (struct timehands *) 0xc112a988
 (kgdb) p th9
 $24 = (struct timehands *) 0xc112a9c4
 (kgdb) 
 
 I've copied /boot/kernel/kernel.symbols over, as well: I need to head
 out for some errands for a while.
 
 Peace,
 david
 


-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD 9.1-RELEASE crashes almost daily; backtraces always list zfs routines

2012-12-24 Thread Derek Kulinski
Hello Andriy,

Monday, December 24, 2012, 3:28:00 PM, you wrote:

 I've looked through the cores and it does look like in all cases some sort of
 memory corruption is a precursor to a subsequent crash.

 I can't decidedly say if the corruptions are caused by the hardware, by some
 code overwriting random memory locations (rogue driver) or by a simpler 
 bug
 like use after free.

 I am always inclined to suspect the hardware first.

 You can try to reproduce the problem with some additional checks enabled in 
 the
 kernel.  Those should catch the problem earlier and thus make its source 
 clearer.

 I recommend the following:
 options INVARIANTS
 options INVARIANT_SUPPORT
 options WITNESS
 options DEBUG_MEMGUARD
 makeoptions DEBUG+=-DDEBUG

 The last is really needed only for the ZFS and OpenSolaris compat code.  It 
 make
 result in some extra noise from unrelated subsystems.
 Perhaps you could just add #define DEBUG to
 sys/cddl/contrib/opensolaris/uts/common/sys/debug.h.  I haven't tested this
 approach though.

 Also, please put vm.memguard.desc=arc_buf_hdr_t into loader.conf.

 Please note that these options will make your system significantly slower.

I recompiled the kernel and is running with options you specified (I
enabled DEBUG in the file).

Anyway even at boot time I started getting following warnings, is this
anything:

Dec 24 16:06:03 chinatsu kernel: Creating and/or trimming log files
Dec 24 16:06:03 chinatsu kernel: lock order reversal:
Dec 24 16:06:03 chinatsu kernel: 1st 0x80bf5780 pf task mtx (pf task 
mtx) @ /usr/src/sys/contrib/pf/net/pf.c:3330
Dec 24 16:06:03 chinatsu kernel: .
Dec 24 16:06:03 chinatsu kernel: 2nd 0xfe0009211af8 radix node head (radix 
node head) @ /usr/src/sys/net/route.c:384
Dec 24 16:06:03 chinatsu kernel: KDB: stack backtrace:
Dec 24 16:06:03 chinatsu kernel: db_trace_self_wrapper() at 
db_trace_self_wrapper+0x2a
Dec 24 16:06:03 chinatsu kernel: kdb_backtrace() at kdb_backtrace+0x37
Dec 24 16:06:03 chinatsu kernel: _witness_debugger() at _witness_debugger+0x2c
Dec 24 16:06:03 chinatsu kernel: witness_checkorder() at 
witness_checkorder+0x844
Dec 24 16:06:03 chinatsu kernel: _rw_rlock() at
Dec 24 16:06:03 chinatsu kernel: Starting syslogd.
Dec 24 16:06:03 chinatsu kernel: _rw_rlock+0x81
Dec 24 16:06:03 chinatsu kernel: rtalloc1_fib() at rtalloc1_fib+0x11c
Dec 24 16:06:03 chinatsu kernel: rtalloc_ign_fib() at rtalloc_ign_fib+0xc5
Dec 24 16:06:03 chinatsu kernel: pf_routable() at pf_routable+0x1fd
Dec 24 16:06:03 chinatsu kernel: pf_test_rule() at pf_test_rule+0x6cf
Dec 24 16:06:03 chinatsu kernel: pf_test() at pf_test+0xf58
Dec 24 16:06:03 chinatsu kernel: pf_check_in() at pf_check_in+0x2b
Dec 24 16:06:03 chinatsu kernel: pfil_run_hooks() at pfil_run_hooks+0xd2
Dec 24 16:06:03 chinatsu kernel: ip_input() at ip_input+0x2dc
Dec 24 16:06:03 chinatsu kernel: netisr_dispatch_src() at 
netisr_dispatch_src+0x170
Dec 24 16:06:03 chinatsu kernel: ether_demux() at ether_demux+0x17d
Dec 24 16:06:03 chinatsu kernel: ether_nh_input() at ether_nh_input+0x209
Dec 24 16:06:03 chinatsu kernel: netisr_dispatch_src() at 
netisr_dispatch_src+0x170
Dec 24 16:06:03 chinatsu kernel: alc_int_task() at alc_int_task+0x2ff
Dec 24 16:06:03 chinatsu kernel: taskqueue_run_locked() at 
taskqueue_run_locked+0x93
Dec 24 16:06:03 chinatsu kernel: taskqueue_thread_loop() at 
taskqueue_thread_loop+0x3e
Dec 24 16:06:03 chinatsu kernel: fork_exit() at fork_exit+0x133
Dec 24 16:06:03 chinatsu kernel: fork_trampoline() at fork_trampoline+0xe
Dec 24 16:06:03 chinatsu kernel: --- trap 0, rip = 0, rsp = 0xff85fb2ebbb0, 
rbp = 0 ---
Dec 24 16:06:03 chinatsu kernel: No core dumps found.
Dec 24 16:06:04 chinatsu kernel: lock order reversal:
Dec 24 16:06:04 chinatsu kernel: 1st 0xff85b9cb8dd8 bufwait (bufwait) @ 
/usr/src/sys/kern/vfs_bio.c:2677
Dec 24 16:06:04 chinatsu kernel: 2nd 0xfe00092c5c00 dirhash (dirhash) @ 
/usr/src/sys/ufs/ufs/ufs_dirhash.c:284
Dec 24 16:06:04 chinatsu kernel: KDB: stack backtrace:
Dec 24 16:06:04 chinatsu kernel: db_trace_self_wrapper() at 
db_trace_self_wrapper+0x2a
Dec 24 16:06:04 chinatsu kernel: kdb_backtrace() at kdb_backtrace+0x37
Dec 24 16:06:04 chinatsu kernel: _witness_debugger() at _witness_debugger+0x2c
Dec 24 16:06:04 chinatsu kernel: witness_checkorder() at 
witness_checkorder+0x844
Dec 24 16:06:04 chinatsu kernel: _sx_xlock() at _sx_xlock+0x61
Dec 24 16:06:04 chinatsu kernel: ufsdirhash_acquire() at ufsdirhash_acquire+0x33
Dec 24 16:06:04 chinatsu kernel: ufsdirhash_remove() at
Dec 24 16:06:04 chinatsu kernel: ufsdirhash_remove+0x16
Dec 24 16:06:04 chinatsu kernel: ufs_dirremove() at ufs_dirremove+0x1bb
Dec 24 16:06:04 chinatsu kernel: ufs_remove() at ufs_remove+0x92
Dec 24 16:06:04 chinatsu kernel: VOP_REMOVE_APV() at VOP_REMOVE_APV+0xb7
Dec 24 16:06:04 chinatsu kernel: kern_unlinkat() at kern_unlinkat+0x2eb
Dec 24 16:06:04 chinatsu kernel: amd64_syscall() at amd64_syscall+0x30e
Dec 24 16:06:04 chinatsu 

Re: stable/9 i386 panic [ACPI/timer?]

2012-12-24 Thread David Wolfskill
On Tue, Dec 25, 2012 at 01:33:15AM +0200, Andriy Gapon wrote:
 ...
  (kgdb) p th6
  $21 = (struct timehands *) 0xc112a910
 
 Comparing the above and the following from an earlier email:
  (kgdb) p timehands
  $1 = (struct timehands * volatile) 0xc11ba910
 and the following:
  (kgdb) p th5
  $7 = {th_counter = 0xc115174c, th_adjustment = 51068786373500, 
th_scale = 1690726758248, th_offset_count = 3990081323, th_offset = {
  sec = 25906, frac = 2278539681754952554}, th_microtime = {
  tv_sec = 1356376278, tv_usec = 192946}, th_nanotime = {
  tv_sec = 1356376278, tv_nsec = 192946562}, th_generation = 669311, 
th_next = 0xc11ba910}
 
 
 I am quite sure that the impossible happened only because the faulty memory 
 made
 it possible.

Ah.  Well, that's not unreasonable, then.

I have (2) 1GB DIMMs + (2) 512MB DIMMs in the machine presently.  Since
I bought the 1GB DIMMs more recently, I'll just pull the 512MB DIMMs for
now, and if that causes things to settle down, I'll plan on buying a
couple more 1GBDIMMs to replace the 512MB DIMMs.

Thank you very much for your help!

 ...

Peace,
david
-- 
David H. Wolfskill  da...@catwhisker.org
Taliban: Evil men with guns afraid of truth from a 14-year old girl.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.


pgpts0FNZ2DyO.pgp
Description: PGP signature


CAM hangs in 9-STABLE? [Was: NFS/ZFS hangs after upgrading from 9.0-RELEASE to -STABLE]

2012-12-24 Thread olivier
Dear All
It turns out that reverting to an older version of the mps driver did not
fix the ZFS hangs I've been struggling with in 9.1 and 9-STABLE after all
(they just took a bit longer to occur again, possibly just by chance). I
followed steps along lines suggested by Andriy to collect more information
when the problem occurs. Hopefully this will help figure out what's going
on.

As far as I can tell, what happens is that at some point IO operations to a
bunch of drives that belong to different pools get stuck. For these drives,
gstat shows no activity but 1 pending operation, as such:

 L(q)  ops/sr/s   kBps   ms/rw/s   kBps   ms/wd/s   kBps   ms/d
  %busy Name
1  0  0  00.0  0  00.0  0  00.0
   0.0  da1

I've been running gstat in a loop (every 100s) to monitor the machine. Just
before the hang occurs, everything seems fine (see full gstat output
below). Right after the hang occurs a number of drives seem stuck (see full
gstat output below). Notably, some stuck drives are seen through the mps
driver and others through the mpt driver. So the problem doesn't seem to be
driver-specific. I have had the problem occur (at a lower frequency) on
similar machines that don't use the mpt driver (and only have 1 disk
provided through mps), so the problem doesn't seem to be caused by the mpt
driver (and is likely not caused by defective hardware). Since based on the
information I provided earlier Andriy thinks the problem might not
originate in ZFS, perhaps that means that the problem is in the CAM layer?

camcontrol tags -v (as suggested by Andriy) in the hung state shows for
example

(pass56:mpt1:0:8:20): dev_openings  254
(pass56:mpt1:0:8:20): dev_active1
(pass56:mpt1:0:8:20): devq_openings 254
(pass56:mpt1:0:8:20): devq_queued   0
(pass56:mpt1:0:8:20): held  0
(pass56:mpt1:0:8:20): mintags   2
(pass56:mpt1:0:8:20): maxtags   255
(I'm not providing full camcontrol tags output below because I couldn't get
it to run during the specific hang I documented most thoroughly; the
example above is from a different occurrence of the hang).

The buses don't seem completely frozen: if I manually remove drives while
the machine is hanging, that's picked up by the mpt driver, which prints
out corresponding messages to the console. But camcontrol reset all or
rescan all don't seem to do anything.

I've tried reducing vfs.zfs.vdev.min_pending and vfs.zfs.vdev.max_pending
to 1, to no avail.

Any suggestions to resolve this problem, work around it, or further
investigate it would be greatly appreciated!
Thanks a lot
Olivier

Detailed information:

Output of procstat -a -kk when the machine is hanging is available at
http://pastebin.com/7D2KtT35 (not putting it here because it's pretty long)

dmesg is available at http://pastebin.com/9zJQwWJG . Note that I'm using
LUN masking, so the illegal requests reported aren't really errors. Maybe
one day if I get my problems sorted out I'll use geom multipathing instead.

My kernel config is
include GENERIC
ident MYKERNEL

options IPSEC
device crypto

options OFED # Infiniband protocol

device mlx4ib # ConnectX Infiniband support
device mlxen # ConnectX Ethernet support
device mthca # Infinihost cards
device ipoib # IP over IB devices

options ATA_CAM # Handle legacy controllers with CAM
options ATA_STATIC_ID   # Static device numbering

options KDB
options DDB



Full output of gstat just before the hang (at most 100s before the hang):
 L(q)  ops/sr/s   kBps   ms/rw/s   kBps   ms/wd/s   kBps   ms/d
  %busy Name
0  0  0  00.0  0  00.0  0  00.0
   0.0  da2
0  0  0  00.0  0  00.0  0  00.0
   0.0  da0
0  0  0  00.0  0  00.0  0  00.0
   0.0  DEV/da2/da2
0  0  0  00.0  0  00.0  0  00.0
   0.0  DEV/da0/da0
1 85 48 794.7 35 840.5  0  00.0
  24.3  da1
0  0  0  00.0  0  00.0  0  00.0
   0.0  DEV/da1/da1
1 83 47 774.3 34 790.5  0  00.0
  22.1  da4
1   1324   1303  214330.6 19 420.7  0  00.0
  79.8  da3
0  0  0  00.0  0  00.0  0  00.0
   0.0  da5
0  0  0  00.0  0  00.0  0  00.0
   0.0  da6
0  0  0  00.0  0  00.0  0  00.0
   0.0  da7
0  0  0  00.0  0  00.0  0  00.0
   0.0  da8
0  0  0  00.0  0  00.0  0  00.0
   0.0  da9
0  0  0  00.0  0  00.0  0  00.0
   0.0  da10
0  0  0  00.0  0  00.0  0  00.0
   0.0  da11
0  0  0  00.0  0  0