A second attempt to come up on the new kernel was successful -- so this had to be due to queued I/Os that were pending at the time of the shutdown....
On 1/11/2017 08:31, Karl Denninger wrote: > During the reboot, immediately after the daemons started up on the > machine (the boot got beyond mounting all the disks and was well into > starting up all the background stuff it runs), I got a double-fault. > > ..... (there were a LOT more of this same; it pretty clearly was a > recursive call sequence that ran the system out of stack space) > > #294 0xffffffff822fdcfd in zio_execute (zio=<value optimized out>) > at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1666 > #295 0xffffffff8230130e in zio_vdev_io_start (zio=0xfffff8010c8f27b0) > at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3127 > #296 0xffffffff822fdcfd in zio_execute (zio=<value optimized out>) > at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1666 > #297 0xffffffff822e464d in vdev_queue_io_done (zio=<value optimized out>) > at > /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c:913 > #298 0xffffffff823014c9 in zio_vdev_io_done (zio=0xfffff8010cff0b88) > at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3152 > #299 0xffffffff822fdcfd in zio_execute (zio=<value optimized out>) > at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1666 > #300 0xffffffff8230130e in zio_vdev_io_start (zio=0xfffff8010cff0b88) > at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3127 > #301 0xffffffff822fdcfd in zio_execute (zio=<value optimized out>) > at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1666 > #302 0xffffffff822e464d in vdev_queue_io_done (zio=<value optimized out>) > at > /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c:913 > #303 0xffffffff823014c9 in zio_vdev_io_done (zio=0xfffff8010c962000) > at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3152 > #304 0xffffffff822fdcfd in zio_execute (zio=<value optimized out>) > at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1666 > #305 0xffffffff8230130e in zio_vdev_io_start (zio=0xfffff8010c962000) > at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3127 > #306 0xffffffff822fdcfd in zio_execute (zio=<value optimized out>) > at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1666 > #307 0xffffffff822e464d in vdev_queue_io_done (zio=<value optimized out>) > at > /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c:913 > #308 0xffffffff823014c9 in zio_vdev_io_done (zio=0xfffff80102175000) > at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3152 > #309 0xffffffff822fdcfd in zio_execute (zio=<value optimized out>) > at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1666 > #310 0xffffffff80b2585a in taskqueue_run_locked (queue=<value optimized > out>) > at /usr/src/sys/kern/subr_taskqueue.c:454 > #311 0xffffffff80b26a48 in taskqueue_thread_loop (arg=<value optimized out>) > at /usr/src/sys/kern/subr_taskqueue.c:724 > #312 0xffffffff80a7eb05 in fork_exit ( > callout=0xffffffff80b26960 <taskqueue_thread_loop>, > arg=0xfffff800b8824c30, frame=0xfffffe0667430c00) > at /usr/src/sys/kern/kern_fork.c:1040 > #313 0xffffffff80f87c3e in fork_trampoline () > at /usr/src/sys/amd64/amd64/exception.S:611 > #314 0x0000000000000000 in ?? () > Current language: auto; currently minimal > (kgdb) > > ..... > > > NewFS.denninger.net dumped core - see /var/crash/vmcore.3 > > Wed Jan 11 08:15:33 CST 2017 > > FreeBSD NewFS.denninger.net 11.0-STABLE FreeBSD 11.0-STABLE #14 > r311927M: Wed Ja > n 11 07:55:20 CST 2017 > k...@newfs.denninger.net:/usr/obj/usr/src/sys/KSD-SMP > amd64 > > panic: double fault > > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain > conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "amd64-marcel-freebsd"... > > Unread portion of the kernel message buffer: > > Fatal double fault > rip = 0xffffffff822e3c5d > rsp = 0xfffffe066742af90 > rbp = 0xfffffe066742b420 > cpuid = 15; apic id = 35 > panic: double fault > cpuid = 15 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame > 0xfffffe0649ddee30 > vpanic() at vpanic+0x186/frame 0xfffffe0649ddeeb0 > panic() at panic+0x43/frame 0xfffffe0649ddef10 > dblfault_handler() at dblfault_handler+0xa2/frame 0xfffffe0649ddef30 > Xdblfault() at Xdblfault+0xac/frame 0xfffffe0649ddef30 > --- trap 0x17, rip = 0xffffffff822e3c5d, rsp = 0xfffffe066742af90, rbp = > 0xfffff > e066742b420 --- > > # Work around for this CPU from 11.x eratta > vm.pmap.pcid_enabled=0 > # > # > # Try to avoid kernel stack exhaustion due to TRIM storms. > kern.kstack_pages="6" > > I have kstack_pages set to "6" to try to avoid another panic that I got > occasionally during zfs backup operations which appeared to be linked to > "too many" TRIMs, and looks very similar to this one. > > I rebooted back to kernel.old, which was built in October, and the > machine came up normally. I'll try the newer build again and see if > this was transient and related to delayed TRIM operations on the disks > related to the installworld/installkernel. But if it is then it remains > a problem -- and setting stackpages didn't help! > > I've got the dump if anything in particular would be of help. > > The prompt to do this in the first place was the openssh CVE that was > recently issued..... > > -- Karl Denninger k...@denninger.net <mailto:k...@denninger.net> /The Market Ticker/ /[S/MIME encrypted email preferred]/
smime.p7s
Description: S/MIME Cryptographic Signature