A second attempt to come up on the new kernel was successful -- so this
had to be due to queued I/Os that were pending at the time of the
shutdown....


On 1/11/2017 08:31, Karl Denninger wrote:
> During the reboot, immediately after the daemons started up on the
> machine (the boot got beyond mounting all the disks and was well into
> starting up all the background stuff it runs), I got a double-fault.
>
> ..... (there were a LOT more of this same; it pretty clearly was a
> recursive call sequence that ran the system out of stack space)
>
> #294 0xffffffff822fdcfd in zio_execute (zio=<value optimized out>)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1666
> #295 0xffffffff8230130e in zio_vdev_io_start (zio=0xfffff8010c8f27b0)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3127
> #296 0xffffffff822fdcfd in zio_execute (zio=<value optimized out>)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1666
> #297 0xffffffff822e464d in vdev_queue_io_done (zio=<value optimized out>)
>     at
> /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c:913
> #298 0xffffffff823014c9 in zio_vdev_io_done (zio=0xfffff8010cff0b88)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3152
> #299 0xffffffff822fdcfd in zio_execute (zio=<value optimized out>)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1666
> #300 0xffffffff8230130e in zio_vdev_io_start (zio=0xfffff8010cff0b88)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3127
> #301 0xffffffff822fdcfd in zio_execute (zio=<value optimized out>)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1666
> #302 0xffffffff822e464d in vdev_queue_io_done (zio=<value optimized out>)
>     at
> /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c:913
> #303 0xffffffff823014c9 in zio_vdev_io_done (zio=0xfffff8010c962000)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3152
> #304 0xffffffff822fdcfd in zio_execute (zio=<value optimized out>)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1666
> #305 0xffffffff8230130e in zio_vdev_io_start (zio=0xfffff8010c962000)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3127
> #306 0xffffffff822fdcfd in zio_execute (zio=<value optimized out>)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1666
> #307 0xffffffff822e464d in vdev_queue_io_done (zio=<value optimized out>)
>     at
> /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c:913
> #308 0xffffffff823014c9 in zio_vdev_io_done (zio=0xfffff80102175000)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3152
> #309 0xffffffff822fdcfd in zio_execute (zio=<value optimized out>)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1666
> #310 0xffffffff80b2585a in taskqueue_run_locked (queue=<value optimized
> out>)
>     at /usr/src/sys/kern/subr_taskqueue.c:454
> #311 0xffffffff80b26a48 in taskqueue_thread_loop (arg=<value optimized out>)
>     at /usr/src/sys/kern/subr_taskqueue.c:724
> #312 0xffffffff80a7eb05 in fork_exit (
>     callout=0xffffffff80b26960 <taskqueue_thread_loop>,
>     arg=0xfffff800b8824c30, frame=0xfffffe0667430c00)
>     at /usr/src/sys/kern/kern_fork.c:1040
> #313 0xffffffff80f87c3e in fork_trampoline ()
>     at /usr/src/sys/amd64/amd64/exception.S:611
> #314 0x0000000000000000 in ?? ()
> Current language:  auto; currently minimal
> (kgdb)
>
> .....
>
>
> NewFS.denninger.net dumped core - see /var/crash/vmcore.3
>
> Wed Jan 11 08:15:33 CST 2017
>
> FreeBSD NewFS.denninger.net 11.0-STABLE FreeBSD 11.0-STABLE #14
> r311927M: Wed Ja
> n 11 07:55:20 CST 2017    
> k...@newfs.denninger.net:/usr/obj/usr/src/sys/KSD-SMP
>   amd64
>
> panic: double fault
>
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for details.
> This GDB was configured as "amd64-marcel-freebsd"...
>
> Unread portion of the kernel message buffer:
>
> Fatal double fault
> rip = 0xffffffff822e3c5d
> rsp = 0xfffffe066742af90
> rbp = 0xfffffe066742b420
> cpuid = 15; apic id = 35
> panic: double fault
> cpuid = 15
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> 0xfffffe0649ddee30
> vpanic() at vpanic+0x186/frame 0xfffffe0649ddeeb0
> panic() at panic+0x43/frame 0xfffffe0649ddef10
> dblfault_handler() at dblfault_handler+0xa2/frame 0xfffffe0649ddef30
> Xdblfault() at Xdblfault+0xac/frame 0xfffffe0649ddef30
> --- trap 0x17, rip = 0xffffffff822e3c5d, rsp = 0xfffffe066742af90, rbp =
> 0xfffff
> e066742b420 ---
>
> # Work around for this CPU from 11.x eratta
> vm.pmap.pcid_enabled=0
> #
> #
> # Try to avoid kernel stack exhaustion due to TRIM storms.
> kern.kstack_pages="6"
>
> I have kstack_pages set to "6" to try to avoid another panic that I got
> occasionally during zfs backup operations which appeared to be linked to
> "too many" TRIMs, and looks very similar to this one.
>
> I rebooted back to kernel.old, which was built in October, and the
> machine came up normally.  I'll try the newer build again and see if
> this was transient and related to delayed TRIM operations on the disks
> related to the installworld/installkernel.  But if it is then it remains
> a problem -- and setting stackpages didn't help!
>
> I've got the dump if anything in particular would be of help.
>
> The prompt to do this in the first place was the openssh CVE that was
> recently issued.....
>
>

-- 
Karl Denninger
k...@denninger.net <mailto:k...@denninger.net>
/The Market Ticker/
/[S/MIME encrypted email preferred]/

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to