Fatal trap 1 [Was: Memory modified after free - by whom?]
on 22/12/2012 02:21 Garrett Cooper said the following: Fatal trap 1: privileged instruction fault while in kernel mode Fatal trap 1: privileged instruction fault while in kernel mode Unrelated to the original topic - this looks very weird. I mean all the CPUs getting this unusual trap... Could you please do 'disassemble 0x80af5099' in kgdb with the same kernel. Or if you have a different kernel now, please use instruction pointer value from a trap with that kernel. Memory modified after free 0xff800040d000(9216) val=5a5a5a5a @ 0xff800040d000 Fatal trap 1: privileged instruction fault while in kernel mode cpuid = 3; cpuid = 1; apic id = 02 cpuid = 0; apic id = 06 apic id = 00 instruction pointer = 0x20:0x80af5099 instruction pointer = 0x20:0x80af5099 instruction pointer = 0x20:0x80af5099 Fatal trap 1: privileged instruction fault while in kernel mode stack pointer = 0x28:0xff8496fff880 stack pointer = 0x28:0xff8496fe1880 cpuid = 2; frame pointer= 0x28:0xff8496fff8b0 frame pointer = 0x28:0xff8496fe18b0 stack pointer = 0x28:0xff849705d880 code segment= base 0x0, limit 0xf, type 0x1b frame pointer = 0x28:0xff849705d8b0 apic id = 04 code segment= base 0x0, limit 0xf, type 0x1b code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 = DPL 0, pres 1, long 1, def32 0, gran 1 instruction pointer = 0x20:0x80af5099 processor eflags= = DPL 0, pres 1, long 1, def32 0, gran 1 interrupt enabled, processor eflags = stack pointer = 0x28:0xff8497067880 interrupt enabled, resume, resume, frame pointer= 0x28:0xff84970678b0 IOPL = 0 code segment= base 0x0, limit 0xf, type 0x1b current process = = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= 12 (irq280: ix0:que 3) ilock order reversal: (Giant after non-sleepable) 1st 0xfe0078148b38 ix0:rx(3) (ix0:rx(3)) @ /usr/src/sys/modules/ixgbe/../../dev/ixgbe/ixgbe.c:4296 2nd 0x814457b8 Giant (Giant) @ /usr/src/sys/dev/usb/input/ukbd.c:1946 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xff8496fff320 kdb_backtrace() at kdb_backtrace+0x39/frame 0xff8496fff3d0 witness_checkorder() at witness_checkorder+0xc47/frame 0xff8496fff450 __mtx_lock_flags() at __mtx_lock_flags+0x89/frame 0xff8496fff490 ukbd_poll() at ukbd_poll+0x28/frame 0xff8496fff4b0 kbdmux_poll() at kbdmux_poll+0x5b/frame 0xff8496fff4d0 cngrab() at cngrab+0x35/frame 0xff8496fff4f0 kdb_trap() at kdb_trap+0x124/frame 0xff8496fff550 trap_fatal() at trap_fatal+0x345/frame 0xff8496fff5b0 trap() at trap+0x836/frame 0xff8496fff7c0 calltrap() at calltrap+0x8/frame 0xff8496fff7c0 --- trap 0x1, rip = 0x80af5099, rsp = 0xff8496fff880, rbp = 0xff8496fff8b0 --- uma_find_refcnt() at uma_find_refcnt+0x79/frame 0xff8496fff8b0 -- Andriy Gapon ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Fatal trap 1 [Was: Memory modified after free - by whom?]
On Sat, Dec 22, 2012 at 01:08:10PM +0200, Andriy Gapon wrote: on 22/12/2012 02:21 Garrett Cooper said the following: Fatal trap 1: privileged instruction fault while in kernel mode Fatal trap 1: privileged instruction fault while in kernel mode Unrelated to the original topic - this looks very weird. I mean all the CPUs getting this unusual trap... Could you please do 'disassemble 0x80af5099' in kgdb with the same kernel. Or if you have a different kernel now, please use instruction pointer value from a trap with that kernel. This is due to the vtoslab() returning NULL. Since slabref is dereferenced later, clang tries to be helpful as usual and converts the !(p-flags PG_SLAB) case from vtoslab() into the jump to un2 instruction if vtoslab() result is NULL. So instead of KASSERT triggering the next line, you see this improvement. Memory modified after free 0xff800040d000(9216) val=5a5a5a5a @ 0xff800040d000 Fatal trap 1: privileged instruction fault while in kernel mode cpuid = 3; cpuid = 1; apic id = 02 cpuid = 0; apic id = 06 apic id = 00 instruction pointer = 0x20:0x80af5099 instruction pointer = 0x20:0x80af5099 instruction pointer = 0x20:0x80af5099 Fatal trap 1: privileged instruction fault while in kernel mode stack pointer = 0x28:0xff8496fff880 stack pointer = 0x28:0xff8496fe1880 cpuid = 2; frame pointer= 0x28:0xff8496fff8b0 frame pointer = 0x28:0xff8496fe18b0 stack pointer = 0x28:0xff849705d880 code segment= base 0x0, limit 0xf, type 0x1b frame pointer = 0x28:0xff849705d8b0 apic id = 04 code segment= base 0x0, limit 0xf, type 0x1b code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 = DPL 0, pres 1, long 1, def32 0, gran 1 instruction pointer = 0x20:0x80af5099 processor eflags= = DPL 0, pres 1, long 1, def32 0, gran 1 interrupt enabled, processor eflags = stack pointer = 0x28:0xff8497067880 interrupt enabled, resume, resume, frame pointer= 0x28:0xff84970678b0 IOPL = 0 code segment= base 0x0, limit 0xf, type 0x1b current process = = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= 12 (irq280: ix0:que 3) ilock order reversal: (Giant after non-sleepable) 1st 0xfe0078148b38 ix0:rx(3) (ix0:rx(3)) @ /usr/src/sys/modules/ixgbe/../../dev/ixgbe/ixgbe.c:4296 2nd 0x814457b8 Giant (Giant) @ /usr/src/sys/dev/usb/input/ukbd.c:1946 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xff8496fff320 kdb_backtrace() at kdb_backtrace+0x39/frame 0xff8496fff3d0 witness_checkorder() at witness_checkorder+0xc47/frame 0xff8496fff450 __mtx_lock_flags() at __mtx_lock_flags+0x89/frame 0xff8496fff490 ukbd_poll() at ukbd_poll+0x28/frame 0xff8496fff4b0 kbdmux_poll() at kbdmux_poll+0x5b/frame 0xff8496fff4d0 cngrab() at cngrab+0x35/frame 0xff8496fff4f0 kdb_trap() at kdb_trap+0x124/frame 0xff8496fff550 trap_fatal() at trap_fatal+0x345/frame 0xff8496fff5b0 trap() at trap+0x836/frame 0xff8496fff7c0 calltrap() at calltrap+0x8/frame 0xff8496fff7c0 --- trap 0x1, rip = 0x80af5099, rsp = 0xff8496fff880, rbp = 0xff8496fff8b0 --- uma_find_refcnt() at uma_find_refcnt+0x79/frame 0xff8496fff8b0 -- Andriy Gapon ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org pgps9pTddrmqz.pgp Description: PGP signature
Re: Memory modified after free - by whom?
On Mon, Dec 10, 2012 at 3:18 PM, m...@freebsd.org wrote: On Mon, Dec 10, 2012 at 3:10 PM, Adrian Chadd adr...@freebsd.org wrote: 9216 sounds like a jumbo frame mbuf. So the NIC is writing to an mbuf after it's finalised/freed. I have a similar bug showing up on ath(4) RX. :( Compile with DEBUG_MEMGUARD in the kernel configuration, and then set vm.memguard.desc to the name of the UMA zone used for the 9216 byte allocations, mbuf_jumbo_9k. This should cause a panic when the memory is touched after free. Tada (dang, that's nifty stuff)! # sysctl vm.memguard.desc=mbuf_jumbo_9k vm.memguard.descM: - mbuf_jumboem_9k # ory modified after free 0xff8000401000(9216) val=0 @ 0xff8000401000 Memory modified after free 0xff8000405000(9216) val=0 @ 0xff8000405000 Memory modified after free 0xff8000409000(9216) val=5a5a5a5a @ 0xff8000409000 Fatal trap 1: privileged instruction fault while in kernel mode Fatal trap 1: privileged instruction fault while in kernel mode Memory modified after free 0xff800040d000(9216) val=5a5a5a5a @ 0xff800040d000 Fatal trap 1: privileged instruction fault while in kernel mode cpuid = 3; cpuid = 1; apic id = 02 cpuid = 0; apic id = 06 apic id = 00 instruction pointer = 0x20:0x80af5099 instruction pointer = 0x20:0x80af5099 instruction pointer = 0x20:0x80af5099 Fatal trap 1: privileged instruction fault while in kernel mode stack pointer = 0x28:0xff8496fff880 stack pointer = 0x28:0xff8496fe1880 cpuid = 2; frame pointer= 0x28:0xff8496fff8b0 frame pointer = 0x28:0xff8496fe18b0 stack pointer = 0x28:0xff849705d880 code segment= base 0x0, limit 0xf, type 0x1b frame pointer = 0x28:0xff849705d8b0 apic id = 04 code segment= base 0x0, limit 0xf, type 0x1b code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 = DPL 0, pres 1, long 1, def32 0, gran 1 instruction pointer = 0x20:0x80af5099 processor eflags= = DPL 0, pres 1, long 1, def32 0, gran 1 interrupt enabled, processor eflags = stack pointer = 0x28:0xff8497067880 interrupt enabled, resume, resume, frame pointer= 0x28:0xff84970678b0 IOPL = 0 code segment= base 0x0, limit 0xf, type 0x1b current process = = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= 12 (irq280: ix0:que 3) ilock order reversal: (Giant after non-sleepable) 1st 0xfe0078148b38 ix0:rx(3) (ix0:rx(3)) @ /usr/src/sys/modules/ixgbe/../../dev/ixgbe/ixgbe.c:4296 2nd 0x814457b8 Giant (Giant) @ /usr/src/sys/dev/usb/input/ukbd.c:1946 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xff8496fff320 kdb_backtrace() at kdb_backtrace+0x39/frame 0xff8496fff3d0 witness_checkorder() at witness_checkorder+0xc47/frame 0xff8496fff450 __mtx_lock_flags() at __mtx_lock_flags+0x89/frame 0xff8496fff490 ukbd_poll() at ukbd_poll+0x28/frame 0xff8496fff4b0 kbdmux_poll() at kbdmux_poll+0x5b/frame 0xff8496fff4d0 cngrab() at cngrab+0x35/frame 0xff8496fff4f0 kdb_trap() at kdb_trap+0x124/frame 0xff8496fff550 trap_fatal() at trap_fatal+0x345/frame 0xff8496fff5b0 trap() at trap+0x836/frame 0xff8496fff7c0 calltrap() at calltrap+0x8/frame 0xff8496fff7c0 --- trap 0x1, rip = 0x80af5099, rsp = 0xff8496fff880, rbp = 0xff8496fff8b0 --- uma_find_refcnt() at uma_find_refcnt+0x79/frame 0xff8496fff8b0 mb_ctor_clust() at mb_ctor_clust+0x8f/frame 0xff8496fff8e0 uma_zalloc_arg() at uma_zalloc_arg+0xe7/frame 0xff8496fff960 m_getjcl() at m_getjcl+0xce/frame 0xff8496fff9a0 ixgbe_refresh_mbufs() at ixgbe_refresh_mbufs+0x77/frame 0xff8496fffa20 ixgbe_rxeof() at ixgbe_rxeof+0x5ce/frame 0xff8496fffad0 ixgbe_msix_que() at ixgbe_msix_que+0x9b/frame 0xff8496fffb20 intr_event_execute_handlers() at intr_event_execute_handlers+0x90/frame 0xff8496fffb60 ithread_loop() at ithread_loop+0x161/frame 0xff8496fffbb0 fork_exit() at fork_exit+0x84/frame 0xff8496fffbf0 fork_trampoline() at fork_trampoline+0xe/frame 0xff8496fffbf0 --- trap 0, rip = 0, rsp = 0xff8496fffcb0, rbp = 0 --- [ thread pid 12 tid 100218 ] Stopped at uma_find_refcnt+0x79: db This looks interesting: $ git diff sys/dev/ixgbe/ixgbe.c diff --git a/sys/dev/ixgbe/ixgbe.c b/sys/dev/ixgbe/ixgbe.c index 40f5488..19f842b 100644 --- a/sys/dev/ixgbe/ixgbe.c +++ b/sys/dev/ixgbe/ixgbe.c @@ -897,8 +897,10 @@ ixgbe_qflush(struct ifnet *ifp) for (int i = 0; i adapter-num_queues; i++, txr++) { IXGBE_TX_LOCK(txr); - while ((m = buf_ring_dequeue_sc(txr-br)) != NULL) + while ((m = buf_ring_dequeue_sc(txr-br)) != NULL) { m_freem(m);
Re: Memory modified after free - by whom?
On Fri, Dec 21, 2012 at 4:21 PM, Garrett Cooper yaneg...@gmail.com wrote: On Mon, Dec 10, 2012 at 3:18 PM, m...@freebsd.org wrote: On Mon, Dec 10, 2012 at 3:10 PM, Adrian Chadd adr...@freebsd.org wrote: 9216 sounds like a jumbo frame mbuf. So the NIC is writing to an mbuf after it's finalised/freed. I have a similar bug showing up on ath(4) RX. :( Compile with DEBUG_MEMGUARD in the kernel configuration, and then set vm.memguard.desc to the name of the UMA zone used for the 9216 byte allocations, mbuf_jumbo_9k. This should cause a panic when the memory is touched after free. Tada (dang, that's nifty stuff)! # sysctl vm.memguard.desc=mbuf_jumbo_9k vm.memguard.descM: - mbuf_jumboem_9k # ory modified after free 0xff8000401000(9216) val=0 @ 0xff8000401000 Memory modified after free 0xff8000405000(9216) val=0 @ 0xff8000405000 Memory modified after free 0xff8000409000(9216) val=5a5a5a5a @ 0xff8000409000 Fatal trap 1: privileged instruction fault while in kernel mode Fatal trap 1: privileged instruction fault while in kernel mode Memory modified after free 0xff800040d000(9216) val=5a5a5a5a @ 0xff800040d000 Fatal trap 1: privileged instruction fault while in kernel mode cpuid = 3; cpuid = 1; apic id = 02 cpuid = 0; apic id = 06 apic id = 00 instruction pointer = 0x20:0x80af5099 instruction pointer = 0x20:0x80af5099 instruction pointer = 0x20:0x80af5099 Fatal trap 1: privileged instruction fault while in kernel mode stack pointer = 0x28:0xff8496fff880 stack pointer = 0x28:0xff8496fe1880 cpuid = 2; frame pointer= 0x28:0xff8496fff8b0 frame pointer = 0x28:0xff8496fe18b0 stack pointer = 0x28:0xff849705d880 code segment= base 0x0, limit 0xf, type 0x1b frame pointer = 0x28:0xff849705d8b0 apic id = 04 code segment= base 0x0, limit 0xf, type 0x1b code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 = DPL 0, pres 1, long 1, def32 0, gran 1 instruction pointer = 0x20:0x80af5099 processor eflags= = DPL 0, pres 1, long 1, def32 0, gran 1 interrupt enabled, processor eflags = stack pointer = 0x28:0xff8497067880 interrupt enabled, resume, resume, frame pointer= 0x28:0xff84970678b0 IOPL = 0 code segment= base 0x0, limit 0xf, type 0x1b current process = = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= 12 (irq280: ix0:que 3) ilock order reversal: (Giant after non-sleepable) 1st 0xfe0078148b38 ix0:rx(3) (ix0:rx(3)) @ /usr/src/sys/modules/ixgbe/../../dev/ixgbe/ixgbe.c:4296 2nd 0x814457b8 Giant (Giant) @ /usr/src/sys/dev/usb/input/ukbd.c:1946 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xff8496fff320 kdb_backtrace() at kdb_backtrace+0x39/frame 0xff8496fff3d0 witness_checkorder() at witness_checkorder+0xc47/frame 0xff8496fff450 __mtx_lock_flags() at __mtx_lock_flags+0x89/frame 0xff8496fff490 ukbd_poll() at ukbd_poll+0x28/frame 0xff8496fff4b0 kbdmux_poll() at kbdmux_poll+0x5b/frame 0xff8496fff4d0 cngrab() at cngrab+0x35/frame 0xff8496fff4f0 kdb_trap() at kdb_trap+0x124/frame 0xff8496fff550 trap_fatal() at trap_fatal+0x345/frame 0xff8496fff5b0 trap() at trap+0x836/frame 0xff8496fff7c0 calltrap() at calltrap+0x8/frame 0xff8496fff7c0 --- trap 0x1, rip = 0x80af5099, rsp = 0xff8496fff880, rbp = 0xff8496fff8b0 --- uma_find_refcnt() at uma_find_refcnt+0x79/frame 0xff8496fff8b0 mb_ctor_clust() at mb_ctor_clust+0x8f/frame 0xff8496fff8e0 uma_zalloc_arg() at uma_zalloc_arg+0xe7/frame 0xff8496fff960 m_getjcl() at m_getjcl+0xce/frame 0xff8496fff9a0 ixgbe_refresh_mbufs() at ixgbe_refresh_mbufs+0x77/frame 0xff8496fffa20 ixgbe_rxeof() at ixgbe_rxeof+0x5ce/frame 0xff8496fffad0 ixgbe_msix_que() at ixgbe_msix_que+0x9b/frame 0xff8496fffb20 intr_event_execute_handlers() at intr_event_execute_handlers+0x90/frame 0xff8496fffb60 ithread_loop() at ithread_loop+0x161/frame 0xff8496fffbb0 fork_exit() at fork_exit+0x84/frame 0xff8496fffbf0 fork_trampoline() at fork_trampoline+0xe/frame 0xff8496fffbf0 --- trap 0, rip = 0, rsp = 0xff8496fffcb0, rbp = 0 --- [ thread pid 12 tid 100218 ] Stopped at uma_find_refcnt+0x79: db This looks interesting: $ git diff sys/dev/ixgbe/ixgbe.c diff --git a/sys/dev/ixgbe/ixgbe.c b/sys/dev/ixgbe/ixgbe.c index 40f5488..19f842b 100644 --- a/sys/dev/ixgbe/ixgbe.c +++ b/sys/dev/ixgbe/ixgbe.c @@ -897,8 +897,10 @@ ixgbe_qflush(struct ifnet *ifp) for (int i = 0; i adapter-num_queues; i++, txr++) { IXGBE_TX_LOCK(txr); -
Re: Memory modified after free - by whom?
(clipping off mdf and adrian so they don't get directly spammed :)..) Crud. Continuing the processor after panic didn't work, so it might be a case of cxgbe shot the sheriff or something else in the stack is doing something wonky: db c Memory modified after free 0xff8000405000(9216) val=0 @ 0xff8000405000 Memory modified after free 0xff800040d000(9216) val=0 @ 0xff800040d000 Fatal trap 1: privileged instruction fault while in kernel mode Memory modified after free 0xff8000409000(9216) val=0 @ 0xff8000409000 cpuid = 0; Fatal trap 1: privileged instruction fault while in kernel mode apic id = 00 cpuid = 1; Fatal trap 1: privileged instruction fault while in kernel mode instruction pointer = 0x20:0x80af5099 cpuid = 2; stack pointer= 0x28:0xff8496f41910 apic id = 04 apic id = 02 instruction pointer = 0x20:0x80af5099 instruction pointer = 0x20:0x80af5099 stack pointer = 0x28:0xff849705d880 frame pointer = 0x28:0xff8496f41940 stack pointer = 0x28:0xff8497067880 code segment= base 0x0, limit 0xf, type 0x1b frame pointer = 0x28:0xff849705d8b0 = DPL 0, pres 1, long 1, def32 0, gran 1 frame pointer = 0x28:0xff84970678b0 Fatal trap 1: privileged instruction fault while in kernel mode code segment= base 0x0, limit 0xf, type 0x1b code segment= base 0x0, limit 0xf, type 0x1b processor eflags= = DPL 0, pres 1, long 1, def32 0, gran 1 = DPL 0, pres 1, long 1, def32 0, gran 1 cpuid = 3; interrupt enabled, processor eflags = processor eflags = resume, interrupt enabled, apic id = 06 interrupt enabled, instruction pointer = 0x20:0x80af5099 resume, resume, stack pointer = 0x28:0xff8497071880 IOPL = 0 frame pointer = 0x28:0xff84970718b0 IOPL = 0 current process = code segment = base 0x0, limit 0xf, type 0x1b 12 (irq283: ix1:que 1) Ilock order reversal: (Giant after non-sleepable) 1st 0xfe009a295918 ix1:rx(1) (ix1:rx(1)) @ /usr/src/sys/modules/ixgbe/../../dev/ixgbe/ixgbe.c:4298 2nd 0x814457b8 Giant (Giant) @ /usr/src/sys/dev/usb/input/ukbd.c:1946 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xff849705d320 kdb_backtrace() at kdb_backtrace+0x39/frame 0xff849705d3d0 witness_checkorder() at witness_checkorder+0xc47/frame 0xff849705d450 __mtx_lock_flags() at __mtx_lock_flags+0x89/frame 0xff849705d490 ukbd_poll() at ukbd_poll+0x28/frame 0xff849705d4b0 kbdmux_poll() at kbdmux_poll+0x5b/frame 0xff849705d4d0 cngrab() at cngrab+0x35/frame 0xff849705d4f0 kdb_trap() at kdb_trap+0x124/frame 0xff849705d550 trap_fatal() at trap_fatal+0x345/frame 0xff849705d5b0 trap() at trap+0x836/frame 0xff849705d7c0 calltrap() at calltrap+0x8/frame 0xff849705d7c0 --- trap 0x1, rip = 0x80af5099, rsp = 0xff849705d880, rbp = 0xff849705d8b0 --- uma_find_refcnt() at uma_find_refcnt+0x79/frame 0xff849705d8b0 mb_ctor_clust() at mb_ctor_clust+0x8f/frame 0xff849705d8e0 uma_zalloc_arg() at uma_zalloc_arg+0xe7/frame 0xff849705d960 m_getjcl() at m_getjcl+0xce/frame 0xff849705d9a0 ixgbe_refresh_mbufs() at ixgbe_refresh_mbufs+0x77/frame 0xff849705da20 ixgbe_rxeof() at ixgbe_rxeof+0x4e9/frame 0xff849705dad0 ixgbe_msix_que() at ixgbe_msix_que+0x9b/frame 0xff849705db20 intr_event_execute_handlers() at intr_event_execute_handlers+0x90/frame 0xff849705db60 ithread_loop() at ithread_loop+0x161/frame 0xff849705dbb0 fork_exit() at fork_exit+0x84/frame 0xff849705dbf0 fork_trampoline() at fork_trampoline+0xe/frame 0xff849705dbf0 --- trap 0, rip = 0, rsp = 0xff849705dcb0, rbp = 0 --- [ thread pid 12 tid 100224 ] Stopped at uma_find_refcnt+0x79: I setup asymmetric MTUs to see if I can narrow down whether the issue is a jumbo frame or cxgbe/ixgbe issue and so far so good. Trying again with jumbo frames paniced on boot when loading the module (really...?): # Memory modified after free 0xff8000401000(9216) val=0 @ 0xff8000401000 Fatal trap 1: privileged instruction fault while in kernel mode cpuid = 1; apic id = 02 instruction pointer = 0x20:0x80af5099 stack pointer = 0x28:0xff8496fc9750 frame pointer = 0x28:0xff8496fc9780 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 3280 (ifconfig) lock order reversal: (Giant after non-sleepable) 1st 0xfe00462f7808 ix0:rx(0) (ix0:rx(0)) @ /usr/src/sys/modules/ixgbe/../../dev/ixgbe/ixgbe.c:3938 2nd 0x814457b8 Giant (Giant) @ /usr/src/sys/dev/usb/input/ukbd.c:1946 KDB: stack backtrace: db_trace_self_wrapper() at
Memory modified after free - by whom?
I noticed this while checking the logs on one of my test boxes after restarting the network. Any idea where I should start looking into this (has IPv6 enabled but wasn't using it, em/cxgbe/ixgbe interfaces with the ixgbe interfaces lagged previously, but now not)? It looks suspiciously like the same size as a jumbo frame, but I'm not 100% sure if that's the real problem. Thanks, -Garrett Dec 10 14:03:12 wf158 kernel: em0: link state changed to DOWN Dec 10 14:03:13 wf158 kernel: ix0: link state changed to DOWN Dec 10 14:03:13 wf158 kernel: ix0: link state changed to UP Dec 10 14:03:13 wf158 kernel: ix1: link state changed to DOWN Dec 10 14:03:13 wf158 kernel: ix1: link state changed to UP Dec 10 14:03:13 wf158 kernel: ix0: link state changed to DOWN Dec 10 14:03:13 wf158 kernel: ix0: link state changed to UP Dec 10 14:03:15 wf158 kernel: em0: link state changed to UP Dec 10 14:03:15 wf158 dhclient: New IP Address (em0): 10.7.169.89 Dec 10 14:03:15 wf158 dhclient: New Subnet Mask (em0): 255.255.240.0 Dec 10 14:03:15 wf158 dhclient: New Broadcast Address (em0): 10.7.175.255 Dec 10 14:03:15 wf158 dhclient: New Routers (em0): 10.7.160.1 Dec 10 14:03:16 wf158 kernel: ix0: link state changed to DOWN Dec 10 14:03:16 wf158 kernel: ix0: link state changed to UP Dec 10 14:03:31 wf158 kernel: in6_purgeaddr: err=65, destination address delete failed Dec 10 14:03:31 wf158 dhclient[4539]: My address (10.7.169.89) was deleted, dhclient exiting Dec 10 14:03:32 wf158 dhclient[4521]: short write: wanted 20 got 0 bytes Dec 10 14:03:32 wf158 dhclient[4521]: exiting. Dec 10 14:03:33 wf158 kernel: em0: link state changed to DOWN Dec 10 14:03:33 wf158 kernel: ix1: link state changed to DOWN Dec 10 14:03:34 wf158 kernel: ix1: link state changed to UP Dec 10 14:03:34 wf158 kernel: ix1: link state changed to DOWN Dec 10 14:03:34 wf158 kernel: ix1: link state changed to UP Dec 10 14:03:34 wf158 kernel: ix0: link state changed to DOWN Dec 10 14:03:34 wf158 kernel: ix0: link state changed to UP Dec 10 14:03:34 wf158 kernel: ix1: link state changed to DOWN Dec 10 14:03:34 wf158 kernel: ix1: link state changed to UP Dec 10 14:03:36 wf158 kernel: em0: link state changed to UP Dec 10 14:03:36 wf158 dhclient: New IP Address (em0): 10.7.169.89 Dec 10 14:03:36 wf158 dhclient: New Subnet Mask (em0): 255.255.240.0 Dec 10 14:03:36 wf158 dhclient: New Broadcast Address (em0): 10.7.175.255 Dec 10 14:03:36 wf158 dhclient: New Routers (em0): 10.7.160.1 Dec 10 14:05:34 wf158 kernel: Memory modified after free 0xff81c016d000(9216) val= @ 0xff81c016d000 Dec 10 14:05:35 wf158 kernel: Memory modified after free 0xff81b5cdc000(9216) val= @ 0xff81b5cdc000 # uname -a FreeBSD wf158.west.isilon.com 10.0-CURRENT FreeBSD 10.0-CURRENT #1 r+2760369-dirty: Mon Dec 10 08:04:46 PST 2012 r...@wf158.west.isilon.com:/usr/obj/usr/src/sys/ISI-GENERIC amd64 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Memory modified after free - by whom?
9216 sounds like a jumbo frame mbuf. So the NIC is writing to an mbuf after it's finalised/freed. I have a similar bug showing up on ath(4) RX. :( adrian On 10 December 2012 14:22, Garrett Cooper yaneg...@gmail.com wrote: I noticed this while checking the logs on one of my test boxes after restarting the network. Any idea where I should start looking into this (has IPv6 enabled but wasn't using it, em/cxgbe/ixgbe interfaces with the ixgbe interfaces lagged previously, but now not)? It looks suspiciously like the same size as a jumbo frame, but I'm not 100% sure if that's the real problem. Thanks, -Garrett Dec 10 14:03:12 wf158 kernel: em0: link state changed to DOWN Dec 10 14:03:13 wf158 kernel: ix0: link state changed to DOWN Dec 10 14:03:13 wf158 kernel: ix0: link state changed to UP Dec 10 14:03:13 wf158 kernel: ix1: link state changed to DOWN Dec 10 14:03:13 wf158 kernel: ix1: link state changed to UP Dec 10 14:03:13 wf158 kernel: ix0: link state changed to DOWN Dec 10 14:03:13 wf158 kernel: ix0: link state changed to UP Dec 10 14:03:15 wf158 kernel: em0: link state changed to UP Dec 10 14:03:15 wf158 dhclient: New IP Address (em0): 10.7.169.89 Dec 10 14:03:15 wf158 dhclient: New Subnet Mask (em0): 255.255.240.0 Dec 10 14:03:15 wf158 dhclient: New Broadcast Address (em0): 10.7.175.255 Dec 10 14:03:15 wf158 dhclient: New Routers (em0): 10.7.160.1 Dec 10 14:03:16 wf158 kernel: ix0: link state changed to DOWN Dec 10 14:03:16 wf158 kernel: ix0: link state changed to UP Dec 10 14:03:31 wf158 kernel: in6_purgeaddr: err=65, destination address delete failed Dec 10 14:03:31 wf158 dhclient[4539]: My address (10.7.169.89) was deleted, dhclient exiting Dec 10 14:03:32 wf158 dhclient[4521]: short write: wanted 20 got 0 bytes Dec 10 14:03:32 wf158 dhclient[4521]: exiting. Dec 10 14:03:33 wf158 kernel: em0: link state changed to DOWN Dec 10 14:03:33 wf158 kernel: ix1: link state changed to DOWN Dec 10 14:03:34 wf158 kernel: ix1: link state changed to UP Dec 10 14:03:34 wf158 kernel: ix1: link state changed to DOWN Dec 10 14:03:34 wf158 kernel: ix1: link state changed to UP Dec 10 14:03:34 wf158 kernel: ix0: link state changed to DOWN Dec 10 14:03:34 wf158 kernel: ix0: link state changed to UP Dec 10 14:03:34 wf158 kernel: ix1: link state changed to DOWN Dec 10 14:03:34 wf158 kernel: ix1: link state changed to UP Dec 10 14:03:36 wf158 kernel: em0: link state changed to UP Dec 10 14:03:36 wf158 dhclient: New IP Address (em0): 10.7.169.89 Dec 10 14:03:36 wf158 dhclient: New Subnet Mask (em0): 255.255.240.0 Dec 10 14:03:36 wf158 dhclient: New Broadcast Address (em0): 10.7.175.255 Dec 10 14:03:36 wf158 dhclient: New Routers (em0): 10.7.160.1 Dec 10 14:05:34 wf158 kernel: Memory modified after free 0xff81c016d000(9216) val= @ 0xff81c016d000 Dec 10 14:05:35 wf158 kernel: Memory modified after free 0xff81b5cdc000(9216) val= @ 0xff81b5cdc000 # uname -a FreeBSD wf158.west.isilon.com 10.0-CURRENT FreeBSD 10.0-CURRENT #1 r+2760369-dirty: Mon Dec 10 08:04:46 PST 2012 r...@wf158.west.isilon.com:/usr/obj/usr/src/sys/ISI-GENERIC amd64 ___ freebsd-...@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Memory modified after free - by whom?
On Mon, Dec 10, 2012 at 3:10 PM, Adrian Chadd adr...@freebsd.org wrote: 9216 sounds like a jumbo frame mbuf. So the NIC is writing to an mbuf after it's finalised/freed. I have a similar bug showing up on ath(4) RX. :( Compile with DEBUG_MEMGUARD in the kernel configuration, and then set vm.memguard.desc to the name of the UMA zone used for the 9216 byte allocations, mbuf_jumbo_9k. This should cause a panic when the memory is touched after free. Cheers, matthew On 10 December 2012 14:22, Garrett Cooper yaneg...@gmail.com wrote: I noticed this while checking the logs on one of my test boxes after restarting the network. Any idea where I should start looking into this (has IPv6 enabled but wasn't using it, em/cxgbe/ixgbe interfaces with the ixgbe interfaces lagged previously, but now not)? It looks suspiciously like the same size as a jumbo frame, but I'm not 100% sure if that's the real problem. Thanks, -Garrett Dec 10 14:03:12 wf158 kernel: em0: link state changed to DOWN Dec 10 14:03:13 wf158 kernel: ix0: link state changed to DOWN Dec 10 14:03:13 wf158 kernel: ix0: link state changed to UP Dec 10 14:03:13 wf158 kernel: ix1: link state changed to DOWN Dec 10 14:03:13 wf158 kernel: ix1: link state changed to UP Dec 10 14:03:13 wf158 kernel: ix0: link state changed to DOWN Dec 10 14:03:13 wf158 kernel: ix0: link state changed to UP Dec 10 14:03:15 wf158 kernel: em0: link state changed to UP Dec 10 14:03:15 wf158 dhclient: New IP Address (em0): 10.7.169.89 Dec 10 14:03:15 wf158 dhclient: New Subnet Mask (em0): 255.255.240.0 Dec 10 14:03:15 wf158 dhclient: New Broadcast Address (em0): 10.7.175.255 Dec 10 14:03:15 wf158 dhclient: New Routers (em0): 10.7.160.1 Dec 10 14:03:16 wf158 kernel: ix0: link state changed to DOWN Dec 10 14:03:16 wf158 kernel: ix0: link state changed to UP Dec 10 14:03:31 wf158 kernel: in6_purgeaddr: err=65, destination address delete failed Dec 10 14:03:31 wf158 dhclient[4539]: My address (10.7.169.89) was deleted, dhclient exiting Dec 10 14:03:32 wf158 dhclient[4521]: short write: wanted 20 got 0 bytes Dec 10 14:03:32 wf158 dhclient[4521]: exiting. Dec 10 14:03:33 wf158 kernel: em0: link state changed to DOWN Dec 10 14:03:33 wf158 kernel: ix1: link state changed to DOWN Dec 10 14:03:34 wf158 kernel: ix1: link state changed to UP Dec 10 14:03:34 wf158 kernel: ix1: link state changed to DOWN Dec 10 14:03:34 wf158 kernel: ix1: link state changed to UP Dec 10 14:03:34 wf158 kernel: ix0: link state changed to DOWN Dec 10 14:03:34 wf158 kernel: ix0: link state changed to UP Dec 10 14:03:34 wf158 kernel: ix1: link state changed to DOWN Dec 10 14:03:34 wf158 kernel: ix1: link state changed to UP Dec 10 14:03:36 wf158 kernel: em0: link state changed to UP Dec 10 14:03:36 wf158 dhclient: New IP Address (em0): 10.7.169.89 Dec 10 14:03:36 wf158 dhclient: New Subnet Mask (em0): 255.255.240.0 Dec 10 14:03:36 wf158 dhclient: New Broadcast Address (em0): 10.7.175.255 Dec 10 14:03:36 wf158 dhclient: New Routers (em0): 10.7.160.1 Dec 10 14:05:34 wf158 kernel: Memory modified after free 0xff81c016d000(9216) val= @ 0xff81c016d000 Dec 10 14:05:35 wf158 kernel: Memory modified after free 0xff81b5cdc000(9216) val= @ 0xff81b5cdc000 # uname -a FreeBSD wf158.west.isilon.com 10.0-CURRENT FreeBSD 10.0-CURRENT #1 r+2760369-dirty: Mon Dec 10 08:04:46 PST 2012 r...@wf158.west.isilon.com:/usr/obj/usr/src/sys/ISI-GENERIC amd64 ___ freebsd-...@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Memory modified after free - by whom?
On 10 December 2012 15:18, m...@freebsd.org wrote: On Mon, Dec 10, 2012 at 3:10 PM, Adrian Chadd adr...@freebsd.org wrote: 9216 sounds like a jumbo frame mbuf. So the NIC is writing to an mbuf after it's finalised/freed. I have a similar bug showing up on ath(4) RX. :( Compile with DEBUG_MEMGUARD in the kernel configuration, and then set vm.memguard.desc to the name of the UMA zone used for the 9216 byte allocations, mbuf_jumbo_9k. This should cause a panic when the memory is touched after free. Right, but I think its a _hardware_ access after the buffer has been freed.. Adrian ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Memory modified after free - by whom?
On Mon, Dec 10, 2012 at 3:21 PM, Adrian Chadd adr...@freebsd.org wrote: On 10 December 2012 15:18, m...@freebsd.org wrote: On Mon, Dec 10, 2012 at 3:10 PM, Adrian Chadd adr...@freebsd.org wrote: 9216 sounds like a jumbo frame mbuf. So the NIC is writing to an mbuf after it's finalised/freed. I have a similar bug showing up on ath(4) RX. :( Compile with DEBUG_MEMGUARD in the kernel configuration, and then set vm.memguard.desc to the name of the UMA zone used for the 9216 byte allocations, mbuf_jumbo_9k. This should cause a panic when the memory is touched after free. Right, but I think its a _hardware_ access after the buffer has been freed.. At least that will give me an idea of who to punt the bug over to next (assuming it lists the driver) -- one of the network folks, jfv, or np :). It seems to be a recent change that's causing this (it's spewing out these warnings every couple seconds), but that might also be related to me getting lagg working on CURRENT as my last known base was 9-STABLE and a lot of networking changes haven't been MFCed :). I could probably look through the code too after compiling it, but it would take too long. Thanks! -Garrett ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Memory modified after free - by whom?
On Mon, Dec 10, 2012 at 05:37:17PM -0800, Garrett Cooper wrote: On Mon, Dec 10, 2012 at 3:21 PM, Adrian Chadd adr...@freebsd.org wrote: On 10 December 2012 15:18, m...@freebsd.org wrote: On Mon, Dec 10, 2012 at 3:10 PM, Adrian Chadd adr...@freebsd.org wrote: 9216 sounds like a jumbo frame mbuf. So the NIC is writing to an mbuf after it's finalised/freed. I have a similar bug showing up on ath(4) RX. :( Compile with DEBUG_MEMGUARD in the kernel configuration, and then set vm.memguard.desc to the name of the UMA zone used for the 9216 byte allocations, mbuf_jumbo_9k. This should cause a panic when the memory is touched after free. Right, but I think its a _hardware_ access after the buffer has been freed.. At least that will give me an idea of who to punt the bug over to next (assuming it lists the driver) -- one of the network folks, jfv, or np :). It seems to be a recent change that's causing this (it's spewing out these warnings every couple seconds), but that might also be related to me getting lagg working on CURRENT as my last known base was 9-STABLE and a lot of networking changes haven't been MFCed :). If you suspect it's a DMA from the NIC after the 9K cluster has been freed, see if the corrupt portion looks anything like an Ethernet frame. If it does then the DMAC in the frame will tell you who to follow up with -- jfv@ or me :-) (btw, your log had val= so I think it's something else..) Regards, Navdeep I could probably look through the code too after compiling it, but it would take too long. Thanks! -Garrett ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Memory modified after free - by whom?
On Mon, Dec 10, 2012 at 03:18:45PM -0800, m...@freebsd.org wrote: m On Mon, Dec 10, 2012 at 3:10 PM, Adrian Chadd adr...@freebsd.org wrote: m 9216 sounds like a jumbo frame mbuf. So the NIC is writing to an mbuf m after it's finalised/freed. m m I have a similar bug showing up on ath(4) RX. :( m m Compile with DEBUG_MEMGUARD in the kernel configuration, and then set m vm.memguard.desc to the name of the UMA zone used for the 9216 byte m allocations, mbuf_jumbo_9k. This should cause a panic when the memory m is touched after free. DEBUG_MEMGUARD doesn't work with cluster zone, I'm afraid it won't work with mbuf_jumbo_9k, too, but I didn't try this. The problem is documented in BUGS in memguard(9). -- Totus tuus, Glebius. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org