Fatal trap 1 [Was: Memory modified after free - by whom?]

2012-12-22 Thread Andriy Gapon
on 22/12/2012 02:21 Garrett Cooper said the following:
 Fatal trap 1: privileged instruction fault while in kernel mode
 Fatal trap 1: privileged instruction fault while in kernel mode

Unrelated to the original topic - this looks very weird.
I mean all the CPUs getting this unusual trap...
Could you please do 'disassemble 0x80af5099' in kgdb with the same
kernel.  Or if you have a different kernel now, please use instruction pointer
value from a trap with that kernel.

 Memory modified after free 0xff800040d000(9216) val=5a5a5a5a @
 0xff800040d000
 Fatal trap 1: privileged instruction fault while in kernel mode
 cpuid = 3;
 cpuid = 1;
 apic id = 02
 cpuid = 0; apic id = 06
 apic id = 00
 instruction pointer = 0x20:0x80af5099
 instruction pointer = 0x20:0x80af5099
 instruction pointer = 0x20:0x80af5099
 Fatal trap 1: privileged instruction fault while in kernel mode
 stack pointer   = 0x28:0xff8496fff880
 stack pointer   = 0x28:0xff8496fe1880
 cpuid = 2; frame pointer= 0x28:0xff8496fff8b0
 frame pointer   = 0x28:0xff8496fe18b0
 stack pointer   = 0x28:0xff849705d880
 code segment= base 0x0, limit 0xf, type 0x1b
 frame pointer   = 0x28:0xff849705d8b0
 apic id = 04
 code segment= base 0x0, limit 0xf, type 0x1b
 code segment= base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, long 1, def32 0, gran 1
 = DPL 0, pres 1, long 1, def32 0, gran 1
 instruction pointer = 0x20:0x80af5099
 processor eflags=   = DPL 0, pres 1, long
 1, def32 0, gran 1
 interrupt enabled, processor eflags = stack pointer =
 0x28:0xff8497067880
 interrupt enabled, resume, resume, frame pointer=
 0x28:0xff84970678b0
 IOPL = 0
 code segment= base 0x0, limit 0xf, type 0x1b
 current process =   = DPL 0, pres 1, long
 1, def32 0, gran 1
 processor eflags= 12 (irq280: ix0:que 3)
 ilock order reversal: (Giant after non-sleepable)
  1st 0xfe0078148b38 ix0:rx(3) (ix0:rx(3)) @
 /usr/src/sys/modules/ixgbe/../../dev/ixgbe/ixgbe.c:4296
  2nd 0x814457b8 Giant (Giant) @ /usr/src/sys/dev/usb/input/ukbd.c:1946
 KDB: stack backtrace:
 db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xff8496fff320
 kdb_backtrace() at kdb_backtrace+0x39/frame 0xff8496fff3d0
 witness_checkorder() at witness_checkorder+0xc47/frame 0xff8496fff450
 __mtx_lock_flags() at __mtx_lock_flags+0x89/frame 0xff8496fff490
 ukbd_poll() at ukbd_poll+0x28/frame 0xff8496fff4b0
 kbdmux_poll() at kbdmux_poll+0x5b/frame 0xff8496fff4d0
 cngrab() at cngrab+0x35/frame 0xff8496fff4f0
 kdb_trap() at kdb_trap+0x124/frame 0xff8496fff550
 trap_fatal() at trap_fatal+0x345/frame 0xff8496fff5b0
 trap() at trap+0x836/frame 0xff8496fff7c0
 calltrap() at calltrap+0x8/frame 0xff8496fff7c0
 --- trap 0x1, rip = 0x80af5099, rsp = 0xff8496fff880, rbp
 = 0xff8496fff8b0 ---
 uma_find_refcnt() at uma_find_refcnt+0x79/frame 0xff8496fff8b0


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Fatal trap 1 [Was: Memory modified after free - by whom?]

2012-12-22 Thread Konstantin Belousov
On Sat, Dec 22, 2012 at 01:08:10PM +0200, Andriy Gapon wrote:
 on 22/12/2012 02:21 Garrett Cooper said the following:
  Fatal trap 1: privileged instruction fault while in kernel mode
  Fatal trap 1: privileged instruction fault while in kernel mode
 
 Unrelated to the original topic - this looks very weird.
 I mean all the CPUs getting this unusual trap...
 Could you please do 'disassemble 0x80af5099' in kgdb with the same
 kernel.  Or if you have a different kernel now, please use instruction 
 pointer
 value from a trap with that kernel.
 
This is due to the vtoslab() returning NULL. Since slabref is
dereferenced later, clang tries to be helpful as usual and converts
the !(p-flags  PG_SLAB) case from vtoslab() into the jump to un2
instruction if vtoslab() result is NULL.

So instead of KASSERT triggering the next line, you see this improvement.

  Memory modified after free 0xff800040d000(9216) val=5a5a5a5a @
  0xff800040d000
  Fatal trap 1: privileged instruction fault while in kernel mode
  cpuid = 3;
  cpuid = 1;
  apic id = 02
  cpuid = 0; apic id = 06
  apic id = 00
  instruction pointer = 0x20:0x80af5099
  instruction pointer = 0x20:0x80af5099
  instruction pointer = 0x20:0x80af5099
  Fatal trap 1: privileged instruction fault while in kernel mode
  stack pointer   = 0x28:0xff8496fff880
  stack pointer   = 0x28:0xff8496fe1880
  cpuid = 2; frame pointer= 0x28:0xff8496fff8b0
  frame pointer   = 0x28:0xff8496fe18b0
  stack pointer   = 0x28:0xff849705d880
  code segment= base 0x0, limit 0xf, type 0x1b
  frame pointer   = 0x28:0xff849705d8b0
  apic id = 04
  code segment= base 0x0, limit 0xf, type 0x1b
  code segment= base 0x0, limit 0xf, type 0x1b
  = DPL 0, pres 1, long 1, def32 0, gran 1
  = DPL 0, pres 1, long 1, def32 0, gran 1
  instruction pointer = 0x20:0x80af5099
  processor eflags=   = DPL 0, pres 1, long
  1, def32 0, gran 1
  interrupt enabled, processor eflags = stack pointer =
  0x28:0xff8497067880
  interrupt enabled, resume, resume, frame pointer=
  0x28:0xff84970678b0
  IOPL = 0
  code segment= base 0x0, limit 0xf, type 0x1b
  current process =   = DPL 0, pres 1, long
  1, def32 0, gran 1
  processor eflags= 12 (irq280: ix0:que 3)
  ilock order reversal: (Giant after non-sleepable)
   1st 0xfe0078148b38 ix0:rx(3) (ix0:rx(3)) @
  /usr/src/sys/modules/ixgbe/../../dev/ixgbe/ixgbe.c:4296
   2nd 0x814457b8 Giant (Giant) @ 
  /usr/src/sys/dev/usb/input/ukbd.c:1946
  KDB: stack backtrace:
  db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
  0xff8496fff320
  kdb_backtrace() at kdb_backtrace+0x39/frame 0xff8496fff3d0
  witness_checkorder() at witness_checkorder+0xc47/frame 0xff8496fff450
  __mtx_lock_flags() at __mtx_lock_flags+0x89/frame 0xff8496fff490
  ukbd_poll() at ukbd_poll+0x28/frame 0xff8496fff4b0
  kbdmux_poll() at kbdmux_poll+0x5b/frame 0xff8496fff4d0
  cngrab() at cngrab+0x35/frame 0xff8496fff4f0
  kdb_trap() at kdb_trap+0x124/frame 0xff8496fff550
  trap_fatal() at trap_fatal+0x345/frame 0xff8496fff5b0
  trap() at trap+0x836/frame 0xff8496fff7c0
  calltrap() at calltrap+0x8/frame 0xff8496fff7c0
  --- trap 0x1, rip = 0x80af5099, rsp = 0xff8496fff880, rbp
  = 0xff8496fff8b0 ---
  uma_find_refcnt() at uma_find_refcnt+0x79/frame 0xff8496fff8b0
 
 
 -- 
 Andriy Gapon
 ___
 freebsd-current@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


pgps9pTddrmqz.pgp
Description: PGP signature


Re: Memory modified after free - by whom?

2012-12-21 Thread Garrett Cooper
On Mon, Dec 10, 2012 at 3:18 PM,  m...@freebsd.org wrote:
 On Mon, Dec 10, 2012 at 3:10 PM, Adrian Chadd adr...@freebsd.org wrote:
 9216 sounds like a jumbo frame mbuf. So the NIC is writing to an mbuf
 after it's finalised/freed.

 I have a similar bug showing up on ath(4) RX. :(

 Compile with DEBUG_MEMGUARD in the kernel configuration, and then set
 vm.memguard.desc to the name of the UMA zone used for the 9216 byte
 allocations, mbuf_jumbo_9k.  This should cause a panic when the memory
 is touched after free.

Tada (dang, that's nifty stuff)!

# sysctl vm.memguard.desc=mbuf_jumbo_9k
vm.memguard.descM:  - mbuf_jumboem_9k
# ory modified after free 0xff8000401000(9216) val=0 @ 0xff8000401000
Memory modified after free 0xff8000405000(9216) val=0 @ 0xff8000405000
Memory modified after free 0xff8000409000(9216) val=5a5a5a5a @
0xff8000409000






Fatal trap 1: privileged instruction fault while in kernel mode
Fatal trap 1: privileged instruction fault while in kernel mode
Memory modified after free 0xff800040d000(9216) val=5a5a5a5a @
0xff800040d000
Fatal trap 1: privileged instruction fault while in kernel mode
cpuid = 3;
cpuid = 1;
apic id = 02
cpuid = 0; apic id = 06
apic id = 00
instruction pointer = 0x20:0x80af5099
instruction pointer = 0x20:0x80af5099
instruction pointer = 0x20:0x80af5099
Fatal trap 1: privileged instruction fault while in kernel mode
stack pointer   = 0x28:0xff8496fff880
stack pointer   = 0x28:0xff8496fe1880
cpuid = 2; frame pointer= 0x28:0xff8496fff8b0
frame pointer   = 0x28:0xff8496fe18b0
stack pointer   = 0x28:0xff849705d880
code segment= base 0x0, limit 0xf, type 0x1b
frame pointer   = 0x28:0xff849705d8b0
apic id = 04
code segment= base 0x0, limit 0xf, type 0x1b
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
= DPL 0, pres 1, long 1, def32 0, gran 1
instruction pointer = 0x20:0x80af5099
processor eflags=   = DPL 0, pres 1, long
1, def32 0, gran 1
interrupt enabled, processor eflags = stack pointer =
0x28:0xff8497067880
interrupt enabled, resume, resume, frame pointer=
0x28:0xff84970678b0
IOPL = 0
code segment= base 0x0, limit 0xf, type 0x1b
current process =   = DPL 0, pres 1, long
1, def32 0, gran 1
processor eflags= 12 (irq280: ix0:que 3)
ilock order reversal: (Giant after non-sleepable)
 1st 0xfe0078148b38 ix0:rx(3) (ix0:rx(3)) @
/usr/src/sys/modules/ixgbe/../../dev/ixgbe/ixgbe.c:4296
 2nd 0x814457b8 Giant (Giant) @ /usr/src/sys/dev/usb/input/ukbd.c:1946
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xff8496fff320
kdb_backtrace() at kdb_backtrace+0x39/frame 0xff8496fff3d0
witness_checkorder() at witness_checkorder+0xc47/frame 0xff8496fff450
__mtx_lock_flags() at __mtx_lock_flags+0x89/frame 0xff8496fff490
ukbd_poll() at ukbd_poll+0x28/frame 0xff8496fff4b0
kbdmux_poll() at kbdmux_poll+0x5b/frame 0xff8496fff4d0
cngrab() at cngrab+0x35/frame 0xff8496fff4f0
kdb_trap() at kdb_trap+0x124/frame 0xff8496fff550
trap_fatal() at trap_fatal+0x345/frame 0xff8496fff5b0
trap() at trap+0x836/frame 0xff8496fff7c0
calltrap() at calltrap+0x8/frame 0xff8496fff7c0
--- trap 0x1, rip = 0x80af5099, rsp = 0xff8496fff880, rbp
= 0xff8496fff8b0 ---
uma_find_refcnt() at uma_find_refcnt+0x79/frame 0xff8496fff8b0
mb_ctor_clust() at mb_ctor_clust+0x8f/frame 0xff8496fff8e0
uma_zalloc_arg() at uma_zalloc_arg+0xe7/frame 0xff8496fff960
m_getjcl() at m_getjcl+0xce/frame 0xff8496fff9a0
ixgbe_refresh_mbufs() at ixgbe_refresh_mbufs+0x77/frame 0xff8496fffa20
ixgbe_rxeof() at ixgbe_rxeof+0x5ce/frame 0xff8496fffad0
ixgbe_msix_que() at ixgbe_msix_que+0x9b/frame 0xff8496fffb20
intr_event_execute_handlers() at
intr_event_execute_handlers+0x90/frame 0xff8496fffb60
ithread_loop() at ithread_loop+0x161/frame 0xff8496fffbb0
fork_exit() at fork_exit+0x84/frame 0xff8496fffbf0
fork_trampoline() at fork_trampoline+0xe/frame 0xff8496fffbf0
--- trap 0, rip = 0, rsp = 0xff8496fffcb0, rbp = 0 ---
[ thread pid 12 tid 100218 ]
Stopped at  uma_find_refcnt+0x79:
db

This looks interesting:

$ git diff sys/dev/ixgbe/ixgbe.c
diff --git a/sys/dev/ixgbe/ixgbe.c b/sys/dev/ixgbe/ixgbe.c
index 40f5488..19f842b 100644
--- a/sys/dev/ixgbe/ixgbe.c
+++ b/sys/dev/ixgbe/ixgbe.c
@@ -897,8 +897,10 @@ ixgbe_qflush(struct ifnet *ifp)

for (int i = 0; i  adapter-num_queues; i++, txr++) {
IXGBE_TX_LOCK(txr);
-   while ((m = buf_ring_dequeue_sc(txr-br)) != NULL)
+   while ((m = buf_ring_dequeue_sc(txr-br)) != NULL) {
m_freem(m);

Re: Memory modified after free - by whom?

2012-12-21 Thread Garrett Cooper
On Fri, Dec 21, 2012 at 4:21 PM, Garrett Cooper yaneg...@gmail.com wrote:
 On Mon, Dec 10, 2012 at 3:18 PM,  m...@freebsd.org wrote:
 On Mon, Dec 10, 2012 at 3:10 PM, Adrian Chadd adr...@freebsd.org wrote:
 9216 sounds like a jumbo frame mbuf. So the NIC is writing to an mbuf
 after it's finalised/freed.

 I have a similar bug showing up on ath(4) RX. :(

 Compile with DEBUG_MEMGUARD in the kernel configuration, and then set
 vm.memguard.desc to the name of the UMA zone used for the 9216 byte
 allocations, mbuf_jumbo_9k.  This should cause a panic when the memory
 is touched after free.

 Tada (dang, that's nifty stuff)!

 # sysctl vm.memguard.desc=mbuf_jumbo_9k
 vm.memguard.descM:  - mbuf_jumboem_9k
 # ory modified after free 0xff8000401000(9216) val=0 @ 0xff8000401000
 Memory modified after free 0xff8000405000(9216) val=0 @ 0xff8000405000
 Memory modified after free 0xff8000409000(9216) val=5a5a5a5a @
 0xff8000409000






 Fatal trap 1: privileged instruction fault while in kernel mode
 Fatal trap 1: privileged instruction fault while in kernel mode
 Memory modified after free 0xff800040d000(9216) val=5a5a5a5a @
 0xff800040d000
 Fatal trap 1: privileged instruction fault while in kernel mode
 cpuid = 3;
 cpuid = 1;
 apic id = 02
 cpuid = 0; apic id = 06
 apic id = 00
 instruction pointer = 0x20:0x80af5099
 instruction pointer = 0x20:0x80af5099
 instruction pointer = 0x20:0x80af5099
 Fatal trap 1: privileged instruction fault while in kernel mode
 stack pointer   = 0x28:0xff8496fff880
 stack pointer   = 0x28:0xff8496fe1880
 cpuid = 2; frame pointer= 0x28:0xff8496fff8b0
 frame pointer   = 0x28:0xff8496fe18b0
 stack pointer   = 0x28:0xff849705d880
 code segment= base 0x0, limit 0xf, type 0x1b
 frame pointer   = 0x28:0xff849705d8b0
 apic id = 04
 code segment= base 0x0, limit 0xf, type 0x1b
 code segment= base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, long 1, def32 0, gran 1
 = DPL 0, pres 1, long 1, def32 0, gran 1
 instruction pointer = 0x20:0x80af5099
 processor eflags=   = DPL 0, pres 1, long
 1, def32 0, gran 1
 interrupt enabled, processor eflags = stack pointer =
 0x28:0xff8497067880
 interrupt enabled, resume, resume, frame pointer=
 0x28:0xff84970678b0
 IOPL = 0
 code segment= base 0x0, limit 0xf, type 0x1b
 current process =   = DPL 0, pres 1, long
 1, def32 0, gran 1
 processor eflags= 12 (irq280: ix0:que 3)
 ilock order reversal: (Giant after non-sleepable)
  1st 0xfe0078148b38 ix0:rx(3) (ix0:rx(3)) @
 /usr/src/sys/modules/ixgbe/../../dev/ixgbe/ixgbe.c:4296
  2nd 0x814457b8 Giant (Giant) @ /usr/src/sys/dev/usb/input/ukbd.c:1946
 KDB: stack backtrace:
 db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xff8496fff320
 kdb_backtrace() at kdb_backtrace+0x39/frame 0xff8496fff3d0
 witness_checkorder() at witness_checkorder+0xc47/frame 0xff8496fff450
 __mtx_lock_flags() at __mtx_lock_flags+0x89/frame 0xff8496fff490
 ukbd_poll() at ukbd_poll+0x28/frame 0xff8496fff4b0
 kbdmux_poll() at kbdmux_poll+0x5b/frame 0xff8496fff4d0
 cngrab() at cngrab+0x35/frame 0xff8496fff4f0
 kdb_trap() at kdb_trap+0x124/frame 0xff8496fff550
 trap_fatal() at trap_fatal+0x345/frame 0xff8496fff5b0
 trap() at trap+0x836/frame 0xff8496fff7c0
 calltrap() at calltrap+0x8/frame 0xff8496fff7c0
 --- trap 0x1, rip = 0x80af5099, rsp = 0xff8496fff880, rbp
 = 0xff8496fff8b0 ---
 uma_find_refcnt() at uma_find_refcnt+0x79/frame 0xff8496fff8b0
 mb_ctor_clust() at mb_ctor_clust+0x8f/frame 0xff8496fff8e0
 uma_zalloc_arg() at uma_zalloc_arg+0xe7/frame 0xff8496fff960
 m_getjcl() at m_getjcl+0xce/frame 0xff8496fff9a0
 ixgbe_refresh_mbufs() at ixgbe_refresh_mbufs+0x77/frame 0xff8496fffa20
 ixgbe_rxeof() at ixgbe_rxeof+0x5ce/frame 0xff8496fffad0
 ixgbe_msix_que() at ixgbe_msix_que+0x9b/frame 0xff8496fffb20
 intr_event_execute_handlers() at
 intr_event_execute_handlers+0x90/frame 0xff8496fffb60
 ithread_loop() at ithread_loop+0x161/frame 0xff8496fffbb0
 fork_exit() at fork_exit+0x84/frame 0xff8496fffbf0
 fork_trampoline() at fork_trampoline+0xe/frame 0xff8496fffbf0
 --- trap 0, rip = 0, rsp = 0xff8496fffcb0, rbp = 0 ---
 [ thread pid 12 tid 100218 ]
 Stopped at  uma_find_refcnt+0x79:
 db

 This looks interesting:

 $ git diff sys/dev/ixgbe/ixgbe.c
 diff --git a/sys/dev/ixgbe/ixgbe.c b/sys/dev/ixgbe/ixgbe.c
 index 40f5488..19f842b 100644
 --- a/sys/dev/ixgbe/ixgbe.c
 +++ b/sys/dev/ixgbe/ixgbe.c
 @@ -897,8 +897,10 @@ ixgbe_qflush(struct ifnet *ifp)

 for (int i = 0; i  adapter-num_queues; i++, txr++) {
 IXGBE_TX_LOCK(txr);
 -  

Re: Memory modified after free - by whom?

2012-12-21 Thread Garrett Cooper
(clipping off mdf and adrian so they don't get directly spammed :)..)
Crud. Continuing the processor after panic didn't work, so it
might be a case of cxgbe shot the sheriff or something else in the
stack is doing something wonky:

db c

Memory modified after free 0xff8000405000(9216) val=0 @ 0xff8000405000
Memory modified after free 0xff800040d000(9216) val=0 @ 0xff800040d000



Fatal trap 1: privileged instruction fault while in kernel mode
Memory modified after free 0xff8000409000(9216) val=0 @ 0xff8000409000

cpuid = 0;
Fatal trap 1: privileged instruction fault while in kernel mode
apic id = 00
cpuid = 1; Fatal trap 1: privileged instruction fault while in kernel mode
instruction pointer = 0x20:0x80af5099
cpuid = 2; stack pointer= 0x28:0xff8496f41910
apic id = 04
apic id = 02
instruction pointer = 0x20:0x80af5099
instruction pointer = 0x20:0x80af5099

stack pointer   = 0x28:0xff849705d880
frame pointer   = 0x28:0xff8496f41940

stack pointer   = 0x28:0xff8497067880
code segment= base 0x0, limit 0xf, type 0x1b
frame pointer   = 0x28:0xff849705d8b0
= DPL 0, pres 1, long 1, def32 0, gran 1
frame pointer   = 0x28:0xff84970678b0
Fatal trap 1: privileged instruction fault while in kernel mode
code segment= base 0x0, limit 0xf, type 0x1b
code segment= base 0x0, limit 0xf, type 0x1b
processor eflags=   = DPL 0, pres 1, long
1, def32 0, gran 1
= DPL 0, pres 1, long 1, def32 0, gran 1
cpuid = 3; interrupt enabled, processor eflags  = processor eflags
 = resume, interrupt enabled, apic id = 06
interrupt enabled, instruction pointer  = 0x20:0x80af5099
resume, resume, stack pointer   = 0x28:0xff8497071880
IOPL = 0
frame pointer   = 0x28:0xff84970718b0
IOPL = 0
current process = code segment  = base 0x0, limit
0xf, type 0x1b
12 (irq283: ix1:que 1)
Ilock order reversal: (Giant after non-sleepable)
 1st 0xfe009a295918 ix1:rx(1) (ix1:rx(1)) @
/usr/src/sys/modules/ixgbe/../../dev/ixgbe/ixgbe.c:4298
 2nd 0x814457b8 Giant (Giant) @ /usr/src/sys/dev/usb/input/ukbd.c:1946
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xff849705d320
kdb_backtrace() at kdb_backtrace+0x39/frame 0xff849705d3d0
witness_checkorder() at witness_checkorder+0xc47/frame 0xff849705d450
__mtx_lock_flags() at __mtx_lock_flags+0x89/frame 0xff849705d490
ukbd_poll() at ukbd_poll+0x28/frame 0xff849705d4b0
kbdmux_poll() at kbdmux_poll+0x5b/frame 0xff849705d4d0
cngrab() at cngrab+0x35/frame 0xff849705d4f0
kdb_trap() at kdb_trap+0x124/frame 0xff849705d550
trap_fatal() at trap_fatal+0x345/frame 0xff849705d5b0
trap() at trap+0x836/frame 0xff849705d7c0
calltrap() at calltrap+0x8/frame 0xff849705d7c0
--- trap 0x1, rip = 0x80af5099, rsp = 0xff849705d880, rbp
= 0xff849705d8b0 ---
uma_find_refcnt() at uma_find_refcnt+0x79/frame 0xff849705d8b0
mb_ctor_clust() at mb_ctor_clust+0x8f/frame 0xff849705d8e0
uma_zalloc_arg() at uma_zalloc_arg+0xe7/frame 0xff849705d960
m_getjcl() at m_getjcl+0xce/frame 0xff849705d9a0
ixgbe_refresh_mbufs() at ixgbe_refresh_mbufs+0x77/frame 0xff849705da20
ixgbe_rxeof() at ixgbe_rxeof+0x4e9/frame 0xff849705dad0
ixgbe_msix_que() at ixgbe_msix_que+0x9b/frame 0xff849705db20
intr_event_execute_handlers() at
intr_event_execute_handlers+0x90/frame 0xff849705db60
ithread_loop() at ithread_loop+0x161/frame 0xff849705dbb0
fork_exit() at fork_exit+0x84/frame 0xff849705dbf0
fork_trampoline() at fork_trampoline+0xe/frame 0xff849705dbf0
--- trap 0, rip = 0, rsp = 0xff849705dcb0, rbp = 0 ---
[ thread pid 12 tid 100224 ]
Stopped at  uma_find_refcnt+0x79:



I setup asymmetric MTUs to see if I can narrow down whether the
issue is a jumbo frame or cxgbe/ixgbe issue and so far so good. Trying
again with jumbo frames paniced on boot when loading the module
(really...?):

# Memory modified after free 0xff8000401000(9216) val=0 @ 0xff8000401000


Fatal trap 1: privileged instruction fault while in kernel mode
cpuid = 1; apic id = 02
instruction pointer = 0x20:0x80af5099
stack pointer   = 0x28:0xff8496fc9750
frame pointer   = 0x28:0xff8496fc9780
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 3280 (ifconfig)
lock order reversal: (Giant after non-sleepable)
 1st 0xfe00462f7808 ix0:rx(0) (ix0:rx(0)) @
/usr/src/sys/modules/ixgbe/../../dev/ixgbe/ixgbe.c:3938
 2nd 0x814457b8 Giant (Giant) @ /usr/src/sys/dev/usb/input/ukbd.c:1946
KDB: stack backtrace:
db_trace_self_wrapper() at 

Memory modified after free - by whom?

2012-12-10 Thread Garrett Cooper
I noticed this while checking the logs on one of my test boxes
after restarting the network. Any idea where I should start looking
into this (has IPv6 enabled but wasn't using it, em/cxgbe/ixgbe
interfaces with the ixgbe interfaces lagged previously, but now not)?
It looks suspiciously like the same size as a jumbo frame, but I'm not
100% sure if that's the real problem.
Thanks,
-Garrett

Dec 10 14:03:12 wf158 kernel: em0: link state changed to DOWN
Dec 10 14:03:13 wf158 kernel: ix0: link state changed to DOWN
Dec 10 14:03:13 wf158 kernel: ix0: link state changed to UP
Dec 10 14:03:13 wf158 kernel: ix1: link state changed to DOWN
Dec 10 14:03:13 wf158 kernel: ix1: link state changed to UP
Dec 10 14:03:13 wf158 kernel: ix0: link state changed to DOWN
Dec 10 14:03:13 wf158 kernel: ix0: link state changed to UP
Dec 10 14:03:15 wf158 kernel: em0: link state changed to UP
Dec 10 14:03:15 wf158 dhclient: New IP Address (em0): 10.7.169.89
Dec 10 14:03:15 wf158 dhclient: New Subnet Mask (em0): 255.255.240.0
Dec 10 14:03:15 wf158 dhclient: New Broadcast Address (em0): 10.7.175.255
Dec 10 14:03:15 wf158 dhclient: New Routers (em0): 10.7.160.1
Dec 10 14:03:16 wf158 kernel: ix0: link state changed to DOWN
Dec 10 14:03:16 wf158 kernel: ix0: link state changed to UP
Dec 10 14:03:31 wf158 kernel: in6_purgeaddr: err=65, destination
address delete failed
Dec 10 14:03:31 wf158 dhclient[4539]: My address (10.7.169.89) was
deleted, dhclient exiting
Dec 10 14:03:32 wf158 dhclient[4521]: short write: wanted 20 got 0 bytes
Dec 10 14:03:32 wf158 dhclient[4521]: exiting.
Dec 10 14:03:33 wf158 kernel: em0: link state changed to DOWN
Dec 10 14:03:33 wf158 kernel: ix1: link state changed to DOWN
Dec 10 14:03:34 wf158 kernel: ix1: link state changed to UP
Dec 10 14:03:34 wf158 kernel: ix1: link state changed to DOWN
Dec 10 14:03:34 wf158 kernel: ix1: link state changed to UP
Dec 10 14:03:34 wf158 kernel: ix0: link state changed to DOWN
Dec 10 14:03:34 wf158 kernel: ix0: link state changed to UP
Dec 10 14:03:34 wf158 kernel: ix1: link state changed to DOWN
Dec 10 14:03:34 wf158 kernel: ix1: link state changed to UP
Dec 10 14:03:36 wf158 kernel: em0: link state changed to UP
Dec 10 14:03:36 wf158 dhclient: New IP Address (em0): 10.7.169.89
Dec 10 14:03:36 wf158 dhclient: New Subnet Mask (em0): 255.255.240.0
Dec 10 14:03:36 wf158 dhclient: New Broadcast Address (em0): 10.7.175.255
Dec 10 14:03:36 wf158 dhclient: New Routers (em0): 10.7.160.1
Dec 10 14:05:34 wf158 kernel: Memory modified after free
0xff81c016d000(9216) val= @ 0xff81c016d000
Dec 10 14:05:35 wf158 kernel: Memory modified after free
0xff81b5cdc000(9216) val= @ 0xff81b5cdc000

# uname -a
FreeBSD wf158.west.isilon.com 10.0-CURRENT FreeBSD 10.0-CURRENT #1
r+2760369-dirty: Mon Dec 10 08:04:46 PST 2012
r...@wf158.west.isilon.com:/usr/obj/usr/src/sys/ISI-GENERIC  amd64
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Memory modified after free - by whom?

2012-12-10 Thread Adrian Chadd
9216 sounds like a jumbo frame mbuf. So the NIC is writing to an mbuf
after it's finalised/freed.

I have a similar bug showing up on ath(4) RX. :(


adrian


On 10 December 2012 14:22, Garrett Cooper yaneg...@gmail.com wrote:
 I noticed this while checking the logs on one of my test boxes
 after restarting the network. Any idea where I should start looking
 into this (has IPv6 enabled but wasn't using it, em/cxgbe/ixgbe
 interfaces with the ixgbe interfaces lagged previously, but now not)?
 It looks suspiciously like the same size as a jumbo frame, but I'm not
 100% sure if that's the real problem.
 Thanks,
 -Garrett

 Dec 10 14:03:12 wf158 kernel: em0: link state changed to DOWN
 Dec 10 14:03:13 wf158 kernel: ix0: link state changed to DOWN
 Dec 10 14:03:13 wf158 kernel: ix0: link state changed to UP
 Dec 10 14:03:13 wf158 kernel: ix1: link state changed to DOWN
 Dec 10 14:03:13 wf158 kernel: ix1: link state changed to UP
 Dec 10 14:03:13 wf158 kernel: ix0: link state changed to DOWN
 Dec 10 14:03:13 wf158 kernel: ix0: link state changed to UP
 Dec 10 14:03:15 wf158 kernel: em0: link state changed to UP
 Dec 10 14:03:15 wf158 dhclient: New IP Address (em0): 10.7.169.89
 Dec 10 14:03:15 wf158 dhclient: New Subnet Mask (em0): 255.255.240.0
 Dec 10 14:03:15 wf158 dhclient: New Broadcast Address (em0): 10.7.175.255
 Dec 10 14:03:15 wf158 dhclient: New Routers (em0): 10.7.160.1
 Dec 10 14:03:16 wf158 kernel: ix0: link state changed to DOWN
 Dec 10 14:03:16 wf158 kernel: ix0: link state changed to UP
 Dec 10 14:03:31 wf158 kernel: in6_purgeaddr: err=65, destination
 address delete failed
 Dec 10 14:03:31 wf158 dhclient[4539]: My address (10.7.169.89) was
 deleted, dhclient exiting
 Dec 10 14:03:32 wf158 dhclient[4521]: short write: wanted 20 got 0 bytes
 Dec 10 14:03:32 wf158 dhclient[4521]: exiting.
 Dec 10 14:03:33 wf158 kernel: em0: link state changed to DOWN
 Dec 10 14:03:33 wf158 kernel: ix1: link state changed to DOWN
 Dec 10 14:03:34 wf158 kernel: ix1: link state changed to UP
 Dec 10 14:03:34 wf158 kernel: ix1: link state changed to DOWN
 Dec 10 14:03:34 wf158 kernel: ix1: link state changed to UP
 Dec 10 14:03:34 wf158 kernel: ix0: link state changed to DOWN
 Dec 10 14:03:34 wf158 kernel: ix0: link state changed to UP
 Dec 10 14:03:34 wf158 kernel: ix1: link state changed to DOWN
 Dec 10 14:03:34 wf158 kernel: ix1: link state changed to UP
 Dec 10 14:03:36 wf158 kernel: em0: link state changed to UP
 Dec 10 14:03:36 wf158 dhclient: New IP Address (em0): 10.7.169.89
 Dec 10 14:03:36 wf158 dhclient: New Subnet Mask (em0): 255.255.240.0
 Dec 10 14:03:36 wf158 dhclient: New Broadcast Address (em0): 10.7.175.255
 Dec 10 14:03:36 wf158 dhclient: New Routers (em0): 10.7.160.1
 Dec 10 14:05:34 wf158 kernel: Memory modified after free
 0xff81c016d000(9216) val= @ 0xff81c016d000
 Dec 10 14:05:35 wf158 kernel: Memory modified after free
 0xff81b5cdc000(9216) val= @ 0xff81b5cdc000

 # uname -a
 FreeBSD wf158.west.isilon.com 10.0-CURRENT FreeBSD 10.0-CURRENT #1
 r+2760369-dirty: Mon Dec 10 08:04:46 PST 2012
 r...@wf158.west.isilon.com:/usr/obj/usr/src/sys/ISI-GENERIC  amd64
 ___
 freebsd-...@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-net
 To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Memory modified after free - by whom?

2012-12-10 Thread mdf
On Mon, Dec 10, 2012 at 3:10 PM, Adrian Chadd adr...@freebsd.org wrote:
 9216 sounds like a jumbo frame mbuf. So the NIC is writing to an mbuf
 after it's finalised/freed.

 I have a similar bug showing up on ath(4) RX. :(

Compile with DEBUG_MEMGUARD in the kernel configuration, and then set
vm.memguard.desc to the name of the UMA zone used for the 9216 byte
allocations, mbuf_jumbo_9k.  This should cause a panic when the memory
is touched after free.

Cheers,
matthew

 On 10 December 2012 14:22, Garrett Cooper yaneg...@gmail.com wrote:
 I noticed this while checking the logs on one of my test boxes
 after restarting the network. Any idea where I should start looking
 into this (has IPv6 enabled but wasn't using it, em/cxgbe/ixgbe
 interfaces with the ixgbe interfaces lagged previously, but now not)?
 It looks suspiciously like the same size as a jumbo frame, but I'm not
 100% sure if that's the real problem.
 Thanks,
 -Garrett

 Dec 10 14:03:12 wf158 kernel: em0: link state changed to DOWN
 Dec 10 14:03:13 wf158 kernel: ix0: link state changed to DOWN
 Dec 10 14:03:13 wf158 kernel: ix0: link state changed to UP
 Dec 10 14:03:13 wf158 kernel: ix1: link state changed to DOWN
 Dec 10 14:03:13 wf158 kernel: ix1: link state changed to UP
 Dec 10 14:03:13 wf158 kernel: ix0: link state changed to DOWN
 Dec 10 14:03:13 wf158 kernel: ix0: link state changed to UP
 Dec 10 14:03:15 wf158 kernel: em0: link state changed to UP
 Dec 10 14:03:15 wf158 dhclient: New IP Address (em0): 10.7.169.89
 Dec 10 14:03:15 wf158 dhclient: New Subnet Mask (em0): 255.255.240.0
 Dec 10 14:03:15 wf158 dhclient: New Broadcast Address (em0): 10.7.175.255
 Dec 10 14:03:15 wf158 dhclient: New Routers (em0): 10.7.160.1
 Dec 10 14:03:16 wf158 kernel: ix0: link state changed to DOWN
 Dec 10 14:03:16 wf158 kernel: ix0: link state changed to UP
 Dec 10 14:03:31 wf158 kernel: in6_purgeaddr: err=65, destination
 address delete failed
 Dec 10 14:03:31 wf158 dhclient[4539]: My address (10.7.169.89) was
 deleted, dhclient exiting
 Dec 10 14:03:32 wf158 dhclient[4521]: short write: wanted 20 got 0 bytes
 Dec 10 14:03:32 wf158 dhclient[4521]: exiting.
 Dec 10 14:03:33 wf158 kernel: em0: link state changed to DOWN
 Dec 10 14:03:33 wf158 kernel: ix1: link state changed to DOWN
 Dec 10 14:03:34 wf158 kernel: ix1: link state changed to UP
 Dec 10 14:03:34 wf158 kernel: ix1: link state changed to DOWN
 Dec 10 14:03:34 wf158 kernel: ix1: link state changed to UP
 Dec 10 14:03:34 wf158 kernel: ix0: link state changed to DOWN
 Dec 10 14:03:34 wf158 kernel: ix0: link state changed to UP
 Dec 10 14:03:34 wf158 kernel: ix1: link state changed to DOWN
 Dec 10 14:03:34 wf158 kernel: ix1: link state changed to UP
 Dec 10 14:03:36 wf158 kernel: em0: link state changed to UP
 Dec 10 14:03:36 wf158 dhclient: New IP Address (em0): 10.7.169.89
 Dec 10 14:03:36 wf158 dhclient: New Subnet Mask (em0): 255.255.240.0
 Dec 10 14:03:36 wf158 dhclient: New Broadcast Address (em0): 10.7.175.255
 Dec 10 14:03:36 wf158 dhclient: New Routers (em0): 10.7.160.1
 Dec 10 14:05:34 wf158 kernel: Memory modified after free
 0xff81c016d000(9216) val= @ 0xff81c016d000
 Dec 10 14:05:35 wf158 kernel: Memory modified after free
 0xff81b5cdc000(9216) val= @ 0xff81b5cdc000

 # uname -a
 FreeBSD wf158.west.isilon.com 10.0-CURRENT FreeBSD 10.0-CURRENT #1
 r+2760369-dirty: Mon Dec 10 08:04:46 PST 2012
 r...@wf158.west.isilon.com:/usr/obj/usr/src/sys/ISI-GENERIC  amd64
 ___
 freebsd-...@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-net
 To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
 ___
 freebsd-current@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Memory modified after free - by whom?

2012-12-10 Thread Adrian Chadd
On 10 December 2012 15:18,  m...@freebsd.org wrote:
 On Mon, Dec 10, 2012 at 3:10 PM, Adrian Chadd adr...@freebsd.org wrote:
 9216 sounds like a jumbo frame mbuf. So the NIC is writing to an mbuf
 after it's finalised/freed.

 I have a similar bug showing up on ath(4) RX. :(

 Compile with DEBUG_MEMGUARD in the kernel configuration, and then set
 vm.memguard.desc to the name of the UMA zone used for the 9216 byte
 allocations, mbuf_jumbo_9k.  This should cause a panic when the memory
 is touched after free.

Right, but I think its a _hardware_ access after the buffer has been freed..


Adrian
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Memory modified after free - by whom?

2012-12-10 Thread Garrett Cooper
On Mon, Dec 10, 2012 at 3:21 PM, Adrian Chadd adr...@freebsd.org wrote:
 On 10 December 2012 15:18,  m...@freebsd.org wrote:
 On Mon, Dec 10, 2012 at 3:10 PM, Adrian Chadd adr...@freebsd.org wrote:
 9216 sounds like a jumbo frame mbuf. So the NIC is writing to an mbuf
 after it's finalised/freed.

 I have a similar bug showing up on ath(4) RX. :(

 Compile with DEBUG_MEMGUARD in the kernel configuration, and then set
 vm.memguard.desc to the name of the UMA zone used for the 9216 byte
 allocations, mbuf_jumbo_9k.  This should cause a panic when the memory
 is touched after free.

 Right, but I think its a _hardware_ access after the buffer has been freed..

At least that will give me an idea of who to punt the bug over to
next (assuming it lists the driver) -- one of the network folks, jfv,
or np :). It seems to be a recent change that's causing this (it's
spewing out these warnings every couple seconds), but that might also
be related to me getting lagg working on CURRENT as my last known base
was 9-STABLE and a lot of networking changes haven't been MFCed :).
I could probably look through the code too after compiling it, but
it would take too long.
Thanks!
-Garrett
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Memory modified after free - by whom?

2012-12-10 Thread Navdeep Parhar
On Mon, Dec 10, 2012 at 05:37:17PM -0800, Garrett Cooper wrote:
 On Mon, Dec 10, 2012 at 3:21 PM, Adrian Chadd adr...@freebsd.org wrote:
  On 10 December 2012 15:18,  m...@freebsd.org wrote:
  On Mon, Dec 10, 2012 at 3:10 PM, Adrian Chadd adr...@freebsd.org wrote:
  9216 sounds like a jumbo frame mbuf. So the NIC is writing to an mbuf
  after it's finalised/freed.
 
  I have a similar bug showing up on ath(4) RX. :(
 
  Compile with DEBUG_MEMGUARD in the kernel configuration, and then set
  vm.memguard.desc to the name of the UMA zone used for the 9216 byte
  allocations, mbuf_jumbo_9k.  This should cause a panic when the memory
  is touched after free.
 
  Right, but I think its a _hardware_ access after the buffer has been freed..
 
 At least that will give me an idea of who to punt the bug over to
 next (assuming it lists the driver) -- one of the network folks, jfv,
 or np :). It seems to be a recent change that's causing this (it's
 spewing out these warnings every couple seconds), but that might also
 be related to me getting lagg working on CURRENT as my last known base
 was 9-STABLE and a lot of networking changes haven't been MFCed :).

If you suspect it's a DMA from the NIC after the 9K cluster has been
freed, see if the corrupt portion looks anything like an Ethernet
frame.  If it does then the DMAC in the frame will tell you who to
follow up with -- jfv@ or me :-)

(btw, your log had val= so I think it's something else..)

Regards,
Navdeep

 I could probably look through the code too after compiling it, but
 it would take too long.
 Thanks!
 -Garrett
 ___
 freebsd-current@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Memory modified after free - by whom?

2012-12-10 Thread Gleb Smirnoff
On Mon, Dec 10, 2012 at 03:18:45PM -0800, m...@freebsd.org wrote:
m On Mon, Dec 10, 2012 at 3:10 PM, Adrian Chadd adr...@freebsd.org wrote:
m  9216 sounds like a jumbo frame mbuf. So the NIC is writing to an mbuf
m  after it's finalised/freed.
m 
m  I have a similar bug showing up on ath(4) RX. :(
m 
m Compile with DEBUG_MEMGUARD in the kernel configuration, and then set
m vm.memguard.desc to the name of the UMA zone used for the 9216 byte
m allocations, mbuf_jumbo_9k.  This should cause a panic when the memory
m is touched after free.

DEBUG_MEMGUARD doesn't work with cluster zone, I'm afraid it won't work
with mbuf_jumbo_9k, too, but I didn't try this.

The problem is documented in BUGS in memguard(9).


-- 
Totus tuus, Glebius.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org