Hi,
we are seeing kernel hangs (hard reset required) under high interrupt
load (triggered by ix(4) cards). I've traced this back to an overflow
of the kernel stack. This is how the stack looks at the time of the
overflow (manual backtrace, as ddb won't show the final part):
EBP (%EBP) 4(%EBP)
0xe13b825c: 0xe13b827c 0xd02d8b1d panic
0xe13b827c: 0xe13b82ac 0xd02d1a42 timeout_add
0xe13b82ac: 0xe13b82dc 0xd02bfd6e hardclock
0xe13b82dc: 0xe13b82fc 0xd04dddca lapic_clockintr
0xe13b82fc: 0xe13b8304 0xd0202475 Xintrltimer
/* OLD EBP at ebp+0x20(9 words) OLD EIP at ebp+0x3c (15words) */
EBP=0xe13b8384 EIP=0xd02d7c31 pool_do_put
0xe13b8384: 0xe13b83a4 0xd02d7b6f pool_put
0xe13b83a4: 0xe13b83d4 0xd02e7f26 m_free
0xe13b83d4: 0xe13b83f4 0xd02e7fa9 m_freem
0xe13b83f4: 0xe13b8434 0xd030a492 ether_input
0xe13b8434: 0xe13b8474 0xd043f2cf ixgbe_rxeof
0xe13b8474: 0xe13b84b4 0xd043c7ed ixgbe_legacy_irq
0xe13b84b4: 0xe13b84bc 0xd02027ee Xintr_ioapic1
EBP=0xe13b9e10 EIP=d0202128 Xdoreti (before iret)
0xe13b9e10: 0xe13b9e40 0xd028bd27 pf_pull_hdr
0xe13b9e40: 0xe13b9f40 0xd028d570 pf_test
0xe13b9f40: 0xe13b9f80 0xd03478b4 ipv4_input
0xe13b9f80: 0xe13b9fa0 0xd0347765 ipintr
0xe13b9fa0: 0xe13b9fa0 0xd0202182 Xsoftnet
Note that the size of the stack frame between Xintr_ioapic1 and pf_pull_hdr
is huge (~6k). This area is filled with a 12 byte pattern that looks like
a code address, the kernel code segment and a pushed eflags register. The
code address points to the interrupt return path in this piece of code from
i386/locore.s:
#define INTRFASTEXIT \
popl %fs ; \
popl %gs ; \
popl %es ; \
popl %ds ; \
popl %edi ; \
popl %esi ; \
popl %ebp ; \
popl %ebx ; \
popl %edx ; \
popl %ecx ; \
popl %eax ; \
sti ; \ <===== (1)
addl $8,%esp ; \
iret <===== (2)
The return address points to the iret marked with (2). I.e. we get hit by
an interrupt immediately before the iret and this happens repeatedly until
the kernel stack overflows. This is only possible due to the sti instruction
at the point marked (1). As iret will restore the pushed eflags value anyway,
it should be safe to remove the sti altogether. Discussed with bluhm@ and
hshoexer@. Patch follows:
Index: locore.s
===================================================================
RCS file: /cvs/src/sys/arch/i386/i386/locore.s,v
retrieving revision 1.130
diff -u -r1.130 locore.s
--- locore.s 3 Jul 2010 04:54:32 -0000 1.130
+++ locore.s 18 Apr 2011 13:52:16 -0000
@@ -128,7 +128,6 @@
popl %edx ; \
popl %ecx ; \
popl %eax ; \
- sti ; \
addl $8,%esp ; \
iret
The sti was introduced in revision 1.97 of locore.s in March 2006 by
mickey@. Commit message:
| prevent the faults on iret to run w/ disabled intrs and cause
| deadlocks; niklas toby tom ok
Maybe mickey or one of the people giving oks back then want to comment?
regards Christian