Hi Tod, I ended up figuring it out that the skb_over_panic only happens when CONFIG_CMA=y is enabled. Disabling it in the kernel fixes the problem.
Regards, On Tue, Jan 21, 2014 at 6:17 PM, Fujinaka, Todd <todd.fujin...@intel.com> wrote: > Looks like you're using custom hardware from Doremi Labs. There's a lot of > questions we need to ask about the hardware setup as well as the software. > Can you file a bug on sourceforge? Also, have you tried any other kernels > with success? > > Thanks. > > Todd Fujinaka > Software Application Engineer > Networking Division (ND) > Intel Corporation > todd.fujin...@intel.com > (503) 712-4565 > > > -----Original Message----- > From: linux-nics-boun...@isotope.jf.intel.com > [mailto:linux-nics-boun...@isotope.jf.intel.com] On Behalf Of Martin Boutin > Sent: Tuesday, January 21, 2014 8:44 AM > To: e1000-devel@lists.sourceforge.net > Cc: Linux NICS > Subject: [linux-nics] e1000e: skb_over_panic on skb_put at e1000_clean_rx_irq > > Hello all, > > Linux (stable) 3.10.27 built from the git sources crashes right at boot with > a "skb_over_panic" error. This only happens during boot (probably related to > the sysvinit sequence), it does not happen when I modprobe the driver. It > seems to be some race condition because sometimes I do not hit the bug. > > I attached gdb to the kernel via serial, here is a meaningful backtrace: > (gdb) bt > #0 skb_panic (skb=skb@entry=0xe874f200, sz=<optimized out>, addr=<optimized > out>, msg=msg@entry=0xc14 75e3c "skb_over_panic") at net/core/skbuff.c:126 > #1 0xc13851db in skb_over_panic (addr=<optimized out>, sz=<optimized > out>, skb=0xe874f200) at net/core/skbuff.c:131 > #2 skb_put (skb=skb@entry=0xe874f200, len=len@entry=46969) at > net/core/skbuff.c:1283 > #3 0xc12f7e7e in e1000_clean_rx_irq (rx_ring=0xe8f11340, > work_done=0xe8cfdf84, work_to_do=64) at driv > ers/net/ethernet/intel/e1000e/netdev.c:1011 > #4 0xc12fd5c7 in e1000e_poll (napi=0xe8e78788, weight=64) at > drivers/net/ethernet/intel/e1000e/netdev > .c:2654 > #5 0xc138edb9 in net_rx_action (h=<optimized out>) at net/core/dev.c:4197 > #6 0xc10302e9 in __do_softirq () at kernel/softirq.c:253 > #7 0xc10030ec in call_on_stack (stack=0xe8cbfecc, func=<optimized > out>) at arch/x86/kernel/irq_32.c:7 > 1 > #8 do_softirq () at arch/x86/kernel/irq_32.c:173 > #9 0xc1030466 in invoke_softirq () at kernel/softirq.c:342 > #10 irq_exit () at kernel/softirq.c:376 > #11 0xc1002fa5 in do_IRQ (regs=<optimized out>) at arch/x86/kernel/irq.c:198 > #12 <signal handler called> > #13 0xc1367d19 in cpuidle_enter_state (dev=dev@entry=0xeafe6600, > drv=drv@entry=0xc1574420, index=index > @entry=8) at drivers/cpuidle/cpuidle.c:92 > #14 0xc1367e03 in cpuidle_idle_call () at drivers/cpuidle/cpuidle.c:157 > #15 0xc1008485 in arch_cpu_idle () at arch/x86/kernel/process.c:301 > #16 0xc1054b4f in cpu_idle_loop () at kernel/cpu/idle.c:98 > #17 cpu_startup_entry (state=state@entry=CPUHP_ONLINE) at > kernel/cpu/idle.c:133 > #18 0xc15c6938 in start_secondary (unused=<optimized out>) at > arch/x86/kernel/smpboot.c:287 > #19 0x00000000 in ?? () > > The relevant section of lspci: > lspci -t -v > -[0000:00]-+-1c.0-[01]----00.0 Intel Corporation 82574L Gigabit Network > Connection > > lspci -s 01:00.0 -vvv > 01:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network > Connection > Subsystem: Intel Corporation Device 0000 > Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- > ParErr- Stepping- SERR- FastB2B- DisINTx+ > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- > <TAbort- <MAbort- >SERR- <PERR- INTx- > Latency: 0, Cache Line Size: 64 bytes > Interrupt: pin A routed to IRQ 16 > Region 0: Memory at b4600000 (32-bit, non-prefetchable) [size=128K] > Region 2: I/O ports at 7000 [size=32] > Region 3: Memory at b4620000 (32-bit, non-prefetchable) [size=16K] > Capabilities: [c8] Power Management version 2 > Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA > PME(D0+,D1-,D2-,D3hot+,D3cold-) > Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME- > Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+ > Address: 0000000000000000 Data: 0000 > Capabilities: [e0] Express (v1) Endpoint, MSI 00 > DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s > <512ns, L1 <64us > ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- > DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+ > RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ > MaxPayload 128 bytes, MaxReadReq 512 bytes > DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- > LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, > Latency L0 <128ns, L1 <64us > ClockPM- Surprise- LLActRep- BwNot- > LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+ > ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- > LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ > DLActive- BWMgmt- ABWMgmt- > Capabilities: [a0] MSI-X: Enable+ Count=5 Masked- > Vector table: BAR=3 offset=00000000 > PBA: BAR=3 offset=00002000 > Capabilities: [100 v1] Advanced Error Reporting > UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- > RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- > UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- > RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- > UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- > RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- > CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- > CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ > AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn- > Capabilities: [140 v1] Device Serial Number 00-11-1f-ff-ff-00-01-11 > Kernel driver in use: e1000e > > The relevant section of dmesg when the kernel does not crash: > (...) > pci 0000:01:00.0: [8086:10d3] type 00 class 0x020000 pci 0000:01:00.0: reg > 10: [mem 0xb4600000-0xb461ffff] pci 0000:01:00.0: reg 18: [io 0x7000-0x701f] > pci 0000:01:00.0: reg 1c: [mem 0xb4620000-0xb4623fff] pci 0000:01:00.0: PME# > supported from D0 D3hot pci 0000:00:1c.0: PCI bridge to [bus 01] > pci 0000:00:1c.0: bridge window [io 0x7000-0x7fff] > pci 0000:00:1c.0: bridge window [mem 0xb4600000-0xb46fffff] > (...) > e1000e: Intel(R) PRO/1000 Network Driver - 2.3.2-k > e1000e: Copyright(c) 1999 - 2013 Intel Corporation. > e1000e 0000:01:00.0: Disabling ASPM L0s L1 e1000e 0000:01:00.0: Interrupt > Throttling Rate (ints/sec) set to dynamic conservative mode e1000e > 0000:01:00.0: irq 63 for MSI/MSI-X e1000e 0000:01:00.0: irq 64 for MSI/MSI-X > e1000e 0000:01:00.0: irq 65 for MSI/MSI-X ACPI Warning: 0x00008040-0x0000805f > SystemIO conflicts with Region \_SB_.PCI0.SBUS.SMBI 1 (20130328/utaddress-251) > ACPI: If an ACPI driver is available for this device, you should use it > instead of the native driver e1000e 0000:01:00.0 eth0: registered PHC clock > e1000e 0000:01:00.0 eth0: (PCI Express:2.5GT/s:Width x1) 00:11:1f:00:01:11 > e1000e 0000:01:00.0 eth0: Intel(R) PRO/1000 Network Connection e1000e > 0000:01:00.0 eth0: MAC: 3, PHY: 8, PBA No: FFFFFF-0FF > > Regards, > -- > Martin Boutin > _______________________________________________ > Linux-nics mailing list > linux-n...@intel.com -- Martin Boutin ------------------------------------------------------------------------------ CenturyLink Cloud: The Leader in Enterprise Cloud Services. Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical Workloads, Development Environments & Everything In Between. Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk _______________________________________________ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired