Using the 3.7.21 version of the ixgbe driver we can reliably produce a crash with this signature:
BUG: unable to handle kernel NULL pointer dereference at 000000000000006c IP: [<ffffffffa005afef>] ixgbe_poll+0x9df/0x1710 [ixgbe] PGD 814c7b067 PUD 8074dd067 PMD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/devices/virtual/bypass/8-9/ping_watchdog CPU 2 Pid: 18925, comm: sport Tainted: P ---------------- 2.6.32-perf #1 To Be Filled By O.E.M. RIP: 0010:[<ffffffffa005afef>] [<ffffffffa005afef>] ixgbe_poll+0x9df/0x1710 [ixgbe] RSP: 0018:ffff88080750b8b0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88040f816f00 RCX: 0000000000000000 RDX: 0000000000000020 RSI: ffffc9000429c000 RDI: ffff88040f891d80 RBP: ffff88080750b970 R08: 0000000000000100 R09: 0000000000000000 R10: 0000000000000100 R11: ffff88080750bfd8 R12: 0000000000000000 R13: ffffc900041221b8 R14: ffff8804077580b0 R15: 000000000000000b FS: 00007f61ccda9700(0000) GS:ffff880028280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000006c CR3: 0000000814436000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process sport (pid: 18925, threadinfo ffff88080750a000, task ffff880814beeb60) Stack: 0000000000000000 ffff8804148b0540 000000000000000e ffff88040703d1c0 <0> ffff8804148b0598 ffff88080750b918 ffffffff815671ac 00000001000359c2 <0> ffff880410744700 0000000000000040 ffff88040f891d80 0000004011087c9c Call Trace: [<ffffffff815671ac>] ? ip_finish_output+0x13c/0x310 [<ffffffff8152b468>] net_rx_action+0xb8/0x400 [<ffffffff81517a84>] ? sock_def_readable+0x44/0x80 [<ffffffff81066a91>] __do_softirq+0xc1/0x1d0 [<ffffffff8100c1ec>] call_softirq+0x1c/0x30 [<ffffffff8100de25>] do_softirq+0x65/0xa0 [<ffffffff8106699a>] local_bh_enable+0x9a/0xb0 [<ffffffff815176fc>] lock_sock_nested+0xac/0xc0 [<ffffffff81641f0b>] ? _spin_unlock_bh+0x1b/0x20 [<ffffffff81517627>] ? release_sock+0xd7/0x100 [<ffffffff81571838>] tcp_recvmsg+0x38/0xe80 [<ffffffff812d4c19>] ? cpumask_next_and+0x29/0x50 [<ffffffff8104b6f4>] ? find_busiest_group+0x244/0xb10 [<ffffffff810544d2>] ? default_wake_function+0x12/0x20 [<ffffffff81516cf9>] sock_common_recvmsg+0x39/0x50 [<ffffffff81516829>] sock_aio_read+0x159/0x160 [<ffffffff8104dbd3>] ? perf_event_task_sched_out+0x33/0x80 [<ffffffff810097ac>] ? __switch_to+0x1ac/0x320 [<ffffffff815166d0>] ? sock_aio_read+0x0/0x160 [<ffffffff811533bb>] do_sync_readv_writev+0xfb/0x140 [<ffffffff810853b0>] ? autoremove_wake_function+0x0/0x40 [<ffffffff811543df>] do_readv_writev+0xcf/0x1f0 [<ffffffff8156dc0d>] ? do_tcp_getsockopt+0x3d/0x5f0 [<ffffffff81012879>] ? read_tsc+0x9/0x20 [<ffffffff8108fc13>] ? ktime_get+0x63/0xe0 [<ffffffff810650c2>] ? ns_to_timeval+0x12/0x40 [<ffffffff810896af>] ? hrtimer_get_remaining+0x3f/0x50 [<ffffffff811546d3>] vfs_readv+0x43/0x60 [<ffffffff811547d1>] sys_readv+0x51/0x80 [<ffffffff8100b132>] system_call_fastpath+0x16/0x1b Code: c1 e5 03 4c 03 6b 20 4d 8b 65 00 49 c7 45 00 00 00 00 00 0f ae e8 48 8b 53 28 31 c0 f6 c2 10 74 0a 41 f7 06 00 00 1e 00 0f 95 c0 <41> 8b 74 24 6c 49 8b 8c 24 b0 01 00 00 85 f6 0f 18 09 0f 85 c0 RIP [<ffffffffa005afef>] ixgbe_poll+0x9df/0x1710 [ixgbe] RSP <ffff88080750b8b0> CR2: 000000000000006c ---[ end trace 9db4623b9591cd54 ]--- addr2line says this is happening on line 2028 below - so a NULL skb pointer is being passed to skb_is_nonlinear(): 1990 static bool ixgbe_clean_rx_irq_ps(struct ixgbe_q_vector *q_vector, 1991 struct ixgbe_ring *rx_ring, 1992 int budget) 1993 { ..... 2021 rmb(); 2022 2023 pkt_is_rsc = ixgbe_get_rsc_state(rx_ring, rx_desc); 2024 2025 prefetch(skb->data); 2026 2027 /* pull the header of the skb in if no data is already present */ 2028 if (!skb_is_nonlinear(skb)) { 2029 __skb_put(skb, ixgbe_get_hlen(rx_ring, rx_desc)); Anyone have a guess as to the cause? Or have you seen similar? One good clue that we've found is that the problem disappears if we turn off irq balancing. -- Arthur ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired