Hi, I have an X10 supermicro with two I350's that has crashed twice now under v4.9.39 within the last 3 weeks, with no crashes before v4.9.39:
$ /sbin/lspci | grep -i ethernet 02:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01) 02:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01) 04:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01) 04:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01) And some X9 supermicro's that have not crashed, with a single I350 I believe: $ /sbin/lspci | grep -i ethernet 06:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01) 06:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01) 06:00.2 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01) 06:00.3 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01) I see in the release notes https://downloadmirror.intel.com/22919/eng/README.txt " Do Not Use LRO When Routing Packets." We are bridging traffic, not routing, and the crashes are in the GRO code. Is it possible there are problems with GRO for bridging in the igb driver now? If I disable GRO can I have some confidence it will fix the issue? Here are my offload settings: Features for eth0: rx-checksumming: on tx-checksumming: on tx-checksum-ipv4: off [fixed] tx-checksum-ip-generic: on tx-checksum-ipv6: off [fixed] tx-checksum-fcoe-crc: off [fixed] tx-checksum-sctp: on scatter-gather: on tx-scatter-gather: on tx-scatter-gather-fraglist: off [fixed] tcp-segmentation-offload: on tx-tcp-segmentation: on tx-tcp-ecn-segmentation: off [fixed] tx-tcp-mangleid-segmentation: off tx-tcp6-segmentation: on udp-fragmentation-offload: off [fixed] generic-segmentation-offload: on generic-receive-offload: on large-receive-offload: off [fixed] rx-vlan-offload: on tx-vlan-offload: on ntuple-filters: off receive-hashing: on highdma: on [fixed] rx-vlan-filter: on [fixed] vlan-challenged: off [fixed] tx-lockless: off [fixed] netns-local: off [fixed] tx-gso-robust: off [fixed] tx-fcoe-segmentation: off [fixed] tx-gre-segmentation: on tx-gre-csum-segmentation: on tx-ipxip4-segmentation: on tx-ipxip6-segmentation: on tx-udp_tnl-segmentation: on tx-udp_tnl-csum-segmentation: on tx-gso-partial: on tx-sctp-segmentation: off [fixed] fcoe-mtu: off [fixed] tx-nocache-copy: off loopback: off [fixed] rx-fcs: off [fixed] rx-all: off tx-vlan-stag-hw-insert: off [fixed] rx-vlan-stag-hw-parse: off [fixed] rx-vlan-stag-filter: off [fixed] l2-fwd-offload: off [fixed] busy-poll: off [fixed] hw-tc-offload: off [fixed] First crash: [4083386.299221] ------------[ cut here ]------------ [4083386.299358] WARNING: CPU: 0 PID: 0 at net/ipv4/af_inet.c:1473 inet_gro_complete+0xbb/0xd0 [4083386.299520] Modules linked in: sb_edac edac_core 8021q mrp garp nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_physdev ip6table_filter ip6_tables xen_pciback blktap xen_netback xen_gntdev xen_gnt alloc xenfs xen_privcmd xen_evtchn xen_blkback tun sch_htb fuse ext2 ebt_mark ebt_ip ebt_arp ebtable_filter ebtables drbd lru_cache cls_fw br_netfilter bridge stp llc iTCO_wdt iTCO_vendor_support pcspkr raid456 async_raid6_recov async_pq async_xor xor async_memcpy async_tx raid10 raid6_pq libcrc32c joydev shpchp i2c_i801 i2c_smbus mei_me mei lpc_ich fjes ipmi_si ipmi_msghandler acpi_power_meter ioatdma igb dca raid1 mlx4_en mlx4_ib ib_core ptp pps_core mlx4_core mpt3sas scsi_transport_sas raid_class wmi ast ttm [4083386.300888] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.39 #1 [4083386.301002] Hardware name: Supermicro Super Server/X10DRi-LN4+, BIOS 2.0a 09/16/2016 [4083386.301109] ffff880306603d90 ffffffff813f5935 0000000000000000 0000000000000000 [4083386.301221] ffff880306603dd0 ffffffff810a7e01 000005c18174578a ffff8802f94a9a00 [4083386.301333] ffff8802f0824450 0000000000000000 0000000000000040 0000000000000040 [4083386.301445] Call Trace: [4083386.301483] <IRQ> [4083386.301519] dump_stack+0x63/0x8e [4083386.301596] __warn+0xd1/0xf0 [4083386.301665] warn_slowpath_null+0x1d/0x20 [4083386.301747] inet_gro_complete+0xbb/0xd0 [4083386.301830] napi_gro_complete+0x73/0xa0 [4083386.301911] napi_gro_flush+0x5f/0x80 [4083386.301988] napi_complete_done+0x6a/0xb0 [4083386.302075] igb_poll+0x38d/0x720 [igb] [4083386.302156] ? igb_msix_ring+0x2e/0x40 [igb] [4083386.302255] ? __handle_irq_event_percpu+0x4b/0x1a0 [4083386.302349] net_rx_action+0x158/0x360 [4083386.302430] __do_softirq+0xd1/0x283 [4083386.302507] irq_exit+0xe9/0x100 [4083386.302580] xen_evtchn_do_upcall+0x35/0x50 [4083386.302665] xen_do_hypervisor_callback+0x1e/0x40 [4083386.302754] <EOI> [4083386.302787] ? xen_hypercall_sched_op+0xa/0x20 [4083386.302876] ? xen_hypercall_sched_op+0xa/0x20 [4083386.302965] ? xen_safe_halt+0x10/0x20 [4083386.303043] ? default_idle+0x1e/0xd0 [4083386.303122] ? arch_cpu_idle+0xf/0x20 [4083386.303200] ? default_idle_call+0x2c/0x40 [4083386.303284] ? cpu_startup_entry+0x1ac/0x240 [4083386.303370] ? rest_init+0x77/0x80 [4083386.303462] ? start_kernel+0x4a7/0x4b4 [4083386.303568] ? set_init_arg+0x55/0x55 [4083386.303670] ? x86_64_start_reservations+0x24/0x26 [4083386.303776] ? xen_start_kernel+0x555/0x561 [4083386.303873] ---[ end trace 8294f59ced689507 ]--- [4083386.303958] general protection fault: 0000 [#1] SMP [4083386.304041] Modules linked in: sb_edac edac_core 8021q mrp garp nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_physdev ip6table_filter ip6_tables xen_pciback blktap xen_netback xen_gntdev xen_gntalloc xenfs xen_privcmd xe n_evtchn xen_blkback tun sch_htb fuse ext2 ebt_mark ebt_ip ebt_arp ebtable_filter ebtables drbd lru_cache cls_fw br_netfilter bridge stp llc iTCO_wdt iTCO_vendor_support pcspkr raid456 async_raid6_recov async_pq async_xor xor async_memcp y async_tx raid10 raid6_pq libcrc32c joydev shpchp i2c_i801 i2c_smbus mei_me mei lpc_ich fjes ipmi_si ipmi_msghandler acpi_power_meter ioatdma igb dca raid1 mlx4_en mlx4_ib ib_core ptp pps_core mlx4_core mpt3sas scsi_transport_sas raid_c lass wmi ast ttm [4083386.305179] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 4.9.39 #1 [4083386.305307] Hardware name: Supermicro Super Server/X10DRi-LN4+, BIOS 2.0a 09/16/2016 [4083386.305414] task: ffffffff81e0e540 task.stack: ffffffff81e00000 [4083386.305498] RIP: e030: skb_release_data+0x73/0xf0 [4083386.305617] RSP: e02b:ffff880306603d90 EFLAGS: 00010206 [4083386.305692] RAX: 0000000000000030 RBX: f5b36db76bd162c7 RCX: ffffffff81e60048 [4083386.305790] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8802f94a9a00 [4083386.305887] RBP: ffff880306603db0 R08: 0000000000004277 R09: 0000000000000000 [4083386.305985] R10: 0000000000000005 R11: 0000000000000002 R12: 0000000000000000 [4083386.306083] R13: ffff8802f94a9a00 R14: ffff88032f527740 R15: 0000000000000040 [4083386.306186] FS: 0000000000000000(0000) GS:ffff880306600000(0000) knlGS:0000000000000000 [4083386.306296] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 [4083386.306407] CR2: 0000000001692ed8 CR3: 000000022b3c9000 CR4: 0000000000042660 [4083386.306505] Stack: [4083386.306537] ffff8802f94a9a00 ffff8802f94a9a00 ffffffff8175ac3e 0000000000000040 [4083386.306649] ffff880306603dc8 ffffffff81745764 ffff8802f94a9a00 ffff880306603df0 [4083386.306762] ffffffff817457c2 ffff8802f94a9a00 ffff8802f0824450 0000000000000000 [4083386.306874] Call Trace: [4083386.306911] <IRQ> [4083386.306944] ? napi_gro_complete+0x5e/0xa0 [4083386.307038] skb_release_all+0x24/0x30 [4083386.307133] kfree_skb+0x32/0x90 [4083386.307206] napi_gro_complete+0x5e/0xa0 [4083386.307287] napi_gro_flush+0x5f/0x80 [4083386.307365] napi_complete_done+0x6a/0xb0 [4083386.307449] igb_poll+0x38d/0x720 [igb] [4083386.307530] ? igb_msix_ring+0x2e/0x40 [igb] [4083386.307617] ? __handle_irq_event_percpu+0x4b/0x1a0 [4083386.307720] net_rx_action+0x158/0x360 [4083386.307800] __do_softirq+0xd1/0x283 [4083386.307877] irq_exit+0xe9/0x100 [4083386.307949] xen_evtchn_do_upcall+0x35/0x50 [4083386.308034] xen_do_hypervisor_callback+0x1e/0x40 [4083386.308124] <EOI> [4083386.308156] ? xen_hypercall_sched_op+0xa/0x20 [4083386.308246] ? xen_hypercall_sched_op+0xa/0x20 [4083386.308334] ? xen_safe_halt+0x10/0x20 [4083386.308413] ? default_idle+0x1e/0xd0 [4083386.308491] ? arch_cpu_idle+0xf/0x20 [4083386.308568] ? default_idle_call+0x2c/0x40 [4083386.308651] ? cpu_startup_entry+0x1ac/0x240 [4083386.308737] ? rest_init+0x77/0x80 [4083386.308811] ? start_kernel+0x4a7/0x4b4 [4083386.308890] ? set_init_arg+0x55/0x55 [4083386.308968] ? x86_64_start_reservations+0x24/0x26 [4083386.309060] ? xen_start_kernel+0x555/0x561 [4083386.309144] Code: f0 41 0f c1 46 20 39 c2 74 09 5b 41 5c 41 5d 41 5e 5d c3 45 31 e4 41 80 3e 00 74 39 49 63 c4 48 83 c0 03 48 c1 e0 04 49 8b 1c 06 <48> 8b 43 20 a8 01 75 6f f0 ff 4b 1c 74 55 48 8b 03 48 c1 e8 33 [4083386.309571] RIP skb_release_data+0x73/0xf0 [4083386.309658] RSP <ffff880306603d90> [4083386.313000] ---[ end trace 8294f59ced689508 ]--- [4083386.389667] Kernel panic - not syncing: Fatal exception in interrupt [4083386.389791] Kernel Offset: disabled (XEN) Hardware Dom0 crashed: rebooting machine in 5 seconds. Second crash: [1838269.012349] general protection fault: 0000 [#1] SMP [1838269.012452] Modules linked in: ebtable_nat sb_edac edac_core 8021q mrp garp nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_physdev ip6table_filter ip6_tables xen_pciback blktap xen_netback xen_gntdev xen_gntalloc xenfs xe n_privcmd xen_evtchn xen_blkback tun sch_htb fuse ext2 ebt_mark ebt_ip ebt_arp ebtable_filter ebtables drbd lru_cache cls_fw br_netfilter bridge stp llc iTCO_wdt iTCO_vendor_support pcspkr raid456 async_raid6_recov async_pq async_xor xor async_memcpy async_tx raid10 raid6_pq libcrc32c joydev i2c_i801 i2c_smbus lpc_ich shpchp mei_me mei fjes ipmi_si ipmi_msghandler acpi_power_meter ioatdma igb dca raid1 mlx4_en mlx4_ib ib_core ptp pps_core mlx4_core mpt3sas scsi_transpor t_sas raid_class wmi ast ttm [1838269.013521] CPU: 1 PID: 18 Comm: ksoftirqd/1 Not tainted 4.9.39 #1 [1838269.013637] Hardware name: Supermicro Super Server/X10DRi-LN4+, BIOS 2.0a 09/16/2016 [1838269.013743] task: ffff88030008c4c0 task.stack: ffffc90041978000 [1838269.013826] RIP: e030: memcpy_erms+0x6/0x10 [1838269.013952] RSP: e02b:ffffc9004197bac0 EFLAGS: 00010202 [1838269.014026] RAX: ffff88032fcafe16 RBX: 0000000000000004 RCX: 0000000000000004 [1838269.014124] RDX: 0000000000000004 RSI: 62a16ddedc6dbcb3 RDI: ffff88032fcafe16 [1838269.014222] RBP: ffffc9004197bb20 R08: 0000000000000004 R09: 0000000000000004 [1838269.014320] R10: ffff88026ae89500 R11: 0000000044639632 R12: 0000000000000048 [1838269.014417] R13: 0000000000000000 R14: 0000000044639632 R15: 0000000000000048 [1838269.014519] FS: 0000000000000000(0000) GS:ffff880306640000(0000) knlGS:ffff880306640000 [1838269.014629] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 [1838269.014709] CR2: ffffffffff600400 CR3: 0000000051939000 CR4: 0000000000042660 [1838269.014808] Stack: [1838269.014840] ffffffff81744c17 ffff88026ae89500 0000000044639632 ffff88030008c4c0 [1838269.014952] ffffffff00000004 0000000000000004 ffff88032fcafe16 ffff88026ae89500 [1838269.015064] 0000000000000004 0000000000000004 000000000000004c 0000000000000028 [1838269.015176] Call Trace: [1838269.015217] ? skb_copy_bits+0x137/0x2c0 [1838269.015299] __pskb_pull_tail+0x7f/0x3b0 [1838269.015382] tcp_gro_receive+0x2c5/0x300 [1838269.015465] tcp6_gro_receive+0x13a/0x1a0 [1838269.015547] ipv6_gro_receive+0x1c6/0x380 [1838269.015630] dev_gro_receive+0x269/0x3b0 [1838269.015712] napi_gro_receive+0x38/0xf0 [1838269.015796] igb_clean_rx_irq+0x38e/0x690 [igb] [1838269.015886] igb_poll+0x362/0x720 [igb] [1838269.015968] ? dequeue_entity+0x26e/0xa90 [1838269.016051] ? xen_mc_flush+0x17b/0x1b0 [1838269.016131] net_rx_action+0x158/0x360 [1838269.016212] __do_softirq+0xd1/0x283 [1838269.016290] ? sort_range+0x30/0x30 [1838269.016366] run_ksoftirqd+0x29/0x50 [1838269.016443] smpboot_thread_fn+0x110/0x160 [1838269.016525] kthread+0xd7/0xf0 [1838269.016595] ? kthread_park+0x60/0x60 [1838269.016673] ret_from_fork+0x25/0x30 [1838269.016758] Code: ff 90 90 90 90 eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 <f3> a4 c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e 40 38 [1838269.017183] RIP memcpy_erms+0x6/0x10 [1838269.017264] RSP <ffffc9004197bac0> [1838269.020618] ---[ end trace 3506ce1d7200529a ]--- [1838269.079891] Kernel panic - not syncing: Fatal exception in interrupt [1838269.080014] Kernel Offset: disabled (XEN) Hardware Dom0 crashed: rebooting machine in 5 seconds. Thanks, Sarah ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired