Hello list,
A machine we recently put into service is showing (presumably)
Ethernet-related problems. The host is a Supermicro SYS-1028U-TNRT+
barebone with 256GB of ECC-RDIMM, 2x Intel Xeon E5-2660 v4 CPUs (24
Cores, HT disabled, BIOS dated 08/09/2016), and connected to a 1GBit
switchport via one of its on-board X540-AT2-provided ports (PCIe link
properties negotiated: Speed 5GT/s, Width x8). The machine's
CPU-normalized load is about 1, so it is quite busy.
Additional software/firmware info and NIC stats:
# uname -a
Linux inject 4.9.0-0.bpo.1-amd64 #1 SMP Debian 4.9.2-2~bpo8+1
(2017-01-26) x86_64 GNU/Linux
# ethtool -i eth0
driver: ixgbe
version: 4.4.0-k
firmware-version: 0x800003e2
bus-info: 0000:01:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no
# ethtool -S eth0 | grep -v ' 0$'
NIC statistics:
rx_packets: 273710944
tx_packets: 398971152
rx_bytes: 313480861463
tx_bytes: 470304591176
rx_pkts_nic: 273710875
tx_pkts_nic: 398971010
rx_bytes_nic: 314575702117
tx_bytes_nic: 471900519485
lsc_int: 5
rx_dropped: 56473
multicast: 58774
broadcast: 195115
fdir_match: 273920501
fdir_miss: 139668
fdir_overflow: 22
tx_timeout_count: 4
tx_restart_queue: 3
[omitting lines that merely detail [rx]x_queue_\d+_{bytes,packets} counters]
Relevant debug ringbuffer contents:
[40807.952873] ------------[ cut here ]------------
[40807.952921] WARNING: CPU: 18 PID: 15921 at
/home/zumbi/linux-4.9.2/net/sched/sch_generic.c:316 dev_watchdog+0x220/0x230
[40807.952959] NETDEV WATCHDOG: eth0 (ixgbe): transmit queue 0 timed out
[40807.952983] Modules linked in: tcp_diag inet_diag netconsole configfs
ipmi_watchdog ast ttm drm_kms_helper drm i2c_algo_bit iTCO_wdt
iTCO_vendor_support intel_rapl sb_edac edac_core x86_pkg_temp_thermal
intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul
crc32_pclmul ghash_clmulni_intel intel_cstate intel_uncore pcspkr evdev
joydev mei_me i2c_i801 lpc_ich intel_rapl_perf i2c_smbus mei ioatdma
mfd_core shpchp wmi tpm_tis tpm_tis_core tpm acpi_power_meter acpi_pad
button ipmi_si ipmi_poweroff ipmi_devintf ipmi_msghandler autofs4 ext4
crc16 jbd2 fscrypto mbcache raid0 raid1 md_mod hid_generic usbhid hid sg
sd_mod crc32c_intel aesni_intel ahci aes_x86_64 glue_helper lrw libahci
gf128mul ablk_helper cryptd xhci_pci libata xhci_hcd ehci_pci ehci_hcd
ixgbe usbcore scsi_mod dca nvme ptp usb_common
[40807.953445] pps_core nvme_core mdio fjes
[40807.953467] CPU: 18 PID: 15921 Comm: inject Not tainted
4.9.0-0.bpo.1-amd64 #1 Debian 4.9.2-2~bpo8+1
[40807.953501] Hardware name: Supermicro SYS-1028U-TNRT+/X10DRU-i+, BIOS
2.0b 08/09/2016
[40807.953529] 0000000000000000 ffffffff8fd2a1f5 ffff9f977f303e38
0000000000000000
[40807.953564] ffffffff8fa77884 0000000000000000 ffff9f977f303e90
ffff9f776ec00000
[40807.953599] 0000000000000012 ffff9f776ec24f40 0000000000000040
ffffffff8fa778ff
[40807.953633] Call Trace:
[40807.953646] <IRQ>
[40807.953665] [<ffffffff8fd2a1f5>] ? dump_stack+0x5c/0x77
[40807.953688] [<ffffffff8fa77884>] ? __warn+0xc4/0xe0
[40807.953708] [<ffffffff8fa778ff>] ? warn_slowpath_fmt+0x5f/0x80
[40807.953731] [<ffffffff8ff1fc30>] ? dev_watchdog+0x220/0x230
[40807.953753] [<ffffffff8ff1fa10>] ?
dev_deactivate_queue.constprop.27+0x60/0x60
[40807.953784] [<ffffffff8fae6210>] ? call_timer_fn+0x30/0x130
[40807.953807] [<ffffffff8fae7085>] ? run_timer_softirq+0x215/0x4b0
[40807.953832] [<ffffffff8fd33434>] ? timerqueue_add+0x54/0xa0
[40807.953853] [<ffffffff8fae82c8>] ? enqueue_hrtimer+0x38/0x80
[40807.953878] [<ffffffff8fffcce6>] ? __do_softirq+0x106/0x292
[40807.953902] [<ffffffff8fb8eae0>] ?
trace_event_raw_event_mm_lru_insertion+0x170/0x170
[40807.953931] [<ffffffff8fa7db08>] ? irq_exit+0x98/0xa0
[40807.953951] [<ffffffff8fffcaee>] ? smp_apic_timer_interrupt+0x3e/0x50
[40807.953977] [<ffffffff8fffbe02>] ? apic_timer_interrupt+0x82/0x90
[40807.953999] <EOI>
[40807.954011] [<ffffffff8fb8eae0>] ?
trace_event_raw_event_mm_lru_insertion+0x170/0x170
[40807.954039] [<ffffffff8fff9d91>] ? _raw_spin_unlock_irqrestore+0x11/0x20
[40807.954064] [<ffffffff8fb8fb4d>] ? pagevec_lru_move_fn+0xad/0xe0
[40807.954934] [<ffffffff8fb8fc6c>] ? __lru_cache_add+0x6c/0x90
[40807.955761] [<ffffffff8fbb769e>] ? handle_mm_fault+0x156e/0x1650
[40807.956582] [<ffffffff8fa5fe43>] ? __do_page_fault+0x253/0x510
[40807.957392] [<ffffffff8fffb598>] ? page_fault+0x28/0x30
[40807.958201] ---[ end trace faa12d1c7fa20cc5 ]---
[40807.959003] ixgbe 0000:01:00.0 eth0: initiating reset due to tx timeout
[40807.959849] ixgbe 0000:01:00.0 eth0: Reset adapter
[40811.710212] ixgbe 0000:01:00.0 eth0: NIC Link is Up 1 Gbps, Flow
Control: None
[42470.998496] ixgbe 0000:01:00.0 eth0: initiating reset due to tx timeout
[42470.999465] ixgbe 0000:01:00.0 eth0: Reset adapter
[42479.497773] ixgbe 0000:01:00.0 eth0: NIC Link is Up 1 Gbps, Flow
Control: None
[48475.363991] ixgbe 0000:01:00.0 eth0: initiating reset due to tx timeout
[48475.365060] ixgbe 0000:01:00.0 eth0: Reset adapter
Can you please help me determine what's the reason for this behaviour?
Are there any specific ixgbe/NIC-specific tunables I should be looking
into to fix it? If I need to supply additional data, please let me know.
Please also keep me CC'd, as I'm not subscribed to this list.
Thanks!
--
Mit freundlichen Grüßen
Johannes Truschnigg
Technik / Senior System Administrator
Geizhals (R) - Preisvergleich
Preisvergleich Internet Services AG
Obere Donaustraße 63/2
A-1020 Wien
Tel: +43 1 5811609/87
Fax: +43 1 5811609/55
http://www.geizhals.at | http://www.geizhals.de | http://www.geizhals.eu
http://www.facebook.com/geizhals => Geizhals auf Facebook!
http://twitter.com/geizhals => Geizhals auf Twitter!
http://blog.geizhals.at => Der Geizhals-Blog!
http://unternehmen.geizhals.at/about/de/apps/ => Die Geizhals Mobile-App
Handelsgericht Wien | FN 197241K | Firmensitz Wien
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
E1000-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit
http://communities.intel.com/community/wired