There's really not enough information here. Ideally you would send us the dmesg of when it fails, and a register dump before and after.
I would suggest opening on bug on sourceforge and attaching the dmesg & register dumps to the bug. Don't just copy them into the bug because that's much harder to read. We haven't heard of many issues with the 82576 like this, so you may also want to ask Supermicro for help, but it also looks like your hardware is EOL. Todd Fujinaka Software Application Engineer Datacenter Engineering Group Intel Corporation todd.fujin...@intel.com -----Original Message----- From: Kojedzinszky Richárd [mailto:kojedzinszky.rich...@euronetrt.hu] Sent: Wednesday, January 24, 2018 1:44 AM To: e1000-devel@lists.sourceforge.net Subject: [E1000-devel] igb transmit queue timeout Dear maintainers, We have a xen virtualization environment, with 6 nearly identical nodes, Supermicro X8DTU boards. We run debian stretch on them, the xen hypervisor and linux kernel is from debian stretch, latest at the time of writing. Unfortunately, we are facing an issue where randomly our igb devices stop working, with the error message: NETDEV WATCHDOG: enp1s0f0 (igb): transmit queue 0 timed out And while the driver tries to recover/reset the adapter, it does not succeed. Shutting down the interface and then bringing it back even does not help, a reboot is required to restore normal operation. The servers are connected to our switch with two interfaces, the problem happens randomly on either one. We have tried to disable msi interrupts, but that did not help. Unfortunately, we cannot reproduce the problem, I mean it happens randomly, frequently, but we cannot explicitly trigger it. It did happen on nearly all our nodes, so I assume it is not a hardware problem. Our kernel/xen versions: # uname -a Linux node-3.cloud-b.dravanet.net 4.9.0-5-amd64 #1 SMP Debian 4.9.65-3+deb9u2 (2018-01-04) x86_64 GNU/Linux # xl info host : x release : 4.9.0-5-amd64 version : #1 SMP Debian 4.9.65-3+deb9u2 (2018-01-04) machine : x86_64 nr_cpus : 8 max_cpu_id : 23 nr_nodes : 2 cores_per_socket : 4 threads_per_core : 1 cpu_mhz : 3066 hw_caps : b7ebfbff:029ee3ff:2c100800:00000001:00000000:00000000:00000000:00000100 virt_caps : hvm hvm_directio total_memory : 196599 free_memory : 94364 sharing_freed_memory : 0 sharing_used_memory : 0 outstanding_claims : 0 free_cpus : 0 xen_major : 4 xen_minor : 8 xen_extra : .3-pre xen_version : 4.8.3-pre xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 xen_scheduler : credit xen_pagesize : 4096 platform_params : virt_start=0xffff800000000000 xen_changeset : xen_commandline : placeholder dom0_mem=4096M gnttab_max_frames=256 cc_compiler : gcc (Debian 6.3.0-18) 6.3.0 20170516 cc_compile_by : ijackson cc_compile_domain : chiark.greenend.org.uk cc_compile_date : Sat Nov 25 11:30:34 UTC 2017 build_id : 23ac95af74d2e3f84c90068ae674c34e764649e7 xend_config_format : 4 What else could we try to resolve this issue? Thanks in advance, Kojedzinszky Richárd Euronet Magyarorszag Informatika Zrt. ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired