Greetings all,

I'm experiencing Tx queue hangs with the ixgbe driver.

The scenario that I am running is the following:

1.       Have two X540-AT2 nics, and connect them through a cable

2.       For each of the two nics create a virtual function from the host OS

Ex: echo 1 > /sys/class/net/enp5s0f1/device/sriov_numvfs

3.       Create two guest virtual machines, and assign the PFs to them (not the 
virtual functions)

4.       Boot up the guests, and try to send some packets through the 
interfaces that will be managed by the ixgbe driver

Ex: ping 1.1.1.1 -I eth1

5.       Notice that the adapter keeps resetting due to a Tx queue hanging

The dmesg log shows the following information:

ixgbe 0000:00:08.0: eth1: Detected Tx Unit Hang
  Tx Queue             <0>
  TDH, TDT             <2>, <3>
  next_to_use          <3>
  next_to_clean        <0>
tx_buffer_info[next_to_clean]
  time_stamp           <1000d00d6>
  jiffies              <1000d0a95>
ixgbe 0000:00:08.0: eth1: tx hang 4 detected on queue 0, resetting adapter
ixgbe 0000:00:08.0: eth1: Reset adapter
ixgbe 0000:00:08.0: eth1: NIC Link is Up 10 Gbps, Flow Control: RX/TX

                Also, running ethtool tests on the NIC shows that it's in an 
inconsistent state

[root@localhost ~]# ethtool -t eth1
The test result is FAIL
The test extra info:
Register test  (offline)         0
Eeprom test    (offline)         0
Interrupt test (offline)         0
Loopback test  (offline)         13
Link test   (on/offline)         0

I would add that if I were to remove the VF interfaces prior to assigning the 
PF to the guests, the problem does not happen anymore,
which makes me suspect it's a configuration related to the way that the 
transmit queues are configured in the case of VFs being enabled.

My setup is the following:


1.       NIC: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2

2.       Hypervisor:
Type: KVM
Distro: CentOS 7
Kernel: 3.10.0-229.4.2.el7.x86_64

3.       Guest
Distro: CentOS 6.3
Kernel: Customized 3.10 kernel , but saw the problem with a default CentOS 
stock kernel as well (2.6.32-279.el6.x86_64)
                Ixgbe version: 3.6.7-k

For now, as a workaround I can disable the virtual functions, before using 
pass-through on the PF.
I would like to understand better why does this problem happen, and if there 
are any patches that attempt to fix the issue.

I know there are a few problems related to when SR-IOV and DCB are 
simultaneously enabled [1] , but I don't think this is the case in my setup.
My customized guest kernel has CONFIG_DCB and CONFIG_PCI_IOV disabled while the 
stock kernel from CentOS has them enabled, yet I manage to see the problem in 
both cases.

[1] http://www.spinics.net/lists/netdev/msg203427.html

Thanks,
Tudor


------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Reply via email to