Good time. We experince link flapping while connecting two supermicro servers with 82599ES adapters:
[Wed Jan 6 07:10:14 2016] ixgbe 0000:81:00.1 enp129s0f1: NIC Link is Down [Wed Jan 6 07:10:15 2016] ixgbe 0000:81:00.1 enp129s0f1: NIC Link is Up 10 Gbps, Flow Control: RX/TX [Wed Jan 6 07:10:15 2016] ixgbe 0000:81:00.1 enp129s0f1: NIC Link is Down [Wed Jan 6 07:10:16 2016] ixgbe 0000:81:00.1 enp129s0f1: NIC Link is Up 10 Gbps, Flow Control: RX/TX [Wed Jan 6 07:10:16 2016] ixgbe 0000:81:00.1 enp129s0f1: NIC Link is Down [Wed Jan 6 07:10:16 2016] ixgbe 0000:81:00.1 enp129s0f1: NIC Link is Up 10 Gbps, Flow Control: RX/TX [Wed Jan 6 07:10:16 2016] ixgbe 0000:81:00.1 enp129s0f1: NIC Link is Down [Wed Jan 6 07:10:16 2016] ixgbe 0000:81:00.1 enp129s0f1: NIC Link is Up 10 Gbps, Flow Control: RX/TX [Wed Jan 6 07:10:17 2016] ixgbe 0000:81:00.1 enp129s0f1: NIC Link is Down [Wed Jan 6 07:10:17 2016] ixgbe 0000:81:00.1 enp129s0f1: NIC Link is Up 10 Gbps, Flow Control: RX/TX So we are trying to connect two supermicro servers with E10G42BTDA X520-DA2 controller in each server: 81:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01) In one controller there is following SFP module: # ethtool --module-info enp129s0f1 Identifier : 0x03 (SFP) Extended identifier : 0x04 (GBIC/SFP defined by 2-wire interface ID) Connector : 0x07 (LC) Transceiver codes : 0x20 0x00 0x00 0x00 0x00 0x00 0x00 0x00 Transceiver type : 10G Ethernet: 10G Base-LR Encoding : 0x06 (64B/66B) BR, Nominal : 10300MBd Rate identifier : 0x00 (unspecified) Length (SMF,km) : 20km Length (SMF) : 20000m Length (50um) : 0m Length (62.5um) : 0m Length (Copper) : 0m Length (OM3) : 0m Laser wavelength : 1330nm Vendor name : GIGALINK Vendor OUI : 00:90:65 Vendor PN : GL-OT-ST12LC1-13 Vendor rev : A Optical diagnostics support : Yes Laser bias current : 37.020 mA Laser output power : 0.9654 mW / -0.15 dBm Receiver signal average optical power : 0.4904 mW / -3.09 dBm Module temperature : 44.23 degrees C / 111.61 degrees F Module voltage : 3.2836 V Alarm/warning flags implemented : Yes Laser bias current high alarm : Off Laser bias current low alarm : Off Laser bias current high warning : Off Laser bias current low warning : Off Laser output power high alarm : Off Laser output power low alarm : Off Laser output power high warning : Off Laser output power low warning : Off Module temperature high alarm : Off Module temperature low alarm : Off Module temperature high warning : Off Module temperature low warning : Off Module voltage high alarm : Off Module voltage low alarm : Off Module voltage high warning : Off Module voltage low warning : Off Laser rx power high alarm : Off Laser rx power low alarm : Off Laser rx power high warning : Off Laser rx power low warning : Off Laser bias current high alarm threshold : 85.000 mA Laser bias current low alarm threshold : 10.000 mA Laser bias current high warning threshold : 80.000 mA Laser bias current low warning threshold : 12.000 mA Laser output power high alarm threshold : 3.1623 mW / 5.00 dBm Laser output power low alarm threshold : 0.3162 mW / -5.00 dBm Laser output power high warning threshold : 2.5119 mW / 4.00 dBm Laser output power low warning threshold : 0.3981 mW / -4.00 dBm Module temperature high alarm threshold : 85.00 degrees C / 185.00 degrees F Module temperature low alarm threshold : -10.00 degrees C / 14.00 degrees F Module temperature high warning threshold : 80.00 degrees C / 176.00 degrees F Module temperature low warning threshold : -5.00 degrees C / 23.00 degrees F Module voltage high alarm threshold : 3.7000 V Module voltage low alarm threshold : 2.9000 V Module voltage high warning threshold : 3.6000 V Module voltage low warning threshold : 3.0000 V Laser rx power high alarm threshold : 1.0000 mW / 0.00 dBm Laser rx power low alarm threshold : 0.0200 mW / -16.99 dBm Laser rx power high warning threshold : 0.7943 mW / -1.00 dBm Laser rx power low warning threshold : 0.0251 mW / -16.00 dBm In another controller: # ethtool --module-info enp129s0f1 Identifier : 0x03 (SFP) Extended identifier : 0x04 (GBIC/SFP defined by 2-wire interface ID) Connector : 0x07 (LC) Transceiver codes : 0x20 0x00 0x00 0x00 0x00 0x00 0x00 0x00 Transceiver type : 10G Ethernet: 10G Base-LR Encoding : 0x06 (64B/66B) BR, Nominal : 10300MBd Rate identifier : 0x00 (unspecified) Length (SMF,km) : 20km Length (SMF) : 20000m Length (50um) : 0m Length (62.5um) : 0m Length (Copper) : 0m Length (OM3) : 0m Laser wavelength : 1270nm Vendor name : GIGALINK Vendor OUI : 00:90:65 Vendor PN : GL-OT-ST12LC1-12 Vendor rev : A Option values : 0x00 0x1a Option : RX_LOS implemented Option : TX_FAULT implemented Option : TX_DISABLE implemented BR margin, max : 0% BR margin, min : 0% Vendor SN : G201511300616 Date code : 151118 Optical diagnostics support : Yes Laser bias current : 35.120 mA Laser output power : 0.7842 mW / -1.06 dBm Receiver signal average optical power : 0.5695 mW / -2.45 dBm Module temperature : 45.68 degrees C / 114.22 degrees F Module voltage : 3.2440 V Alarm/warning flags implemented : Yes Laser bias current high alarm : Off Laser bias current low alarm : Off Laser bias current high warning : Off Laser bias current low warning : Off Laser output power high alarm : Off Laser output power low alarm : Off Laser output power high warning : Off Laser output power low warning : Off Module temperature high alarm : Off Module temperature low alarm : Off Module temperature high warning : Off Module temperature low warning : Off Module voltage high alarm : Off Module voltage low alarm : Off Module voltage high warning : Off Module voltage low warning : Off Laser rx power high alarm : Off Laser rx power low alarm : Off Laser rx power high warning : Off Laser rx power low warning : Off Laser bias current high alarm threshold : 85.000 mA Laser bias current low alarm threshold : 10.000 mA Laser bias current high warning threshold : 80.000 mA Laser bias current low warning threshold : 12.000 mA Laser output power high alarm threshold : 3.1623 mW / 5.00 dBm Laser output power low alarm threshold : 0.3162 mW / -5.00 dBm Laser output power high warning threshold : 2.5119 mW / 4.00 dBm Laser output power low warning threshold : 0.3981 mW / -4.00 dBm Module temperature high alarm threshold : 85.00 degrees C / 185.00 degrees F Module temperature low alarm threshold : -10.00 degrees C / 14.00 degrees F Module temperature high warning threshold : 80.00 degrees C / 176.00 degrees F Module temperature low warning threshold : -5.00 degrees C / 23.00 degrees F Module voltage high alarm threshold : 3.7000 V Module voltage low alarm threshold : 2.9000 V Module voltage high warning threshold : 3.6000 V Module voltage low warning threshold : 3.0000 V Laser rx power high alarm threshold : 1.0000 mW / 0.00 dBm Laser rx power low alarm threshold : 0.0200 mW / -16.99 dBm Laser rx power high warning threshold : 0.7943 mW / -1.00 dBm Laser rx power low warning threshold : 0.0251 mW / -16.00 dBm We've checked both SFP+ modules and patchcord in other hardware and link was working fine there. As for module version we've tried both 4.1.5 and 4.3.13 and we experience problem with both versions. I've built v4.3.13 module with printk enabled, and I found following messages in dmesg: [Wed Jan 6 06:04:02 2016] ixgbe 0000:81:00.1 enp129s0f1: NIC Link is Up 10 Gbps, Flow Control: RX/TX [Wed Jan 6 06:04:02 2016] ixgbe_get_media_type_82599 [Wed Jan 6 06:04:02 2016] ixgbe_check_mac_link_generic [Wed Jan 6 06:04:02 2016] ixgbe_get_media_type_82599 [Wed Jan 6 06:04:02 2016] ixgbe 0000:81:00.1 enp129s0f1: NIC Link is Down [Wed Jan 6 06:04:02 2016] ixgbe_check_mac_link_genericixgbe_fc_enable_genericixgbe_fc_autonegixgbe_check_mac_link_generic [Wed Jan 6 06:04:02 2016] ixgbe 0000:81:00.1 enp129s0f1: NIC Link is Up 10 Gbps, Flow Control: RX/TX [Wed Jan 6 06:04:02 2016] ixgbe_get_media_type_82599 [Wed Jan 6 06:04:02 2016] ixgbe_check_mac_link_generic [Wed Jan 6 06:04:02 2016] ixgbe_get_media_type_82599 [Wed Jan 6 06:04:02 2016] ixgbe 0000:81:00.1 enp129s0f1: NIC Link is Down Does this helps to understand reasons for such behaviour? What makes me wonder. In Documentation/networking/ixgbe.txt I found following information *82599-BASED ADAPTERS* *NOTES: If your 82599-based Intel(R) Network Adapter came with Intel optics, or* *is an Intel(R) Ethernet Server Adapter X520-2, then it only supports Intel* *optics and/or the direct attach cables listed below.* *When 82599-based SFP+ devices are connected back to back, they should be set to* *the same Speed setting via ethtool. Results may vary if you mix speed settings.* *82598-based adapters support all passive direct attach cables that comply* *with SFF-8431 v4.1 and SFF-8472 v10.4 specifications. Active direct attach* *cables are not supported.* Yet any attempt to set speed fails: # ethtool -s enp129s0f1 speed 10000 Cannot set new settings: Invalid argument not setting speed Why is that? That said I found suggestion to set advertising mode: ethtool -s enp129s0f1 advertise 0x1000 and this worked, but link flapping continues. Yes, I realize that these modules are unsupported and to make them working I had to load module with allow_unsupported_sfp=1,1 but the main problem we have is that I failed to find any intel suggested WDM SFP+ module. We need to use only single fiber cable since in our final configuration we need to connect both servers to our provider and they highly recomend WDM connections. If these modules are not going to work does there exist any WDM module that will? # ethtool -i enp129s0f1 driver: ixgbe version: 4.1.5 firmware-version: 0x61c10001 bus-info: 0000:81:00.1 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: no # ethtool enp129s0f1 Settings for enp129s0f1: Supported ports: [ FIBRE ] Supported link modes: 10000baseT/Full Supported pause frame use: No Supports auto-negotiation: No Advertised link modes: 10000baseT/Full Advertised pause frame use: Symmetric Advertised auto-negotiation: No Speed: 10000Mb/s Duplex: Full Port: FIBRE PHYAD: 0 Transceiver: external Auto-negotiation: off Supports Wake-on: d Wake-on: d Current message level: 0x00000007 (7) drv probe link Link detected: yes We've struggle with this problem for few days already so any suggestions are more then wellcome. TIA, -- Peter.
------------------------------------------------------------------------------
_______________________________________________ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired