Hi,

We have a fleet of Dell PowerEdge R640 all with very similar configuration,
important piece here is they run intel 10GB ethernet cards as below:

lspci | grep Ether
19:00.0 Ethernet controller: Intel Corporation Ethernet Controller 10G X550T
(rev 01)
19:00.1 Ethernet controller: Intel Corporation Ethernet Controller 10G X550T
(rev 01)
1a:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network
Connection
(rev 01)
1a:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network
Connection
(rev 01)


Only 2 of them failing to auto-negotiate correct link speed:

ethtool eno1
Settings for eno1:
        Supported ports: [ TP ]
        Supported link modes:   100baseT/Full
                                1000baseT/Full
                                10000baseT/Full
        Supported pause frame use: Symmetric
        Supports auto-negotiation: Yes
        Advertised link modes:  100baseT/Full
                                1000baseT/Full
                                10000baseT/Full
        Advertised pause frame use: Symmetric
        Advertised auto-negotiation: Yes
        Speed: 1000Mb/s
        Duplex: Full
        Port: Twisted Pair
        PHYAD: 0
        Transceiver: internal
        Auto-negotiation: on
        MDI-X: Unknown
        Supports Wake-on: umbg
        Wake-on: g
        Current message level: 0x00000007 (7)
                               drv probe link
        Link detected: yes

ethtool eno2
Settings for eno2:
        Supported ports: [ TP ]
        Supported link modes:   100baseT/Full
                                1000baseT/Full
                                10000baseT/Full
        Supported pause frame use: Symmetric
        Supports auto-negotiation: Yes
        Advertised link modes:  100baseT/Full
                                1000baseT/Full
                                10000baseT/Full
        Advertised pause frame use: Symmetric
        Advertised auto-negotiation: Yes
        Speed: 1000Mb/s
        Duplex: Full
        Port: Twisted Pair
        PHYAD: 0
        Transceiver: internal
        Auto-negotiation: on
        MDI-X: Unknown
        Supports Wake-on: umbg
        Wake-on: g
        Current message level: 0x00000007 (7)
                               drv probe link
        Link detected: yes



They sometimes will also loose connectivity entirely for extended period of
up
to 4 hours, here's our switch logs which usually indicates the problem

lacpd[20416]: %DAEMON-5-LACPD_TIMEOUT: xe-10/0/16: lacp current while timer
expired current Receive State: CURRENT
/kernel: %KERN-5-KERN_LACP_INTF_STATE_CHANGE: lacp_update_state_userspace:
cifd xe-10/0/16 - ATTACHED state - acting as standby link
rpd[1866]: %DAEMON-6: Decode ifd xe-10/0/16 index 2406: ifdm_flags 0xc000
mcsnoopd[94056]: %DAEMON-6: Decode ifd xe-10/0/16 index 2406: ifdm_flags
0xc000
mcsnoopd[94056]: %DAEMON-6: krt_decode_iflogical: xe-10/0/16.0 has got
color 0
lacpd[20416]: %DAEMON-5-LACPD_TIMEOUT: xe-11/0/16: lacp current while timer
expired current Receive State: CURRENT
/kernel: %KERN-5-KERN_LACP_INTF_STATE_CHANGE: lacp_update_state_userspace:
cifd xe-11/0/16 - ATTACHED state - acting as standby link
lacpd[20416]: %DAEMON-5-LACP_INTF_DOWN: ae125: Interface marked down due to
lacp timeout on member xe-11/0/16
rpd[1866]: %DAEMON-6: Decode ifd xe-11/0/16 index 2456: ifdm_flags 0xc000
mcsnoopd[94056]: %DAEMON-6: Decode ifd xe-11/0/16 index 2456: ifdm_flags
0xc000
mcsnoopd[94056]: %DAEMON-6: krt_decode_iflogical: xe-11/0/16.0 has got
color 0
/kernel: %KERN-5-KERN_LACP_INTF_STATE_CHANGE: lacp_update_state_userspace:
cifd xe-11/0/16 - CD state - ready to carry traffic
/kernel: %KERN-5-KERN_LACP_INTF_STATE_CHANGE: lacp_update_state_userspace:
cifd xe-10/0/16 - CD state - ready to carry traffic
rpd[1866]: %DAEMON-6: Decode ifd xe-11/0/16 index 2456: ifdm_flags 0xc000
rpd[1866]: %DAEMON-6: Decode ifd xe-10/0/16 index 2406: ifdm_flags 0xc000
mcsnoopd[94056]: %DAEMON-6: Decode ifd xe-11/0/16 index 2456: ifdm_flags
0xc000
mcsnoopd[94056]: %DAEMON-6: krt_decode_iflogical: xe-11/0/16.0 has got
color 0
mcsnoopd[94056]: %DAEMON-6: received iff message xe-11/0/16.0 ifl 8c6fcf0
op 2
flag 0
mcsnoopd[94056]: %DAEMON-6: KRT Ifstate: Decode iff message -
ifl(xe-11/0/16.0) without mesh-group tlv
mcsnoopd[94056]: %DAEMON-6: Decode ifd xe-10/0/16 index 2406: ifdm_flags
0xc000
mcsnoopd[94056]: %DAEMON-6: krt_decode_iflogical: xe-10/0/16.0 has got
color 0
mcsnoopd[94056]: %DAEMON-6: received iff message xe-10/0/16.0 ifl 8be35a0
op 2
flag 0
mcsnoopd[94056]: %DAEMON-6: KRT Ifstate: Decode iff message -
ifl(xe-10/0/16.0) without mesh-group tlv


We have upgraded to 4.14.52 kernel hoping there might be some ixgbe patch
that
fixes this problem but the problem still persists.

I am posting here to seek advice on how to diagnose and probably fix this
problem.

Thanks!

Abejide Ayodele
It always seems impossible until it's done. --Nelson Mandela
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit 
http://communities.intel.com/community/wired

Reply via email to