Hello intel-wired-lan,
I might post to the wrong place /ML as this issue could not be caused by
an issue with the driver / kernel module.
But I already tried my luck with Intel support (ticket# 06047421).
Support was active at first, but now the conversation there seems to
have died off unfortunately.
Please let me me explain the observed issue and kindly point me to the
right channel (NIC firmware, NVM, ... ) if this is the wrong place after
all:
1)
We purchased a bunch of Intel E810-XXVDA2 adapters and hooked them up
using 100G->4x25G breakout cables (fs.com) to Arista switches.
Unfortunately we cannot get a link up with 25G at boot. Looping the NIC
with a simple SFP28 DAC (fs.com) works fine though.
2)
Certainly we updated the NVM to 4.40 (latest) and power cycled the servers.
3)
We forced / set the correct speed on the Arista switches and we tried
different FEC settings (none or reed-solomon), but no luck there.
4)
The issue seems to be, that the advertised speeds of the NIC don't
contain 25G by default!
Right after boot it looks like this:
# ethtool eth3
Settings for eth3:
Supported ports: [ FIBRE ]
Supported link modes: 1000baseT/Full
10000baseT/Full
25000baseCR/Full
25000baseSR/Full
1000baseX/Full
10000baseSR/Full
10000baseLR/Full
Supported pause frame use: Symmetric
Supports auto-negotiation: No
Supported FEC modes: None
Advertised link modes: 10000baseT/Full
Advertised pause frame use: No
Advertised auto-negotiation: No
Advertised FEC modes: None
Speed: Unknown!
Duplex: Unknown! (255)
Auto-negotiation: off
Port: Direct Attach Copper
PHYAD: 0
Transceiver: internal
Supports Wake-on: d
Wake-on: d
Current message level: 0x00000007 (7)
drv probe link
Link detected: no
Notice the list with advertised speeds contains only "10000baseT/Full".
When explicitly setting this to 25G via:
# ethtool -s eth3 advertise 0x80000000
the links comes right up at 25G and ethtool reports:
# ethtool eth3
Settings for eth3:
Supported ports: [ FIBRE ]
Supported link modes: 1000baseT/Full
10000baseT/Full
25000baseCR/Full
25000baseSR/Full
1000baseX/Full
10000baseSR/Full
10000baseLR/Full
Supported pause frame use: Symmetric
Supports auto-negotiation: No
Supported FEC modes: None
Advertised link modes: 25000baseCR/Full
25000baseSR/Full
Advertised pause frame use: No
Advertised auto-negotiation: No
Advertised FEC modes: None
Speed: 25000Mb/s
Duplex: Full
Auto-negotiation: off
Port: FIBRE
PHYAD: 0
Transceiver: internal
Supports Wake-on: d
Wake-on: d
Current message level: 0x00000007 (7)
drv probe link
Link detected: yes
I can also set both speeds via:
# ethtool -s eth3 advertise 0x80001000 (10G AND 25G)
so the ethtool output changes from:
Advertised link modes: 10000baseT/Full
to
Advertised link modes: 10000baseT/Full
25000baseCR/Full
25000baseSR/Full
10000baseSR/Full
10000baseLR/Full
and the link still comes right up with 25G!
I can even play with the FEC setting to be either none, RS or auto. All
of them work fine - so FEC seems to not be related to the issue.
5)
The servers are Supermicro machines of different models (
On a different machine the reported speeds after bootup looks like this
with the supported link modes even reduced to one entry: "10000baseCR/Full"
# ethtool ens2f0np0
Settings for ens2f0np0:
Supported ports: [ FIBRE ]
Supported link modes: 10000baseCR/Full
Supported pause frame use: Symmetric
Supports auto-negotiation: No
Supported FEC modes: None
Advertised link modes: 10000baseCR/Full
Advertised pause frame use: No
Advertised auto-negotiation: No
Advertised FEC modes: None
Speed: 10000Mb/s
Duplex: Full
Auto-negotiation: off
Port: Direct Attach Copper
PHYAD: 0
Transceiver: internal
Supports Wake-on: d
Wake-on: d
Current message level: 0x00000007 (7)
drv probe link
Link detected: yes
6) While I can now get a link-up at 25G, this is NOT a solution for me
(and this issue), as this is
a) not reboot safe
b) does not work for PXE boot
So could this still be a linux driver issue, why the nic is not offering
all of its capabilities?
Why is the NIC not advertising 25G? Or 10G AND 25G if possible? Could
this be the server BIOS not correctly initializing the NIC?
Is there any way to set this permanently in / via NVM?
Is there any other debugging I could enable at the driver level to help
finding the cause of this?
Regards and thanks for your time,
Christian