Hi Team,

 

I would like to report my experience so somebody may have a real debugging
shortcut next time. There is also a practical wish to developers.

 

We have moved to local 10G for storage based on X710 DA-2 two weeks ago.

Right from the start we have lots of these events:

 

kernel: [772391.372876] i40e 0000:07:00.1: TX driver issue detected, PF
reset issued

kernel: [772391.594984] i40e 0000:07:00.1: i40e_ptp_init: added PHC on
enp7s0f1

kernel: [772391.741089] i40e 0000:07:00.1 enp7s0f1: NIC Link is Up 10 Gbps
Full Duplex, Flow Control: None

kernel: [772391.752696] i40e 0000:07:00.1 enp7s0f1: NIC Link is Down

kernel: [772392.557943] i40e 0000:07:00.1 enp7s0f1: NIC Link is Up 10 Gbps
Full Duplex, Flow Control: None

 

The server works normally, but there can be 500+ of these events In a row,
giving 5 seconds working / 5 seconds not working experience. While the next
hour we may have zero of these. But as a rule, 100+ events daily.

I shall to note that the driver resets just fine. No crashes, visible links
or whatever. The reset method itself seems to be OK.

The problem is that the network performance in such a mode is surely
unacceptable.

 

Initial setup was i40e
<http://sourceforge.net/projects/e1000/files/i40e%20stable/1.1.23/> 1.1.23
as of Intel official downloads. 

Our next try was with your
<http://sourceforge.net/projects/e1000/files/i40e%20stable/1.2.37/> 1.2.37
released a few weeks ago because you said in the list that "something of
that sort was fixed". Zero behavior change as it seems.

 

The problem however WAS solved with 

 

ethtool -K enp7s0f1 tso off

 

that attempt was based on http://sourceforge.net/p/e1000/bugs/407/ thread,
related to completely other setup with different card and different driver,
but still the problem seems the same (for years?) and the solution remains.

 

Working setup is here:

 

l31 ~ # ethtool -k enp7s0f1

Offload parameters for enp7s0f1:

rx-checksumming: on

tx-checksumming: on

scatter-gather: on

tcp-segmentation-offload: off

udp-fragmentation-offload: off

generic-segmentation-offload: on

generic-receive-offload: on

large-receive-offload: off

rx-vlan-offload: on

tx-vlan-offload: on

ntuple-filters: on

receive-hashing: on

 

Not working setup differs in this only:

tcp-segmentation-offload: on

 

I would really suggest this problem be either fixed at last (not so
important) OR noted in some READMEs. That's really a show stopper and
somebody with debugging karma and try-end-see luck may end up with
non-working adapter.

Really I think that in local unified networks with no segmentation this
feature is irrelevant. But if someone is facing this 10G card to internet
and thus expecting some real segmentation, real pain in the performance side
may arise with lack of the feature.

Maybe changing card defaults may be appropriate, although not sure about it
as it is bad to change it in the middle of driver life cycle. But X710 is on
the very start for now, so maybe?...

 

 

Other aux. info that may help:

 

We are using DA SFP+. One port of the card is attached.

Traffic is somehow floating from 1 to ~7 Gbps and the hang is NOT related to
traffic, just peer TCP behavior (as it seems after solution).

We have only local network (used for iSCSI), JUMBO enabled with MTU9000;
MTUs are fine, we have 0.0001% packet fragmentation. It seems that packet
gets fragmented only on some really odd occasion.

No VLANs. No teaming. Very simple plain setup with two 10G switches, 10
hosts (5 here and 5 there) and SFP+ DA all the way (some errors and drops in
the links "as usual", but not perceptible to end apps).

 

This exactly card in debug was in Intel R2312GZ4GCSAS server system.

The card is installed on the PCI raiser, together with some LSI-based Intel
RAID on the same raiser.

 

enp7s0f1  Link encap:Ethernet  HWaddr 68:05:ca:30:5b:c9

          inet addr:10.21.0.87  Bcast:10.21.0.255  Mask:255.255.255.0

          UP BROADCAST RUNNING MULTICAST  MTU:9000  Metric:1

          RX packets:520305857 errors:0 dropped:0 overruns:0 frame:0

          TX packets:1450598509 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:1000

          RX bytes:1797022578878 (1.6 TiB)  TX bytes:8995312721942 (8.1 TiB)

 

l31 ~ # ethtool enp7s0f1

Settings for enp7s0f1:

        Supported ports: [ FIBRE ]

        Supported link modes:   10000baseT/Full

        Supported pause frame use: Symmetric

        Supports auto-negotiation: No

        Advertised link modes:  Not reported

        Advertised pause frame use: No

        Advertised auto-negotiation: No

        Speed: 10000Mb/s

        Duplex: Full

        Port: Direct Attach Copper

        PHYAD: 0

        Transceiver: external

        Auto-negotiation: off

        Supports Wake-on: d

        Wake-on: d

        Current message level: 0x0000000f (15)

                               drv probe link timer

        Link detected: yes

 

l31 ~ # ethtool -i enp7s0f1

driver: i40e

version: 1.2.37

firmware-version: f4.22.27454 a1.2 n4.25 e143f

bus-info: 0000:07:00.1

supports-statistics: yes

supports-test: yes

supports-eeprom-access: yes

supports-register-dump: yes

 

l31 ~ # ethtool -k enp7s0f1

Offload parameters for enp7s0f1:

rx-checksumming: on

tx-checksumming: on

scatter-gather: on

tcp-segmentation-offload: off

udp-fragmentation-offload: off

generic-segmentation-offload: on

generic-receive-offload: on

large-receive-offload: off

rx-vlan-offload: on

tx-vlan-offload: on

ntuple-filters: on

receive-hashing: on

 

I will add some more info if required to help you to understand things
better if you wish.

 

Dmitry.

------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Reply via email to