I'm seeing some strange packet reordering problems with Intel 82599
based NICs using the ixgbe driver. It seems difficult to believe that
I'm the only one running into this but it has shown up to some extent
with every card of this type that I have tried, on multiple systems,
using multiple kernels. By contrast, I also tried Broadcom-based 10G
NICs and did not see the problem in any of these environments.
I tried to collect information on the simplest, most common setup - I
have two machines running RHEL 6.1 connected back-to-back using SFP+
direct attach cables. I changed nothing after installation except for
installing netperf. I then ran a basic test without any other
traffic:
# netstat -s
Ip:
412 total packets received
10 with invalid addresses
0 forwarded
0 incoming packets discarded
286 incoming packets delivered
179 requests sent out
Icmp:
1 ICMP messages received
0 input ICMP message failed.
ICMP input histogram:
echo requests: 1
1 ICMP messages sent
0 ICMP messages failed
ICMP output histogram:
echo replies: 1
IcmpMsg:
InType8: 1
OutType0: 1
Tcp:
0 active connections openings
1 passive connection openings
0 failed connection attempts
0 connection resets received
1 connections established
266 segments received
159 segments send out
0 segments retransmited
0 bad segments received.
0 resets sent
Udp:
19 packets received
0 packets to unknown port received.
0 packet receive errors
19 packets sent
UdpLite:
TcpExt:
2 delayed acks sent
2 packets directly queued to recvmsg prequeue.
88 packets header predicted
10 acknowledgments not containing data received
137 predicted acknowledgments
0 TCP data loss events
IpExt:
InMcastPkts: 2
InBcastPkts: 114
InOctets: 48307
OutOctets: 23984
InMcastOctets: 64
InBcastOctets: 21197
# netperf -H 10.0.0.2 -l 60
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.2
(10.0.0.2) port 0 AF_INET
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
87380 16384 16384 60.00 9407.86
# netstat -s
Ip:
889749 total packets received
10 with invalid addresses
0 forwarded
0 incoming packets discarded
889552 incoming packets delivered
1771143 requests sent out
Icmp:
1 ICMP messages received
0 input ICMP message failed.
ICMP input histogram:
echo requests: 1
1 ICMP messages sent
0 ICMP messages failed
ICMP output histogram:
echo replies: 1
IcmpMsg:
InType8: 1
OutType0: 1
Tcp:
2 active connections openings
1 passive connection openings
0 failed connection attempts
0 connection resets received
1 connections established
889532 segments received
1768958 segments send out
2165 segments retransmited
0 bad segments received.
0 resets sent
Udp:
19 packets received
0 packets to unknown port received.
0 packet receive errors
19 packets sent
UdpLite:
TcpExt:
2 delayed acks sent
23 packets directly queued to recvmsg prequeue.
114 packets header predicted
745471 acknowledgments not containing data received
143904 predicted acknowledgments
3 times recovered from packet loss due to SACK data
Detected reordering 319 times using SACK
Detected reordering 1124 times using time stamp
3 congestion windows fully recovered
1124 congestion windows partially recovered using Hoe heuristic
0 TCP data loss events
2165 fast retransmits
181 DSACKs received
TCPDSACKIgnoredOld: 3
TCPDSACKIgnoredNoUndo: 178
TCPSackShifted: 1167
TCPSackMerged: 99
TCPSackShiftFallback: 34
IpExt:
InMcastPkts: 3
InBcastPkts: 184
InOctets: 46340890
OutOctets: 70657357156
InMcastOctets: 96
InBcastOctets: 35592
As you can see, although performance is not impacted there is quite a
bit of reordering occurring between these two machines. I originally
noticed this problem when I had multiple machines connected in series
performing bridging, where there is a significant drop in throughput.
The problem seems to be primarily related to TSO and LRO. Disabling
both of these causes reordering to mostly, but not completely, go
away. Either one of them alone is not sufficient. In addition,
tcpdump on both the sender and receiver shows a very interesting
result: On the transmit side, everything is in the correct order.
However, on the receive side packets that were part a single TSO frame
on the transmitter are now out of order.
The problem seems somewhat bursty, so it doesn't necessarily show up
on every run of netperf but it is easy to reproduce within a few runs.
These machines are running Red Hat's 2.6.32 kernel but it also
happens on 2.6.38 and current net-next (in fact it seems even somewhat
worse).
The version information from these cards is:
# ethtool -i p1p1
driver: ixgbe
version: 3.0.12-k2
firmware-version: 0.9-3
bus-info: 0000:03:00.0
------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2dcopy2
_______________________________________________
E1000-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit
http://communities.intel.com/community/wired