I'm seeing some strange packet reordering problems with Intel 82599
based NICs using the ixgbe driver.  It seems difficult to believe that
I'm the only one running into this but it has shown up to some extent
with every card of this type that I have tried, on multiple systems,
using multiple kernels.  By contrast, I also tried Broadcom-based 10G
NICs and did not see the problem in any of these environments.

I tried to collect information on the simplest, most common setup - I
have two machines running RHEL 6.1 connected back-to-back using SFP+
direct attach cables.  I changed nothing after installation except for
installing netperf.  I then ran a basic test without any other
traffic:

# netstat -s
Ip:
    412 total packets received
    10 with invalid addresses
    0 forwarded
    0 incoming packets discarded
    286 incoming packets delivered
    179 requests sent out
Icmp:
    1 ICMP messages received
    0 input ICMP message failed.
    ICMP input histogram:
        echo requests: 1
    1 ICMP messages sent
    0 ICMP messages failed
    ICMP output histogram:
        echo replies: 1
IcmpMsg:
        InType8: 1
        OutType0: 1
Tcp:
    0 active connections openings
    1 passive connection openings
    0 failed connection attempts
    0 connection resets received
    1 connections established
    266 segments received
    159 segments send out
    0 segments retransmited
    0 bad segments received.
    0 resets sent
Udp:
    19 packets received
    0 packets to unknown port received.
    0 packet receive errors
    19 packets sent
UdpLite:
TcpExt:
    2 delayed acks sent
    2 packets directly queued to recvmsg prequeue.
    88 packets header predicted
    10 acknowledgments not containing data received
    137 predicted acknowledgments
    0 TCP data loss events
IpExt:
    InMcastPkts: 2
    InBcastPkts: 114
    InOctets: 48307
    OutOctets: 23984
    InMcastOctets: 64
    InBcastOctets: 21197

# netperf -H 10.0.0.2 -l 60
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.2
(10.0.0.2) port 0 AF_INET
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 87380  16384  16384    60.00    9407.86

# netstat -s
Ip:
    889749 total packets received
    10 with invalid addresses
    0 forwarded
    0 incoming packets discarded
    889552 incoming packets delivered
    1771143 requests sent out
Icmp:
    1 ICMP messages received
    0 input ICMP message failed.
    ICMP input histogram:
        echo requests: 1
    1 ICMP messages sent
    0 ICMP messages failed
    ICMP output histogram:
        echo replies: 1
IcmpMsg:
        InType8: 1
        OutType0: 1
Tcp:
    2 active connections openings
    1 passive connection openings
    0 failed connection attempts
    0 connection resets received
    1 connections established
    889532 segments received
    1768958 segments send out
    2165 segments retransmited
    0 bad segments received.
    0 resets sent
Udp:
    19 packets received
    0 packets to unknown port received.
    0 packet receive errors
    19 packets sent
UdpLite:
TcpExt:
    2 delayed acks sent
    23 packets directly queued to recvmsg prequeue.
    114 packets header predicted
    745471 acknowledgments not containing data received
    143904 predicted acknowledgments
    3 times recovered from packet loss due to SACK data
    Detected reordering 319 times using SACK
    Detected reordering 1124 times using time stamp
    3 congestion windows fully recovered
    1124 congestion windows partially recovered using Hoe heuristic
    0 TCP data loss events
    2165 fast retransmits
    181 DSACKs received
    TCPDSACKIgnoredOld: 3
    TCPDSACKIgnoredNoUndo: 178
    TCPSackShifted: 1167
    TCPSackMerged: 99
    TCPSackShiftFallback: 34
IpExt:
    InMcastPkts: 3
    InBcastPkts: 184
    InOctets: 46340890
    OutOctets: 70657357156
    InMcastOctets: 96
    InBcastOctets: 35592

As you can see, although performance is not impacted there is quite a
bit of reordering occurring between these two machines.  I originally
noticed this problem when I had multiple machines connected in series
performing bridging, where there is a significant drop in throughput.

The problem seems to be primarily related to TSO and LRO.  Disabling
both of these causes reordering to mostly, but not completely, go
away.  Either one of them alone is not sufficient.  In addition,
tcpdump on both the sender and receiver shows a very interesting
result: On the transmit side, everything is in the correct order.
However, on the receive side packets that were part a single TSO frame
on the transmitter are now out of order.

The problem seems somewhat bursty, so it doesn't necessarily show up
on every run of netperf but it is easy to reproduce within a few runs.
 These machines are running Red Hat's 2.6.32 kernel but it also
happens on 2.6.38 and current net-next (in fact it seems even somewhat
worse).

The version information from these cards is:
# ethtool -i p1p1
driver: ixgbe
version: 3.0.12-k2
firmware-version: 0.9-3
bus-info: 0000:03:00.0

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2dcopy2
_______________________________________________
E1000-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit 
http://communities.intel.com/community/wired

Reply via email to