On Thu, 6 Oct 2011, Jesse Gross wrote:

> I'm seeing some strange packet reordering problems with Intel 82599
> based NICs using the ixgbe driver.  It seems difficult to believe that
> I'm the only one running into this but it has shown up to some extent
> with every card of this type that I have tried, on multiple systems,
> using multiple kernels.  By contrast, I also tried Broadcom-based 10G
> NICs and did not see the problem in any of these environments.
> 
> I tried to collect information on the simplest, most common setup - I
> have two machines running RHEL 6.1 connected back-to-back using SFP+
> direct attach cables.  I changed nothing after installation except for
> installing netperf.  I then ran a basic test without any other
> traffic:

with that statement I think I might see your problem.  With the default 
setup irqbalance is enabled and might be migrating your interrupts 
frequently, which causes out of order reception.

could you try killall irqbalance and run the test again?  If that works I 
recommend (as our readme does/should) that you run the set_irq_affinity.sh 
script from the sourceforge ixgbe tarball to lay your interrupts out in 
an efficient manner.  plus see below.


> TcpExt:
>     2 delayed acks sent
>     23 packets directly queued to recvmsg prequeue.
>     114 packets header predicted
>     745471 acknowledgments not containing data received
>     143904 predicted acknowledgments
>     3 times recovered from packet loss due to SACK data
>     Detected reordering 319 times using SACK
>     Detected reordering 1124 times using time stamp
>     3 congestion windows fully recovered
>     1124 congestion windows partially recovered using Hoe heuristic
>     0 TCP data loss events
>     2165 fast retransmits

the retransmits here might also be because you're overwhelming the 
receiver and causing dropped packets either in the driver or the socket 
layer.


>     181 DSACKs received
>     TCPDSACKIgnoredOld: 3
>     TCPDSACKIgnoredNoUndo: 178
>     TCPSackShifted: 1167
>     TCPSackMerged: 99
>     TCPSackShiftFallback: 34
> IpExt:
>     InMcastPkts: 3
>     InBcastPkts: 184
>     InOctets: 46340890
>     OutOctets: 70657357156
>     InMcastOctets: 96
>     InBcastOctets: 35592
> 
> As you can see, although performance is not impacted there is quite a
> bit of reordering occurring between these two machines.  I originally
> noticed this problem when I had multiple machines connected in series
> performing bridging, where there is a significant drop in throughput.
> 
> The problem seems to be primarily related to TSO and LRO.  Disabling
> both of these causes reordering to mostly, but not completely, go
> away.  Either one of them alone is not sufficient.  In addition,
> tcpdump on both the sender and receiver shows a very interesting
> result: On the transmit side, everything is in the correct order.
> However, on the receive side packets that were part a single TSO frame
> on the transmitter are now out of order.
> 
> The problem seems somewhat bursty, so it doesn't necessarily show up
> on every run of netperf but it is easy to reproduce within a few runs.
>  These machines are running Red Hat's 2.6.32 kernel but it also
> happens on 2.6.38 and current net-next (in fact it seems even somewhat
> worse).
> 
> The version information from these cards is:
> # ethtool -i p1p1
> driver: ixgbe
> version: 3.0.12-k2
> firmware-version: 0.9-3
> bus-info: 0000:03:00.0
> 
> ------------------------------------------------------------------------------
> All of the data generated in your IT infrastructure is seriously valuable.
> Why? It contains a definitive record of application performance, security
> threats, fraudulent activity, and more. Splunk takes this data and makes
> sense of it. IT sense. And common sense.
> http://p.sf.net/sfu/splunk-d2dcopy2
> _______________________________________________
> E1000-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/e1000-devel
> To learn more about Intel® Ethernet, visit 
> http://communities.intel.com/community/wired
> 

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2dcopy2
_______________________________________________
E1000-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit 
http://communities.intel.com/community/wired

Reply via email to