On Fri, Oct 7, 2011 at 5:23 PM, Brandeburg, Jesse <[email protected]> wrote: > > > On Thu, 6 Oct 2011, Jesse Gross wrote: > >> I'm seeing some strange packet reordering problems with Intel 82599 >> based NICs using the ixgbe driver. It seems difficult to believe that >> I'm the only one running into this but it has shown up to some extent >> with every card of this type that I have tried, on multiple systems, >> using multiple kernels. By contrast, I also tried Broadcom-based 10G >> NICs and did not see the problem in any of these environments. >> >> I tried to collect information on the simplest, most common setup - I >> have two machines running RHEL 6.1 connected back-to-back using SFP+ >> direct attach cables. I changed nothing after installation except for >> installing netperf. I then ran a basic test without any other >> traffic: > > with that statement I think I might see your problem. With the default > setup irqbalance is enabled and might be migrating your interrupts > frequently, which causes out of order reception. > > could you try killall irqbalance and run the test again? If that works I > recommend (as our readme does/should) that you run the set_irq_affinity.sh > script from the sourceforge ixgbe tarball to lay your interrupts out in > an efficient manner. plus see below.
irqbalance was running but after killing it I didn't see any appreciable difference. I also tried running set_irq_affinity.sh for the fun of it with the same results. >> TcpExt: >> 2 delayed acks sent >> 23 packets directly queued to recvmsg prequeue. >> 114 packets header predicted >> 745471 acknowledgments not containing data received >> 143904 predicted acknowledgments >> 3 times recovered from packet loss due to SACK data >> Detected reordering 319 times using SACK >> Detected reordering 1124 times using time stamp >> 3 congestion windows fully recovered >> 1124 congestion windows partially recovered using Hoe heuristic >> 0 TCP data loss events >> 2165 fast retransmits > > the retransmits here might also be because you're overwhelming the > receiver and causing dropped packets either in the driver or the socket > layer. It seems somewhat unlikely to me that the machine is too slow since I don't see this problem with 10G Broadcom NICs. Looking at mpstat, the CPU that traffic is being directed to is about 60% idle (and everything else is completely idle). These particular machines have an Intel E5520 in them, although I have also seen results like this with slightly faster machines as well. ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2dcopy2 _______________________________________________ E1000-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired
