What did you try to disable flow director? This may be a bug. Todd Fujinaka Software Application Engineer Networking Division (ND) Intel Corporation todd.fujin...@intel.com (503) 712-4565
-----Original Message----- From: Hu, Tan Chang [mailto:t...@sandia.gov] Sent: Wednesday, July 29, 2015 6:55 AM To: Fujinaka, Todd Subject: RE: [EXTERNAL] RE: TCP stack is receiving out of order packer delivery even when connected back-to-back The new driver version, 1.2.48, and kernel, 2.6.32-504.30.3.el6.x86_64, doesn't allow disabling of the flow director. The combination returns " Could not change any device features". We will look at XPS. -----Original Message----- From: Fujinaka, Todd [mailto:todd.fujin...@intel.com] Sent: Thursday, July 23, 2015 2:45 PM To: Hu, Tan Chang; e1000-devel@lists.sourceforge.net Cc: Naegle, John H Subject: RE: [EXTERNAL] RE: TCP stack is receiving out of order packer delivery even when connected back-to-back You only sent me the last email and, as I stated before, I do NOT have access to the customer service database. I can only go by what you tell me in this thread. I was waiting for you to try the three tests I described in this thread and let me know the results. We think this is a known issue. You've told me that one of the tests invalid since rngd is not running. Please try the other two. Todd Fujinaka Software Application Engineer Networking Division (ND) Intel Corporation todd.fujin...@intel.com (503) 712-4565 -----Original Message----- From: Hu, Tan Chang [mailto:t...@sandia.gov] Sent: Thursday, July 23, 2015 1:41 PM To: Fujinaka, Todd; e1000-devel@lists.sourceforge.net Cc: Naegle, John H Subject: Re: [EXTERNAL] RE: TCP stack is receiving out of order packer delivery even when connected back-to-back Todd, We already stated in the email that rngd was not running. Seond, if you had access to the ticket system you would have seen that we stated the same on a follow up response and that email attachment wasn't the last email from customer support. ________________________________________ From: Fujinaka, Todd <todd.fujin...@intel.com> Sent: Thursday, July 23, 2015 2:36 PM To: Hu, Tan Chang; e1000-devel@lists.sourceforge.net Cc: Naegle, John H Subject: RE: [EXTERNAL] RE: TCP stack is receiving out of order packer delivery even when connected back-to-back I got the attachments directly, but keep in mind that sourceforge strips the attachments and no one else has seen them. It appears you have send me their last email and a bunch of ethtool output. As far as I can see, they asked you to kill rngd and you never responded with whether that worked or not. So we're back to them asking you to test and us waiting for those results? I gave you two more possible tests as well. Todd Fujinaka Software Application Engineer Networking Division (ND) Intel Corporation todd.fujin...@intel.com (503) 712-4565 -----Original Message----- From: Hu, Tan Chang [mailto:t...@sandia.gov] Sent: Thursday, July 23, 2015 12:21 PM To: Fujinaka, Todd; e1000-devel@lists.sourceforge.net Cc: Naegle, John H Subject: RE: [EXTERNAL] RE: TCP stack is receiving out of order packer delivery even when connected back-to-back Todd, Attached is the email we received from customer support and the ethtool output pre and post run. Disregard the reference to Mellanox cards as they were not in the test setup since June 15, 2015. RGND process was not running on either system. There were no other tests we were ask to run except for watching dropped packets. We can run the suggested fixes if you can provide the instructions to do so. Third paragraph, last sentence, is a dangling sentence. As John mentioned in the last email, we are seeing performance problems with real applications and not just iperf. How is Intel setting the NIC for best general purpose throughput and what are those numbers? -----Original Message----- From: Fujinaka, Todd [mailto:todd.fujin...@intel.com] Sent: Thursday, July 23, 2015 11:16 AM To: Naegle, John H; Hu, Tan Chang; e1000-devel@lists.sourceforge.net Subject: RE: [EXTERNAL] RE: TCP stack is receiving out of order packer delivery even when connected back-to-back OK, so I went directly to the developers and they told me what they told customer service. I'm going to assume you got the whole message. First, what are the results of the tests that you were asked to try? The problem was clearly explained to me, again, that the CPU scheduler is kicking us off the CPU and flow director takes 20 SKBs to catch up. In that time you can get out of order packets. There are things you should have been asked to try to fix this. Turn off XPS in the kernel. Turn off Flow Director. Kill the rngd process (we found that the rngd process is what usually runs to kick us off the CPU). If you don't know how to do that, google is your friend. We have ideas of how to mitigate this without doing Why are you seeing problems? We set up the NIC for the best general purpose throughput. When people start running tests on corner cases or artificial tests, we see degradation. Turn off flow director, and you'll be happier with your iperf numbers on both i40e and ixgbe, but I'm not sure if that's going to help for a real-world server. Some of our competitors don't have flow director, so they won't see this problem. Todd Fujinaka Software Application Engineer Networking Division (ND) Intel Corporation todd.fujin...@intel.com (503) 712-4565 -----Original Message----- From: Fujinaka, Todd Sent: Thursday, July 23, 2015 9:08 AM To: 'Naegle, John H'; Hu, Tan Chang; e1000-devel@lists.sourceforge.net Subject: RE: [EXTERNAL] RE: TCP stack is receiving out of order packer delivery even when connected back-to-back If you would like a phone conversation please contact your FAE to arrange this. Right now we don't see how this is a widespread or critical issue, but the FAE is the one who can determine the urgency. Todd Fujinaka Software Application Engineer Networking Division (ND) Intel Corporation todd.fujin...@intel.com (503) 712-4565 -----Original Message----- From: Naegle, John H [mailto:jhna...@sandia.gov] Sent: Thursday, July 23, 2015 8:31 AM To: Fujinaka, Todd; Hu, Tan Chang; e1000-devel@lists.sourceforge.net Subject: RE: [EXTERNAL] RE: TCP stack is receiving out of order packer delivery even when connected back-to-back While it is true that iperf is an artificial test, it has been very effective at demonstrating TCP/IP issues that also impact our data movement application performance for us! For example, when we discovered this same issue on your 10G NICs, we took a guess and turned off LRO. This greatly reduced the number of Out-of-Order events (as reported by Wireshark), and our user performance jumped by almost 10X! Users had been noticing and complaining about the performance specifically to Intel NICs for some time. When we swapped to a Chelsio NIC, the application performance went up even more. This was in the WAN where latency dominates performance due to the large round trip time. We have been trading emails about these issues now for several weeks. I'd suggest that we have a live phone discussion. I find that it is much easier to come to common understanding in complex problems with live discussions where questions, ideas, and misunderstandings can be quickly clarified. Richard and I are available today after 11:00 MT through 3:00pm MT or tomorrow. We can use my telecon bridge number or just a direct call if that is easier. Please let us know if you can support such a conversation. Thanks, John Naegle -----Original Message----- From: Fujinaka, Todd [mailto:todd.fujin...@intel.com] Sent: Thursday, July 23, 2015 9:11 AM To: Hu, Tan Chang; e1000-devel@lists.sourceforge.net Cc: Naegle, John H Subject: [EXTERNAL] RE: TCP stack is receiving out of order packer delivery even when connected back-to-back Iperf changes queues in the middle of the test. I remember this issue (not the ticket - I don't have direct access to this) and some of the answers you receive late in the thread were from the development team. I don't know if it was edited by customer support (as I said, I don't have access to the ticket) and my opinion in hallway conversations about this was that artificial tests (such as iperf) generate artificial results. We can tune for good iperf performance but the tuning is counterproductive for real traffic. I can't imagine that you're buying this hardware just to perform iperf tests. If there is a real issue, please let us know. The "go-to guy" in our division has already responded to customer service on this issue. Todd Fujinaka Software Application Engineer Networking Division (ND) Intel Corporation todd.fujin...@intel.com (503) 712-4565 -----Original Message----- From: Hu, Tan Chang [mailto:t...@sandia.gov] Sent: Thursday, July 23, 2015 7:58 AM To: Fujinaka, Todd; e1000-devel@lists.sourceforge.net Cc: Naegle, John H Subject: TCP stack is receiving out of order packer delivery even when connected back-to-back Todd, The history of the ticket 8001197134 gave the problem statement. The Linux TCP/IP stack is receiving out of order packets delivery from even when the two host with Intel XL710 40 GE are connected directly back to back. The out of order packets triggers receiver side TCP/IP DUP ACK and cause the sender to go into fast retransmit state and kills throughput performance (especially as the end application of the 40GE is data transfer over 100Gbps WAN connections). The Dell R620 servers are connected via two Intel E40GQSFPSR transceivers with a Tripp Lite N844-10M-12-P40GBase-SR4 MPO cable back-to-back. Intel customer support doesn't think re-transmits dramatically impact throughput, alluding to improper socket buffering, and stated that out of order packets are counted in different statistics. Intel customer support did not provide any numbers as what "dramatically impact" means nor how they are looking for socket buffering or out of order statistics. What does Intel expect in terms of XL710 throughput performance, how do you check for Intel NIC socket buffering settings, and what out of order statistics? Comparing "ethtool -S i40e1" before and after the iperf3 run does indicate multiple tx changes. Iperf3 was configured to run as a single thread, we expected the Intel NIC to handle a single flow in a single queue and not spread around over multiple queues. Can you explain the design why scheduling would impact a single flow? We are also seeing similar behavior from Intel 10GE NIC in production across WAN links while other vendor's hardware do not exhibit this Out-Of-Order packet deliver. -----Original Message----- From: Fujinaka, Todd [mailto:todd.fujin...@intel.com] Sent: Wednesday, July 22, 2015 11:25 AM To: Hu, Tan Chang; e1000-devel@lists.sourceforge.net Cc: Naegle, John H Subject: [EXTERNAL] RE: [E1000-devel] Intel Customer Support Service Request Management request # 8001197134 Can you restate your problem so we can start over on this issue? You may want to just start with a new subject line as well. From what I've heard from customer service, there was no issue and the case was closed but I haven't heard all the details at this time. Another way to get help is to escalate this through your factory contact (whoever sold you whatever it is you're having problems with) and they can file a ticket in IPS. My suggestion is to start over here with a problem statement and we can ask you for more details as we go. I'd also suggest filing a bug on sourceforge if you have attachments to send (inline in email is hard to read), but sourceforge is having issues at the moment. Todd Fujinaka Software Application Engineer Networking Division (ND) Intel Corporation todd.fujin...@intel.com (503) 712-4565 -----Original Message----- From: Hu, Tan Chang [mailto:t...@sandia.gov] Sent: Wednesday, July 22, 2015 8:27 AM To: e1000-devel@lists.sourceforge.net Cc: Naegle, John H Subject: [E1000-devel] Intel Customer Support Service Request Management request # 8001197134 Desr sir, As per understanding from the last email from Intel support dated July 22, 2015, the ticket 8001197234 has been passed on to the Intel driver development team. Please let us know if additional iformation, besides the previously submitted, are needed to resolve this issue. Thanks ------------------------------------------------------------------------------ _______________________________________________ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired