Thanks Stefan, I'll set up a test to replicate your traffic profile as closely as possible and let it run overnight to see if I can repro and then update you tomorrow.
It does seem that it has nothing to do with load so that makes it even more curious. - Greg > -----Original Message----- > From: Stefan Priebe [mailto:s.pri...@profihost.ag] > Sent: Thursday, August 13, 2015 11:53 AM > To: Rose, Gregory V; e1000-devel@lists.sourceforge.net > Subject: Re: [E1000-devel] dropped rx with i40e > > Hi, > > sorry for top posting. > > I will try to describe the workload as good as i can. > > Application is ceph storage (http://ceph.com/). > > Workload is TCP Only, Active/Active bond on both ports of the XL710 card > and jumbo frames (MTU 9000). Traffic peak was 400MBit/s - So overall speed > does not seem to matter. Also i can use iperf and get a constant speed of > 9.8Gb/s in both directions without any rx drops. > > The drops don't occur regulary they just happen at a time X and then stop. > After some hours it happens again. > Stefan > > Am 13.08.2015 um 17:58 schrieb Rose, Gregory V: > > My apologies but I've been unable to get back to this issue. > > > > After reviewing the thread I don't see anything about steps to reproduce > the problem. I understand that you're seeing dropped packets with the > Xl710 with various versions of the i40e driver while the X520 with the > ixgbe driver does not drop packets under the same load. > > > > I don't' see any description of the type of traffic load that is causing > the problem. That would help me to reproduce the issue. > > > > Keep in mind that dropped packets in and of itself is not a bug. It may > mean that the X520 and the ixgbe driver are more mature and have had more > "tuning" and thus work better under the type of traffic load you have on > your network. Thus it is important that we understand the type of traffic > you're seeing on your network so that we can work on making the XL710 and > i40e driver performance on par with the X520 and the ixgbe driver. > > > > One other thing. Below I notice this: > > > >> I tested this one: > >> ethtool -C eth3 adaptive-rx off adaptive-tx off rx-usecs 2 tx-usecs 0 > > > > I believe that you would be better off using higher values. Really low > values mean the HW interrupt will fire more often - instead you should > allow the soft IRQ polling to keep processing packets. > > > > - Greg > > > >> -----Original Message----- > >> From: Stefan Priebe - Profihost AG [mailto:s.pri...@profihost.ag] > >> Sent: Thursday, August 13, 2015 5:41 AM > >> To: Rose, Gregory V; e1000-devel@lists.sourceforge.net > >> Subject: Re: [E1000-devel] dropped rx with i40e > >> > >> 1.3.12-k from net-next devel does not help either ;-( > >> > >> Should we open an intel support ticket? We really need a solution. > >> > >> Stefan > >> > >> Am 12.08.2015 um 10:29 schrieb Stefan Priebe - Profihost AG: > >>> Might this be a memory allocation problem? It happens only after > >>> some hours running and when the whole memory is filled with linux fs > cache. > >>> > >>> Is the i40e driver using kmalloc or vmalloc? > >>> > >>> Stefan > >>> Am 11.08.2015 um 06:03 schrieb Stefan Priebe: > >>>> One more thing to note. It mostly happens after around 8-24 hours > >>>> and i could stop it again by rebooting the system/server. (can't > >>>> prove > >>>> it) > >>>> > >>>> Stefan > >>>> Am 06.08.2015 um 22:59 schrieb Rose, Gregory V: > >>>>> Thanks Stefan. I think for now you've given us enough data to go > >>>>> on > >>>>> - I've got some research to do and then I'll get back to you. > >>>>> > >>>>> - Greg > >>>>> > >>>>>> -----Original Message----- > >>>>>> From: Stefan Priebe - Profihost AG [mailto:s.pri...@profihost.ag] > >>>>>> Sent: Wednesday, August 05, 2015 11:32 PM > >>>>>> To: Rose, Gregory V; e1000-devel@lists.sourceforge.net > >>>>>> Subject: Re: [E1000-devel] dropped rx with i40e > >>>>>> > >>>>>> Am 06.08.2015 um 00:22 schrieb Rose, Gregory V: > >>>>>>> Stefan, > >>>>>>> > >>>>>>> Could you please send me the output of 'ethtool' and 'ethtool -i' > >>>>>>> for > >>>>>> each i40e interface that is experiencing the dropped packets issue? > >>>>>> > >>>>>> These are around 100 cards. So i won't post the output for all of > >> them. > >>>>>> As they're all using the same driver and the same firmware - we > >>>>>> updated all of them i hope it's ok to post the output only from > >>>>>> one > >> of them. > >>>>>> > >>>>>> # ethtool eth2 > >>>>>> Settings for eth2: > >>>>>> Supported ports: [ FIBRE ] > >>>>>> Supported link modes: 10000baseT/Full > >>>>>> Supported pause frame use: Symmetric > >>>>>> Supports auto-negotiation: No > >>>>>> Advertised link modes: Not reported > >>>>>> Advertised pause frame use: No > >>>>>> Advertised auto-negotiation: No > >>>>>> Speed: 10000Mb/s > >>>>>> Duplex: Full > >>>>>> Port: Direct Attach Copper > >>>>>> PHYAD: 0 > >>>>>> Transceiver: external > >>>>>> Auto-negotiation: off > >>>>>> Supports Wake-on: g > >>>>>> Wake-on: d > >>>>>> Current message level: 0x0000000f (15) > >>>>>> drv probe link timer > >>>>>> Link detected: yes > >>>>>> # ethtool -i eth2 > >>>>>> driver: i40e > >>>>>> version: 1.3.4-k > >>>>>> firmware-version: f4.33.31377 a1.2 n4.42 e191b > >>>>>> bus-info: 0000:03:00.0 > >>>>>> supports-statistics: yes > >>>>>> supports-test: yes > >>>>>> supports-eeprom-access: yes > >>>>>> supports-register-dump: yes > >>>>>> supports-priv-flags: yes > >>>>>> > >>>>>>> Also, the system log might help also - dmesg can get that. > >>>>>>> That'll > >>>>>> give me something to look at. > >>>>>> > >>>>>> As this one is pretty long. i pasted dmesg to pastebin: > >>>>>> http://pastebin.com/raw.php?i=7Tjp3eDT > >>>>>> > >>>>>>> By the way, have you tried using ethtool to turn adaptive RX and > >>>>>>> TX off > >>>>>> using ethtool to see if that has any impact on the dropped packets? > >>>>>> > >>>>>> I tested this one: > >>>>>> ethtool -C eth3 adaptive-rx off adaptive-tx off rx-usecs 2 > >>>>>> tx-usecs > >>>>>> 0 > >>>>>> > >>>>>> but it has not helped. Still dropped rx packets. While a 2nd > >>>>>> system receiving the same load using ixgbe has no dropped packets. > >>>>>> > >>>>>>> That might be an easy test to run. > >>>>>> > >>>>>> Thanks! > >>>>>> > >>>>>> Greets, > >>>>>> Stefan > >>>>>> > >>>>>> > >>>>>>> Thanks, > >>>>>>> > >>>>>>> - Greg > >>>>>>> > >>>>>>>> -----Original Message----- > >>>>>>>> From: Stefan Priebe [mailto:s.pri...@profihost.ag] > >>>>>>>> Sent: Wednesday, August 05, 2015 11:14 AM > >>>>>>>> To: e1000-devel@lists.sourceforge.net > >>>>>>>> Subject: Re: [E1000-devel] dropped rx with i40e > >>>>>>>> > >>>>>>>> > >>>>>>>> Something i've noticed: > >>>>>>>> ixgbe: > >>>>>>>> Adaptive RX: off TX: off > >>>>>>>> rx-usecs: 1 > >>>>>>>> tx-usecs: 0 > >>>>>>>> > >>>>>>>> i40e: > >>>>>>>> Adaptive RX: on TX: on > >>>>>>>> rx-usecs: 62 > >>>>>>>> tx-usecs: 122 > >>>>>>>> > >>>>>>>> Stefan > >>>>>>>> > >>>>>>>> Am 05.08.2015 um 09:02 schrieb Stefan Priebe - Profihost AG: > >>>>>>>>> Hello list, > >>>>>>>>> > >>>>>>>>> we're using the intel X520 cards with the ixgbe driver since a > >>>>>>>>> long time for our cloud infrastructure. We never had a problem > >>>>>>>>> with dropped packets and everything was always fine. > >>>>>>>>> > >>>>>>>>> Since a year we started switching to the X710 cards as they're > >>>>>>>>> better regarding their specs (lower power consumption, lower > >>>>>>>>> latency, better price). > >>>>>>>>> > >>>>>>>>> We've around 100 X710 cards running now and we had a lot of > >>>>>>>>> trouble with them. Back in 2014 there were a firmware bug, > >>>>>>>>> then there were driver problems with bonding and so on. > >>>>>>>>> > >>>>>>>>> Now we have detected a new problem! We're seeing a lot of > >>>>>>>>> rx_dropped packets on all X710 cards while all ixgbe based > >>>>>>>>> cards are working > >>>>>> fine. > >>>>>>>>> > >>>>>>>>> I've tested the 1.2.48 driver als also the latest 1.3.4-k > >>>>>>>>> driver from 4.2-rc5. > >>>>>>>>> > >>>>>>>>> Can anybody help? > >>>>>>>>> > >>>>>>>>> Greets, > >>>>>>>>> Stefan > >>>>>>>>> > >>>>>>>> > >>>>>>>> --------------------------------------------------------------- > >>>>>>>> -- > >>>>>>>> ---- > >>>>>>>> ----- > >>>>>>>> ---- > >>>>>>>> _______________________________________________ > >>>>>>>> E1000-devel mailing list > >>>>>>>> E1000-devel@lists.sourceforge.net > >>>>>>>> https://lists.sourceforge.net/lists/listinfo/e1000-devel > >>>>>>>> To learn more about Intel® Ethernet, visit > >>>>>>>> http://communities.intel.com/community/wired ------------------------------------------------------------------------------ _______________________________________________ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired