On Sat, Sep 24, 2016 at 4:40 PM, Michał Purzyński <michalpurzyns...@gmail.com> wrote: > Thank for you being persistent with answers. > > So right after sending previous email I noticed that I left over some > careless IRQ assignments after experimenting with IRQ and process CPU > affinity. Both cards were hitting the same core, which (for second card) was > on a different NUMA node, plus that core was saturated. > > Result was around 38% packets lost, calculated by comparing packets received > with rx_missed. It's interesting that no other counter was increasing. > > Right now I have moved card 0 to core 0 and card 1 to first core from the > second CPU. > > Now the rx_missed is around 6-7% for each card. Still way too much. > > I send a total of 8-11Gbit/sec to both cars total, so each receives around > half of that. Packet rate is 1.2Mpps top (also total). All kinds of packet > sizes.
So if you are doing packet analysis I assume you don't need LRO or GRO. If not you may want to look into disabling them via "ethtool -K". I know RSC can sometimes cause packet drops due to aggregating a number of frames before finally submitting them to the device. Although that usually required ASPM to be enabled as well. > I'll lower rings to 512 as the next step. Good to know about card's > limitations. > > Given that InterruptThrottleRate has to be given in a 'number of interrupts > / second' what would you recommend I set it to, for a start at least? I have > a 2.6Ghz Xeon E5 v3. So I would recommend a value no less than 12500 for InterruptThrottleRate. Assuming a reasonable packet rate that should give you a decent trade of in terms of performance versus latency. > I'll buy a pair of X710 for a test as well. It will be an interesting > comparison. Who knows, maybe the RSS implementation and MQ there is good > enough for IDS to be used. > > Fortunately I don't run it inline, this server receives a copy of traffic. I'm not sure if it will get you much more throughput or not. I still find it odd that your dropping packets even though the device isn't complaining about not having ring buffer resources. Usually that points to a bottleneck somewhere in the PCIe bus. You might want to double check and verify that the devices are connected directly to the root complex and not some secondary bus on a PCIe switch that is actually downgrading the link between the device and the CPU socket. > On Sat, Sep 24, 2016 at 3:20 AM, Alexander Duyck <alexander.du...@gmail.com> > wrote: >> >> Well as a general rule anything over about 80usecs for >> InterruptThrottleRate is a waste. One advantage to reducing the >> interrupt throttle rate is you can reduce the ring size and you might >> see a slight performance improvement. One problem with using 4096 >> descriptors is that it greatly increases the cache footprint and leads >> to more buffer-bloat and cache thrash as you have to evict old >> descriptors to pull in new ones. I'm also sure if you are doing an >> intrusion detection system (I'm assuming that is what IDS is in >> reference to), then the users would appreciate it if you didn't add up >> to a half dozen extra milliseconds of latency to their network (worst >> case with an elephant flow of 1514 byte frames). >> >> What size packets is it you are working with? One limitation of the >> 82599 is that it can only handle an upper limit of somewhere around >> 12Mpps if you are using something like 6 queues, and only a little >> over 2 for a single queue. If you exceed 12Mpps then the part will >> start reporting rx_missed because the PCIe overhead for moving 64 byte >> packets is great enough that it actually causes us to exceed the >> limits of the x8 gen2 link. If the memcpy is what I think it is then >> it allows us to avoid having to do two different atomic operations >> that would have been more expensive otherwise. >> >> On Fri, Sep 23, 2016 at 12:46 PM, Michał Purzyński >> <michalpurzyns...@gmail.com> wrote: >> > Here's what I did >> > >> > ethtool -A p1p1 rx off tx off >> > ethtool -A p3p1 rx off tx off >> > >> > Both ethtool -a <interface> and Arista that's pumping data show that >> > RX/TX >> > pause are disabled. >> > >> > I have two cards, each connected to a separate NUMA node, threads >> > pinned, >> > etc. >> > >> > One non-standard thing is that I use single queue only, because any form >> > of >> > multiqueue leads to packet reordering and confuses IDS. An issue that's >> > been >> > hidden for a while in the NSM community. >> > >> > driver (from sourceforge) was loaded with MQ=0 DCA=2 RSS=1 VMDQ=0 >> > InterruptThrottleRate=956 FCoE=0 LRO=0 vxvlan_rx=0 (each option's value >> > given enogh times so it applies to all cards in this system). >> > >> > I could see the same issue sending traffic to just one card. >> > >> > Of course a single core is swamped with ACK-ing hardware IRQ and then >> > doing >> > softIRQ (which seems to be mostly memcpy?). But then again, I don't see >> > errors about lacking buffers (I run with 4096 descriptors). >> > >> > >> > On Fri, Sep 23, 2016 at 9:22 PM, Alexander Duyck >> > <alexander.du...@gmail.com> >> > wrote: >> >> >> >> When you say you disabled flow control did you disable it on the >> >> interface that is dropping packets or the other end? You might try >> >> explicitly disabling it on the interface that is dropping packets, >> >> that in turn should enable per-queue drop instead of putting >> >> back-pressure onto the Rx FIFO. >> >> >> >> With flow control disabled on the local port you should see >> >> rx_no_dma_resources start incrementing if the issue is that one of the >> >> Rx rings is not keeping up. >> >> >> >> - Alex >> >> >> >> On Fri, Sep 23, 2016 at 11:09 AM, Michał Purzyński >> >> <michalpurzyns...@gmail.com> wrote: >> >> > xoff was increasing so I disabled flow control. >> >> > >> >> > That's a HP DL360 Gen9 and lspci -vvv tells me cards are connected to >> >> > x8 >> >> > link, speed is 5GT/s and ASPM is disabled. >> >> > >> >> > Other error counters are still zero. When I compared rx_packets and >> >> > rx_missed_errors it looks like a 38% (!!) packets are getting lost. >> >> > >> >> > Unfortunately HP documentation is a scam and they actively avoid >> >> > publishing >> >> > motherboard layout. >> >> > >> >> > Any other place I could look for hints? >> >> > >> >> > >> >> > On Fri, Sep 23, 2016 at 7:01 PM, Alexander Duyck >> >> > <alexander.du...@gmail.com> >> >> > wrote: >> >> >> >> >> >> On Fri, Sep 23, 2016 at 1:10 AM, Michał Purzyński >> >> >> <michalpurzyns...@gmail.com> wrote: >> >> >> > Hello. >> >> >> > >> >> >> > On my IDS workload with af_packet I can see rx_missed_errors >> >> >> > growing >> >> >> > while >> >> >> > rx_no_buffer_count does not. Basically every other kind of rx_ >> >> >> > error >> >> >> > counter is 0, including rx_no_dma_resources. It's an 82599 based >> >> >> > card. >> >> >> > >> >> >> > I don't know what to think about that. I went through ixgbe source >> >> >> > code >> >> >> > and >> >> >> > the 82599 datasheet and seems like rx_missed_error means a new >> >> >> > packet >> >> >> > overwrote something already in the packet buffer (FIFO queue on >> >> >> > the >> >> >> > card) >> >> >> > because there was no more space in it. >> >> >> > >> >> >> > Now, that would happen if there is no place to DMA packets into - >> >> >> > but >> >> >> > that >> >> >> > counter does not grow. >> >> >> > >> >> >> > Could you point me to where should I be looking for a problem? >> >> >> > >> >> >> > -- >> >> >> > Michal Purzynski >> >> >> >> >> >> The Rx missed count will increment if you are not able to receive a >> >> >> packet because the Rx FIFO is full. If you are not seeing any >> >> >> rx_no_dma_resources problems it might indicate that the problem is >> >> >> not >> >> >> with providing the DMA resources, but a problem on the bus itself. >> >> >> You might want to double check the slot the device is connected to >> >> >> in >> >> >> order to guarantee that there is a x8 link that supports 5GT/s all >> >> >> the >> >> >> way through to the root complex. >> >> >> >> >> >> - Alex >> >> > >> >> > >> > >> > > > ------------------------------------------------------------------------------ _______________________________________________ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired