On Wed, 30 Oct 2024 16:40:10 +0100
Lukáš Šišmiš <sis...@cesnet.cz> wrote:

> On 30. 10. 24 16:20, Stephen Hemminger wrote:
> > On Wed, 30 Oct 2024 14:58:40 +0100
> > Lukáš Šišmiš <sis...@cesnet.cz> wrote:
> >  
> >> On 29. 10. 24 15:37, Morten Brørup wrote:  
> >>>> From: Lukas Sismis [mailto:sis...@cesnet.cz]
> >>>> Sent: Tuesday, 29 October 2024 13.49
> >>>>
> >>>> Intel PMDs are capped by default to only 4096 RX/TX descriptors.
> >>>> This can be limiting for applications requiring a bigger buffer
> >>>> capabilities. The cap prevented the applications to configure
> >>>> more descriptors. By bufferring more packets with RX/TX
> >>>> descriptors, the applications can better handle the processing
> >>>> peaks.
> >>>>
> >>>> Signed-off-by: Lukas Sismis <sis...@cesnet.cz>
> >>>> ---  
> >>> Seems like a good idea.
> >>>
> >>> Have the max number of descriptors been checked with the datasheets for 
> >>> all the affected NIC chips?
> >>>     
> >> I was hoping to get some feedback on this from the Intel folks.
> >>
> >> But it seems like I can change it only for ixgbe (82599) to 32k
> >> (possibly to 64k - 8), others - ice (E810) and i40e (X710) are capped at
> >> 8k - 32.
> >>
> >> I neither have any experience with other drivers nor I have them
> >> available to test so I will let it be in the follow-up version of this
> >> patch.
> >>
> >> Lukas
> >>  
> > Having large number of descriptors especially at lower speeds will
> > increase buffer bloat. For real life applications, do not want increase
> > latency more than 1ms.
> >
> > 10 Gbps has 7.62Gbps of effective bandwidth due to overhead.
> > Rate for 1500 MTU is 7.62Gbs / (1500 * 8) = 635 K pps (i.e 1.5 us per 
> > packet)
> > A ring of 4096 descriptors can take 6 ms for full size packets.
> >
> > Be careful, optimizing for 64 byte benchmarks can be disaster in real world.
> >  
> Thanks for the info Stephen, however I am not trying to optimize for 64 
> byte benchmarks. The work has been initiated by an IO problem and Intel 
> NICs. Suricata IDS worker (1 core per queue) received a burst of packets 
> and then sequentially processes them one by one. Well it seems like 
> having a 4k buffers it seems to not be enough. NVIDIA NICs allow e.g. 
> 32k descriptors and it works fine. In the end it worked fine when ixgbe 
> descriptors were increased as well. I am not sure why AF-Packet can 
> handle this much better than DPDK, AFP doesn't have crazy high number of 
> descriptors configured <= 4096, yet it works better. At the moment I 
> assume there is an internal buffering in the kernel which allows to 
> handle processing spikes.
> 
> To give more context here is the forum discussion - 
> https://forum.suricata.io/t/high-packet-drop-rate-with-dpdk-compared-to-af-packet-in-suricata-7-0-7/4896
> 
> 
> 

I suspect AF_PACKET provides an intermediate step which can buffer more
or spread out the work.

Reply via email to