On 30. 10. 24 16:20, Stephen Hemminger wrote:
On Wed, 30 Oct 2024 14:58:40 +0100
Lukáš Šišmiš <sis...@cesnet.cz> wrote:

On 29. 10. 24 15:37, Morten Brørup wrote:
From: Lukas Sismis [mailto:sis...@cesnet.cz]
Sent: Tuesday, 29 October 2024 13.49

Intel PMDs are capped by default to only 4096 RX/TX descriptors.
This can be limiting for applications requiring a bigger buffer
capabilities. The cap prevented the applications to configure
more descriptors. By bufferring more packets with RX/TX
descriptors, the applications can better handle the processing
peaks.

Signed-off-by: Lukas Sismis <sis...@cesnet.cz>
---
Seems like a good idea.

Have the max number of descriptors been checked with the datasheets for all the 
affected NIC chips?
I was hoping to get some feedback on this from the Intel folks.

But it seems like I can change it only for ixgbe (82599) to 32k
(possibly to 64k - 8), others - ice (E810) and i40e (X710) are capped at
8k - 32.

I neither have any experience with other drivers nor I have them
available to test so I will let it be in the follow-up version of this
patch.

Lukas

Having large number of descriptors especially at lower speeds will
increase buffer bloat. For real life applications, do not want increase
latency more than 1ms.

10 Gbps has 7.62Gbps of effective bandwidth due to overhead.
Rate for 1500 MTU is 7.62Gbs / (1500 * 8) = 635 K pps (i.e 1.5 us per packet)
A ring of 4096 descriptors can take 6 ms for full size packets.

Be careful, optimizing for 64 byte benchmarks can be disaster in real world.

Thanks for the info Stephen, however I am not trying to optimize for 64 byte benchmarks. The work has been initiated by an IO problem and Intel NICs. Suricata IDS worker (1 core per queue) received a burst of packets and then sequentially processes them one by one. Well it seems like having a 4k buffers it seems to not be enough. NVIDIA NICs allow e.g. 32k descriptors and it works fine. In the end it worked fine when ixgbe descriptors were increased as well. I am not sure why AF-Packet can handle this much better than DPDK, AFP doesn't have crazy high number of descriptors configured <= 4096, yet it works better. At the moment I assume there is an internal buffering in the kernel which allows to handle processing spikes.

To give more context here is the forum discussion - https://forum.suricata.io/t/high-packet-drop-rate-with-dpdk-compared-to-af-packet-in-suricata-7-0-7/4896



Reply via email to