Hello, a CSIT committer here. > some DPI like extracting SNI after decrypting the QUIC traffic
Here “some” means that most packets are processed fast (VPP looks at headers only), and only a small percentage of packets need the expensive decryption and investigation, right? > we can not do the decryption on fast thread as it will hold the packets for > few time Are you talking about throughput or latency? I suspect that the throughput would remain fine, but you might see “heavier tail” in latency (of the otherwise fast packets). > we can not make the child thread just for decryption part I guess you are pointing out a “missing feature” in VPP? In contrast, ipsec traffic already has ways to “offload” crypto processing away from “dataplane” workers, increasing latency for the decryption-needed flows but not affecting latency of no-decrypton flows much. If you want to look at sources, start with the following API messages: ipsec_set_async_mode [0], crypto_set_async_dispatch [1], crypto_sw_scheduler_set_worker [2] and here [3] (scroll down past verbose debug info to see API calls) is an example if the configuration we run in periodic tests. Another framework is available for ipsec of you have corresponding hardware accelerator. I believe there is an ongoing effort of “modularizing” crypto-related code so every protocol (ipsec, wireguard, quic, tls, …) will hav access to the same capabilities via a unified framework, but you can imagine that proper implementation takes time. I am not a maintainer, so I should not tell you the usual “contributions are welcome” mantra, but perhaps AI coding agents are good enough nowadays to “vibe code” you a similar offloading scheme for QUIC? Vratko. [0] https://github.com/FDio/vpp/blob/bde4b86f724a686453df679618cae6359f530cd7/src/vnet/crypto/crypto.api#L38 [1] https://github.com/FDio/vpp/blob/bde4b86f724a686453df679618cae6359f530cd7/src/plugins/crypto_sw_scheduler/crypto_sw_scheduler.api#L31 [2] https://github.com/FDio/vpp/blob/bde4b86f724a686453df679618cae6359f530cd7/src/plugins/crypto_sw_scheduler/crypto_sw_scheduler.api#L31 [3] https://logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-icx/21228297698/log.html.gz#s1-s1-s1-s1-s24-t3-k3-k2 From: [email protected] <[email protected]> On Behalf Of Gulshan via lists.fd.io Sent: Thursday, 29 January, 2026 10:09 To: [email protected] Subject: [vpp-dev] Doing DPI on a slow thread/child thread. #dpdk #mellanox #plugin #vpp #vppctl #vpp-dev #vppinfra Hello Community I am processing 200Gbps traffic on a VPP framework, mainly for a passive firewall. It is currently running fine, but when do some DPI like extracting SNI after decrypting the QUIC traffic, the performance degrades from 200Gbps to 20-30Gbps. I'm using 32 fast thread and 1 main thread(slow thread) for printing stats or adding new policies. I'm getting the rx on mellanox CX-6 card. Since the decryption part is necessary and we can not do the decryption on fast thread as it will hold the packets for few time, and we can not make the child thread just for decryption part. So, how should i proceeed? What should be the ideal setup for condition like this?
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#26747): https://lists.fd.io/g/vpp-dev/message/26747 Mute This Topic: https://lists.fd.io/mt/117523632/21656 Mute #vppctl:https://lists.fd.io/g/vpp-dev/mutehashtag/vppctl Mute #vpp-dev:https://lists.fd.io/g/vpp-dev/mutehashtag/vpp-dev Mute #vppinfra:https://lists.fd.io/g/vpp-dev/mutehashtag/vppinfra Mute #dpdk:https://lists.fd.io/g/vpp-dev/mutehashtag/dpdk Mute #mellanox:https://lists.fd.io/g/vpp-dev/mutehashtag/mellanox Group Owner: [email protected] Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/14379924/21656/631435203/xyzzy [[email protected]] -=-=-=-=-=-=-=-=-=-=-=-
