Hello, a CSIT committer here.

> some DPI like extracting SNI after decrypting the QUIC traffic

Here “some” means that most packets are processed fast (VPP looks at headers 
only),
and only a small percentage of packets need the expensive decryption and 
investigation, right?

> we can not do the decryption on fast thread as it will hold the packets for 
> few time

Are you talking about throughput or latency?
I suspect that the throughput would remain fine, but you might see “heavier 
tail” in latency
(of the otherwise fast packets).

> we can not make the child thread just for decryption part

I guess you are pointing out a “missing feature” in VPP?
In contrast, ipsec traffic already has ways to “offload” crypto processing away 
from “dataplane” workers,
increasing latency for the decryption-needed flows but not affecting latency of 
no-decrypton flows much.

If you want to look at sources, start with the following API messages:
ipsec_set_async_mode [0], crypto_set_async_dispatch [1], 
crypto_sw_scheduler_set_worker [2]
and here [3] (scroll down past verbose debug info to see API calls) is an 
example
if the configuration we run in periodic tests.
Another framework is available for ipsec of you have corresponding hardware 
accelerator.

I believe there is an ongoing effort of “modularizing” crypto-related code
so every protocol (ipsec, wireguard, quic, tls, …) will hav access to the same 
capabilities
via a unified framework, but you can imagine that proper implementation takes 
time.

I am not a maintainer, so I should not tell you the usual “contributions are 
welcome” mantra,
but perhaps AI coding agents are good enough nowadays
to “vibe code” you a similar offloading scheme for QUIC?

Vratko.

[0] 
https://github.com/FDio/vpp/blob/bde4b86f724a686453df679618cae6359f530cd7/src/vnet/crypto/crypto.api#L38
[1] 
https://github.com/FDio/vpp/blob/bde4b86f724a686453df679618cae6359f530cd7/src/plugins/crypto_sw_scheduler/crypto_sw_scheduler.api#L31
[2] 
https://github.com/FDio/vpp/blob/bde4b86f724a686453df679618cae6359f530cd7/src/plugins/crypto_sw_scheduler/crypto_sw_scheduler.api#L31
[3] 
https://logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-icx/21228297698/log.html.gz#s1-s1-s1-s1-s24-t3-k3-k2

From: [email protected] <[email protected]> On Behalf Of Gulshan via 
lists.fd.io
Sent: Thursday, 29 January, 2026 10:09
To: [email protected]
Subject: [vpp-dev] Doing DPI on a slow thread/child thread. #dpdk #mellanox 
#plugin #vpp #vppctl #vpp-dev #vppinfra

Hello Community
I am processing 200Gbps traffic on a VPP framework, mainly for a passive 
firewall. It is currently running fine, but when do some DPI like extracting 
SNI after decrypting the QUIC traffic, the performance degrades from 200Gbps to 
20-30Gbps. I'm using 32 fast thread and 1 main thread(slow thread)  for 
printing stats or adding new policies. I'm getting the rx on mellanox CX-6 card.

Since the decryption part is necessary and we can not do the decryption on fast 
thread as it will hold the packets for few time, and we can not make the child 
thread just for decryption part. So, how should i proceeed? What should be the 
ideal setup for condition like this?
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#26747): https://lists.fd.io/g/vpp-dev/message/26747
Mute This Topic: https://lists.fd.io/mt/117523632/21656
Mute #vppctl:https://lists.fd.io/g/vpp-dev/mutehashtag/vppctl
Mute #vpp-dev:https://lists.fd.io/g/vpp-dev/mutehashtag/vpp-dev
Mute #vppinfra:https://lists.fd.io/g/vpp-dev/mutehashtag/vppinfra
Mute #dpdk:https://lists.fd.io/g/vpp-dev/mutehashtag/dpdk
Mute #mellanox:https://lists.fd.io/g/vpp-dev/mutehashtag/mellanox
Group Owner: [email protected]
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/14379924/21656/631435203/xyzzy 
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-

  • ... Gulshan via lists.fd.io
    • ... Vratko Polak -X (vrpolak - PANTHEON TECHNOLOGIES at Cisco) via lists.fd.io
      • ... Gulshan via lists.fd.io
        • ... Benoit Ganne (bganne) via lists.fd.io

Reply via email to