Thanks for your response.
Currently, I’m using DDP, and traffic is distributed correctly; everything
works fine in that setup.

However, I now want to test the same workload without using DDP, and
instead distribute traffic using handoff. In this scenario, the total
traffic rate is about 60 Gbps, and each worker (20 workers) processes
roughly 3 Gbps.

I’m aware that the next node (my plugin) can be further optimized for
performance, but that is not my concern at the moment. What I want to
understand is the best approach for distributing traffic using handoff
without introducing packet drops.
Any guidance on best practices for handling this type of distribution at
high rates would be greatly appreciated.

At the moment, I have increased frame_queue_nelts to 2048. Increasing it
further to 4096 actually makes the situation dramatically worse, with drops
increasing to millions of packets. I am currently investigating this
behavior and testing with frame_queue_nelts = 2048 while increasing the
number of frame queues (by creating more than one queue using
vlib_frame_queue_main_init())

On Sat, 20 Dec 2025, 14:55 Vladimir Zhigulin via lists.fd.io, <scripath96=
[email protected]> wrote:

> Hi,
>
> From your results it seems for that traffic is not balanced between
> workers, and you may have situation when single worker receives all traffic
> and trying to distribute to other ones.
>
> Make sure to look at rx placement and hashing in `vppctl show
> hardware-interface`. Each worker should have own rx queue for involved
> hardware interface, and their distribution should be balanced, so traffic
> is balanced as well. Like 3 workers = 3 interface rx queues. And that hash
> is working you can check via `vppctl show run | grep
> node-name-which-distributes-packets` - it should show that node counters
> increase similarly, aka balanced.
>
> 1. 2048 should be more then enough. In my testing I had zero drops with
> default value of 64 to up of 4 workers. For more workers it can be
> increased.
> 2. If packets are balanced and loss still persist, then instead of
> increasing queue size figure out why packets are not processed fast enough
> on the other end. Queue exists only to preserve packets for target worker
> to finish his current graph loop + possible lags on both sides. Making
> queue larger will cover more lags of target worker, but I suggest trying to
> figure why node lag happens in the first place using perf.
> 3. Having single per-worker queue per-node/plugin is optimal for most
> cases.
>
> Can you include more information: configuation, worker counts, speeds, how
> much traffic is handled by single worker?
>
> 
>
>
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#26672): https://lists.fd.io/g/vpp-dev/message/26672
Mute This Topic: https://lists.fd.io/mt/116856661/21656
Group Owner: [email protected]
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/14379924/21656/631435203/xyzzy 
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to