-----Original Message-----
From: Stephen Hemminger <[email protected]>
Sent: Tuesday, March 24, 2026 9:17 PM
To: A V, KavyaX <[email protected]>
Cc: [email protected]; Richardson, Bruce <[email protected]>; Singh, Aman
Deep <[email protected]>; Medvedkin, Vladimir
<[email protected]>; Wani, Shaiq <[email protected]>;
[email protected]
Subject: Re: [PATCH v3] app/testpmd: fix DCB queue allocation for VMDq devices
On Tue, 24 Mar 2026 10:05:00 +0000
KAVYA AV <[email protected]> wrote:
> When using DCB mode with VT disabled and requesting more queues than
> traffic classes (e.g., rxq=64 with 8 TCs), testpmd crashes with null
> pointer errors because it artificially limits queue allocation to
> num_tcs.
>
> For VMDq devices, use actual VMDq queue layout (vmdq_queue_num)
> instead of limiting to num_tcs. This allows VMDq devices to utilize
> their full queue capacity while maintaining compatibility with non-VMDq
> devices.
>
> Fixes null pointer dereference when queue structures are accessed
> beyond the allocated range.
>
> Fixes: 2169699b15fc ("app/testpmd: add queue restriction in DCB
> command")
> Cc: [email protected]
>
> Signed-off-by: KAVYA AV <[email protected]>
> ---
I can't follow all the stuff here, is rather complex and esoteric case.
So did AI review.
The feedback from AI raised some questions (the I here is AI not me):
The basic idea of using dynamic VMDq queue info instead of limiting to num_tcs
is right -- the current code clearly crashes when more queues are requested
than traffic classes.
However, I'm not convinced vmdq_queue_num is the correct value here.
This code path is DCB-only (VT disabled) on a device with vmdq_pool_base > 0,
which in practice means i40e. In that case vmdq_queue_num is the total VMDq
pool queue count (vmdq_nb_qps * max_nb_vmdq_vsi), but with VT disabled the PF
queues are what's used for DCB, not the VMDq pool queues. The PF queue count
would be max_rx_queues - vmdq_queue_num. Using the VMDq count here could
over-allocate or misconfigure queues in DCB-only mode.
Can you explain why vmdq_queue_num is the right value rather than the PF queue
count? Or test what happens when this value exceeds what the hardware supports
in DCB-only mode?
After testing with values beyond the hardware-supported limits, I found that
the application could still configure up to max_rx_queues (320 for i40e)
without any errors or crashes. The fwd_stream allocation and queue
configuration also completed successfully.
Current number of RX queues: 320
Max possible RX queues: 320
nb_ports= 1
max_q= 320
Created 320 fwd_stream(s)
Although show port dcb_tc 0 indicates a limit of 8 queues per TC, the
I40E_MAX_Q_PER_TC constant is 64.
testpmd> show port dcb_tc 0
================ DCB infos for port 0 ================
TC NUMBER: 8
TC : 0 1 2 3 4 5 6 7
Prio2TC : 0 1 2 3 4 5 6 7
BW percent : 13% 13% 13% 13% 12% 12% 12% 12%
RXQ base : 0 8 16 24 32 40 48 56
RXQ number : 8 8 8 8 8 8 8 8
TXQ base : 0 8 16 24 32 40 48 56
TXQ number : 8 8 8 8 8 8 8 8
As suggested, switched from the vmdq_queue_number-based layout to the PF-based
layout, and testing showed that the scenario works correctly without errors.
/* Use PF queue count for DCB-only mode with VMDQ devices */
nb_rxq = rte_port->dev_info.max_rx_queues -
rte_port->dev_info.vmdq_queue_num;
nb_txq = rte_port->dev_info.max_tx_queues - rte_port->dev_info.vmdq_queue_num;
Please confirm whether it is appropriate to proceed with this fix.
Minor: the prose line "Fixes null pointer dereference when queue structures are
accessed beyond the allocated range." reads as a sentence fragment. Fold it
into the preceding paragraph or drop it since the Fixes tag already identifies
what's being fixed.