On Sun, 25 Jan 2026 23:58:12 -0800
Dimon Zhao <[email protected]> wrote:

> diff --git a/drivers/net/nbl/nbl_hw/nbl_txrx.c 
> b/drivers/net/nbl/nbl_hw/nbl_txrx.c
> index 650790b4fc..563f011cd3 100644
> --- a/drivers/net/nbl/nbl_hw/nbl_txrx.c
> +++ b/drivers/net/nbl/nbl_hw/nbl_txrx.c
> @@ -69,9 +69,10 @@ static void nbl_res_txrx_release_tx_ring(void *priv, u16 
> queue_idx)
>       struct nbl_resource_mgt *res_mgt = (struct nbl_resource_mgt *)priv;
>       struct nbl_txrx_mgt *txrx_mgt = NBL_RES_MGT_TO_TXRX_MGT(res_mgt);
>       struct nbl_res_tx_ring *tx_ring = NBL_RES_MGT_TO_TX_RING(res_mgt, 
> queue_idx);
> -     if (!tx_ring)
> -             return;
> +

Automated patch review spotted this removal as possible problem.

Code Review:

ERROR: NULL pointer dereference introduced

The NULL check for tx_ring was removed but the code immediately dereferences it:
c

static void nbl_res_txrx_release_tx_ring(void *priv, u16 queue_idx)
{
        struct nbl_resource_mgt *res_mgt = (struct nbl_resource_mgt *)priv;
        struct nbl_txrx_mgt *txrx_mgt = NBL_RES_MGT_TO_TXRX_MGT(res_mgt);
        struct nbl_res_tx_ring *tx_ring = NBL_RES_MGT_TO_TX_RING(res_mgt, 
queue_idx);

        rte_free(tx_ring->tx_entry);  /* <-- crash if tx_ring is NULL */

The old code had:
c

        if (!tx_ring)
                return;

This guard must be restored, or the callers must guarantee tx_ring is never 
NULL.

Similarly for nbl_res_txrx_release_rx_ring — no NULL check before dereferencing 
rx_ring->rx_entry and rx_ring->desc_mz.



Further AI lead analysis show scenarios where it could be a problem.

Complete Call Chain Analysis for Patch 1/4
Call Flow

DPDK Application
    ↓
rte_eth_dev_tx_queue_release() or eth_dev_ops->tx_queue_release
    ↓
nbl_tx_queues_release() [nbl_dev.c:420]
    ↓
disp_ops->release_tx_ring() → nbl_disp_release_tx_ring() [nbl_dispatch.c:163]
    ↓
res_ops->release_tx_ring() → nbl_res_txrx_release_tx_ring() [nbl_txrx.c]

Where tx_ring Can Be NULL

The macro definition:
c

#define NBL_RES_MGT_TO_TX_RING(res_mgt, index)  \
    (NBL_RES_MGT_TO_TXRX_MGT(res_mgt)->tx_rings[(index)])

Where tx_rings is:
c

struct nbl_res_tx_ring **tx_rings;  // Array of pointers

Scenario 1: Normal close without all queues set up

In nbl_dev_port_close() → nbl_release_queues() (line 337-349):
c

for (i = 0; i < eth_dev->data->nb_tx_queues; i++)
    disp_ops->release_tx_ring(NBL_DEV_MGT_TO_DISP_PRIV(dev_mgt), i);

If the application configured 4 queues but only called tx_queue_setup() on 2 of 
them, tx_rings[2] and tx_rings[3] will be NULL.

Scenario 2: DPDK ethdev layer calling release on never-setup queue

DPDK's rte_eth_dev_close() can call tx_queue_release for all configured queues 
without validating they were set up.

Scenario 3: Re-setup path mismatch

In the patched nbl_res_txrx_start_tx_ring():
c

if (eth_dev->data->tx_queues[param->queue_idx] != NULL) {
    NBL_LOG(WARNING, "re-setup an already allocated tx queue");
    nbl_res_txrx_release_tx_ring(priv, param->queue_idx);  // <-- Called

The check is on eth_dev->data->tx_queues[] (ethdev layer), but tx_rings[] 
(resource layer) could differ if a previous setup partially failed.
Verdict: Confirmed Bug in Patch 1/4

The removal of NULL checks causes NULL pointer dereference in both 
nbl_res_txrx_release_tx_ring() and nbl_res_txrx_release_rx_ring().
Required Fix
c

static void nbl_res_txrx_release_tx_ring(void *priv, u16 queue_idx)
{
    struct nbl_resource_mgt *res_mgt = (struct nbl_resource_mgt *)priv;
    struct nbl_txrx_mgt *txrx_mgt = NBL_RES_MGT_TO_TXRX_MGT(res_mgt);
    struct nbl_res_tx_ring *tx_ring = NBL_RES_MGT_TO_TX_RING(res_mgt, 
queue_idx);

+   if (!tx_ring)
+       return;
+
    rte_free(tx_ring->tx_entry);
    rte_memzone_free(tx_ring->net_hdr_mz);
    rte_memzone_free(tx_ring->desc_mz);
    rte_free(tx_ring);
    txrx_mgt->tx_rings[queue_idx] = NULL;
}

static void nbl_res_txrx_release_rx_ring(void *priv, u16 queue_idx)
{
    struct nbl_resource_mgt *res_mgt = (struct nbl_resource_mgt *)priv;
    struct nbl_txrx_mgt *txrx_mgt = NBL_RES_MGT_TO_TXRX_MGT(res_mgt);
    struct nbl_res_rx_ring *rx_ring = NBL_RES_MGT_TO_RX_RING(res_mgt, 
queue_idx);

+   if (!rx_ring)
+       return;
+
    rte_free(rx_ring->rx_entry);
    rte_memzone_free(rx_ring->desc_mz);
    rte_free(rx_ring);
    txrx_mgt->rx_rings[queue_idx] = NULL;
}

The memzone leak fix itself is correct and needed — but the defensive NULL 
checks must be preserved.

Conclusion put the null checks back into rx and tx and retest/resubmit new 
version.


Full chat is here if you are interested: 
https://claude.ai/share/d543bddd-7406-4f00-b5cb-4a92e8718476

Reply via email to