On Sun, 25 Jan 2026 23:58:12 -0800
Dimon Zhao <[email protected]> wrote:
> diff --git a/drivers/net/nbl/nbl_hw/nbl_txrx.c
> b/drivers/net/nbl/nbl_hw/nbl_txrx.c
> index 650790b4fc..563f011cd3 100644
> --- a/drivers/net/nbl/nbl_hw/nbl_txrx.c
> +++ b/drivers/net/nbl/nbl_hw/nbl_txrx.c
> @@ -69,9 +69,10 @@ static void nbl_res_txrx_release_tx_ring(void *priv, u16
> queue_idx)
> struct nbl_resource_mgt *res_mgt = (struct nbl_resource_mgt *)priv;
> struct nbl_txrx_mgt *txrx_mgt = NBL_RES_MGT_TO_TXRX_MGT(res_mgt);
> struct nbl_res_tx_ring *tx_ring = NBL_RES_MGT_TO_TX_RING(res_mgt,
> queue_idx);
> - if (!tx_ring)
> - return;
> +
Automated patch review spotted this removal as possible problem.
Code Review:
ERROR: NULL pointer dereference introduced
The NULL check for tx_ring was removed but the code immediately dereferences it:
c
static void nbl_res_txrx_release_tx_ring(void *priv, u16 queue_idx)
{
struct nbl_resource_mgt *res_mgt = (struct nbl_resource_mgt *)priv;
struct nbl_txrx_mgt *txrx_mgt = NBL_RES_MGT_TO_TXRX_MGT(res_mgt);
struct nbl_res_tx_ring *tx_ring = NBL_RES_MGT_TO_TX_RING(res_mgt,
queue_idx);
rte_free(tx_ring->tx_entry); /* <-- crash if tx_ring is NULL */
The old code had:
c
if (!tx_ring)
return;
This guard must be restored, or the callers must guarantee tx_ring is never
NULL.
Similarly for nbl_res_txrx_release_rx_ring — no NULL check before dereferencing
rx_ring->rx_entry and rx_ring->desc_mz.
Further AI lead analysis show scenarios where it could be a problem.
Complete Call Chain Analysis for Patch 1/4
Call Flow
DPDK Application
↓
rte_eth_dev_tx_queue_release() or eth_dev_ops->tx_queue_release
↓
nbl_tx_queues_release() [nbl_dev.c:420]
↓
disp_ops->release_tx_ring() → nbl_disp_release_tx_ring() [nbl_dispatch.c:163]
↓
res_ops->release_tx_ring() → nbl_res_txrx_release_tx_ring() [nbl_txrx.c]
Where tx_ring Can Be NULL
The macro definition:
c
#define NBL_RES_MGT_TO_TX_RING(res_mgt, index) \
(NBL_RES_MGT_TO_TXRX_MGT(res_mgt)->tx_rings[(index)])
Where tx_rings is:
c
struct nbl_res_tx_ring **tx_rings; // Array of pointers
Scenario 1: Normal close without all queues set up
In nbl_dev_port_close() → nbl_release_queues() (line 337-349):
c
for (i = 0; i < eth_dev->data->nb_tx_queues; i++)
disp_ops->release_tx_ring(NBL_DEV_MGT_TO_DISP_PRIV(dev_mgt), i);
If the application configured 4 queues but only called tx_queue_setup() on 2 of
them, tx_rings[2] and tx_rings[3] will be NULL.
Scenario 2: DPDK ethdev layer calling release on never-setup queue
DPDK's rte_eth_dev_close() can call tx_queue_release for all configured queues
without validating they were set up.
Scenario 3: Re-setup path mismatch
In the patched nbl_res_txrx_start_tx_ring():
c
if (eth_dev->data->tx_queues[param->queue_idx] != NULL) {
NBL_LOG(WARNING, "re-setup an already allocated tx queue");
nbl_res_txrx_release_tx_ring(priv, param->queue_idx); // <-- Called
The check is on eth_dev->data->tx_queues[] (ethdev layer), but tx_rings[]
(resource layer) could differ if a previous setup partially failed.
Verdict: Confirmed Bug in Patch 1/4
The removal of NULL checks causes NULL pointer dereference in both
nbl_res_txrx_release_tx_ring() and nbl_res_txrx_release_rx_ring().
Required Fix
c
static void nbl_res_txrx_release_tx_ring(void *priv, u16 queue_idx)
{
struct nbl_resource_mgt *res_mgt = (struct nbl_resource_mgt *)priv;
struct nbl_txrx_mgt *txrx_mgt = NBL_RES_MGT_TO_TXRX_MGT(res_mgt);
struct nbl_res_tx_ring *tx_ring = NBL_RES_MGT_TO_TX_RING(res_mgt,
queue_idx);
+ if (!tx_ring)
+ return;
+
rte_free(tx_ring->tx_entry);
rte_memzone_free(tx_ring->net_hdr_mz);
rte_memzone_free(tx_ring->desc_mz);
rte_free(tx_ring);
txrx_mgt->tx_rings[queue_idx] = NULL;
}
static void nbl_res_txrx_release_rx_ring(void *priv, u16 queue_idx)
{
struct nbl_resource_mgt *res_mgt = (struct nbl_resource_mgt *)priv;
struct nbl_txrx_mgt *txrx_mgt = NBL_RES_MGT_TO_TXRX_MGT(res_mgt);
struct nbl_res_rx_ring *rx_ring = NBL_RES_MGT_TO_RX_RING(res_mgt,
queue_idx);
+ if (!rx_ring)
+ return;
+
rte_free(rx_ring->rx_entry);
rte_memzone_free(rx_ring->desc_mz);
rte_free(rx_ring);
txrx_mgt->rx_rings[queue_idx] = NULL;
}
The memzone leak fix itself is correct and needed — but the defensive NULL
checks must be preserved.
Conclusion put the null checks back into rx and tx and retest/resubmit new
version.
Full chat is here if you are interested:
https://claude.ai/share/d543bddd-7406-4f00-b5cb-4a92e8718476