On Tue, 16 Jun 2026 at 11:23, Maxime Leroy <[email protected]> wrote:
> On Mon, Jun 15, 2026 at 11:26 AM David Marchand
> <[email protected]> wrote:
> >
> > On Thu, 11 Jun 2026 at 17:51, Maxime Leroy <[email protected]> wrote:
> > >
> > > eth_dev_fp_ops_reset() restores a port's fast-path ops on stop/release
> > > via a compound literal, so every field it omits is zeroed to NULL. It
> > > sets only rx_pkt_burst/tx_pkt_burst (and the rxq/txq data), leaving
> > > rx_queue_count, tx_queue_count, rx/tx_descriptor_status, tx_pkt_prepare
> > > and the recycle callbacks NULL.
> > >
> > > In non-debug builds these ops are reached through an unguarded indirect
> > > call (the NULL check exists only under RTE_ETHDEV_DEBUG_RX/TX). So a
> > > thread calling e.g. rte_eth_rx_queue_count() on a port being stopped
> > > dereferences NULL and crashes, while the same race on rte_eth_rx_burst()
> > > is harmless because the burst ops are reset to dummies. A poll-mode
> > > worker re-checking rx_queue_count before arming the Rx interrupt and
> > > sleeping hits exactly this.
> > >
> > > Reset these ops to the same dummies eth_dev_set_dummy_fops() installs,
> > > so a stopped port behaves like a freshly allocated one: every fast-path
> > > op is a safe no-op, none is NULL.
> > >
> > > Fixes: 066f3d9cc21c ("ethdev: remove callback checks from fast path")
> > > Cc: [email protected]
> > > Signed-off-by: Maxime Leroy <[email protected]>
> > > ---
> > >  lib/ethdev/ethdev_private.c | 7 +++++++
> > >  1 file changed, 7 insertions(+)
> > >
> > > diff --git a/lib/ethdev/ethdev_private.c b/lib/ethdev/ethdev_private.c
> > > index 72a0723846..75ea3eedff 100644
> > > --- a/lib/ethdev/ethdev_private.c
> > > +++ b/lib/ethdev/ethdev_private.c
> > > @@ -263,6 +263,13 @@ eth_dev_fp_ops_reset(struct rte_eth_fp_ops *fpo)
> > >         *fpo = (struct rte_eth_fp_ops) {
> > >                 .rx_pkt_burst = dummy_eth_rx_burst,
> > >                 .tx_pkt_burst = dummy_eth_tx_burst,
> > > +               .tx_pkt_prepare = rte_eth_tx_pkt_prepare_dummy,
> > > +               .rx_queue_count = rte_eth_queue_count_dummy,
> > > +               .tx_queue_count = rte_eth_queue_count_dummy,
> > > +               .rx_descriptor_status = rte_eth_descriptor_status_dummy,
> > > +               .tx_descriptor_status = rte_eth_descriptor_status_dummy,
> > > +               .recycle_tx_mbufs_reuse = 
> > > rte_eth_recycle_tx_mbufs_reuse_dummy,
> > > +               .recycle_rx_descriptors_refill = 
> > > rte_eth_recycle_rx_descriptors_refill_dummy,
> > >                 .rxq = {
> > >                         .data = (void **)&dummy_queues_array[port_id],
> > >                         .clbk = dummy_data,
> >
> > Could we replace eth_dev_set_dummy_fops() with a call to
> > eth_dev_fp_ops_reset() in rte_eth_dev_allocate?
> > I don't like keeping two separate helpers.
>
> Thanks for the review.
>
> Avoiding the duplication is a good idea, but I could not find a clean
> way to do it: eth_dev_set_dummy_fops() and eth_dev_fp_ops_reset()
> cannot be unified without making things worse.
>
> - They write two different structures. eth_dev_set_dummy_fops() sets
> struct rte_eth_dev (the source ops); rte_eth_dev_allocate() fills
> eth_dev->*, and eth_dev_fp_ops_setup() then copies eth_dev->* into the
> per-port struct rte_eth_fp_ops. eth_dev_fp_ops_reset() writes that
> fast-path table entry directly. So calling fp_ops_reset() from
> rte_eth_dev_allocate() would populate the wrong structure.
>
> - The burst dummies are intentionally different. set_dummy_fops() uses
> the silent rte_eth_pkt_burst_dummy (a freshly allocated,
> not-yet-started port is benign), while fp_ops_reset() uses
> dummy_eth_rx_burst/dummy_eth_tx_burst, which log an error and dump the
> stack because hitting the data path on a stopped port is a misuse
> worth flagging.
>
> - fp_ops_reset() also wires fpo->rxq/txq to the
> dummy_queues_array/dummy_data used by the fast-path table, with no
> equivalent on rte_eth_dev.
>
> The only shared part is the non-burst dummy assignments, but factoring
> those across the two different struct types would require a token
> macro, and I don't have a clean solution for it. So for now I have
> kept the two helpers as they are. Suggestions welcome.

Ok, I had missed the separate structs.

Let's keep it simple, and avoid adding macros.
Can you add a comment that those helpers should be kept in sync?


-- 
David Marchand

Reply via email to