From: Simon Horman <[email protected]> Date: Fri, 8 May 2026 15:08:00 +0100
> From: 'Simon Horman' <[email protected]> > > This is an AI-generated review of your patch. The human sending this > email has considered the AI review valid, or at least plausible. > Full review at: https://sashiko.dev > --- > ice: implement Rx queue management ops > > This commit wires the ice driver into the netdev queue management ops > framework, reusing the existing per-queue disable/reconfig/enable paths > originally added for XSk. With it, ice reportedly passes the io_uring > zcrx selftests. > >> diff --git a/drivers/net/ethernet/intel/ice/ice_base.c >> b/drivers/net/ethernet/intel/ice/ice_base.c >> index 1add82d894bb..4e0b8895c303 100644 >> --- a/drivers/net/ethernet/intel/ice/ice_base.c >> +++ b/drivers/net/ethernet/intel/ice/ice_base.c > > [ ... ] > >> @@ -1577,3 +1593,81 @@ int ice_qp_ena(struct ice_vsi *vsi, u16 q_idx) >> >> return fail; >> } >> + >> +/** >> + * ice_qp_ena - Enables a queue pair >> + * @vsi: VSI of interest >> + * @q_idx: ring index in array >> + * >> + * Returns 0 on success, negative on failure. >> + */ >> +int ice_qp_ena(struct ice_vsi *vsi, u16 q_idx) >> +{ >> + return __ice_qp_ena(vsi, q_idx, ice_vsi_cfg_single_rxq(vsi, q_idx)); >> +} >> + >> +static int ice_queue_start(struct net_device *dev, >> + struct netdev_queue_config *qcfg, >> + void *per_queue_mem, int idx) >> +{ >> + const struct ice_netdev_priv *priv = netdev_priv(dev); >> + struct ice_rx_ring *real = priv->vsi->rx_rings[idx]; >> + struct ice_rx_ring *new = per_queue_mem; >> + struct napi_struct *napi; >> + int ret; >> + >> + real->pp = new->pp; >> + real->rx_fqes = new->rx_fqes; >> + real->hdr_fqes = new->hdr_fqes; >> + real->hdr_pp = new->hdr_pp; >> + >> + real->hdr_truesize = new->hdr_truesize; >> + real->truesize = new->truesize; >> + real->rx_hdr_len = new->rx_hdr_len; >> + real->rx_buf_len = new->rx_buf_len; >> + >> + memcpy(&real->xdp_rxq, &new->xdp_rxq, sizeof(new->xdp_rxq)); >> + >> + ret = ice_setup_rx_ctx(real); >> + if (ret) >> + return ret; > > Simon says: I believe that the issue flagged immediately below is > a false positive; the cleanup logic unwinds things correctly. Yup, our Sashiko also had concerns about this piece, but I rechecked and seems like it's a FP. > > However, the two other issues flagged further down do seem to be > of concern. > > ice_queue_start() eagerly copies new->pp, new->rx_fqes, new->hdr_pp, > new->hdr_fqes and the entire new->xdp_rxq into real before ice_setup_rx_ctx() > has succeeded. [...] > >> + >> + napi = &real->q_vector->napi; >> + >> + page_pool_enable_direct_recycling(real->pp, napi); >> + if (real->hdr_pp) >> + page_pool_enable_direct_recycling(real->hdr_pp, napi); >> + >> + ret = ice_alloc_rx_bufs(real, ICE_DESC_UNUSED(real)); >> + >> + return __ice_qp_ena(priv->vsi, idx, ret); > > ice_alloc_rx_bufs() is declared bool and returns !!cleaned_count, so it > yields 0 on full success and 1 on partial/failed allocation: > > bool ice_alloc_rx_bufs(struct ice_rx_ring *rx_ring, unsigned int > cleaned_count) > { > ... > return !!cleaned_count; > } > > That bool is then passed straight into __ice_qp_ena()'s fail parameter, > which only overwrites fail via 'if (!fail) fail = err;', so a value of 1 > propagates unchanged to ice_queue_start()'s return value. > > netdev_rx_queue_reconfig() treats any non-zero return as failure and > propagates it back to userspace (netlink, ethtool, io_uring zcrx install), > where the callers expect a negative errno. > > Before this patch, ice_qp_ena() obtained its fail value from > ice_vsi_cfg_single_rxq(), which returns a signed errno. > > Should this be converted to a proper negative errno before being handed > to __ice_qp_ena(), for example: > > ret = ice_alloc_rx_bufs(real, ICE_DESC_UNUSED(real)) ? -ENOMEM : 0; > >> +} >> + >> +static int ice_queue_stop(struct net_device *dev, void *per_queue_mem, >> + int idx) >> +{ >> + const struct ice_netdev_priv *priv = netdev_priv(dev); >> + struct ice_rx_ring *real = priv->vsi->rx_rings[idx]; >> + int ret; >> + >> + ret = __ice_qp_dis(priv->vsi, idx); >> + if (ret) >> + return ret; >> + >> + page_pool_disable_direct_recycling(real->pp); >> + if (real->hdr_pp) >> + page_pool_disable_direct_recycling(real->hdr_pp); >> + >> + ice_zero_rx_ring(real); >> + memcpy(per_queue_mem, real, sizeof(*real)); >> + >> + return 0; >> +} > > Can this leak inflight page pool buffers? > > ice_zero_rx_ring(real) is called before the memcpy to per_queue_mem, and > ice_zero_rx_ring() resets both indices: > > void ice_zero_rx_ring(struct ice_rx_ring *rx_ring) > { > ... > rx_ring->next_to_clean = 0; > rx_ring->next_to_use = 0; > } > > So per_queue_mem captures a ring where next_to_clean == next_to_use == 0. > > The core then invokes ndo_queue_mem_free(dev, old_mem), and the recycle > loop in ice_queue_mem_free() is guarded by exactly those two indices: > > void ice_queue_mem_free(struct net_device *dev, void *per_queue_mem) > { > ... > for (u32 i = rx_ring->next_to_clean; i != rx_ring->next_to_use; ) { > libeth_rx_recycle_slow(rx_ring->rx_fqes[i].netmem); > > if (rx_ring->hdr_pp) > libeth_rx_recycle_slow(rx_ring->hdr_fqes[i].netmem); > > if (unlikely(++i == rx_ring->count)) > i = 0; > } > ... > ice_rxq_pp_destroy(rx_ring); > } > > With 0 != 0 false on entry, the loop never runs, so any buffers that were > in rx_fqes[old_ntc..old_ntu) (and hdr_fqes[] when header split is on) are > never returned via libeth_rx_recycle_slow(). ice_rxq_pp_destroy() then > kvfree()s the fqes arrays and calls page_pool_destroy() with outstanding > inflight references that are no longer tracked anywhere. > > The new ice_clean_rx_ring() wrapper keeps the original ordering: > > void ice_clean_rx_ring(struct ice_rx_ring *rx_ring) > { > ice_queue_mem_free(rx_ring->netdev, rx_ring); > ice_zero_rx_ring(rx_ring); > } > > Should ice_queue_stop() follow the same ordering and perform the memcpy > before ice_zero_rx_ring(real), so the recycle loop in ice_queue_mem_free() > sees the real ntc/ntu values? > >> + >> +const struct netdev_queue_mgmt_ops ice_queue_mgmt_ops = { >> + .ndo_queue_mem_alloc = ice_queue_mem_alloc, >> + .ndo_queue_mem_free = ice_queue_mem_free, >> + .ndo_queue_mem_size = sizeof(struct ice_rx_ring), >> + .ndo_queue_start = ice_queue_start, >> + .ndo_queue_stop = ice_queue_stop, >> +}; Those two are new to me, I'll double check. Thanks, Olek
