On Mon, May 12, 2025 at 01:54:34PM +0100, Anatoly Burakov wrote: > The i40e driver has an implementation of vectorized mbuf rearm code that > is identical to the one in the common code, so just use that. > > In addition, the i40e has an implementation of Rx queue rearm for Neon > instruction set, so create a common header for Neon implementations too, > and use that in i40e Neon code. > > Signed-off-by: Anatoly Burakov <anatoly.bura...@intel.com> > --- > > Notes: > v2: > - Fix compile issues on Arm64 > > drivers/net/intel/common/rx_vec_neon.h | 131 +++++++++++ > drivers/net/intel/i40e/i40e_rxtx.h | 2 +- > drivers/net/intel/i40e/i40e_rxtx_common_avx.h | 215 ------------------ > drivers/net/intel/i40e/i40e_rxtx_vec_avx2.c | 5 +- > drivers/net/intel/i40e/i40e_rxtx_vec_avx512.c | 5 +- > drivers/net/intel/i40e/i40e_rxtx_vec_neon.c | 59 +---- > drivers/net/intel/i40e/i40e_rxtx_vec_sse.c | 70 +----- > 7 files changed, 144 insertions(+), 343 deletions(-) > create mode 100644 drivers/net/intel/common/rx_vec_neon.h > delete mode 100644 drivers/net/intel/i40e/i40e_rxtx_common_avx.h > > diff --git a/drivers/net/intel/common/rx_vec_neon.h > b/drivers/net/intel/common/rx_vec_neon.h > new file mode 100644 > index 0000000000..d79802b1c0 > --- /dev/null > +++ b/drivers/net/intel/common/rx_vec_neon.h > @@ -0,0 +1,131 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright(c) 2024 Intel Corporation > + */ > + > +#ifndef _COMMON_INTEL_RX_VEC_NEON_H_ > +#define _COMMON_INTEL_RX_VEC_NEON_H_ > + > +#include <stdint.h> > + > +#include <ethdev_driver.h> > +#include <rte_io.h> > +#include <rte_vect.h> > + > +#include "rx.h" > + > +static inline int > +_ci_rxq_rearm_get_bufs(struct ci_rx_queue *rxq, const size_t desc_len) > +{ > + struct ci_rx_entry *rxp = &rxq->sw_ring[rxq->rxrearm_start]; > + const uint16_t rearm_thresh = CI_VPMD_RX_REARM_THRESH; > + volatile void *rxdp; > + int i; > + > + rxdp = RTE_PTR_ADD(rxq->rx_ring, rxq->rxrearm_start * desc_len); > + > + if (rte_mempool_get_bulk(rxq->mp, > + (void **)rxp, > + rearm_thresh) < 0) { > + if (rxq->rxrearm_nb + rearm_thresh >= rxq->nb_rx_desc) { > + uint64x2_t zero = vdupq_n_u64(0); > + > + for (i = 0; i < CI_VPMD_DESCS_PER_LOOP; i++) { > + rxp[i].mbuf = &rxq->fake_mbuf; > + const void *ptr = RTE_PTR_ADD(rxdp, i * > desc_len); > + vst1q_u64(RTE_CAST_PTR(uint64_t *, ptr), zero);
I suspect many comments on the previous patch around the SSE code, e.g. about unnecessary casting, may be relevant to this patch also. /Bruce