On Mon, May 12, 2025 at 01:54:34PM +0100, Anatoly Burakov wrote:
> The i40e driver has an implementation of vectorized mbuf rearm code that
> is identical to the one in the common code, so just use that.
> 
> In addition, the i40e has an implementation of Rx queue rearm for Neon
> instruction set, so create a common header for Neon implementations too,
> and use that in i40e Neon code.
> 
> Signed-off-by: Anatoly Burakov <anatoly.bura...@intel.com>
> ---
> 
> Notes:
>     v2:
>     - Fix compile issues on Arm64
> 
>  drivers/net/intel/common/rx_vec_neon.h        | 131 +++++++++++
>  drivers/net/intel/i40e/i40e_rxtx.h            |   2 +-
>  drivers/net/intel/i40e/i40e_rxtx_common_avx.h | 215 ------------------
>  drivers/net/intel/i40e/i40e_rxtx_vec_avx2.c   |   5 +-
>  drivers/net/intel/i40e/i40e_rxtx_vec_avx512.c |   5 +-
>  drivers/net/intel/i40e/i40e_rxtx_vec_neon.c   |  59 +----
>  drivers/net/intel/i40e/i40e_rxtx_vec_sse.c    |  70 +-----
>  7 files changed, 144 insertions(+), 343 deletions(-)
>  create mode 100644 drivers/net/intel/common/rx_vec_neon.h
>  delete mode 100644 drivers/net/intel/i40e/i40e_rxtx_common_avx.h
> 
> diff --git a/drivers/net/intel/common/rx_vec_neon.h 
> b/drivers/net/intel/common/rx_vec_neon.h
> new file mode 100644
> index 0000000000..d79802b1c0
> --- /dev/null
> +++ b/drivers/net/intel/common/rx_vec_neon.h
> @@ -0,0 +1,131 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2024 Intel Corporation
> + */
> +
> +#ifndef _COMMON_INTEL_RX_VEC_NEON_H_
> +#define _COMMON_INTEL_RX_VEC_NEON_H_
> +
> +#include <stdint.h>
> +
> +#include <ethdev_driver.h>
> +#include <rte_io.h>
> +#include <rte_vect.h>
> +
> +#include "rx.h"
> +
> +static inline int
> +_ci_rxq_rearm_get_bufs(struct ci_rx_queue *rxq, const size_t desc_len)
> +{
> +     struct ci_rx_entry *rxp = &rxq->sw_ring[rxq->rxrearm_start];
> +     const uint16_t rearm_thresh = CI_VPMD_RX_REARM_THRESH;
> +     volatile void *rxdp;
> +     int i;
> +
> +     rxdp = RTE_PTR_ADD(rxq->rx_ring, rxq->rxrearm_start * desc_len);
> +
> +     if (rte_mempool_get_bulk(rxq->mp,
> +                              (void **)rxp,
> +                              rearm_thresh) < 0) {
> +             if (rxq->rxrearm_nb + rearm_thresh >= rxq->nb_rx_desc) {
> +                     uint64x2_t zero = vdupq_n_u64(0);
> +
> +                     for (i = 0; i < CI_VPMD_DESCS_PER_LOOP; i++) {
> +                             rxp[i].mbuf = &rxq->fake_mbuf;
> +                             const void *ptr = RTE_PTR_ADD(rxdp, i * 
> desc_len);
> +                             vst1q_u64(RTE_CAST_PTR(uint64_t *, ptr), zero);

I suspect many comments on the previous patch around the SSE code, e.g.
about unnecessary casting, may be relevant to this patch also.

/Bruce

Reply via email to