Thanks, they have been updated in v5.

> -----原始邮件-----
&gt; 发件人: "Medvedkin, Vladimir" <vladimir.medved...@intel.com>
&gt; 发送时间: 2025-06-11 19:52:57 (星期三)
&gt; 收件人: u...@foxmail.com, dev@dpdk.org
&gt; 抄送: "Sun Yuechi" <sunyue...@iscas.ac.cn>, "Thomas Monjalon" 
<tho...@monjalon.net>, "Bruce Richardson" <bruce.richard...@intel.com>, 
"Stanislaw Kardach" <stanislaw.kard...@gmail.com>
&gt; 主题: Re: [PATCH v4 2/3] lib/lpm: R-V V rte_lpm_lookupx4
&gt; 
&gt; Hi Sun,
&gt; 
&gt; You did not address my previous comments regarding commit message. You 
&gt; can put everything you've wrote in this commit as a note and add 
&gt; meaningful description about what commit generally does, like (please 
&gt; correct if needed):
&gt; 
&gt; "Implement LPM lookupx4 routine for RISC-V architecture using RISC-V 
&gt; Vector Extension instruction set"
&gt; 
&gt; Everything else (performance tests, implementation thoughts and 
&gt; considerations, etc.) should be in the patch notes. For more information 
&gt; on what "patch notes" are, you may want refer to Git documentation [1].
&gt; 
&gt; [1] https://git-scm.com/docs/git-notes
&gt; 
&gt; On 05/06/2025 11:58, u...@foxmail.com wrote:
&gt; 
&gt; &gt; From: Sun Yuechi <sunyue...@iscas.ac.cn>
&gt; &gt;
&gt; &gt; The initialization of vtbl_entry is not fully vectorized here because
&gt; &gt; doing so would require __riscv_vluxei32_v_u32m1, which is slower
&gt; &gt; than the scalar approach in this small-scale scenario.
&gt; &gt;
&gt; &gt; - Test: app/test/lpm_perf_autotest
&gt; &gt; - Platform: Banana Pi(BPI-F3)
&gt; &gt; - SoC: Spacemit X60 (8 cores with Vector extension)
&gt; &gt; - CPU Frequency: up to 1.6 GHz
&gt; &gt; - Cache: 256 KiB L1d ×8, 256 KiB L1i ×8, 1 MiB L2 ×2
&gt; &gt; - Memory: 16 GiB
&gt; &gt; - Kernel: Linux 6.6.36
&gt; &gt; - Compiler: GCC 14.2.0 (with RVV intrinsic support)
&gt; &gt;
&gt; &gt; Test results(LPM LookupX4):
&gt; &gt;      scalar: 5.7 cycles
&gt; &gt;      rvv:    4.6 cycles
&gt; &gt;
&gt; &gt; Signed-off-by: Sun Yuechi <sunyue...@iscas.ac.cn>
&gt; &gt; ---
&gt; &gt;   MAINTAINERS           |  2 ++
&gt; &gt;   lib/lpm/meson.build   |  1 +
&gt; &gt;   lib/lpm/rte_lpm.h     |  2 ++
&gt; &gt;   lib/lpm/rte_lpm_rvv.h | 62 
+++++++++++++++++++++++++++++++++++++++++++
&gt; &gt;   4 files changed, 67 insertions(+)
&gt; &gt;   create mode 100644 lib/lpm/rte_lpm_rvv.h
&gt; &gt;
&gt; &gt; diff --git a/MAINTAINERS b/MAINTAINERS
&gt; &gt; index 3e16789250..0f207ac129 100644
&gt; &gt; --- a/MAINTAINERS
&gt; &gt; +++ b/MAINTAINERS
&gt; &gt; @@ -340,6 +340,8 @@ M: Stanislaw Kardach <stanislaw.kard...@gmail.com>
&gt; &gt;   F: config/riscv/
&gt; &gt;   F: doc/guides/linux_gsg/cross_build_dpdk_for_riscv.rst
&gt; &gt;   F: lib/eal/riscv/
&gt; &gt; +M: sunyuechi <sunyue...@iscas.ac.cn>
&gt; &gt; +F: lib/**/*rvv*
&gt; &gt;   
&gt; &gt;   Intel x86
&gt; &gt;   M: Bruce Richardson <bruce.richard...@intel.com>
&gt; &gt; diff --git a/lib/lpm/meson.build b/lib/lpm/meson.build
&gt; &gt; index fae4f79fb9..09133061e5 100644
&gt; &gt; --- a/lib/lpm/meson.build
&gt; &gt; +++ b/lib/lpm/meson.build
&gt; &gt; @@ -17,6 +17,7 @@ indirect_headers += files(
&gt; &gt;           'rte_lpm_scalar.h',
&gt; &gt;           'rte_lpm_sse.h',
&gt; &gt;           'rte_lpm_sve.h',
&gt; &gt; +        'rte_lpm_rvv.h',
&gt; &gt;   )
&gt; &gt;   deps += ['hash']
&gt; &gt;   deps += ['rcu']
&gt; &gt; diff --git a/lib/lpm/rte_lpm.h b/lib/lpm/rte_lpm.h
&gt; &gt; index 7df64f06b1..b06517206f 100644
&gt; &gt; --- a/lib/lpm/rte_lpm.h
&gt; &gt; +++ b/lib/lpm/rte_lpm.h
&gt; &gt; @@ -408,6 +408,8 @@ rte_lpm_lookupx4(const struct rte_lpm *lpm, xmm_t 
ip, uint32_t hop[4],
&gt; &gt;   #include "rte_lpm_altivec.h"
&gt; &gt;   #elif defined(RTE_ARCH_X86)
&gt; &gt;   #include "rte_lpm_sse.h"
&gt; &gt; +#elif defined(RTE_ARCH_RISCV) &amp;&amp; defined(RTE_RISCV_FEATURE_V)
&gt; &gt; +#include "rte_lpm_rvv.h"
&gt; &gt;   #else
&gt; &gt;   #include "rte_lpm_scalar.h"
&gt; &gt;   #endif
&gt; &gt; diff --git a/lib/lpm/rte_lpm_rvv.h b/lib/lpm/rte_lpm_rvv.h
&gt; &gt; new file mode 100644
&gt; &gt; index 0000000000..5f48fb2b32
&gt; &gt; --- /dev/null
&gt; &gt; +++ b/lib/lpm/rte_lpm_rvv.h
&gt; &gt; @@ -0,0 +1,62 @@
&gt; &gt; +/* SPDX-License-Identifier: BSD-3-Clause
&gt; &gt; + * Copyright (c) 2025 Institute of Software Chinese Academy of 
Sciences (ISCAS).
&gt; &gt; + */
&gt; &gt; +
&gt; &gt; +#ifndef _RTE_LPM_RVV_H_
&gt; &gt; +#define _RTE_LPM_RVV_H_
&gt; &gt; +
&gt; &gt; +#include <rte_vect.h>
&gt; &gt; +
&gt; &gt; +#include <rte_cpuflags.h>
&gt; &gt; +#include <riscv_vector.h>
&gt; &gt; +
&gt; &gt; +#ifdef __cplusplus
&gt; &gt; +extern "C" {
&gt; &gt; +#endif
&gt; &gt; +
&gt; &gt; +#define RTE_LPM_LOOKUP_SUCCESS 0x01000000
&gt; &gt; +#define RTE_LPM_VALID_EXT_ENTRY_BITMASK 0x03000000
&gt; &gt; +
&gt; &gt; +static inline void rte_lpm_lookupx4(
&gt; &gt; +     const struct rte_lpm *lpm, xmm_t ip, uint32_t hop[4], uint32_t 
defv)
&gt; &gt; +{
&gt; &gt; +     size_t vl = 4;
&gt; &gt; +
&gt; &gt; +     const uint32_t *tbl24_p = (const uint32_t *)lpm-&gt;tbl24;
&gt; &gt; +     uint32_t tbl_entries[4] = {
&gt; &gt; +             tbl24_p[((uint32_t)ip[0]) &gt;&gt; 8],
&gt; &gt; +             tbl24_p[((uint32_t)ip[1]) &gt;&gt; 8],
&gt; &gt; +             tbl24_p[((uint32_t)ip[2]) &gt;&gt; 8],
&gt; &gt; +             tbl24_p[((uint32_t)ip[3]) &gt;&gt; 8],
&gt; &gt; +     };
&gt; &gt; +     vuint32m1_t vtbl_entry = __riscv_vle32_v_u32m1(tbl_entries, vl);
&gt; &gt; +
&gt; &gt; +     vbool32_t mask = __riscv_vmseq_vx_u32m1_b32(
&gt; &gt; +         __riscv_vand_vx_u32m1(vtbl_entry, 
RTE_LPM_VALID_EXT_ENTRY_BITMASK, vl),
&gt; &gt; +         RTE_LPM_VALID_EXT_ENTRY_BITMASK, vl);
&gt; &gt; +
&gt; &gt; +     vuint32m1_t vtbl8_index = __riscv_vsll_vx_u32m1(
&gt; &gt; +         __riscv_vadd_vv_u32m1(
&gt; &gt; +             __riscv_vsll_vx_u32m1(__riscv_vand_vx_u32m1(vtbl_entry, 
0x00FFFFFF, vl), 8, vl),
&gt; &gt; +             __riscv_vand_vx_u32m1(
&gt; &gt; +                 __riscv_vle32_v_u32m1((const uint32_t *)&amp;ip, 
vl), 0x000000FF, vl),
&gt; &gt; +             vl),
&gt; &gt; +         2, vl);
&gt; &gt; +
&gt; &gt; +     vtbl_entry = __riscv_vluxei32_v_u32m1_mu(
&gt; &gt; +         mask, vtbl_entry, (const uint32_t *)(lpm-&gt;tbl8), 
vtbl8_index, vl);
&gt; &gt; +
&gt; &gt; +     vuint32m1_t vnext_hop = __riscv_vand_vx_u32m1(vtbl_entry, 
0x00FFFFFF, vl);
&gt; &gt; +     mask = __riscv_vmseq_vx_u32m1_b32(
&gt; &gt; +         __riscv_vand_vx_u32m1(vtbl_entry, RTE_LPM_LOOKUP_SUCCESS, 
vl), 0, vl);
&gt; &gt; +
&gt; &gt; +     vnext_hop = __riscv_vmerge_vxm_u32m1(vnext_hop, defv, mask, vl);
&gt; &gt; +
&gt; &gt; +     __riscv_vse32_v_u32m1(hop, vnext_hop, vl);
&gt; &gt; +}
&gt; &gt; +
&gt; &gt; +#ifdef __cplusplus
&gt; &gt; +}
&gt; &gt; +#endif
&gt; &gt; +
&gt; &gt; +#endif /* _RTE_LPM_RVV_H_ */
&gt; 
&gt; -- 
&gt; Regards,
&gt; Vladimir
</riscv_vector.h></rte_cpuflags.h></rte_vect.h></bruce.richard...@intel.com></sunyue...@iscas.ac.cn></stanislaw.kard...@gmail.com></sunyue...@iscas.ac.cn></sunyue...@iscas.ac.cn></stanislaw.kard...@gmail.com></bruce.richard...@intel.com></tho...@monjalon.net></sunyue...@iscas.ac.cn></vladimir.medved...@intel.com>

Reply via email to