RE: [PATCH] aarch64: Improve svdupq_lane expension for big-endian [PR121293]

Tamar Christina Tue, 29 Jul 2025 23:15:52 -0700

> -----Original Message-----
> From: Richard Sandiford <richard.sandif...@arm.com>
> Sent: Tuesday, July 29, 2025 1:43 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Alex Coplan <alex.cop...@arm.com>; Alice Carlotti 
> <alice.carlo...@arm.com>;
> pins...@gmail.com; ktkac...@nvidia.com; Richard Earnshaw
> <richard.earns...@arm.com>; Tamar Christina <tamar.christ...@arm.com>;
> Wilco Dijkstra <wilco.dijks...@arm.com>
> Subject: [PATCH] aarch64: Improve svdupq_lane expension for big-endian
> [PR121293]
> 
> If the index to svdupq_lane is variable, or is outside the range of DUP,


Minor nit, perhaps this should be explicitly DUPQ explicitly? DUP had me 
confused
for a bit.

> the fallback expansion is to convert to VNx2DI and use TBL.  The problem
> in this PR was that the conversion used subregs, and on big-endian
> targets, a bitcast from VNx2DI to another element size requires a
> REV[BHW] in the best case or a spill and reload in the worst case.
> (See the comment at the head of aarch64-sve.md for details.)
> 
> Here we want the conversion to act like svreinterpret, so it should
> use aarch64_sve_reinterpret instead of subregs.
> 
> Tested on aarch64-linux-gnu.  OK to install?
> 

OK thanks,
Tamar

> Richard
> 
> 
> gcc/
>       PR target/121293
>       * config/aarch64/aarch64-sve-builtins-base.cc (svdupq_lane::expand):
>       Use aarch64_sve_reinterpret instead of subregs.  Explicitly
>       reinterpret the result back to the required mode, rather than
>       leaving the caller to take a subreg.
> 
> gcc/testsuite/
>       PR target/121293
>       * gcc.target/aarch64/sve/acle/general/dupq_lane_9.c: New test.
> ---
>  gcc/config/aarch64/aarch64-sve-builtins-base.cc           | 5 +++--
>  .../gcc.target/aarch64/sve/acle/general/dupq_lane_9.c     | 8 ++++++++
>  2 files changed, 11 insertions(+), 2 deletions(-)
>  create mode 100644
> gcc/testsuite/gcc.target/aarch64/sve/acle/general/dupq_lane_9.c
> 
> diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
> b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
> index b4396837c24..32cce97b220 100644
> --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
> +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
> @@ -1259,9 +1259,10 @@ public:
>       index = target;
>        }
> 
> -    e.args[0] = gen_lowpart (VNx2DImode, e.args[0]);
> +    e.args[0] = aarch64_sve_reinterpret (VNx2DImode, e.args[0]);
>      e.args[1] = index;
> -    return e.use_exact_insn (CODE_FOR_aarch64_sve_tblvnx2di);
> +    rtx res = e.use_exact_insn (CODE_FOR_aarch64_sve_tblvnx2di);
> +    return aarch64_sve_reinterpret (mode, res);
>    }
>  };
> 
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/dupq_lane_9.c
> b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/dupq_lane_9.c
> new file mode 100644
> index 00000000000..e3f352bb8c2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/dupq_lane_9.c
> @@ -0,0 +1,8 @@
> +/* { dg-options "-O2 -mbig-endian" } */
> +
> +#pragma GCC aarch64 "arm_sve.h"
> +
> +svint32_t f(svint32_t x) { return svdupq_lane (x, 17); }
> +void g(svint32_t *a, svint32_t *b) { *a = svdupq_lane (*b, 17); }
> +
> +/* { dg-final { scan-assembler-not {\trevw\t} } } */
> --
> 2.43.0

RE: [PATCH] aarch64: Improve svdupq_lane expension for big-endian [PR121293]

Reply via email to