> On 29 Apr 2025, at 18:21, Richard Sandiford <richard.sandif...@arm.com> wrote: > > External email: Use caution opening links or attachments > > > Jennifer Schmitz <jschm...@nvidia.com> writes: >> If -msve-vector-bits=128, SVE loads and stores (LD1 and ST1) with a >> ptrue predicate can be replaced by neon instructions (LDR and STR), >> thus avoiding the predicate altogether. This also enables formation of >> LDP/STP pairs. >> >> For example, the test cases >> >> svfloat64_t >> ptrue_load (float64_t *x) >> { >> svbool_t pg = svptrue_b64 (); >> return svld1_f64 (pg, x); >> } >> void >> ptrue_store (float64_t *x, svfloat64_t data) >> { >> svbool_t pg = svptrue_b64 (); >> return svst1_f64 (pg, x, data); >> } >> >> were previously compiled to >> (with -O2 -march=armv8.2-a+sve -msve-vector-bits=128): >> >> ptrue_load: >> ptrue p3.b, vl16 >> ld1d z0.d, p3/z, [x0] >> ret >> ptrue_store: >> ptrue p3.b, vl16 >> st1d z0.d, p3, [x0] >> ret >> >> Now the are compiled to: >> >> ptrue_load: >> ldr q0, [x0] >> ret >> ptrue_store: >> str q0, [x0] >> ret >> >> The implementation includes the if-statement >> if (known_eq (GET_MODE_SIZE (mode), 16) >> && aarch64_classify_vector_mode (mode) == VEC_SVE_DATA) >> which checks for 128-bit VLS and excludes partial modes with a >> mode size < 128 (e.g. VNx2QI). >> >> The patch was bootstrapped and tested on aarch64-linux-gnu, no regression. >> OK for mainline? >> >> Signed-off-by: Jennifer Schmitz <jschm...@nvidia.com> >> >> gcc/ >> * config/aarch64/aarch64.cc (aarch64_emit_sve_pred_move): >> Fold LD1/ST1 with ptrue to LDR/STR for 128-bit VLS. >> >> gcc/testsuite/ >> * gcc.target/aarch64/sve/ldst_ptrue_128_to_neon.c: New test. >> * gcc.target/aarch64/sve/cond_arith_6.c: Adjust expected outcome. >> * gcc.target/aarch64/sve/pst/return_4_128.c: Likewise. >> * gcc.target/aarch64/sve/pst/return_5_128.c: Likewise. >> * gcc.target/aarch64/sve/pst/struct_3_128.c: Likewise. >> --- >> gcc/config/aarch64/aarch64.cc | 29 ++++++++-- >> .../gcc.target/aarch64/sve/cond_arith_6.c | 3 +- >> .../aarch64/sve/ldst_ptrue_128_to_neon.c | 48 ++++++++++++++++ >> .../gcc.target/aarch64/sve/pcs/return_4_128.c | 39 +++++-------- >> .../gcc.target/aarch64/sve/pcs/return_5_128.c | 39 +++++-------- >> .../gcc.target/aarch64/sve/pcs/struct_3_128.c | 56 +++++++------------ >> 6 files changed, 118 insertions(+), 96 deletions(-) >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/sve/ldst_ptrue_128_to_neon.c > > OK, thanks. Thanks, pushed to trunk: 83bb288faa39a0bf5ce2d62e21a090a130d8dda4 Jennifer > > Richard
smime.p7s
Description: S/MIME cryptographic signature