Claudio Bantaloukas (rdfm) <[email protected]> requested changes to the code: This is a great start and I've very excited to see the patch series land. I have some comments and hopefully more people will chime in.
> +++ gcc/testsuite/gcc.target/aarch64/lane-bound-3.c > @@ -15,3 +15,1 @@ > - /* Use vgetq_lane_u64 to get a > - __builtin_aarch64_im_lane_boundsi */ > - vgetq_lane_u64(c, __b); > + __builtin_aarch64_im_lane_boundsi (sizeof (c), sizeof (c[0]), __b); @pinskia will this still trigger the case you fixed for https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117665 > +++ gcc/testsuite/gcc.target/aarch64/sha3_1.c > @@ -1,5 +1,5 @@ > /* { dg-do compile } */ > -/* { dg-options "-march=armv8.2-a+sha3" } */ > +/* { dg-options "-O1 -march=armv8.2-a+sha3" } */ What kind of issues does the absence of -O1 cause? > +++ gcc/testsuite/gcc.target/aarch64/sme/inlining_10.c > @@ -51,7 +45,6 @@ void > sc_caller () [[arm::inout("za"), arm::streaming_compatible]] > { > call_vadd (); > - call_vbsl (); bsl and all the other AdvSIMD functions are not compatible with straming mode. See CheckFPAdvSIMDEnabled in https://developer.arm.com/documentation/ddi0602/2026-03/Shared-Pseudocode/aarch64-exceptions-traps?lang=en#func_AArch64_CheckFPAdvSIMDEnabled_0 I think the patch should be amended to maintain the existing behaviour. > +++ gcc/config.gcc > @@ -365,0 +365,4 @@ > + 'arm_fp16.h' > + 'arm_neon.h' > + 'arm_bf16.h' > + 'arm_acle.h' This is not portable to /bin/sh (sorry!) you could do something like `extra_objs="${extra_objs} aarch64-sve-builtins.o"` instead > +++ gcc/config.gcc > @@ -362,3 +362,3 @@ > aarch64*-*-*) > cpu_type=aarch64 > - extra_headers="arm_fp16.h arm_neon.h arm_bf16.h arm_acle.h arm_sve.h > arm_sme.h arm_neon_sve_bridge.h arm_private_fp8.h arm_private_neon_types.h" > + extra_headers=( See comment on previous patch. These should be /bin/sh compliant. > +++ .editorconfig Seems unrelated :) > +++ gcc/config/aarch64/aarch64-sve-builtins.cc > @@ -3333,3 +2529,1 @@ > - /* The type and range are unsigned, so read the argument as an > - unsigned rather than signed HWI. */ > - if (!tree_fits_uhwi_p (arg)) > + if (tree_fits_shwi_p (arg)) should this function be moved in a more generic file than sve? > +++ gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bf16_dup.c > @@ -42,3 +42,3 @@ > return vdupq_lane_bf16 (a, 1); > } > -/* { dg-final { scan-assembler-times "dup\\tv\[0-9\]+\.8h, > v\[0-9\]+.h\\\[0\\\]" 2 } } */ > +/* { dg-final { scan-assembler-times "dup\\tv\[0-9\]+\.8h, > v\[0-9\]+.h\\\[0\\\]" 1 } } */ This is surprising, if this was a bug in the previous implementation or a wanted change, please document it in the patch description. -- https://forge.sourceware.org/gcc/gcc-TEST/pulls/158#issuecomment-5903
