On Fri, 21 Mar 2025 at 11:18, Richard Earnshaw (lists) <richard.earns...@arm.com> wrote: > > On 20/03/2025 16:15, Christophe Lyon wrote: > > Depending on if/how the testing flags are overridden, the first value > > we try("") might not do what we want. > > > > For instance, if the whole testsuite is executed with > > (A) -mthumb -march=armv7-m -mtune=cortex-m3 -mfloat-abi=softfp > > > > bf16_neon_ok is first compiled with > > (A) (B) > > where B = -mcpu=unset -march=armv8.2-a+bf16 > > > > which is accepted, so a testcase like vld2q_lane_bf16_indices_1.c > > is compiled with: > > (A) (C) (B) > > where C = -mfpu=neon -mfloat-abi=softfp -mcpu=unset -march=armv7-a > > -mfpu=neon-fp16 -mfp16-format=ieee > > > > because advsimd-intrinsics.exp has set additional_flags to (C) > > via arm_neon_fp16_ok > > > > So the testcase is compiled with > > [...] -mfpu=neon-fp16 -mcpu=unset -march=armv8.2-a+bf16 > > (thus -mfpu=neon-fp16) and bf16 support is disabled. > > > > The patch replaces "" with -mfpu=auto which matches the intended > > effect of -march=armv8.2-a+bf16 as added by bf16_neon_ok, and the > > testcase is now compiled with > > (A) (C) -mfpu=auto (B) > > > > However, since this effective-target is also used on aarch64 (which > > does not support -mfpu=auto), we do this only on arm. > > > > This patch improves coverage, and makes > > v{ld,st}[234]q_lane_bf16_indices_1.c pass when testsuite flags are > > overridden as described above (e.g. for M-profile). > > > > gcc/testsuite/ > > * lib/target-supports.exp > > (check_effective_target_arm_v8_2a_bf16_neon_ok_nocache): > > Conditionally use -mfpu=auto. > > --- > > gcc/testsuite/lib/target-supports.exp | 9 ++++++++- > > 1 file changed, 8 insertions(+), 1 deletion(-) > > > > diff --git a/gcc/testsuite/lib/target-supports.exp > > b/gcc/testsuite/lib/target-supports.exp > > index e2622a445c5..09b16a14024 100644 > > --- a/gcc/testsuite/lib/target-supports.exp > > +++ b/gcc/testsuite/lib/target-supports.exp > > @@ -6871,12 +6871,19 @@ proc add_options_for_arm_fp16fml_neon { flags } { > > proc check_effective_target_arm_v8_2a_bf16_neon_ok_nocache { } { > > global et_arm_v8_2a_bf16_neon_flags > > set et_arm_v8_2a_bf16_neon_flags "" > > + set fpu_auto "" > > > > if { ![istarget arm*-*-*] && ![istarget aarch64*-*-*] } { > > return 0; > > } > > > > - foreach flags {"" "-mfloat-abi=softfp -mfpu=neon-fp-armv8" > > "-mfloat-abi=hard -mfpu=neon-fp-armv8" } { > > + if { [istarget arm*-*-*] } { > > + set fpu_auto "-mfpu=auto" > > + } > > + > > + foreach flags [list "$fpu_auto" \ > > Shouldn't we try first with "", even on Arm? Thus > foreach flags [list "" "$fpu_auto" \ > ... > I don't think so, that's why I tried to explain above. "" is acceptable / accepted in arm_v8_2a_bf16_neon_ok (this is (A) (B) above, where the important parts are: -march=armv7-m -mcpu=unset -march=armv8.2-a+bf16 (so -mfpu is set to the toolchain's default)
but then the actual testcase is compiled with additional flags (C) defined by the test driver using arm_neon_fp16_ok C = -mfpu=neon -mfloat-abi=softfp -mcpu=unset -march=armv7-a -mfpu=neon-fp16 -mfp16-format=ieee so the relevant parts of (A) (C) (B) are: -march=armv7-m -mfpu=neon -mcpu=unset -march=armv7-a -mfpu=neon-fp16 -mcpu=unset -march=armv8.2-a+bf16 which can be simplified into -mfpu=neon-fp16 -march=armv8.2-a+bf16 I think we need to start with -mfpu=auto instead of "", so that when -march=armv8.2-a+bf16 takes effect, we have cancelled any other -mfpu. > > > + "-mfloat-abi=softfp -mfpu=neon-fp-armv8" \ > > + "-mfloat-abi=hard -mfpu=neon-fp-armv8" ] { > > if { [check_no_compiler_messages_nocache arm_v8_2a_bf16_neon_ok > > object { > > #include <arm_neon.h> > > #if !defined (__ARM_FEATURE_BF16_VECTOR_ARITHMETIC) > > In fact, since aarch64 doesn't support -mfpu at all, only "" is ever going to > work here, so perhaps we can recast all this code, perhaps along the lines of > that in check_effective_target_arm_neon_h_ok_nocache so that we don't need to > actually try to include arm_neon.h at all while figuring out the options. > Maybe. OTOH, actually including arm_neon.h is guaranteed to give the right answer to the question "which flags do we need to be able to include arm_neon.h?", rather than using several conditions that are supposed to match what arm_neon.h does? Thanks, Christophe > R.