Re: [PATCH 06/10] testsuite: aarch64: arm: Add -mfpu=auto to arm_v8_2a_bf16_neon_ok

Christophe Lyon Fri, 04 Apr 2025 11:23:42 -0700

On Fri, 21 Mar 2025 at 16:15, Christophe Lyon
<christophe.l...@linaro.org> wrote:
>
> On Fri, 21 Mar 2025 at 15:25, Richard Earnshaw (lists)
> <richard.earns...@arm.com> wrote:
> >
> > On 21/03/2025 14:05, Christophe Lyon wrote:
> > > On Fri, 21 Mar 2025 at 11:18, Richard Earnshaw (lists)
> > > <richard.earns...@arm.com> wrote:
> > >>
> > >> On 20/03/2025 16:15, Christophe Lyon wrote:
> > >>> Depending on if/how the testing flags are overridden, the first value
> > >>> we try("") might not do what we want.
> > >>>
> > >>> For instance, if the whole testsuite is executed with
> > >>> (A) -mthumb -march=armv7-m -mtune=cortex-m3 -mfloat-abi=softfp
> > >>>
> > >>> bf16_neon_ok is first compiled with
> > >>> (A) (B)
> > >>> where B = -mcpu=unset -march=armv8.2-a+bf16
> > >>>
> > >>> which is accepted, so a testcase like vld2q_lane_bf16_indices_1.c
> > >>> is compiled with:
> > >>> (A) (C) (B)
> > >>> where C = -mfpu=neon -mfloat-abi=softfp -mcpu=unset -march=armv7-a 
> > >>> -mfpu=neon-fp16 -mfp16-format=ieee
> > >>>
> > >>> because advsimd-intrinsics.exp has set additional_flags to (C)
> > >>> via arm_neon_fp16_ok
> > >>>
> > >>> So the testcase is compiled with
> > >>> [...] -mfpu=neon-fp16 -mcpu=unset -march=armv8.2-a+bf16
> > >>> (thus -mfpu=neon-fp16) and bf16 support is disabled.
> > >>>
> > >>> The patch replaces "" with -mfpu=auto which matches the intended
> > >>> effect of -march=armv8.2-a+bf16 as added by bf16_neon_ok, and the
> > >>> testcase is now compiled with
> > >>> (A) (C) -mfpu=auto (B)
> > >>>
> > >>> However, since this effective-target is also used on aarch64 (which
> > >>> does not support -mfpu=auto), we do this only on arm.
> > >>>
> > >>> This patch improves coverage, and makes
> > >>> v{ld,st}[234]q_lane_bf16_indices_1.c pass when testsuite flags are
> > >>> overridden as described above (e.g. for M-profile).
> > >>>
> > >>>       gcc/testsuite/
> > >>>       * lib/target-supports.exp
> > >>>       (check_effective_target_arm_v8_2a_bf16_neon_ok_nocache):
> > >>>       Conditionally use -mfpu=auto.
> > >>> ---
> > >>>  gcc/testsuite/lib/target-supports.exp | 9 ++++++++-
> > >>>  1 file changed, 8 insertions(+), 1 deletion(-)
> > >>>
> > >>> diff --git a/gcc/testsuite/lib/target-supports.exp 
> > >>> b/gcc/testsuite/lib/target-supports.exp
> > >>> index e2622a445c5..09b16a14024 100644
> > >>> --- a/gcc/testsuite/lib/target-supports.exp
> > >>> +++ b/gcc/testsuite/lib/target-supports.exp
> > >>> @@ -6871,12 +6871,19 @@ proc add_options_for_arm_fp16fml_neon { flags } 
> > >>> {
> > >>>  proc check_effective_target_arm_v8_2a_bf16_neon_ok_nocache { } {
> > >>>      global et_arm_v8_2a_bf16_neon_flags
> > >>>      set et_arm_v8_2a_bf16_neon_flags ""
> > >>> +    set fpu_auto ""
> > >>>
> > >>>      if { ![istarget arm*-*-*] && ![istarget aarch64*-*-*] } {
> > >>>       return 0;
> > >>>      }
> > >>>
> > >>> -    foreach flags {"" "-mfloat-abi=softfp -mfpu=neon-fp-armv8" 
> > >>> "-mfloat-abi=hard -mfpu=neon-fp-armv8" } {
> > >>> +    if { [istarget arm*-*-*] } {
> > >>> +     set fpu_auto "-mfpu=auto"
> > >>> +    }
> > >>> +
> > >>> +    foreach flags [list "$fpu_auto" \
> > >>
> > >> Shouldn't we try first with "", even on Arm?  Thus
> > >>        foreach flags [list "" "$fpu_auto" \
> > >> ...
> > >>
> > > I don't think so, that's why I tried to explain above.
> > > "" is acceptable / accepted in arm_v8_2a_bf16_neon_ok
> > > (this is (A) (B) above, where the important parts are:
> > > -march=armv7-m -mcpu=unset -march=armv8.2-a+bf16
> > > (so -mfpu is set to the toolchain's default)
> >
> > That's never going to work reliably.  We need to check, somewhere, the full 
> > set of options we intend to pass to the compilation.  We can't assume that 
> > separately testing if A is ok and B is ok => A + B is ok.
> >
>
> Hmmm I think I raised that problem years ago, because of the way the
> test system is designed...
>
> > >
> > > but then the actual testcase is compiled with additional flags (C)
> > > defined by the test driver using arm_neon_fp16_ok
> > > C = -mfpu=neon -mfloat-abi=softfp -mcpu=unset -march=armv7-a
> > > -mfpu=neon-fp16 -mfp16-format=ieee
> > >
> > > so the relevant parts of (A) (C) (B) are:
> > > -march=armv7-m  -mfpu=neon -mcpu=unset -march=armv7-a -mfpu=neon-fp16
> > > -mcpu=unset -march=armv8.2-a+bf16
> > > which can be simplified into
> > > -mfpu=neon-fp16 -march=armv8.2-a+bf16
> > >
> > > I think we need to start with -mfpu=auto instead of "", so that when
> > > -march=armv8.2-a+bf16 takes effect, we have cancelled any other -mfpu.
> > >
> >
> > Ideally a test shouldn't add any options in some test runs; that way we add 
> > some 'base configuration' coverage.  We obviously can't do that everywhere 
> > and there may still be cases where the system can't test anything useful at 
> > all (hopefully only when linking, or more is needed).
> >
>
> Not sure to follow? If a test wants to check that a given feature
> works as expected, it has to use the appropriate options, doesn't it?
>
>
> > >
> > >
> > >>
> > >>> +                    "-mfloat-abi=softfp -mfpu=neon-fp-armv8" \
> > >>> +                    "-mfloat-abi=hard -mfpu=neon-fp-armv8" ] {
> > >>>       if { [check_no_compiler_messages_nocache arm_v8_2a_bf16_neon_ok 
> > >>> object {
> > >>>           #include <arm_neon.h>
> > >>>           #if !defined (__ARM_FEATURE_BF16_VECTOR_ARITHMETIC)
> > >>
> > >> In fact, since aarch64 doesn't support -mfpu at all, only "" is ever 
> > >> going to work here, so perhaps we can recast all this code, perhaps 
> > >> along the lines of that in check_effective_target_arm_neon_h_ok_nocache 
> > >> so that we don't need to actually try to include arm_neon.h at all while 
> > >> figuring out the options.
> > >>
> > >
> > > Maybe.
> > >
> > > OTOH, actually including arm_neon.h is guaranteed to give the right
> > > answer to the question "which flags do we need to be able to include
> > > arm_neon.h?", rather than using several conditions that are supposed
> > > to match what arm_neon.h does?
> > >
> >
> > Working out whether arm_neon.h can be included can be done with 
> > arm_neon_h_ok.  That's currently Arm only, but it's trivial to extend it to 
> > aarch64.  Once we have that, we can write some rules that build on that 
> > base to get other features that we might desire.  But we must, at some 
> > point check, rather than assume, that combining options from different 
> > rules will result in something sane.
> >
>
> I suspect most tests which currently use arm*neon* effective targets
> actually include arm_neon.h, they should arm_neon_h_ok.
>
> But I'm not sure what we can do to make sure, rather than assume that
> A + C + B does what we want, given that:
> - A is defined by the user who runs the testsuite (via RUNTESTFLAGS /
> target_board)
> - C is defined by the test harness (advsimd-intrinsics.exp)
> - B is defined as needed by individual tests
>
> At least we have no control on A.


Another example is default toolchain configuration.  The Linaro CI
found regressions on arm with this patch series although I had tested
on arm-linux-gnueaibhf. The reason is the CI configures with a
different --with-fpu than the one I used for testing.... and reminded
me of another bug for which I sent a patch a few weeks ago (but needed
some re-work): at least arm_v8_neon and arm_v8_3a_complex_neon_ok lack
the +simd option in -march.  I'll send a patch for that.

Thanks,

Christophe


>
> We can create a new directory, where the .exp test harness would not
> use additional options, and move tests which need
> effective-target-arm-arch* there.
> (that is 189 files out of 541 under advsimd-intrinsics/)
>
> Thanks,
>
> Christophe
>
>
> > R.
> >
> > > Thanks,
> > >
> > > Christophe
> > >
> > >> R.
> >

Re: [PATCH 06/10] testsuite: aarch64: arm: Add -mfpu=auto to arm_v8_2a_bf16_neon_ok

Reply via email to