On 2026-02-06 12:21, Richard Earnshaw (foss) wrote:
On 06/02/2026 10:11, Torbjorn SVENSSON wrote:


On 2026-01-27 11:03, Richard Earnshaw wrote:
On 27/01/2026 06:17, Alexandre Oliva wrote:
On Jan 23, 2026, "Richard Earnshaw (foss)" <[email protected]> wrote:

On 19/01/2026 19:23, Alexandre Oliva wrote:

-/* { dg-additional-options "-march=armv7-a -mthumb" { target { arm_arch_v7a_ok 
&& arm_thumb2_ok } } } */
+/* { dg-additional-options "-mcpu=unset -march=armv7-a -mthumb" { target { 
arm_arch_v7a_ok && arm_thumb2_ok } } } */

This will fail if other options set, or config settings imply,
-mfloat-abi=hard and -mfpu=auto.

So we should use -march=armv7-a+fp

Oh, good catch, thanks.

Here's the patch with this fix, currently under retesting.
I'll take your response above as approval with changes.


Reset the cpu selection to the default on tests that set -march
explicitly instead of using dg-add-options.  The latter would reset
the cpu selection to avoid interference from TOOL_OPTIONS.

Also add +fp to -march in tests that don't override float-abi and fpu,
so that -mfloat-abi=hard -mfpu=auto in TOOL_OPTIONS won't cause a
failure.


for  gcc/testsuite/ChangeLog

     * gcc.target/arm/bfloat16_simd_1_2.c: Add -mcpu=unset.
     * gcc.target/arm/bfloat16_simd_2_2.c: Likewise.
     * gcc.target/arm/bfloat16_simd_3_2.c: Likewise.
     * gcc.dg/torture/pr120347.c: Likewise.  Add +fp to -march.


This is OK, thanks.

Can this patch be picked for release/gcc-15 too?

Yes, as long as you've tested it properly.

I've built r15-10798-gae573c9d0e7f1c and ran the testsuite with the following 
combinations of flags:

thumb/arch=armv6s-m/cpu=cortex-m0/float-abi=soft
thumb/arch=armv6s-m/tune=cortex-m0/float-abi=soft/fpu=auto
thumb/arch=armv7-m/cpu=cortex-m3/float-abi=soft
thumb/arch=armv7-m/tune=cortex-m3/float-abi=soft/fpu=auto
thumb/arch=armv7e-m+fp.dp/tune=cortex-m7/float-abi=hard/fpu=auto
thumb/arch=armv7e-m+fp/tune=cortex-m4/float-abi=hard/fpu=auto
thumb/arch=armv7e-m+nofp/tune=cortex-m4/float-abi=soft/fpu=auto
thumb/arch=armv7e-m+nofp/tune=cortex-m7/float-abi=soft/fpu=auto
thumb/arch=armv7e-m/cpu=cortex-m4/float-abi=hard/fpu=fpv4-sp-d16
thumb/arch=armv7e-m/cpu=cortex-m4/float-abi=soft
thumb/arch=armv7e-m/cpu=cortex-m7/float-abi=hard/fpu=fpv5-d16
thumb/arch=armv7e-m/cpu=cortex-m7/float-abi=soft
thumb/arch=armv7ve+neon/tune=cortex-a7/float-abi=hard/fpu=auto
thumb/arch=armv7ve+nofp/tune=cortex-a7/float-abi=soft/fpu=auto
thumb/arch=armv7ve/cpu=cortex-a7/float-abi=hard/fpu=neon
thumb/arch=armv7ve/cpu=cortex-a7/float-abi=soft
thumb/arch=armv8-m.main+dsp+fp/tune=cortex-m33/float-abi=hard/fpu=auto
thumb/arch=armv8-m.main+dsp+nofp/tune=cortex-m33/float-abi=soft/fpu=auto
thumb/arch=armv8-m.main+dsp/cpu=cortex-m33/float-abi=hard/fpu=fpv5-sp-d16
thumb/arch=armv8-m.main+dsp/cpu=cortex-m33/float-abi=soft
thumb/arch=armv8.1-m.main+mve+nofp/tune=cortex-m55/float-abi=soft/fpu=auto
thumb/arch=armv8.1-m.main+mve+pacbti+nofp/tune=cortex-m85/float-abi=soft/fpu=auto
thumb/arch=armv8.1-m.main+mve+pacbti/cpu=cortex-m85/float-abi=hard/fpu=fpv5-d16
thumb/arch=armv8.1-m.main+mve+pacbti/cpu=cortex-m85/float-abi=soft
thumb/arch=armv8.1-m.main+mve.fp+fp.dp/tune=cortex-m55/float-abi=hard/fpu=auto
thumb/arch=armv8.1-m.main+mve.fp+pacbti+fp.dp/tune=cortex-m85/float-abi=hard/fpu=auto
thumb/arch=armv8.1-m.main+mve/cpu=cortex-m55/float-abi=hard/fpu=fpv5-d16
thumb/arch=armv8.1-m.main+mve/cpu=cortex-m55/float-abi=soft

Out of these, the following permutations have the 4 test cases going from FAIL 
to PASS with the patch applied:

thumb/arch=armv7-m/cpu=cortex-m3/float-abi=soft
thumb/arch=armv7e-m+fp.dp/tune=cortex-m7/float-abi=hard/fpu=auto
thumb/arch=armv7e-m+fp/tune=cortex-m4/float-abi=hard/fpu=auto
thumb/arch=armv7e-m/cpu=cortex-m4/float-abi=hard/fpu=fpv4-sp-d16
thumb/arch=armv7e-m/cpu=cortex-m4/float-abi=soft
thumb/arch=armv7e-m/cpu=cortex-m7/float-abi=hard/fpu=fpv5-d16
thumb/arch=armv7e-m/cpu=cortex-m7/float-abi=soft
thumb/arch=armv7ve+neon/tune=cortex-a7/float-abi=hard/fpu=auto
thumb/arch=armv7ve/cpu=cortex-a7/float-abi=hard/fpu=neon
thumb/arch=armv7ve/cpu=cortex-a7/float-abi=soft
thumb/arch=armv8-m.main+dsp+fp/tune=cortex-m33/float-abi=hard/fpu=auto
thumb/arch=armv8-m.main+dsp/cpu=cortex-m33/float-abi=hard/fpu=fpv5-sp-d16
thumb/arch=armv8-m.main+dsp/cpu=cortex-m33/float-abi=soft
thumb/arch=armv8.1-m.main+mve+pacbti/cpu=cortex-m85/float-abi=hard/fpu=fpv5-d16
thumb/arch=armv8.1-m.main+mve+pacbti/cpu=cortex-m85/float-abi=soft
thumb/arch=armv8.1-m.main+mve.fp+fp.dp/tune=cortex-m55/float-abi=hard/fpu=auto
thumb/arch=armv8.1-m.main+mve.fp+pacbti+fp.dp/tune=cortex-m85/float-abi=hard/fpu=auto
thumb/arch=armv8.1-m.main+mve/cpu=cortex-m55/float-abi=hard/fpu=fpv5-d16
thumb/arch=armv8.1-m.main+mve/cpu=cortex-m55/float-abi=soft

There are no test cases that regresses.


However, the following permutations still fails `check-function-bodies 
stacktest1` of bfloat16_simd_[123]_2.c, even after applying the patch (below 
assembler is for bfloat16_simd_1_2.c, but the other two bfloat16 tests have 
similar output):

thumb/arch=armv7-m/tune=cortex-m3/float-abi=soft/fpu=auto
        sub     sp, sp, #8
        strh    r0, [sp, #6]    @ __bf16
        ldrh    r0, [sp, #6]    @ __bf16
        add     r3, sp, #6
        add     sp, sp, #8
        bx      lr

thumb/arch=armv7e-m+fp.dp/tune=cortex-m7/float-abi=hard/fpu=auto
        sub     sp, sp, #8
        strh    r0, [sp, #6]    @ __bf16
        add     r3, sp, #6
        ldrh    r0, [sp, #6]    @ __bf16
        add     sp, sp, #8
        bx      lr

thumb/arch=armv7e-m+nofp/tune=cortex-m7/float-abi=soft/fpu=auto
        sub     sp, sp, #8
        strh    r0, [sp, #6]    @ __bf16
        add     r3, sp, #6
        ldrh    r0, [sp, #6]    @ __bf16
        add     sp, sp, #8
        bx      lr

thumb/arch=armv8-m.main+dsp+fp/tune=cortex-m33/float-abi=hard/fpu=auto
        sub     sp, sp, #8
        strh    r0, [sp, #6]    @ __bf16
        ldrh    r0, [sp, #6]    @ __bf16
        add     r3, sp, #6
        add     sp, sp, #8
        bx      lr

thumb/arch=armv8-m.main+dsp+nofp/tune=cortex-m33/float-abi=soft/fpu=auto
        sub     sp, sp, #8
        strh    r0, [sp, #6]    @ __bf16
        ldrh    r0, [sp, #6]    @ __bf16
        add     r3, sp, #6
        add     sp, sp, #8
        bx      lr

thumb/arch=armv8.1-m.main+mve+nofp/tune=cortex-m55/float-abi=soft/fpu=auto
        sub     sp, sp, #8
        strh    r0, [sp, #6]    @ __bf16
        ldrh    r0, [sp, #6]    @ __bf16
        add     r3, sp, #6
        add     sp, sp, #8
        bx      lr

thumb/arch=armv8.1-m.main+mve+pacbti+nofp/tune=cortex-m85/float-abi=soft/fpu=auto
        sub     sp, sp, #8
        strh    r0, [sp, #6]    @ __bf16
        ldrh    r0, [sp, #6]    @ __bf16
        add     r3, sp, #6
        add     sp, sp, #8
        bx      lr

thumb/arch=armv8.1-m.main+mve.fp+fp.dp/tune=cortex-m55/float-abi=hard/fpu=auto
        sub     sp, sp, #8
        strh    r0, [sp, #6]    @ __bf16
        ldrh    r0, [sp, #6]    @ __bf16
        add     r3, sp, #6
        add     sp, sp, #8
        bx      lr

thumb/arch=armv8.1-m.main+mve.fp+pacbti+fp.dp/tune=cortex-m85/float-abi=hard/fpu=auto
        sub     sp, sp, #8
        strh    r0, [sp, #6]    @ __bf16
        ldrh    r0, [sp, #6]    @ __bf16
        add     r3, sp, #6
        add     sp, sp, #8
        bx      lr


I've also checked r16-6992-gd7e5113e592c54 and it contains similar patterns as 
above for the failing tests.

The function body should match:
/*
**stacktest1:
**      ...
**      strh    r[0-9]+, \[r[0-9]+\]    @ __bf16
**      ldrh    r[0-9]+, \[sp, #[0-9]+\]        @ __bf16
**      ...
**      bx      lr
*/

Is it okay to use the stack for both of the strh and ldrh, or is this a real 
bug in the compiler?


Unless someone has objected before the end of the week, I'll do the cherry-pick 
for releases/gcc-15 to at least run the tests with correct flags.

Kind regards,
Torbjörn



R.


Reply via email to