> > Thanks! Any thoughts on the other patch in the thread, about flipping > > -ffp-contract from =fast to =on? > > I can't find this mail, not in spam either, but I'm OK with such change if it > comes with test coverage.
Ouch, let me reproduce it below. About test coverage, I'm not exactly sure what you mean, so I'll try to explain my perspective. We definitely have existing test coverage for production of FMAs, especially on targets where FMA is in baseline ISA (aarch64, ia64, rs6000, maybe others), but even x86 has a number of tests where fma is enabled with -m flags. Some of those tests will regress and the question is what to do about them. On x86, I've provided a fix for one test (the patch you just approved), but I can't fix everything and in the patch I'm changing the remaining tests to pass -ffp-contract=fast. But most of the tests are actually highlighting missed optimizations, like SLP not able to form addsub FMAs from a pair of fmadd+fmsub. And the question of what to do on all the other targets remains. --- 8< --- >From 478d51bc46c925028f6beeee1782fcadb3a92a58 Mon Sep 17 00:00:00 2001 From: Alexander Monakov <amona...@ispras.ru> Date: Mon, 12 May 2025 23:23:42 +0300 Subject: [PATCH] c-family: switch away from -ffp-contract=fast Unrestricted FMA contraction, combined with optimizations that create copies of expressions, is causing hard-to-debug issues such as PR106902. Since we implement conformant contraction now, switch C and C++ from -ffp-contract=fast to either =off (under -std=c[++]NN like before, also for C++ now), or =on (under -std=gnu[++]NN). Keep -ffp-contract=fast when -funsafe-math-optimizations (or -ffast-math, -Ofast) is active. In other words, - -ffast-math: no change, unrestricted contraction like before; - standards compliant mode for C: no change, no contraction; - ditto, for C++: align with C (no contraction); - otherwise, switch C and C++ from -ffp-contract=fast to =on. gcc/c-family/ChangeLog: * c-opts.cc (c_common_post_options): Adjust handling of flag_fp_contract_mode. gcc/ChangeLog: * doc/invoke.texi (-ffp-contract): Describe new defaults. (-funsafe-math-optimizations): Add -ffp-contract=fast. gcc/testsuite/ChangeLog: * gcc.target/i386/pr81904.c: Add -ffp-contract=fast to flags. * gcc.target/i386/pr116979.c: Ditto. * gcc.target/i386/intrinsics_opt-1.c: Ditto. * gcc.target/i386/part-vect-fmaddsubhf-1.c: Ditto. * gcc.target/i386/avx512f-vect-fmaddsubXXXps.c: Ditto. * gcc.target/i386/avx512f-vect-fmaddsubXXXpd.c: Ditto. * gcc.target/i386/avx512f-vect-fmsubaddXXXps.c: Ditto. * gcc.target/i386/avx512f-vect-fmsubaddXXXpd.c: Ditto. --- gcc/c-family/c-opts.cc | 11 ++--------- gcc/doc/invoke.texi | 10 ++++++---- .../gcc.target/i386/avx512f-vect-fmaddsubXXXpd.c | 2 +- .../gcc.target/i386/avx512f-vect-fmaddsubXXXps.c | 2 +- .../gcc.target/i386/avx512f-vect-fmsubaddXXXpd.c | 2 +- .../gcc.target/i386/avx512f-vect-fmsubaddXXXps.c | 2 +- gcc/testsuite/gcc.target/i386/intrinsics_opt-1.c | 2 +- .../gcc.target/i386/part-vect-fmaddsubhf-1.c | 2 +- gcc/testsuite/gcc.target/i386/pr116979.c | 2 +- gcc/testsuite/gcc.target/i386/pr81904.c | 2 +- 10 files changed, 16 insertions(+), 21 deletions(-) diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc index 697518637d..1ae45e2b9a 100644 --- a/gcc/c-family/c-opts.cc +++ b/gcc/c-family/c-opts.cc @@ -877,15 +877,8 @@ c_common_post_options (const char **pfilename) flag_excess_precision = (flag_iso ? EXCESS_PRECISION_STANDARD : EXCESS_PRECISION_FAST); - /* ISO C restricts floating-point expression contraction to within - source-language expressions (-ffp-contract=on, currently an alias - for -ffp-contract=off). */ - if (flag_iso - && !c_dialect_cxx () - && (OPTION_SET_P (flag_fp_contract_mode) - == (enum fp_contract_mode) 0) - && flag_unsafe_math_optimizations == 0) - flag_fp_contract_mode = FP_CONTRACT_OFF; + if (!flag_unsafe_math_optimizations && !OPTION_SET_P (flag_fp_contract_mode)) + flag_fp_contract_mode = flag_iso ? FP_CONTRACT_OFF : FP_CONTRACT_ON; /* C language modes before C99 enable -fpermissive by default, but only if -pedantic-errors is not specified. Also treat diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 91b0a201e1..b3a77320ee 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -13070,16 +13070,17 @@ Disabled by default. @opindex ffp-contract @item -ffp-contract=@var{style} @option{-ffp-contract=off} disables floating-point expression contraction. +This is the default for C and C++ in a standards compliant mode +(@option{-std=c11}, @option{-std=c++11} or similar). @option{-ffp-contract=fast} enables floating-point expression contraction such as forming of fused multiply-add operations if the target has native support for them. +This is the default for languages other than C and C++. @option{-ffp-contract=on} enables floating-point expression contraction if allowed by the language standard. This is implemented for C and C++, where it enables contraction within one expression, but not across different statements. - -The default is @option{-ffp-contract=off} for C in a standards compliant mode -(@option{-std=c11} or similar), @option{-ffp-contract=fast} otherwise. +This is the default when compiling GNU dialects of C or C++. @opindex fomit-frame-pointer @item -fomit-frame-pointer @@ -15567,7 +15568,8 @@ or ISO rules/specifications for math functions. It may, however, yield faster code for programs that do not require the guarantees of these specifications. Enables @option{-fno-signed-zeros}, @option{-fno-trapping-math}, -@option{-fassociative-math} and @option{-freciprocal-math}. +@option{-fassociative-math}, @option{-freciprocal-math} and +@option{-ffp-contract=fast}. The default is @option{-fno-unsafe-math-optimizations}. diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vect-fmaddsubXXXpd.c b/gcc/testsuite/gcc.target/i386/avx512f-vect-fmaddsubXXXpd.c index 734f9e0144..2510d96937 100644 --- a/gcc/testsuite/gcc.target/i386/avx512f-vect-fmaddsubXXXpd.c +++ b/gcc/testsuite/gcc.target/i386/avx512f-vect-fmaddsubXXXpd.c @@ -1,6 +1,6 @@ /* { dg-do run } */ /* { dg-require-effective-target avx512f } */ -/* { dg-options "-O3 -mfma -save-temps -mavx512f -mprefer-vector-width=512" } */ +/* { dg-options "-O3 -mfma -save-temps -mavx512f -mprefer-vector-width=512 -ffp-contract=fast" } */ #include "fma-check.h" void __attribute__((noipa)) diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vect-fmaddsubXXXps.c b/gcc/testsuite/gcc.target/i386/avx512f-vect-fmaddsubXXXps.c index ae196c5ef4..6c3577ff49 100644 --- a/gcc/testsuite/gcc.target/i386/avx512f-vect-fmaddsubXXXps.c +++ b/gcc/testsuite/gcc.target/i386/avx512f-vect-fmaddsubXXXps.c @@ -1,6 +1,6 @@ /* { dg-do run } */ /* { dg-require-effective-target avx512f } */ -/* { dg-options "-O3 -mavx512f -mprefer-vector-width=512 -save-temps" } */ +/* { dg-options "-O3 -mavx512f -mprefer-vector-width=512 -save-temps -ffp-contract=fast" } */ #include "fma-check.h" void __attribute__((noipa)) diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vect-fmsubaddXXXpd.c b/gcc/testsuite/gcc.target/i386/avx512f-vect-fmsubaddXXXpd.c index cde76db175..bfea44e3e7 100644 --- a/gcc/testsuite/gcc.target/i386/avx512f-vect-fmsubaddXXXpd.c +++ b/gcc/testsuite/gcc.target/i386/avx512f-vect-fmsubaddXXXpd.c @@ -1,6 +1,6 @@ /* { dg-do run } */ /* { dg-require-effective-target avx512f } */ -/* { dg-options "-O3 -mavx512f -mprefer-vector-width=512 -save-temps" } */ +/* { dg-options "-O3 -mavx512f -mprefer-vector-width=512 -save-temps -ffp-contract=fast" } */ #include "fma-check.h" void __attribute__((noipa)) diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vect-fmsubaddXXXps.c b/gcc/testsuite/gcc.target/i386/avx512f-vect-fmsubaddXXXps.c index 59de39f411..ff4c475aa4 100644 --- a/gcc/testsuite/gcc.target/i386/avx512f-vect-fmsubaddXXXps.c +++ b/gcc/testsuite/gcc.target/i386/avx512f-vect-fmsubaddXXXps.c @@ -1,6 +1,6 @@ /* { dg-do run } */ /* { dg-require-effective-target avx512f } */ -/* { dg-options "-O3 -mavx512f -mprefer-vector-width=512 -save-temps" } */ +/* { dg-options "-O3 -mavx512f -mprefer-vector-width=512 -save-temps -ffp-contract=fast" } */ #include "fma-check.h" void __attribute__((noipa)) diff --git a/gcc/testsuite/gcc.target/i386/intrinsics_opt-1.c b/gcc/testsuite/gcc.target/i386/intrinsics_opt-1.c index a75bf4e9ca..edd8a7d964 100644 --- a/gcc/testsuite/gcc.target/i386/intrinsics_opt-1.c +++ b/gcc/testsuite/gcc.target/i386/intrinsics_opt-1.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -mfma" } */ +/* { dg-options "-O2 -mfma -ffp-contract=fast" } */ #include <emmintrin.h> diff --git a/gcc/testsuite/gcc.target/i386/part-vect-fmaddsubhf-1.c b/gcc/testsuite/gcc.target/i386/part-vect-fmaddsubhf-1.c index 051f992f66..2e7ca250c7 100644 --- a/gcc/testsuite/gcc.target/i386/part-vect-fmaddsubhf-1.c +++ b/gcc/testsuite/gcc.target/i386/part-vect-fmaddsubhf-1.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-mavx512fp16 -mavx512vl -O2" } */ +/* { dg-options "-mavx512fp16 -mavx512vl -O2 -ffp-contract=fast" } */ /* { dg-final { scan-assembler-times "vfmaddsub...ph\[ \t\]+\[^\n\]*%xmm\[0-9\]" 1 { target { ! ia32 } } } } */ /* { dg-final { scan-assembler-times "vfmsubadd...ph\[ \t\]+\[^\n\]*%xmm\[0-9\]" 1 { target { ! ia32 } } } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr116979.c b/gcc/testsuite/gcc.target/i386/pr116979.c index 0d2a958af4..fa2022b8a9 100644 --- a/gcc/testsuite/gcc.target/i386/pr116979.c +++ b/gcc/testsuite/gcc.target/i386/pr116979.c @@ -1,6 +1,6 @@ /* PR target/116979 */ /* { dg-do compile } */ -/* { dg-options "-O2 -mfma -fvect-cost-model=unlimited" } */ +/* { dg-options "-O2 -mfma -fvect-cost-model=unlimited -ffp-contract=fast" } */ /* { dg-final { scan-assembler "vfmaddsub(?:132|213|231)pd" } } */ /* { dg-final { scan-assembler "vfmaddsub(?:132|213|231)ps" { target { ! ia32 } } } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr81904.c b/gcc/testsuite/gcc.target/i386/pr81904.c index 9f5ad0bd95..ef07c8b08c 100644 --- a/gcc/testsuite/gcc.target/i386/pr81904.c +++ b/gcc/testsuite/gcc.target/i386/pr81904.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-mavx512fp16 -mavx512vl -O2 -mprefer-vector-width=512" } */ +/* { dg-options "-mavx512fp16 -mavx512vl -O2 -mprefer-vector-width=512 -ffp-contract=fast" } */ /* { dg-final { scan-assembler-times "vfmaddsub...ph\[ \t\]+\[^\n\]*%zmm\[0-9\]" 1 } } */ /* { dg-final { scan-assembler-times "vfmsubadd...ph\[ \t\]+\[^\n\]*%zmm\[0-9\]" 1 } } */ -- 2.49.0