On Tue, Jun 24, 2025 at 4:28 PM Alexander Monakov <amona...@ispras.ru> wrote: > > > > Thanks! Any thoughts on the other patch in the thread, about flipping > > > -ffp-contract from =fast to =on? > > > > I can't find this mail, not in spam either, but I'm OK with such change if > > it > > comes with test coverage. > > Ouch, let me reproduce it below. About test coverage, I'm not exactly sure > what > you mean, so I'll try to explain my perspective. > > We definitely have existing test coverage for production of FMAs, especially > on targets where FMA is in baseline ISA (aarch64, ia64, rs6000, maybe others), > but even x86 has a number of tests where fma is enabled with -m flags. Some > of those tests will regress and the question is what to do about them. > > On x86, I've provided a fix for one test (the patch you just approved), but > I can't fix everything and in the patch I'm changing the remaining tests to > pass -ffp-contract=fast. But most of the tests are actually highlighting > missed optimizations, like SLP not able to form addsub FMAs from a pair of > fmadd+fmsub.
I'd say we want to fix these kind of things before switching the default. Can you file bugreports for the distinct issues you noticed when adjusting the testcases? I suppose they are reproducible as well when using the C fma() function directly? Thanks, Richard. > > And the question of what to do on all the other targets remains. > > --- 8< --- > > From 478d51bc46c925028f6beeee1782fcadb3a92a58 Mon Sep 17 00:00:00 2001 > From: Alexander Monakov <amona...@ispras.ru> > Date: Mon, 12 May 2025 23:23:42 +0300 > Subject: [PATCH] c-family: switch away from -ffp-contract=fast > > Unrestricted FMA contraction, combined with optimizations that create > copies of expressions, is causing hard-to-debug issues such as PR106902. > Since we implement conformant contraction now, switch C and C++ from > -ffp-contract=fast to either =off (under -std=c[++]NN like before, also > for C++ now), or =on (under -std=gnu[++]NN). Keep -ffp-contract=fast > when -funsafe-math-optimizations (or -ffast-math, -Ofast) is active. > > In other words, > > - -ffast-math: no change, unrestricted contraction like before; > - standards compliant mode for C: no change, no contraction; > - ditto, for C++: align with C (no contraction); > - otherwise, switch C and C++ from -ffp-contract=fast to =on. > > gcc/c-family/ChangeLog: > > * c-opts.cc (c_common_post_options): Adjust handling of > flag_fp_contract_mode. > > gcc/ChangeLog: > > * doc/invoke.texi (-ffp-contract): Describe new defaults. > (-funsafe-math-optimizations): Add -ffp-contract=fast. > > gcc/testsuite/ChangeLog: > > * gcc.target/i386/pr81904.c: Add -ffp-contract=fast to flags. > * gcc.target/i386/pr116979.c: Ditto. > * gcc.target/i386/intrinsics_opt-1.c: Ditto. > * gcc.target/i386/part-vect-fmaddsubhf-1.c: Ditto. > * gcc.target/i386/avx512f-vect-fmaddsubXXXps.c: Ditto. > * gcc.target/i386/avx512f-vect-fmaddsubXXXpd.c: Ditto. > * gcc.target/i386/avx512f-vect-fmsubaddXXXps.c: Ditto. > * gcc.target/i386/avx512f-vect-fmsubaddXXXpd.c: Ditto. > --- > gcc/c-family/c-opts.cc | 11 ++--------- > gcc/doc/invoke.texi | 10 ++++++---- > .../gcc.target/i386/avx512f-vect-fmaddsubXXXpd.c | 2 +- > .../gcc.target/i386/avx512f-vect-fmaddsubXXXps.c | 2 +- > .../gcc.target/i386/avx512f-vect-fmsubaddXXXpd.c | 2 +- > .../gcc.target/i386/avx512f-vect-fmsubaddXXXps.c | 2 +- > gcc/testsuite/gcc.target/i386/intrinsics_opt-1.c | 2 +- > .../gcc.target/i386/part-vect-fmaddsubhf-1.c | 2 +- > gcc/testsuite/gcc.target/i386/pr116979.c | 2 +- > gcc/testsuite/gcc.target/i386/pr81904.c | 2 +- > 10 files changed, 16 insertions(+), 21 deletions(-) > > diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc > index 697518637d..1ae45e2b9a 100644 > --- a/gcc/c-family/c-opts.cc > +++ b/gcc/c-family/c-opts.cc > @@ -877,15 +877,8 @@ c_common_post_options (const char **pfilename) > flag_excess_precision = (flag_iso ? EXCESS_PRECISION_STANDARD > : EXCESS_PRECISION_FAST); > > - /* ISO C restricts floating-point expression contraction to within > - source-language expressions (-ffp-contract=on, currently an alias > - for -ffp-contract=off). */ > - if (flag_iso > - && !c_dialect_cxx () > - && (OPTION_SET_P (flag_fp_contract_mode) > - == (enum fp_contract_mode) 0) > - && flag_unsafe_math_optimizations == 0) > - flag_fp_contract_mode = FP_CONTRACT_OFF; > + if (!flag_unsafe_math_optimizations && !OPTION_SET_P > (flag_fp_contract_mode)) > + flag_fp_contract_mode = flag_iso ? FP_CONTRACT_OFF : FP_CONTRACT_ON; > > /* C language modes before C99 enable -fpermissive by default, but > only if -pedantic-errors is not specified. Also treat > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index 91b0a201e1..b3a77320ee 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -13070,16 +13070,17 @@ Disabled by default. > @opindex ffp-contract > @item -ffp-contract=@var{style} > @option{-ffp-contract=off} disables floating-point expression contraction. > +This is the default for C and C++ in a standards compliant mode > +(@option{-std=c11}, @option{-std=c++11} or similar). > @option{-ffp-contract=fast} enables floating-point expression contraction > such as forming of fused multiply-add operations if the target has > native support for them. > +This is the default for languages other than C and C++. > @option{-ffp-contract=on} enables floating-point expression contraction > if allowed by the language standard. This is implemented for C and C++, > where it enables contraction within one expression, but not across > different statements. > - > -The default is @option{-ffp-contract=off} for C in a standards compliant mode > -(@option{-std=c11} or similar), @option{-ffp-contract=fast} otherwise. > +This is the default when compiling GNU dialects of C or C++. > > @opindex fomit-frame-pointer > @item -fomit-frame-pointer > @@ -15567,7 +15568,8 @@ or ISO rules/specifications for math functions. It > may, however, > yield faster code for programs that do not require the guarantees > of these specifications. > Enables @option{-fno-signed-zeros}, @option{-fno-trapping-math}, > -@option{-fassociative-math} and @option{-freciprocal-math}. > +@option{-fassociative-math}, @option{-freciprocal-math} and > +@option{-ffp-contract=fast}. > > The default is @option{-fno-unsafe-math-optimizations}. > > diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vect-fmaddsubXXXpd.c > b/gcc/testsuite/gcc.target/i386/avx512f-vect-fmaddsubXXXpd.c > index 734f9e0144..2510d96937 100644 > --- a/gcc/testsuite/gcc.target/i386/avx512f-vect-fmaddsubXXXpd.c > +++ b/gcc/testsuite/gcc.target/i386/avx512f-vect-fmaddsubXXXpd.c > @@ -1,6 +1,6 @@ > /* { dg-do run } */ > /* { dg-require-effective-target avx512f } */ > -/* { dg-options "-O3 -mfma -save-temps -mavx512f -mprefer-vector-width=512" > } */ > +/* { dg-options "-O3 -mfma -save-temps -mavx512f -mprefer-vector-width=512 > -ffp-contract=fast" } */ > > #include "fma-check.h" > void __attribute__((noipa)) > diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vect-fmaddsubXXXps.c > b/gcc/testsuite/gcc.target/i386/avx512f-vect-fmaddsubXXXps.c > index ae196c5ef4..6c3577ff49 100644 > --- a/gcc/testsuite/gcc.target/i386/avx512f-vect-fmaddsubXXXps.c > +++ b/gcc/testsuite/gcc.target/i386/avx512f-vect-fmaddsubXXXps.c > @@ -1,6 +1,6 @@ > /* { dg-do run } */ > /* { dg-require-effective-target avx512f } */ > -/* { dg-options "-O3 -mavx512f -mprefer-vector-width=512 -save-temps" } */ > +/* { dg-options "-O3 -mavx512f -mprefer-vector-width=512 -save-temps > -ffp-contract=fast" } */ > > #include "fma-check.h" > void __attribute__((noipa)) > diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vect-fmsubaddXXXpd.c > b/gcc/testsuite/gcc.target/i386/avx512f-vect-fmsubaddXXXpd.c > index cde76db175..bfea44e3e7 100644 > --- a/gcc/testsuite/gcc.target/i386/avx512f-vect-fmsubaddXXXpd.c > +++ b/gcc/testsuite/gcc.target/i386/avx512f-vect-fmsubaddXXXpd.c > @@ -1,6 +1,6 @@ > /* { dg-do run } */ > /* { dg-require-effective-target avx512f } */ > -/* { dg-options "-O3 -mavx512f -mprefer-vector-width=512 -save-temps" } */ > +/* { dg-options "-O3 -mavx512f -mprefer-vector-width=512 -save-temps > -ffp-contract=fast" } */ > > #include "fma-check.h" > void __attribute__((noipa)) > diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vect-fmsubaddXXXps.c > b/gcc/testsuite/gcc.target/i386/avx512f-vect-fmsubaddXXXps.c > index 59de39f411..ff4c475aa4 100644 > --- a/gcc/testsuite/gcc.target/i386/avx512f-vect-fmsubaddXXXps.c > +++ b/gcc/testsuite/gcc.target/i386/avx512f-vect-fmsubaddXXXps.c > @@ -1,6 +1,6 @@ > /* { dg-do run } */ > /* { dg-require-effective-target avx512f } */ > -/* { dg-options "-O3 -mavx512f -mprefer-vector-width=512 -save-temps" } */ > +/* { dg-options "-O3 -mavx512f -mprefer-vector-width=512 -save-temps > -ffp-contract=fast" } */ > > #include "fma-check.h" > void __attribute__((noipa)) > diff --git a/gcc/testsuite/gcc.target/i386/intrinsics_opt-1.c > b/gcc/testsuite/gcc.target/i386/intrinsics_opt-1.c > index a75bf4e9ca..edd8a7d964 100644 > --- a/gcc/testsuite/gcc.target/i386/intrinsics_opt-1.c > +++ b/gcc/testsuite/gcc.target/i386/intrinsics_opt-1.c > @@ -1,5 +1,5 @@ > /* { dg-do compile } */ > -/* { dg-options "-O2 -mfma" } */ > +/* { dg-options "-O2 -mfma -ffp-contract=fast" } */ > > #include <emmintrin.h> > > diff --git a/gcc/testsuite/gcc.target/i386/part-vect-fmaddsubhf-1.c > b/gcc/testsuite/gcc.target/i386/part-vect-fmaddsubhf-1.c > index 051f992f66..2e7ca250c7 100644 > --- a/gcc/testsuite/gcc.target/i386/part-vect-fmaddsubhf-1.c > +++ b/gcc/testsuite/gcc.target/i386/part-vect-fmaddsubhf-1.c > @@ -1,5 +1,5 @@ > /* { dg-do compile } */ > -/* { dg-options "-mavx512fp16 -mavx512vl -O2" } */ > +/* { dg-options "-mavx512fp16 -mavx512vl -O2 -ffp-contract=fast" } */ > /* { dg-final { scan-assembler-times "vfmaddsub...ph\[ > \t\]+\[^\n\]*%xmm\[0-9\]" 1 { target { ! ia32 } } } } */ > /* { dg-final { scan-assembler-times "vfmsubadd...ph\[ > \t\]+\[^\n\]*%xmm\[0-9\]" 1 { target { ! ia32 } } } } */ > > diff --git a/gcc/testsuite/gcc.target/i386/pr116979.c > b/gcc/testsuite/gcc.target/i386/pr116979.c > index 0d2a958af4..fa2022b8a9 100644 > --- a/gcc/testsuite/gcc.target/i386/pr116979.c > +++ b/gcc/testsuite/gcc.target/i386/pr116979.c > @@ -1,6 +1,6 @@ > /* PR target/116979 */ > /* { dg-do compile } */ > -/* { dg-options "-O2 -mfma -fvect-cost-model=unlimited" } */ > +/* { dg-options "-O2 -mfma -fvect-cost-model=unlimited -ffp-contract=fast" } > */ > /* { dg-final { scan-assembler "vfmaddsub(?:132|213|231)pd" } } */ > /* { dg-final { scan-assembler "vfmaddsub(?:132|213|231)ps" { target { ! > ia32 } } } } */ > > diff --git a/gcc/testsuite/gcc.target/i386/pr81904.c > b/gcc/testsuite/gcc.target/i386/pr81904.c > index 9f5ad0bd95..ef07c8b08c 100644 > --- a/gcc/testsuite/gcc.target/i386/pr81904.c > +++ b/gcc/testsuite/gcc.target/i386/pr81904.c > @@ -1,5 +1,5 @@ > /* { dg-do compile } */ > -/* { dg-options "-mavx512fp16 -mavx512vl -O2 -mprefer-vector-width=512" } */ > +/* { dg-options "-mavx512fp16 -mavx512vl -O2 -mprefer-vector-width=512 > -ffp-contract=fast" } */ > /* { dg-final { scan-assembler-times "vfmaddsub...ph\[ > \t\]+\[^\n\]*%zmm\[0-9\]" 1 } } */ > /* { dg-final { scan-assembler-times "vfmsubadd...ph\[ > \t\]+\[^\n\]*%zmm\[0-9\]" 1 } } */ > > -- > 2.49.0 >