Whoops, I forgot to attach the patch. On Mon, Oct 02, 2017 at 07:51:00PM -0400, Michael Meissner wrote: > On Thu, Sep 28, 2017 at 12:40:24AM +0000, Joseph Myers wrote: > > On Wed, 27 Sep 2017, Michael Meissner wrote: > > > > > The glibc team has requested we define the standard macro > > > (__FP_FAST_FMAF128) > > > for PowerPC code when we have the IEEE 128-bit floating point hardware > > > instructions enabled. > > > > It's not a standard macro. TS 18661-3 has FP_FAST_FMAF128 as an optional > > math.h macro (but glibc doesn't define it anywhere at present). > > > > > This patch does this in the PowerPC backend. As I look at the whole > > > issue, at > > > some point we should do this more in the machine independent portion of > > > the > > > compiler. I have some initial patches to do this in the c-family files, > > > but at > > > the present time, the patches are not complete, and I need to think about > > > it > > > more. > > > > I think a machine-independent definition (for _FloatN / _FloatNx types in > > general) should go along with machine-independent fmafN / fmafNx built-in > > functions; when the built-in function is machine-specific, it's natural > > for the macro to be as well. > > > > But in any case, the new macro should be documented in cpp.texi alongside > > the existing __FP_FAST_FMA* macros (probably in the generic > > __FP_FAST_FMAF@var{n} and __FP_FAST_FMAF@var{n}X form). > > This patch adds support for adding the built-in __builtin_fmaf<N> and > __builtin_fmaf<N>x functions if the target machine supports an appropriate > fused multiply-add (FMA) instruction. This patch replaces the original > PowerPC > specific patch. > > Because it involves changes in the built-in support, both the c and c-family > subdirectories, as well as PowerPC changes, I added the global/release > maintainers to the To: list. > > I have done a bootstrap and make check on a little endian Power8 with no > regresions in the tests. I have verified that the changed and new tests both > ran fine. > > I have also bootstrapped the changes on an x86-64 compiler, and it > bootstrapped > fine. I am currently running the unmodified build, but I'm not expecting any > changes in the test suite. > > Assuming the x86-64 tests also have no regressions, can I check these changes > into the trunk? > > [gcc] > 2017-10-02 Michael Meissner <meiss...@linux.vnet.ibm.com> > > * builtins.def (BUILT_IN_FMAF16): Add support for fused > multiply-add built-in functions for _Float<N> and _Float<N>x > types. > (BUILT_IN_FMAF32): Likewise. > (BUILT_IN_FMAF64): Likewise. > (BUILT_IN_FMAF128): Likewise. > (BUILT_IN_FMAF32X): Likewise. > (BUILT_IN_FMAF64X): Likewise. > (BUILT_IN_FMAF128X): Likewise. > * builtin-types.def (BT_FN_FLOAT16_FLOAT16_FLOAT16_FLOAT16): > Likewise. > (BT_FN_FLOAT32_FLOAT32_FLOAT32_FLOAT32): Likewise. > (BT_FN_FLOAT64_FLOAT64_FLOAT64_FLOAT64): Likewise. > (BT_FN_FLOAT128_FLOAT128_FLOAT128_FLOAT128): Likewise. > (BT_FN_FLOAT32X_FLOAT32X_FLOAT32X_FLOAT32X): Likewise. > (BT_FN_FLOAT64X_FLOAT64X_FLOAT64X_FLOAT64X): Likewise. > (BT_FN_FLOAT128X_FLOAT128X_FLOAT128X_FLOAT128X): Likewise. > * builtins.c (expand_builtin_mathfn_ternary): Likewise. > (expand_builtin): Add fused multiply-add builtin support for > _Float<N> and _Float<N>X types. Issue a warning if the machine > does not provide an appropriate FMA insn. > (fold_builtin_3): Add support for fused multiply-add built-in > functions for _Float<N> and _Float<N>x types. > * config/rs6000/rs6000-builtins.def (FMAF128): Delete creating > __builtin_fmaf128, since this is now done in machine independent > code. > * doc/cpp.texi (__FP_FAST_FMAF16): Document macros set to declare > that the appropriate fused multiply-add on _Float<N> and > _Float<N>X types is implemented. > (__FP_FAST_FMAF32): Likewise. > (__FP_FAST_FMAF64): Likewise. > (__FP_FAST_FMAF128): Likewise. > (__FP_FAST_FMAF32X): Likewise. > (__FP_FAST_FMAF64X): Likewise. > (__FP_FAST_FMAF128X): Likewise. > > [gcc/c] > 2017-10-02 Michael Meissner <meiss...@linux.vnet.ibm.com> > > * c-decl.c (header_for_builtin_fn): Add support for fused > multiply-add built-in functions for _Float<N> and _Float<N>x > types. > > [gcc/c-family] > 2017-10-02 Michael Meissner <meiss...@linux.vnet.ibm.com> > > * c-cppbuiltin.c (mode_has_fma): Add support for PowerPC _float128 > FMA (KFmode) if long double != __float128. > (c_cpp_builtins): Define __FP_FAST_FMAF<N> if _Float<N> fused > multiply-add is supported. Define __FP_FAST_FMAF<N>X if > _Float<N>x fused multiply-add is supported. > > [gcc/testsuite] > 2017-10-02 Michael Meissner <meiss...@linux.vnet.ibm.com> > > * gcc.target/powerpc/float128-fma2.c: Change error to new > warning. > * gcc.target/powerpc/float128-fma3.c: New test. > > > -- > Michael Meissner, IBM > IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA > email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
-- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/builtins.def =================================================================== --- gcc/builtins.def (revision 253358) +++ gcc/builtins.def (working copy) @@ -382,6 +382,9 @@ DEF_C99_C90RES_BUILTIN (BUILT_IN_FLOORL, DEF_C99_BUILTIN (BUILT_IN_FMA, "fma", BT_FN_DOUBLE_DOUBLE_DOUBLE_DOUBLE, ATTR_MATHFN_FPROUNDING) DEF_C99_BUILTIN (BUILT_IN_FMAF, "fmaf", BT_FN_FLOAT_FLOAT_FLOAT_FLOAT, ATTR_MATHFN_FPROUNDING) DEF_C99_BUILTIN (BUILT_IN_FMAL, "fmal", BT_FN_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE, ATTR_MATHFN_FPROUNDING) +#define FMA_TYPE(F) BT_FN_##F##_##F##_##F##_##F +DEF_GCC_FLOATN_NX_BUILTINS (BUILT_IN_FMA, "fma", FMA_TYPE, ATTR_MATHFN_FPROUNDING) +#undef FMA_TYPE DEF_C99_BUILTIN (BUILT_IN_FMAX, "fmax", BT_FN_DOUBLE_DOUBLE_DOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST) DEF_C99_BUILTIN (BUILT_IN_FMAXF, "fmaxf", BT_FN_FLOAT_FLOAT_FLOAT, ATTR_CONST_NOTHROW_LEAF_LIST) DEF_C99_BUILTIN (BUILT_IN_FMAXL, "fmaxl", BT_FN_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST) Index: gcc/builtin-types.def =================================================================== --- gcc/builtin-types.def (revision 253358) +++ gcc/builtin-types.def (working copy) @@ -544,6 +544,20 @@ DEF_FUNCTION_TYPE_3 (BT_FN_DOUBLE_DOUBLE BT_DOUBLE, BT_DOUBLE, BT_DOUBLE, BT_DOUBLE) DEF_FUNCTION_TYPE_3 (BT_FN_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE, BT_LONGDOUBLE, BT_LONGDOUBLE, BT_LONGDOUBLE, BT_LONGDOUBLE) +DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT16_FLOAT16_FLOAT16_FLOAT16, + BT_FLOAT16, BT_FLOAT16, BT_FLOAT16, BT_FLOAT16) +DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT32_FLOAT32_FLOAT32_FLOAT32, + BT_FLOAT32, BT_FLOAT32, BT_FLOAT32, BT_FLOAT32) +DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT64_FLOAT64_FLOAT64_FLOAT64, + BT_FLOAT64, BT_FLOAT64, BT_FLOAT64, BT_FLOAT64) +DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT128_FLOAT128_FLOAT128_FLOAT128, + BT_FLOAT128, BT_FLOAT128, BT_FLOAT128, BT_FLOAT128) +DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT32X_FLOAT32X_FLOAT32X_FLOAT32X, + BT_FLOAT32X, BT_FLOAT32X, BT_FLOAT32X, BT_FLOAT32X) +DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT64X_FLOAT64X_FLOAT64X_FLOAT64X, + BT_FLOAT64X, BT_FLOAT64X, BT_FLOAT64X, BT_FLOAT64X) +DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT128X_FLOAT128X_FLOAT128X_FLOAT128X, + BT_FLOAT128X, BT_FLOAT128X, BT_FLOAT128X, BT_FLOAT128X) DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT_FLOAT_FLOAT_INTPTR, BT_FLOAT, BT_FLOAT, BT_FLOAT, BT_INT_PTR) DEF_FUNCTION_TYPE_3 (BT_FN_DOUBLE_DOUBLE_DOUBLE_INTPTR, Index: gcc/builtins.c =================================================================== --- gcc/builtins.c (revision 253358) +++ gcc/builtins.c (working copy) @@ -2067,6 +2067,7 @@ expand_builtin_mathfn_ternary (tree exp, switch (DECL_FUNCTION_CODE (fndecl)) { CASE_FLT_FN (BUILT_IN_FMA): + CASE_FLT_FN_FLOATN_NX (BUILT_IN_FMA): builtin_optab = fma_optab; break; default: gcc_unreachable (); @@ -6563,6 +6564,18 @@ expand_builtin (tree exp, rtx target, rt return target; break; + /* Warn if the user called __builtin_fmaf{32,64,128} and there is no fast + insn to support it. */ + CASE_FLT_FN_FLOATN_NX (BUILT_IN_FMA): + target = expand_builtin_mathfn_ternary (exp, target, subtarget); + if (target) + return target; + + warning_at (tree_nonartificial_location (exp), 0, + "%KThe built-in function %<__builtin_fmafN ()%> may not be " + "supported", exp); + break; + CASE_FLT_FN (BUILT_IN_ILOGB): if (! flag_unsafe_math_optimizations) break; @@ -8988,6 +9001,7 @@ fold_builtin_3 (location_t loc, tree fnd return fold_builtin_sincos (loc, arg0, arg1, arg2); CASE_FLT_FN (BUILT_IN_FMA): + CASE_FLT_FN_FLOATN_NX (BUILT_IN_FMA): return fold_builtin_fma (loc, arg0, arg1, arg2, type); CASE_FLT_FN (BUILT_IN_REMQUO): Index: gcc/config/rs6000/rs6000-builtin.def =================================================================== --- gcc/config/rs6000/rs6000-builtin.def (revision 253358) +++ gcc/config/rs6000/rs6000-builtin.def (working copy) @@ -2369,7 +2369,6 @@ BU_FLOAT128_2 (COPYSIGNQ, "copysignq", hardware. These functions use the new 'f128' suffix. Eventually these should be folded into the common built-in function handling. */ BU_FLOAT128_1_HW (SQRTF128, "sqrtf128", CONST, sqrtkf2) -BU_FLOAT128_3_HW (FMAF128, "fmaf128", CONST, fmakf4_hw) /* 1 argument crypto functions. */ BU_CRYPTO_1 (VSBOX, "vsbox", CONST, crypto_vsbox) Index: gcc/doc/cpp.texi =================================================================== --- gcc/doc/cpp.texi (revision 253358) +++ gcc/doc/cpp.texi (working copy) @@ -2400,6 +2400,20 @@ was used). If 1 or more, it indicates t those requirements; this does not mean that all relevant language features are supported by GCC. +@item __FP_FAST_FMAF16 +@itemx __FP_FAST_FMAF32 +@itemx __FP_FAST_FMAF64 +@itemx __FP_FAST_FMAF128 +@itemx __FP_FAST_FMAF32X +@itemx __FP_FAST_FMAF64X +@itemx __FP_FAST_FMAF128X +This macro is defined with value 1 if the backend supports the +@code{__builtin_fmaf16}, @code{__builtin_fmaf32}, +@code{__builtin_fmaf64}, @code{__builtin_fmaf128}, +@code{__builtin_fmaf32x}, @code{__builtin_fmaf64x}, or +@code{__builtin_fmaf128x} builtin functions that do fused multiply-add +on the types defined in IEEE 754 (IEC 60559). + @item __NO_MATH_ERRNO__ This macro is defined if @option{-fno-math-errno} is used, or enabled by another option such as @option{-ffast-math} or by default. Index: gcc/c/c-decl.c =================================================================== --- gcc/c/c-decl.c (revision 253358) +++ gcc/c/c-decl.c (working copy) @@ -3171,6 +3171,7 @@ header_for_builtin_fn (enum built_in_fun CASE_FLT_FN (BUILT_IN_FDIM): CASE_FLT_FN (BUILT_IN_FLOOR): CASE_FLT_FN (BUILT_IN_FMA): + CASE_FLT_FN_FLOATN_NX (BUILT_IN_FMA): CASE_FLT_FN (BUILT_IN_FMAX): CASE_FLT_FN (BUILT_IN_FMIN): CASE_FLT_FN (BUILT_IN_FMOD): Index: gcc/c-family/c-cppbuiltin.c =================================================================== --- gcc/c-family/c-cppbuiltin.c (revision 253358) +++ gcc/c-family/c-cppbuiltin.c (working copy) @@ -82,6 +82,11 @@ mode_has_fma (machine_mode mode) return !!HAVE_fmadf4; #endif +#ifdef HAVE_fmakf4 /* PowerPC if long double != __float128. */ + case E_KFmode: + return !!HAVE_fmakf4; +#endif + #ifdef HAVE_fmaxf4 case E_XFmode: return !!HAVE_fmaxf4; @@ -1119,7 +1124,7 @@ c_cpp_builtins (cpp_reader *pfile) floatn_nx_types[i].extended ? "X" : ""); sprintf (csuffix, "F%d%s", floatn_nx_types[i].n, floatn_nx_types[i].extended ? "x" : ""); - builtin_define_float_constants (prefix, csuffix, "%s", NULL, + builtin_define_float_constants (prefix, csuffix, "%s", csuffix, FLOATN_NX_TYPE_NODE (i)); } Index: gcc/testsuite/gcc.target/powerpc/float128-fma2.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/float128-fma2.c (revision 253358) +++ gcc/testsuite/gcc.target/powerpc/float128-fma2.c (working copy) @@ -5,5 +5,5 @@ __float128 xfma (__float128 a, __float128 b, __float128 c) { - return __builtin_fmaf128 (a, b, c); /* { dg-error "ISA 3.0 IEEE 128-bit" } */ + return __builtin_fmaf128 (a, b, c); /* { dg-warning "__builtin_fmafN" } */ } Index: gcc/testsuite/gcc.target/powerpc/float128-fma3.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/float128-fma3.c (nonexistent) +++ gcc/testsuite/gcc.target/powerpc/float128-fma3.c (working copy) @@ -0,0 +1,63 @@ +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-options "-mpower9-vector -O2" } */ + +/* Make sure the appropriate FMA fast macros are defined. */ + +#include <math.h> + +#ifdef __FP_FAST_FMAF +float +do_fmaf (float a, float b, float c) +{ + return __builtin_fmaf (a, b, c); +} +#else +#error "__FP_FAST_FMAF should be defined" +#endif + +#ifdef __FP_FAST_FMAF32 +_Float32 +do_fmaf32 (_Float32 a, _Float32 b, _Float32 c) +{ + return __builtin_fmaf32 (a, b, -c); +} +#else +#error "__FP_FAST_FMAF32 should be defined" +#endif + +#ifdef __FP_FAST_FMA +double +do_fma (double a, double b, double c) +{ + return __builtin_fma (a, b, c); +} +#else +#error "__FP_FAST_FMA should be defined" +#endif + +#ifdef __FP_FAST_FMAF64 +_Float64 +do_fmaf64 (_Float64 a, _Float64 b, _Float64 c) +{ + return __builtin_fmaf64 (a, b, -c); +} +#else +#error "__FP_FAST_FMAF64 should be defined" +#endif + +#ifdef __FP_FAST_FMAF128 +_Float128 +do_fmaf128 (_Float128 a, _Float128 b, _Float128 c) +{ + return __builtin_fmaf128 (a, b, c); +} +#else +#error "__FP_FAST_FMAF128 should be defined" +#endif + +/* { dg-final { scan-assembler {\mfmadds\M|\mxsmadd.sp\M} } } */ +/* { dg-final { scan-assembler {\mfmsubs\M|\mxsmsub.sp\M} } } */ +/* { dg-final { scan-assembler {\mfmadd\M|\mxsmadd.dp\M} } } */ +/* { dg-final { scan-assembler {\mfmsub\M|\mxsmsub.dp\M} } } */ +/* { dg-final { scan-assembler {\mxsmaddqp\M} } } */