Re: [PATCH #2], Define __FP_FAST_FMAF128 on PowerPC ISA 3.0

2017-10-04 Thread Segher Boessenkool
Hi!

On Mon, Oct 02, 2017 at 07:52:50PM -0400, Michael Meissner wrote:
> Whoops, I forgot to attach the patch.

Heh.  The rs6000 parts are of course okay (trivial / obvious, but maybe
you are waiting for an ack).

Thanks,


Segher


Re: [PATCH #2], Define __FP_FAST_FMAF128 on PowerPC ISA 3.0

2017-10-02 Thread Joseph Myers
On Mon, 2 Oct 2017, Michael Meissner wrote:

> > > But in any case, the new macro should be documented in cpp.texi alongside 
> > > the existing __FP_FAST_FMA* macros (probably in the generic 
> > > __FP_FAST_FMAF@var{n} and __FP_FAST_FMAF@var{n}X form).
> > 
> > This patch adds support for adding the built-in __builtin_fmaf and
> > __builtin_fmafx functions if the target machine supports an appropriate
> > fused multiply-add (FMA) instruction.  This patch replaces the original 
> > PowerPC
> > specific patch.

Certainly the  FP_FAST_FMA* macros are supposed to relate to 
whether the public functions such as fmaf128 are fast rather than to 
__builtin_* names.

I think there's a strong case that you should provide built-in functions 
under the public names when defining __FP_FAST_FMA*.  I.e., add a variant 
of DEF_GCC_FLOATN_NX_BUILTINS that uses DEF_EXT_LIB_BUILTIN instead of 
DEF_GCC_BUILTIN, and use that for the new built-in functions.

Then, the built-in functions, in whatever form they are provided, should 
be documented in extend.texi, alongside the documentation of 
__builtin_fabsf@var{n} etc. (with, of course, the caveats about 
availability when appropriate instruction support isn't available - the 
__builtin_fabsfN, __builtin_copysignfN functions are always inlined, the 
fma ones aren't, and people may well lack C library support for the 
underlying functions).

Given that, I don't think the warning about lack of instruction support is 
appropriate; a call to __builtin_fmaf128, if the type is supported but 
there is no corresponding instruction (on x86_64, say), would just fall 
back to calling an external fmaf128 function (which in that case would 
work with glibc 2.26 or later, though calls to fmaf64 etc. wouldn't), much 
like any other such built-in function (we don't warn about e.g. calling 
__builtin_clog10 on systems whose C library may not have clog10).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH #2], Define __FP_FAST_FMAF128 on PowerPC ISA 3.0

2017-10-02 Thread Michael Meissner
Whoops, I forgot to attach the patch.

On Mon, Oct 02, 2017 at 07:51:00PM -0400, Michael Meissner wrote:
> On Thu, Sep 28, 2017 at 12:40:24AM +, Joseph Myers wrote:
> > On Wed, 27 Sep 2017, Michael Meissner wrote:
> > 
> > > The glibc team has requested we define the standard macro 
> > > (__FP_FAST_FMAF128)
> > > for PowerPC code when we have the IEEE 128-bit floating point hardware
> > > instructions enabled.
> > 
> > It's not a standard macro.  TS 18661-3 has FP_FAST_FMAF128 as an optional 
> > math.h macro (but glibc doesn't define it anywhere at present).
> > 
> > > This patch does this in the PowerPC backend.  As I look at the whole 
> > > issue, at
> > > some point we should do this more in the machine independent portion of 
> > > the
> > > compiler.  I have some initial patches to do this in the c-family files, 
> > > but at
> > > the present time, the patches are not complete, and I need to think about 
> > > it
> > > more.
> > 
> > I think a machine-independent definition (for _FloatN / _FloatNx types in 
> > general) should go along with machine-independent fmafN / fmafNx built-in 
> > functions; when the built-in function is machine-specific, it's natural 
> > for the macro to be as well.
> > 
> > But in any case, the new macro should be documented in cpp.texi alongside 
> > the existing __FP_FAST_FMA* macros (probably in the generic 
> > __FP_FAST_FMAF@var{n} and __FP_FAST_FMAF@var{n}X form).
> 
> This patch adds support for adding the built-in __builtin_fmaf and
> __builtin_fmafx functions if the target machine supports an appropriate
> fused multiply-add (FMA) instruction.  This patch replaces the original 
> PowerPC
> specific patch.
> 
> Because it involves changes in the built-in support, both the c and c-family
> subdirectories, as well as PowerPC changes, I added the global/release
> maintainers to the To: list.
> 
> I have done a bootstrap and make check on a little endian Power8 with no
> regresions in the tests.  I have verified that the changed and new tests both
> ran fine.
> 
> I have also bootstrapped the changes on an x86-64 compiler, and it 
> bootstrapped
> fine.  I am currently running the unmodified build, but I'm not expecting any
> changes in the test suite.
> 
> Assuming the x86-64 tests also have no regressions, can I check these changes
> into the trunk?
> 
> [gcc]
> 2017-10-02  Michael Meissner  
> 
>   * builtins.def (BUILT_IN_FMAF16): Add support for fused
>   multiply-add built-in functions for _Float and _Floatx
>   types.
>   (BUILT_IN_FMAF32): Likewise.
>   (BUILT_IN_FMAF64): Likewise.
>   (BUILT_IN_FMAF128): Likewise.
>   (BUILT_IN_FMAF32X): Likewise.
>   (BUILT_IN_FMAF64X): Likewise.
>   (BUILT_IN_FMAF128X): Likewise.
>   * builtin-types.def (BT_FN_FLOAT16_FLOAT16_FLOAT16_FLOAT16):
>   Likewise.
>   (BT_FN_FLOAT32_FLOAT32_FLOAT32_FLOAT32): Likewise.
>   (BT_FN_FLOAT64_FLOAT64_FLOAT64_FLOAT64): Likewise.
>   (BT_FN_FLOAT128_FLOAT128_FLOAT128_FLOAT128): Likewise.
>   (BT_FN_FLOAT32X_FLOAT32X_FLOAT32X_FLOAT32X): Likewise.
>   (BT_FN_FLOAT64X_FLOAT64X_FLOAT64X_FLOAT64X): Likewise.
>   (BT_FN_FLOAT128X_FLOAT128X_FLOAT128X_FLOAT128X): Likewise.
>   * builtins.c (expand_builtin_mathfn_ternary): Likewise.
>   (expand_builtin): Add fused multiply-add builtin support for
>   _Float and _FloatX types.  Issue a warning if the machine
>   does not provide an appropriate FMA insn.
>   (fold_builtin_3): Add support for fused multiply-add built-in
>   functions for _Float and _Floatx types.
>   * config/rs6000/rs6000-builtins.def (FMAF128): Delete creating
>   __builtin_fmaf128, since this is now done in machine independent
>   code.
>   * doc/cpp.texi (__FP_FAST_FMAF16): Document macros set to declare
>   that the appropriate fused multiply-add on _Float and
>   _FloatX types is implemented.
>   (__FP_FAST_FMAF32): Likewise.
>   (__FP_FAST_FMAF64): Likewise.
>   (__FP_FAST_FMAF128): Likewise.
>   (__FP_FAST_FMAF32X): Likewise.
>   (__FP_FAST_FMAF64X): Likewise.
>   (__FP_FAST_FMAF128X): Likewise.
> 
> [gcc/c]
> 2017-10-02  Michael Meissner  
> 
>   * c-decl.c (header_for_builtin_fn): Add support for fused
>   multiply-add built-in functions for _Float and _Floatx
>   types.
> 
> [gcc/c-family]
> 2017-10-02  Michael Meissner  
> 
>   * c-cppbuiltin.c (mode_has_fma): Add support for PowerPC _float128
>   FMA (KFmode) if long double != __float128.
>   (c_cpp_builtins): Define __FP_FAST_FMAF if _Float fused
>   multiply-add is supported.  Define __FP_FAST_FMAFX if
>   _Floatx fused multiply-add is supported.
> 
> [gcc/testsuite]
> 2017-10-02  Michael Meissner  
> 
>   * gcc.target/powerpc/float128-fma2.c: Change error to new
>   warning.
>   * gcc.target/powerpc/float128-fma3.c: 

Re: [PATCH #2], Define __FP_FAST_FMAF128 on PowerPC ISA 3.0

2017-10-02 Thread Michael Meissner
On Thu, Sep 28, 2017 at 12:40:24AM +, Joseph Myers wrote:
> On Wed, 27 Sep 2017, Michael Meissner wrote:
> 
> > The glibc team has requested we define the standard macro 
> > (__FP_FAST_FMAF128)
> > for PowerPC code when we have the IEEE 128-bit floating point hardware
> > instructions enabled.
> 
> It's not a standard macro.  TS 18661-3 has FP_FAST_FMAF128 as an optional 
> math.h macro (but glibc doesn't define it anywhere at present).
> 
> > This patch does this in the PowerPC backend.  As I look at the whole issue, 
> > at
> > some point we should do this more in the machine independent portion of the
> > compiler.  I have some initial patches to do this in the c-family files, 
> > but at
> > the present time, the patches are not complete, and I need to think about it
> > more.
> 
> I think a machine-independent definition (for _FloatN / _FloatNx types in 
> general) should go along with machine-independent fmafN / fmafNx built-in 
> functions; when the built-in function is machine-specific, it's natural 
> for the macro to be as well.
> 
> But in any case, the new macro should be documented in cpp.texi alongside 
> the existing __FP_FAST_FMA* macros (probably in the generic 
> __FP_FAST_FMAF@var{n} and __FP_FAST_FMAF@var{n}X form).

This patch adds support for adding the built-in __builtin_fmaf and
__builtin_fmafx functions if the target machine supports an appropriate
fused multiply-add (FMA) instruction.  This patch replaces the original PowerPC
specific patch.

Because it involves changes in the built-in support, both the c and c-family
subdirectories, as well as PowerPC changes, I added the global/release
maintainers to the To: list.

I have done a bootstrap and make check on a little endian Power8 with no
regresions in the tests.  I have verified that the changed and new tests both
ran fine.

I have also bootstrapped the changes on an x86-64 compiler, and it bootstrapped
fine.  I am currently running the unmodified build, but I'm not expecting any
changes in the test suite.

Assuming the x86-64 tests also have no regressions, can I check these changes
into the trunk?

[gcc]
2017-10-02  Michael Meissner  

* builtins.def (BUILT_IN_FMAF16): Add support for fused
multiply-add built-in functions for _Float and _Floatx
types.
(BUILT_IN_FMAF32): Likewise.
(BUILT_IN_FMAF64): Likewise.
(BUILT_IN_FMAF128): Likewise.
(BUILT_IN_FMAF32X): Likewise.
(BUILT_IN_FMAF64X): Likewise.
(BUILT_IN_FMAF128X): Likewise.
* builtin-types.def (BT_FN_FLOAT16_FLOAT16_FLOAT16_FLOAT16):
Likewise.
(BT_FN_FLOAT32_FLOAT32_FLOAT32_FLOAT32): Likewise.
(BT_FN_FLOAT64_FLOAT64_FLOAT64_FLOAT64): Likewise.
(BT_FN_FLOAT128_FLOAT128_FLOAT128_FLOAT128): Likewise.
(BT_FN_FLOAT32X_FLOAT32X_FLOAT32X_FLOAT32X): Likewise.
(BT_FN_FLOAT64X_FLOAT64X_FLOAT64X_FLOAT64X): Likewise.
(BT_FN_FLOAT128X_FLOAT128X_FLOAT128X_FLOAT128X): Likewise.
* builtins.c (expand_builtin_mathfn_ternary): Likewise.
(expand_builtin): Add fused multiply-add builtin support for
_Float and _FloatX types.  Issue a warning if the machine
does not provide an appropriate FMA insn.
(fold_builtin_3): Add support for fused multiply-add built-in
functions for _Float and _Floatx types.
* config/rs6000/rs6000-builtins.def (FMAF128): Delete creating
__builtin_fmaf128, since this is now done in machine independent
code.
* doc/cpp.texi (__FP_FAST_FMAF16): Document macros set to declare
that the appropriate fused multiply-add on _Float and
_FloatX types is implemented.
(__FP_FAST_FMAF32): Likewise.
(__FP_FAST_FMAF64): Likewise.
(__FP_FAST_FMAF128): Likewise.
(__FP_FAST_FMAF32X): Likewise.
(__FP_FAST_FMAF64X): Likewise.
(__FP_FAST_FMAF128X): Likewise.

[gcc/c]
2017-10-02  Michael Meissner  

* c-decl.c (header_for_builtin_fn): Add support for fused
multiply-add built-in functions for _Float and _Floatx
types.

[gcc/c-family]
2017-10-02  Michael Meissner  

* c-cppbuiltin.c (mode_has_fma): Add support for PowerPC _float128
FMA (KFmode) if long double != __float128.
(c_cpp_builtins): Define __FP_FAST_FMAF if _Float fused
multiply-add is supported.  Define __FP_FAST_FMAFX if
_Floatx fused multiply-add is supported.

[gcc/testsuite]
2017-10-02  Michael Meissner  

* gcc.target/powerpc/float128-fma2.c: Change error to new
warning.
* gcc.target/powerpc/float128-fma3.c: New test.


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797