Re: [PATCH], Add support for __builtin_{sqrt,fma}f128 on PowerPC ISA 3.0

2017-09-14 Thread Michael Meissner
On Thu, Sep 14, 2017 at 09:54:14AM -0500, Segher Boessenkool wrote:
> On Wed, Sep 13, 2017 at 05:46:00PM -0400, Michael Meissner wrote:
> > This patch adds support on PowerPC ISA 3.0 for the built-in function
> > __builtin_sqrtf128 generating the XSSQRTQP hardware square root instruction 
> > and
> > the built-in function __builtin_fmaf128 generating XSMADDQP, XSMSUBQP,
> > XSNMADDQP, and XSNMSUBQP fused multiply-add instructions.
> > 
> > While I was at it, I changed the documentation so that it no longer 
> > documents
> > the 'q' built-in functions (to mirror libquadmath) but instead just 
> > documented
> > the 'f128' functions that matches glibc 2.26 and the technical report that
> > added the _FloatF128 date.
> > 
> > I changed the tests that used __fabsq to use __fabsf128 instead.
> > 
> > I also added && lp64 to float128-5.c so that it doesn't cause errors when 
> > doing
> > the test for a 32-bit target.  This is due to the fact that if you enable
> > hardware IEEE 128-bit floating point, you eventually will need TImode
> > supported, and that is not supported on 32-bit targets.
> > 
> > I did a bootstrap and make check with subversion id 252033 on a little 
> > endian
> > power8 system.  The subversion id 252033 is one of the last svn ids that
> > bootstrap without additional patches on the PowerPC.  There were no 
> > regressions
> > in this patch, and I verified the 4 new tests were run.  Can I check this 
> > patch
> > into the trunk?
> 
> Yes please.  A few trivial things:
> 
> > * doc/extend.texi (RS/6000 built-in functions): Document the
> > 'f128' IEEE 128-bit floating point built-in functions.  Don't
> > document the older 'q' versions of the functions. Document the
> > built-in IEEE 128-bit floating point square root and fused
> > multiply-add built-ins.
> 
> Dot space space.
> 
> > +/* 1 argument IEEE 128-bit floating point functions that require ISA 3.0
> > +   hardware.  We define both a 'q' version for libquadmath compatibility, 
> > and a
> > +   'f128' for glibc 2.26.  We didn't need this for FABS/COPYSIGN, since the
> > +   machine independent built-in support already defines the F128 versions, 
> >  */
> 
> Dot instead of comma?
> 
> > --- gcc/testsuite/gcc.target/powerpc/float128-5.c   (revision 252730)
> > +++ gcc/testsuite/gcc.target/powerpc/float128-5.c   (working copy)
> > @@ -1,4 +1,4 @@
> > -/* { dg-do compile { target { powerpc*-*-linux* } } } */
> > +/* { dg-do compile { target { powerpc*-*-linux* && lp64 } } } */
> 
> Maybe add a comment why this is -m64 only?

I removed the changes to the documentation that didn't pertain to the functions
I was adding (we can clean that up some other time), and added a -m64 comment.

I checked this into the trunk:

[gcc]
2017-09-14  Michael Meissner  

* config/rs6000/rs6000-builtin.def (BU_FLOAT128_1_HW): New macros
to support float128 built-in functions that require the ISA 3.0
hardware.
(BU_FLOAT128_3_HW): Likewise.
(SQRTF128): Add support for the IEEE 128-bit square root and fma
built-in functions.
(FMAF128): Likewise.
(FMAQ): Likewise.
* config/rs6000/rs6000.c (rs6000_builtin_mask_calculate): Add
support for built-in functions that need the ISA 3.0 IEEE 128-bit
floating point instructions.
(rs6000_invalid_builtin): Likewise.
(rs6000_builtin_mask_names): Likewise.
* config/rs6000/rs6000.h (MASK_FLOAT128_HW): Likewise.
(RS6000_BTM_FLOAT128_HW): Likewise.
(RS6000_BTM_COMMON): Likewise.
* config/rs6000/rs6000.md (fma4_hw): Add a generator
function.
* doc/extend.texi (RS/6000 built-in functions): Document the
'f128' IEEE 128-bit floating point built-in functions. Don't
document the older 'q' versions of the functions. Document the
built-in IEEE 128-bit floating point square root and fused
multiply-add built-ins.

[gcc/testsuite]
2017-09-14  Michael Meissner  

* gcc.target/powerpc/abs128-1.c: Use __builtin_fabsf128 instead of
__builtin_fabsq.
* gcc.target/powerpc/float128-5.c: Use __builtin_fabsf128 instead
of __builtin_fabsq.  Prevent the test from running on 32-bit.
* gcc.target/powerpc/float128-fma1.c: New test.
* gcc.target/powerpc/float128-fma2.c: Likewise.
* gcc.target/powerpc/float128-sqrt1.c: Likewise.
* gcc.target/powerpc/float128-sqrt2.c: Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000-builtin.def
===
--- gcc/config/rs6000/rs6000-builtin.def(revision 252768)
+++ gcc/config/rs6000/rs6000-builtin.def(working copy)
@@ -667,6 +667,23 @@
 | RS6000_BTC_UNARY),

Re: [PATCH], Add support for __builtin_{sqrt,fma}f128 on PowerPC ISA 3.0

2017-09-14 Thread Michael Meissner
On Wed, Sep 13, 2017 at 10:49:43PM +, Joseph Myers wrote:
> On Wed, 13 Sep 2017, Michael Meissner wrote:
> 
> > This patch adds support on PowerPC ISA 3.0 for the built-in function
> > __builtin_sqrtf128 generating the XSSQRTQP hardware square root instruction 
> > and
> > the built-in function __builtin_fmaf128 generating XSMADDQP, XSMSUBQP,
> > XSNMADDQP, and XSNMSUBQP fused multiply-add instructions.
> 
> Is there a reason for these to be architecture-specific rather than 
> generic everywhere _Float128 is supported?  (With the fmaf128 / sqrtf128 
> names available as well as the __builtin_* variants of those.)

I wanted to get in the PowerPC stuff ASAP, so the library people can start
using it.

I do think for at least some of the built-ins (sqrt, fma, lrint, etc.) it makes
sense, and at some point I was going to look at it.

In the grand scheme of things, it is only a temporary measure, and the real end
goal is to enable switching long double to be float128.  However, that will
take multiple releases to get there.

> Full support for _FloatN/_FloatNx variants of all the existing built-in 
> functions might be complicated, and run into potential issues with startup 
> cost of creating large numbers of extra built-in functions (it's 
> desirable, but possibly hard, which is why I excluded it from the initial 
> _FloatN / _FloatNx support patches).  But adding just these two functions 
> to builtins.def and making them fold / expand appropriately ought to be 
> much simpler.  (I realise sqrt goes through internal-fn.def and 
> DEF_INTERNAL_FLT_FN expects a particular set of functions for standard 
> types, so maybe some duplication would be involved to get the built-in 
> function expanded appropriately, i.e. using an insn pattern or a call to 
> an external sqrtf128 function according to whether such an insn pattern is 
> available.  fma ought not to involve much more than adding an extra case 
> where CASE_FLT_FN (BUILT_IN_FMA) is used.)

Yeah, but I wanted to get the easy stuff in there right now before looking at
the machine independent support.

> > While I was at it, I changed the documentation so that it no longer 
> > documents
> > the 'q' built-in functions (to mirror libquadmath) but instead just 
> > documented
> > the 'f128' functions that matches glibc 2.26 and the technical report that
> > added the _FloatF128 date.
> 
> Those *f128 built-in functions (inf / huge_val / nan / nans / fabs / 
> copysign) are not target-specific; they exist for all _FloatN / _FloatNx 
> types for all targets with such types.  So it doesn't seem appropriate to 
> document them in a target-specific section of the manual, beyond a brief 
> cross-reference to the documentation of the functions as 
> target-independent.

Right now we just document a few of the 'q' functions that were added before
you added the f128 versions.  I was trying to harmonize things.  Originally, I
was going to make both __builtin_sqrtq and __builtin_sqrtf128, but Segher and
David didn't want that.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Re: [PATCH], Add support for __builtin_{sqrt,fma}f128 on PowerPC ISA 3.0

2017-09-14 Thread Segher Boessenkool
On Wed, Sep 13, 2017 at 05:46:00PM -0400, Michael Meissner wrote:
> This patch adds support on PowerPC ISA 3.0 for the built-in function
> __builtin_sqrtf128 generating the XSSQRTQP hardware square root instruction 
> and
> the built-in function __builtin_fmaf128 generating XSMADDQP, XSMSUBQP,
> XSNMADDQP, and XSNMSUBQP fused multiply-add instructions.
> 
> While I was at it, I changed the documentation so that it no longer documents
> the 'q' built-in functions (to mirror libquadmath) but instead just documented
> the 'f128' functions that matches glibc 2.26 and the technical report that
> added the _FloatF128 date.
> 
> I changed the tests that used __fabsq to use __fabsf128 instead.
> 
> I also added && lp64 to float128-5.c so that it doesn't cause errors when 
> doing
> the test for a 32-bit target.  This is due to the fact that if you enable
> hardware IEEE 128-bit floating point, you eventually will need TImode
> supported, and that is not supported on 32-bit targets.
> 
> I did a bootstrap and make check with subversion id 252033 on a little endian
> power8 system.  The subversion id 252033 is one of the last svn ids that
> bootstrap without additional patches on the PowerPC.  There were no 
> regressions
> in this patch, and I verified the 4 new tests were run.  Can I check this 
> patch
> into the trunk?

Yes please.  A few trivial things:

>   * doc/extend.texi (RS/6000 built-in functions): Document the
>   'f128' IEEE 128-bit floating point built-in functions.  Don't
>   document the older 'q' versions of the functions. Document the
>   built-in IEEE 128-bit floating point square root and fused
>   multiply-add built-ins.

Dot space space.

> +/* 1 argument IEEE 128-bit floating point functions that require ISA 3.0
> +   hardware.  We define both a 'q' version for libquadmath compatibility, 
> and a
> +   'f128' for glibc 2.26.  We didn't need this for FABS/COPYSIGN, since the
> +   machine independent built-in support already defines the F128 versions,  
> */

Dot instead of comma?

> --- gcc/testsuite/gcc.target/powerpc/float128-5.c (revision 252730)
> +++ gcc/testsuite/gcc.target/powerpc/float128-5.c (working copy)
> @@ -1,4 +1,4 @@
> -/* { dg-do compile { target { powerpc*-*-linux* } } } */
> +/* { dg-do compile { target { powerpc*-*-linux* && lp64 } } } */

Maybe add a comment why this is -m64 only?

Thanks,


Segher


Re: [PATCH], Add support for __builtin_{sqrt,fma}f128 on PowerPC ISA 3.0

2017-09-13 Thread Joseph Myers
On Wed, 13 Sep 2017, Michael Meissner wrote:

> This patch adds support on PowerPC ISA 3.0 for the built-in function
> __builtin_sqrtf128 generating the XSSQRTQP hardware square root instruction 
> and
> the built-in function __builtin_fmaf128 generating XSMADDQP, XSMSUBQP,
> XSNMADDQP, and XSNMSUBQP fused multiply-add instructions.

Is there a reason for these to be architecture-specific rather than 
generic everywhere _Float128 is supported?  (With the fmaf128 / sqrtf128 
names available as well as the __builtin_* variants of those.)

Full support for _FloatN/_FloatNx variants of all the existing built-in 
functions might be complicated, and run into potential issues with startup 
cost of creating large numbers of extra built-in functions (it's 
desirable, but possibly hard, which is why I excluded it from the initial 
_FloatN / _FloatNx support patches).  But adding just these two functions 
to builtins.def and making them fold / expand appropriately ought to be 
much simpler.  (I realise sqrt goes through internal-fn.def and 
DEF_INTERNAL_FLT_FN expects a particular set of functions for standard 
types, so maybe some duplication would be involved to get the built-in 
function expanded appropriately, i.e. using an insn pattern or a call to 
an external sqrtf128 function according to whether such an insn pattern is 
available.  fma ought not to involve much more than adding an extra case 
where CASE_FLT_FN (BUILT_IN_FMA) is used.)

> While I was at it, I changed the documentation so that it no longer documents
> the 'q' built-in functions (to mirror libquadmath) but instead just documented
> the 'f128' functions that matches glibc 2.26 and the technical report that
> added the _FloatF128 date.

Those *f128 built-in functions (inf / huge_val / nan / nans / fabs / 
copysign) are not target-specific; they exist for all _FloatN / _FloatNx 
types for all targets with such types.  So it doesn't seem appropriate to 
document them in a target-specific section of the manual, beyond a brief 
cross-reference to the documentation of the functions as 
target-independent.

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH], Add support for __builtin_{sqrt,fma}f128 on PowerPC ISA 3.0

2017-09-13 Thread Michael Meissner
This patch adds support on PowerPC ISA 3.0 for the built-in function
__builtin_sqrtf128 generating the XSSQRTQP hardware square root instruction and
the built-in function __builtin_fmaf128 generating XSMADDQP, XSMSUBQP,
XSNMADDQP, and XSNMSUBQP fused multiply-add instructions.

While I was at it, I changed the documentation so that it no longer documents
the 'q' built-in functions (to mirror libquadmath) but instead just documented
the 'f128' functions that matches glibc 2.26 and the technical report that
added the _FloatF128 date.

I changed the tests that used __fabsq to use __fabsf128 instead.

I also added && lp64 to float128-5.c so that it doesn't cause errors when doing
the test for a 32-bit target.  This is due to the fact that if you enable
hardware IEEE 128-bit floating point, you eventually will need TImode
supported, and that is not supported on 32-bit targets.

I did a bootstrap and make check with subversion id 252033 on a little endian
power8 system.  The subversion id 252033 is one of the last svn ids that
bootstrap without additional patches on the PowerPC.  There were no regressions
in this patch, and I verified the 4 new tests were run.  Can I check this patch
into the trunk?

[gcc]
2017-09-13  Michael Meissner  

* config/rs6000/rs6000-builtin.def (BU_FLOAT128_1_HW): New macros
to support float128 built-in functions that require the ISA 3.0
hardware.
(BU_FLOAT128_3_HW): Likewise.
(SQRTF128): Add support for the IEEE 128-bit square root and fma
built-in functions.
(FMAF128): Likewise.
(FMAQ): Likewise.
* config/rs6000/rs6000.c (rs6000_builtin_mask_calculate): Add
support for built-in functions that need the ISA 3.0 IEEE 128-bit
floating point instructions.
(rs6000_invalid_builtin): Likewise.
(rs6000_builtin_mask_names): Likewise.
* config/rs6000/rs6000.h (MASK_FLOAT128_HW): Likewise.
(RS6000_BTM_FLOAT128_HW): Likewise.
(RS6000_BTM_COMMON): Likewise.
* config/rs6000/rs6000.md (fma4_hw): Add a generator
function.
* doc/extend.texi (RS/6000 built-in functions): Document the
'f128' IEEE 128-bit floating point built-in functions.  Don't
document the older 'q' versions of the functions. Document the
built-in IEEE 128-bit floating point square root and fused
multiply-add built-ins.

[gcc/testsuite]
2017-09-13  Michael Meissner  

* gcc.target/powerpc/abs128-1.c: Use __builtin_fabsf128 instead of
__builtin_fabsq.
* gcc.target/powerpc/float128-5.c: Use __builtin_fabsf128 instead
of __builtin_fabsq.  Prevent the test from running on 32-bit.
* gcc.target/powerpc/float128-fma1.c: New test.
* gcc.target/powerpc/float128-fma2.c: Likewise.
* gcc.target/powerpc/float128-sqrt1.c: Likewise.
* gcc.target/powerpc/float128-sqrt2.c: Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000-builtin.def
===
--- gcc/config/rs6000/rs6000-builtin.def(revision 252730)
+++ gcc/config/rs6000/rs6000-builtin.def(working copy)
@@ -667,6 +667,23 @@
 | RS6000_BTC_UNARY),   \
CODE_FOR_ ## ICODE) /* ICODE */
 
+/* IEEE 128-bit floating-point builtins that need the ISA 3.0 hardware.  */
+#define BU_FLOAT128_1_HW(ENUM, NAME, ATTR, ICODE)   \
+  RS6000_BUILTIN_1 (MISC_BUILTIN_ ## ENUM,  /* ENUM */  \
+   "__builtin_" NAME,  /* NAME */  \
+   RS6000_BTM_FLOAT128_HW, /* MASK */  \
+   (RS6000_BTC_ ## ATTR/* ATTR */  \
+| RS6000_BTC_UNARY),   \
+   CODE_FOR_ ## ICODE) /* ICODE */
+
+#define BU_FLOAT128_3_HW(ENUM, NAME, ATTR, ICODE)   \
+  RS6000_BUILTIN_3 (MISC_BUILTIN_ ## ENUM,  /* ENUM */  \
+   "__builtin_" NAME,  /* NAME */  \
+   RS6000_BTM_FLOAT128_HW, /* MASK */  \
+   (RS6000_BTC_ ## ATTR/* ATTR */  \
+| RS6000_BTC_TERNARY), \
+   CODE_FOR_ ## ICODE) /* ICODE */
+
 /* Miscellaneous builtins for instructions added in ISA 3.0.  These
instructions don't require either the DFP or VSX options, just the basic
ISA 3.0 enablement since they operate on general purpose registers.  */
@@ -2328,6 +2345,16 @@ BU_FLOAT128_1 (FABSQ,"fabsq",   CO
 
 /* 2 argument IEEE 128-bit