Re: [PATCH v2] Simplify pow with constant

2017-08-25 Thread Wilco Dijkstra
Jeff Law wrote:
> Right.  exp is painful in glibc, but pow is *dramatically* more painful
> and likely always will be.
>
> Siddhesh did some great work in bringing those costs down in glibc but
> the more code we can reasonably shunt into exp instead of pow, the better.
>
> It's likely pow will always be significantly more expensive than exp.
> It's also likely that predicting when these functions are going to go
> off the fast paths is painful.

With a modern implementation there won't be any slow path - it's completely
unnecessary, and you can get 100x speedup by simply doing things in a
sane way.

Szabolc's version of powf is almost literally doing exp(log(x) * y), so exp is
about twice as fast as pow.

Wilco

Re: [PATCH v2] Simplify pow with constant

2017-08-24 Thread Jeff Law
On 08/17/2017 09:43 AM, Alexander Monakov wrote:
> On Thu, 17 Aug 2017, Wilco Dijkstra wrote:
> 
>> This patch simplifies pow (C, x) into exp (x * C1) if C > 0, C1 = log (C).
> 
> Note this changes the outcome for C == +Inf, x == 0 (pow is specified to
> return 1.0 in that case, but x * C1 == NaN).  There's another existing
> transform with the same issue, 'pow(expN(x), y) -> expN(x*y)', so this is
> not a new problem.
That may be part of the reason why they're guarded with -ffast-math.  In
addition to the changes in accuracy.  I haven't followed those
transformations on the GCC side to know with any certainty.

jeff


Re: [PATCH v2] Simplify pow with constant

2017-08-24 Thread Jeff Law
On 08/17/2017 07:56 AM, Wilco Dijkstra wrote:
> This patch simplifies pow (C, x) into exp (x * C1) if C > 0, C1 = log (C).
> Do this only for fast-math as accuracy is reduced.  This is much faster
> since pow is more complex than exp - with current GLIBC the speedup is
> more than 7 times for this transformation.
Right.  exp is painful in glibc, but pow is *dramatically* more painful
and likely always will be.

Siddhesh did some great work in bringing those costs down in glibc but
the more code we can reasonably shunt into exp instead of pow, the better.

It's likely pow will always be significantly more expensive than exp.
It's also likely that predicting when these functions are going to go
off the fast paths is painful.

I think Richi already ack'd the V2 patch.  Just wanted to chime in given
I've been fairly deep on this in the past with customers on the glibc side.

jeff


Re: [PATCH v2] Simplify pow with constant

2017-08-18 Thread Richard Biener
On Fri, Aug 18, 2017 at 2:47 PM, Wilco Dijkstra  wrote:
> Alexander Monakov wrote:
>>
>> Note this changes the outcome for C == +Inf, x == 0 (pow is specified to
>> return 1.0 in that case, but x * C1 == NaN).  There's another existing
>> transform with the same issue, 'pow(expN(x), y) -> expN(x*y)', so this is
>> not a new problem.
>>
>> The whole set of these match.pd transforms is guarded by
>> flag_unsafe_math_optimizations, which is a bit strange, on the one hand
>> it does not include -ffinite-math-only, but on the other hand it's
>> defined broadly enough to imply that.
>
> Yes I was assuming that unsafe_math_optimizations was enough for these
> transformations to be... safe. I've added an isfinite check so that case 
> works fine.
> It looks we need to go through the more complex transformations (especially
> given pow has so many special cases) and add more finite_math checks.
> Here's the new version:
>
>
> This patch simplifies pow (C, x) into exp (x * C1) if C > 0, C1 = log (C).
> Do this only for fast-math as accuracy is reduced.  This is much faster
> since pow is more complex than exp - with current GLIBC the speedup is
> more than 7 times for this transformation.
>
> The worst-case ULP error of the transformation for powf (10.0, x) in SPEC
> was 2.5.  If we allow use of exp10 in match.pd, the ULP error would be lower.

You can use exp10/pow10 in match.pd but it will only match if the programmer
already uses those functions as they are not C standard.  There's also exp2
which is a C99 function.  Is there any good reason to convert exp (C * x)
to exp2 (C' * x)?  (besides if C' becomes 1)

So eventually we can match pow (10., x) -> exp10 and pow (2., x) -> exp2
(though I believe in canonicalization as well -- all exp calls could
be canonicalized
to pow calls to simplify the cases we need to handle...).

The patch is ok.

Thanks,
Richard.

> ChangeLog:
> 2017-08-18  Wilco Dijkstra  
>
> * match.pd: Add pow (C, x) simplification.
> --
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 
> 0e36f46b914bc63c257cef47152ab1aa507963e5..a5552c5096de5100a882d52add6b620ba87c1f72
>  100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -3622,6 +3622,16 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> (logs (pows @0 @1))
> (mult @1 (logs @0
>
> + /* pow(C,x) -> exp(log(C)*x) if C > 0.  */
> + (for pows (POW)
> +  exps (EXP)
> +  logs (LOG)
> +  (simplify
> +   (pows REAL_CST@0 @1)
> +(if (real_compare (GT_EXPR, TREE_REAL_CST_PTR (@0), )
> +&& real_isfinite (TREE_REAL_CST_PTR (@0)))
> + (exps (mult (logs @0) @1)
> +
>   (for sqrts (SQRT)
>cbrts (CBRT)
>pows (POW)
>
>


Re: [PATCH v2] Simplify pow with constant

2017-08-18 Thread Wilco Dijkstra
Alexander Monakov wrote:
>
> Note this changes the outcome for C == +Inf, x == 0 (pow is specified to
> return 1.0 in that case, but x * C1 == NaN).  There's another existing
> transform with the same issue, 'pow(expN(x), y) -> expN(x*y)', so this is
> not a new problem.
>
> The whole set of these match.pd transforms is guarded by
> flag_unsafe_math_optimizations, which is a bit strange, on the one hand
> it does not include -ffinite-math-only, but on the other hand it's
> defined broadly enough to imply that.

Yes I was assuming that unsafe_math_optimizations was enough for these
transformations to be... safe. I've added an isfinite check so that case works 
fine.
It looks we need to go through the more complex transformations (especially
given pow has so many special cases) and add more finite_math checks.
Here's the new version:


This patch simplifies pow (C, x) into exp (x * C1) if C > 0, C1 = log (C).
Do this only for fast-math as accuracy is reduced.  This is much faster
since pow is more complex than exp - with current GLIBC the speedup is
more than 7 times for this transformation.

The worst-case ULP error of the transformation for powf (10.0, x) in SPEC
was 2.5.  If we allow use of exp10 in match.pd, the ULP error would be lower.

ChangeLog:
2017-08-18  Wilco Dijkstra  

* match.pd: Add pow (C, x) simplification.
--
diff --git a/gcc/match.pd b/gcc/match.pd
index 
0e36f46b914bc63c257cef47152ab1aa507963e5..a5552c5096de5100a882d52add6b620ba87c1f72
 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3622,6 +3622,16 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(logs (pows @0 @1))
(mult @1 (logs @0
 
+ /* pow(C,x) -> exp(log(C)*x) if C > 0.  */
+ (for pows (POW)
+  exps (EXP)
+  logs (LOG)
+  (simplify
+   (pows REAL_CST@0 @1)
+(if (real_compare (GT_EXPR, TREE_REAL_CST_PTR (@0), )
+&& real_isfinite (TREE_REAL_CST_PTR (@0)))
+ (exps (mult (logs @0) @1)
+
  (for sqrts (SQRT)
   cbrts (CBRT)
   pows (POW)



Re: [PATCH v2] Simplify pow with constant

2017-08-18 Thread Richard Biener
On Thu, Aug 17, 2017 at 5:43 PM, Alexander Monakov  wrote:
> On Thu, 17 Aug 2017, Wilco Dijkstra wrote:
>
>> This patch simplifies pow (C, x) into exp (x * C1) if C > 0, C1 = log (C).
>
> Note this changes the outcome for C == +Inf, x == 0 (pow is specified to
> return 1.0 in that case, but x * C1 == NaN).  There's another existing
> transform with the same issue, 'pow(expN(x), y) -> expN(x*y)', so this is
> not a new problem.
>
> The whole set of these match.pd transforms is guarded by
> flag_unsafe_math_optimizations, which is a bit strange, on the one hand
> it does not include -ffinite-math-only, but on the other hand it's
> defined broadly enough to imply that.

No, flag_unsafe_math_optimization doesn't imply -ffinite-math-only.
-funsafe-math-optimization these days should guard transforms that
affect the value of the result (but not its classification).  There are
more specific variants like -fassociative-math which also affects
results but in a more specific way.

> (to be clear, I'm not objecting to the patch, just pointing out an
> existing inconsistency in case someone can offer clarifications)

We shouldn't add such issue knowingly, I guess for this case it's easy
to check that C is not +Inf.

For the existing transforms with the same issue guarding with
-ffinite-math-only is an appropriate fix.

Richard.

> Alexander


Re: [PATCH v2] Simplify pow with constant

2017-08-17 Thread Alexander Monakov
On Thu, 17 Aug 2017, Wilco Dijkstra wrote:

> This patch simplifies pow (C, x) into exp (x * C1) if C > 0, C1 = log (C).

Note this changes the outcome for C == +Inf, x == 0 (pow is specified to
return 1.0 in that case, but x * C1 == NaN).  There's another existing
transform with the same issue, 'pow(expN(x), y) -> expN(x*y)', so this is
not a new problem.

The whole set of these match.pd transforms is guarded by
flag_unsafe_math_optimizations, which is a bit strange, on the one hand
it does not include -ffinite-math-only, but on the other hand it's
defined broadly enough to imply that.

(to be clear, I'm not objecting to the patch, just pointing out an
existing inconsistency in case someone can offer clarifications)

Alexander


Re: [PATCH v2] Simplify pow with constant

2017-08-17 Thread Wilco Dijkstra
This patch simplifies pow (C, x) into exp (x * C1) if C > 0, C1 = log (C).
Do this only for fast-math as accuracy is reduced.  This is much faster
since pow is more complex than exp - with current GLIBC the speedup is
more than 7 times for this transformation.

The worst-case ULP error of the transformation for powf (10.0, x) in SPEC
was 2.5.  If we allow use of exp10 in match.pd, the ULP error would be lower.

OK for commit?

ChangeLog:
2017-08-17  Wilco Dijkstra  

* match.pd: Add pow (C, x) simplification.
--

diff --git a/gcc/match.pd b/gcc/match.pd
index 
0e36f46b914bc63c257cef47152ab1aa507963e5..a614917c8c9f24bba20c521e9b5f558be4886813
 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3622,6 +3622,15 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(logs (pows @0 @1))
(mult @1 (logs @0
 
+ /* pow(C,x) -> exp(log(C)*x) if C > 0.  */
+ (for pows (POW)
+  exps (EXP)
+  logs (LOG)
+  (simplify
+   (pows REAL_CST@0 @1)
+(if (real_compare (GT_EXPR, TREE_REAL_CST_PTR (@0), ))
+ (exps (mult (logs @0) @1)
+
  (for sqrts (SQRT)
   cbrts (CBRT)
   pows (POW)