Re: [PATCH] Defer pow (C, x) folding until after vectorization always (PR middle-end/82004)
On February 19, 2018 11:02:50 PM GMT+01:00, Jakub Jelinekwrote: >Hi! > >While I've over-simplified the testcase and so this patch doesn't help >the 628.pop2_s miscompare, I still believe it is beneficial to defer >this >folding until late for these reasons: >1) if we propagate a constant into the second pow argument too, it will > be likely more precise than going through the exp (cst * x) way >2) except when C is M_E, pow is fewer operations and thus smaller IL > >Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK. Richard. >2018-02-19 Jakub Jelinek > > PR middle-end/82004 > * match.pd (pow(C,x) -> exp(log(C)*x)): Delay all folding until > after vectorization. > > * gfortran.dg/pr82004.f90: New test. > >--- gcc/match.pd.jj2018-02-15 12:15:51.655780636 +0100 >+++ gcc/match.pd 2018-02-19 17:38:06.390763194 +0100 >@@ -4006,7 +4006,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > (simplify >(pows REAL_CST@0 @1) >(if (real_compare (GT_EXPR, TREE_REAL_CST_PTR (@0), ) >- && real_isfinite (TREE_REAL_CST_PTR (@0))) >+ && real_isfinite (TREE_REAL_CST_PTR (@0)) >+ /* As libmvec doesn't have a vectorized exp2, defer optimizing >+ the use_exp2 case until after vectorization. It seems actually >+ beneficial for all constants to postpone this until later, >+ because exp(log(C)*x), while faster, will have worse precision >+ and if x folds into a constant too, that is unnecessary >+ pessimization. */ >+ && canonicalize_math_after_vectorization_p ()) > (with { >const REAL_VALUE_TYPE *const value = TREE_REAL_CST_PTR (@0); >bool use_exp2 = false; >@@ -4021,10 +4028,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > } > (if (!use_exp2) > (exps (mult (logs @0) @1)) >- /* As libmvec doesn't have a vectorized exp2, defer optimizing >- this until after vectorization. */ >- (if (canonicalize_math_after_vectorization_p ()) >- (exp2s (mult (log2s @0) @1 >+ (exp2s (mult (log2s @0) @1))) > > (for sqrts (SQRT) > cbrts (CBRT) >--- gcc/testsuite/gfortran.dg/pr82004.f90.jj 2018-02-19 >17:58:57.435682156 +0100 >+++ gcc/testsuite/gfortran.dg/pr82004.f90 2018-02-19 17:58:34.127684892 >+0100 >@@ -0,0 +1,18 @@ >+! PR middle-end/82004 >+! { dg-do run } >+! { dg-options "-Ofast" } >+ >+ integer, parameter :: r8 = selected_real_kind(13), i4 = kind(1) >+ integer (i4), parameter :: a = 400, b = 2 >+ real (r8), parameter, dimension(b) :: c = (/ .001_r8, 10.00_r8 /) >+ real (r8) :: d, e, f, g, h >+ real (r8), parameter :: j & >+= 10**(log10(c(1))-(log10(c(b))-log10(c(1)))/real(a)) >+ >+ d = c(1) >+ e = c(b) >+ f = (log10(e)-log10(d))/real(a) >+ g = log10(d) - f >+ h = 10**(g) >+ if (h.ne.j) stop 1 >+end > > Jakub
[PATCH] Defer pow (C, x) folding until after vectorization always (PR middle-end/82004)
Hi! While I've over-simplified the testcase and so this patch doesn't help the 628.pop2_s miscompare, I still believe it is beneficial to defer this folding until late for these reasons: 1) if we propagate a constant into the second pow argument too, it will be likely more precise than going through the exp (cst * x) way 2) except when C is M_E, pow is fewer operations and thus smaller IL Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2018-02-19 Jakub JelinekPR middle-end/82004 * match.pd (pow(C,x) -> exp(log(C)*x)): Delay all folding until after vectorization. * gfortran.dg/pr82004.f90: New test. --- gcc/match.pd.jj 2018-02-15 12:15:51.655780636 +0100 +++ gcc/match.pd2018-02-19 17:38:06.390763194 +0100 @@ -4006,7 +4006,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (simplify (pows REAL_CST@0 @1) (if (real_compare (GT_EXPR, TREE_REAL_CST_PTR (@0), ) - && real_isfinite (TREE_REAL_CST_PTR (@0))) + && real_isfinite (TREE_REAL_CST_PTR (@0)) + /* As libmvec doesn't have a vectorized exp2, defer optimizing + the use_exp2 case until after vectorization. It seems actually + beneficial for all constants to postpone this until later, + because exp(log(C)*x), while faster, will have worse precision + and if x folds into a constant too, that is unnecessary + pessimization. */ + && canonicalize_math_after_vectorization_p ()) (with { const REAL_VALUE_TYPE *const value = TREE_REAL_CST_PTR (@0); bool use_exp2 = false; @@ -4021,10 +4028,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) } (if (!use_exp2) (exps (mult (logs @0) @1)) - /* As libmvec doesn't have a vectorized exp2, defer optimizing -this until after vectorization. */ - (if (canonicalize_math_after_vectorization_p ()) - (exp2s (mult (log2s @0) @1 + (exp2s (mult (log2s @0) @1))) (for sqrts (SQRT) cbrts (CBRT) --- gcc/testsuite/gfortran.dg/pr82004.f90.jj2018-02-19 17:58:57.435682156 +0100 +++ gcc/testsuite/gfortran.dg/pr82004.f90 2018-02-19 17:58:34.127684892 +0100 @@ -0,0 +1,18 @@ +! PR middle-end/82004 +! { dg-do run } +! { dg-options "-Ofast" } + + integer, parameter :: r8 = selected_real_kind(13), i4 = kind(1) + integer (i4), parameter :: a = 400, b = 2 + real (r8), parameter, dimension(b) :: c = (/ .001_r8, 10.00_r8 /) + real (r8) :: d, e, f, g, h + real (r8), parameter :: j & += 10**(log10(c(1))-(log10(c(b))-log10(c(1)))/real(a)) + + d = c(1) + e = c(b) + f = (log10(e)-log10(d))/real(a) + g = log10(d) - f + h = 10**(g) + if (h.ne.j) stop 1 +end Jakub