Re: [match.pd] PR83750 - CSE erf/erfc pair

2021-10-25 Thread Richard Biener via Gcc-patches
On Fri, 22 Oct 2021, Prathamesh Kulkarni wrote:

> On Fri, 22 Oct 2021 at 14:56, Richard Biener  wrote:
> >
> > On Fri, 22 Oct 2021, Prathamesh Kulkarni wrote:
> >
> > > On Wed, 20 Oct 2021 at 18:21, Richard Biener  wrote:
> > > >
> > > > On Wed, 20 Oct 2021, Prathamesh Kulkarni wrote:
> > > >
> > > > > On Tue, 19 Oct 2021 at 16:55, Richard Biener  
> > > > > wrote:
> > > > > >
> > > > > > On Tue, 19 Oct 2021, Prathamesh Kulkarni wrote:
> > > > > >
> > > > > > > On Tue, 19 Oct 2021 at 13:02, Richard Biener 
> > > > > > >  wrote:
> > > > > > > >
> > > > > > > > On Tue, Oct 19, 2021 at 9:03 AM Prathamesh Kulkarni via 
> > > > > > > > Gcc-patches
> > > > > > > >  wrote:
> > > > > > > > >
> > > > > > > > > On Mon, 18 Oct 2021 at 17:23, Richard Biener 
> > > > > > > > >  wrote:
> > > > > > > > > >
> > > > > > > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > > > > > > > >
> > > > > > > > > > > On Mon, 18 Oct 2021 at 17:10, Richard Biener 
> > > > > > > > > > >  wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > On Mon, 18 Oct 2021 at 16:18, Richard Biener 
> > > > > > > > > > > > >  wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi Richard,
> > > > > > > > > > > > > > > As suggested in PR, I have attached WIP patch 
> > > > > > > > > > > > > > > that adds two patterns
> > > > > > > > > > > > > > > to match.pd:
> > > > > > > > > > > > > > > erfc(x) --> 1 - erf(x) if canonicalize_math_p() 
> > > > > > > > > > > > > > > and,
> > > > > > > > > > > > > > > 1 - erf(x) --> erfc(x) if !canonicalize_math_p().
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > This works to remove call to erfc for the 
> > > > > > > > > > > > > > > following test:
> > > > > > > > > > > > > > > double f(double x)
> > > > > > > > > > > > > > > {
> > > > > > > > > > > > > > >   double g(double, double);
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >   double t1 = __builtin_erf (x);
> > > > > > > > > > > > > > >   double t2 = __builtin_erfc (x);
> > > > > > > > > > > > > > >   return g(t1, t2);
> > > > > > > > > > > > > > > }
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > with .optimized dump shows:
> > > > > > > > > > > > > > >   t1_2 = __builtin_erf (x_1(D));
> > > > > > > > > > > > > > >   t2_3 = 1.0e+0 - t1_2;
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > However, for the following test:
> > > > > > > > > > > > > > > double f(double x)
> > > > > > > > > > > > > > > {
> > > > > > > > > > > > > > >   double g(double, double);
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >   double t1 = __builtin_erfc (x);
> > > > > > > > > > > > > > >   return t1;
> > > > > > > > > > > > > > > }
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > It canonicalizes erfc(x) to 1 - erf(x), but does 
> > > > > > > > > > > > > > > not transform 1 -
> > > > > > > > > > > > > > > erf(x) to erfc(x) again
> > > > > > > > > > > > > > > post canonicalization.
> > > > > > > > > > > > > > > -fdump-tree-folding shows that 1 - erf(x) --> 
> > > > > > > > > > > > > > > erfc(x) gets applied,
> > > > > > > > > > > > > > > but then it tries to
> > > > > > > > > > > > > > > resimplify erfc(x), which fails post 
> > > > > > > > > > > > > > > canonicalization. So we end up
> > > > > > > > > > > > > > > with erfc(x) transformed to
> > > > > > > > > > > > > > > 1 - erf(x) in .optimized dump, which I suppose 
> > > > > > > > > > > > > > > isn't ideal.
> > > > > > > > > > > > > > > Could you suggest how to proceed ?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I applied your patch manually and it does the 
> > > > > > > > > > > > > > intended
> > > > > > > > > > > > > > simplifications so I wonder what I am missing?
> > > > > > > > > > > > > Would it be OK to always fold erfc(x) -> 1 - erf(x) 
> > > > > > > > > > > > > even when there's
> > > > > > > > > > > > > no erf(x) in the source ?
> > > > > > > > > > > >
> > > > > > > > > > > > I do think it's reasonable to expect erfc to be 
> > > > > > > > > > > > available when erf
> > > > > > > > > > > > is and vice versa but note both are C99 specified 
> > > > > > > > > > > > functions (either
> > > > > > > > > > > > requires -lm).
> > > > > > > > > > > OK, thanks. Would it be OK to commit the patch after 
> > > > > > > > > > > bootstrap+test ?
> > > > > > > > > >
> > > > > > > > > > Yes, but I'm confused because you say the patch doesn't 
> > > > > > > > > > work for you?
> > > > > > > > > The patch works for me to CSE erf/erfc pair.
> > > > > > > > > However when there's only erfc in the source, it 
> > > > > > > > > canonicalizes erfc(x)
> > > > > > > > > to 1 - erf(x) but later fails to uncanonicalize 1 - erf(x) 
> > > > > > > > > back to
> > > > > > > > > erfc(x)
> > > > > > > > > with -O3 -funsafe-math-optimizations.
> 

Re: [match.pd] PR83750 - CSE erf/erfc pair

2021-10-22 Thread Prathamesh Kulkarni via Gcc-patches
On Fri, 22 Oct 2021 at 14:56, Richard Biener  wrote:
>
> On Fri, 22 Oct 2021, Prathamesh Kulkarni wrote:
>
> > On Wed, 20 Oct 2021 at 18:21, Richard Biener  wrote:
> > >
> > > On Wed, 20 Oct 2021, Prathamesh Kulkarni wrote:
> > >
> > > > On Tue, 19 Oct 2021 at 16:55, Richard Biener  wrote:
> > > > >
> > > > > On Tue, 19 Oct 2021, Prathamesh Kulkarni wrote:
> > > > >
> > > > > > On Tue, 19 Oct 2021 at 13:02, Richard Biener 
> > > > > >  wrote:
> > > > > > >
> > > > > > > On Tue, Oct 19, 2021 at 9:03 AM Prathamesh Kulkarni via 
> > > > > > > Gcc-patches
> > > > > > >  wrote:
> > > > > > > >
> > > > > > > > On Mon, 18 Oct 2021 at 17:23, Richard Biener 
> > > > > > > >  wrote:
> > > > > > > > >
> > > > > > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > > > > > > >
> > > > > > > > > > On Mon, 18 Oct 2021 at 17:10, Richard Biener 
> > > > > > > > > >  wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > > > > > > > > >
> > > > > > > > > > > > On Mon, 18 Oct 2021 at 16:18, Richard Biener 
> > > > > > > > > > > >  wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi Richard,
> > > > > > > > > > > > > > As suggested in PR, I have attached WIP patch that 
> > > > > > > > > > > > > > adds two patterns
> > > > > > > > > > > > > > to match.pd:
> > > > > > > > > > > > > > erfc(x) --> 1 - erf(x) if canonicalize_math_p() and,
> > > > > > > > > > > > > > 1 - erf(x) --> erfc(x) if !canonicalize_math_p().
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > This works to remove call to erfc for the following 
> > > > > > > > > > > > > > test:
> > > > > > > > > > > > > > double f(double x)
> > > > > > > > > > > > > > {
> > > > > > > > > > > > > >   double g(double, double);
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >   double t1 = __builtin_erf (x);
> > > > > > > > > > > > > >   double t2 = __builtin_erfc (x);
> > > > > > > > > > > > > >   return g(t1, t2);
> > > > > > > > > > > > > > }
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > with .optimized dump shows:
> > > > > > > > > > > > > >   t1_2 = __builtin_erf (x_1(D));
> > > > > > > > > > > > > >   t2_3 = 1.0e+0 - t1_2;
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > However, for the following test:
> > > > > > > > > > > > > > double f(double x)
> > > > > > > > > > > > > > {
> > > > > > > > > > > > > >   double g(double, double);
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >   double t1 = __builtin_erfc (x);
> > > > > > > > > > > > > >   return t1;
> > > > > > > > > > > > > > }
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > It canonicalizes erfc(x) to 1 - erf(x), but does 
> > > > > > > > > > > > > > not transform 1 -
> > > > > > > > > > > > > > erf(x) to erfc(x) again
> > > > > > > > > > > > > > post canonicalization.
> > > > > > > > > > > > > > -fdump-tree-folding shows that 1 - erf(x) --> 
> > > > > > > > > > > > > > erfc(x) gets applied,
> > > > > > > > > > > > > > but then it tries to
> > > > > > > > > > > > > > resimplify erfc(x), which fails post 
> > > > > > > > > > > > > > canonicalization. So we end up
> > > > > > > > > > > > > > with erfc(x) transformed to
> > > > > > > > > > > > > > 1 - erf(x) in .optimized dump, which I suppose 
> > > > > > > > > > > > > > isn't ideal.
> > > > > > > > > > > > > > Could you suggest how to proceed ?
> > > > > > > > > > > > >
> > > > > > > > > > > > > I applied your patch manually and it does the intended
> > > > > > > > > > > > > simplifications so I wonder what I am missing?
> > > > > > > > > > > > Would it be OK to always fold erfc(x) -> 1 - erf(x) 
> > > > > > > > > > > > even when there's
> > > > > > > > > > > > no erf(x) in the source ?
> > > > > > > > > > >
> > > > > > > > > > > I do think it's reasonable to expect erfc to be available 
> > > > > > > > > > > when erf
> > > > > > > > > > > is and vice versa but note both are C99 specified 
> > > > > > > > > > > functions (either
> > > > > > > > > > > requires -lm).
> > > > > > > > > > OK, thanks. Would it be OK to commit the patch after 
> > > > > > > > > > bootstrap+test ?
> > > > > > > > >
> > > > > > > > > Yes, but I'm confused because you say the patch doesn't work 
> > > > > > > > > for you?
> > > > > > > > The patch works for me to CSE erf/erfc pair.
> > > > > > > > However when there's only erfc in the source, it canonicalizes 
> > > > > > > > erfc(x)
> > > > > > > > to 1 - erf(x) but later fails to uncanonicalize 1 - erf(x) back 
> > > > > > > > to
> > > > > > > > erfc(x)
> > > > > > > > with -O3 -funsafe-math-optimizations.
> > > > > > > >
> > > > > > > > For,
> > > > > > > > t1 = __builtin_erfc(x),
> > > > > > > >
> > > > > > > > .optimized dump shows:
> > > > > > > >   _2 = __builtin_erf (x_1(D));
> > > > > > > >   t1_3 = 1.0e+0 - _2;
> > > > > > > >
> > > > > > > > and for,
> > > > > > > > double t1 = x + __builtin_erfc(x);
> > > > > > > >

Re: [match.pd] PR83750 - CSE erf/erfc pair

2021-10-22 Thread Richard Biener via Gcc-patches
On Fri, 22 Oct 2021, Prathamesh Kulkarni wrote:

> On Wed, 20 Oct 2021 at 18:21, Richard Biener  wrote:
> >
> > On Wed, 20 Oct 2021, Prathamesh Kulkarni wrote:
> >
> > > On Tue, 19 Oct 2021 at 16:55, Richard Biener  wrote:
> > > >
> > > > On Tue, 19 Oct 2021, Prathamesh Kulkarni wrote:
> > > >
> > > > > On Tue, 19 Oct 2021 at 13:02, Richard Biener 
> > > > >  wrote:
> > > > > >
> > > > > > On Tue, Oct 19, 2021 at 9:03 AM Prathamesh Kulkarni via Gcc-patches
> > > > > >  wrote:
> > > > > > >
> > > > > > > On Mon, 18 Oct 2021 at 17:23, Richard Biener  
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > > > > > >
> > > > > > > > > On Mon, 18 Oct 2021 at 17:10, Richard Biener 
> > > > > > > > >  wrote:
> > > > > > > > > >
> > > > > > > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > > > > > > > >
> > > > > > > > > > > On Mon, 18 Oct 2021 at 16:18, Richard Biener 
> > > > > > > > > > >  wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi Richard,
> > > > > > > > > > > > > As suggested in PR, I have attached WIP patch that 
> > > > > > > > > > > > > adds two patterns
> > > > > > > > > > > > > to match.pd:
> > > > > > > > > > > > > erfc(x) --> 1 - erf(x) if canonicalize_math_p() and,
> > > > > > > > > > > > > 1 - erf(x) --> erfc(x) if !canonicalize_math_p().
> > > > > > > > > > > > >
> > > > > > > > > > > > > This works to remove call to erfc for the following 
> > > > > > > > > > > > > test:
> > > > > > > > > > > > > double f(double x)
> > > > > > > > > > > > > {
> > > > > > > > > > > > >   double g(double, double);
> > > > > > > > > > > > >
> > > > > > > > > > > > >   double t1 = __builtin_erf (x);
> > > > > > > > > > > > >   double t2 = __builtin_erfc (x);
> > > > > > > > > > > > >   return g(t1, t2);
> > > > > > > > > > > > > }
> > > > > > > > > > > > >
> > > > > > > > > > > > > with .optimized dump shows:
> > > > > > > > > > > > >   t1_2 = __builtin_erf (x_1(D));
> > > > > > > > > > > > >   t2_3 = 1.0e+0 - t1_2;
> > > > > > > > > > > > >
> > > > > > > > > > > > > However, for the following test:
> > > > > > > > > > > > > double f(double x)
> > > > > > > > > > > > > {
> > > > > > > > > > > > >   double g(double, double);
> > > > > > > > > > > > >
> > > > > > > > > > > > >   double t1 = __builtin_erfc (x);
> > > > > > > > > > > > >   return t1;
> > > > > > > > > > > > > }
> > > > > > > > > > > > >
> > > > > > > > > > > > > It canonicalizes erfc(x) to 1 - erf(x), but does not 
> > > > > > > > > > > > > transform 1 -
> > > > > > > > > > > > > erf(x) to erfc(x) again
> > > > > > > > > > > > > post canonicalization.
> > > > > > > > > > > > > -fdump-tree-folding shows that 1 - erf(x) --> erfc(x) 
> > > > > > > > > > > > > gets applied,
> > > > > > > > > > > > > but then it tries to
> > > > > > > > > > > > > resimplify erfc(x), which fails post 
> > > > > > > > > > > > > canonicalization. So we end up
> > > > > > > > > > > > > with erfc(x) transformed to
> > > > > > > > > > > > > 1 - erf(x) in .optimized dump, which I suppose isn't 
> > > > > > > > > > > > > ideal.
> > > > > > > > > > > > > Could you suggest how to proceed ?
> > > > > > > > > > > >
> > > > > > > > > > > > I applied your patch manually and it does the intended
> > > > > > > > > > > > simplifications so I wonder what I am missing?
> > > > > > > > > > > Would it be OK to always fold erfc(x) -> 1 - erf(x) even 
> > > > > > > > > > > when there's
> > > > > > > > > > > no erf(x) in the source ?
> > > > > > > > > >
> > > > > > > > > > I do think it's reasonable to expect erfc to be available 
> > > > > > > > > > when erf
> > > > > > > > > > is and vice versa but note both are C99 specified functions 
> > > > > > > > > > (either
> > > > > > > > > > requires -lm).
> > > > > > > > > OK, thanks. Would it be OK to commit the patch after 
> > > > > > > > > bootstrap+test ?
> > > > > > > >
> > > > > > > > Yes, but I'm confused because you say the patch doesn't work 
> > > > > > > > for you?
> > > > > > > The patch works for me to CSE erf/erfc pair.
> > > > > > > However when there's only erfc in the source, it canonicalizes 
> > > > > > > erfc(x)
> > > > > > > to 1 - erf(x) but later fails to uncanonicalize 1 - erf(x) back to
> > > > > > > erfc(x)
> > > > > > > with -O3 -funsafe-math-optimizations.
> > > > > > >
> > > > > > > For,
> > > > > > > t1 = __builtin_erfc(x),
> > > > > > >
> > > > > > > .optimized dump shows:
> > > > > > >   _2 = __builtin_erf (x_1(D));
> > > > > > >   t1_3 = 1.0e+0 - _2;
> > > > > > >
> > > > > > > and for,
> > > > > > > double t1 = x + __builtin_erfc(x);
> > > > > > >
> > > > > > > .optimized dump shows:
> > > > > > >   _3 = __builtin_erf (x_2(D));
> > > > > > >   _7 = x_2(D) + 1.0e+0;
> > > > > > >   t1_4 = _7 - _3;
> > > > > > >
> > > > > > > I assume in both cases, we want erfc in the code-gen instead ?
> > > > > > > I think the reason uncaonicalization fails is 

Re: [match.pd] PR83750 - CSE erf/erfc pair

2021-10-22 Thread Prathamesh Kulkarni via Gcc-patches
On Wed, 20 Oct 2021 at 18:21, Richard Biener  wrote:
>
> On Wed, 20 Oct 2021, Prathamesh Kulkarni wrote:
>
> > On Tue, 19 Oct 2021 at 16:55, Richard Biener  wrote:
> > >
> > > On Tue, 19 Oct 2021, Prathamesh Kulkarni wrote:
> > >
> > > > On Tue, 19 Oct 2021 at 13:02, Richard Biener 
> > > >  wrote:
> > > > >
> > > > > On Tue, Oct 19, 2021 at 9:03 AM Prathamesh Kulkarni via Gcc-patches
> > > > >  wrote:
> > > > > >
> > > > > > On Mon, 18 Oct 2021 at 17:23, Richard Biener  
> > > > > > wrote:
> > > > > > >
> > > > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > > > > >
> > > > > > > > On Mon, 18 Oct 2021 at 17:10, Richard Biener 
> > > > > > > >  wrote:
> > > > > > > > >
> > > > > > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > > > > > > >
> > > > > > > > > > On Mon, 18 Oct 2021 at 16:18, Richard Biener 
> > > > > > > > > >  wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Richard,
> > > > > > > > > > > > As suggested in PR, I have attached WIP patch that adds 
> > > > > > > > > > > > two patterns
> > > > > > > > > > > > to match.pd:
> > > > > > > > > > > > erfc(x) --> 1 - erf(x) if canonicalize_math_p() and,
> > > > > > > > > > > > 1 - erf(x) --> erfc(x) if !canonicalize_math_p().
> > > > > > > > > > > >
> > > > > > > > > > > > This works to remove call to erfc for the following 
> > > > > > > > > > > > test:
> > > > > > > > > > > > double f(double x)
> > > > > > > > > > > > {
> > > > > > > > > > > >   double g(double, double);
> > > > > > > > > > > >
> > > > > > > > > > > >   double t1 = __builtin_erf (x);
> > > > > > > > > > > >   double t2 = __builtin_erfc (x);
> > > > > > > > > > > >   return g(t1, t2);
> > > > > > > > > > > > }
> > > > > > > > > > > >
> > > > > > > > > > > > with .optimized dump shows:
> > > > > > > > > > > >   t1_2 = __builtin_erf (x_1(D));
> > > > > > > > > > > >   t2_3 = 1.0e+0 - t1_2;
> > > > > > > > > > > >
> > > > > > > > > > > > However, for the following test:
> > > > > > > > > > > > double f(double x)
> > > > > > > > > > > > {
> > > > > > > > > > > >   double g(double, double);
> > > > > > > > > > > >
> > > > > > > > > > > >   double t1 = __builtin_erfc (x);
> > > > > > > > > > > >   return t1;
> > > > > > > > > > > > }
> > > > > > > > > > > >
> > > > > > > > > > > > It canonicalizes erfc(x) to 1 - erf(x), but does not 
> > > > > > > > > > > > transform 1 -
> > > > > > > > > > > > erf(x) to erfc(x) again
> > > > > > > > > > > > post canonicalization.
> > > > > > > > > > > > -fdump-tree-folding shows that 1 - erf(x) --> erfc(x) 
> > > > > > > > > > > > gets applied,
> > > > > > > > > > > > but then it tries to
> > > > > > > > > > > > resimplify erfc(x), which fails post canonicalization. 
> > > > > > > > > > > > So we end up
> > > > > > > > > > > > with erfc(x) transformed to
> > > > > > > > > > > > 1 - erf(x) in .optimized dump, which I suppose isn't 
> > > > > > > > > > > > ideal.
> > > > > > > > > > > > Could you suggest how to proceed ?
> > > > > > > > > > >
> > > > > > > > > > > I applied your patch manually and it does the intended
> > > > > > > > > > > simplifications so I wonder what I am missing?
> > > > > > > > > > Would it be OK to always fold erfc(x) -> 1 - erf(x) even 
> > > > > > > > > > when there's
> > > > > > > > > > no erf(x) in the source ?
> > > > > > > > >
> > > > > > > > > I do think it's reasonable to expect erfc to be available 
> > > > > > > > > when erf
> > > > > > > > > is and vice versa but note both are C99 specified functions 
> > > > > > > > > (either
> > > > > > > > > requires -lm).
> > > > > > > > OK, thanks. Would it be OK to commit the patch after 
> > > > > > > > bootstrap+test ?
> > > > > > >
> > > > > > > Yes, but I'm confused because you say the patch doesn't work for 
> > > > > > > you?
> > > > > > The patch works for me to CSE erf/erfc pair.
> > > > > > However when there's only erfc in the source, it canonicalizes 
> > > > > > erfc(x)
> > > > > > to 1 - erf(x) but later fails to uncanonicalize 1 - erf(x) back to
> > > > > > erfc(x)
> > > > > > with -O3 -funsafe-math-optimizations.
> > > > > >
> > > > > > For,
> > > > > > t1 = __builtin_erfc(x),
> > > > > >
> > > > > > .optimized dump shows:
> > > > > >   _2 = __builtin_erf (x_1(D));
> > > > > >   t1_3 = 1.0e+0 - _2;
> > > > > >
> > > > > > and for,
> > > > > > double t1 = x + __builtin_erfc(x);
> > > > > >
> > > > > > .optimized dump shows:
> > > > > >   _3 = __builtin_erf (x_2(D));
> > > > > >   _7 = x_2(D) + 1.0e+0;
> > > > > >   t1_4 = _7 - _3;
> > > > > >
> > > > > > I assume in both cases, we want erfc in the code-gen instead ?
> > > > > > I think the reason uncaonicalization fails is because the pattern 1 
> > > > > > -
> > > > > > erf(x) to erfc(x)
> > > > > > gets applied, but then it fails in resimplifying erfc(x), and we end
> > > > > > up with 1 - erf(x) in code-gen.
> > > > > >
> > > > > > From gimple-match.c, it hits the simplification:
> > > > > >
> 

Re: [match.pd] PR83750 - CSE erf/erfc pair

2021-10-20 Thread Richard Biener via Gcc-patches
On Wed, 20 Oct 2021, Prathamesh Kulkarni wrote:

> On Tue, 19 Oct 2021 at 16:55, Richard Biener  wrote:
> >
> > On Tue, 19 Oct 2021, Prathamesh Kulkarni wrote:
> >
> > > On Tue, 19 Oct 2021 at 13:02, Richard Biener  
> > > wrote:
> > > >
> > > > On Tue, Oct 19, 2021 at 9:03 AM Prathamesh Kulkarni via Gcc-patches
> > > >  wrote:
> > > > >
> > > > > On Mon, 18 Oct 2021 at 17:23, Richard Biener  
> > > > > wrote:
> > > > > >
> > > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > > > >
> > > > > > > On Mon, 18 Oct 2021 at 17:10, Richard Biener  
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > > > > > >
> > > > > > > > > On Mon, 18 Oct 2021 at 16:18, Richard Biener 
> > > > > > > > >  wrote:
> > > > > > > > > >
> > > > > > > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Richard,
> > > > > > > > > > > As suggested in PR, I have attached WIP patch that adds 
> > > > > > > > > > > two patterns
> > > > > > > > > > > to match.pd:
> > > > > > > > > > > erfc(x) --> 1 - erf(x) if canonicalize_math_p() and,
> > > > > > > > > > > 1 - erf(x) --> erfc(x) if !canonicalize_math_p().
> > > > > > > > > > >
> > > > > > > > > > > This works to remove call to erfc for the following test:
> > > > > > > > > > > double f(double x)
> > > > > > > > > > > {
> > > > > > > > > > >   double g(double, double);
> > > > > > > > > > >
> > > > > > > > > > >   double t1 = __builtin_erf (x);
> > > > > > > > > > >   double t2 = __builtin_erfc (x);
> > > > > > > > > > >   return g(t1, t2);
> > > > > > > > > > > }
> > > > > > > > > > >
> > > > > > > > > > > with .optimized dump shows:
> > > > > > > > > > >   t1_2 = __builtin_erf (x_1(D));
> > > > > > > > > > >   t2_3 = 1.0e+0 - t1_2;
> > > > > > > > > > >
> > > > > > > > > > > However, for the following test:
> > > > > > > > > > > double f(double x)
> > > > > > > > > > > {
> > > > > > > > > > >   double g(double, double);
> > > > > > > > > > >
> > > > > > > > > > >   double t1 = __builtin_erfc (x);
> > > > > > > > > > >   return t1;
> > > > > > > > > > > }
> > > > > > > > > > >
> > > > > > > > > > > It canonicalizes erfc(x) to 1 - erf(x), but does not 
> > > > > > > > > > > transform 1 -
> > > > > > > > > > > erf(x) to erfc(x) again
> > > > > > > > > > > post canonicalization.
> > > > > > > > > > > -fdump-tree-folding shows that 1 - erf(x) --> erfc(x) 
> > > > > > > > > > > gets applied,
> > > > > > > > > > > but then it tries to
> > > > > > > > > > > resimplify erfc(x), which fails post canonicalization. So 
> > > > > > > > > > > we end up
> > > > > > > > > > > with erfc(x) transformed to
> > > > > > > > > > > 1 - erf(x) in .optimized dump, which I suppose isn't 
> > > > > > > > > > > ideal.
> > > > > > > > > > > Could you suggest how to proceed ?
> > > > > > > > > >
> > > > > > > > > > I applied your patch manually and it does the intended
> > > > > > > > > > simplifications so I wonder what I am missing?
> > > > > > > > > Would it be OK to always fold erfc(x) -> 1 - erf(x) even when 
> > > > > > > > > there's
> > > > > > > > > no erf(x) in the source ?
> > > > > > > >
> > > > > > > > I do think it's reasonable to expect erfc to be available when 
> > > > > > > > erf
> > > > > > > > is and vice versa but note both are C99 specified functions 
> > > > > > > > (either
> > > > > > > > requires -lm).
> > > > > > > OK, thanks. Would it be OK to commit the patch after 
> > > > > > > bootstrap+test ?
> > > > > >
> > > > > > Yes, but I'm confused because you say the patch doesn't work for 
> > > > > > you?
> > > > > The patch works for me to CSE erf/erfc pair.
> > > > > However when there's only erfc in the source, it canonicalizes erfc(x)
> > > > > to 1 - erf(x) but later fails to uncanonicalize 1 - erf(x) back to
> > > > > erfc(x)
> > > > > with -O3 -funsafe-math-optimizations.
> > > > >
> > > > > For,
> > > > > t1 = __builtin_erfc(x),
> > > > >
> > > > > .optimized dump shows:
> > > > >   _2 = __builtin_erf (x_1(D));
> > > > >   t1_3 = 1.0e+0 - _2;
> > > > >
> > > > > and for,
> > > > > double t1 = x + __builtin_erfc(x);
> > > > >
> > > > > .optimized dump shows:
> > > > >   _3 = __builtin_erf (x_2(D));
> > > > >   _7 = x_2(D) + 1.0e+0;
> > > > >   t1_4 = _7 - _3;
> > > > >
> > > > > I assume in both cases, we want erfc in the code-gen instead ?
> > > > > I think the reason uncaonicalization fails is because the pattern 1 -
> > > > > erf(x) to erfc(x)
> > > > > gets applied, but then it fails in resimplifying erfc(x), and we end
> > > > > up with 1 - erf(x) in code-gen.
> > > > >
> > > > > From gimple-match.c, it hits the simplification:
> > > > >
> > > > > gimple_seq *lseq = seq;
> > > > > if (__builtin_expect (!dbg_cnt
> > > > > (match), 0)) goto next_after_fail1172;
> > > > > if (__builtin_expect (dump_file &&
> > > > > (dump_flags & TDF_FOLDING), 0)) fprintf (dump_file, "Applying 

Re: [match.pd] PR83750 - CSE erf/erfc pair

2021-10-20 Thread Prathamesh Kulkarni via Gcc-patches
On Tue, 19 Oct 2021 at 16:55, Richard Biener  wrote:
>
> On Tue, 19 Oct 2021, Prathamesh Kulkarni wrote:
>
> > On Tue, 19 Oct 2021 at 13:02, Richard Biener  
> > wrote:
> > >
> > > On Tue, Oct 19, 2021 at 9:03 AM Prathamesh Kulkarni via Gcc-patches
> > >  wrote:
> > > >
> > > > On Mon, 18 Oct 2021 at 17:23, Richard Biener  wrote:
> > > > >
> > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > > >
> > > > > > On Mon, 18 Oct 2021 at 17:10, Richard Biener  
> > > > > > wrote:
> > > > > > >
> > > > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > > > > >
> > > > > > > > On Mon, 18 Oct 2021 at 16:18, Richard Biener 
> > > > > > > >  wrote:
> > > > > > > > >
> > > > > > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > > > > > > >
> > > > > > > > > > Hi Richard,
> > > > > > > > > > As suggested in PR, I have attached WIP patch that adds two 
> > > > > > > > > > patterns
> > > > > > > > > > to match.pd:
> > > > > > > > > > erfc(x) --> 1 - erf(x) if canonicalize_math_p() and,
> > > > > > > > > > 1 - erf(x) --> erfc(x) if !canonicalize_math_p().
> > > > > > > > > >
> > > > > > > > > > This works to remove call to erfc for the following test:
> > > > > > > > > > double f(double x)
> > > > > > > > > > {
> > > > > > > > > >   double g(double, double);
> > > > > > > > > >
> > > > > > > > > >   double t1 = __builtin_erf (x);
> > > > > > > > > >   double t2 = __builtin_erfc (x);
> > > > > > > > > >   return g(t1, t2);
> > > > > > > > > > }
> > > > > > > > > >
> > > > > > > > > > with .optimized dump shows:
> > > > > > > > > >   t1_2 = __builtin_erf (x_1(D));
> > > > > > > > > >   t2_3 = 1.0e+0 - t1_2;
> > > > > > > > > >
> > > > > > > > > > However, for the following test:
> > > > > > > > > > double f(double x)
> > > > > > > > > > {
> > > > > > > > > >   double g(double, double);
> > > > > > > > > >
> > > > > > > > > >   double t1 = __builtin_erfc (x);
> > > > > > > > > >   return t1;
> > > > > > > > > > }
> > > > > > > > > >
> > > > > > > > > > It canonicalizes erfc(x) to 1 - erf(x), but does not 
> > > > > > > > > > transform 1 -
> > > > > > > > > > erf(x) to erfc(x) again
> > > > > > > > > > post canonicalization.
> > > > > > > > > > -fdump-tree-folding shows that 1 - erf(x) --> erfc(x) gets 
> > > > > > > > > > applied,
> > > > > > > > > > but then it tries to
> > > > > > > > > > resimplify erfc(x), which fails post canonicalization. So 
> > > > > > > > > > we end up
> > > > > > > > > > with erfc(x) transformed to
> > > > > > > > > > 1 - erf(x) in .optimized dump, which I suppose isn't ideal.
> > > > > > > > > > Could you suggest how to proceed ?
> > > > > > > > >
> > > > > > > > > I applied your patch manually and it does the intended
> > > > > > > > > simplifications so I wonder what I am missing?
> > > > > > > > Would it be OK to always fold erfc(x) -> 1 - erf(x) even when 
> > > > > > > > there's
> > > > > > > > no erf(x) in the source ?
> > > > > > >
> > > > > > > I do think it's reasonable to expect erfc to be available when erf
> > > > > > > is and vice versa but note both are C99 specified functions 
> > > > > > > (either
> > > > > > > requires -lm).
> > > > > > OK, thanks. Would it be OK to commit the patch after bootstrap+test 
> > > > > > ?
> > > > >
> > > > > Yes, but I'm confused because you say the patch doesn't work for you?
> > > > The patch works for me to CSE erf/erfc pair.
> > > > However when there's only erfc in the source, it canonicalizes erfc(x)
> > > > to 1 - erf(x) but later fails to uncanonicalize 1 - erf(x) back to
> > > > erfc(x)
> > > > with -O3 -funsafe-math-optimizations.
> > > >
> > > > For,
> > > > t1 = __builtin_erfc(x),
> > > >
> > > > .optimized dump shows:
> > > >   _2 = __builtin_erf (x_1(D));
> > > >   t1_3 = 1.0e+0 - _2;
> > > >
> > > > and for,
> > > > double t1 = x + __builtin_erfc(x);
> > > >
> > > > .optimized dump shows:
> > > >   _3 = __builtin_erf (x_2(D));
> > > >   _7 = x_2(D) + 1.0e+0;
> > > >   t1_4 = _7 - _3;
> > > >
> > > > I assume in both cases, we want erfc in the code-gen instead ?
> > > > I think the reason uncaonicalization fails is because the pattern 1 -
> > > > erf(x) to erfc(x)
> > > > gets applied, but then it fails in resimplifying erfc(x), and we end
> > > > up with 1 - erf(x) in code-gen.
> > > >
> > > > From gimple-match.c, it hits the simplification:
> > > >
> > > > gimple_seq *lseq = seq;
> > > > if (__builtin_expect (!dbg_cnt
> > > > (match), 0)) goto next_after_fail1172;
> > > > if (__builtin_expect (dump_file &&
> > > > (dump_flags & TDF_FOLDING), 0)) fprintf (dump_file, "Applying pattern
> > > > %s:%d, %s:%d\n", "match.pd", 6162, __FILE__, __LINE__);
> > > > {
> > > >   res_op->set_op (CFN_BUILT_IN_ERFC, 
> > > > type, 1);
> > > >   res_op->ops[0] = captures[0];
> > > >   res_op->resimplify 

Re: [match.pd] PR83750 - CSE erf/erfc pair

2021-10-19 Thread Richard Biener via Gcc-patches
On Tue, 19 Oct 2021, Prathamesh Kulkarni wrote:

> On Tue, 19 Oct 2021 at 13:02, Richard Biener  
> wrote:
> >
> > On Tue, Oct 19, 2021 at 9:03 AM Prathamesh Kulkarni via Gcc-patches
> >  wrote:
> > >
> > > On Mon, 18 Oct 2021 at 17:23, Richard Biener  wrote:
> > > >
> > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > >
> > > > > On Mon, 18 Oct 2021 at 17:10, Richard Biener  
> > > > > wrote:
> > > > > >
> > > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > > > >
> > > > > > > On Mon, 18 Oct 2021 at 16:18, Richard Biener  
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > > > > > >
> > > > > > > > > Hi Richard,
> > > > > > > > > As suggested in PR, I have attached WIP patch that adds two 
> > > > > > > > > patterns
> > > > > > > > > to match.pd:
> > > > > > > > > erfc(x) --> 1 - erf(x) if canonicalize_math_p() and,
> > > > > > > > > 1 - erf(x) --> erfc(x) if !canonicalize_math_p().
> > > > > > > > >
> > > > > > > > > This works to remove call to erfc for the following test:
> > > > > > > > > double f(double x)
> > > > > > > > > {
> > > > > > > > >   double g(double, double);
> > > > > > > > >
> > > > > > > > >   double t1 = __builtin_erf (x);
> > > > > > > > >   double t2 = __builtin_erfc (x);
> > > > > > > > >   return g(t1, t2);
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > with .optimized dump shows:
> > > > > > > > >   t1_2 = __builtin_erf (x_1(D));
> > > > > > > > >   t2_3 = 1.0e+0 - t1_2;
> > > > > > > > >
> > > > > > > > > However, for the following test:
> > > > > > > > > double f(double x)
> > > > > > > > > {
> > > > > > > > >   double g(double, double);
> > > > > > > > >
> > > > > > > > >   double t1 = __builtin_erfc (x);
> > > > > > > > >   return t1;
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > It canonicalizes erfc(x) to 1 - erf(x), but does not 
> > > > > > > > > transform 1 -
> > > > > > > > > erf(x) to erfc(x) again
> > > > > > > > > post canonicalization.
> > > > > > > > > -fdump-tree-folding shows that 1 - erf(x) --> erfc(x) gets 
> > > > > > > > > applied,
> > > > > > > > > but then it tries to
> > > > > > > > > resimplify erfc(x), which fails post canonicalization. So we 
> > > > > > > > > end up
> > > > > > > > > with erfc(x) transformed to
> > > > > > > > > 1 - erf(x) in .optimized dump, which I suppose isn't ideal.
> > > > > > > > > Could you suggest how to proceed ?
> > > > > > > >
> > > > > > > > I applied your patch manually and it does the intended
> > > > > > > > simplifications so I wonder what I am missing?
> > > > > > > Would it be OK to always fold erfc(x) -> 1 - erf(x) even when 
> > > > > > > there's
> > > > > > > no erf(x) in the source ?
> > > > > >
> > > > > > I do think it's reasonable to expect erfc to be available when erf
> > > > > > is and vice versa but note both are C99 specified functions (either
> > > > > > requires -lm).
> > > > > OK, thanks. Would it be OK to commit the patch after bootstrap+test ?
> > > >
> > > > Yes, but I'm confused because you say the patch doesn't work for you?
> > > The patch works for me to CSE erf/erfc pair.
> > > However when there's only erfc in the source, it canonicalizes erfc(x)
> > > to 1 - erf(x) but later fails to uncanonicalize 1 - erf(x) back to
> > > erfc(x)
> > > with -O3 -funsafe-math-optimizations.
> > >
> > > For,
> > > t1 = __builtin_erfc(x),
> > >
> > > .optimized dump shows:
> > >   _2 = __builtin_erf (x_1(D));
> > >   t1_3 = 1.0e+0 - _2;
> > >
> > > and for,
> > > double t1 = x + __builtin_erfc(x);
> > >
> > > .optimized dump shows:
> > >   _3 = __builtin_erf (x_2(D));
> > >   _7 = x_2(D) + 1.0e+0;
> > >   t1_4 = _7 - _3;
> > >
> > > I assume in both cases, we want erfc in the code-gen instead ?
> > > I think the reason uncaonicalization fails is because the pattern 1 -
> > > erf(x) to erfc(x)
> > > gets applied, but then it fails in resimplifying erfc(x), and we end
> > > up with 1 - erf(x) in code-gen.
> > >
> > > From gimple-match.c, it hits the simplification:
> > >
> > > gimple_seq *lseq = seq;
> > > if (__builtin_expect (!dbg_cnt
> > > (match), 0)) goto next_after_fail1172;
> > > if (__builtin_expect (dump_file &&
> > > (dump_flags & TDF_FOLDING), 0)) fprintf (dump_file, "Applying pattern
> > > %s:%d, %s:%d\n", "match.pd", 6162, __FILE__, __LINE__);
> > > {
> > >   res_op->set_op (CFN_BUILT_IN_ERFC, 
> > > type, 1);
> > >   res_op->ops[0] = captures[0];
> > >   res_op->resimplify (lseq, valueize);
> > >   return true;
> > > }
> > >
> > > But res_op->resimplify returns false, and doesn't end up adding to lseq.
> >
> > There's nothing to add to lseq since there's also nothing to resimplify.
> > The only thing that could happen is 

Re: [match.pd] PR83750 - CSE erf/erfc pair

2021-10-19 Thread Prathamesh Kulkarni via Gcc-patches
On Tue, 19 Oct 2021 at 13:02, Richard Biener  wrote:
>
> On Tue, Oct 19, 2021 at 9:03 AM Prathamesh Kulkarni via Gcc-patches
>  wrote:
> >
> > On Mon, 18 Oct 2021 at 17:23, Richard Biener  wrote:
> > >
> > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > >
> > > > On Mon, 18 Oct 2021 at 17:10, Richard Biener  wrote:
> > > > >
> > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > > >
> > > > > > On Mon, 18 Oct 2021 at 16:18, Richard Biener  
> > > > > > wrote:
> > > > > > >
> > > > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > > > > >
> > > > > > > > Hi Richard,
> > > > > > > > As suggested in PR, I have attached WIP patch that adds two 
> > > > > > > > patterns
> > > > > > > > to match.pd:
> > > > > > > > erfc(x) --> 1 - erf(x) if canonicalize_math_p() and,
> > > > > > > > 1 - erf(x) --> erfc(x) if !canonicalize_math_p().
> > > > > > > >
> > > > > > > > This works to remove call to erfc for the following test:
> > > > > > > > double f(double x)
> > > > > > > > {
> > > > > > > >   double g(double, double);
> > > > > > > >
> > > > > > > >   double t1 = __builtin_erf (x);
> > > > > > > >   double t2 = __builtin_erfc (x);
> > > > > > > >   return g(t1, t2);
> > > > > > > > }
> > > > > > > >
> > > > > > > > with .optimized dump shows:
> > > > > > > >   t1_2 = __builtin_erf (x_1(D));
> > > > > > > >   t2_3 = 1.0e+0 - t1_2;
> > > > > > > >
> > > > > > > > However, for the following test:
> > > > > > > > double f(double x)
> > > > > > > > {
> > > > > > > >   double g(double, double);
> > > > > > > >
> > > > > > > >   double t1 = __builtin_erfc (x);
> > > > > > > >   return t1;
> > > > > > > > }
> > > > > > > >
> > > > > > > > It canonicalizes erfc(x) to 1 - erf(x), but does not transform 
> > > > > > > > 1 -
> > > > > > > > erf(x) to erfc(x) again
> > > > > > > > post canonicalization.
> > > > > > > > -fdump-tree-folding shows that 1 - erf(x) --> erfc(x) gets 
> > > > > > > > applied,
> > > > > > > > but then it tries to
> > > > > > > > resimplify erfc(x), which fails post canonicalization. So we 
> > > > > > > > end up
> > > > > > > > with erfc(x) transformed to
> > > > > > > > 1 - erf(x) in .optimized dump, which I suppose isn't ideal.
> > > > > > > > Could you suggest how to proceed ?
> > > > > > >
> > > > > > > I applied your patch manually and it does the intended
> > > > > > > simplifications so I wonder what I am missing?
> > > > > > Would it be OK to always fold erfc(x) -> 1 - erf(x) even when 
> > > > > > there's
> > > > > > no erf(x) in the source ?
> > > > >
> > > > > I do think it's reasonable to expect erfc to be available when erf
> > > > > is and vice versa but note both are C99 specified functions (either
> > > > > requires -lm).
> > > > OK, thanks. Would it be OK to commit the patch after bootstrap+test ?
> > >
> > > Yes, but I'm confused because you say the patch doesn't work for you?
> > The patch works for me to CSE erf/erfc pair.
> > However when there's only erfc in the source, it canonicalizes erfc(x)
> > to 1 - erf(x) but later fails to uncanonicalize 1 - erf(x) back to
> > erfc(x)
> > with -O3 -funsafe-math-optimizations.
> >
> > For,
> > t1 = __builtin_erfc(x),
> >
> > .optimized dump shows:
> >   _2 = __builtin_erf (x_1(D));
> >   t1_3 = 1.0e+0 - _2;
> >
> > and for,
> > double t1 = x + __builtin_erfc(x);
> >
> > .optimized dump shows:
> >   _3 = __builtin_erf (x_2(D));
> >   _7 = x_2(D) + 1.0e+0;
> >   t1_4 = _7 - _3;
> >
> > I assume in both cases, we want erfc in the code-gen instead ?
> > I think the reason uncaonicalization fails is because the pattern 1 -
> > erf(x) to erfc(x)
> > gets applied, but then it fails in resimplifying erfc(x), and we end
> > up with 1 - erf(x) in code-gen.
> >
> > From gimple-match.c, it hits the simplification:
> >
> > gimple_seq *lseq = seq;
> > if (__builtin_expect (!dbg_cnt
> > (match), 0)) goto next_after_fail1172;
> > if (__builtin_expect (dump_file &&
> > (dump_flags & TDF_FOLDING), 0)) fprintf (dump_file, "Applying pattern
> > %s:%d, %s:%d\n", "match.pd", 6162, __FILE__, __LINE__);
> > {
> >   res_op->set_op (CFN_BUILT_IN_ERFC, type, 
> > 1);
> >   res_op->ops[0] = captures[0];
> >   res_op->resimplify (lseq, valueize);
> >   return true;
> > }
> >
> > But res_op->resimplify returns false, and doesn't end up adding to lseq.
>
> There's nothing to add to lseq since there's also nothing to resimplify.
> The only thing that could happen is that the replacement is not done
> because replace_stmt_with_simplification via maybe_push_res_to_seq
> doesn't pass the builtin_decl_implicit test:
>
>   /* Find the function we want to call.  */
>   tree decl = builtin_decl_implicit (as_builtin_fn (fn));
>   if (!decl)
>   

Re: [match.pd] PR83750 - CSE erf/erfc pair

2021-10-19 Thread Richard Biener via Gcc-patches
On Tue, Oct 19, 2021 at 9:03 AM Prathamesh Kulkarni via Gcc-patches
 wrote:
>
> On Mon, 18 Oct 2021 at 17:23, Richard Biener  wrote:
> >
> > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> >
> > > On Mon, 18 Oct 2021 at 17:10, Richard Biener  wrote:
> > > >
> > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > >
> > > > > On Mon, 18 Oct 2021 at 16:18, Richard Biener  
> > > > > wrote:
> > > > > >
> > > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > > > >
> > > > > > > Hi Richard,
> > > > > > > As suggested in PR, I have attached WIP patch that adds two 
> > > > > > > patterns
> > > > > > > to match.pd:
> > > > > > > erfc(x) --> 1 - erf(x) if canonicalize_math_p() and,
> > > > > > > 1 - erf(x) --> erfc(x) if !canonicalize_math_p().
> > > > > > >
> > > > > > > This works to remove call to erfc for the following test:
> > > > > > > double f(double x)
> > > > > > > {
> > > > > > >   double g(double, double);
> > > > > > >
> > > > > > >   double t1 = __builtin_erf (x);
> > > > > > >   double t2 = __builtin_erfc (x);
> > > > > > >   return g(t1, t2);
> > > > > > > }
> > > > > > >
> > > > > > > with .optimized dump shows:
> > > > > > >   t1_2 = __builtin_erf (x_1(D));
> > > > > > >   t2_3 = 1.0e+0 - t1_2;
> > > > > > >
> > > > > > > However, for the following test:
> > > > > > > double f(double x)
> > > > > > > {
> > > > > > >   double g(double, double);
> > > > > > >
> > > > > > >   double t1 = __builtin_erfc (x);
> > > > > > >   return t1;
> > > > > > > }
> > > > > > >
> > > > > > > It canonicalizes erfc(x) to 1 - erf(x), but does not transform 1 -
> > > > > > > erf(x) to erfc(x) again
> > > > > > > post canonicalization.
> > > > > > > -fdump-tree-folding shows that 1 - erf(x) --> erfc(x) gets 
> > > > > > > applied,
> > > > > > > but then it tries to
> > > > > > > resimplify erfc(x), which fails post canonicalization. So we end 
> > > > > > > up
> > > > > > > with erfc(x) transformed to
> > > > > > > 1 - erf(x) in .optimized dump, which I suppose isn't ideal.
> > > > > > > Could you suggest how to proceed ?
> > > > > >
> > > > > > I applied your patch manually and it does the intended
> > > > > > simplifications so I wonder what I am missing?
> > > > > Would it be OK to always fold erfc(x) -> 1 - erf(x) even when there's
> > > > > no erf(x) in the source ?
> > > >
> > > > I do think it's reasonable to expect erfc to be available when erf
> > > > is and vice versa but note both are C99 specified functions (either
> > > > requires -lm).
> > > OK, thanks. Would it be OK to commit the patch after bootstrap+test ?
> >
> > Yes, but I'm confused because you say the patch doesn't work for you?
> The patch works for me to CSE erf/erfc pair.
> However when there's only erfc in the source, it canonicalizes erfc(x)
> to 1 - erf(x) but later fails to uncanonicalize 1 - erf(x) back to
> erfc(x)
> with -O3 -funsafe-math-optimizations.
>
> For,
> t1 = __builtin_erfc(x),
>
> .optimized dump shows:
>   _2 = __builtin_erf (x_1(D));
>   t1_3 = 1.0e+0 - _2;
>
> and for,
> double t1 = x + __builtin_erfc(x);
>
> .optimized dump shows:
>   _3 = __builtin_erf (x_2(D));
>   _7 = x_2(D) + 1.0e+0;
>   t1_4 = _7 - _3;
>
> I assume in both cases, we want erfc in the code-gen instead ?
> I think the reason uncaonicalization fails is because the pattern 1 -
> erf(x) to erfc(x)
> gets applied, but then it fails in resimplifying erfc(x), and we end
> up with 1 - erf(x) in code-gen.
>
> From gimple-match.c, it hits the simplification:
>
> gimple_seq *lseq = seq;
> if (__builtin_expect (!dbg_cnt
> (match), 0)) goto next_after_fail1172;
> if (__builtin_expect (dump_file &&
> (dump_flags & TDF_FOLDING), 0)) fprintf (dump_file, "Applying pattern
> %s:%d, %s:%d\n", "match.pd", 6162, __FILE__, __LINE__);
> {
>   res_op->set_op (CFN_BUILT_IN_ERFC, type, 1);
>   res_op->ops[0] = captures[0];
>   res_op->resimplify (lseq, valueize);
>   return true;
> }
>
> But res_op->resimplify returns false, and doesn't end up adding to lseq.

There's nothing to add to lseq since there's also nothing to resimplify.
The only thing that could happen is that the replacement is not done
because replace_stmt_with_simplification via maybe_push_res_to_seq
doesn't pass the builtin_decl_implicit test:

  /* Find the function we want to call.  */
  tree decl = builtin_decl_implicit (as_builtin_fn (fn));
  if (!decl)
return NULL;

btw, it did work for me since the call was present before and gimplification
should then mark the function eligible for implicit generation.

> As you suggest, should we instead handle this in fre to transform
> erfc(x) to 1 - erf(x), only when
> there's a matching erf(x) in the source ?

Note that's strictly 

Re: [match.pd] PR83750 - CSE erf/erfc pair

2021-10-19 Thread Prathamesh Kulkarni via Gcc-patches
On Mon, 18 Oct 2021 at 17:23, Richard Biener  wrote:
>
> On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
>
> > On Mon, 18 Oct 2021 at 17:10, Richard Biener  wrote:
> > >
> > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > >
> > > > On Mon, 18 Oct 2021 at 16:18, Richard Biener  wrote:
> > > > >
> > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > > >
> > > > > > Hi Richard,
> > > > > > As suggested in PR, I have attached WIP patch that adds two patterns
> > > > > > to match.pd:
> > > > > > erfc(x) --> 1 - erf(x) if canonicalize_math_p() and,
> > > > > > 1 - erf(x) --> erfc(x) if !canonicalize_math_p().
> > > > > >
> > > > > > This works to remove call to erfc for the following test:
> > > > > > double f(double x)
> > > > > > {
> > > > > >   double g(double, double);
> > > > > >
> > > > > >   double t1 = __builtin_erf (x);
> > > > > >   double t2 = __builtin_erfc (x);
> > > > > >   return g(t1, t2);
> > > > > > }
> > > > > >
> > > > > > with .optimized dump shows:
> > > > > >   t1_2 = __builtin_erf (x_1(D));
> > > > > >   t2_3 = 1.0e+0 - t1_2;
> > > > > >
> > > > > > However, for the following test:
> > > > > > double f(double x)
> > > > > > {
> > > > > >   double g(double, double);
> > > > > >
> > > > > >   double t1 = __builtin_erfc (x);
> > > > > >   return t1;
> > > > > > }
> > > > > >
> > > > > > It canonicalizes erfc(x) to 1 - erf(x), but does not transform 1 -
> > > > > > erf(x) to erfc(x) again
> > > > > > post canonicalization.
> > > > > > -fdump-tree-folding shows that 1 - erf(x) --> erfc(x) gets applied,
> > > > > > but then it tries to
> > > > > > resimplify erfc(x), which fails post canonicalization. So we end up
> > > > > > with erfc(x) transformed to
> > > > > > 1 - erf(x) in .optimized dump, which I suppose isn't ideal.
> > > > > > Could you suggest how to proceed ?
> > > > >
> > > > > I applied your patch manually and it does the intended
> > > > > simplifications so I wonder what I am missing?
> > > > Would it be OK to always fold erfc(x) -> 1 - erf(x) even when there's
> > > > no erf(x) in the source ?
> > >
> > > I do think it's reasonable to expect erfc to be available when erf
> > > is and vice versa but note both are C99 specified functions (either
> > > requires -lm).
> > OK, thanks. Would it be OK to commit the patch after bootstrap+test ?
>
> Yes, but I'm confused because you say the patch doesn't work for you?
The patch works for me to CSE erf/erfc pair.
However when there's only erfc in the source, it canonicalizes erfc(x)
to 1 - erf(x) but later fails to uncanonicalize 1 - erf(x) back to
erfc(x)
with -O3 -funsafe-math-optimizations.

For,
t1 = __builtin_erfc(x),

.optimized dump shows:
  _2 = __builtin_erf (x_1(D));
  t1_3 = 1.0e+0 - _2;

and for,
double t1 = x + __builtin_erfc(x);

.optimized dump shows:
  _3 = __builtin_erf (x_2(D));
  _7 = x_2(D) + 1.0e+0;
  t1_4 = _7 - _3;

I assume in both cases, we want erfc in the code-gen instead ?
I think the reason uncaonicalization fails is because the pattern 1 -
erf(x) to erfc(x)
gets applied, but then it fails in resimplifying erfc(x), and we end
up with 1 - erf(x) in code-gen.

>From gimple-match.c, it hits the simplification:

gimple_seq *lseq = seq;
if (__builtin_expect (!dbg_cnt
(match), 0)) goto next_after_fail1172;
if (__builtin_expect (dump_file &&
(dump_flags & TDF_FOLDING), 0)) fprintf (dump_file, "Applying pattern
%s:%d, %s:%d\n", "match.pd", 6162, __FILE__, __LINE__);
{
  res_op->set_op (CFN_BUILT_IN_ERFC, type, 1);
  res_op->ops[0] = captures[0];
  res_op->resimplify (lseq, valueize);
  return true;
}

But res_op->resimplify returns false, and doesn't end up adding to lseq.

As you suggest, should we instead handle this in fre to transform
erfc(x) to 1 - erf(x), only when
there's a matching erf(x) in the source ?

Thanks,
Prathamesh
>
> Btw, please add the testcase from the PR and also a testcase that shows
> the canonicalization is undone.  Maybe you can also double-check that
> we handle x + erfc (x) because I see we associate that as
> (x + 1) - erf (x) which is then not recognized back to erfc.
>
> The less surprising (as to preserve the function called in the source)
> variant for the PR would be to teach CSE to lookup erf(x) when
> visiting erfc(x) and when found synthesize 1 - erf(x).
>
> That said, a mathematician should chime in on how important it is
> to preserve erfc vs. erf (precision or even speed).
>
> Thanks,
> Richard.
>
> > Thanks,
> > Prathamesh
> >
> > >
> > > Richard.
> > >
> > > > So for the following test:
> > > > double f(double x)
> > > > {
> > > >   t1 = __builtin_erfc(x)
> > > >   return t1;
> > > > }
> > > >
> > > > .optimized dump shows:
> > > > double f (double x)
> > > > {
> > > >   double 

Re: [match.pd] PR83750 - CSE erf/erfc pair

2021-10-18 Thread Richard Biener via Gcc-patches
On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:

> On Mon, 18 Oct 2021 at 17:10, Richard Biener  wrote:
> >
> > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> >
> > > On Mon, 18 Oct 2021 at 16:18, Richard Biener  wrote:
> > > >
> > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > >
> > > > > Hi Richard,
> > > > > As suggested in PR, I have attached WIP patch that adds two patterns
> > > > > to match.pd:
> > > > > erfc(x) --> 1 - erf(x) if canonicalize_math_p() and,
> > > > > 1 - erf(x) --> erfc(x) if !canonicalize_math_p().
> > > > >
> > > > > This works to remove call to erfc for the following test:
> > > > > double f(double x)
> > > > > {
> > > > >   double g(double, double);
> > > > >
> > > > >   double t1 = __builtin_erf (x);
> > > > >   double t2 = __builtin_erfc (x);
> > > > >   return g(t1, t2);
> > > > > }
> > > > >
> > > > > with .optimized dump shows:
> > > > >   t1_2 = __builtin_erf (x_1(D));
> > > > >   t2_3 = 1.0e+0 - t1_2;
> > > > >
> > > > > However, for the following test:
> > > > > double f(double x)
> > > > > {
> > > > >   double g(double, double);
> > > > >
> > > > >   double t1 = __builtin_erfc (x);
> > > > >   return t1;
> > > > > }
> > > > >
> > > > > It canonicalizes erfc(x) to 1 - erf(x), but does not transform 1 -
> > > > > erf(x) to erfc(x) again
> > > > > post canonicalization.
> > > > > -fdump-tree-folding shows that 1 - erf(x) --> erfc(x) gets applied,
> > > > > but then it tries to
> > > > > resimplify erfc(x), which fails post canonicalization. So we end up
> > > > > with erfc(x) transformed to
> > > > > 1 - erf(x) in .optimized dump, which I suppose isn't ideal.
> > > > > Could you suggest how to proceed ?
> > > >
> > > > I applied your patch manually and it does the intended
> > > > simplifications so I wonder what I am missing?
> > > Would it be OK to always fold erfc(x) -> 1 - erf(x) even when there's
> > > no erf(x) in the source ?
> >
> > I do think it's reasonable to expect erfc to be available when erf
> > is and vice versa but note both are C99 specified functions (either
> > requires -lm).
> OK, thanks. Would it be OK to commit the patch after bootstrap+test ?

Yes, but I'm confused because you say the patch doesn't work for you?

Btw, please add the testcase from the PR and also a testcase that shows
the canonicalization is undone.  Maybe you can also double-check that
we handle x + erfc (x) because I see we associate that as
(x + 1) - erf (x) which is then not recognized back to erfc.

The less surprising (as to preserve the function called in the source)
variant for the PR would be to teach CSE to lookup erf(x) when
visiting erfc(x) and when found synthesize 1 - erf(x).

That said, a mathematician should chime in on how important it is
to preserve erfc vs. erf (precision or even speed).

Thanks,
Richard.

> Thanks,
> Prathamesh
> 
> >
> > Richard.
> >
> > > So for the following test:
> > > double f(double x)
> > > {
> > >   t1 = __builtin_erfc(x)
> > >   return t1;
> > > }
> > >
> > > .optimized dump shows:
> > > double f (double x)
> > > {
> > >   double t1;
> > >   double _2;
> > >
> > >[local count: 1073741824]:
> > >   _2 = __builtin_erf (x_1(D));
> > >   t1_3 = 1.0e+0 - _2;
> > >   return t1_3;
> > > }
> > >
> > > while before patch, it has:
> > >   t1_4 = __builtin_erfc (x_2(D)); [tail call]
> > >   return t1_4;
> > >
> > > Thanks,
> > > Prathamesh
> > >
> > > >
> > > > Richard.
> > > >
> > > > > Thanks,
> > > > > Prathamesh
> > > > >
> > > >
> > > > --
> > > > Richard Biener 
> > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > >
> >
> > --
> > Richard Biener 
> > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


Re: [match.pd] PR83750 - CSE erf/erfc pair

2021-10-18 Thread Prathamesh Kulkarni via Gcc-patches
On Mon, 18 Oct 2021 at 17:10, Richard Biener  wrote:
>
> On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
>
> > On Mon, 18 Oct 2021 at 16:18, Richard Biener  wrote:
> > >
> > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > >
> > > > Hi Richard,
> > > > As suggested in PR, I have attached WIP patch that adds two patterns
> > > > to match.pd:
> > > > erfc(x) --> 1 - erf(x) if canonicalize_math_p() and,
> > > > 1 - erf(x) --> erfc(x) if !canonicalize_math_p().
> > > >
> > > > This works to remove call to erfc for the following test:
> > > > double f(double x)
> > > > {
> > > >   double g(double, double);
> > > >
> > > >   double t1 = __builtin_erf (x);
> > > >   double t2 = __builtin_erfc (x);
> > > >   return g(t1, t2);
> > > > }
> > > >
> > > > with .optimized dump shows:
> > > >   t1_2 = __builtin_erf (x_1(D));
> > > >   t2_3 = 1.0e+0 - t1_2;
> > > >
> > > > However, for the following test:
> > > > double f(double x)
> > > > {
> > > >   double g(double, double);
> > > >
> > > >   double t1 = __builtin_erfc (x);
> > > >   return t1;
> > > > }
> > > >
> > > > It canonicalizes erfc(x) to 1 - erf(x), but does not transform 1 -
> > > > erf(x) to erfc(x) again
> > > > post canonicalization.
> > > > -fdump-tree-folding shows that 1 - erf(x) --> erfc(x) gets applied,
> > > > but then it tries to
> > > > resimplify erfc(x), which fails post canonicalization. So we end up
> > > > with erfc(x) transformed to
> > > > 1 - erf(x) in .optimized dump, which I suppose isn't ideal.
> > > > Could you suggest how to proceed ?
> > >
> > > I applied your patch manually and it does the intended
> > > simplifications so I wonder what I am missing?
> > Would it be OK to always fold erfc(x) -> 1 - erf(x) even when there's
> > no erf(x) in the source ?
>
> I do think it's reasonable to expect erfc to be available when erf
> is and vice versa but note both are C99 specified functions (either
> requires -lm).
OK, thanks. Would it be OK to commit the patch after bootstrap+test ?

Thanks,
Prathamesh

>
> Richard.
>
> > So for the following test:
> > double f(double x)
> > {
> >   t1 = __builtin_erfc(x)
> >   return t1;
> > }
> >
> > .optimized dump shows:
> > double f (double x)
> > {
> >   double t1;
> >   double _2;
> >
> >[local count: 1073741824]:
> >   _2 = __builtin_erf (x_1(D));
> >   t1_3 = 1.0e+0 - _2;
> >   return t1_3;
> > }
> >
> > while before patch, it has:
> >   t1_4 = __builtin_erfc (x_2(D)); [tail call]
> >   return t1_4;
> >
> > Thanks,
> > Prathamesh
> >
> > >
> > > Richard.
> > >
> > > > Thanks,
> > > > Prathamesh
> > > >
> > >
> > > --
> > > Richard Biener 
> > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> >
>
> --
> Richard Biener 
> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


Re: [match.pd] PR83750 - CSE erf/erfc pair

2021-10-18 Thread Richard Biener via Gcc-patches
On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:

> On Mon, 18 Oct 2021 at 16:18, Richard Biener  wrote:
> >
> > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> >
> > > Hi Richard,
> > > As suggested in PR, I have attached WIP patch that adds two patterns
> > > to match.pd:
> > > erfc(x) --> 1 - erf(x) if canonicalize_math_p() and,
> > > 1 - erf(x) --> erfc(x) if !canonicalize_math_p().
> > >
> > > This works to remove call to erfc for the following test:
> > > double f(double x)
> > > {
> > >   double g(double, double);
> > >
> > >   double t1 = __builtin_erf (x);
> > >   double t2 = __builtin_erfc (x);
> > >   return g(t1, t2);
> > > }
> > >
> > > with .optimized dump shows:
> > >   t1_2 = __builtin_erf (x_1(D));
> > >   t2_3 = 1.0e+0 - t1_2;
> > >
> > > However, for the following test:
> > > double f(double x)
> > > {
> > >   double g(double, double);
> > >
> > >   double t1 = __builtin_erfc (x);
> > >   return t1;
> > > }
> > >
> > > It canonicalizes erfc(x) to 1 - erf(x), but does not transform 1 -
> > > erf(x) to erfc(x) again
> > > post canonicalization.
> > > -fdump-tree-folding shows that 1 - erf(x) --> erfc(x) gets applied,
> > > but then it tries to
> > > resimplify erfc(x), which fails post canonicalization. So we end up
> > > with erfc(x) transformed to
> > > 1 - erf(x) in .optimized dump, which I suppose isn't ideal.
> > > Could you suggest how to proceed ?
> >
> > I applied your patch manually and it does the intended
> > simplifications so I wonder what I am missing?
> Would it be OK to always fold erfc(x) -> 1 - erf(x) even when there's
> no erf(x) in the source ?

I do think it's reasonable to expect erfc to be available when erf
is and vice versa but note both are C99 specified functions (either
requires -lm).

Richard.

> So for the following test:
> double f(double x)
> {
>   t1 = __builtin_erfc(x)
>   return t1;
> }
> 
> .optimized dump shows:
> double f (double x)
> {
>   double t1;
>   double _2;
> 
>[local count: 1073741824]:
>   _2 = __builtin_erf (x_1(D));
>   t1_3 = 1.0e+0 - _2;
>   return t1_3;
> }
> 
> while before patch, it has:
>   t1_4 = __builtin_erfc (x_2(D)); [tail call]
>   return t1_4;
> 
> Thanks,
> Prathamesh
> 
> >
> > Richard.
> >
> > > Thanks,
> > > Prathamesh
> > >
> >
> > --
> > Richard Biener 
> > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


Re: [match.pd] PR83750 - CSE erf/erfc pair

2021-10-18 Thread Prathamesh Kulkarni via Gcc-patches
On Mon, 18 Oct 2021 at 16:18, Richard Biener  wrote:
>
> On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
>
> > Hi Richard,
> > As suggested in PR, I have attached WIP patch that adds two patterns
> > to match.pd:
> > erfc(x) --> 1 - erf(x) if canonicalize_math_p() and,
> > 1 - erf(x) --> erfc(x) if !canonicalize_math_p().
> >
> > This works to remove call to erfc for the following test:
> > double f(double x)
> > {
> >   double g(double, double);
> >
> >   double t1 = __builtin_erf (x);
> >   double t2 = __builtin_erfc (x);
> >   return g(t1, t2);
> > }
> >
> > with .optimized dump shows:
> >   t1_2 = __builtin_erf (x_1(D));
> >   t2_3 = 1.0e+0 - t1_2;
> >
> > However, for the following test:
> > double f(double x)
> > {
> >   double g(double, double);
> >
> >   double t1 = __builtin_erfc (x);
> >   return t1;
> > }
> >
> > It canonicalizes erfc(x) to 1 - erf(x), but does not transform 1 -
> > erf(x) to erfc(x) again
> > post canonicalization.
> > -fdump-tree-folding shows that 1 - erf(x) --> erfc(x) gets applied,
> > but then it tries to
> > resimplify erfc(x), which fails post canonicalization. So we end up
> > with erfc(x) transformed to
> > 1 - erf(x) in .optimized dump, which I suppose isn't ideal.
> > Could you suggest how to proceed ?
>
> I applied your patch manually and it does the intended
> simplifications so I wonder what I am missing?
Would it be OK to always fold erfc(x) -> 1 - erf(x) even when there's
no erf(x) in the source ?
So for the following test:
double f(double x)
{
  t1 = __builtin_erfc(x)
  return t1;
}

.optimized dump shows:
double f (double x)
{
  double t1;
  double _2;

   [local count: 1073741824]:
  _2 = __builtin_erf (x_1(D));
  t1_3 = 1.0e+0 - _2;
  return t1_3;
}

while before patch, it has:
  t1_4 = __builtin_erfc (x_2(D)); [tail call]
  return t1_4;

Thanks,
Prathamesh

>
> Richard.
>
> > Thanks,
> > Prathamesh
> >
>
> --
> Richard Biener 
> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


Re: [match.pd] PR83750 - CSE erf/erfc pair

2021-10-18 Thread Richard Biener via Gcc-patches
On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:

> Hi Richard,
> As suggested in PR, I have attached WIP patch that adds two patterns
> to match.pd:
> erfc(x) --> 1 - erf(x) if canonicalize_math_p() and,
> 1 - erf(x) --> erfc(x) if !canonicalize_math_p().
> 
> This works to remove call to erfc for the following test:
> double f(double x)
> {
>   double g(double, double);
> 
>   double t1 = __builtin_erf (x);
>   double t2 = __builtin_erfc (x);
>   return g(t1, t2);
> }
> 
> with .optimized dump shows:
>   t1_2 = __builtin_erf (x_1(D));
>   t2_3 = 1.0e+0 - t1_2;
> 
> However, for the following test:
> double f(double x)
> {
>   double g(double, double);
> 
>   double t1 = __builtin_erfc (x);
>   return t1;
> }
> 
> It canonicalizes erfc(x) to 1 - erf(x), but does not transform 1 -
> erf(x) to erfc(x) again
> post canonicalization.
> -fdump-tree-folding shows that 1 - erf(x) --> erfc(x) gets applied,
> but then it tries to
> resimplify erfc(x), which fails post canonicalization. So we end up
> with erfc(x) transformed to
> 1 - erf(x) in .optimized dump, which I suppose isn't ideal.
> Could you suggest how to proceed ?

I applied your patch manually and it does the intended
simplifications so I wonder what I am missing?

Richard.

> Thanks,
> Prathamesh
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


[match.pd] PR83750 - CSE erf/erfc pair

2021-10-18 Thread Prathamesh Kulkarni via Gcc-patches
Hi Richard,
As suggested in PR, I have attached WIP patch that adds two patterns
to match.pd:
erfc(x) --> 1 - erf(x) if canonicalize_math_p() and,
1 - erf(x) --> erfc(x) if !canonicalize_math_p().

This works to remove call to erfc for the following test:
double f(double x)
{
  double g(double, double);

  double t1 = __builtin_erf (x);
  double t2 = __builtin_erfc (x);
  return g(t1, t2);
}

with .optimized dump shows:
  t1_2 = __builtin_erf (x_1(D));
  t2_3 = 1.0e+0 - t1_2;

However, for the following test:
double f(double x)
{
  double g(double, double);

  double t1 = __builtin_erfc (x);
  return t1;
}

It canonicalizes erfc(x) to 1 - erf(x), but does not transform 1 -
erf(x) to erfc(x) again
post canonicalization.
-fdump-tree-folding shows that 1 - erf(x) --> erfc(x) gets applied,
but then it tries to
resimplify erfc(x), which fails post canonicalization. So we end up
with erfc(x) transformed to
1 - erf(x) in .optimized dump, which I suppose isn't ideal.
Could you suggest how to proceed ?

Thanks,
Prathamesh
diff --git a/gcc/match.pd b/gcc/match.pd
index a9791ceb74a..217e46ff12c 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -6147,6 +6147,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(floors tree_expr_nonnegative_p@0)
(truncs @0
 
+/* Simplify,
+   erfc(x) -> 1 - erf(x) if canonicalize_math_p().
+   1 - erf(x) -> erfc(x) if !canonicalize_math_p().  */
+
+(if (flag_unsafe_math_optimizations)
+ (simplify
+  (ERFC @0)
+   (if (canonicalize_math_p ())
+(minus { build_one_cst (TREE_TYPE (@0)); } (ERF @0
+ (simplify
+  (minus real_onep (ERF @0))
+  (if (!canonicalize_math_p ())
+   (ERFC @0
+
 (match double_value_p
  @0
  (if (TYPE_MAIN_VARIANT (TREE_TYPE (@0)) == double_type_node)))


Re: PR83750: CSE erf/erfc pair

2018-11-09 Thread Richard Biener
On Fri, Nov 9, 2018 at 8:11 AM Prathamesh Kulkarni
 wrote:
>
> On Tue, 6 Nov 2018 at 16:04, Richard Biener  
> wrote:
> >
> > On Mon, Nov 5, 2018 at 3:11 PM Prathamesh Kulkarni
> >  wrote:
> > >
> > > On Mon, 5 Nov 2018 at 18:14, Richard Biener  
> > > wrote:
> > > >
> > > > On Mon, Nov 5, 2018 at 1:11 PM Prathamesh Kulkarni
> > > >  wrote:
> > > > >
> > > > > On Mon, 5 Nov 2018 at 15:10, Richard Biener 
> > > > >  wrote:
> > > > > >
> > > > > > On Fri, Nov 2, 2018 at 10:37 AM Prathamesh Kulkarni
> > > > > >  wrote:
> > > > > > >
> > > > > > > Hi,
> > > > > > > This patch adds two transforms to match.pd to CSE erf/erfc pair.
> > > > > > > erfc(x) is canonicalized to 1 - erf(x) and is then reversed to 1 -
> > > > > > > erf(x) when canonicalization is disabled and result of erf(x) has
> > > > > > > single use within 1 - erf(x).
> > > > > > >
> > > > > > > The patch regressed builtin-nonneg-1.c. The following test-case
> > > > > > > reproduces the issue with patch:
> > > > > > >
> > > > > > > void test(double d1) {
> > > > > > >   if (signbit(erfc(d1)))
> > > > > > > link_failure_erfc();
> > > > > > > }
> > > > > > >
> > > > > > > ssa dump:
> > > > > > >
> > > > > > >:
> > > > > > >   _5 = __builtin_erf (d1_4(D));
> > > > > > >   _1 = 1.0e+0 - _5;
> > > > > > >   _6 = _1 < 0.0;
> > > > > > >   _2 = (int) _6;
> > > > > > >   if (_2 != 0)
> > > > > > > goto ; [INV]
> > > > > > >   else
> > > > > > > goto ; [INV]
> > > > > > >
> > > > > > >:
> > > > > > >   link_failure_erfc ();
> > > > > > >
> > > > > > >:
> > > > > > >   return;
> > > > > > >
> > > > > > > As can be seen, erfc(d1) is folded to 1 - erf(d1).
> > > > > > > forwprop then transforms the if condition from _2 != 0
> > > > > > > to _5 > 1.0e+0 and that defeats DCE thus resulting in link failure
> > > > > > > in undefined reference to link_failure_erfc().
> > > > > > >
> > > > > > > So, the patch adds another transform erf(x) > 1 -> 0.
> > > > > >
> > > > > > Ick.
> > > > > >
> > > > > > Why not canonicalize erf (x) to 1-erfc(x) instead?
> > > > > Sorry I didn't quite follow, won't this cause similar issue with erf ?
> > > > > I changed the pattern to canonicalize erf(x) -> 1 - erfc(x)
> > > > > and 1 - erfc(x) -> erf(x) after canonicalization is disabled.
> > > > >
> > > > > This caused undefined reference to link_failure_erf() in following 
> > > > > test-case:
> > > > >
> > > > > extern int signbit(double);
> > > > > extern void link_failure_erf(void);
> > > > > extern double erf(double);
> > > > >
> > > > > void test(double d1) {
> > > > >   if (signbit(erf(d1)))
> > > > > link_failure_erf();
> > > > > }
> > > >
> > > > But that's already not optimized without any canonicalization
> > > > because erf returns sth in range [-1, 1].
> > > >
> > > > I suggested the change because we have limited support for FP
> > > > value-ranges and nonnegative is one thing we can compute
> > > > (and erfc as opposed to erf is nonnegative).
> > > Ah right, thanks for the explanation.
> > > Unfortunately this still regresses builtin-nonneg-1.c, which can be
> > > reproduced with following test-case:
> > >
> > > extern int signbit(double);
> > > extern void link_failure_erf(void);
> > > extern double erf(double);
> > > extern double fabs(double);
> > >
> > > void test(double d1) {
> > >   if (signbit(erf(fabs(d1
> > > link_failure_erf();
> > > }
> > >
> > > signbit(erf(fabs(d1)) is transformed to 0 without patch but with patch
> > > it gets canonicalized to signbit(1 - erfc(fabs(d1))) which similarly
> > > defeats DCE.
> > >
> > > forwprop1 shows:
> > >  :
> > >   _1 = ABS_EXPR ;
> > >   _6 = __builtin_erfc (_1);
> > >   _2 = 1.0e+0 - _6;
> > >   _7 = _6 > 1.0e+0;
> > >   _3 = (int) _7;
> > >   if (_6 > 1.0e+0)
> > > goto ; [INV]
> > >   else
> > > goto ; [INV]
> > >
> > >:
> > >   link_failure_erf ();
> > >
> > >:
> > >   return;
> > >
> > > I assume we would need to somehow tell gcc that the canonicalized
> > > expression 1 - erfc(x) would not exceed 1.0 ?
> > > Is there a better way to do that apart from defining pattern (1 -
> > > erfc(x)) > 1.0 -> 0
> > > which I agree doesn't look ideal to add in match.pd ?
> >
> > You could handle a MINUS_EXPR of 1 and erfc() in
> > gimple_assign_nonnegative_warnv_p (I wouldn't bother
> > to do it for tree_binary_nonnegative_warnv_p)
> >
> Um won't the value range of 1 - erfc(x) be [-1.0, 1.0] (same as
> erf(x)), so would adding it to gimple_assign_nonnegative_warnv_p be
> correct ?

Under the same condition as erf, when the argument is positive.
This is what the testcase is testing.

> erf(fabs(x)) is non-negative and between [0.0, 1.0] although I am not
> sure if 1 - erfc(fabs(x)) would be always non-negative too if
> erfc(fabs(x)) can exceed 1.0 for some value of x ?
> AFAIU for the above case, gcc doesn't know that 1 - erfc(x) cannot
> exceed 1.0, and that's because we don't propagate value range for
> floats as you pointed out.
>
> Thanks,
> Prathamesh
> > This is of 

Re: PR83750: CSE erf/erfc pair

2018-11-08 Thread Prathamesh Kulkarni
On Tue, 6 Nov 2018 at 16:04, Richard Biener  wrote:
>
> On Mon, Nov 5, 2018 at 3:11 PM Prathamesh Kulkarni
>  wrote:
> >
> > On Mon, 5 Nov 2018 at 18:14, Richard Biener  
> > wrote:
> > >
> > > On Mon, Nov 5, 2018 at 1:11 PM Prathamesh Kulkarni
> > >  wrote:
> > > >
> > > > On Mon, 5 Nov 2018 at 15:10, Richard Biener 
> > > >  wrote:
> > > > >
> > > > > On Fri, Nov 2, 2018 at 10:37 AM Prathamesh Kulkarni
> > > > >  wrote:
> > > > > >
> > > > > > Hi,
> > > > > > This patch adds two transforms to match.pd to CSE erf/erfc pair.
> > > > > > erfc(x) is canonicalized to 1 - erf(x) and is then reversed to 1 -
> > > > > > erf(x) when canonicalization is disabled and result of erf(x) has
> > > > > > single use within 1 - erf(x).
> > > > > >
> > > > > > The patch regressed builtin-nonneg-1.c. The following test-case
> > > > > > reproduces the issue with patch:
> > > > > >
> > > > > > void test(double d1) {
> > > > > >   if (signbit(erfc(d1)))
> > > > > > link_failure_erfc();
> > > > > > }
> > > > > >
> > > > > > ssa dump:
> > > > > >
> > > > > >:
> > > > > >   _5 = __builtin_erf (d1_4(D));
> > > > > >   _1 = 1.0e+0 - _5;
> > > > > >   _6 = _1 < 0.0;
> > > > > >   _2 = (int) _6;
> > > > > >   if (_2 != 0)
> > > > > > goto ; [INV]
> > > > > >   else
> > > > > > goto ; [INV]
> > > > > >
> > > > > >:
> > > > > >   link_failure_erfc ();
> > > > > >
> > > > > >:
> > > > > >   return;
> > > > > >
> > > > > > As can be seen, erfc(d1) is folded to 1 - erf(d1).
> > > > > > forwprop then transforms the if condition from _2 != 0
> > > > > > to _5 > 1.0e+0 and that defeats DCE thus resulting in link failure
> > > > > > in undefined reference to link_failure_erfc().
> > > > > >
> > > > > > So, the patch adds another transform erf(x) > 1 -> 0.
> > > > >
> > > > > Ick.
> > > > >
> > > > > Why not canonicalize erf (x) to 1-erfc(x) instead?
> > > > Sorry I didn't quite follow, won't this cause similar issue with erf ?
> > > > I changed the pattern to canonicalize erf(x) -> 1 - erfc(x)
> > > > and 1 - erfc(x) -> erf(x) after canonicalization is disabled.
> > > >
> > > > This caused undefined reference to link_failure_erf() in following 
> > > > test-case:
> > > >
> > > > extern int signbit(double);
> > > > extern void link_failure_erf(void);
> > > > extern double erf(double);
> > > >
> > > > void test(double d1) {
> > > >   if (signbit(erf(d1)))
> > > > link_failure_erf();
> > > > }
> > >
> > > But that's already not optimized without any canonicalization
> > > because erf returns sth in range [-1, 1].
> > >
> > > I suggested the change because we have limited support for FP
> > > value-ranges and nonnegative is one thing we can compute
> > > (and erfc as opposed to erf is nonnegative).
> > Ah right, thanks for the explanation.
> > Unfortunately this still regresses builtin-nonneg-1.c, which can be
> > reproduced with following test-case:
> >
> > extern int signbit(double);
> > extern void link_failure_erf(void);
> > extern double erf(double);
> > extern double fabs(double);
> >
> > void test(double d1) {
> >   if (signbit(erf(fabs(d1
> > link_failure_erf();
> > }
> >
> > signbit(erf(fabs(d1)) is transformed to 0 without patch but with patch
> > it gets canonicalized to signbit(1 - erfc(fabs(d1))) which similarly
> > defeats DCE.
> >
> > forwprop1 shows:
> >  :
> >   _1 = ABS_EXPR ;
> >   _6 = __builtin_erfc (_1);
> >   _2 = 1.0e+0 - _6;
> >   _7 = _6 > 1.0e+0;
> >   _3 = (int) _7;
> >   if (_6 > 1.0e+0)
> > goto ; [INV]
> >   else
> > goto ; [INV]
> >
> >:
> >   link_failure_erf ();
> >
> >:
> >   return;
> >
> > I assume we would need to somehow tell gcc that the canonicalized
> > expression 1 - erfc(x) would not exceed 1.0 ?
> > Is there a better way to do that apart from defining pattern (1 -
> > erfc(x)) > 1.0 -> 0
> > which I agree doesn't look ideal to add in match.pd ?
>
> You could handle a MINUS_EXPR of 1 and erfc() in
> gimple_assign_nonnegative_warnv_p (I wouldn't bother
> to do it for tree_binary_nonnegative_warnv_p)
>
Um won't the value range of 1 - erfc(x) be [-1.0, 1.0] (same as
erf(x)), so would adding it to gimple_assign_nonnegative_warnv_p be
correct ?
erf(fabs(x)) is non-negative and between [0.0, 1.0] although I am not
sure if 1 - erfc(fabs(x)) would be always non-negative too if
erfc(fabs(x)) can exceed 1.0 for some value of x ?
AFAIU for the above case, gcc doesn't know that 1 - erfc(x) cannot
exceed 1.0, and that's because we don't propagate value range for
floats as you pointed out.

Thanks,
Prathamesh
> This is of course similarly "lame" but a bit cleaner than
> a match.pd pattern that just works for the comparison.
>
> In reality the proper long-term fix is to add basic range
> propagation to floats.
>
> Richard.
>
> >
> > Thanks
> > Prathamesh
> > >
> > > > forwprop1 shows:
> > > >
> > > > :
> > > >   _5 = __builtin_erfc (d1_4(D));
> > > >   _1 = 1.0e+0 - _5;
> > > >   _6 = _5 > 1.0e+0;
> > > >   _2 = (int) _6;
> > > >   if (_5 > 1.0e+0)
> > > > 

Re: PR83750: CSE erf/erfc pair

2018-11-06 Thread Richard Biener
On Mon, Nov 5, 2018 at 3:11 PM Prathamesh Kulkarni
 wrote:
>
> On Mon, 5 Nov 2018 at 18:14, Richard Biener  
> wrote:
> >
> > On Mon, Nov 5, 2018 at 1:11 PM Prathamesh Kulkarni
> >  wrote:
> > >
> > > On Mon, 5 Nov 2018 at 15:10, Richard Biener  
> > > wrote:
> > > >
> > > > On Fri, Nov 2, 2018 at 10:37 AM Prathamesh Kulkarni
> > > >  wrote:
> > > > >
> > > > > Hi,
> > > > > This patch adds two transforms to match.pd to CSE erf/erfc pair.
> > > > > erfc(x) is canonicalized to 1 - erf(x) and is then reversed to 1 -
> > > > > erf(x) when canonicalization is disabled and result of erf(x) has
> > > > > single use within 1 - erf(x).
> > > > >
> > > > > The patch regressed builtin-nonneg-1.c. The following test-case
> > > > > reproduces the issue with patch:
> > > > >
> > > > > void test(double d1) {
> > > > >   if (signbit(erfc(d1)))
> > > > > link_failure_erfc();
> > > > > }
> > > > >
> > > > > ssa dump:
> > > > >
> > > > >:
> > > > >   _5 = __builtin_erf (d1_4(D));
> > > > >   _1 = 1.0e+0 - _5;
> > > > >   _6 = _1 < 0.0;
> > > > >   _2 = (int) _6;
> > > > >   if (_2 != 0)
> > > > > goto ; [INV]
> > > > >   else
> > > > > goto ; [INV]
> > > > >
> > > > >:
> > > > >   link_failure_erfc ();
> > > > >
> > > > >:
> > > > >   return;
> > > > >
> > > > > As can be seen, erfc(d1) is folded to 1 - erf(d1).
> > > > > forwprop then transforms the if condition from _2 != 0
> > > > > to _5 > 1.0e+0 and that defeats DCE thus resulting in link failure
> > > > > in undefined reference to link_failure_erfc().
> > > > >
> > > > > So, the patch adds another transform erf(x) > 1 -> 0.
> > > >
> > > > Ick.
> > > >
> > > > Why not canonicalize erf (x) to 1-erfc(x) instead?
> > > Sorry I didn't quite follow, won't this cause similar issue with erf ?
> > > I changed the pattern to canonicalize erf(x) -> 1 - erfc(x)
> > > and 1 - erfc(x) -> erf(x) after canonicalization is disabled.
> > >
> > > This caused undefined reference to link_failure_erf() in following 
> > > test-case:
> > >
> > > extern int signbit(double);
> > > extern void link_failure_erf(void);
> > > extern double erf(double);
> > >
> > > void test(double d1) {
> > >   if (signbit(erf(d1)))
> > > link_failure_erf();
> > > }
> >
> > But that's already not optimized without any canonicalization
> > because erf returns sth in range [-1, 1].
> >
> > I suggested the change because we have limited support for FP
> > value-ranges and nonnegative is one thing we can compute
> > (and erfc as opposed to erf is nonnegative).
> Ah right, thanks for the explanation.
> Unfortunately this still regresses builtin-nonneg-1.c, which can be
> reproduced with following test-case:
>
> extern int signbit(double);
> extern void link_failure_erf(void);
> extern double erf(double);
> extern double fabs(double);
>
> void test(double d1) {
>   if (signbit(erf(fabs(d1
> link_failure_erf();
> }
>
> signbit(erf(fabs(d1)) is transformed to 0 without patch but with patch
> it gets canonicalized to signbit(1 - erfc(fabs(d1))) which similarly
> defeats DCE.
>
> forwprop1 shows:
>  :
>   _1 = ABS_EXPR ;
>   _6 = __builtin_erfc (_1);
>   _2 = 1.0e+0 - _6;
>   _7 = _6 > 1.0e+0;
>   _3 = (int) _7;
>   if (_6 > 1.0e+0)
> goto ; [INV]
>   else
> goto ; [INV]
>
>:
>   link_failure_erf ();
>
>:
>   return;
>
> I assume we would need to somehow tell gcc that the canonicalized
> expression 1 - erfc(x) would not exceed 1.0 ?
> Is there a better way to do that apart from defining pattern (1 -
> erfc(x)) > 1.0 -> 0
> which I agree doesn't look ideal to add in match.pd ?

You could handle a MINUS_EXPR of 1 and erfc() in
gimple_assign_nonnegative_warnv_p (I wouldn't bother
to do it for tree_binary_nonnegative_warnv_p)

This is of course similarly "lame" but a bit cleaner than
a match.pd pattern that just works for the comparison.

In reality the proper long-term fix is to add basic range
propagation to floats.

Richard.

>
> Thanks
> Prathamesh
> >
> > > forwprop1 shows:
> > >
> > > :
> > >   _5 = __builtin_erfc (d1_4(D));
> > >   _1 = 1.0e+0 - _5;
> > >   _6 = _5 > 1.0e+0;
> > >   _2 = (int) _6;
> > >   if (_5 > 1.0e+0)
> > > goto ; [INV]
> > >   else
> > > goto ; [INV]
> > >
> > >:
> > >   link_failure_erf ();
> > >
> > >:
> > >   return;
> > >
> > > which defeats DCE to remove call to link_failure_erf.
> > >
> > > Thanks,
> > > Prathamesh
> > > >
> > > > > which resolves the regression.
> > > > >
> > > > > Bootstrapped+tested on x86_64-unknown-linux-gnu.
> > > > > Cross-testing on arm and aarch64 variants in progress.
> > > > > OK for trunk if passes ?
> > > > >
> > > > > Thanks,
> > > > > Prathamesh


Re: PR83750: CSE erf/erfc pair

2018-11-05 Thread Paul Koning



> On Nov 5, 2018, at 1:48 PM, Michael Matz  wrote:
> 
> Hi,
> 
> On Mon, 5 Nov 2018, Jeff Law wrote:
> 
 Don't we have a flag specific to honoring nans?  Would that be better 
 to use than flag_unsafe_math_optimizations?  As Uli mentioned, 
 there's
>>> 
>>> That's only relevant for the comparison optimization, of course.
>>> 
>>> Converting erfc to 1-erf is dubious, since the whole point of erfc is 
>>> for cases where 1-erf is inaccurate.  (Conversion in the other 
>>> direction also needs flag_unsafe_math_optimizations.)
>>> 
>> Understood.  Thanks for clarifying.  It seems like 
>> unsafe-math-optimization is a better fit than the nan specific flag.
> 
> But still we should consider general usefullness, even with unsafe-math.  
> In this case we would remove a usage of a slow function that the user 
> specifically used to deal with inaccuracies with an equally slow function 
> (plus a little arithmetic that is shadows by the functions slowness) that 
> now exacly produces the inaccuracies the user wanted to avoid.  I.e. the 
> speed gain is zero.  The "canonicalization gain" referred to in the PR 
> might be real, but it comes at the cost of introducing definite 
> catastrophic cancellation.
> 
> IMHO that's not a sensible transformation to do, under any flags.

That seems right.  The same goes for log vs. logp1, and exp vs. expm1.

paul



Re: PR83750: CSE erf/erfc pair

2018-11-05 Thread Michael Matz
Hi,

On Mon, 5 Nov 2018, Jeff Law wrote:

> >> Don't we have a flag specific to honoring nans?  Would that be better 
> >> to use than flag_unsafe_math_optimizations?  As Uli mentioned, 
> >> there's
> > 
> > That's only relevant for the comparison optimization, of course.
> > 
> > Converting erfc to 1-erf is dubious, since the whole point of erfc is 
> > for cases where 1-erf is inaccurate.  (Conversion in the other 
> > direction also needs flag_unsafe_math_optimizations.)
> > 
> Understood.  Thanks for clarifying.  It seems like 
> unsafe-math-optimization is a better fit than the nan specific flag.

But still we should consider general usefullness, even with unsafe-math.  
In this case we would remove a usage of a slow function that the user 
specifically used to deal with inaccuracies with an equally slow function 
(plus a little arithmetic that is shadows by the functions slowness) that 
now exacly produces the inaccuracies the user wanted to avoid.  I.e. the 
speed gain is zero.  The "canonicalization gain" referred to in the PR 
might be real, but it comes at the cost of introducing definite 
catastrophic cancellation.

IMHO that's not a sensible transformation to do, under any flags.


Ciao,
Michael.


Re: PR83750: CSE erf/erfc pair

2018-11-05 Thread Jeff Law
On 11/5/18 11:27 AM, Joseph Myers wrote:
> On Sun, 4 Nov 2018, Jeff Law wrote:
> 
>> Don't we have a flag specific to honoring nans?  Would that be better to
>> use than flag_unsafe_math_optimizations?  As Uli mentioned, there's
> 
> That's only relevant for the comparison optimization, of course.
> 
> Converting erfc to 1-erf is dubious, since the whole point of erfc is for 
> cases where 1-erf is inaccurate.  (Conversion in the other direction also 
> needs flag_unsafe_math_optimizations.)
> 
Understood.  Thanks for clarifying.  It seems like
unsafe-math-optimization is a better fit than the nan specific flag.

jeff


Re: PR83750: CSE erf/erfc pair

2018-11-05 Thread Joseph Myers
On Sun, 4 Nov 2018, Jeff Law wrote:

> Don't we have a flag specific to honoring nans?  Would that be better to
> use than flag_unsafe_math_optimizations?  As Uli mentioned, there's

That's only relevant for the comparison optimization, of course.

Converting erfc to 1-erf is dubious, since the whole point of erfc is for 
cases where 1-erf is inaccurate.  (Conversion in the other direction also 
needs flag_unsafe_math_optimizations.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: PR83750: CSE erf/erfc pair

2018-11-05 Thread Prathamesh Kulkarni
On Mon, 5 Nov 2018 at 18:14, Richard Biener  wrote:
>
> On Mon, Nov 5, 2018 at 1:11 PM Prathamesh Kulkarni
>  wrote:
> >
> > On Mon, 5 Nov 2018 at 15:10, Richard Biener  
> > wrote:
> > >
> > > On Fri, Nov 2, 2018 at 10:37 AM Prathamesh Kulkarni
> > >  wrote:
> > > >
> > > > Hi,
> > > > This patch adds two transforms to match.pd to CSE erf/erfc pair.
> > > > erfc(x) is canonicalized to 1 - erf(x) and is then reversed to 1 -
> > > > erf(x) when canonicalization is disabled and result of erf(x) has
> > > > single use within 1 - erf(x).
> > > >
> > > > The patch regressed builtin-nonneg-1.c. The following test-case
> > > > reproduces the issue with patch:
> > > >
> > > > void test(double d1) {
> > > >   if (signbit(erfc(d1)))
> > > > link_failure_erfc();
> > > > }
> > > >
> > > > ssa dump:
> > > >
> > > >:
> > > >   _5 = __builtin_erf (d1_4(D));
> > > >   _1 = 1.0e+0 - _5;
> > > >   _6 = _1 < 0.0;
> > > >   _2 = (int) _6;
> > > >   if (_2 != 0)
> > > > goto ; [INV]
> > > >   else
> > > > goto ; [INV]
> > > >
> > > >:
> > > >   link_failure_erfc ();
> > > >
> > > >:
> > > >   return;
> > > >
> > > > As can be seen, erfc(d1) is folded to 1 - erf(d1).
> > > > forwprop then transforms the if condition from _2 != 0
> > > > to _5 > 1.0e+0 and that defeats DCE thus resulting in link failure
> > > > in undefined reference to link_failure_erfc().
> > > >
> > > > So, the patch adds another transform erf(x) > 1 -> 0.
> > >
> > > Ick.
> > >
> > > Why not canonicalize erf (x) to 1-erfc(x) instead?
> > Sorry I didn't quite follow, won't this cause similar issue with erf ?
> > I changed the pattern to canonicalize erf(x) -> 1 - erfc(x)
> > and 1 - erfc(x) -> erf(x) after canonicalization is disabled.
> >
> > This caused undefined reference to link_failure_erf() in following 
> > test-case:
> >
> > extern int signbit(double);
> > extern void link_failure_erf(void);
> > extern double erf(double);
> >
> > void test(double d1) {
> >   if (signbit(erf(d1)))
> > link_failure_erf();
> > }
>
> But that's already not optimized without any canonicalization
> because erf returns sth in range [-1, 1].
>
> I suggested the change because we have limited support for FP
> value-ranges and nonnegative is one thing we can compute
> (and erfc as opposed to erf is nonnegative).
Ah right, thanks for the explanation.
Unfortunately this still regresses builtin-nonneg-1.c, which can be
reproduced with following test-case:

extern int signbit(double);
extern void link_failure_erf(void);
extern double erf(double);
extern double fabs(double);

void test(double d1) {
  if (signbit(erf(fabs(d1
link_failure_erf();
}

signbit(erf(fabs(d1)) is transformed to 0 without patch but with patch
it gets canonicalized to signbit(1 - erfc(fabs(d1))) which similarly
defeats DCE.

forwprop1 shows:
 :
  _1 = ABS_EXPR ;
  _6 = __builtin_erfc (_1);
  _2 = 1.0e+0 - _6;
  _7 = _6 > 1.0e+0;
  _3 = (int) _7;
  if (_6 > 1.0e+0)
goto ; [INV]
  else
goto ; [INV]

   :
  link_failure_erf ();

   :
  return;

I assume we would need to somehow tell gcc that the canonicalized
expression 1 - erfc(x) would not exceed 1.0 ?
Is there a better way to do that apart from defining pattern (1 -
erfc(x)) > 1.0 -> 0
which I agree doesn't look ideal to add in match.pd ?

Thanks
Prathamesh
>
> > forwprop1 shows:
> >
> > :
> >   _5 = __builtin_erfc (d1_4(D));
> >   _1 = 1.0e+0 - _5;
> >   _6 = _5 > 1.0e+0;
> >   _2 = (int) _6;
> >   if (_5 > 1.0e+0)
> > goto ; [INV]
> >   else
> > goto ; [INV]
> >
> >:
> >   link_failure_erf ();
> >
> >:
> >   return;
> >
> > which defeats DCE to remove call to link_failure_erf.
> >
> > Thanks,
> > Prathamesh
> > >
> > > > which resolves the regression.
> > > >
> > > > Bootstrapped+tested on x86_64-unknown-linux-gnu.
> > > > Cross-testing on arm and aarch64 variants in progress.
> > > > OK for trunk if passes ?
> > > >
> > > > Thanks,
> > > > Prathamesh


Re: PR83750: CSE erf/erfc pair

2018-11-05 Thread Richard Biener
On Mon, Nov 5, 2018 at 1:11 PM Prathamesh Kulkarni
 wrote:
>
> On Mon, 5 Nov 2018 at 15:10, Richard Biener  
> wrote:
> >
> > On Fri, Nov 2, 2018 at 10:37 AM Prathamesh Kulkarni
> >  wrote:
> > >
> > > Hi,
> > > This patch adds two transforms to match.pd to CSE erf/erfc pair.
> > > erfc(x) is canonicalized to 1 - erf(x) and is then reversed to 1 -
> > > erf(x) when canonicalization is disabled and result of erf(x) has
> > > single use within 1 - erf(x).
> > >
> > > The patch regressed builtin-nonneg-1.c. The following test-case
> > > reproduces the issue with patch:
> > >
> > > void test(double d1) {
> > >   if (signbit(erfc(d1)))
> > > link_failure_erfc();
> > > }
> > >
> > > ssa dump:
> > >
> > >:
> > >   _5 = __builtin_erf (d1_4(D));
> > >   _1 = 1.0e+0 - _5;
> > >   _6 = _1 < 0.0;
> > >   _2 = (int) _6;
> > >   if (_2 != 0)
> > > goto ; [INV]
> > >   else
> > > goto ; [INV]
> > >
> > >:
> > >   link_failure_erfc ();
> > >
> > >:
> > >   return;
> > >
> > > As can be seen, erfc(d1) is folded to 1 - erf(d1).
> > > forwprop then transforms the if condition from _2 != 0
> > > to _5 > 1.0e+0 and that defeats DCE thus resulting in link failure
> > > in undefined reference to link_failure_erfc().
> > >
> > > So, the patch adds another transform erf(x) > 1 -> 0.
> >
> > Ick.
> >
> > Why not canonicalize erf (x) to 1-erfc(x) instead?
> Sorry I didn't quite follow, won't this cause similar issue with erf ?
> I changed the pattern to canonicalize erf(x) -> 1 - erfc(x)
> and 1 - erfc(x) -> erf(x) after canonicalization is disabled.
>
> This caused undefined reference to link_failure_erf() in following test-case:
>
> extern int signbit(double);
> extern void link_failure_erf(void);
> extern double erf(double);
>
> void test(double d1) {
>   if (signbit(erf(d1)))
> link_failure_erf();
> }

But that's already not optimized without any canonicalization
because erf returns sth in range [-1, 1].

I suggested the change because we have limited support for FP
value-ranges and nonnegative is one thing we can compute
(and erfc as opposed to erf is nonnegative).

> forwprop1 shows:
>
> :
>   _5 = __builtin_erfc (d1_4(D));
>   _1 = 1.0e+0 - _5;
>   _6 = _5 > 1.0e+0;
>   _2 = (int) _6;
>   if (_5 > 1.0e+0)
> goto ; [INV]
>   else
> goto ; [INV]
>
>:
>   link_failure_erf ();
>
>:
>   return;
>
> which defeats DCE to remove call to link_failure_erf.
>
> Thanks,
> Prathamesh
> >
> > > which resolves the regression.
> > >
> > > Bootstrapped+tested on x86_64-unknown-linux-gnu.
> > > Cross-testing on arm and aarch64 variants in progress.
> > > OK for trunk if passes ?
> > >
> > > Thanks,
> > > Prathamesh


Re: PR83750: CSE erf/erfc pair

2018-11-05 Thread Prathamesh Kulkarni
On Mon, 5 Nov 2018 at 15:10, Richard Biener  wrote:
>
> On Fri, Nov 2, 2018 at 10:37 AM Prathamesh Kulkarni
>  wrote:
> >
> > Hi,
> > This patch adds two transforms to match.pd to CSE erf/erfc pair.
> > erfc(x) is canonicalized to 1 - erf(x) and is then reversed to 1 -
> > erf(x) when canonicalization is disabled and result of erf(x) has
> > single use within 1 - erf(x).
> >
> > The patch regressed builtin-nonneg-1.c. The following test-case
> > reproduces the issue with patch:
> >
> > void test(double d1) {
> >   if (signbit(erfc(d1)))
> > link_failure_erfc();
> > }
> >
> > ssa dump:
> >
> >:
> >   _5 = __builtin_erf (d1_4(D));
> >   _1 = 1.0e+0 - _5;
> >   _6 = _1 < 0.0;
> >   _2 = (int) _6;
> >   if (_2 != 0)
> > goto ; [INV]
> >   else
> > goto ; [INV]
> >
> >:
> >   link_failure_erfc ();
> >
> >:
> >   return;
> >
> > As can be seen, erfc(d1) is folded to 1 - erf(d1).
> > forwprop then transforms the if condition from _2 != 0
> > to _5 > 1.0e+0 and that defeats DCE thus resulting in link failure
> > in undefined reference to link_failure_erfc().
> >
> > So, the patch adds another transform erf(x) > 1 -> 0.
>
> Ick.
>
> Why not canonicalize erf (x) to 1-erfc(x) instead?
Sorry I didn't quite follow, won't this cause similar issue with erf ?
I changed the pattern to canonicalize erf(x) -> 1 - erfc(x)
and 1 - erfc(x) -> erf(x) after canonicalization is disabled.

This caused undefined reference to link_failure_erf() in following test-case:

extern int signbit(double);
extern void link_failure_erf(void);
extern double erf(double);

void test(double d1) {
  if (signbit(erf(d1)))
link_failure_erf();
}

forwprop1 shows:

:
  _5 = __builtin_erfc (d1_4(D));
  _1 = 1.0e+0 - _5;
  _6 = _5 > 1.0e+0;
  _2 = (int) _6;
  if (_5 > 1.0e+0)
goto ; [INV]
  else
goto ; [INV]

   :
  link_failure_erf ();

   :
  return;

which defeats DCE to remove call to link_failure_erf.

Thanks,
Prathamesh
>
> > which resolves the regression.
> >
> > Bootstrapped+tested on x86_64-unknown-linux-gnu.
> > Cross-testing on arm and aarch64 variants in progress.
> > OK for trunk if passes ?
> >
> > Thanks,
> > Prathamesh


Re: PR83750: CSE erf/erfc pair

2018-11-05 Thread Richard Biener
On Fri, Nov 2, 2018 at 10:37 AM Prathamesh Kulkarni
 wrote:
>
> Hi,
> This patch adds two transforms to match.pd to CSE erf/erfc pair.
> erfc(x) is canonicalized to 1 - erf(x) and is then reversed to 1 -
> erf(x) when canonicalization is disabled and result of erf(x) has
> single use within 1 - erf(x).
>
> The patch regressed builtin-nonneg-1.c. The following test-case
> reproduces the issue with patch:
>
> void test(double d1) {
>   if (signbit(erfc(d1)))
> link_failure_erfc();
> }
>
> ssa dump:
>
>:
>   _5 = __builtin_erf (d1_4(D));
>   _1 = 1.0e+0 - _5;
>   _6 = _1 < 0.0;
>   _2 = (int) _6;
>   if (_2 != 0)
> goto ; [INV]
>   else
> goto ; [INV]
>
>:
>   link_failure_erfc ();
>
>:
>   return;
>
> As can be seen, erfc(d1) is folded to 1 - erf(d1).
> forwprop then transforms the if condition from _2 != 0
> to _5 > 1.0e+0 and that defeats DCE thus resulting in link failure
> in undefined reference to link_failure_erfc().
>
> So, the patch adds another transform erf(x) > 1 -> 0.

Ick.

Why not canonicalize erf (x) to 1-erfc(x) instead?

> which resolves the regression.
>
> Bootstrapped+tested on x86_64-unknown-linux-gnu.
> Cross-testing on arm and aarch64 variants in progress.
> OK for trunk if passes ?
>
> Thanks,
> Prathamesh


Re: PR83750: CSE erf/erfc pair

2018-11-04 Thread Jeff Law
On 11/2/18 3:36 AM, Prathamesh Kulkarni wrote:
> Hi,
> This patch adds two transforms to match.pd to CSE erf/erfc pair.
> erfc(x) is canonicalized to 1 - erf(x) and is then reversed to 1 -
> erf(x) when canonicalization is disabled and result of erf(x) has
> single use within 1 - erf(x).
> 
> The patch regressed builtin-nonneg-1.c. The following test-case
> reproduces the issue with patch:
> 
> void test(double d1) {
>   if (signbit(erfc(d1)))
> link_failure_erfc();
> }
> 
> ssa dump:
> 
>:
>   _5 = __builtin_erf (d1_4(D));
>   _1 = 1.0e+0 - _5;
>   _6 = _1 < 0.0;
>   _2 = (int) _6;
>   if (_2 != 0)
> goto ; [INV]
>   else
> goto ; [INV]
> 
>:
>   link_failure_erfc ();
> 
>:
>   return;
> 
> As can be seen, erfc(d1) is folded to 1 - erf(d1).
> forwprop then transforms the if condition from _2 != 0
> to _5 > 1.0e+0 and that defeats DCE thus resulting in link failure
> in undefined reference to link_failure_erfc().
> 
> So, the patch adds another transform erf(x) > 1 -> 0
> which resolves the regression.
> 
> Bootstrapped+tested on x86_64-unknown-linux-gnu.
> Cross-testing on arm and aarch64 variants in progress.
> OK for trunk if passes ?
> 
> Thanks,
> Prathamesh
> 
> 
> pr83750-4.txt
> 
> 2018-11-02  Prathamesh Kulkarni  
> 
>   * match.pd (erfc(x) -> 1 - erf(x)): New pattern.
>   (1 - erf(x) -> erfc(x)): Likewise.
>   (erf(x) > 1 -> 0): Likewise.
> 
> testsuite/
>   * gcc.dg/tree-ssa/pr83750-1.c: New test
>   * gcc.dg/tree-ssa/pr83750-2.c: Likewise.
Don't we have a flag specific to honoring nans?  Would that be better to
use than flag_unsafe_math_optimizations?  As Uli mentioned, there's
other cases (where ABS (const) >= 1.0.).

jeff


Re: PR83750: CSE erf/erfc pair

2018-11-02 Thread Ulrich Drepper
On Fri, Nov 2, 2018 at 10:36 AM Prathamesh Kulkarni
 wrote:
> So, the patch adds another transform erf(x) > 1 -> 0
> which resolves the regression.

Why don't you match for any constant with absolute value >= 1.0
instead of just 1.0?


PR83750: CSE erf/erfc pair

2018-11-02 Thread Prathamesh Kulkarni
Hi,
This patch adds two transforms to match.pd to CSE erf/erfc pair.
erfc(x) is canonicalized to 1 - erf(x) and is then reversed to 1 -
erf(x) when canonicalization is disabled and result of erf(x) has
single use within 1 - erf(x).

The patch regressed builtin-nonneg-1.c. The following test-case
reproduces the issue with patch:

void test(double d1) {
  if (signbit(erfc(d1)))
link_failure_erfc();
}

ssa dump:

   :
  _5 = __builtin_erf (d1_4(D));
  _1 = 1.0e+0 - _5;
  _6 = _1 < 0.0;
  _2 = (int) _6;
  if (_2 != 0)
goto ; [INV]
  else
goto ; [INV]

   :
  link_failure_erfc ();

   :
  return;

As can be seen, erfc(d1) is folded to 1 - erf(d1).
forwprop then transforms the if condition from _2 != 0
to _5 > 1.0e+0 and that defeats DCE thus resulting in link failure
in undefined reference to link_failure_erfc().

So, the patch adds another transform erf(x) > 1 -> 0
which resolves the regression.

Bootstrapped+tested on x86_64-unknown-linux-gnu.
Cross-testing on arm and aarch64 variants in progress.
OK for trunk if passes ?

Thanks,
Prathamesh
2018-11-02  Prathamesh Kulkarni  

* match.pd (erfc(x) -> 1 - erf(x)): New pattern.
(1 - erf(x) -> erfc(x)): Likewise.
(erf(x) > 1 -> 0): Likewise.

testsuite/
* gcc.dg/tree-ssa/pr83750-1.c: New test
* gcc.dg/tree-ssa/pr83750-2.c: Likewise.

diff --git a/gcc/match.pd b/gcc/match.pd
index d07ceb7d087..03e9230a579 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -4490,7 +4490,28 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(if (targetm.libc_has_function (function_c99_math_complex))
 (complex
  (mult (exps@1 (realpart @0)) (realpart (cexpis:type@2 (imagpart @0
- (mult @1 (imagpart @2)))
+ (mult @1 (imagpart @2))
+
+
+ /* Canonicalize erfc(x) -> 1 - erf(x) */
+ (simplify
+  (ERFC @0)
+   (minus { build_one_cst (TREE_TYPE (@0)); } (ERF @0
+
+(if (flag_unsafe_math_optimizations
+ && !canonicalize_math_p())
+
+ /* 1 - erf(x) -> erfc(x)
+This is only done if result of erf() has single use in 1 - erf(x). */
+ (simplify
+  (minus real_onep (ERF@1 @0))
+   (if (single_use (@1))
+(ERFC @0)))
+
+ /* erf(x) > 1 -> 0 */
+ (simplify
+  (gt (ERF @0) real_onep)
+  { integer_zero_node; }))
 
 (if (canonicalize_math_p ())
  /* floor(x) -> trunc(x) if x is nonnegative.  */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr83750-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr83750-1.c
new file mode 100644
index 000..c4d3e428f15
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr83750-1.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target c99_runtime } */
+/* { dg-options "-O2 -ffast-math -fdump-tree-optimized" } */
+
+float f1(float x)
+{
+  float g1(float, float);
+
+  float r = __builtin_erff (x);
+  float t = __builtin_erfcf (x);
+  return g1 (r, t);
+}
+
+double f2(double x)
+{
+  double g2(double, double);
+
+  double r = __builtin_erf (x);
+  double t = __builtin_erfc (x);
+  return g2 (r, t);
+}
+
+long double f3(long double x)
+{
+  long double g3(long double, long double);
+
+  long double r = __builtin_erfl (x);
+  long double t = __builtin_erfcl (x);
+  return g3(r, t);
+}
+
+/* { dg-final { scan-tree-dump-not "erfc" "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr83750-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr83750-2.c
new file mode 100644
index 000..60417b38681
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr83750-2.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target c99_runtime } */
+/* { dg-options "-O2 -ffast-math -fdump-tree-optimized" } */
+
+/* Check that the canonicalized form 1 - erf(x) is folded to erfc(x). */
+
+float f1(float x)
+{
+  return __builtin_erfcf (x);
+}
+
+double f2(double x)
+{
+  return __builtin_erfc (x);
+}
+
+long double f3(long double x)
+{
+  return __builtin_erfcl (x); 
+}
+
+/* { dg-final { scan-tree-dump-times "erfc" 3 "optimized" } } */