Re: [PATCH][RFC] Early phiopt pass

2018-10-23 Thread Richard Biener
On Mon, 22 Oct 2018, Richard Biener wrote:

> On Wed, 29 Aug 2018, Richard Biener wrote:
> 
> > On Wed, 29 Aug 2018, Jeff Law wrote:
> > 
> > > On 08/29/2018 04:56 AM, Richard Biener wrote:
> > > > 
> > > > In response to PR87105 I dusted off an old patch that adds an early
> > > > phiopt pass.  Recognizing we run phiopt twice with not many passes
> > > > in between early in post-IPA optimizations this patch moves the
> > > > first of said to the early pipeline.
> > > > 
> > > > The main motivation is to do things like MIN/MAX_EXPR early to
> > > > avoid jump threading mess up the CFG (the case with PR87105).
> > > > I realize theres early backward threading before the new early
> > > > phiopt pass but that doesn't seem to do anything useful there (yet).
> > > > I think it makes sense to push that later anyways.
> > > > 
> > > > Now, early phiopt is quite confused about predict stmts still
> > > > being present and turning conditional BBs into diamonds which it
> > > > cannot handle.  I've fixed at least stray such stmts in the BBs
> > > > that are interesting.  Note this may hide fallout which would otherwise
> > > > be visible in the testsuite (there's no flag to avoid
> > > > generating the predictors - they are emitted directly by the frontends,
> > > > maybe we could drop them with -fno[-guess]-branch-probabilities at
> > > > gimplification time?).
> > > > 
> > > > There's also an effect on ifcombine which, when preceeded by phiopt,
> > > > can miss cases because phiopt may value-replace some condition.
> > > > 
> > > > The patch contains adjustments to testcases where there's no harm done
> > > > in the end and leaves those FAILing where we would need to do sth.
> > > > 
> > > > In the end it's regular pass-ordering issues but our testsuite very
> > > > often ties our hands when re-ordering passes because of them.
> > > > 
> > > > One option would be to distinguish early from late phiopt and for
> > > > example avoid value-replacement - like just do MIN/MAX recognition
> > > > for the vectorizer.
> > > > 
> > > > Any comments?
> > > > 
> > > > Some detailed notes on the remaining FAILs below.
> > > [ ... ]
> > > I didn't see anything in the testsuite fallout that gave me significant
> > > concern.  If your judgment is that we're better off running it earlier,
> > > then let's do it.
> > 
> > I guess so.  I'll add an early variant anyway since we don't want to
> > do adjacent load hoisting early.  I'll see how difficult it is to handle
> > ifcombine merging with straight-line code or catch those cases elsewhere.
> 
> So I'm finally returning to this...
> 
> I failed to find a sequence of VRP (jump threading), ifcombine and phiopt
> that avoids regressions (well, the current order works of course).  So
> instead of moving the first phiopt pass I am now _adding_ a phiopt
> pass early doing _only_ min/max/abs replacement.  Those are wrecked
> by jump-threading if they happen in sequences with some common operands.
> 
> Thus the following patch is now in bootstrap & regtest on 
> x86_64-unknown-linux-gnu.
> 
> If that goes well I plan to install the patch after digging out
> the relevant two-to-three PRs and adding testcases for them.

The following is what I have applied.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

Richard.

>From 83f89c82650bdfa057bde3ef1a236338a8cdd848 Mon Sep 17 00:00:00 2001
From: Richard Guenther 
Date: Tue, 28 Aug 2018 12:53:53 +0200
Subject: [PATCH] early-phiopt

FAIL: g++.dg/predict-loop-exit-2.C -std=gnu++98  scan-tree-dump-times profile_es
timate "loop exit heuristics:" 2

phiopt value-replaces the last "exit" test in

  :
  _5 = foo ();
  if (_5 != 0)
goto ;
  else
goto ;

  :
  g.0_7 = g;
  if (g.0_7 <= 9)
goto ;
  else
goto ;

  :

  :
  # iftmp.3_1 = PHI <1(5), 0(6), 1(4)>
  if (iftmp.3_1 != 0)
goto ;
  else
goto ;

  :
  return;

to look like

  :
  g.0_7 = g;
  _6 = g.0_7 <= 9;

  :
  # iftmp.3_1 = PHI <_6(5), 1(4)>
  if (iftmp.3_1 != 0)
goto ;

which confuses whatever the testcase was supposed to check (there is only
one loop exit).

FAIL: gcc.dg/tree-ssa/ssa-pre-32.c scan-tree-dump pre "# prephitmp_[0-9]+ = PHI 
<[xy]_[0-9]+(D)[^,]*, [xy]_[0-9]+(D)"

phiopt optimizes

:
   if (b_5(D) != 0)
 goto ; [INV]
   else
 goto ; [INV]

:

:
   # iftmp.0_3 = PHI <4294967295(2), 0(3)>
   _1 = iftmp.0_3 & x_8(D);

to

   _9 = (unsigned int) b_5(D);
   _14 = -_9;
   _1 = x_8(D) & _14;

but in the sequence with two same conditions but opposite bits
nothing optimizes this further back to b_5(D) ? x_6(D) : y_7(D).

FAIL: gcc.dg/tree-ssa/ssa-ifcombine-7.c scan-tree-dump ifcombine " > "
FAIL: gcc.dg/tree-ssa/ssa-ifcombine-ccmp-1.c scan-tree-dump optimized "&"
FAIL: gcc.dg/tree-ssa/ssa-ifcombine-ccmp-4.c scan-tree-dump optimized "&"
FAIL: gcc.dg/tree-ssa/ssa-ifcombine-ccmp-5.c scan-tree-dump-times optimized "&" 
2
  \|" 2

early phiopt value-replaces the inner condition which leaves ifcombine
with no work.


Re: [PATCH][RFC] Early phiopt pass

2018-10-22 Thread Richard Biener
On Wed, 29 Aug 2018, Richard Biener wrote:

> On Wed, 29 Aug 2018, Jeff Law wrote:
> 
> > On 08/29/2018 04:56 AM, Richard Biener wrote:
> > > 
> > > In response to PR87105 I dusted off an old patch that adds an early
> > > phiopt pass.  Recognizing we run phiopt twice with not many passes
> > > in between early in post-IPA optimizations this patch moves the
> > > first of said to the early pipeline.
> > > 
> > > The main motivation is to do things like MIN/MAX_EXPR early to
> > > avoid jump threading mess up the CFG (the case with PR87105).
> > > I realize theres early backward threading before the new early
> > > phiopt pass but that doesn't seem to do anything useful there (yet).
> > > I think it makes sense to push that later anyways.
> > > 
> > > Now, early phiopt is quite confused about predict stmts still
> > > being present and turning conditional BBs into diamonds which it
> > > cannot handle.  I've fixed at least stray such stmts in the BBs
> > > that are interesting.  Note this may hide fallout which would otherwise
> > > be visible in the testsuite (there's no flag to avoid
> > > generating the predictors - they are emitted directly by the frontends,
> > > maybe we could drop them with -fno[-guess]-branch-probabilities at
> > > gimplification time?).
> > > 
> > > There's also an effect on ifcombine which, when preceeded by phiopt,
> > > can miss cases because phiopt may value-replace some condition.
> > > 
> > > The patch contains adjustments to testcases where there's no harm done
> > > in the end and leaves those FAILing where we would need to do sth.
> > > 
> > > In the end it's regular pass-ordering issues but our testsuite very
> > > often ties our hands when re-ordering passes because of them.
> > > 
> > > One option would be to distinguish early from late phiopt and for
> > > example avoid value-replacement - like just do MIN/MAX recognition
> > > for the vectorizer.
> > > 
> > > Any comments?
> > > 
> > > Some detailed notes on the remaining FAILs below.
> > [ ... ]
> > I didn't see anything in the testsuite fallout that gave me significant
> > concern.  If your judgment is that we're better off running it earlier,
> > then let's do it.
> 
> I guess so.  I'll add an early variant anyway since we don't want to
> do adjacent load hoisting early.  I'll see how difficult it is to handle
> ifcombine merging with straight-line code or catch those cases elsewhere.

So I'm finally returning to this...

I failed to find a sequence of VRP (jump threading), ifcombine and phiopt
that avoids regressions (well, the current order works of course).  So
instead of moving the first phiopt pass I am now _adding_ a phiopt
pass early doing _only_ min/max/abs replacement.  Those are wrecked
by jump-threading if they happen in sequences with some common operands.

Thus the following patch is now in bootstrap & regtest on 
x86_64-unknown-linux-gnu.

If that goes well I plan to install the patch after digging out
the relevant two-to-three PRs and adding testcases for them.

Richard.

2018-10-22  Richard Biener  

* passes.def (pass_all_early_optimizations): Add early phi-opt
after dce.
* tree-ssa-phiopt.c (value_replacement): Ignore NOPs and predicts in
addition to debug stmts.
(tree_ssa_phiopt_worker): Add early_p argument, do only min/max
and abs replacement early.
* tree-cfg.c (gimple_empty_block_p): Likewise.

* g++.dg/tree-ssa/pr21463.C: Scan phiopt2 because this testcase
relies on phiprop run before.
* g++.dg/tree-ssa/pr30738.C: Likewise.
* g++.dg/tree-ssa/pr57380.C: Likewise.
* gcc.dg/tree-ssa/pr84859.c: Likewise.
* gcc.dg/tree-ssa/pr45397.c: Scan phiopt2 because phiopt1 is
confused by copies in the IL left by EVRP.
* gcc.dg/tree-ssa/phi-opt-5.c: Likewise, this time confused
by predictors.
* gcc.dg/tree-ssa/phi-opt-12.c: Scan phiopt2.

diff --git a/gcc/passes.def b/gcc/passes.def
index 7f4b3479a35..24f212c8e31 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -88,6 +88,7 @@ along with GCC; see the file COPYING3.  If not see
  NEXT_PASS (pass_merge_phi);
   NEXT_PASS (pass_dse);
  NEXT_PASS (pass_cd_dce);
+ NEXT_PASS (pass_phiopt, true /* early_p */);
  NEXT_PASS (pass_early_ipa_sra);
  NEXT_PASS (pass_tail_recursion);
  NEXT_PASS (pass_convert_switch);
@@ -208,7 +209,7 @@ along with GCC; see the file COPYING3.  If not see
   NEXT_PASS (pass_copy_prop);
   NEXT_PASS (pass_tree_ifcombine);
   NEXT_PASS (pass_merge_phi);
-  NEXT_PASS (pass_phiopt);
+  NEXT_PASS (pass_phiopt, false /* early_p */);
   NEXT_PASS (pass_tail_recursion);
   NEXT_PASS (pass_ch);
   NEXT_PASS (pass_lower_complex);
@@ -237,7 +238,7 @@ along with GCC; see the file COPYING3.  If not see
   NEXT_PASS (pass_reassoc, true /* insert_powi_p */);
   NEXT_PASS (pass_dce);
   NEXT_PASS (pass_forwprop);
-  

Re: [PATCH][RFC] Early phiopt pass

2018-08-29 Thread Richard Biener
On Wed, 29 Aug 2018, Jeff Law wrote:

> On 08/29/2018 04:56 AM, Richard Biener wrote:
> > 
> > In response to PR87105 I dusted off an old patch that adds an early
> > phiopt pass.  Recognizing we run phiopt twice with not many passes
> > in between early in post-IPA optimizations this patch moves the
> > first of said to the early pipeline.
> > 
> > The main motivation is to do things like MIN/MAX_EXPR early to
> > avoid jump threading mess up the CFG (the case with PR87105).
> > I realize theres early backward threading before the new early
> > phiopt pass but that doesn't seem to do anything useful there (yet).
> > I think it makes sense to push that later anyways.
> > 
> > Now, early phiopt is quite confused about predict stmts still
> > being present and turning conditional BBs into diamonds which it
> > cannot handle.  I've fixed at least stray such stmts in the BBs
> > that are interesting.  Note this may hide fallout which would otherwise
> > be visible in the testsuite (there's no flag to avoid
> > generating the predictors - they are emitted directly by the frontends,
> > maybe we could drop them with -fno[-guess]-branch-probabilities at
> > gimplification time?).
> > 
> > There's also an effect on ifcombine which, when preceeded by phiopt,
> > can miss cases because phiopt may value-replace some condition.
> > 
> > The patch contains adjustments to testcases where there's no harm done
> > in the end and leaves those FAILing where we would need to do sth.
> > 
> > In the end it's regular pass-ordering issues but our testsuite very
> > often ties our hands when re-ordering passes because of them.
> > 
> > One option would be to distinguish early from late phiopt and for
> > example avoid value-replacement - like just do MIN/MAX recognition
> > for the vectorizer.
> > 
> > Any comments?
> > 
> > Some detailed notes on the remaining FAILs below.
> [ ... ]
> I didn't see anything in the testsuite fallout that gave me significant
> concern.  If your judgment is that we're better off running it earlier,
> then let's do it.

I guess so.  I'll add an early variant anyway since we don't want to
do adjacent load hoisting early.  I'll see how difficult it is to handle
ifcombine merging with straight-line code or catch those cases elsewhere.

Richard.


Re: [PATCH][RFC] Early phiopt pass

2018-08-29 Thread Jeff Law
On 08/29/2018 04:56 AM, Richard Biener wrote:
> 
> In response to PR87105 I dusted off an old patch that adds an early
> phiopt pass.  Recognizing we run phiopt twice with not many passes
> in between early in post-IPA optimizations this patch moves the
> first of said to the early pipeline.
> 
> The main motivation is to do things like MIN/MAX_EXPR early to
> avoid jump threading mess up the CFG (the case with PR87105).
> I realize theres early backward threading before the new early
> phiopt pass but that doesn't seem to do anything useful there (yet).
> I think it makes sense to push that later anyways.
> 
> Now, early phiopt is quite confused about predict stmts still
> being present and turning conditional BBs into diamonds which it
> cannot handle.  I've fixed at least stray such stmts in the BBs
> that are interesting.  Note this may hide fallout which would otherwise
> be visible in the testsuite (there's no flag to avoid
> generating the predictors - they are emitted directly by the frontends,
> maybe we could drop them with -fno[-guess]-branch-probabilities at
> gimplification time?).
> 
> There's also an effect on ifcombine which, when preceeded by phiopt,
> can miss cases because phiopt may value-replace some condition.
> 
> The patch contains adjustments to testcases where there's no harm done
> in the end and leaves those FAILing where we would need to do sth.
> 
> In the end it's regular pass-ordering issues but our testsuite very
> often ties our hands when re-ordering passes because of them.
> 
> One option would be to distinguish early from late phiopt and for
> example avoid value-replacement - like just do MIN/MAX recognition
> for the vectorizer.
> 
> Any comments?
> 
> Some detailed notes on the remaining FAILs below.
[ ... ]
I didn't see anything in the testsuite fallout that gave me significant
concern.  If your judgment is that we're better off running it earlier,
then let's do it.

Jeff


Re: [PATCH][RFC] Early phiopt pass

2018-08-29 Thread Jan Hubicka
> 
> In response to PR87105 I dusted off an old patch that adds an early
> phiopt pass.  Recognizing we run phiopt twice with not many passes
> in between early in post-IPA optimizations this patch moves the
> first of said to the early pipeline.
> 
> The main motivation is to do things like MIN/MAX_EXPR early to
> avoid jump threading mess up the CFG (the case with PR87105).
> I realize theres early backward threading before the new early
> phiopt pass but that doesn't seem to do anything useful there (yet).
> I think it makes sense to push that later anyways.

This looks like good plan overall.
> 
> Now, early phiopt is quite confused about predict stmts still
> being present and turning conditional BBs into diamonds which it
> cannot handle.  I've fixed at least stray such stmts in the BBs
> that are interesting.  Note this may hide fallout which would otherwise
> be visible in the testsuite (there's no flag to avoid
> generating the predictors - they are emitted directly by the frontends,
> maybe we could drop them with -fno[-guess]-branch-probabilities at
> gimplification time?).

That would be fine with me.  We could also just add guess-branch-probaiblities
guards and not produce them from FE.
> diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
> index cf12cb1f391..938ad7d9a44 100644
> --- a/gcc/tree-cfg.c
> +++ b/gcc/tree-cfg.c
> @@ -6106,11 +6106,19 @@ gimple_empty_block_p (basic_block bb)
>gimple_stmt_iterator gsi = gsi_after_labels (bb);
>if (phi_nodes (bb))
>  return false;
> -  if (gsi_end_p (gsi))
> -return true;
> -  if (is_gimple_debug (gsi_stmt (gsi)))
> -gsi_next_nondebug ();
> -  return gsi_end_p (gsi);
> +  while (!gsi_end_p (gsi))
> +{
> +  gimple *stmt = gsi_stmt (gsi);
> +  if (is_gimple_debug (stmt))
> + ;
> +  else if (gimple_code (stmt) == GIMPLE_NOP
> +|| gimple_code (stmt) == GIMPLE_PREDICT)
> + ;
> +  else
> + return false;
> +  gsi_next ();
> +}
> +  return true;

I think considering GIMPLE_PREDICT as empty basic blocks is going to make 
cfgcleanup
to drop them in cases it should not (it will simply forward edges across them).

I guess we could add new flag to gimple_empty_block_p whether to skip them or 
not
defaulting to false.
Once you ifconvert diamond you want to drop all predict statemetns inside of 
the diamond
because they used to control the condtional of the diamond and now they control 
something
else.

Honza
>  }
>  
>  
> diff --git a/gcc/tree-ssa-phiopt.c b/gcc/tree-ssa-phiopt.c
> index 1667bad873b..bbefc21dc13 100644
> --- a/gcc/tree-ssa-phiopt.c
> +++ b/gcc/tree-ssa-phiopt.c
> @@ -913,7 +913,9 @@ value_replacement (basic_block cond_bb, basic_block 
> middle_bb,
>gsi_next_nondebug ();
>if (!is_gimple_assign (stmt))
>   {
> -   emtpy_or_with_defined_p = false;
> +   if (gimple_code (stmt) != GIMPLE_PREDICT
> +   && gimple_code (stmt) != GIMPLE_NOP)
> + emtpy_or_with_defined_p = false;
> continue;
>   }
>/* Now try to adjust arg0 or arg1 according to the computation


[PATCH][RFC] Early phiopt pass

2018-08-29 Thread Richard Biener


In response to PR87105 I dusted off an old patch that adds an early
phiopt pass.  Recognizing we run phiopt twice with not many passes
in between early in post-IPA optimizations this patch moves the
first of said to the early pipeline.

The main motivation is to do things like MIN/MAX_EXPR early to
avoid jump threading mess up the CFG (the case with PR87105).
I realize theres early backward threading before the new early
phiopt pass but that doesn't seem to do anything useful there (yet).
I think it makes sense to push that later anyways.

Now, early phiopt is quite confused about predict stmts still
being present and turning conditional BBs into diamonds which it
cannot handle.  I've fixed at least stray such stmts in the BBs
that are interesting.  Note this may hide fallout which would otherwise
be visible in the testsuite (there's no flag to avoid
generating the predictors - they are emitted directly by the frontends,
maybe we could drop them with -fno[-guess]-branch-probabilities at
gimplification time?).

There's also an effect on ifcombine which, when preceeded by phiopt,
can miss cases because phiopt may value-replace some condition.

The patch contains adjustments to testcases where there's no harm done
in the end and leaves those FAILing where we would need to do sth.

In the end it's regular pass-ordering issues but our testsuite very
often ties our hands when re-ordering passes because of them.

One option would be to distinguish early from late phiopt and for
example avoid value-replacement - like just do MIN/MAX recognition
for the vectorizer.

Any comments?

Some detailed notes on the remaining FAILs below.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

>From e4f06ed39d12e7c17ff57aa17f17f54013d7869f Mon Sep 17 00:00:00 2001
From: Richard Guenther 
Date: Tue, 28 Aug 2018 12:53:53 +0200
Subject: [PATCH] early-phiopt

FAIL: g++.dg/predict-loop-exit-2.C -std=gnu++98  scan-tree-dump-times profile_es
timate "loop exit heuristics:" 2

phiopt value-replaces the last "exit" test in

  :
  _5 = foo ();
  if (_5 != 0)
goto ;
  else
goto ;

  :
  g.0_7 = g;
  if (g.0_7 <= 9)
goto ;
  else
goto ;

  :

  :
  # iftmp.3_1 = PHI <1(5), 0(6), 1(4)>
  if (iftmp.3_1 != 0)
goto ;
  else
goto ;

  :
  return;

to look like

  :
  g.0_7 = g;
  _6 = g.0_7 <= 9;

  :
  # iftmp.3_1 = PHI <_6(5), 1(4)>
  if (iftmp.3_1 != 0)
goto ;

which confuses whatever the testcase was supposed to check (there is only
one loop exit).

FAIL: gcc.dg/tree-ssa/ssa-pre-32.c scan-tree-dump pre "# prephitmp_[0-9]+ = PHI 
<[xy]_[0-9]+(D)[^,]*, [xy]_[0-9]+(D)"

phiopt optimizes

:
   if (b_5(D) != 0)
 goto ; [INV]
   else
 goto ; [INV]

:

:
   # iftmp.0_3 = PHI <4294967295(2), 0(3)>
   _1 = iftmp.0_3 & x_8(D);

to

   _9 = (unsigned int) b_5(D);
   _14 = -_9;
   _1 = x_8(D) & _14;

but in the sequence with two same conditions but opposite bits
nothing optimizes this further back to b_5(D) ? x_6(D) : y_7(D).

FAIL: gcc.dg/tree-ssa/ssa-ifcombine-7.c scan-tree-dump ifcombine " > "
FAIL: gcc.dg/tree-ssa/ssa-ifcombine-ccmp-1.c scan-tree-dump optimized "&"
FAIL: gcc.dg/tree-ssa/ssa-ifcombine-ccmp-4.c scan-tree-dump optimized "&"
FAIL: gcc.dg/tree-ssa/ssa-ifcombine-ccmp-5.c scan-tree-dump-times optimized "&" 
2
  \|" 2

early phiopt value-replaces the inner condition which leaves ifcombine
with no work.

2018-08-28  Richard Biener  

* passes.def (pass_all_early_optimizations): Add phi-opt
after dce.
(pass_all_optimizations): Remove first phi-opt pass.
* tree-ssa-phiopt.c (value_replacement): Ignore NOPs and predicts in
addition to debug stmts.
* tree-cfg.c (gimple_empty_block_p): Likewise.

* g++.dg/tree-ssa/pr21463.C: Scan phiopt2 because this testcase
relies on phiprop run before.
* g++.dg/tree-ssa/pr30738.C: Likewise.
* g++.dg/tree-ssa/pr57380.C: Likewise.
* gcc.dg/tree-ssa/pr84859.c: Likewise.
* gcc.dg/tree-ssa/vrp33.c: Disable phiopt.
* gcc.dg/tree-ssa/pr68198.c: Likewise.
* gcc.dg/tree-ssa/pr20701.c: Likewise.
* gcc.dg/tree-ssa/pr21001.c: Likewise.
* gcc.dg/tree-ssa/pr21563.c: Likewise.
* gcc.dg/tree-ssa/pr37508.c: Likewise.
* gcc.dg/tree-ssa/vrp19.c: Likewise.
* gcc.dg/tree-ssa/vrp20.c: Likewise.
* gcc.dg/tree-ssa/pr45397.c: Scan phiopt2 because phiopt1 is
confused by copies in the IL left by EVRP.
* gcc.dg/tree-ssa/phi-opt-12.c: Likewise, this time confused
by predictors.
* gcc.dg/tree-ssa/phi-opt-5.c: Likewise.
* gcc.dg/pr24564.c: Likewise.
* gcc.dg/uninit-15.c: Switch to testing other OK alternative.

diff --git a/gcc/passes.def b/gcc/passes.def
index 7f4b3479a35..401d1307125 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -88,6 +88,7 @@ along with GCC; see the file COPYING3.  If not see
  NEXT_PASS (pass_merge_phi);
   NEXT_PASS