Re: [PATCH] Fix PR78343
On Mon, 28 Nov 2016, Christophe Lyon wrote: > Hi Richard, > > > On 25 November 2016 at 11:20, Richard Bienerwrote: > > On Thu, 24 Nov 2016, Richard Biener wrote: > > > >> > >> I am testing the following patch for an optimization regression where > >> a loop made dead by final value replacement was made used again by > >> DOM 20 passes later. The real issue here is that we do not get rid > >> of dead loops until very late so this patch makes sure to do that. > >> We could schedule it later (but better no later than unrolling > >> as that might expose a pretty inefficient way of removing a dead loop). > >> > >> Bootstrap and regtest running on x86_64-unknown-linux-gnu. > > > > As expected some testcases need adjustment, thus applied as follows. > > > > Richard. > > > > 2016-11-24 Richard Biener > > > > PR tree-optimization/78343 > > * passes.def: Add CD-DCE pass after loop splitting. > > * tree-ssa-dce.c (find_obviously_necessary_stmts): Move > > SCEV init/finalize ... > > (perform_tree_ssa_dce): ... here. Deal with being > > executed inside the loop pipeline in aggressive mode. > > > > * gcc.dg/tree-ssa/sccp-2.c: New testcase. > > * gcc.dg/autopar/uns-outer-6.c: Adjust. > > * gcc.dg/tree-ssa/20030808-1.c: Likewise. > > * gcc.dg/tree-ssa/20040305-1.c: Likewise. > > * gcc.dg/vect/pr38529.c: Likewise. > > > > But now, I am seeing failures on: > gcc.dg/tree-ssa/20030808-1.c scan-tree-dump-times cddce3 "->code" 0 > gcc.dg/tree-ssa/20030808-1.c scan-tree-dump-times cddce3 "if " 0 > gcc.dg/tree-ssa/20040305-1.c scan-tree-dump-times cddce3 "if " 2 > because the dump file does not exist. Bah. Fixed as follows. 2016-11-28 Richard Biener PR tree-optimization/78343 * gcc.dg/tree-ssa/20030808-1.c: Fix dump to generate. * gcc.dg/tree-ssa/20040305-1.c: Likewise. Index: gcc/testsuite/gcc.dg/tree-ssa/20030808-1.c === --- gcc/testsuite/gcc.dg/tree-ssa/20030808-1.c (revision 242908) +++ gcc/testsuite/gcc.dg/tree-ssa/20030808-1.c (working copy) @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O1 -fdump-tree-cddce2" } */ +/* { dg-options "-O1 -fdump-tree-cddce3" } */ extern void abort (void); Index: gcc/testsuite/gcc.dg/tree-ssa/20040305-1.c === --- gcc/testsuite/gcc.dg/tree-ssa/20040305-1.c (revision 242908) +++ gcc/testsuite/gcc.dg/tree-ssa/20040305-1.c (working copy) @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-cddce2 -fdump-tree-forwprop1-details" } */ +/* { dg-options "-O2 -fdump-tree-cddce3 -fdump-tree-forwprop1-details" } */ int abarney[2]; int afred[1]; > (on aarch64 and arm targets) > > Christophe > > > > diff --git a/gcc/passes.def b/gcc/passes.def > > index 2a470a7..2fa682b 100644 > > --- a/gcc/passes.def > > +++ b/gcc/passes.def > > @@ -271,6 +271,9 @@ along with GCC; see the file COPYING3. If not see > > NEXT_PASS (pass_tree_unswitch); > > NEXT_PASS (pass_scev_cprop); > > NEXT_PASS (pass_loop_split); > > + /* All unswitching, final value replacement and splitting can > > expose > > +empty loops. Remove them now. */ > > + NEXT_PASS (pass_cd_dce); > > NEXT_PASS (pass_record_bounds); > > NEXT_PASS (pass_loop_distribution); > > NEXT_PASS (pass_copy_prop); > > diff --git a/gcc/testsuite/gcc.dg/autopar/uns-outer-6.c > > b/gcc/testsuite/gcc.dg/autopar/uns-outer-6.c > > index dc2870b..5af60b0 100644 > > --- a/gcc/testsuite/gcc.dg/autopar/uns-outer-6.c > > +++ b/gcc/testsuite/gcc.dg/autopar/uns-outer-6.c > > @@ -25,7 +25,7 @@ parloop (int N) > >for (i = 0; i < N; i++) > > { > >for (j = 0; j < N; j++) > > - y[i]=x[i][j]; > > + y[i] += x[i][j]; > >sum += y[i]; > > } > >g_sum = sum; > > @@ -46,6 +46,10 @@ main (void) > > > > > > /* Check that outer loop is parallelized. */ > > +/* This fails because we have > > + FAILED: data dependencies exist across iterations > > + > > + > > /* { dg-final { scan-tree-dump-times "parallelizing outer loop" 1 > > "parloops2" } } */ > > /* { dg-final { scan-tree-dump-times "parallelizing inner loop" 0 > > "parloops2" } } */ > > /* { dg-final { scan-tree-dump-times "loopfn" 4 "optimized" } } */ > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20030808-1.c > > b/gcc/testsuite/gcc.dg/tree-ssa/20030808-1.c > > index 7cc5404..cda86a7 100644 > > --- a/gcc/testsuite/gcc.dg/tree-ssa/20030808-1.c > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/20030808-1.c > > @@ -33,8 +33,8 @@ delete_dead_jumptables () > > /* There should be no loads of ->code. If any exist, then we failed to > > optimize away all the IF statements and the statements feeding > > their conditions. */ > >
Re: [PATCH] Fix PR78343
Hi Richard, On 25 November 2016 at 11:20, Richard Bienerwrote: > On Thu, 24 Nov 2016, Richard Biener wrote: > >> >> I am testing the following patch for an optimization regression where >> a loop made dead by final value replacement was made used again by >> DOM 20 passes later. The real issue here is that we do not get rid >> of dead loops until very late so this patch makes sure to do that. >> We could schedule it later (but better no later than unrolling >> as that might expose a pretty inefficient way of removing a dead loop). >> >> Bootstrap and regtest running on x86_64-unknown-linux-gnu. > > As expected some testcases need adjustment, thus applied as follows. > > Richard. > > 2016-11-24 Richard Biener > > PR tree-optimization/78343 > * passes.def: Add CD-DCE pass after loop splitting. > * tree-ssa-dce.c (find_obviously_necessary_stmts): Move > SCEV init/finalize ... > (perform_tree_ssa_dce): ... here. Deal with being > executed inside the loop pipeline in aggressive mode. > > * gcc.dg/tree-ssa/sccp-2.c: New testcase. > * gcc.dg/autopar/uns-outer-6.c: Adjust. > * gcc.dg/tree-ssa/20030808-1.c: Likewise. > * gcc.dg/tree-ssa/20040305-1.c: Likewise. > * gcc.dg/vect/pr38529.c: Likewise. > But now, I am seeing failures on: gcc.dg/tree-ssa/20030808-1.c scan-tree-dump-times cddce3 "->code" 0 gcc.dg/tree-ssa/20030808-1.c scan-tree-dump-times cddce3 "if " 0 gcc.dg/tree-ssa/20040305-1.c scan-tree-dump-times cddce3 "if " 2 because the dump file does not exist. (on aarch64 and arm targets) Christophe > diff --git a/gcc/passes.def b/gcc/passes.def > index 2a470a7..2fa682b 100644 > --- a/gcc/passes.def > +++ b/gcc/passes.def > @@ -271,6 +271,9 @@ along with GCC; see the file COPYING3. If not see > NEXT_PASS (pass_tree_unswitch); > NEXT_PASS (pass_scev_cprop); > NEXT_PASS (pass_loop_split); > + /* All unswitching, final value replacement and splitting can expose > +empty loops. Remove them now. */ > + NEXT_PASS (pass_cd_dce); > NEXT_PASS (pass_record_bounds); > NEXT_PASS (pass_loop_distribution); > NEXT_PASS (pass_copy_prop); > diff --git a/gcc/testsuite/gcc.dg/autopar/uns-outer-6.c > b/gcc/testsuite/gcc.dg/autopar/uns-outer-6.c > index dc2870b..5af60b0 100644 > --- a/gcc/testsuite/gcc.dg/autopar/uns-outer-6.c > +++ b/gcc/testsuite/gcc.dg/autopar/uns-outer-6.c > @@ -25,7 +25,7 @@ parloop (int N) >for (i = 0; i < N; i++) > { >for (j = 0; j < N; j++) > - y[i]=x[i][j]; > + y[i] += x[i][j]; >sum += y[i]; > } >g_sum = sum; > @@ -46,6 +46,10 @@ main (void) > > > /* Check that outer loop is parallelized. */ > +/* This fails because we have > + FAILED: data dependencies exist across iterations > + > + > /* { dg-final { scan-tree-dump-times "parallelizing outer loop" 1 > "parloops2" } } */ > /* { dg-final { scan-tree-dump-times "parallelizing inner loop" 0 > "parloops2" } } */ > /* { dg-final { scan-tree-dump-times "loopfn" 4 "optimized" } } */ > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20030808-1.c > b/gcc/testsuite/gcc.dg/tree-ssa/20030808-1.c > index 7cc5404..cda86a7 100644 > --- a/gcc/testsuite/gcc.dg/tree-ssa/20030808-1.c > +++ b/gcc/testsuite/gcc.dg/tree-ssa/20030808-1.c > @@ -33,8 +33,8 @@ delete_dead_jumptables () > /* There should be no loads of ->code. If any exist, then we failed to > optimize away all the IF statements and the statements feeding > their conditions. */ > -/* { dg-final { scan-tree-dump-times "->code" 0 "cddce2"} } */ > +/* { dg-final { scan-tree-dump-times "->code" 0 "cddce3"} } */ > > /* There should be no IF statements. */ > -/* { dg-final { scan-tree-dump-times "if " 0 "cddce2"} } */ > +/* { dg-final { scan-tree-dump-times "if " 0 "cddce3"} } */ > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20040305-1.c > b/gcc/testsuite/gcc.dg/tree-ssa/20040305-1.c > index 501e28c..d1a9af8 100644 > --- a/gcc/testsuite/gcc.dg/tree-ssa/20040305-1.c > +++ b/gcc/testsuite/gcc.dg/tree-ssa/20040305-1.c > @@ -27,4 +27,4 @@ void foo(int edx, int eax) > > /* After cddce we should have two IF statements remaining as the other > two tests can be threaded. */ > -/* { dg-final { scan-tree-dump-times "if " 2 "cddce2"} } */ > +/* { dg-final { scan-tree-dump-times "if " 2 "cddce3"} } */ > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/sccp-2.c > b/gcc/testsuite/gcc.dg/tree-ssa/sccp-2.c > new file mode 100644 > index 000..099b281 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/sccp-2.c > @@ -0,0 +1,15 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-optimized" } */ > + > +unsigned int > +test(unsigned int quant) > +{ > + unsigned int sum = 0; > + for (unsigned int i = 0; i < quant; ++i) > +sum += quant; > + return sum; > +} > + > +/* A single basic-block should remain (computing and
Re: [PATCH] Fix PR78343
On Thu, 24 Nov 2016, Richard Biener wrote: > > I am testing the following patch for an optimization regression where > a loop made dead by final value replacement was made used again by > DOM 20 passes later. The real issue here is that we do not get rid > of dead loops until very late so this patch makes sure to do that. > We could schedule it later (but better no later than unrolling > as that might expose a pretty inefficient way of removing a dead loop). > > Bootstrap and regtest running on x86_64-unknown-linux-gnu. As expected some testcases need adjustment, thus applied as follows. Richard. 2016-11-24 Richard BienerPR tree-optimization/78343 * passes.def: Add CD-DCE pass after loop splitting. * tree-ssa-dce.c (find_obviously_necessary_stmts): Move SCEV init/finalize ... (perform_tree_ssa_dce): ... here. Deal with being executed inside the loop pipeline in aggressive mode. * gcc.dg/tree-ssa/sccp-2.c: New testcase. * gcc.dg/autopar/uns-outer-6.c: Adjust. * gcc.dg/tree-ssa/20030808-1.c: Likewise. * gcc.dg/tree-ssa/20040305-1.c: Likewise. * gcc.dg/vect/pr38529.c: Likewise. diff --git a/gcc/passes.def b/gcc/passes.def index 2a470a7..2fa682b 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -271,6 +271,9 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_tree_unswitch); NEXT_PASS (pass_scev_cprop); NEXT_PASS (pass_loop_split); + /* All unswitching, final value replacement and splitting can expose +empty loops. Remove them now. */ + NEXT_PASS (pass_cd_dce); NEXT_PASS (pass_record_bounds); NEXT_PASS (pass_loop_distribution); NEXT_PASS (pass_copy_prop); diff --git a/gcc/testsuite/gcc.dg/autopar/uns-outer-6.c b/gcc/testsuite/gcc.dg/autopar/uns-outer-6.c index dc2870b..5af60b0 100644 --- a/gcc/testsuite/gcc.dg/autopar/uns-outer-6.c +++ b/gcc/testsuite/gcc.dg/autopar/uns-outer-6.c @@ -25,7 +25,7 @@ parloop (int N) for (i = 0; i < N; i++) { for (j = 0; j < N; j++) - y[i]=x[i][j]; + y[i] += x[i][j]; sum += y[i]; } g_sum = sum; @@ -46,6 +46,10 @@ main (void) /* Check that outer loop is parallelized. */ +/* This fails because we have + FAILED: data dependencies exist across iterations + + /* { dg-final { scan-tree-dump-times "parallelizing outer loop" 1 "parloops2" } } */ /* { dg-final { scan-tree-dump-times "parallelizing inner loop" 0 "parloops2" } } */ /* { dg-final { scan-tree-dump-times "loopfn" 4 "optimized" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20030808-1.c b/gcc/testsuite/gcc.dg/tree-ssa/20030808-1.c index 7cc5404..cda86a7 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/20030808-1.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/20030808-1.c @@ -33,8 +33,8 @@ delete_dead_jumptables () /* There should be no loads of ->code. If any exist, then we failed to optimize away all the IF statements and the statements feeding their conditions. */ -/* { dg-final { scan-tree-dump-times "->code" 0 "cddce2"} } */ +/* { dg-final { scan-tree-dump-times "->code" 0 "cddce3"} } */ /* There should be no IF statements. */ -/* { dg-final { scan-tree-dump-times "if " 0 "cddce2"} } */ +/* { dg-final { scan-tree-dump-times "if " 0 "cddce3"} } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20040305-1.c b/gcc/testsuite/gcc.dg/tree-ssa/20040305-1.c index 501e28c..d1a9af8 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/20040305-1.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/20040305-1.c @@ -27,4 +27,4 @@ void foo(int edx, int eax) /* After cddce we should have two IF statements remaining as the other two tests can be threaded. */ -/* { dg-final { scan-tree-dump-times "if " 2 "cddce2"} } */ +/* { dg-final { scan-tree-dump-times "if " 2 "cddce3"} } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/sccp-2.c b/gcc/testsuite/gcc.dg/tree-ssa/sccp-2.c new file mode 100644 index 000..099b281 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/sccp-2.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ + +unsigned int +test(unsigned int quant) +{ + unsigned int sum = 0; + for (unsigned int i = 0; i < quant; ++i) +sum += quant; + return sum; +} + +/* A single basic-block should remain (computing and + returning quant * quant). */ +/* { dg-final { scan-tree-dump-times "bb" 1 "optimized" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/pr38529.c b/gcc/testsuite/gcc.dg/vect/pr38529.c index 171adeb..9b5919d 100644 --- a/gcc/testsuite/gcc.dg/vect/pr38529.c +++ b/gcc/testsuite/gcc.dg/vect/pr38529.c @@ -11,7 +11,3 @@ void foo() for (j = 0; j < 17; ++j) a[i] = 0; } - -/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" } } */ - - diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c index 7b9814e..50b5eef 100644 --- a/gcc/tree-ssa-dce.c +++ b/gcc/tree-ssa-dce.c @@ -400,7