[Bug tree-optimization/51005] -ftree-tail-merge slows down compilation of 20001226-1.c

2011-11-14 Thread vries at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51005

--- Comment #4 from vries at gcc dot gnu.org 2011-11-14 12:05:58 UTC ---
Created attachment 25816
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=25816
Patch that also delegates updating the vops, but keeps dominator info
up-to-date

bootstrapped and reg-tested on x64_64 and i686, build and reg-tested on ARM and
MIPS (with an additional verify_dominators after each update).

This patch updates dominator info after each cluster application.
It does this roughly in the following way:
- for a cluster, determine the nearest-common-dominator dom
- for the dom, determine the set directly dominated by the dom, called
  fix_dom_bbs
- for each of fix_dom_bbs, determine whether there is a path from dom to it, 
  which does not pass through cluster.
  If for a member of fix_dom_bb such a path exists, it is still directly
  dominated by dom. If such a path doesn't exist, it's now dominated by
cluster.
  The patch implement this path search by a walk marking all bbs reachable from
  dom but not through cluster. This walk is done once for each cluster.

For the example, there are 2 clusters of 4096 bbs in 1 iteration. Using this
patch, the compilation time for this example is comparable to recalculating the
dominator info each iteration from scratch.

The time-complexity of this approach per iteration is O(n_clusters * (n_bbs +
n_edges)). Best case, this algorithm is faster than recomputing dominator info
from scratch each iteration. But worst case, it's slower.


[Bug tree-optimization/51005] -ftree-tail-merge slows down compilation of 20001226-1.c

2011-11-14 Thread vries at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51005

--- Comment #5 from vries at gcc dot gnu.org 2011-11-15 00:12:51 UTC ---
Author: vries
Date: Tue Nov 15 00:12:45 2011
New Revision: 181372

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=181372
Log:
2011-11-15  Tom de Vries  t...@codesourcery.com

PR tree-optimization/51005
* tree-ssa-tail-merge.c (delete_basic_block_same_succ): Rename to
mark_basic_block_deleted.
(update_worklist): Inline purge_bbs.
(purge_bbs, unlink_virtual_phi, update_vuses, vop_at_entry)
(delete_block_update_dominator_info): Remove.
(replace_block_by): Remove update_vops parameter.  Partially evaluate
for update_vops == false.
(apply_clusters): Remove update_vops parameter.  Remove update_vops
argument in replace_block_by call.
(update_debug_stmts): Remove MAY_HAVE_DEBUG_STMTS test.
(tail_merge_optimize): Remove update_vops argument to apply_clusters.
Remove call to purge_bbs.  Add calls to calculate_dominance_info and
free_dominance_info.  Add MAY_HAVE_DEBUG_STMTSbefore calling
update_debug_stmts.  Mark vop var for renaming, if necessary.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/tree-ssa-tail-merge.c


[Bug tree-optimization/51005] -ftree-tail-merge slows down compilation of 20001226-1.c

2011-11-14 Thread vries at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51005

vries at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED

--- Comment #6 from vries at gcc dot gnu.org 2011-11-15 00:42:15 UTC ---
Patch checked in, compile time back to normal.

Resolving PR as fixed.


[Bug tree-optimization/51005] -ftree-tail-merge slows down compilation of 20001226-1.c

2011-11-07 Thread vries at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51005

--- Comment #1 from vries at gcc dot gnu.org 2011-11-07 09:53:15 UTC ---
-ftree-tail-merge optimises about half of the basic blocks away:
...
$ egrep -c '^bb.*:' 20001226-1.c.091t.crited 
16385
$ egrep -c '^bb.*:' 20001226-1.c.092t.pre 
8195
...

but it does so with a slowdown of factor 3.5:
...
$ time gcc src/gcc/testsuite/gcc.c-torture/compile/20001226-1.c -S -O2
-fno-tree-tail-merge

real0m34.691s
user0m34.350s
sys0m0.220s
$ time gcc src/gcc/testsuite/gcc.c-torture/compile/20001226-1.c -S -O2
-ftree-tail-merge

real2m7.072s
user2m6.220s
sys0m0.190s
...


[Bug tree-optimization/51005] -ftree-tail-merge slows down compilation of 20001226-1.c

2011-11-07 Thread vries at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51005

--- Comment #2 from vries at gcc dot gnu.org 2011-11-07 10:05:03 UTC ---
Created attachment 25733
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=25733
possible patch

This patch brings the computation time back down:
...
$ time gcc src/gcc/testsuite/gcc.c-torture/compile/20001226-1.c -S -O2
-ftree-tail-merge

real0m35.600s
user0m35.210s
sys0m0.270s
...

The patch delegates the updating of the vops to TODO_update_ssa_only_virtuals.
That way, the dominators don't need to be updated after every replacement, just
after every iteration.


[Bug tree-optimization/51005] -ftree-tail-merge slows down compilation of 20001226-1.c

2011-11-07 Thread vries at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51005

vries at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2011-11-07
 AssignedTo|unassigned at gcc dot   |vries at gcc dot gnu.org
   |gnu.org |
 Ever Confirmed|0   |1


[Bug tree-optimization/51005] -ftree-tail-merge slows down compilation of 20001226-1.c

2011-11-07 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51005

--- Comment #3 from Richard Guenther rguenth at gcc dot gnu.org 2011-11-07 
11:00:39 UTC ---
Note that PRE already performs TODO_update_ssa_only_virtuals (well, of course
only if something marked the VOP for renaming which happens whenever it
inserts a load somewhere).  So it might indeed be not too bad to do this
(given the amount of code we can delete ...)