[Bug tree-optimization/51005] -ftree-tail-merge slows down compilation of 20001226-1.c
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51005 --- Comment #4 from vries at gcc dot gnu.org 2011-11-14 12:05:58 UTC --- Created attachment 25816 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=25816 Patch that also delegates updating the vops, but keeps dominator info up-to-date bootstrapped and reg-tested on x64_64 and i686, build and reg-tested on ARM and MIPS (with an additional verify_dominators after each update). This patch updates dominator info after each cluster application. It does this roughly in the following way: - for a cluster, determine the nearest-common-dominator dom - for the dom, determine the set directly dominated by the dom, called fix_dom_bbs - for each of fix_dom_bbs, determine whether there is a path from dom to it, which does not pass through cluster. If for a member of fix_dom_bb such a path exists, it is still directly dominated by dom. If such a path doesn't exist, it's now dominated by cluster. The patch implement this path search by a walk marking all bbs reachable from dom but not through cluster. This walk is done once for each cluster. For the example, there are 2 clusters of 4096 bbs in 1 iteration. Using this patch, the compilation time for this example is comparable to recalculating the dominator info each iteration from scratch. The time-complexity of this approach per iteration is O(n_clusters * (n_bbs + n_edges)). Best case, this algorithm is faster than recomputing dominator info from scratch each iteration. But worst case, it's slower.
[Bug tree-optimization/51005] -ftree-tail-merge slows down compilation of 20001226-1.c
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51005 --- Comment #5 from vries at gcc dot gnu.org 2011-11-15 00:12:51 UTC --- Author: vries Date: Tue Nov 15 00:12:45 2011 New Revision: 181372 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=181372 Log: 2011-11-15 Tom de Vries t...@codesourcery.com PR tree-optimization/51005 * tree-ssa-tail-merge.c (delete_basic_block_same_succ): Rename to mark_basic_block_deleted. (update_worklist): Inline purge_bbs. (purge_bbs, unlink_virtual_phi, update_vuses, vop_at_entry) (delete_block_update_dominator_info): Remove. (replace_block_by): Remove update_vops parameter. Partially evaluate for update_vops == false. (apply_clusters): Remove update_vops parameter. Remove update_vops argument in replace_block_by call. (update_debug_stmts): Remove MAY_HAVE_DEBUG_STMTS test. (tail_merge_optimize): Remove update_vops argument to apply_clusters. Remove call to purge_bbs. Add calls to calculate_dominance_info and free_dominance_info. Add MAY_HAVE_DEBUG_STMTSbefore calling update_debug_stmts. Mark vop var for renaming, if necessary. Modified: trunk/gcc/ChangeLog trunk/gcc/tree-ssa-tail-merge.c
[Bug tree-optimization/51005] -ftree-tail-merge slows down compilation of 20001226-1.c
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51005 vries at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED --- Comment #6 from vries at gcc dot gnu.org 2011-11-15 00:42:15 UTC --- Patch checked in, compile time back to normal. Resolving PR as fixed.
[Bug tree-optimization/51005] -ftree-tail-merge slows down compilation of 20001226-1.c
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51005 --- Comment #1 from vries at gcc dot gnu.org 2011-11-07 09:53:15 UTC --- -ftree-tail-merge optimises about half of the basic blocks away: ... $ egrep -c '^bb.*:' 20001226-1.c.091t.crited 16385 $ egrep -c '^bb.*:' 20001226-1.c.092t.pre 8195 ... but it does so with a slowdown of factor 3.5: ... $ time gcc src/gcc/testsuite/gcc.c-torture/compile/20001226-1.c -S -O2 -fno-tree-tail-merge real0m34.691s user0m34.350s sys0m0.220s $ time gcc src/gcc/testsuite/gcc.c-torture/compile/20001226-1.c -S -O2 -ftree-tail-merge real2m7.072s user2m6.220s sys0m0.190s ...
[Bug tree-optimization/51005] -ftree-tail-merge slows down compilation of 20001226-1.c
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51005 --- Comment #2 from vries at gcc dot gnu.org 2011-11-07 10:05:03 UTC --- Created attachment 25733 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=25733 possible patch This patch brings the computation time back down: ... $ time gcc src/gcc/testsuite/gcc.c-torture/compile/20001226-1.c -S -O2 -ftree-tail-merge real0m35.600s user0m35.210s sys0m0.270s ... The patch delegates the updating of the vops to TODO_update_ssa_only_virtuals. That way, the dominators don't need to be updated after every replacement, just after every iteration.
[Bug tree-optimization/51005] -ftree-tail-merge slows down compilation of 20001226-1.c
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51005 vries at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2011-11-07 AssignedTo|unassigned at gcc dot |vries at gcc dot gnu.org |gnu.org | Ever Confirmed|0 |1
[Bug tree-optimization/51005] -ftree-tail-merge slows down compilation of 20001226-1.c
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51005 --- Comment #3 from Richard Guenther rguenth at gcc dot gnu.org 2011-11-07 11:00:39 UTC --- Note that PRE already performs TODO_update_ssa_only_virtuals (well, of course only if something marked the VOP for renaming which happens whenever it inserts a load somewhere). So it might indeed be not too bad to do this (given the amount of code we can delete ...)