[Bug tree-optimization/101555] Compile slowdown in tree PRE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101555 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #8 from Richard Biener --- Fixed in GCC 12.
[Bug tree-optimization/101555] Compile slowdown in tree PRE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101555 --- Comment #7 from Richard Biener --- The committed change improves compile-time to less than 50s, in principle it also applies to the GCC 11 and 10 branches where the related issue was fixed.
[Bug tree-optimization/101555] Compile slowdown in tree PRE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101555 --- Comment #6 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:f387ff788f63c1974479644edae728047f843ec4 commit r12-3378-gf387ff788f63c1974479644edae728047f843ec4 Author: Richard Biener Date: Tue Sep 7 10:35:42 2021 +0200 tree-optimization/101555 - avoid redundant alias queries in PRE This avoids doing redundant work during PHI translation to invalidate mems when translating their corresponding VUSE through the blocks virtual PHI node. All the invalidation work is already done by prune_clobbered_mems. This speeds up the compile of the testcase from 275s with PRE taking 91% of the compile-time down to 43s with PRE taking 16% of the compile-time. 2021-09-07 Richard Biener PR tree-optimization/101555 * tree-ssa-pre.c (translate_vuse_through_block): Do not perform an alias walk to determine the validity of the mem at the start of the block which is already guaranteed by means of prune_clobbered_mems. (phi_translate_1): Pass edge to translate_vuse_through_block.
[Bug tree-optimization/101555] Compile slowdown in tree PRE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101555 --- Comment #5 from Richard Biener --- OK, so most of the time is spent in ANTIC compute, specifically PHI translation and there translate_vuse_through_block doing the (rate limited) stmt_may_clobber_ref_p_1 query. It's a bit fishy that we're doing these things "twice", in particular we are supposed to translate ANTIC_IN to the predecessor ANTIC_OUT, thus all VUSEs should be already top-of-block. But then in compute_antic_aux we're doing /* Prune expressions that are clobbered in block and thus become invalid if translated from ANTIC_OUT to ANTIC_IN. */ prune_clobbered_mems (ANTIC_OUT, block); which is supposed to do the "translation" through the block but that does not adjust VUSEs of expressions. That uses value_dies_in_block_x, something with a cache but also following a somewhat different logic with respect to the VUSEs in the expression. The main complication here is that the expression set we start from is taken from the VN tables which may put "pre-translated" VUSEs in rather than starting with the VUSEs as they were.
[Bug tree-optimization/101555] Compile slowdown in tree PRE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101555 Richard Biener changed: What|Removed |Added Summary|[12 Regression] Compile |Compile slowdown in tree |slowdown in tree PRE|PRE Target Milestone|12.0|--- Keywords|needs-bisection | --- Comment #4 from Richard Biener --- Meanwhile GCC 12 seems to be as fast as GCC 11 (again). Still PRE is slow, in this case the profile seems to be mostly alias walk related.