[Bug tree-optimization/101555] Compile slowdown in tree PRE

2022-07-22 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101555

Richard Biener  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from Richard Biener  ---
Fixed in GCC 12.

[Bug tree-optimization/101555] Compile slowdown in tree PRE

2021-09-07 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101555

--- Comment #7 from Richard Biener  ---
The committed change improves compile-time to less than 50s, in principle it
also applies to the GCC 11 and 10 branches where the related issue was fixed.

[Bug tree-optimization/101555] Compile slowdown in tree PRE

2021-09-07 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101555

--- Comment #6 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:f387ff788f63c1974479644edae728047f843ec4

commit r12-3378-gf387ff788f63c1974479644edae728047f843ec4
Author: Richard Biener 
Date:   Tue Sep 7 10:35:42 2021 +0200

tree-optimization/101555 - avoid redundant alias queries in PRE

This avoids doing redundant work during PHI translation to invalidate
mems when translating their corresponding VUSE through the blocks
virtual PHI node.  All the invalidation work is already done by
prune_clobbered_mems.

This speeds up the compile of the testcase from 275s with PRE
taking 91% of the compile-time down to 43s with PRE taking 16%
of the compile-time.

2021-09-07  Richard Biener  

PR tree-optimization/101555
* tree-ssa-pre.c (translate_vuse_through_block): Do not
perform an alias walk to determine the validity of the
mem at the start of the block which is already guaranteed
by means of prune_clobbered_mems.
(phi_translate_1): Pass edge to translate_vuse_through_block.

[Bug tree-optimization/101555] Compile slowdown in tree PRE

2021-09-07 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101555

--- Comment #5 from Richard Biener  ---
OK, so most of the time is spent in ANTIC compute, specifically PHI translation
and there translate_vuse_through_block doing the (rate limited)
stmt_may_clobber_ref_p_1 query.

It's a bit fishy that we're doing these things "twice", in particular
we are supposed to translate ANTIC_IN to the predecessor ANTIC_OUT, thus
all VUSEs should be already top-of-block.  But then in compute_antic_aux
we're doing

  /* Prune expressions that are clobbered in block and thus become
 invalid if translated from ANTIC_OUT to ANTIC_IN.  */
  prune_clobbered_mems (ANTIC_OUT, block);

which is supposed to do the "translation" through the block but that does
not adjust VUSEs of expressions.  That uses value_dies_in_block_x, something
with a cache but also following a somewhat different logic with respect
to the VUSEs in the expression.

The main complication here is that the expression set we start from is
taken from the VN tables which may put "pre-translated" VUSEs in rather
than starting with the VUSEs as they were.

[Bug tree-optimization/101555] Compile slowdown in tree PRE

2021-09-06 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101555

Richard Biener  changed:

   What|Removed |Added

Summary|[12 Regression] Compile |Compile slowdown in tree
   |slowdown in tree PRE|PRE
   Target Milestone|12.0|---
   Keywords|needs-bisection |

--- Comment #4 from Richard Biener  ---
Meanwhile GCC 12 seems to be as fast as GCC 11 (again).

Still PRE is slow, in this case the profile seems to be mostly alias walk
related.