[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)
--- Comment #21 from rguenth at gcc dot gnu dot org 2010-04-15 13:47 --- Subject: Bug 43627 Author: rguenth Date: Thu Apr 15 13:46:42 2010 New Revision: 158377 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=158377 Log: 2010-04-15 Richard Guenther rguent...@suse.de PR tree-optimization/43627 * tree-vrp.c (extract_range_from_unary_expr): Widenings of [1, +INF(OVF)] go to [1, +INF(OVF)] of the wider type, not varying. * gcc.dg/tree-ssa/vrp49.c: New testcase. Added: branches/gcc-4_5-branch/gcc/testsuite/gcc.dg/tree-ssa/vrp49.c Modified: branches/gcc-4_5-branch/gcc/ChangeLog branches/gcc-4_5-branch/gcc/testsuite/ChangeLog branches/gcc-4_5-branch/gcc/tree-vrp.c -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627
[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)
--- Comment #22 from rguenth at gcc dot gnu dot org 2010-04-15 13:47 --- Fixed. -- rguenth at gcc dot gnu dot org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Known to work|4.4.3 4.6.0 |4.4.3 4.5.1 4.6.0 Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627
[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)
--- Comment #18 from rguenth at gcc dot gnu dot org 2010-04-06 11:21 --- GCC 4.5.0 is being released. Deferring to 4.5.1. -- rguenth at gcc dot gnu dot org changed: What|Removed |Added Target Milestone|4.5.0 |4.5.1 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627
[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)
--- Comment #19 from rguenth at gcc dot gnu dot org 2010-04-06 12:32 --- Subject: Bug 43627 Author: rguenth Date: Tue Apr 6 12:32:25 2010 New Revision: 157992 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=157992 Log: 2010-04-06 Richard Guenther rguent...@suse.de PR tree-optimization/43627 * tree-vrp.c (extract_range_from_unary_expr): Widenings of [1, +INF(OVF)] go to [1, +INF(OVF)] of the wider type, not varying. * gcc.dg/tree-ssa/vrp49.c: New testcase. Added: trunk/gcc/testsuite/gcc.dg/tree-ssa/vrp49.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-vrp.c -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627
[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)
--- Comment #20 from rguenth at gcc dot gnu dot org 2010-04-06 12:33 --- Fixed on trunk sofar. Queued for 4.5.1. -- rguenth at gcc dot gnu dot org changed: What|Removed |Added Known to fail||4.5.0 Known to work||4.4.3 4.6.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627
[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)
-- rguenth at gcc dot gnu dot org changed: What|Removed |Added Priority|P3 |P2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627
[Bug tree-optimization/43627] [4.5 Regression] slow compilation
--- Comment #1 from jv244 at cam dot ac dot uk 2010-04-02 08:16 --- Created an attachment (id=20287) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20287action=view) testcase reproduce with gfortran -fbounds-check -g -O3 -ffast-math -funroll-loops -ftree-vectorize -march=native -c hog.f90 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627
[Bug tree-optimization/43627] [4.5 Regression] slow compilation
-- jv244 at cam dot ac dot uk changed: What|Removed |Added Target Milestone|--- |4.5.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627
[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv )
--- Comment #2 from jv244 at cam dot ac dot uk 2010-04-02 08:27 --- And a timing report as well (notice the machine is not fully idle). The major consumer is tree canonical. Execution times (seconds) garbage collection: 7.71 ( 2%) usr 0.07 ( 4%) sys 14.12 ( 2%) wall 0 kB ( 0%) ggc callgraph construction: 0.18 ( 0%) usr 0.01 ( 1%) sys 0.24 ( 0%) wall 6675 kB ( 1%) ggc callgraph optimization: 0.61 ( 0%) usr 0.03 ( 2%) sys 0.61 ( 0%) wall 1655 kB ( 0%) ggc ipa cp: 0.19 ( 0%) usr 0.00 ( 0%) sys 0.19 ( 0%) wall 539 kB ( 0%) ggc ipa reference : 0.15 ( 0%) usr 0.00 ( 0%) sys 0.15 ( 0%) wall 0 kB ( 0%) ggc ipa pure const: 0.17 ( 0%) usr 0.00 ( 0%) sys 0.17 ( 0%) wall 0 kB ( 0%) ggc ipa SRA : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc cfg cleanup : 0.78 ( 0%) usr 0.01 ( 1%) sys 1.27 ( 0%) wall 3661 kB ( 0%) ggc CFG verifier : 2.10 ( 1%) usr 0.00 ( 0%) sys 3.40 ( 1%) wall 0 kB ( 0%) ggc trivially dead code : 0.38 ( 0%) usr 0.00 ( 0%) sys 0.40 ( 0%) wall 0 kB ( 0%) ggc df multiple defs : 0.59 ( 0%) usr 0.00 ( 0%) sys 0.92 ( 0%) wall 0 kB ( 0%) ggc df reaching defs : 0.86 ( 0%) usr 0.00 ( 0%) sys 1.83 ( 0%) wall 0 kB ( 0%) ggc df live regs : 4.92 ( 1%) usr 0.01 ( 1%) sys 8.23 ( 1%) wall 0 kB ( 0%) ggc df liveinitialized regs: 1.48 ( 0%) usr 0.01 ( 1%) sys 3.37 ( 1%) wall 0 kB ( 0%) ggc df use-def / def-use chains: 0.71 ( 0%) usr 0.00 ( 0%) sys 1.39 ( 0%) wall 0 kB ( 0%) ggc df reg dead/unused notes: 4.15 ( 1%) usr 0.01 ( 1%) sys 7.47 ( 1%) wall 9314 kB ( 1%) ggc register information : 1.29 ( 0%) usr 0.01 ( 1%) sys 3.00 ( 0%) wall 0 kB ( 0%) ggc alias analysis: 0.64 ( 0%) usr 0.00 ( 0%) sys 0.74 ( 0%) wall 21770 kB ( 3%) ggc alias stmt walking: 1.94 ( 1%) usr 0.06 ( 4%) sys 3.50 ( 1%) wall 0 kB ( 0%) ggc register scan : 0.18 ( 0%) usr 0.00 ( 0%) sys 0.11 ( 0%) wall 0 kB ( 0%) ggc rebuild jump labels : 0.23 ( 0%) usr 0.00 ( 0%) sys 0.26 ( 0%) wall 0 kB ( 0%) ggc parser: 1.27 ( 0%) usr 0.12 ( 7%) sys 1.50 ( 0%) wall 42200 kB ( 5%) ggc inline heuristics : 0.43 ( 0%) usr 0.02 ( 1%) sys 0.34 ( 0%) wall 0 kB ( 0%) ggc tree gimplify : 0.69 ( 0%) usr 0.03 ( 2%) sys 0.79 ( 0%) wall 52375 kB ( 6%) ggc tree eh : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall 0 kB ( 0%) ggc tree CFG construction : 0.08 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 9418 kB ( 1%) ggc tree CFG cleanup : 0.49 ( 0%) usr 0.00 ( 0%) sys 0.80 ( 0%) wall 418 kB ( 0%) ggc tree VRP : 2.08 ( 1%) usr 0.05 ( 3%) sys 3.67 ( 1%) wall 54923 kB ( 7%) ggc tree copy propagation : 0.37 ( 0%) usr 0.00 ( 0%) sys 0.59 ( 0%) wall 237 kB ( 0%) ggc tree find ref. vars : 0.07 ( 0%) usr 0.02 ( 1%) sys 0.09 ( 0%) wall 3774 kB ( 0%) ggc tree PTA : 0.19 ( 0%) usr 0.00 ( 0%) sys 0.19 ( 0%) wall 425 kB ( 0%) ggc tree PHI insertion: 0.02 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 315 kB ( 0%) ggc tree SSA rewrite : 0.44 ( 0%) usr 0.03 ( 2%) sys 0.80 ( 0%) wall 20682 kB ( 3%) ggc tree SSA other: 0.22 ( 0%) usr 0.02 ( 1%) sys 0.23 ( 0%) wall 434 kB ( 0%) ggc tree SSA incremental : 0.62 ( 0%) usr 0.04 ( 2%) sys 0.91 ( 0%) wall 438 kB ( 0%) ggc tree operand scan : 0.27 ( 0%) usr 0.14 ( 8%) sys 0.53 ( 0%) wall 21791 kB ( 3%) ggc dominator optimization: 0.42 ( 0%) usr 0.00 ( 0%) sys 0.72 ( 0%) wall 4190 kB ( 1%) ggc tree CCP : 0.56 ( 0%) usr 0.01 ( 1%) sys 0.70 ( 0%) wall 3081 kB ( 0%) ggc tree PHI const/copy prop: 0.05 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 22 kB ( 0%) ggc tree split crit edges : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.10 ( 0%) wall 3268 kB ( 0%) ggc tree reassociation: 0.17 ( 0%) usr 0.00 ( 0%) sys 0.36 ( 0%) wall 161 kB ( 0%) ggc tree PRE : 6.54 ( 2%) usr 0.02 ( 1%) sys 11.71 ( 2%) wall 25200 kB ( 3%) ggc tree FRE : 0.76 ( 0%) usr 0.03 ( 2%) sys 1.15 ( 0%) wall 8100 kB ( 1%) ggc tree code sinking : 0.23 ( 0%) usr 0.04 ( 2%) sys 0.44 ( 0%) wall 12275 kB ( 2%) ggc tree linearize phis : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.05 ( 0%) wall 0 kB ( 0%) ggc tree forward propagate: 0.19 ( 0%) usr 0.01 ( 1%) sys 0.25 ( 0%) wall 9572 kB ( 1%) ggc tree phiprop : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc tree conservative DCE : 0.19 ( 0%) usr 0.02 ( 1%) sys 0.51 ( 0%) wall 17 kB ( 0%) ggc tree aggressive DCE : 0.49 ( 0%) usr 0.01 ( 1%) sys 0.74 ( 0%)
[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv )
--- Comment #3 from steven at gcc dot gnu dot org 2010-04-02 09:18 --- This tells me you are comparing apples and cows: Extra diagnostic checks enabled; compiler may run slowly. Could you try again with a compiler configured with --enable=checking=release? -- steven at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |WAITING http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627
[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv )
--- Comment #4 from jv244 at cam dot ac dot uk 2010-04-02 09:26 --- (In reply to comment #3) This tells me you are comparing apples and cows: Extra diagnostic checks enabled; compiler may run slowly. Could you try again with a compiler configured with --enable=checking=release? I'll do now... for reference, 4.4 has: gfortran -ftime-report -fbounds-check -g -O3 -ffast-math -funroll-loops -ftree-vectorize -march=native hog.f90 Execution times (seconds) garbage collection: 0.15 ( 1%) usr 0.00 ( 0%) sys 0.14 ( 1%) wall 0 kB ( 0%) ggc callgraph construction: 0.33 ( 1%) usr 0.03 ( 4%) sys 0.33 ( 1%) wall 9447 kB ( 2%) ggc callgraph optimization: 0.46 ( 2%) usr 0.01 ( 1%) sys 0.50 ( 2%) wall 239 kB ( 0%) ggc ipa cp: 0.22 ( 1%) usr 0.00 ( 0%) sys 0.24 ( 1%) wall 0 kB ( 0%) ggc ipa reference : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc cfg cleanup : 0.08 ( 0%) usr 0.00 ( 0%) sys 0.12 ( 0%) wall 914 kB ( 0%) ggc trivially dead code : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 0 kB ( 0%) ggc df reaching defs : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.12 ( 0%) wall 0 kB ( 0%) ggc df live regs : 0.22 ( 1%) usr 0.00 ( 0%) sys 0.18 ( 1%) wall 0 kB ( 0%) ggc df liveinitialized regs: 0.10 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 0 kB ( 0%) ggc df use-def / def-use chains: 0.11 ( 0%) usr 0.00 ( 0%) sys 0.05 ( 0%) wall 0 kB ( 0%) ggc df reg dead/unused notes: 0.17 ( 1%) usr 0.00 ( 0%) sys 0.09 ( 0%) wall 3443 kB ( 1%) ggc register information : 0.07 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall 0 kB ( 0%) ggc alias analysis: 0.14 ( 1%) usr 0.00 ( 0%) sys 0.12 ( 0%) wall 6273 kB ( 1%) ggc register scan : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 0 kB ( 0%) ggc rebuild jump labels : 0.08 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc parser: 1.27 ( 5%) usr 0.12 (16%) sys 1.31 ( 5%) wall 50936 kB ( 9%) ggc inline heuristics : 0.13 ( 1%) usr 0.05 ( 6%) sys 0.25 ( 1%) wall 0 kB ( 0%) ggc tree gimplify : 0.44 ( 2%) usr 0.04 ( 5%) sys 0.54 ( 2%) wall 61550 kB (11%) ggc tree eh : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 0 kB ( 0%) ggc tree CFG construction : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 9734 kB ( 2%) ggc tree CFG cleanup : 0.28 ( 1%) usr 0.00 ( 0%) sys 0.18 ( 1%) wall 668 kB ( 0%) ggc tree VRP : 1.21 ( 5%) usr 0.03 ( 4%) sys 1.26 ( 5%) wall 42193 kB ( 8%) ggc tree copy propagation : 0.21 ( 1%) usr 0.00 ( 0%) sys 0.24 ( 1%) wall 315 kB ( 0%) ggc tree find ref. vars : 0.07 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 8937 kB ( 2%) ggc tree PTA : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.10 ( 0%) wall 758 kB ( 0%) ggc tree alias analysis : 0.12 ( 0%) usr 0.05 ( 6%) sys 0.12 ( 0%) wall 77 kB ( 0%) ggc tree call clobbering : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 18 kB ( 0%) ggc tree flow sensitive alias: 0.02 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 121 kB ( 0%) ggc tree flow insensitive alias: 0.02 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc tree memory partitioning: 0.00 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 21 kB ( 0%) ggc tree PHI insertion: 0.02 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 201 kB ( 0%) ggc tree SSA rewrite : 0.17 ( 1%) usr 0.01 ( 1%) sys 0.13 ( 1%) wall 19668 kB ( 4%) ggc tree SSA other: 0.11 ( 0%) usr 0.03 ( 4%) sys 0.18 ( 1%) wall 360 kB ( 0%) ggc tree SSA incremental : 0.24 ( 1%) usr 0.02 ( 3%) sys 0.25 ( 1%) wall 40 kB ( 0%) ggc tree operand scan : 0.36 ( 1%) usr 0.15 (19%) sys 0.58 ( 2%) wall 27070 kB ( 5%) ggc dominator optimization: 0.26 ( 1%) usr 0.00 ( 0%) sys 0.14 ( 1%) wall 2270 kB ( 0%) ggc tree SRA : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc tree CCP : 0.33 ( 1%) usr 0.01 ( 1%) sys 0.24 ( 1%) wall 4060 kB ( 1%) ggc tree reassociation: 0.06 ( 0%) usr 0.00 ( 0%) sys 0.10 ( 0%) wall 124 kB ( 0%) ggc tree PRE : 5.18 (21%) usr 0.05 ( 6%) sys 5.07 (20%) wall 87699 kB (16%) ggc tree FRE : 0.51 ( 2%) usr 0.00 ( 0%) sys 0.55 ( 2%) wall 7664 kB ( 1%) ggc tree code sinking : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 75 kB ( 0%) ggc tree linearize phis : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc tree forward propagate: 0.11 ( 0%) usr 0.02 ( 3%) sys 0.11 ( 0%) wall 11274 kB ( 2%) ggc tree phiprop : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB (
[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv )
--- Comment #5 from jv244 at cam dot ac dot uk 2010-04-02 09:47 --- (In reply to comment #3) cows with cows now (i.e. --enable-checking=release), on an idle machine. Execution times (seconds) garbage collection: 0.29 ( 0%) usr 0.00 ( 0%) sys 0.31 ( 0%) wall 0 kB ( 0%) ggc callgraph construction: 0.11 ( 0%) usr 0.01 ( 1%) sys 0.12 ( 0%) wall 5939 kB ( 1%) ggc callgraph optimization: 0.29 ( 0%) usr 0.00 ( 0%) sys 0.25 ( 0%) wall 184 kB ( 0%) ggc ipa cp: 0.10 ( 0%) usr 0.00 ( 0%) sys 0.10 ( 0%) wall 539 kB ( 0%) ggc ipa reference : 0.07 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall 0 kB ( 0%) ggc ipa pure const: 0.09 ( 0%) usr 0.00 ( 0%) sys 0.14 ( 0%) wall 0 kB ( 0%) ggc cfg cleanup : 0.67 ( 0%) usr 0.00 ( 0%) sys 0.83 ( 0%) wall 3661 kB ( 1%) ggc trivially dead code : 0.21 ( 0%) usr 0.00 ( 0%) sys 0.17 ( 0%) wall 0 kB ( 0%) ggc df multiple defs : 0.35 ( 0%) usr 0.00 ( 0%) sys 0.36 ( 0%) wall 0 kB ( 0%) ggc df reaching defs : 0.69 ( 0%) usr 0.00 ( 0%) sys 0.65 ( 0%) wall 0 kB ( 0%) ggc df live regs : 3.08 ( 1%) usr 0.00 ( 0%) sys 3.07 ( 1%) wall 0 kB ( 0%) ggc df liveinitialized regs: 1.17 ( 0%) usr 0.00 ( 0%) sys 1.07 ( 0%) wall 0 kB ( 0%) ggc df use-def / def-use chains: 0.53 ( 0%) usr 0.00 ( 0%) sys 0.35 ( 0%) wall 0 kB ( 0%) ggc df reg dead/unused notes: 2.50 ( 1%) usr 0.00 ( 0%) sys 2.73 ( 1%) wall 9314 kB ( 1%) ggc register information : 1.05 ( 0%) usr 0.00 ( 0%) sys 0.84 ( 0%) wall 0 kB ( 0%) ggc alias analysis: 0.58 ( 0%) usr 0.00 ( 0%) sys 0.61 ( 0%) wall 21770 kB ( 3%) ggc alias stmt walking: 1.29 ( 0%) usr 0.04 ( 4%) sys 1.36 ( 0%) wall 0 kB ( 0%) ggc register scan : 0.09 ( 0%) usr 0.00 ( 0%) sys 0.10 ( 0%) wall 0 kB ( 0%) ggc rebuild jump labels : 0.21 ( 0%) usr 0.00 ( 0%) sys 0.25 ( 0%) wall 0 kB ( 0%) ggc parser: 1.15 ( 0%) usr 0.12 (11%) sys 1.26 ( 0%) wall 42200 kB ( 6%) ggc inline heuristics : 0.24 ( 0%) usr 0.01 ( 1%) sys 0.24 ( 0%) wall 0 kB ( 0%) ggc tree gimplify : 0.43 ( 0%) usr 0.05 ( 4%) sys 0.47 ( 0%) wall 52375 kB ( 8%) ggc tree eh : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 0 kB ( 0%) ggc tree CFG construction : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.07 ( 0%) wall 9418 kB ( 1%) ggc tree CFG cleanup : 0.27 ( 0%) usr 0.00 ( 0%) sys 0.46 ( 0%) wall 418 kB ( 0%) ggc tree VRP : 1.57 ( 1%) usr 0.06 ( 5%) sys 1.60 ( 1%) wall 54731 kB ( 8%) ggc tree copy propagation : 0.20 ( 0%) usr 0.00 ( 0%) sys 0.29 ( 0%) wall 237 kB ( 0%) ggc tree find ref. vars : 0.03 ( 0%) usr 0.01 ( 1%) sys 0.10 ( 0%) wall 3774 kB ( 1%) ggc tree PTA : 0.16 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall 423 kB ( 0%) ggc tree PHI insertion: 0.01 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 315 kB ( 0%) ggc tree SSA rewrite : 0.24 ( 0%) usr 0.02 ( 2%) sys 0.19 ( 0%) wall 20682 kB ( 3%) ggc tree SSA other: 0.10 ( 0%) usr 0.04 ( 4%) sys 0.19 ( 0%) wall 434 kB ( 0%) ggc tree SSA incremental : 0.56 ( 0%) usr 0.02 ( 2%) sys 0.66 ( 0%) wall 438 kB ( 0%) ggc tree operand scan : 0.21 ( 0%) usr 0.20 (18%) sys 0.42 ( 0%) wall 21791 kB ( 3%) ggc dominator optimization: 0.35 ( 0%) usr 0.01 ( 1%) sys 0.36 ( 0%) wall 4189 kB ( 1%) ggc tree SRA : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc tree CCP : 0.49 ( 0%) usr 0.00 ( 0%) sys 0.34 ( 0%) wall 3081 kB ( 0%) ggc tree PHI const/copy prop: 0.02 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 22 kB ( 0%) ggc tree split crit edges : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 3265 kB ( 0%) ggc tree reassociation: 0.12 ( 0%) usr 0.01 ( 1%) sys 0.11 ( 0%) wall 161 kB ( 0%) ggc tree PRE : 4.88 ( 2%) usr 0.00 ( 0%) sys 4.89 ( 2%) wall 25200 kB ( 4%) ggc tree FRE : 0.65 ( 0%) usr 0.02 ( 2%) sys 0.67 ( 0%) wall 8099 kB ( 1%) ggc tree code sinking : 0.16 ( 0%) usr 0.05 ( 4%) sys 0.17 ( 0%) wall 12275 kB ( 2%) ggc tree linearize phis : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 0 kB ( 0%) ggc tree forward propagate: 0.14 ( 0%) usr 0.00 ( 0%) sys 0.17 ( 0%) wall 9572 kB ( 1%) ggc tree phiprop : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc tree conservative DCE : 0.21 ( 0%) usr 0.03 ( 3%) sys 0.15 ( 0%) wall 17 kB ( 0%) ggc tree aggressive DCE : 0.16 ( 0%) usr 0.00 ( 0%) sys 0.16 ( 0%) wall 2998 kB ( 0%) ggc tree DSE : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall
[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)
-- steven at gcc dot gnu dot org changed: What|Removed |Added CC||rguenth at gcc dot gnu dot ||org Status|WAITING |NEW Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2010-04-02 10:25:15 date|| Summary|[4.5 Regression] slow |[4.5 Regression] slow |compilation (tree canonical |compilation (tree canonical |iv ) |iv takes 75%) http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627
[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)
--- Comment #6 from rguenth at gcc dot gnu dot org 2010-04-02 12:19 --- The issue is for certain the many manually unrolled loops and possibly the new autoinc code. What's your native arch? I can't reproduce this on a core i?86. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627
[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)
--- Comment #7 from jv244 at cam dot ac dot uk 2010-04-02 12:28 --- (In reply to comment #6) What's your native arch? I can't reproduce this on a core i?86. -v output: /data03/vondele/gcc_trunk/build/libexec/gcc/x86_64-unknown-linux-gnu/4.5.0/f951 hog.f90 -march=k8-sse3 -mcx16 -msahf --param l1-cache-size=64 --param l1-cache-line-size=64 --param l2-cache-size=1024 -mtune=k8 -quiet -dumpbase hog.f90 -auxbase hog -g -O3 -version -fbounds-check -ffast-math -funroll-loops -ftree-vectorize -fintrinsic-modules-path /data03/vondele/gcc_trunk/build/lib/gcc/x86_64-unknown-linux-gnu/4.5.0/finclude -o /tmp/ccA2YvFn.s -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627
[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)
--- Comment #8 from rguenth at gcc dot gnu dot org 2010-04-02 14:07 --- Confirmed on x86_64-linux with -O2 -fbounds-check. find_loop_niter_by_eval takes a lot of time in each of the ints2bits_* routines because the loops have a lot of exits (due to -fbounds-check). -- rguenth at gcc dot gnu dot org changed: What|Removed |Added GCC target triplet||x86_64-*-* http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627
[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)
--- Comment #9 from jv244 at cam dot ac dot uk 2010-04-02 14:07 --- Created an attachment (id=20290) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20290action=view) smaller testcase (needs 3s, 80% in tree canonical iv) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627
[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)
--- Comment #10 from rguenth at gcc dot gnu dot org 2010-04-02 14:08 --- Created an attachment (id=20291) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20291action=view) reduced testcase -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627
[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)
--- Comment #11 from rguenth at gcc dot gnu dot org 2010-04-02 14:13 --- Compared to 4.4 we no longer eliminate most of the bound checks in 4.5. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627
[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)
--- Comment #12 from jv244 at cam dot ac dot uk 2010-04-02 14:17 --- (In reply to comment #9) Created an attachment (id=20290) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20290action=view) [edit] smaller testcase (needs 3s, 80% in tree canonical iv) from valgrind, I see some 1300 cals to get_val_for / fold_binary_loc, for the small testcase -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627
[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)
--- Comment #13 from rguenth at gcc dot gnu dot org 2010-04-02 14:23 --- Testcase for that: MODULE hfx_compression_core_methods IMPLICIT NONE INTEGER, PARAMETER :: int_8=8 CONTAINS SUBROUTINE ints2bits_3(Ndata,packed_data,full_data) INTEGER, INTENT(IN) :: Ndata INTEGER(KIND=int_8), INTENT(OUT) :: packed_data(*) INTEGER(KIND=int_8), INTENT(IN) :: full_data(*) INTEGER, PARAMETER :: Nbits = 3 INTEGER :: idata, ipack, kdata, Ndata_rep INTEGER(KIND=int_8) :: data_tmp, pack_tmp idata=0 ipack=0 Ndata_rep=(Ndata/2)*2 DO kdata=1,Ndata_rep,2 pack_tmp=0 idata=idata+1 data_tmp = full_data(idata) data_tmp = ISHFT(data_tmp,61) pack_tmp = IOR(pack_tmp,data_tmp) pack_tmp = ISHFT(pack_tmp,-3) idata=idata+1 data_tmp = full_data(idata) data_tmp = ISHFT(data_tmp,61) pack_tmp = IOR(pack_tmp,data_tmp) pack_tmp = ISHFT(pack_tmp,0) pack_tmp = ISHFT(pack_tmp,0) ipack = ipack + 1 packed_data(ipack) = pack_tmp ENDDO END SUBROUTINE ints2bits_3 END MODULE hfx_compression_core_methods likely caused by 2010-02-16 Richard Guenther rguent...@suse.de PR tree-optimization/41043 * tree-vrp.c (vrp_var_may_overflow): Only ask SCEV for real loops. (vrp_visit_assignment_or_call): Do not ask SCEV for regular statements ... (vrp_visit_phi_node): ... but only for loop PHI nodes. -- rguenth at gcc dot gnu dot org changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |rguenth at gcc dot gnu dot |dot org |org Status|NEW |ASSIGNED Last reconfirmed|2010-04-02 10:25:15 |2010-04-02 14:23:22 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627
[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)
--- Comment #14 from rguenth at gcc dot gnu dot org 2010-04-02 14:26 --- Interestingly it works on i?86 ... -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627
[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)
--- Comment #15 from rguenth at gcc dot gnu dot org 2010-04-02 14:39 --- C testcase for the missed VRP, fails with long on x86_64 only, with long long also on i?86: extern void link_error (void) __attribute__((noreturn)); int n; float *x; int main() { if (n 0) { int i = 0; do { long index; i = i + 1; index = i; if (index = 0) link_error (); x[index] = 0; i = i + 1; index = i; if (index = 0) link_error (); x[index] = 0; } while (i n); } } -- rguenth at gcc dot gnu dot org changed: What|Removed |Added Keywords||missed-optimization http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627
[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)
--- Comment #16 from rguenth at gcc dot gnu dot org 2010-04-02 14:53 --- It's the strict-overflow stuff that cripples VRP again here. I have a kludge. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627
[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)
--- Comment #17 from rguenth at gcc dot gnu dot org 2010-04-02 15:10 --- Created an attachment (id=20292) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20292action=view) minimal patch I'm testing this minimal patch. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627