[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)

2010-04-15 Thread rguenth at gcc dot gnu dot org


--- Comment #21 from rguenth at gcc dot gnu dot org  2010-04-15 13:47 
---
Subject: Bug 43627

Author: rguenth
Date: Thu Apr 15 13:46:42 2010
New Revision: 158377

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=158377
Log:
2010-04-15  Richard Guenther  rguent...@suse.de

PR tree-optimization/43627
* tree-vrp.c (extract_range_from_unary_expr): Widenings
of [1, +INF(OVF)] go to [1, +INF(OVF)] of the wider type,
not varying.

* gcc.dg/tree-ssa/vrp49.c: New testcase.

Added:
branches/gcc-4_5-branch/gcc/testsuite/gcc.dg/tree-ssa/vrp49.c
Modified:
branches/gcc-4_5-branch/gcc/ChangeLog
branches/gcc-4_5-branch/gcc/testsuite/ChangeLog
branches/gcc-4_5-branch/gcc/tree-vrp.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627



[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)

2010-04-15 Thread rguenth at gcc dot gnu dot org


--- Comment #22 from rguenth at gcc dot gnu dot org  2010-04-15 13:47 
---
Fixed.


-- 

rguenth at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
  Known to work|4.4.3 4.6.0 |4.4.3 4.5.1 4.6.0
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627



[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)

2010-04-06 Thread rguenth at gcc dot gnu dot org


--- Comment #18 from rguenth at gcc dot gnu dot org  2010-04-06 11:21 
---
GCC 4.5.0 is being released.  Deferring to 4.5.1.


-- 

rguenth at gcc dot gnu dot org changed:

   What|Removed |Added

   Target Milestone|4.5.0   |4.5.1


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627



[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)

2010-04-06 Thread rguenth at gcc dot gnu dot org


--- Comment #19 from rguenth at gcc dot gnu dot org  2010-04-06 12:32 
---
Subject: Bug 43627

Author: rguenth
Date: Tue Apr  6 12:32:25 2010
New Revision: 157992

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=157992
Log:
2010-04-06  Richard Guenther  rguent...@suse.de

PR tree-optimization/43627
* tree-vrp.c (extract_range_from_unary_expr): Widenings
of [1, +INF(OVF)] go to [1, +INF(OVF)] of the wider type,
not varying.

* gcc.dg/tree-ssa/vrp49.c: New testcase.

Added:
trunk/gcc/testsuite/gcc.dg/tree-ssa/vrp49.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-vrp.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627



[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)

2010-04-06 Thread rguenth at gcc dot gnu dot org


--- Comment #20 from rguenth at gcc dot gnu dot org  2010-04-06 12:33 
---
Fixed on trunk sofar.  Queued for 4.5.1.


-- 

rguenth at gcc dot gnu dot org changed:

   What|Removed |Added

  Known to fail||4.5.0
  Known to work||4.4.3 4.6.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627



[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)

2010-04-03 Thread rguenth at gcc dot gnu dot org


-- 

rguenth at gcc dot gnu dot org changed:

   What|Removed |Added

   Priority|P3  |P2


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627



[Bug tree-optimization/43627] [4.5 Regression] slow compilation

2010-04-02 Thread jv244 at cam dot ac dot uk


--- Comment #1 from jv244 at cam dot ac dot uk  2010-04-02 08:16 ---
Created an attachment (id=20287)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20287action=view)
testcase

reproduce with 

gfortran -fbounds-check -g -O3 -ffast-math -funroll-loops -ftree-vectorize
-march=native -c hog.f90


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627



[Bug tree-optimization/43627] [4.5 Regression] slow compilation

2010-04-02 Thread jv244 at cam dot ac dot uk


-- 

jv244 at cam dot ac dot uk changed:

   What|Removed |Added

   Target Milestone|--- |4.5.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627



[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv )

2010-04-02 Thread jv244 at cam dot ac dot uk


--- Comment #2 from jv244 at cam dot ac dot uk  2010-04-02 08:27 ---
And a timing report as well (notice the machine is not fully idle). The major
consumer is tree canonical.

Execution times (seconds)
 garbage collection:   7.71 ( 2%) usr   0.07 ( 4%) sys  14.12 ( 2%) wall   
   0 kB ( 0%) ggc
 callgraph construction:   0.18 ( 0%) usr   0.01 ( 1%) sys   0.24 ( 0%) wall   
6675 kB ( 1%) ggc
 callgraph optimization:   0.61 ( 0%) usr   0.03 ( 2%) sys   0.61 ( 0%) wall   
1655 kB ( 0%) ggc
 ipa cp:   0.19 ( 0%) usr   0.00 ( 0%) sys   0.19 ( 0%) wall   
 539 kB ( 0%) ggc
 ipa reference :   0.15 ( 0%) usr   0.00 ( 0%) sys   0.15 ( 0%) wall   
   0 kB ( 0%) ggc
 ipa pure const:   0.17 ( 0%) usr   0.00 ( 0%) sys   0.17 ( 0%) wall   
   0 kB ( 0%) ggc
 ipa SRA   :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 cfg cleanup   :   0.78 ( 0%) usr   0.01 ( 1%) sys   1.27 ( 0%) wall   
3661 kB ( 0%) ggc
 CFG verifier  :   2.10 ( 1%) usr   0.00 ( 0%) sys   3.40 ( 1%) wall   
   0 kB ( 0%) ggc
 trivially dead code   :   0.38 ( 0%) usr   0.00 ( 0%) sys   0.40 ( 0%) wall   
   0 kB ( 0%) ggc
 df multiple defs  :   0.59 ( 0%) usr   0.00 ( 0%) sys   0.92 ( 0%) wall   
   0 kB ( 0%) ggc
 df reaching defs  :   0.86 ( 0%) usr   0.00 ( 0%) sys   1.83 ( 0%) wall   
   0 kB ( 0%) ggc
 df live regs  :   4.92 ( 1%) usr   0.01 ( 1%) sys   8.23 ( 1%) wall   
   0 kB ( 0%) ggc
 df liveinitialized regs:   1.48 ( 0%) usr   0.01 ( 1%) sys   3.37 ( 1%) wall 
 0 kB ( 0%) ggc
 df use-def / def-use chains:   0.71 ( 0%) usr   0.00 ( 0%) sys   1.39 ( 0%)
wall   0 kB ( 0%) ggc
 df reg dead/unused notes:   4.15 ( 1%) usr   0.01 ( 1%) sys   7.47 ( 1%) wall 
  9314 kB ( 1%) ggc
 register information  :   1.29 ( 0%) usr   0.01 ( 1%) sys   3.00 ( 0%) wall   
   0 kB ( 0%) ggc
 alias analysis:   0.64 ( 0%) usr   0.00 ( 0%) sys   0.74 ( 0%) wall  
21770 kB ( 3%) ggc
 alias stmt walking:   1.94 ( 1%) usr   0.06 ( 4%) sys   3.50 ( 1%) wall   
   0 kB ( 0%) ggc
 register scan :   0.18 ( 0%) usr   0.00 ( 0%) sys   0.11 ( 0%) wall   
   0 kB ( 0%) ggc
 rebuild jump labels   :   0.23 ( 0%) usr   0.00 ( 0%) sys   0.26 ( 0%) wall   
   0 kB ( 0%) ggc
 parser:   1.27 ( 0%) usr   0.12 ( 7%) sys   1.50 ( 0%) wall  
42200 kB ( 5%) ggc
 inline heuristics :   0.43 ( 0%) usr   0.02 ( 1%) sys   0.34 ( 0%) wall   
   0 kB ( 0%) ggc
 tree gimplify :   0.69 ( 0%) usr   0.03 ( 2%) sys   0.79 ( 0%) wall  
52375 kB ( 6%) ggc
 tree eh   :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall   
   0 kB ( 0%) ggc
 tree CFG construction :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
9418 kB ( 1%) ggc
 tree CFG cleanup  :   0.49 ( 0%) usr   0.00 ( 0%) sys   0.80 ( 0%) wall   
 418 kB ( 0%) ggc
 tree VRP  :   2.08 ( 1%) usr   0.05 ( 3%) sys   3.67 ( 1%) wall  
54923 kB ( 7%) ggc
 tree copy propagation :   0.37 ( 0%) usr   0.00 ( 0%) sys   0.59 ( 0%) wall   
 237 kB ( 0%) ggc
 tree find ref. vars   :   0.07 ( 0%) usr   0.02 ( 1%) sys   0.09 ( 0%) wall   
3774 kB ( 0%) ggc
 tree PTA  :   0.19 ( 0%) usr   0.00 ( 0%) sys   0.19 ( 0%) wall   
 425 kB ( 0%) ggc
 tree PHI insertion:   0.02 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
 315 kB ( 0%) ggc
 tree SSA rewrite  :   0.44 ( 0%) usr   0.03 ( 2%) sys   0.80 ( 0%) wall  
20682 kB ( 3%) ggc
 tree SSA other:   0.22 ( 0%) usr   0.02 ( 1%) sys   0.23 ( 0%) wall   
 434 kB ( 0%) ggc
 tree SSA incremental  :   0.62 ( 0%) usr   0.04 ( 2%) sys   0.91 ( 0%) wall   
 438 kB ( 0%) ggc
 tree operand scan :   0.27 ( 0%) usr   0.14 ( 8%) sys   0.53 ( 0%) wall  
21791 kB ( 3%) ggc
 dominator optimization:   0.42 ( 0%) usr   0.00 ( 0%) sys   0.72 ( 0%) wall   
4190 kB ( 1%) ggc
 tree CCP  :   0.56 ( 0%) usr   0.01 ( 1%) sys   0.70 ( 0%) wall   
3081 kB ( 0%) ggc
 tree PHI const/copy prop:   0.05 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
22 kB ( 0%) ggc
 tree split crit edges :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.10 ( 0%) wall   
3268 kB ( 0%) ggc
 tree reassociation:   0.17 ( 0%) usr   0.00 ( 0%) sys   0.36 ( 0%) wall   
 161 kB ( 0%) ggc
 tree PRE  :   6.54 ( 2%) usr   0.02 ( 1%) sys  11.71 ( 2%) wall  
25200 kB ( 3%) ggc
 tree FRE  :   0.76 ( 0%) usr   0.03 ( 2%) sys   1.15 ( 0%) wall   
8100 kB ( 1%) ggc
 tree code sinking :   0.23 ( 0%) usr   0.04 ( 2%) sys   0.44 ( 0%) wall  
12275 kB ( 2%) ggc
 tree linearize phis   :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall   
   0 kB ( 0%) ggc
 tree forward propagate:   0.19 ( 0%) usr   0.01 ( 1%) sys   0.25 ( 0%) wall   
9572 kB ( 1%) ggc
 tree phiprop  :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree conservative DCE :   0.19 ( 0%) usr   0.02 ( 1%) sys   0.51 ( 0%) wall   
  17 kB ( 0%) ggc
 tree aggressive DCE   :   0.49 ( 0%) usr   0.01 ( 1%) sys   0.74 ( 0%) 

[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv )

2010-04-02 Thread steven at gcc dot gnu dot org


--- Comment #3 from steven at gcc dot gnu dot org  2010-04-02 09:18 ---
This tells me you are comparing apples and cows: Extra diagnostic checks
enabled; compiler may run slowly.

Could you try again with a compiler configured with --enable=checking=release?


-- 

steven at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627



[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv )

2010-04-02 Thread jv244 at cam dot ac dot uk


--- Comment #4 from jv244 at cam dot ac dot uk  2010-04-02 09:26 ---
(In reply to comment #3)
 This tells me you are comparing apples and cows: Extra diagnostic checks
 enabled; compiler may run slowly.
 
 Could you try again with a compiler configured with --enable=checking=release?
 

I'll do now...

for reference, 4.4 has:

 gfortran -ftime-report -fbounds-check -g -O3 -ffast-math -funroll-loops 
 -ftree-vectorize -march=native hog.f90

Execution times (seconds)
 garbage collection:   0.15 ( 1%) usr   0.00 ( 0%) sys   0.14 ( 1%) wall   
   0 kB ( 0%) ggc
 callgraph construction:   0.33 ( 1%) usr   0.03 ( 4%) sys   0.33 ( 1%) wall   
9447 kB ( 2%) ggc
 callgraph optimization:   0.46 ( 2%) usr   0.01 ( 1%) sys   0.50 ( 2%) wall   
 239 kB ( 0%) ggc
 ipa cp:   0.22 ( 1%) usr   0.00 ( 0%) sys   0.24 ( 1%) wall   
   0 kB ( 0%) ggc
 ipa reference :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 cfg cleanup   :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall   
 914 kB ( 0%) ggc
 trivially dead code   :   0.10 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall   
   0 kB ( 0%) ggc
 df reaching defs  :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall   
   0 kB ( 0%) ggc
 df live regs  :   0.22 ( 1%) usr   0.00 ( 0%) sys   0.18 ( 1%) wall   
   0 kB ( 0%) ggc
 df liveinitialized regs:   0.10 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
 0 kB ( 0%) ggc
 df use-def / def-use chains:   0.11 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%)
wall   0 kB ( 0%) ggc
 df reg dead/unused notes:   0.17 ( 1%) usr   0.00 ( 0%) sys   0.09 ( 0%) wall 
  3443 kB ( 1%) ggc
 register information  :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall   
   0 kB ( 0%) ggc
 alias analysis:   0.14 ( 1%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall   
6273 kB ( 1%) ggc
 register scan :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
   0 kB ( 0%) ggc
 rebuild jump labels   :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 parser:   1.27 ( 5%) usr   0.12 (16%) sys   1.31 ( 5%) wall  
50936 kB ( 9%) ggc
 inline heuristics :   0.13 ( 1%) usr   0.05 ( 6%) sys   0.25 ( 1%) wall   
   0 kB ( 0%) ggc
 tree gimplify :   0.44 ( 2%) usr   0.04 ( 5%) sys   0.54 ( 2%) wall  
61550 kB (11%) ggc
 tree eh   :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
   0 kB ( 0%) ggc
 tree CFG construction :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
9734 kB ( 2%) ggc
 tree CFG cleanup  :   0.28 ( 1%) usr   0.00 ( 0%) sys   0.18 ( 1%) wall   
 668 kB ( 0%) ggc
 tree VRP  :   1.21 ( 5%) usr   0.03 ( 4%) sys   1.26 ( 5%) wall  
42193 kB ( 8%) ggc
 tree copy propagation :   0.21 ( 1%) usr   0.00 ( 0%) sys   0.24 ( 1%) wall   
 315 kB ( 0%) ggc
 tree find ref. vars   :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall   
8937 kB ( 2%) ggc
 tree PTA  :   0.10 ( 0%) usr   0.00 ( 0%) sys   0.10 ( 0%) wall   
 758 kB ( 0%) ggc
 tree alias analysis   :   0.12 ( 0%) usr   0.05 ( 6%) sys   0.12 ( 0%) wall   
  77 kB ( 0%) ggc
 tree call clobbering  :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
  18 kB ( 0%) ggc
 tree flow sensitive alias:   0.02 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall
121 kB ( 0%) ggc
 tree flow insensitive alias:   0.02 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%)
wall   0 kB ( 0%) ggc
 tree memory partitioning:   0.00 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
21 kB ( 0%) ggc
 tree PHI insertion:   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
 201 kB ( 0%) ggc
 tree SSA rewrite  :   0.17 ( 1%) usr   0.01 ( 1%) sys   0.13 ( 1%) wall  
19668 kB ( 4%) ggc
 tree SSA other:   0.11 ( 0%) usr   0.03 ( 4%) sys   0.18 ( 1%) wall   
 360 kB ( 0%) ggc
 tree SSA incremental  :   0.24 ( 1%) usr   0.02 ( 3%) sys   0.25 ( 1%) wall   
  40 kB ( 0%) ggc
 tree operand scan :   0.36 ( 1%) usr   0.15 (19%) sys   0.58 ( 2%) wall  
27070 kB ( 5%) ggc
 dominator optimization:   0.26 ( 1%) usr   0.00 ( 0%) sys   0.14 ( 1%) wall   
2270 kB ( 0%) ggc
 tree SRA  :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree CCP  :   0.33 ( 1%) usr   0.01 ( 1%) sys   0.24 ( 1%) wall   
4060 kB ( 1%) ggc
 tree reassociation:   0.06 ( 0%) usr   0.00 ( 0%) sys   0.10 ( 0%) wall   
 124 kB ( 0%) ggc
 tree PRE  :   5.18 (21%) usr   0.05 ( 6%) sys   5.07 (20%) wall  
87699 kB (16%) ggc
 tree FRE  :   0.51 ( 2%) usr   0.00 ( 0%) sys   0.55 ( 2%) wall   
7664 kB ( 1%) ggc
 tree code sinking :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
  75 kB ( 0%) ggc
 tree linearize phis   :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree forward propagate:   0.11 ( 0%) usr   0.02 ( 3%) sys   0.11 ( 0%) wall  
11274 kB ( 2%) ggc
 tree phiprop  :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 

[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv )

2010-04-02 Thread jv244 at cam dot ac dot uk


--- Comment #5 from jv244 at cam dot ac dot uk  2010-04-02 09:47 ---
(In reply to comment #3)

cows with cows now (i.e. --enable-checking=release), on an idle machine.

Execution times (seconds)
 garbage collection:   0.29 ( 0%) usr   0.00 ( 0%) sys   0.31 ( 0%) wall   
   0 kB ( 0%) ggc
 callgraph construction:   0.11 ( 0%) usr   0.01 ( 1%) sys   0.12 ( 0%) wall   
5939 kB ( 1%) ggc
 callgraph optimization:   0.29 ( 0%) usr   0.00 ( 0%) sys   0.25 ( 0%) wall   
 184 kB ( 0%) ggc
 ipa cp:   0.10 ( 0%) usr   0.00 ( 0%) sys   0.10 ( 0%) wall   
 539 kB ( 0%) ggc
 ipa reference :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall   
   0 kB ( 0%) ggc
 ipa pure const:   0.09 ( 0%) usr   0.00 ( 0%) sys   0.14 ( 0%) wall   
   0 kB ( 0%) ggc
 cfg cleanup   :   0.67 ( 0%) usr   0.00 ( 0%) sys   0.83 ( 0%) wall   
3661 kB ( 1%) ggc
 trivially dead code   :   0.21 ( 0%) usr   0.00 ( 0%) sys   0.17 ( 0%) wall   
   0 kB ( 0%) ggc
 df multiple defs  :   0.35 ( 0%) usr   0.00 ( 0%) sys   0.36 ( 0%) wall   
   0 kB ( 0%) ggc
 df reaching defs  :   0.69 ( 0%) usr   0.00 ( 0%) sys   0.65 ( 0%) wall   
   0 kB ( 0%) ggc
 df live regs  :   3.08 ( 1%) usr   0.00 ( 0%) sys   3.07 ( 1%) wall   
   0 kB ( 0%) ggc
 df liveinitialized regs:   1.17 ( 0%) usr   0.00 ( 0%) sys   1.07 ( 0%) wall 
 0 kB ( 0%) ggc
 df use-def / def-use chains:   0.53 ( 0%) usr   0.00 ( 0%) sys   0.35 ( 0%)
wall   0 kB ( 0%) ggc
 df reg dead/unused notes:   2.50 ( 1%) usr   0.00 ( 0%) sys   2.73 ( 1%) wall 
  9314 kB ( 1%) ggc
 register information  :   1.05 ( 0%) usr   0.00 ( 0%) sys   0.84 ( 0%) wall   
   0 kB ( 0%) ggc
 alias analysis:   0.58 ( 0%) usr   0.00 ( 0%) sys   0.61 ( 0%) wall  
21770 kB ( 3%) ggc
 alias stmt walking:   1.29 ( 0%) usr   0.04 ( 4%) sys   1.36 ( 0%) wall   
   0 kB ( 0%) ggc
 register scan :   0.09 ( 0%) usr   0.00 ( 0%) sys   0.10 ( 0%) wall   
   0 kB ( 0%) ggc
 rebuild jump labels   :   0.21 ( 0%) usr   0.00 ( 0%) sys   0.25 ( 0%) wall   
   0 kB ( 0%) ggc
 parser:   1.15 ( 0%) usr   0.12 (11%) sys   1.26 ( 0%) wall  
42200 kB ( 6%) ggc
 inline heuristics :   0.24 ( 0%) usr   0.01 ( 1%) sys   0.24 ( 0%) wall   
   0 kB ( 0%) ggc
 tree gimplify :   0.43 ( 0%) usr   0.05 ( 4%) sys   0.47 ( 0%) wall  
52375 kB ( 8%) ggc
 tree eh   :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
   0 kB ( 0%) ggc
 tree CFG construction :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall   
9418 kB ( 1%) ggc
 tree CFG cleanup  :   0.27 ( 0%) usr   0.00 ( 0%) sys   0.46 ( 0%) wall   
 418 kB ( 0%) ggc
 tree VRP  :   1.57 ( 1%) usr   0.06 ( 5%) sys   1.60 ( 1%) wall  
54731 kB ( 8%) ggc
 tree copy propagation :   0.20 ( 0%) usr   0.00 ( 0%) sys   0.29 ( 0%) wall   
 237 kB ( 0%) ggc
 tree find ref. vars   :   0.03 ( 0%) usr   0.01 ( 1%) sys   0.10 ( 0%) wall   
3774 kB ( 1%) ggc
 tree PTA  :   0.16 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall   
 423 kB ( 0%) ggc
 tree PHI insertion:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
 315 kB ( 0%) ggc
 tree SSA rewrite  :   0.24 ( 0%) usr   0.02 ( 2%) sys   0.19 ( 0%) wall  
20682 kB ( 3%) ggc
 tree SSA other:   0.10 ( 0%) usr   0.04 ( 4%) sys   0.19 ( 0%) wall   
 434 kB ( 0%) ggc
 tree SSA incremental  :   0.56 ( 0%) usr   0.02 ( 2%) sys   0.66 ( 0%) wall   
 438 kB ( 0%) ggc
 tree operand scan :   0.21 ( 0%) usr   0.20 (18%) sys   0.42 ( 0%) wall  
21791 kB ( 3%) ggc
 dominator optimization:   0.35 ( 0%) usr   0.01 ( 1%) sys   0.36 ( 0%) wall   
4189 kB ( 1%) ggc
 tree SRA  :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
   0 kB ( 0%) ggc
 tree CCP  :   0.49 ( 0%) usr   0.00 ( 0%) sys   0.34 ( 0%) wall   
3081 kB ( 0%) ggc
 tree PHI const/copy prop:   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
22 kB ( 0%) ggc
 tree split crit edges :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
3265 kB ( 0%) ggc
 tree reassociation:   0.12 ( 0%) usr   0.01 ( 1%) sys   0.11 ( 0%) wall   
 161 kB ( 0%) ggc
 tree PRE  :   4.88 ( 2%) usr   0.00 ( 0%) sys   4.89 ( 2%) wall  
25200 kB ( 4%) ggc
 tree FRE  :   0.65 ( 0%) usr   0.02 ( 2%) sys   0.67 ( 0%) wall   
8099 kB ( 1%) ggc
 tree code sinking :   0.16 ( 0%) usr   0.05 ( 4%) sys   0.17 ( 0%) wall  
12275 kB ( 2%) ggc
 tree linearize phis   :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
   0 kB ( 0%) ggc
 tree forward propagate:   0.14 ( 0%) usr   0.00 ( 0%) sys   0.17 ( 0%) wall   
9572 kB ( 1%) ggc
 tree phiprop  :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
   0 kB ( 0%) ggc
 tree conservative DCE :   0.21 ( 0%) usr   0.03 ( 3%) sys   0.15 ( 0%) wall   
  17 kB ( 0%) ggc
 tree aggressive DCE   :   0.16 ( 0%) usr   0.00 ( 0%) sys   0.16 ( 0%) wall   
2998 kB ( 0%) ggc
 tree DSE  :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall   

[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)

2010-04-02 Thread steven at gcc dot gnu dot org


-- 

steven at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||rguenth at gcc dot gnu dot
   ||org
 Status|WAITING |NEW
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2010-04-02 10:25:15
   date||
Summary|[4.5 Regression] slow   |[4.5 Regression] slow
   |compilation (tree canonical |compilation (tree canonical
   |iv  )   |iv  takes 75%)


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627



[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)

2010-04-02 Thread rguenth at gcc dot gnu dot org


--- Comment #6 from rguenth at gcc dot gnu dot org  2010-04-02 12:19 ---
The issue is for certain the many manually unrolled loops and possibly the
new autoinc code.

What's your native arch?  I can't reproduce this on a core i?86.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627



[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)

2010-04-02 Thread jv244 at cam dot ac dot uk


--- Comment #7 from jv244 at cam dot ac dot uk  2010-04-02 12:28 ---
(In reply to comment #6)
 What's your native arch?  I can't reproduce this on a core i?86.

-v output:


/data03/vondele/gcc_trunk/build/libexec/gcc/x86_64-unknown-linux-gnu/4.5.0/f951
hog.f90 -march=k8-sse3 -mcx16 -msahf --param l1-cache-size=64 --param
l1-cache-line-size=64 --param l2-cache-size=1024 -mtune=k8 -quiet -dumpbase
hog.f90 -auxbase hog -g -O3 -version -fbounds-check -ffast-math -funroll-loops
-ftree-vectorize -fintrinsic-modules-path
/data03/vondele/gcc_trunk/build/lib/gcc/x86_64-unknown-linux-gnu/4.5.0/finclude
-o /tmp/ccA2YvFn.s


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627



[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)

2010-04-02 Thread rguenth at gcc dot gnu dot org


--- Comment #8 from rguenth at gcc dot gnu dot org  2010-04-02 14:07 ---
Confirmed on x86_64-linux with -O2 -fbounds-check.

find_loop_niter_by_eval takes a lot of time in each of the ints2bits_*
routines because the loops have a lot of exits (due to -fbounds-check).


-- 

rguenth at gcc dot gnu dot org changed:

   What|Removed |Added

 GCC target triplet||x86_64-*-*


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627



[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)

2010-04-02 Thread jv244 at cam dot ac dot uk


--- Comment #9 from jv244 at cam dot ac dot uk  2010-04-02 14:07 ---
Created an attachment (id=20290)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20290action=view)
smaller testcase (needs 3s, 80% in tree canonical iv)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627



[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)

2010-04-02 Thread rguenth at gcc dot gnu dot org


--- Comment #10 from rguenth at gcc dot gnu dot org  2010-04-02 14:08 
---
Created an attachment (id=20291)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20291action=view)
reduced testcase


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627



[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)

2010-04-02 Thread rguenth at gcc dot gnu dot org


--- Comment #11 from rguenth at gcc dot gnu dot org  2010-04-02 14:13 
---
Compared to 4.4 we no longer eliminate most of the bound checks in 4.5.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627



[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)

2010-04-02 Thread jv244 at cam dot ac dot uk


--- Comment #12 from jv244 at cam dot ac dot uk  2010-04-02 14:17 ---
(In reply to comment #9)
 Created an attachment (id=20290)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20290action=view) [edit]
 smaller testcase (needs 3s, 80% in tree canonical iv)

from valgrind, I see some 1300 cals to get_val_for / fold_binary_loc, for
the small testcase


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627



[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)

2010-04-02 Thread rguenth at gcc dot gnu dot org


--- Comment #13 from rguenth at gcc dot gnu dot org  2010-04-02 14:23 
---
Testcase for that:

MODULE hfx_compression_core_methods

  IMPLICIT NONE

  INTEGER, PARAMETER :: int_8=8

  CONTAINS

  SUBROUTINE ints2bits_3(Ndata,packed_data,full_data)
INTEGER, INTENT(IN)  :: Ndata
INTEGER(KIND=int_8), INTENT(OUT) :: packed_data(*)
INTEGER(KIND=int_8), INTENT(IN)  :: full_data(*)

INTEGER, PARAMETER   :: Nbits = 3

INTEGER  :: idata, ipack, kdata, Ndata_rep
INTEGER(KIND=int_8)  :: data_tmp, pack_tmp

   idata=0
   ipack=0
   Ndata_rep=(Ndata/2)*2
   DO kdata=1,Ndata_rep,2
   pack_tmp=0
 idata=idata+1
data_tmp = full_data(idata)
data_tmp = ISHFT(data_tmp,61)
pack_tmp = IOR(pack_tmp,data_tmp)
pack_tmp = ISHFT(pack_tmp,-3)
 idata=idata+1
data_tmp = full_data(idata)
data_tmp = ISHFT(data_tmp,61)
pack_tmp = IOR(pack_tmp,data_tmp)
pack_tmp = ISHFT(pack_tmp,0)
   pack_tmp = ISHFT(pack_tmp,0)
   ipack = ipack + 1
   packed_data(ipack) = pack_tmp
   ENDDO
  END SUBROUTINE ints2bits_3

END MODULE hfx_compression_core_methods


likely caused by

2010-02-16  Richard Guenther  rguent...@suse.de

PR tree-optimization/41043
* tree-vrp.c  (vrp_var_may_overflow): Only ask SCEV for real loops.
(vrp_visit_assignment_or_call): Do not ask SCEV for regular
statements ...
(vrp_visit_phi_node): ... but only for loop PHI nodes.


-- 

rguenth at gcc dot gnu dot org changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |rguenth at gcc dot gnu dot
   |dot org |org
 Status|NEW |ASSIGNED
   Last reconfirmed|2010-04-02 10:25:15 |2010-04-02 14:23:22
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627



[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)

2010-04-02 Thread rguenth at gcc dot gnu dot org


--- Comment #14 from rguenth at gcc dot gnu dot org  2010-04-02 14:26 
---
Interestingly it works on i?86 ...


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627



[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)

2010-04-02 Thread rguenth at gcc dot gnu dot org


--- Comment #15 from rguenth at gcc dot gnu dot org  2010-04-02 14:39 
---
C testcase for the missed VRP, fails with long on x86_64 only, with
long long also on i?86:

extern void link_error (void) __attribute__((noreturn));
int n;
float *x;
int main()
{
  if (n  0)
{
  int i = 0;
  do
{
  long index;
  i = i + 1;
  index = i;
  if (index = 0)
link_error ();
  x[index] = 0;
  i = i + 1;
  index = i;
  if (index = 0)
link_error ();
  x[index] = 0;
}
  while (i  n);
}
}


-- 

rguenth at gcc dot gnu dot org changed:

   What|Removed |Added

   Keywords||missed-optimization


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627



[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)

2010-04-02 Thread rguenth at gcc dot gnu dot org


--- Comment #16 from rguenth at gcc dot gnu dot org  2010-04-02 14:53 
---
It's the strict-overflow stuff that cripples VRP again here.  I have a kludge.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627



[Bug tree-optimization/43627] [4.5 Regression] slow compilation (tree canonical iv takes 75%)

2010-04-02 Thread rguenth at gcc dot gnu dot org


--- Comment #17 from rguenth at gcc dot gnu dot org  2010-04-02 15:10 
---
Created an attachment (id=20292)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20292action=view)
minimal patch

I'm testing this minimal patch.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43627