[Bug tree-optimization/33761] non-optimal inlining heuristics pessimizes gzip SPEC score at -O3

2008-09-06 Thread hubicka at gcc dot gnu dot org
--- Comment #26 from hubicka at gcc dot gnu dot org 2008-09-06 12:00 --- IRA seems to fix the remaining problem with spill in internal loop on 32bit nicely, so we produce good scores for gzip compared to older GCC versions.

[Bug tree-optimization/33761] non-optimal inlining heuristics pessimizes gzip SPEC score at -O3

2008-09-06 Thread hubicka at gcc dot gnu dot org
--- Comment #27 from hubicka at gcc dot gnu dot org 2008-09-06 12:02 --- Also just noticed that offline copy of longest-match get extra move: .L15: movzbl 2(%eax), %edi #, tmp87 leal2(%eax), %ecx #, scan.158 movl%edi, %edx # tmp87,

[Bug tree-optimization/33761] non-optimal inlining heuristics pessimizes gzip SPEC score at -O3

2008-02-08 Thread hubicka at gcc dot gnu dot org
--- Comment #25 from hubicka at gcc dot gnu dot org 2008-02-08 15:39 --- -fno-tree-dominator-opts -fno-tree-copyrename solves the coalescing problem (name is introduced by second, the actual problematic pattern by first pass), saving roughly 1s at both -O2 and 2s at -O3, -O3 is still

[Bug tree-optimization/33761] non-optimal inlining heuristics pessimizes gzip SPEC score at -O3

2008-02-08 Thread hubicka at gcc dot gnu dot org
--- Comment #24 from hubicka at gcc dot gnu dot org 2008-02-08 15:11 --- Hi, the tonight runs with continue heuristics shows again improvements on 64bit scores , but degradation on 32bit scores. Looking into the loop, the real trouble seems to be that the main loop has 6 loop carried

[Bug tree-optimization/33761] non-optimal inlining heuristics pessimizes gzip SPEC score at -O3

2008-02-07 Thread hubicka at gcc dot gnu dot org
--- Comment #23 from hubicka at gcc dot gnu dot org 2008-02-07 12:30 --- Created an attachment (id=15115) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15115action=view) Annotated profile I am attaching dump with profile read in. It shows the hot spots in longest_match at least:

[Bug tree-optimization/33761] non-optimal inlining heuristics pessimizes gzip SPEC score at -O3

2008-02-06 Thread hubicka at gcc dot gnu dot org
--- Comment #17 from hubicka at gcc dot gnu dot org 2008-02-06 13:28 --- One problem is the following: do { ; match = window + cur_match; if (match[best_len] != scan_end || match[best_len-1] != scan_end1 || *match != *scan ||

[Bug tree-optimization/33761] non-optimal inlining heuristics pessimizes gzip SPEC score at -O3

2008-02-06 Thread hubicka at gcc dot gnu dot org
--- Comment #18 from hubicka at gcc dot gnu dot org 2008-02-06 16:44 --- Created an attachment (id=15107) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15107action=view) Path to predict_paths_leading_to Hi, I've revived the continue heuristic patch. By itself it does not help

[Bug tree-optimization/33761] non-optimal inlining heuristics pessimizes gzip SPEC score at -O3

2008-02-06 Thread hubicka at gcc dot gnu dot org
--- Comment #19 from hubicka at gcc dot gnu dot org 2008-02-06 16:56 --- Created an attachment (id=15108) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15108action=view) Complete continue heuristic patch Hi, this is the complete patch. With this patch we produce profile sane

[Bug tree-optimization/33761] non-optimal inlining heuristics pessimizes gzip SPEC score at -O3

2008-02-06 Thread ubizjak at gmail dot com
--- Comment #20 from ubizjak at gmail dot com 2008-02-06 18:42 --- Whoa, adding -fomit-frame-pointer brings us from (gcc -O3 -m32) user0m41.031s to (gcc -O3 -m32 -fomit-frame-pointer) user0m30.006s Since -fo-f-p adds another free reg, it looks that since inlining increases

[Bug tree-optimization/33761] non-optimal inlining heuristics pessimizes gzip SPEC score at -O3

2008-02-06 Thread ubizjak at gmail dot com
--- Comment #21 from ubizjak at gmail dot com 2008-02-06 19:10 --- (In reply to comment #20) Since -fo-f-p adds another free reg, it looks that since inlining increases register pressure some unlucky heavy-used variable gets allocated to the stack slot. It is best_len (and

[Bug tree-optimization/33761] non-optimal inlining heuristics pessimizes gzip SPEC score at -O3

2008-02-06 Thread hubicka at gcc dot gnu dot org
--- Comment #22 from hubicka at gcc dot gnu dot org 2008-02-06 19:22 --- Yes, there are number of unlucky variables. However the real source is here seems to be always wrong profile guiding regalloc to optimize for cold portions of the function rather than real increase of register

[Bug tree-optimization/33761] non-optimal inlining heuristics pessimizes gzip SPEC score at -O3

2008-02-05 Thread hubicka at gcc dot gnu dot org
--- Comment #15 from hubicka at gcc dot gnu dot org 2008-02-05 13:36 --- Thanks, looks comparable to K8 scores, except that -O3 is not actually that worse there. So it looks there is more than just random effect of code layout involved, I will try to look into the assembly produced

[Bug tree-optimization/33761] non-optimal inlining heuristics pessimizes gzip SPEC score at -O3

2008-02-05 Thread hubicka at gcc dot gnu dot org
--- Comment #16 from hubicka at gcc dot gnu dot org 2008-02-05 13:55 --- Thanks, looks comparable to K8 scores, except that -O3 is not actually that worse there. So it looks there is more than just random effect of code layout involved, I will try to look into the assembly produced

[Bug tree-optimization/33761] non-optimal inlining heuristics pessimizes gzip SPEC score at -O3

2008-02-03 Thread hubicka at gcc dot gnu dot org
--- Comment #13 from hubicka at gcc dot gnu dot org 2008-02-03 13:39 --- Tonight runs on haydn with patch in shows regression on gzip: 950-901 in 32bit. FDO 64bit runs are not affected. This is same score as we had in December, we improved a bit since then but not enough to match

[Bug tree-optimization/33761] non-optimal inlining heuristics pessimizes gzip SPEC score at -O3

2008-02-03 Thread ubizjak at gmail dot com
--- Comment #14 from ubizjak at gmail dot com 2008-02-03 17:35 --- (In reply to comment #13) Uros, would be possible to give it a try on Core? That would help to figure out if it is code layout problem of K8. Hm, the patch doesn't seem to help: -m32 -O2: 32.434 -m32 -O2 (patched):

[Bug tree-optimization/33761] non-optimal inlining heuristics pessimizes gzip SPEC score at -O3

2008-02-02 Thread hubicka at gcc dot gnu dot org
--- Comment #12 from hubicka at gcc dot gnu dot org 2008-02-02 16:22 --- Created an attachment (id=15079) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15079action=view) address accumulation patch While working on PR17863 I wrote the attached patch to make fwprop to combine code

[Bug tree-optimization/33761] non-optimal inlining heuristics pessimizes gzip SPEC score at -O3

2008-01-16 Thread hubicka at gcc dot gnu dot org
--- Comment #11 from hubicka at gcc dot gnu dot org 2008-01-16 16:46 --- Last time I looked into it, it was code alignment affected by inlining in the string matching loop (longest_match). This code is very atypical, since the internal loop

[Bug tree-optimization/33761] non-optimal inlining heuristics pessimizes gzip SPEC score at -O3

2007-12-10 Thread rguenth at gcc dot gnu dot org
--- Comment #3 from rguenth at gcc dot gnu dot org 2007-12-10 10:52 --- I don't think this qualifies as a 4.3 regression - http://www.suse.de/~gcctest/SPEC/CINT/sb-haydn-head-64-32o-32bit/index.html shows that while there were jumps, the numbers close to the 4.2 release are actually

[Bug tree-optimization/33761] non-optimal inlining heuristics pessimizes gzip SPEC score at -O3

2007-12-10 Thread ubizjak at gmail dot com
--- Comment #4 from ubizjak at gmail dot com 2007-12-10 12:31 --- (In reply to comment #3) I don't think this qualifies as a 4.3 regression - Fair enough. It looks that this problem is specific to Core2. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33761

[Bug tree-optimization/33761] non-optimal inlining heuristics pessimizes gzip SPEC score at -O3

2007-12-10 Thread ubizjak at gmail dot com
--- Comment #5 from ubizjak at gmail dot com 2007-12-10 17:12 --- (In reply to comment #4) Fair enough. It looks that this problem is specific to Core2. Here are timings with 'gcc version 4.3.0 20071201 (experimental) [trunk revision 130554] (GCC)' on vendor_id : GenuineIntel

[Bug tree-optimization/33761] non-optimal inlining heuristics pessimizes gzip SPEC score at -O3

2007-12-10 Thread rguenther at suse dot de
--- Comment #6 from rguenther at suse dot de 2007-12-10 17:13 --- Subject: Re: non-optimal inlining heuristics pessimizes gzip SPEC score at -O3 On Mon, 10 Dec 2007, ubizjak at gmail dot com wrote: (In reply to comment #4) Fair enough. It looks that this problem is specific to

[Bug tree-optimization/33761] non-optimal inlining heuristics pessimizes gzip SPEC score at -O3

2007-12-10 Thread ubizjak at gmail dot com
--- Comment #7 from ubizjak at gmail dot com 2007-12-10 17:26 --- (In reply to comment #6) FSF GCC 4.1 does not have -mtune=generic. OK, OK. Now with 'gcc version 4.1.3 20070716 (prerelease)': -m32 -O2: 29.306s -m32 -O3: 29.582s I don't have 4.2 here. --