[Bug target/33604] [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2
-- rguenth at gcc dot gnu dot org changed: What|Removed |Added Target Milestone|4.3.4 |4.4.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604
[Bug target/33604] [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2
--- Comment #48 from bonzini at gnu dot org 2009-02-01 08:14 --- Fixed on the trunk with the original testcase: 4.2 -O2 0m13.897s 4.2 -O3 miscompiled 4.4 -O2/-O3 0m8.714s -- bonzini at gnu dot org changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604
[Bug target/33604] [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2
--- Comment #47 from rguenth at gcc dot gnu dot org 2009-01-24 10:19 --- GCC 4.3.3 is being released, adjusting target milestone. -- rguenth at gcc dot gnu dot org changed: What|Removed |Added Target Milestone|4.3.3 |4.3.4 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604
[Bug target/33604] [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2
--- Comment #46 from bonzini at gnu dot org 2008-12-30 08:02 --- What benchmark.cpp was that? And did you test -O2 or -O3? Thanks! -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604
[Bug target/33604] [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2
--- Comment #45 from Joey dot ye at intel dot com 2008-12-30 01:49 --- (In reply to comment #44) > Does anyone have new numbers? Fixed on both i386/x86_64: x86_64: 4.4 (trunk 142847): 5.4s 4.3.2 release: 5.4s 4.2.4 release: 5.4s i386: 4.4 (trunk 142847): 2.7s 4.3.2 release: 2.8s 4.2.4 release: 2.7s -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604
[Bug target/33604] [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2
--- Comment #44 from pinskia at gcc dot gnu dot org 2008-12-25 18:13 --- Does anyone have new numbers? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604
[Bug target/33604] [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2
--- Comment #43 from amonakov at gcc dot gnu dot org 2008-08-29 13:12 --- Checking original testcase times on x86_64 prescott with gentoo 4.2, 4.3 and today's trunk: 2.960sg++-4.2.4 (GCC) 4.2.4 (Gentoo 4.2.4 p1.0) 2.916sg++-4.3.1 (Gentoo 4.3.1-r1 p1.1) 4.3.1 3.993sg++ (GCC) 4.4.0 20080829 (experimental) 2.796sg++ (GCC) 4.4.0 20080829 (experimental) with --param max-inline-insns-auto=126 So I believe lack of inlining is the biggest 4.4's problem. We do not inline 3x3 matrix multiplication in benchmark loop. While looking at it I found that einline2 dump does not always show the reason for not inlining. I would like to propose the following patch: --- a/gcc/ipa-inline.c +++ b/gcc/ipa-inline.c @@ -1494,6 +1494,8 @@ cgraph_decide_inlining_incrementally (struct cgraph_node *node, } if (cgraph_default_inline_p (e->callee, &failed_reason)) inlined |= try_inline (e, mode, depth); + else if (dump_file) + fprintf (dump_file, "Not inlining: %s.\n", failed_reason); } node->aux = (void *)(size_t) old_mode; return inlined; -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604
[Bug target/33604] [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2
--- Comment #42 from jsm28 at gcc dot gnu dot org 2008-08-27 22:02 --- 4.3.2 is released, changing milestones to 4.3.3. -- jsm28 at gcc dot gnu dot org changed: What|Removed |Added Target Milestone|4.3.2 |4.3.3 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604
[Bug target/33604] [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2
-- rguenth at gcc dot gnu dot org changed: What|Removed |Added Known to fail||4.3.0 Target Milestone|4.3.0 |4.3.1 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604
[Bug target/33604] [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2
--- Comment #40 from rguenth at gcc dot gnu dot org 2008-03-02 14:00 --- I think new analysis is necessary first -- what is exactly causing the speed difference? -- rguenth at gcc dot gnu dot org changed: What|Removed |Added GCC target triplet||i?86-*-* x86_64-*-* Keywords||missed-optimization http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604
[Bug target/33604] [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2
--- Comment #39 from bonzini at gnu dot org 2008-03-02 12:26 --- Subject: Re: [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2 > The problem still exists for the first two test cases. > As I noted in comment #8 there is a significant speedup from -O2 to -O3 for > g++-4.2 (18s -> 5s) > With the current g++-4.3 there is no difference between -O2 and -O3 (both 14s) > "-fforce-addr" which produced significant speedup does not exist anymore. So maybe we need to restore part of -fforce-addr's behavior, but not the one that caused regressions. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604
[Bug target/33604] [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2
--- Comment #38 from michael dot olbrich at gmx dot net 2008-03-02 12:14 --- I tried again with g++-4.2 (GCC) 4.2.3 (Debian 4.2.3-2) g++-4.3 (Debian 4.3-20080227-1) 4.3.0 20080227 (prerelease) [gcc-4_3-branch revision 132730] The problem still exists for the first two test cases. As I noted in comment #8 there is a significant speedup from -O2 to -O3 for g++-4.2 (18s -> 5s) With the current g++-4.3 there is no difference between -O2 and -O3 (both 14s) "-fforce-addr" which produced significant speedup does not exist anymore. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604
[Bug target/33604] [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2
--- Comment #37 from bonzini at gnu dot org 2008-02-27 17:05 --- Subject: Re: [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2 jacob at math dot jussieu dot fr wrote: > --- Comment #36 from jacob at math dot jussieu dot fr 2008-02-27 16:58 > --- > That's great; from the assembly code I take it that you are referring tothe > last benchmark.cpp; I was referring to the first one. Again, my 4.3 is one > month old so maybe things have further improved since. No, I doubt. The last benchmark.cpp is now fully optimized, but we might be missing something. Paolo -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604
[Bug target/33604] [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2
--- Comment #36 from jacob at math dot jussieu dot fr 2008-02-27 16:58 --- That's great; from the assembly code I take it that you are referring tothe last benchmark.cpp; I was referring to the first one. Again, my 4.3 is one month old so maybe things have further improved since. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604
[Bug target/33604] [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2
--- Comment #35 from pinskia at gcc dot gnu dot org 2008-02-27 16:43 --- We get: : m__valuem_I_lsm.28 = 1.0e+0 - m__valuem_I_lsm.28; ivtmp.30 = ivtmp.30 + 1; if (ivtmp.30 != 1) goto ; else goto ; or: L2: addl$1, %eax movapd %xmm1, %xmm2 subsd %xmm0, %xmm2 cmpl$1, %eax movapd %xmm2, %xmm0 jne L2 or : L2: addl$1, %eax cmpl$1, %eax fsub%st, %st(1) jne L2 All are fast. -- pinskia at gcc dot gnu dot org changed: What|Removed |Added Component|tree-optimization |target http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604