[Bug target/33604] [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2

2009-02-01 Thread rguenth at gcc dot gnu dot org


-- 

rguenth at gcc dot gnu dot org changed:

   What|Removed |Added

   Target Milestone|4.3.4   |4.4.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604



[Bug target/33604] [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2

2009-02-01 Thread bonzini at gnu dot org


--- Comment #48 from bonzini at gnu dot org  2009-02-01 08:14 ---
Fixed on the trunk with the original testcase:

4.2 -O2   0m13.897s
4.2 -O3   miscompiled
4.4 -O2/-O3   0m8.714s


-- 

bonzini at gnu dot org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604



[Bug target/33604] [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2

2009-01-24 Thread rguenth at gcc dot gnu dot org


--- Comment #47 from rguenth at gcc dot gnu dot org  2009-01-24 10:19 
---
GCC 4.3.3 is being released, adjusting target milestone.


-- 

rguenth at gcc dot gnu dot org changed:

   What|Removed |Added

   Target Milestone|4.3.3   |4.3.4


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604



[Bug target/33604] [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2

2008-12-30 Thread bonzini at gnu dot org


--- Comment #46 from bonzini at gnu dot org  2008-12-30 08:02 ---
What benchmark.cpp was that? And did you test -O2 or -O3?

Thanks!


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604



[Bug target/33604] [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2

2008-12-29 Thread Joey dot ye at intel dot com


--- Comment #45 from Joey dot ye at intel dot com  2008-12-30 01:49 ---
(In reply to comment #44)
> Does anyone have new numbers?
Fixed on both i386/x86_64:
x86_64:
4.4 (trunk 142847): 5.4s
4.3.2 release:  5.4s
4.2.4 release:  5.4s

i386:
4.4 (trunk 142847): 2.7s
4.3.2 release:  2.8s
4.2.4 release:  2.7s


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604



[Bug target/33604] [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2

2008-12-25 Thread pinskia at gcc dot gnu dot org


--- Comment #44 from pinskia at gcc dot gnu dot org  2008-12-25 18:13 
---
Does anyone have new numbers?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604



[Bug target/33604] [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2

2008-08-29 Thread amonakov at gcc dot gnu dot org


--- Comment #43 from amonakov at gcc dot gnu dot org  2008-08-29 13:12 
---
Checking original testcase times on x86_64 prescott with gentoo 4.2, 4.3 and
today's trunk:
2.960sg++-4.2.4 (GCC) 4.2.4 (Gentoo 4.2.4 p1.0)
2.916sg++-4.3.1 (Gentoo 4.3.1-r1 p1.1) 4.3.1
3.993sg++ (GCC) 4.4.0 20080829 (experimental)
2.796sg++ (GCC) 4.4.0 20080829 (experimental) with --param
max-inline-insns-auto=126

So I believe lack of inlining is the biggest 4.4's problem. We do not inline
3x3 matrix multiplication in benchmark loop.

While looking at it I found that einline2 dump does not always show the reason
for not inlining. I would like to propose the following patch:

--- a/gcc/ipa-inline.c
+++ b/gcc/ipa-inline.c
@@ -1494,6 +1494,8 @@ cgraph_decide_inlining_incrementally (struct cgraph_node
*node,
  }
if (cgraph_default_inline_p (e->callee, &failed_reason))
  inlined |= try_inline (e, mode, depth);
+   else if (dump_file)
+ fprintf (dump_file, "Not inlining: %s.\n", failed_reason);
   }
   node->aux = (void *)(size_t) old_mode;
   return inlined;


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604



[Bug target/33604] [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2

2008-08-27 Thread jsm28 at gcc dot gnu dot org


--- Comment #42 from jsm28 at gcc dot gnu dot org  2008-08-27 22:02 ---
4.3.2 is released, changing milestones to 4.3.3.


-- 

jsm28 at gcc dot gnu dot org changed:

   What|Removed |Added

   Target Milestone|4.3.2   |4.3.3


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604



[Bug target/33604] [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2

2008-03-14 Thread rguenth at gcc dot gnu dot org


-- 

rguenth at gcc dot gnu dot org changed:

   What|Removed |Added

  Known to fail||4.3.0
   Target Milestone|4.3.0   |4.3.1


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604



[Bug target/33604] [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2

2008-03-02 Thread rguenth at gcc dot gnu dot org


--- Comment #40 from rguenth at gcc dot gnu dot org  2008-03-02 14:00 
---
I think new analysis is necessary first -- what is exactly causing the speed
difference?


-- 

rguenth at gcc dot gnu dot org changed:

   What|Removed |Added

 GCC target triplet||i?86-*-* x86_64-*-*
   Keywords||missed-optimization


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604



[Bug target/33604] [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2

2008-03-02 Thread bonzini at gnu dot org


--- Comment #39 from bonzini at gnu dot org  2008-03-02 12:26 ---
Subject: Re:  [4.3/4.4 Regression] Revision 119502 causes
 significantly slower results with 4.3/4.4 compared to 4.2


> The problem still exists for the first two test cases.
> As I noted in comment #8 there is a significant speedup from -O2 to -O3 for
> g++-4.2 (18s -> 5s)
> With the current g++-4.3 there is no difference between -O2 and -O3 (both 14s)
> "-fforce-addr" which produced significant speedup does not exist anymore.

So maybe we need to restore part of -fforce-addr's behavior, but not the 
one that caused regressions.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604



[Bug target/33604] [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2

2008-03-02 Thread michael dot olbrich at gmx dot net


--- Comment #38 from michael dot olbrich at gmx dot net  2008-03-02 12:14 
---
I tried again with 
g++-4.2 (GCC) 4.2.3 (Debian 4.2.3-2)
g++-4.3 (Debian 4.3-20080227-1) 4.3.0 20080227 (prerelease) [gcc-4_3-branch
revision 132730]

The problem still exists for the first two test cases.
As I noted in comment #8 there is a significant speedup from -O2 to -O3 for
g++-4.2 (18s -> 5s)
With the current g++-4.3 there is no difference between -O2 and -O3 (both 14s)
"-fforce-addr" which produced significant speedup does not exist anymore.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604



[Bug target/33604] [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2

2008-02-27 Thread bonzini at gnu dot org


--- Comment #37 from bonzini at gnu dot org  2008-02-27 17:05 ---
Subject: Re:  [4.3/4.4 Regression] Revision 119502 causes
 significantly slower results with 4.3/4.4 compared to 4.2

jacob at math dot jussieu dot fr wrote:
> --- Comment #36 from jacob at math dot jussieu dot fr  2008-02-27 16:58 
> ---
> That's great; from the assembly code I take it that you are referring tothe
> last benchmark.cpp; I was referring to the first one. Again, my 4.3 is one
> month old so maybe things have further improved since.

No, I doubt.  The last benchmark.cpp is now fully optimized, but we 
might be missing something.

Paolo


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604



[Bug target/33604] [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2

2008-02-27 Thread jacob at math dot jussieu dot fr


--- Comment #36 from jacob at math dot jussieu dot fr  2008-02-27 16:58 
---
That's great; from the assembly code I take it that you are referring tothe
last benchmark.cpp; I was referring to the first one. Again, my 4.3 is one
month old so maybe things have further improved since.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604



[Bug target/33604] [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2

2008-02-27 Thread pinskia at gcc dot gnu dot org


--- Comment #35 from pinskia at gcc dot gnu dot org  2008-02-27 16:43 
---
We get:
:
  m__valuem_I_lsm.28 = 1.0e+0 - m__valuem_I_lsm.28;
  ivtmp.30 = ivtmp.30 + 1;
  if (ivtmp.30 != 1)
goto ;
  else
goto ;

or:
L2:
addl$1, %eax
movapd  %xmm1, %xmm2
subsd   %xmm0, %xmm2
cmpl$1, %eax
movapd  %xmm2, %xmm0
jne L2

or :
L2:
addl$1, %eax
cmpl$1, %eax
fsub%st, %st(1)
jne L2

All are fast.


-- 

pinskia at gcc dot gnu dot org changed:

   What|Removed |Added

  Component|tree-optimization   |target


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604