[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #28 from rguenth at gcc dot gnu dot org 2010-04-06 11:20 --- GCC 4.5.0 is being released. Deferring to 4.5.1. -- rguenth at gcc dot gnu dot org changed: What|Removed |Added Target Milestone|4.5.0 |4.5.1 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
-- rguenth at gcc dot gnu dot org changed: What|Removed |Added Status|WAITING |NEW Last reconfirmed|2010-03-29 17:28:22 |2010-03-31 11:56:08 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #26 from rguenth at gcc dot gnu dot org 2010-03-28 15:50 --- ppc folks, can you re-confirm this bug again? There have been some register allocation changes meanwhile. -- rguenth at gcc dot gnu dot org changed: What|Removed |Added Status|NEW |WAITING http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #24 from rguenth at gcc dot gnu dot org 2009-11-30 13:14 --- ppc folks, can you re-confirm this bug? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #25 from pthaugen at gcc dot gnu dot org 2009-11-30 21:29 --- I am still seeing the 2-block loop using revision 154838, both 32 and 64 bit, compile options -O3 -mcpu=power6 -funroll-loops. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #23 from pthaugen at gcc dot gnu dot org 2009-07-23 23:48 --- I opened a new bugzilla, 40482, for the Load-hit-store RA issue discussed in comments 17-20 since that's a separate problem. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #22 from pthaugen at gcc dot gnu dot org 2009-07-14 21:15 --- The original problem, multi-block loop preventing movement of loads, was reintroduced with revision 149206, Jan's CD-DCE patch to remove empty loops. -- pthaugen at gcc dot gnu dot org changed: What|Removed |Added CC||jh at suse dot cz http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #21 from matz at gcc dot gnu dot org 2009-07-09 10:43 --- I fear this is no expand-from-SSA problem anymore, but rather an IRA problem. Unassigning and CCing Vlad. -- matz at gcc dot gnu dot org changed: What|Removed |Added CC||vmakarov at gcc dot gnu dot ||org AssignedTo|matz at gcc dot gnu dot org |unassigned at gcc dot gnu ||dot org Status|ASSIGNED|NEW http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #19 from luisgpm at linux dot vnet dot ibm dot com 2009-06-03 03:01 --- A little bit of information about the problem. On 32-bit code, the loads are being pushed up, from a different BB to the BB where we have the stfd instruction, during global scheduling. I suspect the 64-bit case is the same, with small variations. I'll update with more details soon. Luis -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
-- rguenth at gcc dot gnu dot org changed: What|Removed |Added Keywords||missed-optimization Priority|P3 |P2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
-- pthaugen at gcc dot gnu dot org changed: What|Removed |Added CC||pthaugen at gcc dot gnu dot ||org Priority|P2 |P3 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #17 from luisgpm at linux dot vnet dot ibm dot com 2009-05-15 02:16 --- Actually, 64-bit is affected too, but not with the power6x tuning i was using. With -mcpu=power6 i can reproduce the problem. The problem seems to be a couple load instructions that are being pushed up from a different basic block, and this results in a Load-hit-store hazard, since we've pushed the load too close to a store to the same address. I'm not sure this is a direct consequence of changes in the gimple code. Would you know Matz? I'll continue digging... Luis -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #18 from luisgpm at linux dot vnet dot ibm dot com 2009-05-15 02:19 --- 64-bit with -mcpu=power6 .L93: fmul 20,11,13 fmul 19,11,0 addis 12,11,0xffe5 lfd 3,0(11) addi 5,11,8 lfd 2,9472(12) addis 14,5,0xffe5 fmadd 1,12,0,20 fmsub 12,12,13,19 lfd 20,9472(14) lfd 19,8(11) addi 11,5,8 fmul 11,1,13 fmul 21,1,2 fmul 22,1,3 fmul 8,1,0 fmadd 11,12,0,11 fmadd 5,12,3,21 fmsub 4,12,2,22 fmsub 12,12,13,8 fmul 1,11,19 fmul 22,11,20 fadd 9,9,5 fadd 10,10,4 fmsub 21,12,20,1 fmadd 2,12,19,22 fadd 10,10,21 fadd 9,9,2 bdnz .L93 .L265: stfd 12,736(1) --- mr 11,3 | ld 5,736(1) -- -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #14 from matz at gcc dot gnu dot org 2009-05-13 18:16 --- http://gcc.gnu.org/ml/gcc-patches/2009-05/msg00753.html should fix it. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #15 from matz at gcc dot gnu dot org 2009-05-13 20:14 --- Subject: Bug 39976 Author: matz Date: Wed May 13 20:14:44 2009 New Revision: 147494 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=147494 Log: PR middle-end/39976 * tree-outof-ssa.c (maybe_renumber_stmts_bb): New function. (trivially_conflicts_p): New function. (insert_backedge_copies): Use it. Modified: trunk/gcc/ChangeLog trunk/gcc/tree-outof-ssa.c -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #16 from luisgpm at linux dot vnet dot ibm dot com 2009-05-14 04:12 --- Just for the record... The 64-bit case is fixed. There are still performance issues in the 32-bit case. I'll attach more information soon. Luis -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #11 from luisgpm at linux dot vnet dot ibm dot com 2009-05-12 12:55 --- Any updates? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #12 from matz at gcc dot gnu dot org 2009-05-12 13:37 --- The problem is that for PHI node expansion something has to be inserted on the backedge of a single BB loop, splitting it into two BBs (where one just contains one instruction). Something in the RTL passes then moves stuff into that first BB somehow limiting the scheduler then. I'm working on a fix. Earlier compilers contained a hack for this (because swing modulo scheduling can only deal with single BB loops), which I removed as part of SSA expand, see PR34263 and PR19038. I'm working on an alternative solution that places the copy somewhere more intelligently than now. -- matz at gcc dot gnu dot org changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |matz at gcc dot gnu dot org |dot org | Status|UNCONFIRMED |ASSIGNED Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2009-05-12 13:37:22 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #13 from pinskia at gcc dot gnu dot org 2009-05-12 15:29 --- (In reply to comment #12) I'm working on a fix. Earlier compilers contained a hack for this (because swing modulo scheduling can only deal with single BB loops), It was not just SMS which only can deal with single BB loops, that change improved other loops in general because the register allocator was not able to get rid of the extra basic block (just SMS showed it more). Since my name is on that change, my motivation at the time the part of the patch was written was not SMS (SMS was an after thought to me). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
-- mmitchel at gcc dot gnu dot org changed: What|Removed |Added Priority|P3 |P2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #8 from luisgpm at linux dot vnet dot ibm dot com 2009-05-04 15:41 --- Oops... forgot about that bit, sorry. Compile flags: -m32 -O3 -mcpu=power[4|5|6] -ffast-math -ftree-loop-linear -funroll-loops -fpeel-loops This reproduces on both 32-bit and 64-bit. Luis -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #9 from luisgpm at linux dot vnet dot ibm dot com 2009-05-04 15:50 --- Follows the configure options, although i think you'll be able to reproduce it with the flags i've mentioned. /gcc/HEAD/configure --target=powerpc64-linux --host=powerpc64-linux --build=powerpc64-linux --with-cpu=default32 --enable-threads=posix --enable-shared --enable-__cxa_atexit --with-gmp=/gmp --with-mpfr=mpfr --with-long-double-128 --enable-decimal-float --enable-secure-plt --disable-bootstrap --disable-alsa --prefix=/install/gcc/HEAD build_alias=powerpc64-linux host_alias=powerpc64-linux target_alias=powerpc64-linux --enable-languages=c,c++,fortran --no-create --no-recursion -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #7 from matz at gcc dot gnu dot org 2009-05-04 14:37 --- Compile options please. I can't reproduce it with a powerpc64 compiler with -O2, neither with -m32 nor -m64, -ffast-math or no -ffast-math. Also 'gcc -v' can't hurt to make sure our compilers are configured the same. Hint: I use this command to quickly skim over the situation with labels and bdnz: % egrep '^.L|bdnz' thin6d.s If the bdnz lines always mention the label from a line above it's a single basic block loop. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #10 from matz at gcc dot gnu dot org 2009-05-04 16:10 --- Yes, I can now, thanks. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #6 from luisgpm at linux dot vnet dot ibm dot com 2009-05-04 13:50 --- Just for the sake of information, -fselective-scheduling doesn't help. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #5 from rguenth at gcc dot gnu dot org 2009-05-01 20:07 --- bb 45: # crkveuk_lsm.686_3635 = PHI crkveuk_lsm.686_517(44) # cikve_lsm.685_3640 = PHI cikve_lsm.685_528(44) # crkveuk_lsm.686_3564 = PHI crkveuk_lsm.686_517(44) Interesting, I wonder if that causes expand SSA to go crazy or do we go into closed loop form on purpose now. Huh. We should definitely get rid of these degenerate PHIs before expanding (and call cfgcleanup which then may merge the blocks). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
-- pinskia at gcc dot gnu dot org changed: What|Removed |Added CC||pinskia at gcc dot gnu dot ||org Component|regression |middle-end Target Milestone|--- |4.5.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #1 from luisgpm at linux dot vnet dot ibm dot com 2009-04-30 19:29 --- ASM code for the bad loop .L145: fmul 10,8,13 fmul 5,8,0 addis 3,4,0xffe5 lfd 22,8(7) addi 7,4,8 lfd 6,9472(3) fmadd 10,9,0,10 fmsub 23,9,13,5 fmul 2,10,22 fmul 9,10,6 fmr 7,23 fmsub 25,23,6,2 fmadd 26,23,22,9 fadd 12,12,25 fadd 11,11,26 .L93: fmul 8,10,13 fmul 22,10,0 addis 3,7,0xffe5 lfd 21,0(7) addi 4,7,8 lfd 25,9472(3) fmadd 8,7,0,8 fmsub 9,7,13,22 fmul 23,8,21 fmul 26,8,25 fmsub 24,9,25,23 fmadd 7,9,21,26 fadd 12,12,24 fadd 11,11,7 bdnz .L145 stfd 9,472(1) mr 7,8 lwz 3,472(1) lwz 4,476(1) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #2 from luisgpm at linux dot vnet dot ibm dot com 2009-04-30 19:38 --- Created an attachment (id=17786) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17786action=view) Last tree pass for the bad code -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #3 from pinskia at gcc dot gnu dot org 2009-04-30 19:51 --- bb 45: # crkveuk_lsm.686_3635 = PHI crkveuk_lsm.686_517(44) # cikve_lsm.685_3640 = PHI cikve_lsm.685_528(44) # crkveuk_lsm.686_3564 = PHI crkveuk_lsm.686_517(44) Interesting, I wonder if that causes expand SSA to go crazy or do we go into closed loop form on purpose now. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #4 from hjl dot tools at gmail dot com 2009-04-30 20:17 --- I am not sure if it is related. Revision 146817 miscompiled 465.tonto in SPEC CPU 2006 on Linux/ia32 with -O3 -msse2 -mfpmath=sse -ffast-math -funroll-loops -m32 The resulting tonto binary generated the wrong result and ran VERY VERY slow. -- hjl dot tools at gmail dot com changed: What|Removed |Added CC||hjl dot tools at gmail dot ||com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976