[Bug rtl-optimization/38373] 32-bit Vortex degradation on PPC due to bad RTL aliasing
--- Comment #6 from luisgpm at linux dot vnet dot ibm dot com 2009-06-29 02:25 --- Fixed -- luisgpm at linux dot vnet dot ibm dot com changed: What|Removed |Added Status|RESOLVED|VERIFIED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38373
[Bug rtl-optimization/38373] 32-bit Vortex degradation on PPC due to bad RTL aliasing
--- Comment #5 from luisgpm at linux dot vnet dot ibm dot com 2009-06-29 02:24 --- Already commited on 4.5. Closing... -- luisgpm at linux dot vnet dot ibm dot com changed: What|Removed |Added Status|WAITING |RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38373
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #19 from luisgpm at linux dot vnet dot ibm dot com 2009-06-03 03:01 --- A little bit of information about the problem. On 32-bit code, the loads are being pushed up, from a different BB to the BB where we have the stfd instruction, during global scheduling. I suspect the 64-bit case is the same, with small variations. I'll update with more details soon. Luis -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/40029] [4.5 Regression] Big degradation on swim/mgrid on powerpc 32/64 after alias improvement merge (gcc r145494)
--- Comment #5 from luisgpm at linux dot vnet dot ibm dot com 2009-05-29 19:52 --- >From predictive commoning we gain a bit more performance, probably due to the bigger unrolling factor. Any chance of the unrolling taking place while still using PRE? Thanks, Luis -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40029
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #18 from luisgpm at linux dot vnet dot ibm dot com 2009-05-15 02:19 --- 64-bit with -mcpu=power6 .L93: fmul 20,11,13 fmul 19,11,0 addis 12,11,0xffe5 lfd 3,0(11) addi 5,11,8 lfd 2,9472(12) addis 14,5,0xffe5 fmadd 1,12,0,20 fmsub 12,12,13,19 lfd 20,9472(14) lfd 19,8(11) addi 11,5,8 fmul 11,1,13 fmul 21,1,2 fmul 22,1,3 fmul 8,1,0 fmadd 11,12,0,11 fmadd 5,12,3,21 fmsub 4,12,2,22 fmsub 12,12,13,8 fmul 1,11,19 fmul 22,11,20 fadd 9,9,5 fadd 10,10,4 fmsub 21,12,20,1 fmadd 2,12,19,22 fadd 10,10,21 fadd 9,9,2 bdnz .L93 .L265: stfd 12,736(1) --- mr 11,3 | ld 5,736(1) -- -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #17 from luisgpm at linux dot vnet dot ibm dot com 2009-05-15 02:16 --- Actually, 64-bit is affected too, but not with the "power6x" tuning i was using. With "-mcpu=power6" i can reproduce the problem. The problem seems to be a couple load instructions that are being pushed up from a different basic block, and this results in a Load-hit-store hazard, since we've pushed the load too close to a store to the same address. I'm not sure this is a direct consequence of changes in the gimple code. Would you know Matz? I'll continue digging... Luis -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #16 from luisgpm at linux dot vnet dot ibm dot com 2009-05-14 04:12 --- Just for the record... The 64-bit case is fixed. There are still performance issues in the 32-bit case. I'll attach more information soon. Luis -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #11 from luisgpm at linux dot vnet dot ibm dot com 2009-05-12 12:55 --- Any updates? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/40029] [4.5 Regression] Big degradation on swim/mgrid on powerpc 32/64 after alias improvement merge (gcc r145494)
--- Comment #1 from luisgpm at linux dot vnet dot ibm dot com 2009-05-11 18:04 --- Good asm code for a hot loop in swim's "calc1" function 10001e10: lfd f12,-10672(r11) 10001e14: lfd f9,-10672(r9) 10001e18: addir21,r21,16 10001e1c: lfd f7,-10680(r11) 10001e20: lfd f6,-10672(r6) 10001e24: fmulf3,f9,f9 10001e28: cmpwr21,r0 10001e2c: faddf4,f7,f12 10001e30: lfd f22,-10680(r9) 10001e34: lfd f10,-10664(r9) 10001e38: addir9,r9,16 10001e3c: lfd f23,-10672(r5) 10001e40: lfd f13,-10664(r5) 10001e44: addir5,r5,16 10001e48: lfd f5,-10664(r11) 10001e4c: fsubf28,f23,f9 10001e50: fsubf25,f13,f10 10001e54: lfd f13,-10672(r4) 10001e58: addir11,r11,16 10001e5c: faddf5,f12,f5 10001e60: fsubf20,f13,f0 10001e64: fmulf9,f11,f9 10001e68: fmadd f27,f22,f22,f3 10001e6c: fmadd f30,f10,f10,f3 10001e70: lfd f3,-10680(r8) 10001e74: faddf26,f4,f6 10001e78: fmulf10,f11,f10 10001e7c: fmulf24,f28,f2 10001e80: fmulf21,f25,f2 10001e84: fmulf4,f9,f4 10001e88: fmadd f22,f0,f0,f27 10001e8c: faddf27,f8,f7 10001e90: faddf23,f26,f8 10001e94: fmulf26,f0,f11 10001e98: lfd f8,-10664(r6) 10001e9c: lfd f0,-10664(r4) 10001ea0: addir6,r6,16 10001ea4: faddf29,f5,f8 10001ea8: fsubf25,f0,f13 10001eac: addir4,r4,16 10001eb0: fmsub f28,f20,f1,f24 10001eb4: lfd f20,-10672(r8) 10001eb8: fmulf5,f10,f5 10001ebc: addir8,r8,16 10001ec0: stfdf4,-10672(r22) 10001ec4: stfdf5,-10664(r22) 10001ec8: addir22,r22,16 10001ecc: fmulf27,f26,f27 10001ed0: faddf24,f6,f29 10001ed4: fmsub f29,f25,f1,f21 10001ed8: fdivf28,f28,f23 10001edc: fmadd f25,f13,f13,f30 10001ee0: faddf6,f6,f12 10001ee4: fmadd f30,f3,f3,f22 10001ee8: stfdf27,-10680(r3) 10001eec: fdivf29,f29,f24 10001ef0: fmadd f3,f20,f20,f25 10001ef4: fmulf20,f13,f11 10001ef8: fmadd f7,f30,f31,f7 10001efc: stfdf7,-10680(r10) 10001f00: fmadd f12,f3,f31,f12 10001f04: fmulf13,f20,f6 10001f08: stfdf12,-10672(r10) 10001f0c: stfdf13,-10672(r3) 10001f10: addir10,r10,16 10001f14: addir3,r3,16 10001f18: stfdf28,-10672(r7) 10001f1c: stfdf29,-10664(r7) 10001f20: addir7,r7,16 10001f24: bne 10001e10 -- Bad asm code for the same loop 10001a60: addis r27,r9,-435 10001a64: addis r12,r11,-2176 10001a68: lfd f13,-7440(r27) 10001a6c: lfd f10,28344(r12) 10001a70: addis r8,r11,-1958 10001a74: addis r10,r11,-1740 10001a78: fsubf7,f10,f13 10001a7c: lfd f8,-704(r8) 10001a80: lfd f10,0(r9) 10001a84: addis r7,r9,-218 10001a88: addis r28,r9,1523 10001a8c: lfd f9,-29752(r10) 10001a90: faddf6,f12,f10 10001a94: fsubf2,f8,f0 10001a98: addis r12,r11,218 10001a9c: addis r27,r9,2176 10001aa0: faddf5,f11,f9 10001aa4: faddf11,f11,f12 10001aa8: addir9,r9,8 10001aac: cmpwr6,r9 10001ab0: fmulf1,f7,f30 10001ab4: fmulf7,f13,f13 10001ab8: fmulf13,f13,f3 10001abc: faddf31,f5,f6 10001ac0: lfd f5,29040(r7) 10001ac4: fmsub f2,f2,f29,f1 10001ac8: fmadd f1,f0,f0,f7 10001acc: fmulf0,f0,f3 10001ad0: fmulf6,f13,f6 10001ad4: stfdf6,-6728(r28) 10001ad8: fdivf2,f2,f31 10001adc: fmadd f5,f5,f5,f1 10001ae0: fmulf31,f0,f11 10001ae4: fmr f0,f8 10001ae8: stfdf31,0(r11) 10001aec: fmr f11,f9 10001af0: addir11,r11,8 10001af4: faddf1,f5,f4 10001af8: fmr f4,f7 10001afc: fmadd f5,f1,f28,f12 10001b00: fmr f12,f10 10001b04: stfdf5,-28344(r27) 10001b08: stfdf2,-29040(r12) 10001b0c: bne+10001a60 -- Looking into the differences for both cases, the good code seems to be traversing the loop in a different way than the bad one, using smaller displacements for each load/store. The bad case uses bigger displacements. Also, it looks like we have a bigger unrolling factor on the good case (longer code, more loads) compared to the bad case. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40029
[Bug middle-end/40029] New: [4.5 Regression] Big degradation on swim/mgrid on powerpc 32/64 after alias improvement merge (gcc r145494)
CPU2000's swim and mgrid had ~10% slowdown after the merge of the alias improvement branch. GCC was configured with the following: /gcc/HEAD/configure --target=powerpc64-linux --host=powerpc64-linux --build=powerpc64-linux --with-cpu=default32 --enable-threads=posix --enable-shared --enable-__cxa_atexit --with-gmp=/gmp --with-mpfr=mpfr --with-long-double-128 --enable-decimal-float --enable-secure-plt --disable-bootstrap --disable-alsa --prefix=/install/gcc/HEAD build_alias=powerpc64-linux host_alias=powerpc64-linux target_alias=powerpc64-linux --enable-languages=c,c++,fortran --no-create --no-recursion Compile flags used: -m[32|64] -O3 -mcpu=power[4|5|6] -ffast-math -ftree-loop-linear -funroll-loops -fpeel-loops Will provide more details soon. -- Summary: [4.5 Regression] Big degradation on swim/mgrid on powerpc 32/64 after alias improvement merge (gcc r145494) Product: gcc Version: 4.5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: luisgpm at linux dot vnet dot ibm dot com GCC build triplet: powerpc*-*-* GCC host triplet: powerpc*-*-* GCC target triplet: powerpc*-*-* http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40029
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #9 from luisgpm at linux dot vnet dot ibm dot com 2009-05-04 15:50 --- Follows the configure options, although i think you'll be able to reproduce it with the flags i've mentioned. /gcc/HEAD/configure --target=powerpc64-linux --host=powerpc64-linux --build=powerpc64-linux --with-cpu=default32 --enable-threads=posix --enable-shared --enable-__cxa_atexit --with-gmp=/gmp --with-mpfr=mpfr --with-long-double-128 --enable-decimal-float --enable-secure-plt --disable-bootstrap --disable-alsa --prefix=/install/gcc/HEAD build_alias=powerpc64-linux host_alias=powerpc64-linux target_alias=powerpc64-linux --enable-languages=c,c++,fortran --no-create --no-recursion -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #8 from luisgpm at linux dot vnet dot ibm dot com 2009-05-04 15:41 --- Oops... forgot about that bit, sorry. Compile flags: -m32 -O3 -mcpu=power[4|5|6] -ffast-math -ftree-loop-linear -funroll-loops -fpeel-loops This reproduces on both 32-bit and 64-bit. Luis -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #6 from luisgpm at linux dot vnet dot ibm dot com 2009-05-04 13:50 --- Just for the sake of information, -fselective-scheduling doesn't help. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #2 from luisgpm at linux dot vnet dot ibm dot com 2009-04-30 19:38 --- Created an attachment (id=17786) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17786&action=view) Last tree pass for the bad code -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
--- Comment #1 from luisgpm at linux dot vnet dot ibm dot com 2009-04-30 19:29 --- ASM code for the bad loop .L145: fmul 10,8,13 fmul 5,8,0 addis 3,4,0xffe5 lfd 22,8(7) addi 7,4,8 lfd 6,9472(3) fmadd 10,9,0,10 fmsub 23,9,13,5 fmul 2,10,22 fmul 9,10,6 fmr 7,23 fmsub 25,23,6,2 fmadd 26,23,22,9 fadd 12,12,25 fadd 11,11,26 .L93: fmul 8,10,13 fmul 22,10,0 addis 3,7,0xffe5 lfd 21,0(7) addi 4,7,8 lfd 25,9472(3) fmadd 8,7,0,8 fmsub 9,7,13,22 fmul 23,8,21 fmul 26,8,25 fmsub 24,9,25,23 fmadd 7,9,21,26 fadd 12,12,24 fadd 11,11,7 bdnz .L145 stfd 9,472(1) mr 7,8 lwz 3,472(1) lwz 4,476(1) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug rtl-optimization/38373] 32-bit Vortex degradation on PPC due to bad RTL aliasing
--- Comment #3 from luisgpm at linux dot vnet dot ibm dot com 2009-04-30 16:33 --- This is already in 4.4, but we would like to add additional checks on 4.5 that would be risky to have on 4.4 (since it was almost being released). I have the additional patch and will attach it soon. Sorry it took so long to reply. Luis -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38373
[Bug regression/39976] New: [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817
We have a hot spot on sixtrack in a function called thin6d. Such loop is generated by the old (pre-146817) gcc as a single BB, thus the only way inside that loop is by executing instructions until we fall into that code. The post-146817 gcc breaks that loop in two BB's, such that we can actually branch to the middle of that loop in the first iteration, and then the loop runs just like in pre-146817. The degradation comes from the fact that the creation of two BB's breaks good scheduling of instructions inside that loop, like this: Good code: All the fp load instructions are grouped in the upper portion of the code. fmulf22,f11,f13 fmulf23,f11,f0 addis r12,r6,-27 lfd f3,0(r6) addir4,r6,8 lfd f1,9472(r12) addis r12,r4,-27 fmadd f8,f12,f0,f22 fmsub f4,f12,f13,f23 lfd f22,9472(r12) lfd f23,8(r6) addir6,r4,8 fmulf11,f8,f13 fmulf24,f8,f1 fmulf25,f8,f3 fmulf5,f8,f0 fmadd f11,f4,f0,f11 fmadd f21,f4,f3,f24 fmsub f2,f4,f1,f25 fmsub f12,f4,f13,f5 fmulf1,f11,f23 fmulf8,f11,f22 faddf9,f9,f21 faddf10,f10,f2 fmsub f24,f12,f22,f1 fmadd f25,f12,f23,f8 faddf10,f10,f24 faddf9,f9,f25 bdnz100ca878 Bad code: The second pair of loads are pushed down the second BB, causing slowdowns. fmulf5,f8,f0 addis r3,r4,-27 lfd f22,8(r7) addir7,r4,8 lfd f6,9472(r3) fmadd f10,f9,f0,f10 fmsub f23,f9,f13,f5 fmulf2,f10,f22 fmulf9,f10,f6 fmr f7,f23 fmsub f25,f23,f6,f2 fmadd f26,f23,f22,f9 faddf12,f12,f25 faddf11,f11,f26 fmulf8,f10,f13 >> BB mark fmulf22,f10,f0 addis r3,r7,-27 lfd f21,0(r7) addir4,r7,8 lfd f25,9472(r3) fmadd f8,f7,f0,f8 fmsub f9,f7,f13,f22 fmulf23,f8,f21 fmulf26,f8,f25 fmsub f24,f9,f25,f23 fmadd f7,f9,f21,f26 faddf12,f12,f24 faddf11,f11,f7 bdnz100c9fe0 -- Summary: [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817 Product: gcc Version: 4.5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: regression AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: luisgpm at linux dot vnet dot ibm dot com GCC build triplet: powerpc*-*-* GCC host triplet: powerpc*-*-* GCC target triplet: powerpc*-*-* http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
[Bug rtl-optimization/38373] 32-bit Vortex degradation on PPC due to bad RTL aliasing
--- Comment #1 from luisgpm at linux dot vnet dot ibm dot com 2009-01-09 18:00 --- Created an attachment (id=17065) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17065&action=view) Second part of the combined patch Additional check to avoid returning a NULL base. This is a placeholder for a 4.5 fix. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38373
[Bug rtl-optimization/38373] New: 32-bit Vortex degradation on PPC due to bad RTL aliasing
The handling of LO_SUM by alias.c:find_base_term causes a degradation on 32-bit vortex on PPC when used with the new REG_POINTER attribute. Making "find_base_term" handle LO_SUM the same way as alias.c:find_base_value fixes the problem. Preventing "find_base_term" from returning NULL so easily also fixes the problem. A fix for 4.5 will most probably be a combination of these two approaches. -- Summary: 32-bit Vortex degradation on PPC due to bad RTL aliasing Product: gcc Version: 4.4.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: luisgpm at linux dot vnet dot ibm dot com GCC build triplet: powerpc*-*-* GCC host triplet: powerpc*-*-* GCC target triplet: powerpc*-*-* http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38373
[Bug tree-optimization/37686] [4.4 Regression] Building of CPU2000's bzip2 with peak flags with -mcpu=power4 fails with an ICE.
--- Comment #21 from luisgpm at linux dot vnet dot ibm dot com 2008-10-03 20:59 --- It fixes the bzip2 ICE. Thanks, Luis -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37686
[Bug middle-end/37447] [4.4 Regression] test pr28982b.c fails execution on power4 or later with ira change
--- Comment #5 from luisgpm at linux dot vnet dot ibm dot com 2008-10-02 01:43 --- This problem also showed up as a CPU2000 regression in the Sixtrack benchmark for PPC64, causing problems in the ondering of ld/st instructions. A GCC patched with Richard's fix produced the right code and the regression is gone. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37447
[Bug regression/37686] [4.4 Regression] Building of CPU2000's bzip2 with peak flags with -mcpu=power4 fails with an ICE.
--- Comment #7 from luisgpm at linux dot vnet dot ibm dot com 2008-10-01 17:44 --- I still can ICE it with the same flags in a native system. Any other info you'd like to have available? I have a more reduced source, will post it soon. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37686
[Bug regression/37686] [4.4 Regression] Building of CPU2000's bzip2 with peak flags with -mcpu=power4 fails with an ICE.
--- Comment #5 from luisgpm at linux dot vnet dot ibm dot com 2008-10-01 13:19 --- I'm still trying to minimize even further the source. Will attach when i have something better. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37686
[Bug regression/37686] [4.4 Regression] Building of CPU2000's bzip2 with peak flags with -mcpu=power4 fails with an ICE.
--- Comment #4 from luisgpm at linux dot vnet dot ibm dot com 2008-10-01 13:13 --- Created an attachment (id=16441) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16441&action=view) Reduced source for bzip2.c Indented reduced source. -- luisgpm at linux dot vnet dot ibm dot com changed: What|Removed |Added Attachment #16440|0 |1 is obsolete|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37686
[Bug regression/37686] [4.4 Regression] Building of CPU2000's bzip2 with peak flags with -mcpu=power4 fails with an ICE.
--- Comment #3 from luisgpm at linux dot vnet dot ibm dot com 2008-10-01 13:10 --- Created an attachment (id=16440) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16440&action=view) Reduced source for bzip2.c Source reduced with delta -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37686
[Bug regression/37686] [4.4 Regression] Building of CPU2000's bzip2 with peak flags with -mcpu=power4 fails with an ICE.
--- Comment #2 from luisgpm at linux dot vnet dot ibm dot com 2008-10-01 13:10 --- Created an attachment (id=16439) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16439&action=view) Preprocessed source for reduced bzip2.c Preprocessed source. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37686
[Bug regression/37686] New: Building of CPU2000's bzip2 with peak flags with -mcpu=power4 fails with an ICE.
After this patch: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=140660 Building bzip2 with peak flags AND -mcpu=power4 for ppc (-O3 -mcpu=power4 -ffast-math -funroll-loops -ftree-loop-linear -fpeel-loops) started ICE'ing. Here is the backtrace: #0 0x103a2c3c in gimple_bb (g=0x4) at /home/luis/src/gcc/HEAD/gcc/gimple.h:1071 #1 0x103a2b80 in gsi_start (seq=0xf76882b0) at /home/luis/src/gcc/HEAD/gcc/gimple.h:4298 #2 0x103a5a48 in gsi_start_phis (bb=0xf7664b40) at /home/luis/src/gcc/HEAD/gcc/gimple-iterator.c:770 #3 0x105e601c in verify_stmts () at /home/luis/src/gcc/HEAD/gcc/tree-cfg.c:4177 #4 0x107e80cc in verify_ssa (check_modified_stmt=1 '\001') at /home/luis/src/gcc/HEAD/gcc/tree-ssa.c:750 #5 0x10476c30 in execute_function_todo (data=0x4043) at /home/luis/src/gcc/HEAD/gcc/passes.c:999 #6 0x1047657c in do_per_function (callback=0x10476898 , data=0x4043) at /home/luis/src/gcc/HEAD/gcc/passes.c:841 #7 0x10476d38 in execute_todo (flags=16451) at /home/luis/src/gcc/HEAD/gcc/passes.c:1025 #8 0x10477c54 in execute_one_pass (pass=0x10eeabd4) at /home/luis/src/gcc/HEAD/gcc/passes.c:1301 #9 0x10477ec4 in execute_pass_list (pass=0x10eeabd4) at /home/luis/src/gcc/HEAD/gcc/passes.c:1327 #10 0x10477ef0 in execute_pass_list (pass=0x10eeaa9c) at /home/luis/src/gcc/HEAD/gcc/passes.c:1328 #11 0x10477ef0 in execute_pass_list (pass=0x10eea494) at /home/luis/src/gcc/HEAD/gcc/passes.c:1328 #12 0x106656a8 in tree_rest_of_compilation (fndecl=0xf7e74180) at /home/luis/src/gcc/HEAD/gcc/tree-optimize.c:418 #13 0x10930d78 in cgraph_expand_function (node=0xf7e78300) at /home/luis/src/gcc/HEAD/gcc/cgraphunit.c:1038 #14 0x10930fe0 in cgraph_expand_all_functions () at /home/luis/src/gcc/HEAD/gcc/cgraphunit.c:1097 #15 0x109317a8 in cgraph_optimize () at /home/luis/src/gcc/HEAD/gcc/cgraphunit.c:1302 #16 0x10033998 in c_write_global_declarations () at /home/luis/src/gcc/HEAD/gcc/c-decl.c:8073 #17 0x105c9624 in compile_file () at /home/luis/src/gcc/HEAD/gcc/toplev.c:979 #18 0x105cc214 in do_compile () at /home/luis/src/gcc/HEAD/gcc/toplev.c:2190 #19 0x105cc2a8 in toplev_main (argc=30, argv=0xff880e44) at /home/luis/src/gcc/HEAD/gcc/toplev.c: #20 0x100f90fc in main (argc=30, argv=0xff880e44) at /home/luis/src/gcc/HEAD/gcc/main.c:35 The failing line is "return g->gsbase.bb;", because "g" isn't really a valid pointer: (gdb) p g $1 = (const_gimple) 0x4 -- Summary: Building of CPU2000's bzip2 with peak flags with - mcpu=power4 fails with an ICE. Product: gcc Version: 4.4.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: regression AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: luisgpm at linux dot vnet dot ibm dot com GCC build triplet: powerpc*-*-* GCC host triplet: powerpc*-*-* GCC target triplet: powerpc*-*-* http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37686
[Bug fortran/37536] [4.3/4.4 Regression] a mfcr is produced instead of branches for DO loops
--- Comment #9 from luisgpm at linux dot vnet dot ibm dot com 2008-09-17 15:22 --- Gathered some PPC 32/64 performance numbers with the patch (based on 140409). No noticeable performance regressions were found. 32-bit swin and 64-bit art had a little boost on speed (7.8% and 3.4% respectivelly). -- luisgpm at linux dot vnet dot ibm dot com changed: What|Removed |Added CC||luisgpm at linux dot vnet ||dot ibm dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37536
[Bug tree-optimization/14703] [4.4 regression] Inadequate optimization of inline templated functions, infinite loop in ipa-reference and memory hog
--- Comment #12 from luisgpm at linux dot vnet dot ibm dot com 2008-09-11 05:18 --- This patch (revision 140068) breaks SPEC2000's 200.sixtrack benchmark for POWER6 due to miscompares. Reverting this patch solves the problem. Not sure what specific problem was introduced. I can isolate a testcase if needed be. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14703
[Bug target/34856] [4.2/4.3 Regression] ICE with some constant vectors
--- Comment #31 from luisgpm at linux dot vnet dot ibm dot com 2008-09-09 13:51 --- I have the fix for PPC. Any special reason why this doesn't get reproduced there? Still would be worthwhile to include the rs6000-specific fix for this bug ticket? Thanks, Luis -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34856
[Bug c/37171] [4.4 Regression] Canonical spelling optimization dependency
--- Comment #9 from luisgpm at linux dot vnet dot ibm dot com 2008-08-20 16:09 --- With revision 139317, the numbers for 197.parser as back to normal and the generated ASM code carries only a single call to __ctype_toupper_loc. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37171
[Bug c/37171] [4.4 Regression] Canonical spelling optimization dependency
--- Comment #7 from luisgpm at linux dot vnet dot ibm dot com 2008-08-20 14:21 --- The preprocessed sources for strncasecmp.c are exactly the same for both cases. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37171
[Bug c/37171] [4.4 Regression] Canonical spelling optimization dependency
--- Comment #6 from luisgpm at linux dot vnet dot ibm dot com 2008-08-20 14:07 --- Created an attachment (id=16112) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16112&action=view) Generated ASM code for the good case The __ctype_toupper_loc function, differently than the bad case is called only ONCE in this code. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37171
[Bug c/37171] [4.4 Regression] Canonical spelling optimization dependency
--- Comment #5 from luisgpm at linux dot vnet dot ibm dot com 2008-08-20 14:06 --- Created an attachment (id=16111) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16111&action=view) Generated ASM code for the bad case Notice that __ctype_toupper_loc is called 6 times in this code. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37171
[Bug c/37171] [4.4 Regression] Canonical spelling optimization dependency
--- Comment #4 from luisgpm at linux dot vnet dot ibm dot com 2008-08-20 14:05 --- Created an attachment (id=16110) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16110&action=view) Preprocessed source for the good case -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37171
[Bug c/37171] [4.4 Regression] Canonical spelling optimization dependency
--- Comment #3 from luisgpm at linux dot vnet dot ibm dot com 2008-08-20 14:04 --- Created an attachment (id=16109) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16109&action=view) Preprocessed source for the bad case -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37171