[Bug rtl-optimization/38373] 32-bit Vortex degradation on PPC due to bad RTL aliasing

2009-06-28 Thread luisgpm at linux dot vnet dot ibm dot com


--- Comment #6 from luisgpm at linux dot vnet dot ibm dot com  2009-06-29 
02:25 ---
Fixed


-- 

luisgpm at linux dot vnet dot ibm dot com changed:

   What|Removed |Added

 Status|RESOLVED|VERIFIED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38373



[Bug rtl-optimization/38373] 32-bit Vortex degradation on PPC due to bad RTL aliasing

2009-06-28 Thread luisgpm at linux dot vnet dot ibm dot com


--- Comment #5 from luisgpm at linux dot vnet dot ibm dot com  2009-06-29 
02:24 ---
Already commited on 4.5. Closing...


-- 

luisgpm at linux dot vnet dot ibm dot com changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38373



[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817

2009-06-02 Thread luisgpm at linux dot vnet dot ibm dot com


--- Comment #19 from luisgpm at linux dot vnet dot ibm dot com  2009-06-03 
03:01 ---
A little bit of information about the problem.

On 32-bit code, the loads are being pushed up, from a different BB to the BB
where we have the stfd instruction, during global scheduling. I suspect the
64-bit case is the same, with small variations.

I'll update with more details soon.

Luis


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976



[Bug middle-end/40029] [4.5 Regression] Big degradation on swim/mgrid on powerpc 32/64 after alias improvement merge (gcc r145494)

2009-05-29 Thread luisgpm at linux dot vnet dot ibm dot com


--- Comment #5 from luisgpm at linux dot vnet dot ibm dot com  2009-05-29 
19:52 ---
>From predictive commoning we gain a bit more performance, probably due to the
bigger unrolling factor.

Any chance of the unrolling taking place while still using PRE?

Thanks,
Luis


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40029



[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817

2009-05-14 Thread luisgpm at linux dot vnet dot ibm dot com


--- Comment #18 from luisgpm at linux dot vnet dot ibm dot com  2009-05-15 
02:19 ---
64-bit with -mcpu=power6

.L93:
fmul 20,11,13
fmul 19,11,0
addis 12,11,0xffe5
lfd 3,0(11)
addi 5,11,8
lfd 2,9472(12)
addis 14,5,0xffe5
fmadd 1,12,0,20
fmsub 12,12,13,19
lfd 20,9472(14)
lfd 19,8(11)
addi 11,5,8
fmul 11,1,13
fmul 21,1,2
fmul 22,1,3
fmul 8,1,0
fmadd 11,12,0,11
fmadd 5,12,3,21
fmsub 4,12,2,22
fmsub 12,12,13,8
fmul 1,11,19
fmul 22,11,20
fadd 9,9,5
fadd 10,10,4
fmsub 21,12,20,1
fmadd 2,12,19,22
fadd 10,10,21
fadd 9,9,2
bdnz .L93
.L265:
stfd 12,736(1) ---
mr 11,3   |
ld 5,736(1) --


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976



[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817

2009-05-14 Thread luisgpm at linux dot vnet dot ibm dot com


--- Comment #17 from luisgpm at linux dot vnet dot ibm dot com  2009-05-15 
02:16 ---
Actually, 64-bit is affected too, but not with the "power6x" tuning i was
using. With "-mcpu=power6" i can reproduce the problem.

The problem seems to be a couple load instructions that are being pushed up
from a different basic block, and this results in a Load-hit-store hazard,
since we've pushed the load too close to a store to the same address.

I'm not sure this is a direct consequence of changes in the gimple code. Would
you know Matz?

I'll continue digging...

Luis


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976



[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817

2009-05-13 Thread luisgpm at linux dot vnet dot ibm dot com


--- Comment #16 from luisgpm at linux dot vnet dot ibm dot com  2009-05-14 
04:12 ---
Just for the record... The 64-bit case is fixed. There are still performance
issues in the 32-bit case.

I'll attach more information soon.

Luis


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976



[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817

2009-05-12 Thread luisgpm at linux dot vnet dot ibm dot com


--- Comment #11 from luisgpm at linux dot vnet dot ibm dot com  2009-05-12 
12:55 ---
Any updates?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976



[Bug middle-end/40029] [4.5 Regression] Big degradation on swim/mgrid on powerpc 32/64 after alias improvement merge (gcc r145494)

2009-05-11 Thread luisgpm at linux dot vnet dot ibm dot com


--- Comment #1 from luisgpm at linux dot vnet dot ibm dot com  2009-05-11 
18:04 ---
Good asm code for a hot loop in swim's "calc1" function

10001e10:   lfd f12,-10672(r11)
10001e14:   lfd f9,-10672(r9)
10001e18:   addir21,r21,16
10001e1c:   lfd f7,-10680(r11)
10001e20:   lfd f6,-10672(r6)
10001e24:   fmulf3,f9,f9
10001e28:   cmpwr21,r0
10001e2c:   faddf4,f7,f12
10001e30:   lfd f22,-10680(r9)
10001e34:   lfd f10,-10664(r9)
10001e38:   addir9,r9,16
10001e3c:   lfd f23,-10672(r5)
10001e40:   lfd f13,-10664(r5)
10001e44:   addir5,r5,16
10001e48:   lfd f5,-10664(r11)
10001e4c:   fsubf28,f23,f9
10001e50:   fsubf25,f13,f10
10001e54:   lfd f13,-10672(r4)
10001e58:   addir11,r11,16
10001e5c:   faddf5,f12,f5
10001e60:   fsubf20,f13,f0
10001e64:   fmulf9,f11,f9
10001e68:   fmadd   f27,f22,f22,f3
10001e6c:   fmadd   f30,f10,f10,f3
10001e70:   lfd f3,-10680(r8)
10001e74:   faddf26,f4,f6
10001e78:   fmulf10,f11,f10
10001e7c:   fmulf24,f28,f2
10001e80:   fmulf21,f25,f2
10001e84:   fmulf4,f9,f4
10001e88:   fmadd   f22,f0,f0,f27
10001e8c:   faddf27,f8,f7
10001e90:   faddf23,f26,f8
10001e94:   fmulf26,f0,f11
10001e98:   lfd f8,-10664(r6)
10001e9c:   lfd f0,-10664(r4)
10001ea0:   addir6,r6,16
10001ea4:   faddf29,f5,f8
10001ea8:   fsubf25,f0,f13
10001eac:   addir4,r4,16
10001eb0:   fmsub   f28,f20,f1,f24
10001eb4:   lfd f20,-10672(r8)
10001eb8:   fmulf5,f10,f5
10001ebc:   addir8,r8,16
10001ec0:   stfdf4,-10672(r22)
10001ec4:   stfdf5,-10664(r22)
10001ec8:   addir22,r22,16
10001ecc:   fmulf27,f26,f27
10001ed0:   faddf24,f6,f29
10001ed4:   fmsub   f29,f25,f1,f21
10001ed8:   fdivf28,f28,f23
10001edc:   fmadd   f25,f13,f13,f30
10001ee0:   faddf6,f6,f12
10001ee4:   fmadd   f30,f3,f3,f22
10001ee8:   stfdf27,-10680(r3)
10001eec:   fdivf29,f29,f24
10001ef0:   fmadd   f3,f20,f20,f25
10001ef4:   fmulf20,f13,f11
10001ef8:   fmadd   f7,f30,f31,f7
10001efc:   stfdf7,-10680(r10)
10001f00:   fmadd   f12,f3,f31,f12
10001f04:   fmulf13,f20,f6
10001f08:   stfdf12,-10672(r10)
10001f0c:   stfdf13,-10672(r3)
10001f10:   addir10,r10,16
10001f14:   addir3,r3,16
10001f18:   stfdf28,-10672(r7)
10001f1c:   stfdf29,-10664(r7)
10001f20:   addir7,r7,16
10001f24:   bne 10001e10 

--
Bad asm code for the same loop

10001a60:   addis   r27,r9,-435
10001a64:   addis   r12,r11,-2176
10001a68:   lfd f13,-7440(r27)
10001a6c:   lfd f10,28344(r12)
10001a70:   addis   r8,r11,-1958
10001a74:   addis   r10,r11,-1740
10001a78:   fsubf7,f10,f13
10001a7c:   lfd f8,-704(r8)
10001a80:   lfd f10,0(r9)
10001a84:   addis   r7,r9,-218
10001a88:   addis   r28,r9,1523
10001a8c:   lfd f9,-29752(r10)
10001a90:   faddf6,f12,f10
10001a94:   fsubf2,f8,f0
10001a98:   addis   r12,r11,218
10001a9c:   addis   r27,r9,2176
10001aa0:   faddf5,f11,f9
10001aa4:   faddf11,f11,f12
10001aa8:   addir9,r9,8
10001aac:   cmpwr6,r9
10001ab0:   fmulf1,f7,f30
10001ab4:   fmulf7,f13,f13
10001ab8:   fmulf13,f13,f3
10001abc:   faddf31,f5,f6
10001ac0:   lfd f5,29040(r7)
10001ac4:   fmsub   f2,f2,f29,f1
10001ac8:   fmadd   f1,f0,f0,f7
10001acc:   fmulf0,f0,f3
10001ad0:   fmulf6,f13,f6
10001ad4:   stfdf6,-6728(r28)
10001ad8:   fdivf2,f2,f31
10001adc:   fmadd   f5,f5,f5,f1
10001ae0:   fmulf31,f0,f11
10001ae4:   fmr f0,f8
10001ae8:   stfdf31,0(r11)
10001aec:   fmr f11,f9
10001af0:   addir11,r11,8
10001af4:   faddf1,f5,f4
10001af8:   fmr f4,f7
10001afc:   fmadd   f5,f1,f28,f12
10001b00:   fmr f12,f10
10001b04:   stfdf5,-28344(r27)
10001b08:   stfdf2,-29040(r12)
10001b0c:   bne+10001a60 

--

Looking into the differences for both cases, the good code seems to be
traversing the loop in a different way than the bad one, using smaller
displacements for each load/store. The bad case uses bigger displacements.

Also, it looks like we have a bigger unrolling factor on the good case (longer
code, more loads) compared to the bad case.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40029



[Bug middle-end/40029] New: [4.5 Regression] Big degradation on swim/mgrid on powerpc 32/64 after alias improvement merge (gcc r145494)

2009-05-05 Thread luisgpm at linux dot vnet dot ibm dot com
CPU2000's swim and mgrid had ~10% slowdown after the merge of the alias
improvement branch.

GCC was configured with the following:

/gcc/HEAD/configure --target=powerpc64-linux --host=powerpc64-linux
--build=powerpc64-linux --with-cpu=default32 --enable-threads=posix
--enable-shared --enable-__cxa_atexit --with-gmp=/gmp --with-mpfr=mpfr
--with-long-double-128 --enable-decimal-float --enable-secure-plt
--disable-bootstrap --disable-alsa --prefix=/install/gcc/HEAD
build_alias=powerpc64-linux host_alias=powerpc64-linux
target_alias=powerpc64-linux --enable-languages=c,c++,fortran --no-create
--no-recursion

Compile flags used: -m[32|64] -O3 -mcpu=power[4|5|6] -ffast-math
-ftree-loop-linear -funroll-loops -fpeel-loops

Will provide more details soon.


-- 
   Summary: [4.5 Regression] Big degradation on swim/mgrid on
powerpc 32/64 after alias improvement merge (gcc
r145494)
   Product: gcc
   Version: 4.5.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: luisgpm at linux dot vnet dot ibm dot com
 GCC build triplet: powerpc*-*-*
  GCC host triplet: powerpc*-*-*
GCC target triplet: powerpc*-*-*


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40029



[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817

2009-05-04 Thread luisgpm at linux dot vnet dot ibm dot com


--- Comment #9 from luisgpm at linux dot vnet dot ibm dot com  2009-05-04 
15:50 ---
Follows the configure options, although i think you'll be able to reproduce it
with the flags i've mentioned.

/gcc/HEAD/configure --target=powerpc64-linux --host=powerpc64-linux
--build=powerpc64-linux --with-cpu=default32 --enable-threads=posix
--enable-shared --enable-__cxa_atexit --with-gmp=/gmp --with-mpfr=mpfr
--with-long-double-128 --enable-decimal-float --enable-secure-plt
--disable-bootstrap --disable-alsa --prefix=/install/gcc/HEAD
build_alias=powerpc64-linux host_alias=powerpc64-linux
target_alias=powerpc64-linux --enable-languages=c,c++,fortran --no-create
--no-recursion


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976



[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817

2009-05-04 Thread luisgpm at linux dot vnet dot ibm dot com


--- Comment #8 from luisgpm at linux dot vnet dot ibm dot com  2009-05-04 
15:41 ---
Oops... forgot about that bit, sorry.

Compile flags: -m32 -O3 -mcpu=power[4|5|6] -ffast-math -ftree-loop-linear
-funroll-loops -fpeel-loops

This reproduces on both 32-bit and 64-bit.

Luis


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976



[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817

2009-05-04 Thread luisgpm at linux dot vnet dot ibm dot com


--- Comment #6 from luisgpm at linux dot vnet dot ibm dot com  2009-05-04 
13:50 ---
Just for the sake of information, -fselective-scheduling doesn't help.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976



[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817

2009-04-30 Thread luisgpm at linux dot vnet dot ibm dot com


--- Comment #2 from luisgpm at linux dot vnet dot ibm dot com  2009-04-30 
19:38 ---
Created an attachment (id=17786)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17786&action=view)
Last tree pass for the bad code


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976



[Bug middle-end/39976] [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817

2009-04-30 Thread luisgpm at linux dot vnet dot ibm dot com


--- Comment #1 from luisgpm at linux dot vnet dot ibm dot com  2009-04-30 
19:29 ---
ASM code for the bad loop

.L145:
fmul 10,8,13
fmul 5,8,0
addis 3,4,0xffe5
lfd 22,8(7)
addi 7,4,8
lfd 6,9472(3)
fmadd 10,9,0,10
fmsub 23,9,13,5
fmul 2,10,22
fmul 9,10,6
fmr 7,23
fmsub 25,23,6,2
fmadd 26,23,22,9
fadd 12,12,25
fadd 11,11,26
.L93:
fmul 8,10,13
fmul 22,10,0
addis 3,7,0xffe5
lfd 21,0(7)
addi 4,7,8
lfd 25,9472(3)
fmadd 8,7,0,8
fmsub 9,7,13,22
fmul 23,8,21
fmul 26,8,25
fmsub 24,9,25,23
fmadd 7,9,21,26
fadd 12,12,24
fadd 11,11,7
bdnz .L145
stfd 9,472(1)
mr 7,8
lwz 3,472(1)
lwz 4,476(1)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976



[Bug rtl-optimization/38373] 32-bit Vortex degradation on PPC due to bad RTL aliasing

2009-04-30 Thread luisgpm at linux dot vnet dot ibm dot com


--- Comment #3 from luisgpm at linux dot vnet dot ibm dot com  2009-04-30 
16:33 ---
This is already in 4.4, but we would like to add additional checks on 4.5 that
would be risky to have on 4.4 (since it was almost being released). I have the
additional patch and will attach it soon.

Sorry it took so long to reply.

Luis


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38373



[Bug regression/39976] New: [4.5 Regression] Big sixtrack degradation on powerpc 32/64 after revision r146817

2009-04-30 Thread luisgpm at linux dot vnet dot ibm dot com
We have a hot spot on sixtrack in a function called thin6d.

Such loop is generated by the old (pre-146817) gcc as a single BB, thus the
only way inside that loop is by executing instructions until we fall into that
code.

The post-146817 gcc breaks that loop in two BB's, such that we can actually
branch to the middle of that loop in the first iteration, and then the loop
runs just like in pre-146817.

The degradation comes from the fact that the creation of two BB's breaks good
scheduling of instructions inside that loop, like this:

Good code: All the fp load instructions are grouped in the upper portion of the
code.

fmulf22,f11,f13
fmulf23,f11,f0
addis   r12,r6,-27
lfd f3,0(r6)
addir4,r6,8
lfd f1,9472(r12)
addis   r12,r4,-27
fmadd   f8,f12,f0,f22
fmsub   f4,f12,f13,f23
lfd f22,9472(r12)
lfd f23,8(r6)
addir6,r4,8
fmulf11,f8,f13
fmulf24,f8,f1
fmulf25,f8,f3
fmulf5,f8,f0
fmadd   f11,f4,f0,f11
fmadd   f21,f4,f3,f24
fmsub   f2,f4,f1,f25
fmsub   f12,f4,f13,f5
fmulf1,f11,f23
fmulf8,f11,f22
faddf9,f9,f21
faddf10,f10,f2
fmsub   f24,f12,f22,f1
fmadd   f25,f12,f23,f8
faddf10,f10,f24
faddf9,f9,f25
bdnz100ca878 

Bad code: The second pair of loads are pushed down the second BB, causing
slowdowns.

fmulf5,f8,f0
addis   r3,r4,-27
lfd f22,8(r7)
addir7,r4,8
lfd f6,9472(r3)
fmadd   f10,f9,f0,f10
fmsub   f23,f9,f13,f5
fmulf2,f10,f22
fmulf9,f10,f6
fmr f7,f23
fmsub   f25,f23,f6,f2
fmadd   f26,f23,f22,f9
faddf12,f12,f25
faddf11,f11,f26
fmulf8,f10,f13
>> BB mark
fmulf22,f10,f0
addis   r3,r7,-27
lfd f21,0(r7)
addir4,r7,8
lfd f25,9472(r3)
fmadd   f8,f7,f0,f8
fmsub   f9,f7,f13,f22
fmulf23,f8,f21
fmulf26,f8,f25
fmsub   f24,f9,f25,f23
fmadd   f7,f9,f21,f26
faddf12,f12,f24
faddf11,f11,f7
bdnz100c9fe0 


-- 
   Summary: [4.5 Regression] Big sixtrack degradation on powerpc
32/64 after revision r146817
   Product: gcc
   Version: 4.5.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: regression
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: luisgpm at linux dot vnet dot ibm dot com
 GCC build triplet: powerpc*-*-*
  GCC host triplet: powerpc*-*-*
GCC target triplet: powerpc*-*-*


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976



[Bug rtl-optimization/38373] 32-bit Vortex degradation on PPC due to bad RTL aliasing

2009-01-09 Thread luisgpm at linux dot vnet dot ibm dot com


--- Comment #1 from luisgpm at linux dot vnet dot ibm dot com  2009-01-09 
18:00 ---
Created an attachment (id=17065)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17065&action=view)
Second part of the combined patch

Additional check to avoid returning a NULL base. This is a placeholder for a
4.5 fix.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38373



[Bug rtl-optimization/38373] New: 32-bit Vortex degradation on PPC due to bad RTL aliasing

2008-12-02 Thread luisgpm at linux dot vnet dot ibm dot com
The handling of LO_SUM by alias.c:find_base_term causes a degradation on 32-bit
vortex on PPC when used with the new REG_POINTER attribute. 

Making "find_base_term" handle LO_SUM the same way as alias.c:find_base_value
fixes the problem. Preventing "find_base_term" from returning NULL so easily
also fixes the problem.

A fix for 4.5 will most probably be a combination of these two approaches.


-- 
   Summary: 32-bit Vortex degradation on PPC due to bad RTL aliasing
   Product: gcc
   Version: 4.4.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
AssignedTo: unassigned at gcc dot gnu dot org
    ReportedBy: luisgpm at linux dot vnet dot ibm dot com
 GCC build triplet: powerpc*-*-*
  GCC host triplet: powerpc*-*-*
GCC target triplet: powerpc*-*-*


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38373



[Bug tree-optimization/37686] [4.4 Regression] Building of CPU2000's bzip2 with peak flags with -mcpu=power4 fails with an ICE.

2008-10-03 Thread luisgpm at linux dot vnet dot ibm dot com


--- Comment #21 from luisgpm at linux dot vnet dot ibm dot com  2008-10-03 
20:59 ---
It fixes the bzip2 ICE.

Thanks,
Luis


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37686



[Bug middle-end/37447] [4.4 Regression] test pr28982b.c fails execution on power4 or later with ira change

2008-10-01 Thread luisgpm at linux dot vnet dot ibm dot com


--- Comment #5 from luisgpm at linux dot vnet dot ibm dot com  2008-10-02 
01:43 ---
This problem also showed up as a CPU2000 regression in the Sixtrack benchmark
for PPC64, causing problems in the ondering of ld/st instructions.

A GCC patched with Richard's fix produced the right code and the regression is
gone.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37447



[Bug regression/37686] [4.4 Regression] Building of CPU2000's bzip2 with peak flags with -mcpu=power4 fails with an ICE.

2008-10-01 Thread luisgpm at linux dot vnet dot ibm dot com


--- Comment #7 from luisgpm at linux dot vnet dot ibm dot com  2008-10-01 
17:44 ---
I still can ICE it with the same flags in a native system. Any other info you'd
like to have available?

I have a more reduced source, will post it soon.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37686



[Bug regression/37686] [4.4 Regression] Building of CPU2000's bzip2 with peak flags with -mcpu=power4 fails with an ICE.

2008-10-01 Thread luisgpm at linux dot vnet dot ibm dot com


--- Comment #5 from luisgpm at linux dot vnet dot ibm dot com  2008-10-01 
13:19 ---
I'm still trying to minimize even further the source. Will attach when i have
something better.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37686



[Bug regression/37686] [4.4 Regression] Building of CPU2000's bzip2 with peak flags with -mcpu=power4 fails with an ICE.

2008-10-01 Thread luisgpm at linux dot vnet dot ibm dot com


--- Comment #4 from luisgpm at linux dot vnet dot ibm dot com  2008-10-01 
13:13 ---
Created an attachment (id=16441)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16441&action=view)
Reduced source for bzip2.c

Indented reduced source.


-- 

luisgpm at linux dot vnet dot ibm dot com changed:

   What|Removed |Added

  Attachment #16440|0   |1
is obsolete||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37686



[Bug regression/37686] [4.4 Regression] Building of CPU2000's bzip2 with peak flags with -mcpu=power4 fails with an ICE.

2008-10-01 Thread luisgpm at linux dot vnet dot ibm dot com


--- Comment #3 from luisgpm at linux dot vnet dot ibm dot com  2008-10-01 
13:10 ---
Created an attachment (id=16440)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16440&action=view)
Reduced source for bzip2.c

Source reduced with delta


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37686



[Bug regression/37686] [4.4 Regression] Building of CPU2000's bzip2 with peak flags with -mcpu=power4 fails with an ICE.

2008-10-01 Thread luisgpm at linux dot vnet dot ibm dot com


--- Comment #2 from luisgpm at linux dot vnet dot ibm dot com  2008-10-01 
13:10 ---
Created an attachment (id=16439)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16439&action=view)
Preprocessed source for reduced bzip2.c

Preprocessed source.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37686



[Bug regression/37686] New: Building of CPU2000's bzip2 with peak flags with -mcpu=power4 fails with an ICE.

2008-09-30 Thread luisgpm at linux dot vnet dot ibm dot com
After this patch:

http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=140660

Building bzip2 with peak flags AND -mcpu=power4 for ppc (-O3 -mcpu=power4
-ffast-math -funroll-loops -ftree-loop-linear -fpeel-loops) started ICE'ing.

Here is the backtrace:

#0  0x103a2c3c in gimple_bb (g=0x4) at
/home/luis/src/gcc/HEAD/gcc/gimple.h:1071
#1  0x103a2b80 in gsi_start (seq=0xf76882b0) at
/home/luis/src/gcc/HEAD/gcc/gimple.h:4298
#2  0x103a5a48 in gsi_start_phis (bb=0xf7664b40) at
/home/luis/src/gcc/HEAD/gcc/gimple-iterator.c:770
#3  0x105e601c in verify_stmts () at
/home/luis/src/gcc/HEAD/gcc/tree-cfg.c:4177
#4  0x107e80cc in verify_ssa (check_modified_stmt=1 '\001') at
/home/luis/src/gcc/HEAD/gcc/tree-ssa.c:750
#5  0x10476c30 in execute_function_todo (data=0x4043) at
/home/luis/src/gcc/HEAD/gcc/passes.c:999
#6  0x1047657c in do_per_function (callback=0x10476898 ,
data=0x4043) at /home/luis/src/gcc/HEAD/gcc/passes.c:841
#7  0x10476d38 in execute_todo (flags=16451) at
/home/luis/src/gcc/HEAD/gcc/passes.c:1025
#8  0x10477c54 in execute_one_pass (pass=0x10eeabd4) at
/home/luis/src/gcc/HEAD/gcc/passes.c:1301
#9  0x10477ec4 in execute_pass_list (pass=0x10eeabd4) at
/home/luis/src/gcc/HEAD/gcc/passes.c:1327
#10 0x10477ef0 in execute_pass_list (pass=0x10eeaa9c) at
/home/luis/src/gcc/HEAD/gcc/passes.c:1328
#11 0x10477ef0 in execute_pass_list (pass=0x10eea494) at
/home/luis/src/gcc/HEAD/gcc/passes.c:1328
#12 0x106656a8 in tree_rest_of_compilation (fndecl=0xf7e74180) at
/home/luis/src/gcc/HEAD/gcc/tree-optimize.c:418
#13 0x10930d78 in cgraph_expand_function (node=0xf7e78300) at
/home/luis/src/gcc/HEAD/gcc/cgraphunit.c:1038
#14 0x10930fe0 in cgraph_expand_all_functions () at
/home/luis/src/gcc/HEAD/gcc/cgraphunit.c:1097
#15 0x109317a8 in cgraph_optimize () at
/home/luis/src/gcc/HEAD/gcc/cgraphunit.c:1302
#16 0x10033998 in c_write_global_declarations () at
/home/luis/src/gcc/HEAD/gcc/c-decl.c:8073
#17 0x105c9624 in compile_file () at /home/luis/src/gcc/HEAD/gcc/toplev.c:979
#18 0x105cc214 in do_compile () at /home/luis/src/gcc/HEAD/gcc/toplev.c:2190
#19 0x105cc2a8 in toplev_main (argc=30, argv=0xff880e44) at
/home/luis/src/gcc/HEAD/gcc/toplev.c:
#20 0x100f90fc in main (argc=30, argv=0xff880e44) at
/home/luis/src/gcc/HEAD/gcc/main.c:35

The failing line is "return g->gsbase.bb;", because "g" isn't really a valid
pointer:

(gdb) p g
$1 = (const_gimple) 0x4


-- 
   Summary: Building of CPU2000's bzip2 with peak flags with -
mcpu=power4 fails with an ICE.
   Product: gcc
   Version: 4.4.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: regression
AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: luisgpm at linux dot vnet dot ibm dot com
 GCC build triplet: powerpc*-*-*
  GCC host triplet: powerpc*-*-*
GCC target triplet: powerpc*-*-*


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37686



[Bug fortran/37536] [4.3/4.4 Regression] a mfcr is produced instead of branches for DO loops

2008-09-17 Thread luisgpm at linux dot vnet dot ibm dot com


--- Comment #9 from luisgpm at linux dot vnet dot ibm dot com  2008-09-17 
15:22 ---
Gathered some PPC 32/64 performance numbers with the patch (based on 140409).
No noticeable performance regressions were found. 32-bit swin and 64-bit art
had a little boost on speed (7.8% and 3.4% respectivelly).


-- 

luisgpm at linux dot vnet dot ibm dot com changed:

   What|Removed |Added

 CC||luisgpm at linux dot vnet
   ||dot ibm dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37536



[Bug tree-optimization/14703] [4.4 regression] Inadequate optimization of inline templated functions, infinite loop in ipa-reference and memory hog

2008-09-10 Thread luisgpm at linux dot vnet dot ibm dot com


--- Comment #12 from luisgpm at linux dot vnet dot ibm dot com  2008-09-11 
05:18 ---
This patch (revision 140068) breaks SPEC2000's 200.sixtrack benchmark for
POWER6 due to miscompares.  Reverting this patch solves the problem. Not sure
what specific problem was introduced. I can isolate a testcase if needed be.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14703



[Bug target/34856] [4.2/4.3 Regression] ICE with some constant vectors

2008-09-09 Thread luisgpm at linux dot vnet dot ibm dot com


--- Comment #31 from luisgpm at linux dot vnet dot ibm dot com  2008-09-09 
13:51 ---
I have the fix for PPC. Any special reason why this doesn't get reproduced
there? Still would be worthwhile to include the rs6000-specific fix for this
bug ticket?

Thanks,
Luis


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34856



[Bug c/37171] [4.4 Regression] Canonical spelling optimization dependency

2008-08-20 Thread luisgpm at linux dot vnet dot ibm dot com


--- Comment #9 from luisgpm at linux dot vnet dot ibm dot com  2008-08-20 
16:09 ---
With revision 139317, the numbers for 197.parser as back to normal and the
generated ASM code carries only a single call to __ctype_toupper_loc.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37171



[Bug c/37171] [4.4 Regression] Canonical spelling optimization dependency

2008-08-20 Thread luisgpm at linux dot vnet dot ibm dot com


--- Comment #7 from luisgpm at linux dot vnet dot ibm dot com  2008-08-20 
14:21 ---
The preprocessed sources for strncasecmp.c are exactly the same for both cases.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37171



[Bug c/37171] [4.4 Regression] Canonical spelling optimization dependency

2008-08-20 Thread luisgpm at linux dot vnet dot ibm dot com


--- Comment #6 from luisgpm at linux dot vnet dot ibm dot com  2008-08-20 
14:07 ---
Created an attachment (id=16112)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16112&action=view)
Generated ASM code for the good case

The __ctype_toupper_loc function, differently than the bad case is called only
ONCE in this code.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37171



[Bug c/37171] [4.4 Regression] Canonical spelling optimization dependency

2008-08-20 Thread luisgpm at linux dot vnet dot ibm dot com


--- Comment #5 from luisgpm at linux dot vnet dot ibm dot com  2008-08-20 
14:06 ---
Created an attachment (id=16111)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16111&action=view)
Generated ASM code for the bad case

Notice that __ctype_toupper_loc is called 6 times in this code.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37171



[Bug c/37171] [4.4 Regression] Canonical spelling optimization dependency

2008-08-20 Thread luisgpm at linux dot vnet dot ibm dot com


--- Comment #4 from luisgpm at linux dot vnet dot ibm dot com  2008-08-20 
14:05 ---
Created an attachment (id=16110)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16110&action=view)
Preprocessed source for the good case


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37171



[Bug c/37171] [4.4 Regression] Canonical spelling optimization dependency

2008-08-20 Thread luisgpm at linux dot vnet dot ibm dot com


--- Comment #3 from luisgpm at linux dot vnet dot ibm dot com  2008-08-20 
14:04 ---
Created an attachment (id=16109)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16109&action=view)
Preprocessed source for the bad case


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37171