[Bug target/32523] disastrous scheduling for POWER5

2010-11-03 Thread ccorn at cs dot tu-berlin.de
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32523

Christian Cornelssen ccorn at cs dot tu-berlin.de changed:

   What|Removed |Added

 CC||ccorn at cs dot
   ||tu-berlin.de

--- Comment #14 from Christian Cornelssen ccorn at cs dot tu-berlin.de 
2010-11-03 19:32:24 UTC ---
Reproduced the problem on a PowerMac G5 with 2 PPC970MP (4 cores) under MacOS X
10.4.11 (Darwin 8.11.0).

Using the attachment of the original bug report, I compared

a) Apple's version of GCC-4.0 as provied by Xcode 2.5 as /usr/bin/gcc:

  powerpc-apple-darwin8-gcc-4.0.1 (GCC) 4.0.1 (Apple Computer, Inc. build 5370)

b) GCC-4.4.5 as provided by MacPorts:

  gcc-mp-4.4 (GCC) 4.4.5

simply by issuing the command

  make double GCC3=gcc-4.0 GCC4=gcc-mp-4.4

Performance drop is about one third with GCC-4.4.5 instead of Apple's version
of GCC-4.0.1, but is almost restored when using -fno-schedule-insns
-fno-rerun-loop-opt with GCC-4.4.5.


[Bug target/32523] disastrous scheduling for POWER5

2010-09-29 Thread fang at csl dot cornell.edu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32523

--- Comment #9 from David Fang fang at csl dot cornell.edu 2010-09-29 
21:36:02 UTC ---
Out of curiosity, any benchmark updates on more recent releases?


[Bug target/32523] disastrous scheduling for POWER5

2010-09-29 Thread whaley at cs dot utsa.edu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32523

--- Comment #10 from R. Clint Whaley whaley at cs dot utsa.edu 2010-09-29 
22:22:22 UTC ---
Out of curiosity, any benchmark updates on more recent releases?

Nope, after several rough experiences I've stopped reporting gcc bugs and
problems.  It usually takes weeks of my time, and I think only once or twice
has the problem been fixed because of my report, which is typically reported as
invalid by Pinski right up until it is fixed.  Usually the problem gets fixed
accidentally by other updates if it is ever fixed at all.

I've started to just rewrite things to ameliorate gcc problems.  I'll only
report problems if I can't get anything workable with this approach, since
rewriting whole code generators is faster than getting anyone here to confirm,
much less fix gcc problems.  I've largely insulated myself from all the gcc
performance regressions that used to cripple my library by extensive use of
assembly, which allows me to help my users even while gcc remains terribly
slow.

I don't think I'm the only developer who has been forced to take this path.

Cheers,
Clint


[Bug target/32523] disastrous scheduling for POWER5

2010-09-29 Thread pinskia at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32523

--- Comment #11 from Andrew Pinski pinskia at gcc dot gnu.org 2010-09-29 
22:39:14 UTC ---
(In reply to comment #10)
 which is typically reported as invalid by Pinski right up until it is fixed.  

I just looked into the bugs which you have filed and saw a different pattern. 
I think you are putting too much blame on me.  This is ok as I am the one who
normally touches almost every bug.  In the bugs you filed, I noticed one where
I made a comment which was supposed to be interrupted as an internal developer
comment rather than one about your code.

In another one (PR 30599), the problem was in your code as you were requesting
a truncation to happen; yes we went back and forth on that one but you
requested the truncation and GCC actually did it in that case.  In another it
was about a warning generated because of glibc marking a function to be warned
about.  In another one, GCC did not build because of an older version of Xcode
in Mac OS X.  In another the bug was marked as won't fix in the end but not by
me.

So please be more careful when you saying I close bugs as invalid right until
they are fixed.  Yes it has happened to one bug in the past (though I think I
still say that bug was invalid; I cannot remember the number right now).


Really I should have ignored this trolling really.


[Bug target/32523] disastrous scheduling for POWER5

2010-09-29 Thread whaley at cs dot utsa.edu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32523

--- Comment #12 from R. Clint Whaley whaley at cs dot utsa.edu 2010-09-29 
23:10:50 UTC ---
Andrew,

I'm certainly unsurprised that you disagree with me, since I don't think we
have ever agreed on anything in something like 5 years.  To get an idea of what
I'm talking about, scope:
   http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27827
where I have to show the problem affects almost every x86 architecture in use
at that time until someone admits it is a problem (somewhere around comment
#25, I think). I don't believe you ever said it was a problem.

How about this bug, still unconfirmed 3 years after I posted the benchmark
showing it?  

How about this beauty:
   http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38496

Obviously, I disagree with your summary of our interactions on the x87 gcc
arbitrary rounding bug, but at least people can scope the link to see if they
agree with your description. Unfortunately, several similar things I reported
years ago have aged out of the system.

If you could point out any report that I sent in where you agreed that it was a
bug or a problem before someone else did, maybe we can dispel my feeling that
you are someone who just routinely marks things as unimportant regardless of
the facts.

Regards,
Clint


[Bug target/32523] disastrous scheduling for POWER5

2010-09-29 Thread pinskia at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32523

--- Comment #13 from Andrew Pinski pinskia at gcc dot gnu.org 2010-09-29 
23:20:29 UTC ---
(In reply to comment #12)
 Andrew,
 
 I'm certainly unsurprised that you disagree with me, since I don't think we
 have ever agreed on anything in something like 5 years.  To get an idea of 
 what
 I'm talking about, scope:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27827

And if you look at the history of those two bugs, you will notice I did not
close them as invalid at all.  I might have suggested they were but I never
closed them as such.  I had left them for people who would analysis them
better.  So you first said I marked it as invalid which was not true as the
history on the bug report does not lie.

For this bug, the problem of the first pass of the scheduler increases life
range of variables which causes the register allocator not to do a good job. 
There are other bugs which record that fact already too (I don't know them
currently but you can find them via searching for -fno-schedule-insns).  It is
a well known issue which has been improved.  Which I mentioned exactly in
comment #2.  Nobody might have tested your testcase again which is why someone
finally decided to ask you if you want to test it.  As I mentioned in this bug
report you were testing a heavily modified 3.3.3 (I know because unit-at-a-time
was included in SUSE's 3.3).


[Bug target/32523] disastrous scheduling for POWER5

2007-06-28 Thread whaley at cs dot utsa dot edu


--- Comment #8 from whaley at cs dot utsa dot edu  2007-06-28 14:18 ---
I've been doing further testing on the g5 (the only machine where I have local
and root access), and this problem does not occur with stock gcc 4.1.1 either. 
Therefore, whatever problem is avoided by throwing -fno-schedule-insns was not
in 4.1.1.

BTW, as on the Power5, the best kernel does not get all it's performance back
by throwing this flag, even though the simplified example does.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32523



[Bug target/32523] disastrous scheduling for POWER5

2007-06-27 Thread pinskia at gcc dot gnu dot org


--- Comment #2 from pinskia at gcc dot gnu dot org  2007-06-27 16:25 ---
PowerPC970FX is not a direct descendent of Power5.  It is a descendent of the
970 which is a heavily modified Power4.  Power5 is the direct descendent of the
Power4 though, at least in terms of scheduling (I don't know if in terms of the
hardware itself).  So at best they are siblings rather than descendents of one
another.

The main thing is that you turned off the first scheduling pass which is before
the register allocator so I think the case is the register allocator is messing
up (which is already known).  The other thing is what options are you using to
invoke GCC with?  Power5 support inside GCC was not added until at least 3.4
(maybe it was 4.0).


-- 

pinskia at gcc dot gnu dot org changed:

   What|Removed |Added

  Component|c   |target


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32523



[Bug target/32523] disastrous scheduling for POWER5

2007-06-27 Thread pinskia at gcc dot gnu dot org


--- Comment #3 from pinskia at gcc dot gnu dot org  2007-06-27 16:27 ---
 I have been trying to install gcc 4.2 on PowerPC970FX, but so far no luck (it 
 doesn't seem to like
 MacOSX).

I have no problems installing GCC on Mac OS X 10.4.8/9/10.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32523



[Bug target/32523] disastrous scheduling for POWER5

2007-06-27 Thread pinskia at gcc dot gnu dot org


--- Comment #5 from pinskia at gcc dot gnu dot org  2007-06-27 17:05 ---
Well the 3.3.3 you are using is a heavy modified 3.3.3 which has the power5
backported and many other stuff.


-- 

pinskia at gcc dot gnu dot org changed:

   What|Removed |Added

  Component|c   |target


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32523



[Bug target/32523] disastrous scheduling for POWER5

2007-06-27 Thread whaley at cs dot utsa dot edu


--- Comment #6 from whaley at cs dot utsa dot edu  2007-06-27 19:09 ---
Andrew,

OK, I installed stock gnu gcc 3.4.6:
   78n04 TEST/MMBENCH_PPC ~/local/gcc-3.4.6/bin/gcc -v
Reading specs from
/u/noibm122/local/gcc-3.4.6/lib/gcc/powerpc64-unknown-linux-gnu/3.4.6/specs
Configured with: ../configure --prefix=/u/noibm122/local/gcc-3.4.6
--enable-languages=c
Thread model: posix
gcc version 3.4.6

and I get the exact same behavior as with the modified gcc 3 (it accepts the
power5 flags and everything).  So, it would seem something that used to work in
the stock gcc is now broken . . .

Thanks,
Clint


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32523



[Bug target/32523] disastrous scheduling for POWER5

2007-06-27 Thread whaley at cs dot utsa dot edu


--- Comment #7 from whaley at cs dot utsa dot edu  2007-06-28 05:25 ---
This problem affects the g5/970 as well:

Darwin. uname -a
Darwin etl-g52.cs.utsa.edu 8.10.0 Darwin Kernel Version 8.10.0: Wed May 23
16:50:59 PDT 2007; root:xnu-792.21.3~1/RELEASE_PPC Power Macintosh powerpc

Darwin. make all
/usr/bin/gcc-3.3 -DREPS=1000 -DWALL -O3 -c mmbench.c
/usr/bin/gcc-3.3 -DREPS=1000 -DWALL -O3 -c dgemm_atlas.c
/usr/bin/gcc-3.3 -DREPS=1000 -DWALL -O3 -o xdmm_gcc3 mmbench.o dgemm_atlas.o
rm -f *.o
/Users/whaley/local/gcc-4.2/bin/gcc -DREPS=1000 -DWALL -mcpu=970 -mtune=970 -O3
-m64 -c mmbench.c
/Users/whaley/local/gcc-4.2/bin/gcc -DREPS=1000 -DWALL -mcpu=970 -mtune=970 -O3
-m64 -c dgemm_atlas.c
/Users/whaley/local/gcc-4.2/bin/gcc -DREPS=1000 -DWALL -mcpu=970 -mtune=970 -O3
-m64 -o xdmm_gcc4 mmbench.o dgemm_atlas.o
rm -f *.o
/Users/whaley/local/gcc-4.2/bin/gcc -DREPS=1000 -DWALL -mcpu=970 -mtune=970 -O3
-m64 -c mmbench.c
/Users/whaley/local/gcc-4.2/bin/gcc -DREPS=1000 -DWALL -mcpu=970 -mtune=970 -O3
-m64 -fno-schedule-insns -fno-rerun-loop-opt -c \
dgemm_atlas.c
/Users/whaley/local/gcc-4.2/bin/gcc -DREPS=1000 -DWALL -mcpu=970 -mtune=970 -O3
-m64 -o xdmm_gcc4_nosched mmbench.o dgemm_atlas.o
rm -f *.o
echo GCC 3.x performance:

GCC 3.x performance:
./xdmm_gcc3
ALGORITHM NB   REPSTIME  MFLOPS
=  =  =  ==  ==

atlasmm   40   1000   0.021 6212.39

echo GCC 4.2 performance:
GCC 4.2 performance:
./xdmm_gcc4
ALGORITHM NB   REPSTIME  MFLOPS
=  =  =  ==  ==

atlasmm   40   1000   0.026 4905.34

echo GCC 4.2 w/o scheduling performance:
GCC 4.2 w/o scheduling performance:
./xdmm_gcc4_nosched
ALGORITHM NB   REPSTIME  MFLOPS
=  =  =  ==  ==

atlasmm   40   1000   0.020 6291.78


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32523