[Bug tree-optimization/38328] Massive performance regression for jpeg_idct_islow

2008-11-30 Thread hjl dot tools at gmail dot com


--- Comment #1 from hjl dot tools at gmail dot com  2008-11-30 15:00 ---
Can you try -fno-ira to see if it fixes the problem?


-- 

hjl dot tools at gmail dot com changed:

   What|Removed |Added

 CC||hjl dot tools at gmail dot
   ||com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38328



[Bug tree-optimization/38328] Massive performance regression for jpeg_idct_islow

2008-11-30 Thread sgunderson at bigfoot dot com


--- Comment #2 from sgunderson at bigfoot dot com  2008-11-30 15:06 ---
OK, I looked at the source. The issue here seems to be that 4.4 likes to
compile this:

z3 = ((z3) * (- ((INT32) 16069)));

into this:

10  0.0403 : 805cc87:   lea(%ecx,%ecx,4),%ebx
   : 805cc8a:   lea(%ebx,%ebx,4),%ebx
20  0.0805 : 805cc8d:   lea(%ebx,%ebx,4),%ebx
 7  0.0282 : 805cc90:   lea(%ecx,%ebx,2),%ebx
 3  0.0121 : 805cc93:   shl$0x4,%ebx
38  0.1530 : 805cc96:   add%ecx,%ebx
 8  0.0322 : 805cc98:   lea(%ecx,%ebx,4),%esi

4.3 uses imul here, which is a lot faster.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38328



[Bug tree-optimization/38328] Massive performance regression for jpeg_idct_islow

2008-11-30 Thread rguenth at gcc dot gnu dot org


--- Comment #3 from rguenth at gcc dot gnu dot org  2008-11-30 16:23 ---
Which tuning are you using?  Try enabling -mtune=generic (possibly by default).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38328



[Bug tree-optimization/38328] Massive performance regression for jpeg_idct_islow

2008-11-30 Thread sgunderson at bigfoot dot com


--- Comment #4 from sgunderson at bigfoot dot com  2008-11-30 20:32 ---
Subject: Re:  Massive performance regression
for jpeg_idct_islow

On Sun, Nov 30, 2008 at 04:23:31PM -, rguenth at gcc dot gnu dot org wrote:
 Which tuning are you using?  Try enabling -mtune=generic (possibly by 
 default).

The compile flags are -g -O2 -D_REENTRANT, IIRC. No weird compile options.

/* Steinar */


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38328



[Bug tree-optimization/38328] Massive performance regression for jpeg_idct_islow

2008-11-30 Thread rguenth at gcc dot gnu dot org


--- Comment #5 from rguenth at gcc dot gnu dot org  2008-11-30 20:37 ---
What is the gcc output if you append -v?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38328



[Bug tree-optimization/38328] Massive performance regression for jpeg_idct_islow

2008-11-30 Thread sgunderson at bigfoot dot com


--- Comment #6 from sgunderson at bigfoot dot com  2008-11-30 20:40 ---
Subject: Re:  Massive performance regression
for jpeg_idct_islow

On Sun, Nov 30, 2008 at 08:37:31PM -, rguenth at gcc dot gnu dot org wrote:
 --- Comment #5 from rguenth at gcc dot gnu dot org  2008-11-30 20:37 
 ---
 What is the gcc output if you append -v?

fugl:~  /usr/lib/gcc-snapshot/bin/gcc -v   
Using built-in specs.
Target: i486-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 20081117-1'
--with-bugurl=file:///usr/share/doc/gcc-snapshot/README.Bugs
--enable-languages=c,c++,java,fortran,objc,obj-c++,ada
--prefix=/usr/lib/gcc-snapshot --enable-shared --with-system-zlib --disable-nls
--enable-clocale=gnu --enable-libstdcxx-debug --enable-java-awt=gtk
--enable-gtk-cairo --disable-plugin
--with-java-home=/usr/lib/gcc-snapshot/java-1.5.0-gcj-4.4-1.5.0.0/jre
--enable-java-home --with-jvm-root-dir=/usr/lib/gcc-snapshot/jvm
--with-jvm-jar-dir=/usr/lib/gcc-snapshot/jvm-exports
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-mpfr
--enable-targets=all --enable-cld --disable-werror --build=i486-linux-gnu
--host=i486-linux-gnu --target=i486-linux-gnu
Thread model: posix
gcc version 4.4.0 20081117 (experimental) [trunk revision 141948] (Debian
20081117-1) 

/* Steinar */


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38328



[Bug tree-optimization/38328] Massive performance regression for jpeg_idct_islow

2008-11-30 Thread rguenth at gcc dot gnu dot org


--- Comment #7 from rguenth at gcc dot gnu dot org  2008-11-30 21:04 ---
Append -v to the command-line you use for compiling ;)  Seriously, if using
-mtune=generic works then this is a Debian packaging issue of their
gcc-snapshot compiler.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38328



[Bug tree-optimization/38328] Massive performance regression for jpeg_idct_islow

2008-11-30 Thread sgunderson at bigfoot dot com


--- Comment #8 from sgunderson at bigfoot dot com  2008-11-30 21:19 ---
Subject: Re:  Massive performance regression
for jpeg_idct_islow

On Sun, Nov 30, 2008 at 09:04:07PM -, rguenth at gcc dot gnu dot org wrote:
 Append -v to the command-line you use for compiling ;)  Seriously, if using
 -mtune=generic works then this is a Debian packaging issue of their
 gcc-snapshot compiler.

fugl:~/nmu/libjpeg6b-6b /usr/lib/gcc-snapshot/bin/gcc -D_REENTRANT -g -Wall
-O2 -g -I. -c ./jidctint.c  -fPIC -DPIC -o .libs/jidctint.o -v
Using built-in specs.
Target: i486-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 20081117-1'
--with-bugurl=file:///usr/share/doc/gcc-snapshot/README.Bugs
--enable-languages=c,c++,java,fortran,objc,obj-c++,ada
--prefix=/usr/lib/gcc-snapshot --enable-shared --with-system-zlib --disable-nls
--enable-clocale=gnu --enable-libstdcxx-debug --enable-java-awt=gtk
--enable-gtk-cairo --disable-plugin
--with-java-home=/usr/lib/gcc-snapshot/java-1.5.0-gcj-4.4-1.5.0.0/jre
--enable-java-home --with-jvm-root-dir=/usr/lib/gcc-snapshot/jvm
--with-jvm-jar-dir=/usr/lib/gcc-snapshot/jvm-exports
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-mpfr
--enable-targets=all --enable-cld --disable-werror --build=i486-linux-gnu
--host=i486-linux-gnu --target=i486-linux-gnu
Thread model: posix
gcc version 4.4.0 20081117 (experimental) [trunk revision 141948] (Debian
20081117-1) 
COLLECT_GCC_OPTIONS='-D_REENTRANT' '-g' '-Wall' '-O2' '-g' '-I.' '-c' '-fPIC'
'-DPIC' '-o' '.libs/jidctint.o' '-v' '-mtune=i486'
 /usr/lib/gcc-snapshot/libexec/gcc/i486-linux-gnu/4.4.0/cc1 -quiet -v -I.
-D_REENTRANT -DPIC ./jidctint.c -quiet -dumpbase jidctint.c -mtune=i486
-auxbase-strip .libs/jidctint.o -g -g -O2 -Wall -version -fPIC -o
/tmp/cc5hqg0m.s
ignoring nonexistent directory /usr/local/include/i486-linux-gnu
ignoring nonexistent directory
/usr/lib/gcc-snapshot/lib/gcc/i486-linux-gnu/4.4.0/../../../../i486-linux-gnu/include
ignoring nonexistent directory /usr/include/i486-linux-gnu
#include ... search starts here:
#include ... search starts here:
 .
 /usr/local/include
 /usr/lib/gcc-snapshot/include
 /usr/lib/gcc-snapshot/lib/gcc/i486-linux-gnu/4.4.0/include
 /usr/lib/gcc-snapshot/lib/gcc/i486-linux-gnu/4.4.0/include-fixed
 /usr/include
End of search list.
GNU C (Debian 20081117-1) version 4.4.0 20081117 (experimental) [trunk revision
141948] (i486-linux-gnu)
compiled by GNU C version 4.4.0 20081117 (experimental) [trunk revision
141948], GMP version 4.2.2, MPFR version 2.3.2.
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
Compiler executable checksum: 445209552aa2d93e7e967b7473e83cd6
COLLECT_GCC_OPTIONS='-D_REENTRANT' '-g' '-Wall' '-O2' '-g' '-I.' '-c' '-fPIC'
'-DPIC' '-o' '.libs/jidctint.o' '-v' '-mtune=i486'
 as -V -Qy -o .libs/jidctint.o /tmp/cc5hqg0m.s
GNU assembler version 2.18.0 (i486-linux-gnu) using BFD version (GNU Binutils
for Debian) 2.18.0.20080103
COMPILER_PATH=/usr/lib/gcc-snapshot/libexec/gcc/i486-linux-gnu/4.4.0/:/usr/lib/gcc-snapshot/libexec/gcc/i486-linux-gnu/4.4.0/:/usr/lib/gcc-snapshot/libexec/gcc/i486-linux-gnu/:/usr/lib/gcc-snapshot/lib/gcc/i486-linux-gnu/4.4.0/:/usr/lib/gcc-snapshot/lib/gcc/i486-linux-gnu/:/usr/lib/gcc/i486-linux-gnu/
LIBRARY_PATH=/usr/lib/gcc-snapshot/lib/gcc/i486-linux-gnu/4.4.0/:/usr/lib/gcc-snapshot/lib/gcc/i486-linux-gnu/4.4.0/../../../../lib/:/lib/../lib/:/usr/lib/../lib/:/usr/lib/gcc-snapshot/lib/gcc/i486-linux-gnu/4.4.0/../../../:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-D_REENTRANT' '-g' '-Wall' '-O2' '-g' '-I.' '-c' '-fPIC'
'-DPIC' '-o' '.libs/jidctint.o' '-v' '-mtune=i486'

-mtune=generic still produces these long series of leas.

/* Steinar */


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38328



[Bug tree-optimization/38328] Massive performance regression for jpeg_idct_islow

2008-11-30 Thread sgunderson at bigfoot dot com


--- Comment #9 from sgunderson at bigfoot dot com  2008-11-30 21:22 ---
Subject: Re:  Massive performance regression
for jpeg_idct_islow

On Sun, Nov 30, 2008 at 09:19:08PM -, sgunderson at bigfoot dot com wrote:
 -mtune=generic still produces these long series of leas.

Sorry, I objdumped the wrong file. -mtune=generic appears to fix it (although
I haven't checked the performance).

/* Steinar */


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38328



[Bug tree-optimization/38328] Massive performance regression for jpeg_idct_islow

2008-11-30 Thread rguenth at gcc dot gnu dot org


--- Comment #10 from rguenth at gcc dot gnu dot org  2008-11-30 21:29 
---

/usr/lib/gcc-snapshot/libexec/gcc/i486-linux-gnu/4.4.0/cc1 -quiet -v -I.
-D_REENTRANT -DPIC ./jidctint.c -quiet -dumpbase jidctint.c -mtune=i486
-auxbase-strip .libs/jidctint.o -g -g -O2 -Wall -version -fPIC -o
/tmp/cc5hqg0m.s


so it uses -mtune=i486 - this optimizes the multiplication for i486 where imul
is slow.  The difference to 4.3 is a packaging issue in Debian.


-- 

rguenth at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||INVALID


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38328



[Bug tree-optimization/38328] Massive performance regression for jpeg_idct_islow

2008-11-30 Thread sgunderson at bigfoot dot com


--- Comment #11 from sgunderson at bigfoot dot com  2008-11-30 22:48 ---
Subject: Re:  Massive performance regression
for jpeg_idct_islow

On Sun, Nov 30, 2008 at 09:29:29PM -, rguenth at gcc dot gnu dot org wrote:
 so it uses -mtune=i486 - this optimizes the multiplication for i486 where imul
 is slow.  The difference to 4.3 is a packaging issue in Debian.

Thanks! I'll file a bug against the package.

/* Steinar */


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38328