I have noticed a big performance decrease in one of my numerical codes when switching from gcc 4.4 to gcc 4.5. A small test case is attached. When compiling this test case with "gcc -O3 perf.c -lm -std=c99" and executing the resulting binary, the CPU time with the head of the 4.4 branch is about 1.1s, with the head of the trunk it is 2.1s.
This is on a Pentium D CPU. I have verified that both binaries produce identical results. Verbose output of gcc-4.4: ~/tmp/wigner3j>gcc -O3 perf.c -lm -std=c99 -save_temps -v Using built-in specs. gcc: unrecognized option '-save_temps' Target: i686-pc-linux-gnu Configured with: /scratch/martin/gcc44/configure --prefix=/scratch/martin/ugcc44 --enable-languages=c++,fortran --enable-target=all --disable-bootstrap --enable -checking=release Thread model: posix gcc version 4.4.3 20091130 (prerelease) [gcc-4_4-branch revision 154765] (GCC) COLLECT_GCC_OPTIONS='-O3' '-std=c99' '-save_temps' '-v' '-mtune=generic' /scratch/martin/ugcc44/libexec/gcc/i686-pc-linux-gnu/4.4.3/cc1 -quiet -v perf.c -quiet -dumpbase perf.c -mtune=generic -auxbase perf -O3 -std=c99 -version -o / tmp/cc3D10Yi.s ignoring nonexistent directory "/scratch/martin/ugcc44/lib/gcc/i686-pc-linux-gnu /4.4.3/../../../../i686-pc-linux-gnu/include" #include "..." search starts here: #include <...> search starts here: /usr/local/include /scratch/martin/ugcc44/include /scratch/martin/ugcc44/lib/gcc/i686-pc-linux-gnu/4.4.3/include /scratch/martin/ugcc44/lib/gcc/i686-pc-linux-gnu/4.4.3/include-fixed /usr/include End of search list. GNU C (GCC) version 4.4.3 20091130 (prerelease) [gcc-4_4-branch revision 154765] (i686-pc-linux-gnu) compiled by GNU C version 4.2.3, GMP version 4.2.4, MPFR version 2.3.2. GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 Compiler executable checksum: 0428a618e74de3f947d92ab031f86f8a COLLECT_GCC_OPTIONS='-O3' '-std=c99' '-save_temps' '-v' '-mtune=generic' as -V -Qy -o /tmp/cc6AnZqy.o /tmp/cc3D10Yi.s GNU assembler version 2.18 (i686-pc-linux-gnu) using BFD version (GNU Binutils) 2.18 COMPILER_PATH=/scratch/martin/ugcc44/libexec/gcc/i686-pc-linux-gnu/4.4.3/:/scrat ch/martin/ugcc44/libexec/gcc/i686-pc-linux-gnu/4.4.3/:/scratch/martin/ugcc44/lib exec/gcc/i686-pc-linux-gnu/:/scratch/martin/ugcc44/lib/gcc/i686-pc-linux-gnu/4.4 .3/:/scratch/martin/ugcc44/lib/gcc/i686-pc-linux-gnu/:/usr/libexec/gcc/i686-pc-l inux-gnu/:/usr/lib/gcc/i686-pc-linux-gnu/ LIBRARY_PATH=/scratch/martin/ugcc44/lib/gcc/i686-pc-linux-gnu/4.4.3/:/scratch/ma rtin/ugcc44/lib/gcc/i686-pc-linux-gnu/4.4.3/../../../:/lib/:/usr/lib/ COLLECT_GCC_OPTIONS='-O3' '-std=c99' '-save_temps' '-v' '-mtune=generic' /scratch/martin/ugcc44/libexec/gcc/i686-pc-linux-gnu/4.4.3/collect2 --eh-frame- hdr -m elf_i386 -dynamic-linker /lib/ld-linux.so.2 /usr/lib/crt1.o /usr/lib/crti .o /scratch/martin/ugcc44/lib/gcc/i686-pc-linux-gnu/4.4.3/crtbegin.o -L/scratch/ martin/ugcc44/lib/gcc/i686-pc-linux-gnu/4.4.3 -L/scratch/martin/ugcc44/lib/gcc/i 686-pc-linux-gnu/4.4.3/../../.. /tmp/cc6AnZqy.o -lm -lgcc --as-needed -lgcc_s -- no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /scratch/martin/ugcc44 /lib/gcc/i686-pc-linux-gnu/4.4.3/crtend.o /usr/lib/crtn.o Verbose output of gcc-4.5: ~/tmp/wigner3j>gcc -O3 perf.c -lm -std=c99 -save-temps -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/afs/mpa/data/martin/ugcc/libexec/gcc/i686-pc-linux-gnu/4.5.0/lto-wrapper Target: i686-pc-linux-gnu Configured with: /scratch/martin/gcc/configure --enable-gold --prefix=/afs/mpa/data/martin/ugcc --with-mpfr=/afs/mpa/data/martin/numlibs --with-gmp=/afs/mpa/data/martin/numlibs --with-mpc=/afs/mpa/data/martin/numlibs --enable-languages=c++,fortran --enable-target=all --enable-checking=release Thread model: posix gcc version 4.5.0 20091214 (experimental) [trunk revision 155208] (GCC) COLLECT_GCC_OPTIONS='-O3' '-std=c99' '-save-temps' '-v' '-mtune=generic' /afs/mpa/data/martin/ugcc/libexec/gcc/i686-pc-linux-gnu/4.5.0/cc1 -E -quiet -v perf.c -mtune=generic -std=c99 -O3 -fpch-preprocess -o perf.i ignoring nonexistent directory "/afs/mpa/data/martin/ugcc/lib/gcc/i686-pc-linux-gnu/4.5.0/../../../../i686-pc-linux-gnu/include" #include "..." search starts here: #include <...> search starts here: /usr/local/include /afs/mpa/data/martin/ugcc/include /afs/mpa/data/martin/ugcc/lib/gcc/i686-pc-linux-gnu/4.5.0/include /afs/mpa/data/martin/ugcc/lib/gcc/i686-pc-linux-gnu/4.5.0/include-fixed /usr/include End of search list. COLLECT_GCC_OPTIONS='-O3' '-std=c99' '-save-temps' '-v' '-mtune=generic' /afs/mpa/data/martin/ugcc/libexec/gcc/i686-pc-linux-gnu/4.5.0/cc1 -fpreprocessed perf.i -quiet -dumpbase perf.c -mtune=generic -auxbase perf -O3 -std=c99 -version -o perf.s GNU C (GCC) version 4.5.0 20091214 (experimental) [trunk revision 155208] (i686-pc-linux-gnu) compiled by GNU C version 4.5.0 20091214 (experimental) [trunk revision 155208], GMP version 4.3.1, MPFR version 2.4.2, MPC version 0.8 GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 GNU C (GCC) version 4.5.0 20091214 (experimental) [trunk revision 155208] (i686-pc-linux-gnu) compiled by GNU C version 4.5.0 20091214 (experimental) [trunk revision 155208], GMP version 4.3.1, MPFR version 2.4.2, MPC version 0.8 GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 Compiler executable checksum: 9df7fe822ccb89478c9ff357db9be45e COLLECT_GCC_OPTIONS='-O3' '-std=c99' '-save-temps' '-v' '-mtune=generic' as -V -Qy --32 -o perf.o perf.s GNU assembler version 2.18 (i686-pc-linux-gnu) using BFD version (GNU Binutils) 2.18 COMPILER_PATH=/afs/mpa/data/martin/ugcc/libexec/gcc/i686-pc-linux-gnu/4.5.0/:/afs/mpa/data/martin/ugcc/libexec/gcc/i686-pc-linux-gnu/4.5.0/:/afs/mpa/data/martin/ugcc/libexec/gcc/i686-pc-linux-gnu/:/afs/mpa/data/martin/ugcc/lib/gcc/i686-pc-linux-gnu/4.5.0/:/afs/mpa/data/martin/ugcc/lib/gcc/i686-pc-linux-gnu/ LIBRARY_PATH=/afs/mpa/data/martin/ugcc/lib/gcc/i686-pc-linux-gnu/4.5.0/:/afs/mpa/data/martin/ugcc/lib/gcc/i686-pc-linux-gnu/4.5.0/../../../:/lib/:/usr/lib/ COLLECT_GCC_OPTIONS='-O3' '-std=c99' '-save-temps' '-v' '-mtune=generic' /afs/mpa/data/martin/ugcc/libexec/gcc/i686-pc-linux-gnu/4.5.0/collect2 --eh-frame-hdr -m elf_i386 -dynamic-linker /lib/ld-linux.so.2 /usr/lib/crt1.o /usr/lib/crti.o /afs/mpa/data/martin/ugcc/lib/gcc/i686-pc-linux-gnu/4.5.0/crtbegin.o -L/afs/mpa/data/martin/ugcc/lib/gcc/i686-pc-linux-gnu/4.5.0 -L/afs/mpa/data/martin/ugcc/lib/gcc/i686-pc-linux-gnu/4.5.0/../../.. perf.o -lm -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /afs/mpa/data/martin/ugcc/lib/gcc/i686-pc-linux-gnu/4.5.0/crtend.o /usr/lib/crtn.o I attach the test case and the two generated assembler files. -- Summary: [4.5] Performance regression of generated code Product: gcc Version: 4.5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: martin at mpa-garching dot mpg dot de GCC build triplet: i686-pc-linux-gnu GCC host triplet: i686-pc-linux-gnu GCC target triplet: i686-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42376