Hello,
I tested out Richard Guenther's inlining patch on a numerical C++ code that I have developed. The patch is included and described here:
http://gcc.gnu.org/ml/gcc-patches/2005-02/msg01571.html
The C++ code that I am timing (especially the first two examples) heavily uses uBLAS, a C++ matrix library in BOOST. Here are some timings for gcc-3.4 and for 4.0-CVS patched and unpatched. The CVS snapshot is from Feb 22.
Data set #1: 3.4: 3.7s 4.0: 10.6s = [3.4] * 2.86 4.0+p: 4.0s = [3.4] * 1.08 = [4.0] / 2.65
Data set #2: 3.4: 10.9s 4.0: 29.9s = [3.4] * 2.74 4.0+p: 11.4s = [3.4] * 1.04 = [4.0] / 2.62
Data set #3 [+different cmd-line options] 3.4: 24.8s 4.0: 35.3s = [3.4] * 1.42 4.0+p: 23.0s = [3.4] * 0.93 = [4.0] / 1.49
So, with Richard's patch, I get a large 1.5 - 2.6 times speedup! If it could be included in 4.0, then that would be very helpful :)
The compilation flags that I used were:
-ffast-math -fomit-frame-pointer -march=pentium4 -mtune=pentium4 -O3 -DNDEBUG_UBLAS -DNDEBUG_DP -DNDEBUG
I can test this code with -fpmath=sse and -fpmath=i387,sse options if someone is interested. (e.g. sooner rather than later)
Thanks!
-BenRI
P.S. Additionally, this patch fixes SOME of the non-reduced test-cases (not sent to the list) for the problem in this message, although the exact test-case mentioned in the message still doesn't work:
http://gcc.gnu.org/ml/gcc/2005-02/msg00910.html