Last week I asked...
On Thu, Feb 01, 2001 at 01:51:56PM +0000, Tim Bunce wrote:
> Can anyone recommend extra gcc options to squeeze the last ounce of
> performance out of code (perl and apache in this case) on Intel?
>
> I don't mind tying the code down to one cpu type or loosing the ability
> to debug etc. We're already doing -O6 and are looking for more.
>
> I recall Malcom Beattie (CC'd, Hi Malcolm!) experimenting in this area,
> something about not wasting a register for the frame pointer.
>
> I'm using gcc 2.95.2, is that the latest/best?
> It's on FreeBSD 4.1 and 4.2.
I've appended a summary (with some additional notes after my reading of
the GCC 2.95.2 docs in square brackets).
Many thanks to all who contributed. I'm off to play with these options
now. I'll report back later.
Tim.
From: Greg Cope <[EMAIL PROTECTED]>
I've used this, but have had a few unresolved segfaults on buzy machines:
-O6 -mcpu=pentium -march=pentium -fomit-frame-pointer
[-march=pentium implies -mcpu=pentium]
From: Owen Williams <[EMAIL PROTECTED]>
I saw these on a site somewhere for compiling the linux kernel:
-mcpu=pentiumpro -mpentium -ffast-math -O5 -fthread-jumps
[-mpentium is deprecated synonym for -mcpu=pentium. -O enables -fthread-jumps]
Use them on anything that is pentiumpro and above. I get a good speed
increase.
From: Vivek Khera <[EMAIL PROTECTED]>
There were some important compiler fixes in FreeBSD 4.x that went in
early in January. If you can, I'd recommend updating to the latest
4.2-STABLE version for the most stable compiler environment. Most
important if you're compiling threaded apps in C++ (eg, MySQL).
Personally, I use these options with good effect:
-O2 -pipe -march=i586 -ffast-math -mfancy-math-387
Anything beyond that is bound to tickle gcc bugs.
From: Steve Fink <[EMAIL PROTECTED]>
> I recall Malcom Beattie (CC'd, Hi Malcolm!) experimenting in this area,
> something about not wasting a register for the frame pointer.
That particular option would be gcc -fomit-frame-pointer.
You might try -ffast-math -fexpensive-optimizations (never played with
the latter, though, and it's probably on with -O6 anyway).
If you really want to go crazy, you could try -fbranch-probabilities
(requires more than just turning it on; read the gcc man page.) I doubt
it's worth the trouble.
And you'd probably want -march=i686 (or whatever CPU you're using).
I don't know the state of pentium-specific optimizations, but does
Cygnus's Code Fusion still have a gcc with Pentium-specific
optimizations that aren't in the main tree? I just remember the numbers
saying that they'd slightly overtaken Intel's compiler, but that was a
year and a half ago.
From: nick <[EMAIL PROTECTED]>
>
>And you'd probably want -march=i686 (or whatever CPU you're using).
Not necessarily. gcc and ia32 is weird that way. I would use whatever
Linus & co. decided to use for the kernel on that arch in question.
From: James W Walden <[EMAIL PROTECTED]>
I use '-march=i686 -mcpu=i686' to improve performance with gcc. The
percentage improvement varies greatly between applications but is often
around 10%. If you're willing to use a commercial compiler instead of
gcc, I get a 20-40% improvement with Intel's proton C compiler (which I
think is only available commercially for Windows so far) over gcc and
have found other commercial compilers to produce similar gains.
From: Mark Mielke <[EMAIL PROTECTED]>
Try the pgcc patch.
I don't even think -O6 does anything for gcc 2.95.x, although my
memory is faint. I think it only goes to -O3.
To re-order the instructions for a pentium:
gcc -O3 -mpentium -march=pentium ...
If you apply the pgcc patch, it will actually use the new instructions
available only on the pentium, and not on the 386/486, where desirable.
From: "Redford, John" <[EMAIL PROTECTED]>
Why for me is that -O3 (and presumably -O6) performs optimizations that are
"unsafe". I have had critical bugs caused by compiling Perl with -O3, (which
used to be habitual). Now I only use -O2.
(Or possibly the optimizations were simply buggy in GCC; definitely this was
with GCC of years long ago, I haven't tried to push my luck again).
From: Perrin Harkins <[EMAIL PROTECTED]>
It's a bit old, but there's this page:
http://www.google.com/search?q=cache:members.nbci.com/Alex_Maranda/gnuintel/GNUintel.htm&hl=en&lr=lang_en
He comes out in favor of using PGCC.
[Summary:
http://gcc.gnu.org/onlinedocs/gcc-2.95.2/gcc_2.html#SEC10
http://gcc.gnu.org/onlinedocs/gcc-2.95.2/gcc_2.html#SEC31
http://members.nbci.com/Alex_Maranda/gnuintel/GNUintel.htm
gcc -O3 -malign-double -ffast-math -funroll-all-loops -fno-rtti -fno-exceptions
pgcc -O6 -malign-double -ffast-math -funroll-all-loops -fno-rtti -mcpu=pentiumpro
Using -mcpu=pentiumpro doesn't stop code running on old 386 so is
probably a good idea as a default for Perl & Apache on Intel.
To use pentiumpro specific instructions (won't run on i386) use:
-march=pentiumpro (which also implies -mcpu=pentiumpro)
-fomit-frame-pointer makes extra register available but disables debugging
]