Re: [gentoo-user] gcc optimizations

Javier Villavicencio Tue, 28 Oct 2003 15:18:58 -0800

On Tue, 28 Oct 2003 22:59:46 +0100
Redeeman <[EMAIL PROTECTED]> wrote:


> this is interresting, so i emerged povray, and did like you, but i
> couldnt find the benchmark.ini you talk about, so i just did the command
> u used, in the dir with  the file u use, and this is result:
> 
> real    0m2.568s
> user    0m2.220s
> sys     0m0.030s
> 
> i've got an athlon xp 1800+, and a geforce2 intergrated GPU :-)
> 
> and btw, you are saying it isnt recommended to compile gentoo with those
> flags, i have compiled everything with this:
> 
> -march=athlon-xp -O3 -pipe -mmmx -msse -m3dnow -mfpmath=sse,387
> -fexpensive-optimizations -fstack-protector -fomit-frame-pointer
> -funroll-loops -fforce-addr -falign-functions=4 -frerun-loop-opt
> -frerun-cse-after-loop -maccumulate-outgoing-args -fprefetch-loop-arrays
> 
> and its stable and really fast ;)
> the thing with fprofile-arcs is interresting, i will give it a shot!
> i wonder too, what is your system specs?
> 
The benchmark.ini is in the web page of povray, it's the configuration of the program 
to make it run in the same conditions on all machines, like quality and such things to 
the maximum, also there are two .pov files for benchmarking, the benchmark.pov and 
skyvase.pov. the benchmark.pov takes almost 27mins to compile, and that's why I didn't 
used it in my tests.

About your cflags, this is a list of what is included in each -Ox and what's included 
for some -mflags:

-O2 enables:
 -fdefer-pop
 -fmerge-constants
 -fthread-jumps
 -floop-optimize
 -fcrossjumping
 -fif-conversion
 -fif-conversion2
 -fdelayed-branch
 -fguess-branch-probability
 -fcprop-registers
 -fforce-mem
 -foptimize-sibling-calls
 -fstrength-reduce
 -fcse-follow-jumps  
 -fcse-skip-blocks
 -frerun-cse-after-loop  
 -frerun-loop-opt
 -fgcse   
 -fgcse-lm   
 -fgcse-sm
 -fdelete-null-pointer-checks
 -fexpensive-optimizations
 -fregmove
 -fschedule-insns  
 -fschedule-insns2
 -fsched-interblock 
 -fsched-spec
 -fcaller-saves
 -fpeephole2
 -freorder-blocks  
 -freorder-functions
 -fstrict-aliasing
 -falign-functions  
 -falign-jumps
 -falign-loops  
 -falign-labels

-O3 enables:
 -finline-functions
 -frename-registers

Machine defaults when using -march=xxx:
 -mpreferred-stack-boundary=4 (4=ok for ia32, 8=ok for ia64, it's safe to experiment 
with 8 on p4 and athlons)
 -m96bits-long-double (implied by ia32 architecture, it's safe to experimetn with 
-m128bits-long-double on p4 and athlons)

Various:
 -maccumulate-outgoing-args implies -mno-push-args
 -fomit-frame-pointer implies -momit-leaf-frame-pointer

So you have a few flags that are already in your -O3 optimizations.

My system is an AthlonXP 2500 core Barton running at 200Mhz x 9, default core voltage. 
2 OCz EL DDR in Dual Channel Mode 400Mhz, ATI Radeon 9600 Pro, and a ASUS A7N8X 
Deluxe, Linux 2.6.0-test9 with some strange patches from lkml, without local-apic.

Your runned your test with benchmark.ini missing, the defaults .ini values for povray 
are merely for preview.

--
[EMAIL PROTECTED] mailing list

Re: [gentoo-user] gcc optimizations

Reply via email to