Re: [gentoo-user] CFLAGS CPU optimization question.

2005-05-30 Thread Colin

Andreas Fredriksson wrote:


On 5/29/05, Digby Tarvin [EMAIL PROTECTED] wrote:
 


On the subject of CPU flags, anyone tried optimizing gentoo for a
Toshiba Libretto (110CT)?
   model name  : Mobile Pentium MMX
   flags   : fpu vme de pse tsc msr mce cx8 mmx
   



This is indeed a classic pentium chip with mmx added. You can use
-mcpu=pentium (or -march=pentium), optionally adding the mmx USE flag
for those packages that support it.
 

Actually, since it has MMX, use {-mcpu/-mtune/-march}=pentium-mmx.  
Worked for me.


--
Colin

--
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] CFLAGS CPU optimization question.

2005-05-30 Thread Colin

Andreas Fredriksson wrote:


On 5/29/05, Digby Tarvin [EMAIL PROTECTED] wrote:
 


On the subject of CPU flags, anyone tried optimizing gentoo for a
Toshiba Libretto (110CT)?
   model name  : Mobile Pentium MMX
   flags   : fpu vme de pse tsc msr mce cx8 mmx
   



This is indeed a classic pentium chip with mmx added. You can use
-mcpu=pentium (or -march=pentium), optionally adding the mmx USE flag
for those packages that support it.
 

Actually, since it has MMX, use {-mcpu/-mtune/-march}=pentium-mmx.  
Worked for me.  Additionally, add the CPU flags to your USE flags 
(especially mmx).


--
Colin

--
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] CFLAGS CPU optimization question.

2005-05-29 Thread Digby Tarvin
On the subject of CPU flags, anyone tried optimizing gentoo for a
Toshiba Libretto (110CT)?

how do I determine which of the stage3 installation files:
stage3-athlon-xp-2005.0.tar.bz2
stage3-i686-2005.0.tar.bz2
stage3-pentium3-2005.0.tar.bz2
stage3-pentium4-2005.0.tar.bz2
stage3-x86-2005.0.tar.bz2
is appropriate? 'uname -m' (as suggested in the docs) reports i586 so I
assume it is pentium, but where is the boundry between P3 and P4?

Also, what would be the best CPUFLAGS where /proc/cpuinfo (under SuSE 7.3)
reports the following CPU information:
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 5
model   : 8
model name  : Mobile Pentium MMX
stepping: 1
cpu MHz : 233.292
fdiv_bug: no
hlt_bug : no
f00f_bug: yes
coma_bug: no
fpu : yes
fpu_exception   : yes
cpuid level : 1
wp  : yes
flags   : fpu vme de pse tsc msr mce cx8 mmx
bogomips: 465.30

Is there anything else I need to check to work out my optimum settings?

Thanks,
DigbyT
-- 
Digby R. S. Tarvin [EMAIL PROTECTED]
http://www.digbyt.com
-- 
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] CFLAGS CPU optimization question.

2005-05-29 Thread Julien Cayzac
On 5/29/05, Digby Tarvin [EMAIL PROTECTED] wrote:
 how do I determine which of the stage3 installation files:
 stage3-athlon-xp-2005.0.tar.bz2
 stage3-i686-2005.0.tar.bz2
 stage3-pentium3-2005.0.tar.bz2
 stage3-pentium4-2005.0.tar.bz2
 stage3-x86-2005.0.tar.bz2
 is appropriate? 'uname -m' (as suggested in the docs) reports i586 so I
 assume it is pentium, but where is the boundry between P3 and P4?

P3 and P4 are not i586.
stage3-x86-2005.0.tar.bz2 is the only stage3 tarball suitable for
i586. If you try another one, your system will simply not run at all.

Julien.

-- 
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] CFLAGS CPU optimization question.

2005-05-29 Thread Andreas Fredriksson
On 5/29/05, Digby Tarvin [EMAIL PROTECTED] wrote:
 On the subject of CPU flags, anyone tried optimizing gentoo for a
 Toshiba Libretto (110CT)?
 model name  : Mobile Pentium MMX
 flags   : fpu vme de pse tsc msr mce cx8 mmx

This is indeed a classic pentium chip with mmx added. You can use
-mcpu=pentium (or -march=pentium), optionally adding the mmx USE flag
for those packages that support it.

// Andreas

-- 
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] CFLAGS CPU optimization question.

2005-05-28 Thread Boyd Stephen Smith Jr.
On Monday 23 May 2005 05:09 pm, Colin [EMAIL PROTECTED] wrote:
   -O3:  The highest performance optimization level before code starts to
 break.  It goes up to -O9 if you're daring.  (Use -Os to compile for
 size.)  Implies a lot of stuff.

Ack! What?  It does *not* go up to -O9 and never has.  Currently, the code 
change anything after -O3.  There used to be an -O4, and I think up to -O6 
in private builds, but anything higher than -O3 won't help anymore.  (It 
didn't really /help/ before, more often than not it simply produced 
seeg-faulting executables.)

You can specify -O9, but I don't think that's actually a limit.  I think 
you can but any value there that's recognized by aoti (or maybe one of the 
strto{,u}{l,ll} family).  Use -O69 and see how fast you bugs get marked 
INVALID at bugs.gentoo.org!

Also, I *think* -O3 is still broken on some architectures.  x86 should 
support it fine, but -O3 is in that group of compiler flags that has 
produced broken executables.  (That said, I run with -O3 on a pentinum2 
and am quite happy.)

-- 
Boyd Stephen Smith Jr.
[EMAIL PROTECTED]
ICQ: 514984 YM/AIM: DaTwinkDaddy
-- 
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] CFLAGS CPU optimization question.

2005-05-28 Thread Ryan Lynch
Optimization level 9 (-O9)?  Thats a laugh.  Read the GCC man page, the 
optimization levels are just groupings of other optimization flags (-O1, -O2, 
-O3, -O0, -Os), with optimization level 3 (-O3) containing the most 
optimization flags.  The numbers don't correlate to any kind of optimization 
level (i.e. -O99 wouldn't be 99% optimized or some equivilent malarky).  The 
numbers might as well be letters like A, B, C  In fact, if you know 
anything about programming, most of the -O3 flags and beyond become very code 
specific, and sometimes only work with programs written a certain way or run on 
certain processors.This just goes towards my general opinion that Gentoo 
users in general see gcc cflags as some kind of magic incantation and no little 
about their purpose, potential, or meaning.  -funroll-loops anyone?

-Ryan Lynch

On Monday 23 May 2005 05:09 pm, Colin [EMAIL PROTECTED] wrote:
 On Monday 23 May 2005 05:09 pm, Colin [EMAIL PROTECTED] wrote:
-O3:  The highest performance optimization level before code starts to
  break.  It goes up to -O9 if you're daring.  (Use -Os to compile for
  size.)  Implies a lot of stuff.
-- 
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] CFLAGS CPU optimization question.

2005-05-28 Thread Ryan Lynch
Optimization level 9 (-O9)?  Thats a laugh.  Read the GCC man page, the 
optimization levels are just groupings of other optimization flags (-O1, -O2, 
-O3, -O0, -Os), with optimization level 3 (-O3) containing the most 
optimization flags.  The numbers don't correlate to any kind of optimization 
level (i.e. -O99 wouldn't be 99% optimized or some equivilent malarky).  The 
numbers might as well be letters like A, B, C  In fact, if you know 
anything about programming, most of the -O3 flags and beyond become very code 
specific, and sometimes only work with programs written a certain way or run on 
certain processors.This just goes towards my general opinion that Gentoo 
users in general see gcc cflags as some kind of magic incantation and no little 
about their purpose, potential, or meaning.  -funroll-loops anyone?

-Ryan Lynch

On Monday 23 May 2005 05:09 pm, Colin [EMAIL PROTECTED] wrote:
 On Monday 23 May 2005 05:09 pm, Colin [EMAIL PROTECTED] wrote:
-O3:  The highest performance optimization level before code starts to
  break.  It goes up to -O9 if you're daring.  (Use -Os to compile for
  size.)  Implies a lot of stuff.
-- 
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] CFLAGS CPU optimization question.

2005-05-28 Thread Ryan Lynch
Optimization level 9 (-O9)?  Thats a laugh.  Read the GCC man page, the 
optimization levels are just groupings of other optimization flags (-O1, -O2, 
-O3, -O0, -Os), with optimization level 3 (-O3) containing the most 
optimization flags.  The numbers don't correlate to any kind of optimization 
level (i.e. -O99 wouldn't be 99% optimized or some equivilent malarky).  The 
numbers might as well be letters like A, B, C  In fact, if you know 
anything about programming, most of the -O3 flags and beyond become very code 
specific, and sometimes only work with programs written a certain way or run on 
certain processors.This just goes towards my general opinion that Gentoo 
users in general see gcc cflags as some kind of magic incantation and no little 
about their purpose, potential, or meaning.  -funroll-loops anyone?

-Ryan Lynch

On Monday 23 May 2005 05:09 pm, Colin [EMAIL PROTECTED] wrote:
 On Monday 23 May 2005 05:09 pm, Colin [EMAIL PROTECTED] wrote:
-O3:  The highest performance optimization level before code starts to
  break.  It goes up to -O9 if you're daring.  (Use -Os to compile for
  size.)  Implies a lot of stuff.
-- 
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] CFLAGS CPU optimization question.

2005-05-28 Thread Richard Fish
Ok already, we hear you.  No need to post the same message 5 times.

And BTW, it is a feature of GMail that you don't see your own posts.

Cheers,

-Richard

Ryan Lynch wrote:

Optimization level 9 (-O9)?  Thats a laugh.  Read the GCC man page, the 
optimization levels are just groupings of other optimization flags (-O1, -O2, 
-O3, -O0, -Os), with optimization level 3 (-O3) containing the most 
optimization flags.  The numbers don't correlate to any kind of optimization 
level (i.e. -O99 wouldn't be 99% optimized or some equivilent malarky).  The 
numbers might as well be letters like A, B, C  In fact, if you know 
anything about programming, most of the -O3 flags and beyond become very code 
specific, and sometimes only work with programs written a certain way or run 
on certain processors.This just goes towards my general opinion that 
Gentoo users in general see gcc cflags as some kind of magic incantation and 
no little about their purpose, potential, or meaning.  -funroll-loops anyone?

-Ryan Lynch

On Monday 23 May 2005 05:09 pm, Colin [EMAIL PROTECTED] wrote:
  

On Monday 23 May 2005 05:09 pm, Colin [EMAIL PROTECTED] wrote:


  -O3:  The highest performance optimization level before code starts to
break.  It goes up to -O9 if you're daring.  (Use -Os to compile for
size.)  Implies a lot of stuff.
  

-- 
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] CFLAGS CPU optimization question.

2005-05-24 Thread Sami Samhuri
* On Tue May-24-2005 at 01:08:51 AM +0200, Julien Cayzac said:
 On 5/24/05, Richard Fish [EMAIL PROTECTED] wrote:
  [ recommandations about performance cflags ]
 
 While we're at optimizing stuff, here are my CFLAGS (athlon-xp mobile,
 barton core):
 
 CFLAGS=-O2 -march=athlon-xp -msse -mfpmath=sse -pipe
 -finline-functions -fsched2-use-superblocks -fsched2-use-traces
 -fmove-all-movables -frename-registers -fweb -ffast-math
 -funsafe-math-optimizations -fprefetch-loop-arrays -fforce-addr
 -momit-leaf-frame-pointer -ftracer -funit-at-a-time
 -maccumulate-outgoing-args
 
 2.6.10-gentoo-r6 kernel, everything is stable... and far faster than
 when I had only -O2 -march=athlon-xp -pipe :-)

I recommand [sic] that you slow down.

-- 
Sami Samhuri


pgp7ep95rGmLO.pgp
Description: PGP signature


Re: [gentoo-user] CFLAGS CPU optimization question.

2005-05-23 Thread Colin

Walter Dnes wrote:


 Currently, I use -march=i686 for my 3 machines, a P4, a PIII, and a
PII (and a partridge in a pear trg).

 According to the gcc docs at...
http://gcc.gnu.org/onlinedocs/gcc-3.3.5/gcc/i386-and-x86_002d64-Options.html#i386-and-x86_002d64-Options

i586 is equivalent to pentium and i686 is equivalent to pentiumpro.

 Does this mean that I would get better optimization if I use pentium2,
pentium3 or pentium4, as appropriate?  I am using the available flags
(-mmmx, -msse, -msse2, -mfpmath=sse, etc) as appropriate.


Yes, it would.

My CFLAGS (Pentium II, 504 MHz, 224 MB RAM):  -O3 -march=pentium2 -mmmx 
-fomit-frame-pointer -pipe -ftracer -fno-rename-registers -funroll-loops


I have a stable install of kernel 2.6.11-gentoo-r9 and GNOME 2.8.3.  I 
haven't had a kernel panic yet, and I compiled and run the system with 
an overclocked 112 MHz front side bus.  It was worth sitting around 
watching endless lines of text scroll by.  My secrets?


 -O3:  The highest performance optimization level before code starts to 
break.  It goes up to -O9 if you're daring.  (Use -Os to compile for 
size.)  Implies a lot of stuff.
 -march=pentium2:  Implies -mmmx and writes code specifically for the 
P2 processor.

 -mmmx:  Build code with MMX instructions wherever possible.
 -fomit-frame-pointer:  Don't keep the frame pointer in a register.  
You get an extra register at the cost of losing debugging ability.
 -pipe:  Use pipes instead of temporary files.  Not recommended on a 
RAM-limited system.
 -ftracer:  Use the processor's branch predictor when compiling.  I 
think it compiles twice with this flag, but it does compile more 
efficiently.
 -fno-rename-registers:  Renaming registers is only done when running 
32-bit code on a 64-bit processor.  It's implied on x86 architecture anyway.
 -funroll-loops:  If you can tell how many times a loop will loop 
(mainly for loops), then unroll it.  Does it increase performance?  If 
it does, it's unnoticeable.  Don't tell anyone you use it though.  It 
spreads the whole Gentoo ricer myth that's been going around the Internet.


If your Pentium 4 supports Hyper-Threading, adjust MAKEOPTS 
accordingly.  My P4 compiles faster at -j3 than -j2.  (Haven't tried -j4 
though.)


--
Colin

--
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] CFLAGS CPU optimization question.

2005-05-23 Thread Richard Fish
Colin wrote:

  -funroll-loops:  If you can tell how many times a loop will loop
 (mainly for loops), then unroll it.  Does it increase performance?  If
 it does, it's unnoticeable.  Don't tell anyone you use it though.  It
 spreads the whole Gentoo ricer myth that's been going around the
 Internet.


Just a quick word of warning...-O3 can be slower or faster than -O2 or
-Os depending upon what code you are running.  The same is true of
-funroll-loops...it can actually hurt performance in many cases.  This
is due to the effect of a cache miss causing the processor to fetch data
from RAM, which takes a dozen or more clock cycles on a modern x86
computer.  Those running CPUs that have big disparities between the
internal clock and the memory bus are well advised to test the effects
of these flags on their own systems.

For me, compression and encryption (when I make backups) are my big CPU
hogs.  So those are what I tested, and I found -Os to be about 5% faster
on average than -O2, depending upon whether it was gzip, bzip2, and what
level (-1 thru -9) of compression I chose.  There were some cases that
were about 5% slower, but not ones I am likely to use.  -O3 was either
5% faster or 20% slower than -O2.  Compilation time was about 10% faster
with -Os compared to -O2, and I don't really remember how much more time
-O3 took. That is on a P4 3Ghz with HT.  The encryption code didn't show
any performance boost or hit with any of the optimization levels,
probably because it includes p4-optimized assembly code.

 If your Pentium 4 supports Hyper-Threading, adjust MAKEOPTS
 accordingly.  My P4 compiles faster at -j3 than -j2.  (Haven't tried
 -j4 though.)


Another word or warning...beware of how much memory compilation takes. 
Large C++ packages (like X11 and KDE) can require over one hundred
megabytes *per module* for the compilation at -O2.  -O3 will require
even more memory.  I'd recommend only using -j2 if you have at least
512MB of memory, and -j3 at 1GB or more.

-Richard

-- 
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] CFLAGS CPU optimization question.

2005-05-23 Thread Julien Cayzac
On 5/24/05, Richard Fish [EMAIL PROTECTED] wrote:
 [ recommandations about performance cflags ]

While we're at optimizing stuff, here are my CFLAGS (athlon-xp mobile,
barton core):

CFLAGS=-O2 -march=athlon-xp -msse -mfpmath=sse -pipe
-finline-functions -fsched2-use-superblocks -fsched2-use-traces
-fmove-all-movables -frename-registers -fweb -ffast-math
-funsafe-math-optimizations -fprefetch-loop-arrays -fforce-addr
-momit-leaf-frame-pointer -ftracer -funit-at-a-time
-maccumulate-outgoing-args

2.6.10-gentoo-r6 kernel, everything is stable... and far faster than
when I had only -O2 -march=athlon-xp -pipe :-)

Julien.

-- 
gentoo-user@gentoo.org mailing list