I'm in need of some expert advise about CFLAGS. I'm going to install
Gentoo in two systems, an Atom 330 and a P4 prescott. Since compiling
lots of stuff in an Atom is less than joyful, I intend to do almost
all compilation on the prescott and produce binary packages to use on
the Atom. So, something like
CFLAGS="-march=native ..." is out. I know I could set
CFLAGS="-march=i686 -O2 -pipe -fomit-frame-pointer", but I would like
to extract a bit more juice from the hw, while still keeping stable
systems. So, I thought of setting CFLAGS for both systems as
-march=i686, plus the flags that are pulled by -march=native in both
systems, plus some other flags:

-march=i686
-mno-aes
-mno-pclmul
-mno-popcnt
-mno-abm
-mno-lwp
-mno-fma
-mno-fma4
-mno-xop
-mno-bmi
-mno-bmi2
-mno-tbm
-mno-avx
-mno-avx2
-mno-sse4.2
-mno-sse4.1
-mno-lzcnt
-mno-rdrnd
-mno-f16c
-mno-fsgsbase
--param l1-cache-size=16
--param l1-cache-line-size=64
--param l2-cache-size=1024
-pni
-mtrr

This is what is pulled by -march=native, using gcc -march=native -E -v
- </dev/null 2>&1 | grep cc1

Prescott:
-march=prescott -mno-cx16 -mno-sahf -mno-movbe -mno-aes -mno-pclmul
-mno-popcnt -mno-abm -mno-lwp -mno-fma -mno-fma4 -mno-xop -mno-bmi
-mno-bmi2 -mno-tbm -mno-avx -mno-avx2 -mno-sse4.2 -mno-sse4.1
-mno-lzcnt -mno-rdrnd -mno-f16c -mno-fsgsbase --param l1-cache-size=16
--param l1-cache-line-size=64 --param l2-cache-size=1024
-mtune=prescott

Atom:
-march=atom -mcx16 -msahf -mmovbe -mno-aes -mno-pclmul -mno-popcnt
-mno-abm -mno-lwp -mno-fma -mno-fma4 -mno-xop -mno-bmi -mno-bmi2
-mno-tbm -mno-avx -mno-avx2 -mno-sse4.2 -mno-sse4.1 -mno-lzcnt
-mno-rdrnd -mno-f16c -mno-fsgsbase --param l1-cache-size=24 --param
l1-cache-line-size=64 --param l2-cache-size=512 -mtune=atom

Flags supported (from cat /proc/cpuinfo|grep flags):

Prescott:
fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe constant_tsc
pebs bts pni dtes64 monitor ds_cpl cid xtpr

Atom:
fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc
arch_perfmon pebs bts aperfmperf pni dtes64 monitor ds_cpl tm2 ssse3
cx16 xtpr pdcm movbe lahf_lm dtherm


Some questions:

1) Is this strategy right? If so, any other flags to add? (or any
flags to remove from the list?)

2) The --param flags are the ones of the computer that will do the
compiling. I'm guessing the produced binaries are compatible with cpu
with different --param flags. Is this right?

TIA

Jorge Almeida

Reply via email to