Hi everyone,

These days I played a bit with some nice GCC features.
Current CPUs (in both x86 and AMD64 arches) have great potential with
SIMD instructions and loop optimizations, this potential isn't exploited
very much with current make.conf options (-O2 optimization level).

I have recompiled GCC 4.6.2 enabling graphite support and tested out
some make configurations.

My first try has been enabling both LTO, graphite and -ftree-vectorize,
CFLAGS and LDFLAGS were as follows: 

GRAPHITE="-floop-interchange -ftree-loop-distribution -floop-strip-mine 
-floop-block"
CFLAGS="-O2 -ftree-vectorize -march=native -pipe -flto ${GRAPHITE}"
LDFLAGS="-Wl,-O1 -Wl,--as-needed,-flto"

LTO caused a lot of troubles with many packages, while admittedly giving
some speedup when working fine, it woked well with firefox and many
other packages, but it isn't well suited for an inclusion in make.conf,
given that it makes libav, geant and possibly even glibc fail. It could
be considered for use with specific packages, for example the above
mentioned firefox.
GCC 4.6.2 is built with LTO support by default (as opposed to previous
GCC versions).

My second try has thus been disabling LTO, leaving graphite and
-ftree-vectorize on: 

GRAPHITE="-floop-interchange -ftree-loop-distribution -floop-strip-mine 
-floop-block"
CFLAGS="-O2 -ftree-vectorize -march=native -pipe ${GRAPHITE}"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"

this far it worked very well with any ebuild I tried, resulting binaries
are, size-wise, not very different and the performance looks better on
my system (especially with codecs and compression tools), I'd be
inclined to enable those USE flags by default and using LTO for special
optimization purposes in entropy packages.
Does this look like a sane proposal? 

-- 
Lorenzo Cogotti


Reply via email to