I wrote this a few weeks ago... # The following is a study of possible ways to optimize the GNU toolchain. # # First of all, I made the assumption that using the Top Level Makefile # system, included with the recent GCC and Binutils versions, has a greater # potential for optimizing the applications than building them each # separately. I also did not study non-bootstrapped builds because a # non-bootstrapped toolchain has no reason to perform better than a # bootstrapped toolchain.
# I performed this study with all the tricks I know of to optimize the
# toolchain, including intermodule builds, profiled builds, and GNU hash
# style.
# The top level makefile system also adds the potential of profiling a pile of
# other packages, like flex and bison and gettext and bash and findutils and
# much more. But I have yet to get that to work for me.
# The --enable-intermodule option will make GCC use the -combine option
# to compile all sources in the same command line. The idea is that the
# compiler can optimize all the sources together and make better judgments.
# This option should also take better advantage of the -O3 option. The only
# other package I know of that uses -combine (and also uses -fwhole-program)
# is Busybox, because it not only increases performance but also decreases
# program size (about 1%).
# My system is a Pentium 4 prescott, 3GHZ, with 1024MB of physical memory.
# Running kernel 2.6.19.2, and Glibc-2.5. This toolchain version's are the
# same versions on the chrooted host system (chapter 6 LFS). I have a 1024MB
# swap partition, and added a 512MB swap file, giving a total of 2.5GB of
# system memory.
# I only performed each build and test once, so there may be variations on
# your results, but my results should be reasonably valid.
tar xf gcc-g++-4.1-20070108.tar.bz2
tar xf gcc-core-4.1-20070108.tar.bz2
tar xf gcc-testsuite-4.1-20070108.tar.bz2
mv gcc-4.1-20070108/ butterfly-toolchain
cd butterfly-toolchain/
# This patch will cause GCC to link with --hash-style=gnu. This will cause
# programs to have better run times. Get details from the web or manual pages.
patch -Np0 -i ../gcc41-hash-style-gnu.patch
# This patch has nothing to do with performance, but is needed in order to
# pass Glibc-2.5's testsuite:
patch -Np0 -i ../gcc-DW_CFA_val.patch
# This Sed command is a workaround for some sort of bug. The GCOV_VERSION
# variable, "0x34303170" here, depends on your GCC version and might be
# different for you. I suggest you skip this command, continue, and you'll
# get an error during the build. Then find and read "gcov-iov.h" to get the
# GCOV_VERSION version, and then start over and use it here:
sed -e \
'[EMAIL PROTECTED] \"gcov-iov.h\"@\
#define GCOV_VERSION \(\(gcov_unsigned_t\)0x34303170\)@' -i gcc/gcov-io.h
sed -i '[EMAIL PROTECTED]/[EMAIL PROTECTED] true@' gcc/Makefile.in
sed -i 's/@have_mktemp_command@/yes/' gcc/gccbug.in
tar xf ../binutils-2.17.50.0.9.tar.bz2
ln -s binutils-2.17.50.0.9/{bfd,binutils,gas,gprof,ld,opcodes} .
# Which CFLAGS you should use is another story, I'm using these:
export CFLAGS="-march=prescott -mtune=prescott -O3 -fomit-frame-pointer -pipe"
export CFLAGS="$CFLAGS -fexpensive-optimizations -DNDEBUG"
export CXXFLAGS="$CFLAGS"
# Using 'BOOT_CFLAGS="$CFLAGS"' with the 'make' command won't pass BOOT_CFLAGS
# down to ld/, bfd, and friends. We need to adjust it in the Makefile's.
# First wipe out mh-x86omitfp so it's BOOT_CFLAGS doesn't play any role:
dd if=/dev/null of=config/mh-x86omitfp count=1
# Then set the BOOT_CFLAGS in the Makefile's:
sed "s/^BOOT_CFLAGS.*/BOOT_CFLAGS = $CFLAGS/" \
-i Makefile.{in,tpl} gcc/Makefile.in
mkdir obj
cd obj
# Beware: the combination of --enable-intermodule, profiledbootstrap, and -O3
# will eat up about 2.2GB of memory. Make should you add enough swap
# space/files for this. I suggest 2.5GB total (including physical RAM). It
# will also take a really long time (see below). I suggest you run this about
# 20 minutes before going to sleep, then watch TV etc for 20 minutes, and
# check the build is going okay, then go to sleep. Hopefully it will be
# finished when you wake.
# The -DNDEBUG option in CFLAGS will cause warnings about unused variables,
# because -DNDEBUG will disable the usefullness of assert(3), which also
# increases performance, and this will require the --disable-werror to be
# used.
# For reasons I didn't explore I couldn't get --enable-shared to work with
# --enable-intermodule. libgcc.so will still be built, but not the Binutils
# shared libraries. This is unfortunate, but will have no adverse performance
# effects, and I'm pretty sure you won't notice it. It means libbfd.a will be
# statically linked into each Binutils application, making them slightly
# larger, upwards of 480KB larger depending on how much of the library goes
# unused.
../configure --prefix=/usr \
--libexecdir=/usr/lib --enable-clocale=gnu \
--enable-threads=posix --enable-__cxa_atexit \
--disable-werror --disable-checking \
--with-cpu=prescott \
--enable-bootstrap --enable-intermodule
# Use nice(1) so your system isn't a snail while you're using 2GB of swap:
time { nice make tooldir=/usr profiledbootstrap 2>&1 | tee make.log ; }
# The following are build and test suite times of various configurations.
# The test suite times hopefully represents run time results, even though it
# depends highly on the host system, the host system components don't change
# so the results should be a fair comparison. I had zero "unexpected
# failures" from all tests, but 'make CFLAGS="" CXXFLAGS="" -k check' needs to
# be used to reset the CFLAGS because -O3 causes some failures in Binutils.
# My SBU, for Binutils-alone without any set CFLAGS, is:
# real 2m53.189s (173 seconds)
# user 2m8.530s
# sys 0m30.220s
# 1
# Build time of Butterfly is 103.5 SBU:
# real 298m37.739s (17917 seconds)
# user 58m41.080s
# sys 5m6.360s
#
# Time to run the test suite is 19.6 SBU:
# real 56m41.453s (3401 seconds)
# user 46m28.930s
# sys 9m37.490s
# 2
# Build time of Butterfly without '--enable-intermodule' is 16.5 SBU:
# real 47m49.584s (2869 seconds)
# user 42m40.570s
# sys 4m0.770s
#
# Time to run the test suite is 20.0 SBU:
# real 57m55.046s (3475 seconds)
# user 47m13.770s
# sys 9m54.990s
# 3
# Build time of Butterfly without '--enable-intermodule' and with 'bootstrap'
# instead of 'profiledbootstrap' is 12.0 SBU:
# real 34m38.662s (2078 seconds)
# user 30m22.580s
# sys 3m30.010s
#
# Time to run the test suite is 20.6 SBU:
# real 59m33.658s (3573 seconds)
# user 48m56.940s
# sys 9m58.580s
# The performance results between a normal bootstrap and non-bootstrap should
# be exactly the same, so it's not noted here.
# Results:
# Build number 3 is the vanilla base-line time.
#
# The formula is:
# comparison-time divided by baseline-time and then multiply the result by 100
#
# Build number 2 performs 3% better compared to build number 3.
# Build number 1 performs 5% better compared to build number 3.
#
# So '--enable-intermodule --enable-bootstrap && make profiledbootstrap' wins,
# but only by 5%.
#
# Install with:
make tooldir=/usr install
#
robert
pgpvEzK0ucRZJ.pgp
Description: PGP signature
-- http://linuxfromscratch.org/mailman/listinfo/lfs-chat FAQ: http://www.linuxfromscratch.org/faq/ Unsubscribe: See the above information page
