-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/1171/
-----------------------------------------------------------

(Updated July 2, 2012, 5:56 a.m.)


Review request for Default.


Description (updated)
-------

Changeset 9086:faeddb9fb678
---------------------------
gcc: Enable Link-Time Optimization for gcc >= 4.6

This patch adds a scons flag to indicate that compilation and linking
should be done using LTO. No check is performed to guarantee that the
linker supports LTO and use of the linker plugin, so the user has to
ensure that binutils GNU ld >= 2.21 or the gold linker is available.

The same number of jobs is used for the parallel phase of LTO as the
jobs specified on the scons command line, using the -flto=n flag that
was introduced with gcc 4.6. Supposedly the gold linker also supports
concurrent and incremental linking, but this is not used at this
point.

Currently the LTO option is only useful for gcc >= 4.6, due to the
limited support on clang and earlier versions of gcc. The intention is
to also add support for clang once the LTO integration matures. The
use of LTO is independent of the target, i.e. debug, opt, fast and
prof, although opt and fast are the most likely candidates.

The compilation and linking time is increased by almost 50% on
average, although ARM seems to be particularly demanding with an
increase of almost 100%. Also beware when using this as gcc uses a
tremendous amount of memory and temp space in the process. You have
been warned.

When it comes to the return on investment, the regression seems to be
roughly 15% faster with LTO. For a bit more detail, I ran twolf on
ARM.fast, with three repeated runs, and they all finish within 42
minutes (+- 25 seconds) without LTO and 31 minutes (+- 25 seconds)
with LTO, i.e. LTO gives an impressive >25% speed-up for this case.

Without LTO (ARM.fast twolf)

real    42m37.632s
user    42m34.448s
sys     0m0.390s

real    41m51.793s
user    41m50.384s
sys     0m0.131s

real    41m45.491s
user    41m39.791s
sys     0m0.139s

With LTO (ARM.fast twolf)

real    30m33.588s
user    30m5.701s
sys     0m0.141s

real    31m27.791s
user    31m24.674s
sys     0m0.111s

real    31m25.500s
user    31m16.731s
sys     0m0.106s


Diffs (updated)
-----

  SConstruct 5f0321c03a26 
  src/SConscript 5f0321c03a26 

Diff: http://reviews.gem5.org/r/1171/diff/


Testing
-------

util/regress all passing (disregarding t1000 and eio)


Thanks,

Andreas Hansson

_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to