-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/1171/#review3002
-----------------------------------------------------------


Thanks... this is a pretty impressive speedup.

You should mention in the commit message that you reworked how flags are 
handled in order to make it easier to create this option.  At first I was 
thinking "gee, all this cleanup of the arg lists should go in a separate 
patch", then I realized that you actually needed to do that to have an 
independent flag.


SConstruct
<http://reviews.gem5.org/r/1171/#comment3224>

    I don't think I want to be nagged about this.  Does it make sense just to 
turn it on by default for opt, fast, and prof (or maybe just fast), then have a 
--no-lto or --lto=false option to disable it?
    
    


- Steve Reinhardt


On July 2, 2012, 5:56 a.m., Andreas Hansson wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/1171/
> -----------------------------------------------------------
> 
> (Updated July 2, 2012, 5:56 a.m.)
> 
> 
> Review request for Default.
> 
> 
> Description
> -------
> 
> Changeset 9086:faeddb9fb678
> ---------------------------
> gcc: Enable Link-Time Optimization for gcc >= 4.6
> 
> This patch adds a scons flag to indicate that compilation and linking
> should be done using LTO. No check is performed to guarantee that the
> linker supports LTO and use of the linker plugin, so the user has to
> ensure that binutils GNU ld >= 2.21 or the gold linker is available.
> 
> The same number of jobs is used for the parallel phase of LTO as the
> jobs specified on the scons command line, using the -flto=n flag that
> was introduced with gcc 4.6. Supposedly the gold linker also supports
> concurrent and incremental linking, but this is not used at this
> point.
> 
> Currently the LTO option is only useful for gcc >= 4.6, due to the
> limited support on clang and earlier versions of gcc. The intention is
> to also add support for clang once the LTO integration matures. The
> use of LTO is independent of the target, i.e. debug, opt, fast and
> prof, although opt and fast are the most likely candidates.
> 
> The compilation and linking time is increased by almost 50% on
> average, although ARM seems to be particularly demanding with an
> increase of almost 100%. Also beware when using this as gcc uses a
> tremendous amount of memory and temp space in the process. You have
> been warned.
> 
> When it comes to the return on investment, the regression seems to be
> roughly 15% faster with LTO. For a bit more detail, I ran twolf on
> ARM.fast, with three repeated runs, and they all finish within 42
> minutes (+- 25 seconds) without LTO and 31 minutes (+- 25 seconds)
> with LTO, i.e. LTO gives an impressive >25% speed-up for this case.
> 
> Without LTO (ARM.fast twolf)
> 
> real  42m37.632s
> user  42m34.448s
> sys   0m0.390s
> 
> real  41m51.793s
> user  41m50.384s
> sys   0m0.131s
> 
> real  41m45.491s
> user  41m39.791s
> sys   0m0.139s
> 
> With LTO (ARM.fast twolf)
> 
> real  30m33.588s
> user  30m5.701s
> sys   0m0.141s
> 
> real  31m27.791s
> user  31m24.674s
> sys   0m0.111s
> 
> real  31m25.500s
> user  31m16.731s
> sys   0m0.106s
> 
> 
> Diffs
> -----
> 
>   SConstruct 5f0321c03a26 
>   src/SConscript 5f0321c03a26 
> 
> Diff: http://reviews.gem5.org/r/1171/diff/
> 
> 
> Testing
> -------
> 
> util/regress all passing (disregarding t1000 and eio)
> 
> 
> Thanks,
> 
> Andreas Hansson
> 
>

_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to