> On July 2, 2012, 7:30 a.m., Steve Reinhardt wrote: > > SConstruct, line 521 > > <http://reviews.gem5.org/r/1171/diff/3/?file=27538#file27538line521> > > > > I don't think I want to be nagged about this. Does it make sense just > > to turn it on by default for opt, fast, and prof (or maybe just fast), then > > have a --no-lto or --lto=false option to disable it? > > > > > > Andreas Hansson wrote: > Given the extra time needed to link, I'm inclined to leave it as a > default-off option. An intermediate step would be to base the choice on an > environment variable if present...
That's a reasonable concern. I'm mostly concerned about the nag message; it will be annoying to see it every time I compile if I've already decided not to use lto, particularly in a situation like debug where I almost certainly don't want it. But that leaves the question of getting people to use lto if it's not on by default and there's no message. One counter-argument for you is that people compiling gem5.fast have already indicated that they want as-fast-as-possible execution at the expense of everything else, so it seems reasonable to me to enable lto there by default when it applies. By extension, opt has so far been seen as "fast with seatbelts" (in my mind anyway), so that argues that lto should be on there too. And if you're profiling, you really want to get profile information for the fastest compilation, otherwise you could be looking at artificial bottlenecks. So it's a slippery slope, I admit. I'm not convinced one way or another myself, just trying to reason through the issue. There are hybrid answers possible too, like turn it on by default only in fast and prof, remind that it's an option in opt, but say nothing in debug. I'm not sure how easy it is to code that up in the sconscript though. Or if you feel strongly about how you're doing it now, you can tell me I'm a wimp for getting annoyed at repetitive output messages... - Steve ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://reviews.gem5.org/r/1171/#review3002 ----------------------------------------------------------- On July 2, 2012, 5:56 a.m., Andreas Hansson wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > http://reviews.gem5.org/r/1171/ > ----------------------------------------------------------- > > (Updated July 2, 2012, 5:56 a.m.) > > > Review request for Default. > > > Description > ------- > > Changeset 9086:faeddb9fb678 > --------------------------- > gcc: Enable Link-Time Optimization for gcc >= 4.6 > > This patch adds a scons flag to indicate that compilation and linking > should be done using LTO. No check is performed to guarantee that the > linker supports LTO and use of the linker plugin, so the user has to > ensure that binutils GNU ld >= 2.21 or the gold linker is available. > > The same number of jobs is used for the parallel phase of LTO as the > jobs specified on the scons command line, using the -flto=n flag that > was introduced with gcc 4.6. Supposedly the gold linker also supports > concurrent and incremental linking, but this is not used at this > point. > > Currently the LTO option is only useful for gcc >= 4.6, due to the > limited support on clang and earlier versions of gcc. The intention is > to also add support for clang once the LTO integration matures. The > use of LTO is independent of the target, i.e. debug, opt, fast and > prof, although opt and fast are the most likely candidates. > > The compilation and linking time is increased by almost 50% on > average, although ARM seems to be particularly demanding with an > increase of almost 100%. Also beware when using this as gcc uses a > tremendous amount of memory and temp space in the process. You have > been warned. > > When it comes to the return on investment, the regression seems to be > roughly 15% faster with LTO. For a bit more detail, I ran twolf on > ARM.fast, with three repeated runs, and they all finish within 42 > minutes (+- 25 seconds) without LTO and 31 minutes (+- 25 seconds) > with LTO, i.e. LTO gives an impressive >25% speed-up for this case. > > Without LTO (ARM.fast twolf) > > real 42m37.632s > user 42m34.448s > sys 0m0.390s > > real 41m51.793s > user 41m50.384s > sys 0m0.131s > > real 41m45.491s > user 41m39.791s > sys 0m0.139s > > With LTO (ARM.fast twolf) > > real 30m33.588s > user 30m5.701s > sys 0m0.141s > > real 31m27.791s > user 31m24.674s > sys 0m0.111s > > real 31m25.500s > user 31m16.731s > sys 0m0.106s > > > Diffs > ----- > > SConstruct 5f0321c03a26 > src/SConscript 5f0321c03a26 > > Diff: http://reviews.gem5.org/r/1171/diff/ > > > Testing > ------- > > util/regress all passing (disregarding t1000 and eio) > > > Thanks, > > Andreas Hansson > > _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
