On Wed, 2014-10-01 at 21:54 -0400, Michael Goulish wrote: > Link-time optimization can be turned on by adding the > -flto flag to the proton library build, in both compilation > and linking steps. It offers the possibility of optimizations > using deeper knowledge of the whole program than is available > before linking.
Now that you've got lot going you could try pgo (profile guided optimisations) they're supposed to be good! They do take a significant amount of work to set up though, because you need to profile a representative workload, then optimise the code again based on the results from that. So the flow is more like: 1. Build with appropriate options (probably still lto) 2. Run profiling run 3. Build again with optimising input from the profile run. > ...So! The LTO technology really works, but it's not as > good as manual inlining based on profiling. In fact > it slows that down a little, probably because it is choosing > some inlining candidates that don't help enough to offset > cache thrash due to code size increase. You could perhaps test this by using "-Os -flto" rather than "-O2 -flto" that should attempt to keep the size of code down. Also the gcc 4.8 versions have better heuristics for inlining than the 4.7 versions (apparently and the 4.9 versions will be better still). Andrew
