The topic of the usefulness of various -Ox optimizations comes up every so often. This will likely be old news to Gentoo old hands, but it should be useful for newbies, at least. Anyway, I found this commentary by Linus interesting:
http://permalink.gmane.org/gmane.linux.kernel/344440 The thread wanders quite a bit, but this subthread started with someone displaying a listing of their kernels ordered by size back thru 2.2.x, noting that the size had increased from ~ half a MB back then to ~ 1.5 MB today. Somebody then asked (among other things) if the compiler used was the same, listing the size of the /same/ kernel compiled with gcc-2.95.x against the gcc-4.1.0-snapshot they are running. The 4.1 compiled kernel was MUCH larger. Someone replied that the comparison wasn't fair -- if you don't ask gcc to optimize for size, don't blame it for not doing so -- and proceeded to compare kernels compiled with 4.1 and normal options, against those compiled using the kernel embedded option, and then choosing appropriately the answers to the output of "make oldconfig" after having changed to "embedded". Among other things, he added -Os to the kernel's gcc command line, and he probably dropped symbols as well. What else he chose from the options he didn't say, but the resulting core kernel (allnoconfig) ended up /very/ similar in size to the old 2.5 kernel he compared against (I believe the first one to have allnoconfig as an option). A few posts down, Linus then posted this comment, from the bottom of the linked post above: <quote> And we should probably make -Os the default. Apparently Fedora already does that by just forcibly hacking the Kconfig files. With modern CPU's, instructions are almost "free". The real cost is in cache misses, and that tends to be doubly true of system software that tends to have a lot more cache misses than "normal" programs (because people try hard to batch up system calls like write etc, so by the time the kernel is called, the L1 cache is mostly flushed already - possibly the L2 too. And interrupts may be in the "fast path", but they'd sure as hell better not happen so often that they stay cached very well etc etc). So -Os probably performs better in real life, and likely only performs worse on micro-benchmarks. Sadly, micro-benchmarks are often very instructive in many other ways. </quote> (Dave Jones later confirms that Fedora does indeed normally run -Os.) (If you want to view the entire thread, use the "Go to the topic" link near the top left of the page.) So... certainly for kernel and probably for glibc stuff (tho I believe Gentoo kills -Os on glibc compiles unless you hack out that portion of the ebuild, in your own overlay or whatever), -Os is likely to be the best choice. For most of userland, -Os may be best, but to a smaller degree, and performance should be similar with -O2, trading size for medium speed optimizations in what amounts to a wash. The exceptions would likely be media encoders and the like, where the working set is large and in a data streaming environment, and -O3 may make sense. In the general case, however, -O3 likely does NOT make sense, because it's almost always so expensive in size that the gains in speed over -O2 are far outweighed. IMO (and for the kernel anyway, Linus's as well)... -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman in http://www.linuxdevcenter.com/pub/a/linux/2004/12/22/rms_interview.html -- [email protected] mailing list
