Re: Old compiler versions (was Re: v4.10: kernel stack frame pointer .. has bad value (null))
On Thu, Mar 9, 2017 at 2:49 AM, Pavel Machek wrote: > > (On thinkpad X220, compiling bzip2) You really shouldn't assume that the zlib build tracks the kernel build. At least at some point, a noticeable part of the build cost for the kernel was just parsing the fairly big source code. We have honking big include files and deep nesting, and there is a lot of preprocessor and just general parsing overhead for stuff that in most files don't even generate code. All those inline functions and type declarations for things that then aren't actually used in most files means that you spend relatively more time just parsing files than you spend on generating and optimizing code. So the trade-offs between different projects can be very different. Some projects have huge tables with static initializers that gcc at some point had serious quadratic-time issues with, and other code has big functions where the actual optimization phase is the bulk of it. And some projects have a lot of big and nested include files. It's not nearly as bad as some C++ projects (where the header file mess can often _easily_ be the dominant factor by far), but it's still potentially completely different from something like building zlib. Oh, and don't even bother looking at -O0 times. That's almost purely parsing, but more importantly, the kernel has never in its lifetime built without optimizations. We basically rely on the compiler not being moronic crap. Always have, always will. > Unfortunately, 4.11-rc1 fails to compile on gcc 3.3.5. > >> 1. None (CC_STACKPROTECTOR_NONE) (NEW) > > is needed. Easy. But then I get > > AS arch/x86/entry/entry_32.o > arch/x86/entry/entry_32.S: Assembler messages: > arch/x86/entry/entry_32.S:440: Error: invalid character '"' in > operand 1 > > from the ALTERNATIVE macro. It seems 3.3 just does not like " in macro > arguments. Ok. Clearly our checks in are outdated, and we "allow" compilers that don't actually work. > But that looks fixable. But when I force the compilation, it is > actually _slower_ than recent gcc (23 minutes vs. 13 > minutes). Interesting. I forget when gcc got the "integrated preprocessor". It's a long time ago. But that actually sped things up, because it basically halves (or more) the overhead of parsing. With an external preprocessor you obviously first have cpp doing its parsing, then writing the preprocessed results out, and then you had cc1 doing parsing again. So yes, gcc has gotten a lot slower over time, but some things have actually improved. Linus
Old compiler versions (was Re: v4.10: kernel stack frame pointer .. has bad value (null))
Hi! > > > - CONFIG_FUNCTION_GRAPH_TRACER sets it on x86-32 because of a gcc bug > > > where the stack gets aligned before the mcount call. This issue > > > should be mostly obsolete as most modern compilers now have -mfentry. > > > We could make it dependent on CC_USING_FENTRY. > > > > Yeah. At some point we might even upgrade the compiler requirements to > > no longer accept the mcount model. > > > > I think the fentry model is gcc-4.6.0 and up. Currently I guess we > > support gcc-3.2+, which is fairly ridiculous considering that 4.6.0 is > > from March, 2011. So it's over five years ago already. > > > > gcc-3.2.0 is from 2002, I think. At some point you just have to say > > "caring about a 15 year old compiler is ridiculous" > > > > The main reason we have fairly aggressively supported old compilers > > tends to be some odder architectures that don't have good support, so > > people use various random "this works for me" versions. > > > > We could easily make the gcc version checks much more strict on x86, > > I suspect. > > Well, I have fast CPUs, but most of the time they just compile > stuff. Especially bisect is compile-heavy. I suspect going back to > gcc-3.2 would bring me bigger advantages than CPU upgrade... Okay, would not it be nice if we supported gcc-3.3? It compiles about twice the speed of gcc-4.9, across the board: (If we could compile at -O1, we'd get 4 times the speed. At -O0, we'd be at cca 9 times the speed; that would be useful for a bisect!) Good news is that -Os is quite significantly faster than -O2 (and already supported), so that should be simple way to optimize bisect performance. (On thinkpad X220, compiling bzip2) | mach | gcc | || real | user | sys | $ | x220 | 4.9.2-10 | -O0 | bzip2.c caf036 | 0.644 | 0.54 | 0.03 | $ | | | -O1 || 1.501 || | $ | | | -O2 || 2.607 || | $ | | | -O3 || 3.052 || | $ | | | -Os || 1.839 || | $ | | 3.3.5-13 | -O0 || 0.343 | 0.300 | 0.028 | $ | | | -O1 || 0.721 || | $ | | | -O2 || 1.238 || | $ | | | -O3 || 1.598 | 1.508 | 0.032 | $ Unfortunately, 4.11-rc1 fails to compile on gcc 3.3.5. > 1. None (CC_STACKPROTECTOR_NONE) (NEW) is needed. Easy. But then I get AS arch/x86/entry/entry_32.o arch/x86/entry/entry_32.S: Assembler messages: arch/x86/entry/entry_32.S:440: Error: invalid character '"' in operand 1 from the ALTERNATIVE macro. It seems 3.3 just does not like " in macro arguments. arch/x86/boot/bioscall.S: Assembler messages: arch/x86/boot/bioscall.S:68: Error: `68(%esp)' is not a valid 16 bit base/index expression Plus I get about milion of from fs/fs-writeback.c:23: include/linux/irq.h:419: warning: parameter has incomplete type include/linux/irq.h:420: warning: parameter has incomplete type ... and problem with builtin_ffs in drm_blend.c, and others with function alignment in drm. lzo1x_compress needs __builtin_ctz. In the end, compilation fails with mm/built-in.o(.text+0x2b714): In function `do_set_pmd': : undefined reference to `__compiletime_assert_3034' mm/built-in.o(.text+0x2c09a): In function `create_huge_pmd': : undefined reference to `do_huge_pmd_anonymous_page' mm/built-in.o(.text+0x2c0ca): In function `wp_huge_pmd': : undefined reference to `do_huge_pmd_wp_page' drivers/built-in.o(.text+0xe5a2b): In function `cea_mode_alternate_timings': : undefined reference to `__compiletime_assert_2638' drivers/built-in.o(.text+0x3c969f): In function `sg_ioctl': : undefined reference to `__divdi3' But that looks fixable. But when I force the compilation, it is actually _slower_ than recent gcc (23 minutes vs. 13 minutes). Interesting. If someone knows what old gcc versions actually compile recent kernels, I'd like to know. Best regards, Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature