On Mon, Dec 31, 2001 at 01:46:29AM +0100, Thierry Godefroy wrote: > > > If you select two of them (e.g. 68040 & 68060 > > > for the Qx0), then no optimization flag is used at all (!) for the > > > compilation (thus defaulting to a 68000-compatible code and also > > > probably avoiding any 68000 instruction that would not work on higher > > > processor, e.g MOVE.W SR,Dn which is priviledged on 68020+). > > > > not correct, we use gcc configured for linux-m68k which has DEFAULT_TARGET > > set to 68020. > > I was not aware of this fact...
gcc/config/m68k/linux* - sometimes it is worth to read the mess, for example it is the only place to get information about calling converntions. > > So the effect of actually using -m68040 will be next to zero for non-FPU code. > > But even 68020, 68040 and 68060 have got significant differences between them > (cache size and organization, MMU functionnalities, a few instructions differ > also, some others running more or less faster depending on the processor model, > etc...) for a GOOD optimizing compiler (or a good assembler programmer ;-) to > take advantage of a targeted optimization... MMU and cache code is way beyound what any reasonable c compiler should generate, as well as generating for specific cache footprints. The most striking difference is the missing of some div.l variants on 68060 (they have to be emulated in kernel) and similar FPU differences. Choosing CPU instruction depending on CPU model made a lot of difference for 68000-30 but 040-60 do pretty much every reasonable instruction in the pipeline, you would have to be extraordinarilly clever to optimise anything there. Since no FPU code is in kernel and 64 bit division is also pretty rare the resulting difference should be rather small. Otoh code compiled with -m68060 will run on 020-60 but in some cases much slower than generic code. > > The reason why selecting a single CPU variant makes the kernel faster is > > that the differences between the m68k CPU's wrt cache, MMU, FPU and exception > > handling are rather big, so each CPU needs its own cache and MMU handling > > code. > > Sure ! BTW don't even think (or try) to run a 68040 optimized kernel on a Q60: > it won't even initialize (because of the MMU and registers initialization at > the very start of the kernel code) ! its not because of the optimisation, 68060 requires a few pieces of code of its own. Bus error handler is completely different and the ilsp/fpsp package (support for missing instructions) would be missing on a pure 040 system. > > This code is mostly inlined and if support for more CPU variants is > > compiled in the distinction has to happen at runtime. > > This is a very valid point, although a well optimized code should test for the > processor just once and then be able to just test a flag (or even self-modify > itself !) IOT choose the good inlined part of the cache/MMU/exception processing > routines to run. indeed we have a flag that is set even before kernel boot sequence by the booter code. There is some movement to selfmodifying code with quite a few clever ideas but I am not in favor of it - it will be always inferior to optimised gcc code compiled for the single variant so why the trouble. Bye Richard
