Hi everyone,

So I'm still focused on x86 for the moment, and I'm still looking for ways to both increase the speed of the compiler and also find new optimisations. Currently I'm building a few more principles of my "Deep Optimizer" into OptPass1MOV that are showing promise - I should have a patch ready tomorrow, and if approved, I can build a few extra things on top of it, as well as removing some other optimisations that have become redundant as a result.

In regards to things that are ready, I discovered an extra pass that occurs after the Post-Peephole Optimization stage that is a little hidden (PostPeepHoleOpts is overridden and calls OptReferences afte the 'inherited' call).  What this pass does is optimise all of the references in the instructions to take on a standardised form.  Because of the nature of the Post-Peephole Optimization stage (generally only converting individual instructions into more compact forms), it is very easy to optimise the current instruction's references as part of this pass, thereby removing the need to have a separate OptReferences pass.  Details can be found here: https://bugs.freepascal.org/view.php?id=36583 - initial experiments show a 10% speed increase.

One other thing that has been on my mind for a while, but would need some discussion... I would like to move Pass 1 so it runs before all of the imaginary registers are changed to real registers.  Some optimisations are able to reduce the number of registers required by a routine, but since this occurs after the register allocation, overhead such as preserving and restoring non-volatile registers has already been generated.  Additionally, additional optimisations can be programmed to help the allocator - for example, if you come across "mov %immreg1, %immreg2", %immreg1 gets deallocated at this point (i.e. is not used afterwards) and it's the first appearance of %immreg2, all references to %immreg2 after that point can be changed to %immreg1 and the mov instruction removed - if that's too much work, but the allocator can be trusted to give %immreg1 and %immreg2 the same "colour" in such instances, then the extraneous mov instruction would have to be removed in Pass 2, say.  The only difficulty is working out how to track the register usage, since TAllUsedRegs can only handle real registers.  I figure a new descendant class might be the answer to that one.  Something new to research!

Gareth aka. Kit

_______________________________________________
fpc-devel maillist  -  [email protected]
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Reply via email to