So far, I'm researching the optimisation as listed below... tracking registers with identical values and changing them to minimise pipeline stalls. Because I don't need to keep track of their actual values, just whether they've changed since a particular MOV instruction, I've managed to move this into the peephole optimiser as an extension to TX86AsmOptimizer.PostPeepholeOptMov().
It's a bit more difficult than it looks though - I've had a lot of crashes so far when it changes a register when it shouldn't do, but I'm ironing out the bugs one by one. To truly see the gains though, one would need to perform some kind of intense timing comparison. This would be the first step in the step-by-step implementation. More in-depth deep data-flow optimisation, like successfully merging div and mod instructions of the same numerator and denominator will require some more care and thought, especially as the two divison operations may not use the same registers (if successful though, it will improve the compiler itself, since it has "x div 1000" and "x mod 1000" side-by-side in a couple of places, a common pair of expressions to produce a human-readable time metric, e.g. seconds and milliseconds). Gareth aka. Kit On Sun 03/06/18 14:12 , Florian Klämpfl flor...@freepascal.org sent: Am 21.05.2018 um 21:05 schrieb J. Gareth Moreton: > Would you object to me trying anyway, Florian? No, feel free to go ahead, but it needs to be done step by step. > It might be that I run into the same problems you had and it's too > unsafe, but I'm going by a conservative philosophy in that if it spots something that it can't work out (e.g. an > instruction that it's not programmed to handle) or is potentially unsafe (e.g. reading and writing to a block of memory > that it doesn't have control over, due to multi-threading issues), then it just stops optimising and drops all > assumptions that it has made at that point. > > As a small test case, I'm attempting to see if I can spot and optimise, for example, "mov %rax, %rbx; lea %rcx, > -8(%rsp); mov %rbx, 8(%rsp)", where a pipeline stall occurs due to a read-after-write penalty (with %rbx in this case). Things like this are fine, it gets hairy though as soon as memory locations are involved. _______________________________________________ fpc-devel maillist - fpc-devel@lists.freepascal.org [1] http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel [2]">http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel Links: ------ [1] mailto:fpc-devel@lists.freepascal.org [2] http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
_______________________________________________ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel