Hi everyone,

So a merge request of mine was just approved that allows the peephole optimizer access to more registers when it needs one for temporary storage.  It allows it to make an optimisation on x86_64-win64 that wasn't possible before due to the lack of available volatile registers.  In packages\numlib\src\dsl.pas - before:

.Lj184:
    ...
    cmpl    $1,%ecx
    jng    .Lj188
    subl    $1,%ecx
.Lj188:
    ...

After:

.Lj184:
    ...
    cmpl    $1,%ecx
    setg    %bl
    movzbl    %bl,%ebx
    subl    %ebx,%ecx
    ...

%ebx is a non-volatile register, but the current subroutine preserves it and it's not currently in use, so the peephole optimizer can borrow it for a few instructions.

I need to double-check though... is this actually a good optimisation for speed?  It removes a jump and a label, which might permit other long-range optimisations, but it's 3 instructions that are in a dependency chain.

Gareth aka. Kit


--
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Reply via email to