Hi everyone,

I've been playing around with the peephole optimizer on x86_64 a lot lately, and I'm starting to notice that a lot of procedures, both in the RTL and the compiler itself, produce the same assembly language when fully optimized (or sometimes even before this point).  Just as an example in the assembly for TStream in the classes unit:

.section .text.n_classes$_$tstream_$__$$_readdata$char$$nativeint,"ax"
    .balign 16,0x90
.globl    CLASSES$_$TSTREAM_$__$$_READDATA$CHAR$$NATIVEINT
CLASSES$_$TSTREAM_$__$$_READDATA$CHAR$$NATIVEINT:
.seh_proc CLASSES$_$TSTREAM_$__$$_READDATA$CHAR$$NATIVEINT
    leaq    -40(%rsp),%rsp
.seh_stackalloc 40
.seh_endprologue
# Peephole Optimization: Mov2Nop 3b done
    movl    $1,%r8d
# Peephole Optimization: %rcx = %rax; removed unnecessary instruction (MovMov2MovNop 6b} # Peephole Optimization: %rax = %rcx; changed to minimise pipeline stall (MovXXX2MovXXX)
    movq    (%rcx),%rax
    call    *256(%rax)
    movslq  %eax,%rax
    nop
    leaq    40(%rsp),%rsp
    ret
.seh_endproc

.section .text.n_classes$_$tstream_$__$$_readdata$shortint$$nativeint,"ax"
    .balign 16,0x90
.globl    CLASSES$_$TSTREAM_$__$$_READDATA$SHORTINT$$NATIVEINT
CLASSES$_$TSTREAM_$__$$_READDATA$SHORTINT$$NATIVEINT:
.seh_proc CLASSES$_$TSTREAM_$__$$_READDATA$SHORTINT$$NATIVEINT
    leaq    -40(%rsp),%rsp
.seh_stackalloc 40
.seh_endprologue
# Peephole Optimization: Mov2Nop 3b done
    movl    $1,%r8d
# Peephole Optimization: %rcx = %rax; removed unnecessary instruction (MovMov2MovNop 6b} # Peephole Optimization: %rax = %rcx; changed to minimise pipeline stall (MovXXX2MovXXX)
    movq    (%rcx),%rax
    call    *256(%rax)
    movslq  %eax,%rax
    nop
    leaq    40(%rsp),%rsp
    ret
.seh_endproc

.section .text.n_classes$_$tstream_$__$$_readdata$byte$$nativeint,"ax"
    .balign 16,0x90
.globl    CLASSES$_$TSTREAM_$__$$_READDATA$BYTE$$NATIVEINT
CLASSES$_$TSTREAM_$__$$_READDATA$BYTE$$NATIVEINT:
.seh_proc CLASSES$_$TSTREAM_$__$$_READDATA$BYTE$$NATIVEINT
    leaq    -40(%rsp),%rsp
.seh_stackalloc 40
.seh_endprologue
# Peephole Optimization: Mov2Nop 3b done
    movl    $1,%r8d
# Peephole Optimization: %rcx = %rax; removed unnecessary instruction (MovMov2MovNop 6b} # Peephole Optimization: %rax = %rcx; changed to minimise pipeline stall (MovXXX2MovXXX)
    movq    (%rcx),%rax
    call    *256(%rax)
    movslq  %eax,%rax
    nop
    leaq    40(%rsp),%rsp
    ret
.seh_endproc

The final assembly language of each method is identical.

(Note that the trunk is not this efficient just yet... it still leaves a "movq %rcx,%rax" instruction prior to "movl $1,%r8d" and then calls "movq (%rax),%rax" instead of "movq (%rcx),%rax" - it's still all identical though).

Would it be plausible to calculate and store a form of message digest (hash) of the final form of the tai entries or machine code and identify collisions and potential duplicate procedures for whole-program optimization? Granted I don't know anything about WPO yet so I don't know how plausible this is.  This wouldn't be somethind done on quick or debug builds because you'll need to be able to do proper stack traces, and having identical procedures merged into one might cause confusion.

Gareth aka. Kit


--
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Reply via email to