8 Regression] Code size increase for ARM compared to gcc-5.3.0

aldyh at gcc dot gnu.org Fri, 02 Mar 2018 03:06:07 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70359


--- Comment #23 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
For the curious, on x86 with -ftree-forwprop we get an additional jump:

inttostr:
.LFB0:
        .cfi_startproc
        movl    %edi, %eax
        movslq  %edx, %rdx
        movl    $-858993459, %r9d
        sarl    $31, %eax
        leaq    -1(%rsi,%rdx), %rsi
        movl    %eax, %ecx
        movb    $0, (%rsi)
        xorl    %edi, %ecx
        subl    %eax, %ecx
        jmp     .L2                    ;; boo!
        .p2align 4,,10
        .p2align 3
.L4:
        movq    %r8, %rsi
        movl    %edx, %ecx
.L2:
...
...
...
        movb    $45, -1(%r8)
        leaq    -2(%rsi), %r8

Not to mention that the last two instructions are slower (ok, larger) than
without forwprop, probably because rsi encodes better than r8.

        movb    $45, -1(%rsi)
        subq    $1, %rsi

Also, the inequality comparison on x86 is shorter than comparing with > 9.  But
that's probably irrelevant.

All in all:

x86 -ftree-forwprop:    139 bytes
x86 -fno-tree-forwprop: 126 bytes

[Bug target/70359] [6/7/8 Regression] Code size increase for ARM compared to gcc-5.3.0

Reply via email to