https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70359
--- Comment #23 from Aldy Hernandez <aldyh at gcc dot gnu.org> --- For the curious, on x86 with -ftree-forwprop we get an additional jump: inttostr: .LFB0: .cfi_startproc movl %edi, %eax movslq %edx, %rdx movl $-858993459, %r9d sarl $31, %eax leaq -1(%rsi,%rdx), %rsi movl %eax, %ecx movb $0, (%rsi) xorl %edi, %ecx subl %eax, %ecx jmp .L2 ;; boo! .p2align 4,,10 .p2align 3 .L4: movq %r8, %rsi movl %edx, %ecx .L2: ... ... ... movb $45, -1(%r8) leaq -2(%rsi), %r8 Not to mention that the last two instructions are slower (ok, larger) than without forwprop, probably because rsi encodes better than r8. movb $45, -1(%rsi) subq $1, %rsi Also, the inequality comparison on x86 is shorter than comparing with > 9. But that's probably irrelevant. All in all: x86 -ftree-forwprop: 139 bytes x86 -fno-tree-forwprop: 126 bytes