https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79173

Thomas Koenig <tkoenig at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |tkoenig at gcc dot gnu.org
   Last reconfirmed|2017-01-23 00:00:00         |2021-5-28

--- Comment #10 from Thomas Koenig <tkoenig at gcc dot gnu.org> ---
Just had a look at trunk.

It currently produces

adc:
        leaq    800(%rsi), %rcx
        xorl    %edx, %edx
.L2:
        movq    (%rdi), %rax
        addb    $-1, %dl
        adcq    (%rsi), %rax
        setc    %dl
        addq    $8, %rsi
        movq    %rax, (%rdi)
        addq    $8, %rdi
        cmpq    %rcx, %rsi
        jne     .L2
        ret

Clang does

adc:                                    # @adc
        movl    $4, %eax
        xorl    %ecx, %ecx
.LBB0_1:                                # =>This Inner Loop Header: Depth=1
        movq    -32(%rsi,%rax,8), %rdx
        addb    $-1, %cl
        adcq    %rdx, -32(%rdi,%rax,8)
        movq    -24(%rsi,%rax,8), %rcx
        adcq    %rcx, -24(%rdi,%rax,8)
        movq    -16(%rsi,%rax,8), %rcx
        adcq    %rcx, -16(%rdi,%rax,8)
        movq    -8(%rsi,%rax,8), %rcx
        adcq    %rcx, -8(%rdi,%rax,8)
        movq    (%rsi,%rax,8), %rcx
        adcq    %rcx, (%rdi,%rax,8)
        setb    %cl
        addq    $5, %rax
        cmpq    $104, %rax
        jne     .LBB0_1
        retq

so it actually unrolls the loop and does the ideal sequence of
add with carry.

Reply via email to