https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79173
Thomas Koenig <tkoenig at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |tkoenig at gcc dot gnu.org Last reconfirmed|2017-01-23 00:00:00 |2021-5-28 --- Comment #10 from Thomas Koenig <tkoenig at gcc dot gnu.org> --- Just had a look at trunk. It currently produces adc: leaq 800(%rsi), %rcx xorl %edx, %edx .L2: movq (%rdi), %rax addb $-1, %dl adcq (%rsi), %rax setc %dl addq $8, %rsi movq %rax, (%rdi) addq $8, %rdi cmpq %rcx, %rsi jne .L2 ret Clang does adc: # @adc movl $4, %eax xorl %ecx, %ecx .LBB0_1: # =>This Inner Loop Header: Depth=1 movq -32(%rsi,%rax,8), %rdx addb $-1, %cl adcq %rdx, -32(%rdi,%rax,8) movq -24(%rsi,%rax,8), %rcx adcq %rcx, -24(%rdi,%rax,8) movq -16(%rsi,%rax,8), %rcx adcq %rcx, -16(%rdi,%rax,8) movq -8(%rsi,%rax,8), %rcx adcq %rcx, -8(%rdi,%rax,8) movq (%rsi,%rax,8), %rcx adcq %rcx, (%rdi,%rax,8) setb %cl addq $5, %rax cmpq $104, %rax jne .LBB0_1 retq so it actually unrolls the loop and does the ideal sequence of add with carry.