Issue 97909
Summary Suboptimal codegen for `sdiv` by constants
Labels backend:AArch64, backend:X86, missed-optimization
Assignees
Reporter Kmeakin
    GCC is able to generate shorter code for signed division by non-power of 2 constants than clang:
https://godbolt.org/z/axeTh4qb8

AArch64, clang:
```asm
sdiv3:                                  // @sdiv3
 mov     w8, #21846                      // =0x5556
        movk    w8, #21845, lsl #16
        smull   x8, w0, w8
        lsr     x9, x8, #63
        lsr     x8, x8, #32
        add     w0, w8, w9
 ret
```

AArch64, GCC:
```asm
sdiv3:
        mov     w1, 21846
        movk    w1, 0x5555, lsl 16
        smull   x1, w0, w1
 lsr     x1, x1, 32
        sub     w0, w1, w0, asr 31
 ret
```

x86-64, clang:
```asm
sdiv3: # @sdiv3
        movsxd  rax, edi
        imul    rax, rax, 1431655766
        mov     rcx, rax
        shr     rcx, 63
 shr     rax, 32
        add     eax, ecx
 ret
```

x86-64, GCC:
```asm
sdiv3:
        movsx   rax, edi
        sar     edi, 31
        imul    rax, rax, 1431655766
 shr     rax, 32
        sub     eax, edi
        ret
```
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to