| Issue |
179678
|
| Summary |
[X64] Missed optimization for 64/32-bit division (with a 64-bit or a 32-bit result)
|
| Labels |
new issue
|
| Assignees |
|
| Reporter |
rrrola
|
[https://godbolt.org/z/xfoM5bqon](https://godbolt.org/z/xfoM5bqon)
When compiling a 64/32-bit division, clang 21 with -O2 checks if the dividend is also 32-bit.
Based on this check, it produces a full 64/64 or a faster 32/32-bit divison (emulated by a 128/64 or 64/32 DIV with the high word of the dividend zeroed):
```
div_rem:
mov rax, rdx
mov ecx, esi
shr rdx, 32
je .LBB0_1
xor edx, edx
div rcx
mov dword ptr [rdi], edx
ret
.LBB0_1:
xor edx, edx
div ecx
mov dword ptr [rdi], edx
ret
```
The 64/32-bit division path could be used more often.
If the high word of the dividend is smaller than the divisor, the quotient of 64/32-bit divison will have a 32-bit result.
Expected assembly:
```
div_rem:
mov rax, rdx
mov ecx, esi
shr rdx, 32
cmp edx, ecx ; high word of dividend < divisor?
jb .LBB0_1
xor edx, edx ; no - need to use 128/64-bit division
div rcx
mov dword ptr [rdi], edx
ret
.LBB0_1: ; yes - can use 64/32-bit division (don't zero the high part of the dividend)
div ecx
mov dword ptr [rdi], edx
ret
```
Similar optimizations can be done for a 64/64-bit division: if the divisor fits in 32 bits and the high word of the dividend is smaller than the divisor, we can do a 64/32-bit division.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs