| Issue |
166126
|
| Summary |
[X86] LLVM -Oz generates 8-byte move for small constants
|
| Labels |
new issue
|
| Assignees |
|
| Reporter |
ONE-RANDOM-HUMAN
|
In certain cases, Clang -Oz generates a full 8 byte move for loading single byte constants. This does not seem to affect -Os or any other levels.
Test case (https://godbolt.org/z/G3EfdKxMs):
```
#include <stdlib.h>
#include <stdint.h>
#include <string.h>
size_t test(uint64_t p[6], size_t index) {
uint64_t p2[6];
memcpy(p2, p, sizeof(p2));
for (size_t i = 0; i < 6; i++) {
if ((p2[i] & (1ULL << index)) != 0) {
return i;
}
}
return 6;
}
```
This generates with Clang 21.1.0
```
test:
mov rcx, rsi
movups xmm0, xmmword ptr [rdi]
movups xmm1, xmmword ptr [rdi + 16]
movups xmm2, xmmword ptr [rdi + 32]
movaps xmmword ptr [rsp - 24], xmm2
movaps xmmword ptr [rsp - 40], xmm1
movaps xmmword ptr [rsp - 56], xmm0
movabs rdx, 1
shl rdx, cl
xor eax, eax
.LBB0_1:
cmp rax, 6
je .LBB0_4
test qword ptr [rsp + 8*rax - 56], rdx
jne .LBB0_4
inc rax
jmp .LBB0_1
.LBB0_4:
ret
```
where, `movabs rdx, 1` is used instead of `mov edx, 1` or `push 1/pop rdx`. The issue does not appear if the `memcpy` is removed.
Additionally in this case, there are other missed optimisations as the `memcpy` can be eliminated (which happens with -O2), and `i` can be demoted to 32 bits.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs