| Issue |
124714
|
| Summary |
Missed optimization: inline functions, when operations can be done with smaller bit width
|
| Labels |
new issue
|
| Assignees |
|
| Reporter |
Explorer09
|
```c
#include <stdbool.h>
#include <stdint.h>
static inline uint64_t saturating_sub_u64(uint64_t a, uint64_t b) {
return a > b ? a - b : 0;
}
uint32_t test1a(uint32_t a, uint32_t b) {
return a > b ? a - b : 0;
}
uint32_t test1b(uint32_t a, uint32_t b) {
return (uint64_t)a > (uint64_t)b ? (uint64_t)a - (uint64_t)b : (uint64_t)0;
}
uint32_t test1c(uint32_t a, uint32_t b) {
return saturating_sub_u64(a, b);
}
```
Expected result: `test1a`, `test1b` and `test1c` functions transform to same code.
Actual result: `test1a` and `test1b` transform to same code, but `test1c` produces slightly larger code, with unnecessary zero extension operations.
This can be tested in Compiler Explorer.
x86-64 clang 19.1.0 with `-Os` option produces:
```x86asm
test1b:
xorl %eax, %eax
subl %esi, %edi
cmovael %edi, %eax
retq
test1c:
movl %edi, %ecx
movl %esi, %edx
xorl %eax, %eax
subq %rdx, %rcx
cmovaeq %rcx, %rax
retq
```
Related: [GCC bug 118679](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118679)
When I reported the bug in GCC, I have another example that GCC missed the optimization for, but somehow Clang optimized it correctly (see the `max_u64` and `test2` functions in that bug report). The `saturating_sub_u64` case is what Clang missed.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs