https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122240
Bug ID: 122240
Summary: LIM missed opportunity in loop
Product: gcc
Version: 16.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c++
Assignee: unassigned at gcc dot gnu.org
Reporter: matt at godbolt dot org
Target Milestone: ---
In code similar to 122226, I have this "strlen" type thing for ints:
```
#include <cstddef>
// [[gnu::noinline]]
// [[gnu::const]]
// [[gnu::noinline, gnu::const]]
static std::size_t count_ints(const int *ints) {
std::size_t num{};
while (*ints) {
num++;
ints++;
}
return num;
}
std::size_t num_compares{};
// Returns if a zero-termed list of ints has 1234
bool has_1234(const int *ints) {
for (std::size_t index = 0; index < count_ints(ints); ++index) {
++num_compares;
if (ints[index] == 1234) {
return true;
}
}
return false;
}
```
That is, we have a "strlen" type function (count_ints), and another that
foolishly calls it for each loop iteration, looking for "1234".
CE link: https://godbolt.org/z/a8o8doob9
Even though gcc can see the body of the "count_ints" function, and the `int`
does not alias with the `size_t`, and there's no aggregate being returned, gcc
generates the code for `count_ints` inside the loop:
```
.L3:
xor eax, eax ; length = 0
.L6:
add rax, 1
mov r8d, DWORD PTR [rdi+rax*4]
test r8d, r8d
jne .L6 ; loop looking for a zero
cmp rdx, rax ; is current "1234 index" same as length
jnb .L15 ; exit loop
add rcx, 1 ; inc count
cmp DWORD PTR [rdi+rdx*4], 1234 ; check this entry for 1234
je .L16 ; exit if found
add rdx, 1 ; ++loop
mov esi, 1
jmp .L3 ; back to main loop at L3 (which re-scans the loop from
the beginning again)
```
If we mark it noinline (comment in and out the various attributes), we see it
call the function each time.
Manually marking it `gnu::const` does not help here. Marking it _both_ noinline
_and_ const gives the result I'd expect, with the code counting the length once
at the top of the loop.
Initially I thought gcc was being clever when inlining and walking along the
array looking for _either_ zero or 1234, but it does seem like it's
unnecessarily looping each time, having not recognised it can LIM out the
count_ints.
I may be missing something, but I didn't see anything in the optimisation
report. But it seems:
- GCC with visible body, no attributes: Re-counts on every iteration
(conservative but slow)
- GCC with visible body + [[gnu::const]]: Hoists correctly, though should be
unnecessary, unless I misunderstand `const` here
- GCC with just declaration + [[gnu::const]]: Doesn't hoist (a bug? - doesn't
respect the attribute)
For reference, clang 21.1 also fails to hoist when the body is visible without
attributes (though it rewrites count_ints to `wcslen`). However, clang does
respect [[gnu::const]] on just the declaration (without visible body), and
successfully hoists in that case - see https://godbolt.org/z/57b4Y3KPG. Maybe
the bug here is that GCC doesn't respect [[gnu::const]] on declarations when it
can't see the function body, even if I'm missed a reason why the body of the
function isn't already `const` when it can be seen.
Thanks in advance and apologies if I've missed an obvious reason why this can't
be LIMed.