https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122240

            Bug ID: 122240
           Summary: LIM missed opportunity in loop
           Product: gcc
           Version: 16.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: matt at godbolt dot org
  Target Milestone: ---

In code similar to 122226, I have this "strlen" type thing for ints:

```
#include <cstddef>

// [[gnu::noinline]]
// [[gnu::const]]
// [[gnu::noinline, gnu::const]]
static std::size_t count_ints(const int *ints) {
  std::size_t num{};
  while (*ints) {
    num++;
    ints++;
  }
  return num;
}

std::size_t num_compares{};

// Returns if a zero-termed list of ints has 1234
bool has_1234(const int *ints) {
  for (std::size_t index = 0; index < count_ints(ints); ++index) {
    ++num_compares;
    if (ints[index] == 1234) {
      return true;
    }
  }
  return false;
}
```

That is, we have a "strlen" type function (count_ints), and another that
foolishly calls it for each loop iteration, looking for "1234".

CE link: https://godbolt.org/z/a8o8doob9

Even though gcc can see the body of the "count_ints" function, and the `int`
does not alias with the `size_t`, and there's no aggregate being returned, gcc
generates the code for `count_ints` inside the loop:

```
.L3:
        xor     eax, eax ; length = 0
.L6:
        add     rax, 1
        mov     r8d, DWORD PTR [rdi+rax*4]
        test    r8d, r8d
        jne     .L6   ; loop looking for a zero
        cmp     rdx, rax ; is current "1234 index" same as length
        jnb     .L15   ; exit loop
        add     rcx, 1 ; inc count
        cmp     DWORD PTR [rdi+rdx*4], 1234  ; check this entry for 1234
        je      .L16  ; exit if found
        add     rdx, 1 ; ++loop
        mov     esi, 1
        jmp     .L3   ; back to main loop at L3 (which re-scans the loop from
the beginning again)
```

If we mark it noinline (comment in and out the various attributes), we see it
call the function each time.

Manually marking it `gnu::const` does not help here. Marking it _both_ noinline
_and_ const gives the result I'd expect, with the code counting the length once
at the top of the loop.

Initially I thought gcc was being clever when inlining and walking along the
array looking for _either_ zero or 1234, but it does seem like it's
unnecessarily looping each time, having not recognised it can LIM out the
count_ints.

I may be missing something, but I didn't see anything in the optimisation
report. But it seems:

  - GCC with visible body, no attributes: Re-counts on every iteration
(conservative but slow)
  - GCC with visible body + [[gnu::const]]: Hoists correctly, though should be
unnecessary, unless I misunderstand `const` here
  - GCC with just declaration + [[gnu::const]]: Doesn't hoist (a bug? - doesn't
respect the attribute)


For reference, clang 21.1 also fails to hoist when the body is visible without
attributes (though it rewrites count_ints to `wcslen`). However, clang does
respect [[gnu::const]] on just the declaration (without visible body), and
successfully hoists in that case - see https://godbolt.org/z/57b4Y3KPG. Maybe
the bug here is that GCC doesn't respect [[gnu::const]] on declarations when it
can't see the function body, even if I'm missed a reason why the body of the
function isn't already `const` when it can be seen.

Thanks in advance and apologies if I've missed an obvious reason why this can't
be LIMed.

Reply via email to