https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85747

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2021-12-27
           Severity|normal                      |enhancement
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW

--- Comment #9 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
size: 13-3, last_iteration: 13-3
  Loop size: 13
  Estimated size after unrolling: 40
Not unrolling loop 1: it is not innermost and code would grow.

There are a few others like this one.
Note LLVM is able even to handle:
template <class It>
constexpr void sort(It first, It last) {
    for (;first != last; ++first) {
        auto it = first;
        ++it;
        for (; it != last; ++it) {
            if (*it < *first) {
                auto tmp = *it;
                *it = *first;
                *first = tmp;
            }
        }
    }
}

static int generate() {
    int a[] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21};
    a[5] = 55;
    sort(a + 0, a + 21);
    return a[0] + a[6] + a[1] + a[2] + a[3] + a[4];
}

I suspect the cost estimate it does for the loop is the removal of the load of
a[i] knowing that a is fully written to.

Reply via email to