https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121949

            Bug ID: 121949
           Summary: Missed shift vectorization when IV value has a
                    different datatype
           Product: gcc
           Version: 16.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: tnfchris at gcc dot gnu.org
            Blocks: 53947
  Target Milestone: ---

I think the solution to this is probably the same as in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119860#c1 but filing it as a
separate  tickets as something to test with.

The following example:

void f1(long long word, long long* acc)
{
    for (long long row = 0; row < 64; ++row)
    {
        if (word & (1ull << row)) {
            acc[row] += row;
        }
    }
}

void f2(long long word, long long* acc)
{
    for (int row = 0; row < 64; ++row)
    {
        if (word & (1ull << row)) {
            acc[row] += row;
        }
    }
}

with -O3 -march=armv8-a+sve vectorizes with f1 but doesn't with f2.

This is because the shift amount "row" is 32-bits but the datatype of the shift
64-bits.

It seems the vectorizer doesn't support increasing the VF and simply extending
the value to 64-bits in this case and instead refuses to vectorize.

While the optimal solution may be to just extend row to a 64-bit IV, it's
unclear why we didn't support unpacking in this case.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

Reply via email to