Bug ID: 36448
           Summary: Vectorization improvement opportunity for loops with
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Windows NT
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Backend: X86

Let's consider the following loop (
void testStride(int a[], int b[], int N) {
  for (int i = 0; i < N; i+=2)
    a[i] = b[i];

If we specify that we have avx-512 support (-march=skylake-avx512) LLVM will be
able to vectorize it using Gather/Scatter.

However if we do not have the avx-512 support LLVM will not vectorize this loop
due to its cost model detects it is inefficient because it needs to scalarize
the memory access.

At the same time LLVM Vectorizer supports masked load/store but it is not used
for loops with stride access. It is only used for loops with conditions.

Specifically if I re-write the loop as
void testCond(int a[], int b[], int N) {
  for (int i = 0; i < N; i++)
    if ((i % 2) == 0)
      a[i] = b[i];

LLVM vectorizes this loop and uses masked load/store. However it has a problem
to detect a simple stride pattern for mask and computes it on each iteration.

So I guess there are two opportunities here:
1) Support masked load/store for stride access to memory
2) Be clever in determine invariant mask hoisting from the loop.

You are receiving this mail because:
You are on the CC list for the bug.
llvm-bugs mailing list

Reply via email to