https://bugs.llvm.org/show_bug.cgi?id=41732

            Bug ID: 41732
           Summary: Illegal vectorization of a loop with a recurrence with
                    Skylake, KNL
           Product: new-bugs
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: new bugs
          Assignee: hideki.sa...@intel.com
          Reporter: rscottman...@gmail.com
                CC: craig.top...@gmail.com, hfin...@anl.gov,
                    htmldevelo...@gmail.com, llvm-bugs@lists.llvm.org

The test case below is derived from an HPC benchmark. llvm is illegaly
vectorizing the loop in function one() despite an obvious recurrence.
Curiously, when function two() is commented out, llvm correctly determines it
is unsafe to vectorize. If you inhibit inlining of one() it also prevents
vectorization. 

I did some trace debugging and found in the good version,
DepChecker-areDepsSafe() returns false, but true in the bad version. 


// bad version

clang -S -O2 -Rpass=loop-vectorize test.c  -march=skylake-avx512
test.c:6:3: remark: vectorized loop (vectorization width: 16, interleaved
count: 1) [-Rpass=loop-vectorize]
  do {
  ^

// good version

clang -S -O2 -Rpass=loop-vectorize -Rpass-missed=loop-vectorize test.c 
-march=skylake-avx512 -DNO_TWO
test.c:6:3: remark: loop not vectorized [-Rpass-missed=loop-vectorize]
  do {


Note: The problem exists on other targets, but the loops are incidentally
correct because they are not considered profitable. For instance, with
-march=broadwell (which does not have hw scatter):

LV: Loop cost is 8
LV: Interleaving to reduce branch cost.
LV: Vectorization is possible but not beneficial.
LV: Interleaving is not beneficial.

but with -march=knl, we see vectorization as well. 



Test case:

static void  __attribute__ ((always_inline)) one(
  const int *restrict in, const int *const end,
  const unsigned shift, int *const restrict index,
  int *const restrict out)
{
  do {
    int a_idx = *in>>shift;
    int b_idx = index[a_idx];   
    out[b_idx] = *in;         // <-- reccurence as index[a_idx] can be the
    index[a_idx]++;           //     same and incremented within the vector
  } while(++in!=end);         //     which leads to incorrect results
}

#ifndef NO_TWO
static void  __attribute__ ((noinline)) two(
  const int *restrict in, const int *const end,
  const unsigned shift, int *const restrict index,
  int *const restrict out)
{
  do out[index[(*in>>shift)]++]=*in; while(++in!=end);
}
#endif

void parent(
  int digits, int n, int *restrict work, int * restrict idx,
  int *restrict shift, int **restrict indices)
{
  int *in = work;
  int *dst = work+n;
  int d;
  for(d=1;d!=digits-1;++d) {
    int *t;
    one(in,in+n,shift[d],indices[d],dst);
    t=in,in=dst,dst=t;
  }
#ifndef NO_TWO
  two(in,in+n,shift[d],indices[d],idx);
#endif
}

-- 
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to