https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109011

            Bug ID: 109011
           Summary: missed optimization in presence of __builtin_ctz
           Product: gcc
           Version: 12.2.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: vincenzo.innocente at cern dot ch
  Target Milestone: ---

in the following code foo does not vectorize, bar does.
clang vectorize foo using a pattern that invokes vplzcntd

(code made a bit complex to make vectorization "relevant") 

see https://godbolt.org/z/5fa1zbPeG

#include <cstdint>
uint32_t x[256];
uint32_t y[256];
uint32_t w[256];
uint32_t z[256];



void foo() {
  for (int i=0; i<256;i++) {
    auto p = x[i] ?  __builtin_ctz(x[i]) : y[i];
   z[i] = w[i]*p;
 }  
}


void bar() {
  for (int j=0; j<256;j+=8)
  for (int i=j; i<j+8;i++) {
   // auto p = x[i] ?  x[i] : y[i];
   auto p = x[i] ?  __builtin_ctz(x[i]) : y[i];
   z[i] = w[i]*p;
 }  
}

Reply via email to