http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51499

             Bug #: 51499
           Summary: vectorizer missing simple case
    Classification: Unclassified
           Product: gcc
           Version: 4.6.2
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassig...@gcc.gnu.org
        ReportedBy: fb.programm...@gmail.com


The sse vectorizer seems to miss one of the simplest cases:

#include <cstdio>
#include <cstdlib>

double loop(double a, size_t n){
   // initialise differently so compiler doesn't simplify
   double sum1=0.1, sum2=0.2, sum3=0.3, sum4=0.4, sum5=0.5, sum6=0.6;
   for(size_t i=0; i<n; i++){
      sum1+=a; sum2+=a; sum3+=a; sum4+=a; sum5+=a; sum6+=a;
   }
   return sum1+sum2+sum3+sum4+sum5+sum6-2.1-6.0*a*n;
}

int main(int argc, char** argv) {
   size_t n=1000000;
   double a=1.1;
   printf("res=%f\n", loop(a,n));
   return EXIT_SUCCESS;
}

g++-4.6.2 -Wall -O2 -ftree-vectorize -ftree-vectorizer-verbose=2 test.cpp

test.cpp:7: note: not vectorized: unsupported use in stmt.
test.cpp:4: note: vectorized 0 loops in function.

We get six addsd operations - whereas an optimisation should have
given us three addpd operations.

.L3:
    addq    $1, %rax
    addsd    %xmm0, %xmm6
    cmpq    %rdi, %rax
    addsd    %xmm0, %xmm5
    addsd    %xmm0, %xmm4
    addsd    %xmm0, %xmm3
    addsd    %xmm0, %xmm2
    addsd    %xmm0, %xmm1
    jne    .L3

Reply via email to