http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51499
Bug #: 51499 Summary: vectorizer missing simple case Classification: Unclassified Product: gcc Version: 4.6.2 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: tree-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: fb.programm...@gmail.com The sse vectorizer seems to miss one of the simplest cases: #include <cstdio> #include <cstdlib> double loop(double a, size_t n){ // initialise differently so compiler doesn't simplify double sum1=0.1, sum2=0.2, sum3=0.3, sum4=0.4, sum5=0.5, sum6=0.6; for(size_t i=0; i<n; i++){ sum1+=a; sum2+=a; sum3+=a; sum4+=a; sum5+=a; sum6+=a; } return sum1+sum2+sum3+sum4+sum5+sum6-2.1-6.0*a*n; } int main(int argc, char** argv) { size_t n=1000000; double a=1.1; printf("res=%f\n", loop(a,n)); return EXIT_SUCCESS; } g++-4.6.2 -Wall -O2 -ftree-vectorize -ftree-vectorizer-verbose=2 test.cpp test.cpp:7: note: not vectorized: unsupported use in stmt. test.cpp:4: note: vectorized 0 loops in function. We get six addsd operations - whereas an optimisation should have given us three addpd operations. .L3: addq $1, %rax addsd %xmm0, %xmm6 cmpq %rdi, %rax addsd %xmm0, %xmm5 addsd %xmm0, %xmm4 addsd %xmm0, %xmm3 addsd %xmm0, %xmm2 addsd %xmm0, %xmm1 jne .L3