https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85143
Bug ID: 85143 Summary: Loop limit prevents (auto)vectorization Product: gcc Version: 7.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: robertw89 at googlemail dot com Target Milestone: --- I expected that it generates a vectorized version potentially specializing to the boundary. LLVM produces strange looking (vectorized) code so I guess it's a but this time :) Works fine if the hardcoded boundary is removed. void boxIntersectionSimdNative( bool*__restrict__ res, double*__restrict__ a, double*__restrict__ b, int n ) { for( int i = 0; i < n && i < 1337; i++) { res[i] = a[i] > b[i]; } } output boxIntersectionSimdNative(bool*, double*, double*, int): test ecx, ecx jle .L34 mov eax, 1 jmp .L30 .L35: cmp r8d, 1336 jg .L34 .L30: vmovsd xmm0, QWORD PTR [rsi-8+rax*8] mov r8d, eax vcomisd xmm0, QWORD PTR [rdx-8+rax*8] seta BYTE PTR [rdi-1+rax] add rax, 1 cmp ecx, r8d jg .L35 .L34: rep ret