https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84485
Bug ID: 84485 Summary: [6/7 Regression] Vectorising zero-stride rmw operation Product: gcc Version: 6.0 Status: UNCONFIRMED Keywords: wrong-code Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: rsandifo at gcc dot gnu.org Target Milestone: --- r223486 meant that we could vectorise: void f (unsigned long incx, unsigned long incy, float *restrict dx, float *restrict dy) { unsigned long ix = 0, iy = 0; for (unsigned long i = 0; i < 512; ++i) { dy[iy] += dx[ix]; ix += incx; iy += incy; } } without first proving that incy is nonzero. At the time this particular testcase needed -fno-vect-cost-model on both x86_64 and aarch64, but by the time of the GCC 6 and 7 releases, the aarch64 vectorisation costs allowed it to be vectorised at -O3. This was fixed for GCC 8 with r256644, but that's obviously much too invasive to backport.