https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
So in the BB SLP attempt from loop vectorization (or in the BB SLP pass with
-fno-predictive-commoning) we get confused during DR group building because
of a duplicate access and fixup splitting the candidates at odd points.
For the reduced testcase we see
<bb 3> [local count: 1063004409]:
# i_16 = PHI <_5(5), 0(2)>
# ivtmp_18 = PHI <ivtmp_15(5), 511(2)>
_1 = i_16 + 1;
_2 = a[_1];
_3 = a[i_16];
_4 = _2 * _3;
a[i_16] = _4;
_5 = i_16 + 2;
_6 = a[_5];
_7 = a[_1];
_8 = _6 * _7;
a[_1] = _8;
ivtmp_15 = ivtmp_18 - 1;
if (ivtmp_15 != 0)
goto <bb 5>; [99.00%]
else
goto <bb 4>; [1.00%]
so a[_1] is loaded twice because CSE doesn't figure that a[i_16] cannot alias
it. That causes us to split the load group.