Richard Biener <richard.guent...@gmail.com> writes: > On Thu, Feb 3, 2022 at 11:52 AM Richard Sandiford via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: >> >> After the fix for PR102659, the vectoriser can no longer group >> conditional accesses of the form: >> >> for (int i = 0; i < n; ++i) >> if (...) >> ...a[i * 2] + a[i * 2 + 1]...; >> >> on LP64 targets. It has to treat them as two independent >> gathers instead. > > Hmm, that's unfortunate. Can you file an enhancement bugreport?
OK, filed as PR104368. > How does using intptr_t help? i * 2 can still overflow with large n, > so can it with 'int' on ILP32. So I guess this is the old issue > of transforming (uint64)(i * 2 + 1) to (uint64)(i*2) + 1UL? That does happen, but I'm not sure that it's the main problem. SCEV analysis seems to fail for the a[i * 2] access too. With ints the &a[i * 2] calculation is: _45 = (unsigned int) i_26; _46 = _45 * 2; _5 = (int) _46; _6 = (long unsigned int) _5; _7 = _6 * 4; _48 = _47 + _7; and the &a[i * 2 + 1] calculation is: _10 = _6 + 1; _11 = _10 * 4; _51 = _11 + _47; With intptr_ts the &a[i * 2] calculation is: i.0_1 = (long unsigned int) i_23; _5 = i.0_1 * 8; _40 = _39 + _5; and the &a[i * 2 + 1] calculation is: _8 = _5 + 4; _43 = _8 + _39; which looks correct. If the intptr_t i * 2 wraps then a &a[(uintptr_t)i * 2] IV will still behave correctly, so the {a, +, 8} SCEV still seems accurate. The int i * 2 would instead wrap at 32 bits, so &a[(unsigned)i * 2] isn't linear in any meaningful sense. I don't know if the wrapping intptr_t SCEV leads to well-formed gimple though. Are pointer IVs assumed not to overflow? If so, I guess we might still be introducing UB for some intptr_t cases (although not this one AFAICT, since any wrapping cases would be UB in the source too). Thanks, Richard