https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121393
--- Comment #1 from Andrew Stubbs <ams at gcc dot gnu.org> --- The underlying issue here is not new. What is new is that the code is being vectorized using gather_loads, where before the vectorizer was either failing entirely, or using another kind of load. The testcase reverts to the old behaviour with "--foffload-options=--param=vect-partial-vector-usage=0". At least for "for-11", the fault is caused by a gather_load with negative offsets being marked as unsigned. The offsets are 32-bit, so the compiler uses a zero-extend to calculate the 64-bit address. This leads directly to the invalid addresses at runtime. However, the unsigned-ness is right there in the testcase: __attribute__((noinline, noclone)) void N(f27) (void) { SC unsigned int v1, v3; SC unsigned long long v2; OMPTGT #pragma omp F S collapse(3) for (v1 = 0; v1 < 20; v1 += 2) for (v2 = __LONG_LONG_MAX__ + 11ULL; v2 != __LONG_LONG_MAX__ - 4ULL; -- v2) for (v3 = 10; v3 != 0; v3--) b[v1 >> 1][v2 - __LONG_LONG_MAX__ + 3][v3 - 1] += 5.5; } If v3 is made signed then the testcase passes. Likewise, if v3 is made unsigned long long then the testcase passes. (v1 can remain unsigned int.) There are so many boundary cases in this testcase I'm really not sure if it can be considered well formed? However, my expectation would be that v3 gets zero-extended to 64-bit, and then everything else works. What is actually happening is that the middle-end is truncating the offsets to 32-bit, and this I do not expect.