https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121393
--- Comment #2 from Andrew Stubbs <ams at gcc dot gnu.org> --- Here's preprocessed code: #pragma omp for collapse(3) for (v1 = 0; v1 < 20; v1 += 2) for (v2 = 0x7fffffffffffffffLL + 11ULL; v2 != 0x7fffffffffffffffLL - 4ULL; -- v2) for (v3 = 10; v3 != 0; v3--) b[v1 >> 1][v2 - 0x7fffffffffffffffLL + 3][v3 - 1] += 5.5; But the "original" dump has this: #pragma omp for collapse(3) for (v1 = 0; v1 < 20; v1 = v1 + 2) for (v2 = 9223372036854775818; v2 != 9223372036854775803; --v2) for (v3 = 10; v3 > 0; v3-- ) { b[v1 >> 1][v2 + 9223372036854775812][v3 + 4294967295] = b[v1 >> 1][v2 + 9223372036854775812][v3 + 4294967295] + 5.5e+0 } So the +4294967295 for -1 is there right from the front end (which is in the x86_64 compiler, BTW), and this does not work when the offset gets zero-extended to 64-bit in the AMDGCN back end.