https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102160
Bug ID: 102160 Summary: Too many runtime alias checks when vectorizing Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- The following is reduced from (or rather "inspired") 507.cactuBSSN_r ML_BSSN_Advect_Body where, when one works around other issues by editing the source, the vectorizer intends to create > 8000 runtime alias checks (and refuses). void foo (double *a, double *b, int off, int n, int m) { for (int j = 0; j < m; ++j) for (int i = 0; i < n; ++i) a[j*n+i] = b[j*n+i] + b[(j+1)*n+i] + b[(j-1)*n+i]; } this small example iterates over a 2d array in a linearized way (and a way that as written does not actually guarantee that each a[j*n + i] is only written once, that is, the 2 dimensions do not "overlap"). The interesting bit is that the kernel offsets the accesses in the outer loop iteration direction and thus when analyzing the refs in the innermost loop we have three unknown non-constant offsets to b[] and we will create three runtime alias checks that fail to merge (obviously). We need to do better by formulating the alias checks with respect to the outermost [interesting] iteration where we should be able to merge the checks into one, obviously making it less precise by computing the access extent of the whole loop nest. As additional benefit the runtime alias check can be hoisted and thus versioning applied to the outer loop. That might already magically work even.