For: void f1 (int *x, int *y) { for (int i = 0; i < 32; ++i) x[i] += y[i]; }
we check at runtime whether one vector at x would overlap one vector at y. But in cases like this, the vector code would handle x <= y just fine, since any write to address A still happens after any read from address A. The only problem is if x is ahead of y by less than a vector. The same is true for two writes: void f2 (int *x, int *y) { for (int i = 0; i < 32; ++i) { x[i] = i; y[i] = 2; } } If y <= x then a vector write at y after a vector write at x would have the same net effect as the original scalar writes. Read-after-write cases like: int f3 (int *x, int *y) { int res = 0; for (int i = 0; i < 32; ++i) { x[i] = i; res += y[i]; } return res; } can cope with x == y, but otherwise don't allow overlap in either direction. Since checking for x == y at runtime would require extra code, we're probably better off sticking with the current overlap test. An overlap test is also needed if the scalar or vector accesses covered by the alias check are mixed together, rather than all statements for the second access following all statements for the first access. This patch series tracks whether accesses in an alias check are well-ordered, and also tracks which combination of RAW, WAR and WAW dependencies the alias check covers. It then uses a more relaxed condition for well-ordered WAR and WAW alias checks. The most important case this allows is equal addresses/indices for WAR dependencies. E.g. more realistic instances of functions like f1 can support both "constructive" and "destructive" operations, where the destination pointer is explicitly allowed to be the same as a source pointer or point to an independent object. The series probably doesn't help other cases that much in practice. However, the checks involved are just as simple as (and sometimes slightly simpler than) the corresponding overlap tests, so there should be no downside to using the more relaxed tests whenever we can. The main reason for doing this now is that SVE2 has instructions that detect RAW and WAR/WAW hazards between two vector pointers. A follow-on patch adds support for them. Each patch tested individually on aarch64-linux-gnu and the series as a whole on x86_64-linux-gnu. OK to install? Richard