On Mon, 26 Sep 2011, Jakub Jelinek wrote: > Hi! > > Adding Joseph and Jason to CC. > > On Mon, Sep 26, 2011 at 04:56:20PM +0200, Richard Guenther wrote: > > Let's see what kind of fallout we get ;) For example, if the > > following is valid C code I expect we will vectorize the second > > loop (disambiguating p[i] and q[i]) bogously: > > > > void foo (int *p) > > { > > int * __restrict p1 = p; > > int * __restrict p2 = p + 32; > > int *q; > > int i; > > for (i = 0; i < 32; ++i) > > p1[i] = p2[i]; > > p = p1; > > q = p2 - 31; > > for (i = 0; i < 32; ++i) > > p[i] = q[i]; > > } > > > > because p and q base on different restrict qualified pointers > > (p1 and p2 respective). At the moment we are safe from this > > because of the TYPE_RESTRICT checks. > > > > Any opinion on the above? Is it valid to base non-restrict > > pointers on restrict ones? It would be sort-of weird at least, > > but at least I don't think the first loop use is bogus (even > > though the pointed-to objects are the same). > > If the last loop was > for (i = 0; i < 32; i++) > q[i] = p[i]; > then I believe the above would be clearly invalid C99, because > an object X (say incoming p[4]) would be modified in the same block
That can be fixed by adding some {}s and using a different decl for the 2nd p/q. > using a pointer based on p1 and using a pointer not based on p1 > (q), which would violate the requirements that if the object is > modified through lvalue whose address is based on p1, all modifications > to B in that block should be done through lvalues whose address is > based on p1. In the above testcase all modifications are made through > lvalues whose addresses are p1 based though, so it is less clear. > Joseph? > > Anyway, GCC currently makes p1 and p2 use the same restrict > base actually, and does vectorize both loops before the patch, > after the patch as well as with -D__restrict= > (the vectorization of the second loop is guarded with non-overlap > check). It looks like a gimplification is wrong: > int * restrict p1 = p; > > int * restrict p2 = p + 128; > > is gimplified into: > p1 = (int * restrict) p; > > p.0 = (int * restrict) p; > > p2 = p.0 + 128; > > which IMHO is incorrect, I'd say p.0 shouldn't be int * restrict, > but plain int *, only the final value should be TYPE_RESTRICT. Yeah, but maybe it's already fold that transforms (int * restrict)(p + 128) to (int * restrict)p + 128. I'm pretty sure it is. Richard.