On Thu, Jun 26, 2014 at 9:11 AM, Arthur O'Dwyer <[email protected]> wrote:
> On Thu, Jun 26, 2014 at 4:10 AM, Richard Smith <[email protected]> > wrote: > > On Wed, Jun 25, 2014 at 7:34 PM, Arthur O'Dwyer < > [email protected]> wrote: > >> On Wed, Jun 25, 2014 at 3:26 PM, Sanjin Sijaric < > [email protected]> wrote: > >> > > >> >> int *p; > >> >> typedef struct { > >> >> char a; > >> >> char b[100]; > >> >> char c; > >> >> } S; > >> >> > >> >> S x; > >> >> > >> >> void func1 (char d) { > >> >> for (int i = 0; i < 100; i++) { > >> >> x.b[i] += 1; > >> >> d = *p; > >> >> x.a += d; > >> >> } > >> >> } > >> >> > >> >> It seems like you want the compiler to hoist the read of `*p` above > the > >> >> write to `x.b[i]`. > >> >> But that isn't generally possible, is it? because the caller might > have > >> >> executed > >> >> > >> >> p = &x.b[3]; > >> >> > >> >> before the call to func1. > […] > > I think that's backwards from the intent: if you swap over 'int' and > 'char' > > in the example, we cannot do the reordering, because p could point to > (some > > byte of) one of the ints. > > > > With the test as-is, we *can* reorder the *p load (and even move it out > of > > the loop): > > -- *p cannot alias x.b[i], because if 'x.b[i] += 1' has defined > behavior, > > then x is an object of type S and x.b is an object of type char[100] and > 0 > > <= i < 100, and therefore there is no int object aliased by that store > > You left out one step there, which needs to be made explicit IMO: the > fact that we *store* into a *single byte* of x.b is important. > Obviously we could not do the same optimization on > > size_t nbytes = sizeof(int); > for (int i = 0; i < 100; i++) { > __builtin_memcpy(x.b, &i, nbytes); > d = *p; > x.a += d; > } > > because that would basically nerf "omnipotent char" (and "placement > new" in general). Even in this case, you can in principle hoist out the load. The expression 'x.b' would have undefined behavior if there weren't an S object at that address, and if there's an S object, there's not an int. This is probably more than we want to optimize -- it'll break too much real world code -- even with -fstruct-path-tbaa. > If the compiler can actually detect that the size > of the last access was incommensurate with the size of the load being > reordered, then I withdraw my objection, but otherwise it really seems > like this is a super dangerous optimization. > > > > -- *p cannot alias x.a, because if 'x.a += d' has defined behavior, > then x > > is an object of type S, so a store to S::a cannot alias any int object. > > This one I agree with your logic, btw. A named object of type 'char' > (being only one byte in size) clearly cannot alias with type 'int' > (being four bytes in size). My objections apply only to cases > analogous to > > template<class T> > struct Container { > char data[N + sizeof T]; > }; > > –Arthur >
_______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
