[Bug middle-end/85720] bad codegen for looped assignment of primitives at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85720 --- Comment #4 from Mathias Stearn --- (In reply to Marc Glisse from comment #3) > Again, you are ignoring aliasing issues (just like in your other PR the > function copy isn't equivalent to memmove). Does adding __restrict change > the result? Also, B[i]=B[i]+1 doesn't look like a memset... Sorry, I typoed. It was supposed to be B[i] = A[i] + 1. That still does basically the same thing though: https://godbolt.org/g/dtmU5t. Good point about aliasing though. I guess the right code gen in that case would actually be something that detected the overlap and did the right calls to memset to only set each byte once. Or just do the simple thing: if (b > a && b < a + n) { memset(b, 1, n); memset(a, 0, n); } else { memset(a, 0, n); memset(b, 1, n); } Yes, __restrict helps, but that isn't part of standard c++, and it seems like it never will be.
[Bug middle-end/85720] bad codegen for looped assignment of primitives at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85720 --- Comment #3 from Marc Glisse --- (In reply to Mathias Stearn from comment #2) > Hmm. Taking the example from the -ftree-loop-distribute-patterns > documentation, it still seems to generate poor code, this time at both -O2 > and -O3: https://godbolt.org/g/EsQDj8 > > Why isn't that transformed to memset(A, 0, N); memset(B, 1, N); ? This feels > similar to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85721. Should I make > a new ticket with this example? Again, you are ignoring aliasing issues (just like in your other PR the function copy isn't equivalent to memmove). Does adding __restrict change the result? Also, B[i]=B[i]+1 doesn't look like a memset...
[Bug middle-end/85720] bad codegen for looped assignment of primitives at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85720 --- Comment #2 from Mathias Stearn --- Hmm. Taking the example from the -ftree-loop-distribute-patterns documentation, it still seems to generate poor code, this time at both -O2 and -O3: https://godbolt.org/g/EsQDj8 Why isn't that transformed to memset(A, 0, N); memset(B, 1, N); ? This feels similar to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85721. Should I make a new ticket with this example?