[Bug tree-optimization/68953] [6 Regression] [graphite] Wrong code w/ -O[12] -floop-nest-optimize
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68953 vries at gcc dot gnu.org changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #11 from vries at gcc dot gnu.org --- patch with test-case committed, follow-up question answered ( https://gcc.gnu.org/ml/gcc-patches/2016-04/msg00473.html ). Marking resolved-fixed.
[Bug tree-optimization/68953] [6 Regression] [graphite] Wrong code w/ -O[12] -floop-nest-optimize
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68953 --- Comment #10 from vries at gcc dot gnu.org --- asked follow-up question ( https://gcc.gnu.org/ml/gcc-patches/2016-04/msg00444.html ), waiting for answer before marking fixed-resolved.
[Bug tree-optimization/68953] [6 Regression] [graphite] Wrong code w/ -O[12] -floop-nest-optimize
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68953 --- Comment #9 from vries at gcc dot gnu.org --- Author: vries Date: Sat Apr 9 15:28:24 2016 New Revision: 234851 URL: https://gcc.gnu.org/viewcvs?rev=234851=gcc=rev Log: Fix pdr accesses order 2016-04-09 Tom de VriesPR tree-optimization/68953 * graphite-sese-to-poly.c (pdr_add_memory_accesses): Order accesses from first to last subscript. * gcc.dg/graphite/pr68953.c: New test. Added: trunk/gcc/testsuite/gcc.dg/graphite/pr68953.c Modified: trunk/gcc/ChangeLog trunk/gcc/graphite-sese-to-poly.c trunk/gcc/testsuite/ChangeLog
[Bug tree-optimization/68953] [6 Regression] [graphite] Wrong code w/ -O[12] -floop-nest-optimize
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68953 vries at gcc dot gnu.org changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |vries at gcc dot gnu.org --- Comment #8 from vries at gcc dot gnu.org --- stage1 approved: https://gcc.gnu.org/ml/gcc-patches/2016-04/msg00430.html
[Bug tree-optimization/68953] [6 Regression] [graphite] Wrong code w/ -O[12] -floop-nest-optimize
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68953 vries at gcc dot gnu.org changed: What|Removed |Added Keywords||patch --- Comment #7 from vries at gcc dot gnu.org --- https://gcc.gnu.org/ml/gcc-patches/2016-04/msg00373.html
[Bug tree-optimization/68953] [6 Regression] [graphite] Wrong code w/ -O[12] -floop-nest-optimize
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68953 --- Comment #6 from vries at gcc dot gnu.org --- (In reply to vries from comment #5) > The patch changes the order of the subscript functions Oops, that's accesses, actually. > (that was the easiest > for me to implement) to: > [alias set, first subscript, last subscript]
[Bug tree-optimization/68953] [6 Regression] [graphite] Wrong code w/ -O[12] -floop-nest-optimize
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68953 --- Comment #5 from vries at gcc dot gnu.org --- Created attachment 38207 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38207=edit demonstrator patch In add_pdr_constraints, for the EXTRADIM=0 case we have: ... accesses: { S_4[i1, i2] -> [1, 1 + i1] } subscript_sizes: { [1, i1] : i1 >= 0 and i1 <= 3 } intersection: { S_4[i1, i2] -> [1, 1 + i1] : i1 >= -1 and i1 <= 2 } ... but for the EXTRADIM=1 case, we have: ... accesses: { S_4[i1, i2] -> [1, 0, 1 + i1] } subscript_sizes: { [1, i1, 0] : i1 >= 0 and i1 <= 3 } intersection: { S_4[-1, i2] -> [1, 0, 0] } ... Actually, the accesses are ordered: [alias set, last subscript, first subscript] and the subscript sizes are ordered: [alias set range, first subscript range, last subscript range] and that explains why intersection gives unintended results. The patch changes the order of the subscript functions (that was the easiest for me to implement) to: [alias set, first subscript, last subscript] and we get a more reasonable intersection (similar to the EXTRADIM=0 case): ... accesses: { S_4[i1, i2] -> [1, 1 + i1, 0] } subscript_sizes: { [1, i1, 0] : i1 >= 0 and i1 <= 3 } intersection: { S_4[i1, i2] -> [1, 1 + i1, 0] : i1 >= -1 and i1 <= 2 } ... and consequently, correct dependences, and the wrong-code issue is fixed. Atm though I've got no clue about the overall effect of this change, or what the actual fix should look like.
[Bug tree-optimization/68953] [6 Regression] [graphite] Wrong code w/ -O[12] -floop-nest-optimize
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68953 --- Comment #4 from vries at gcc dot gnu.org --- Created attachment 38178 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38178=edit UDIFF
[Bug tree-optimization/68953] [6 Regression] [graphite] Wrong code w/ -O[12] -floop-nest-optimize
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68953 --- Comment #3 from vries at gcc dot gnu.org --- A way to look at the problem is to compare against the dump info for the variant without the extra (redundant) dimension. So, compare dump-info for -DEXTRADIM={0,1} for this source: ... #if EXTRADIM int yu[4][1] = { { 1 }, { 2 }, { 3 }, { 4 } }; int main (void) { int zh, ro; for (zh = 0; zh < 2; ++zh) for (ro = 0; ro < 3; ++ro) yu[ro][0] = yu[zh + 1][0]; return yu[0][0]; /* Should be 2, but returns 3. */ } #else int yu[4] = { 1, 2, 3, 4 }; int main (void) { int zh, ro; for (zh = 0; zh < 2; ++zh) for (ro = 0; ro < 3; ++ro) yu[ro] = yu[zh + 1]; return yu[0]; /* Returns 3. */ } #endif ... This shows a bit of the unified diff of the dump info. The fact that the data references show no reads, is probably already indicative of a problem: ... data references ( - reads: { S_4[i1, i2] -> [1, 1 + i1] : i1 >= 0 and i1 <= 1 and i2 <= 2 and i2 >= 0 } - must_writes: { S_4[i1, i2] -> [1, i2] : i2 >= 0 and i2 <= 2 and i1 >= 0 and i1 <= 1 } + reads: { } + must_writes: { S_4[i1, 0] -> [1, 0, 0] : i1 >= 0 and i1 <= 1 } may_writes: { } ) data dependences ( -{ S_4[i1, i2] -> S_4[i1, i2'] : i2' >= 1 + i1 and i2' <= 1 + i1 + i2 and i2' >= 1 + i2 and i2' <= 2; S_4[0, i2] -> S_4[1, i2'] : i2' >= 2 - i2 and \ i2' <= i2 and i2 <= 2; S_4[0, 0] -> S_4[1, 0] } +{ S_4[0, 0] -> S_4[1, 0] } ) ...
[Bug tree-optimization/68953] [6 Regression] [graphite] Wrong code w/ -O[12] -floop-nest-optimize
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68953 --- Comment #2 from vries at gcc dot gnu.org --- (In reply to vries from comment #1) > Created attachment 38141 [details] > updated test-case > > Fails with -O1, passes with -O2. > > The problem is that this loop nest: > ... > for (zh = 0; zh < 2; ++zh) > for (ro = 0; ro < 3; ++ro) > yu[ro][0] = yu[zh + 1][0]; > ... Unrolled, that looks like: /* yu == { { 1 }, { 2 }, { 3 }, { 4 } }. */ yu[0][0] = yu[1][0]; /* zh == 0, ro == 0. */ /* yu == { { 2 }, { 2 }, { 3 }, { 4 } }. */ yu[1][0] = yu[1][0]; /* zh == 0, ro == 1. */ /* yu == { { 2 }, { 2 }, { 3 }, { 4 } }. */ yu[2][0] = yu[1][0]; /* zh == 0, ro == 2. */ /* yu == { { 2 }, { 2 }, { 2 }, { 4 } }. */ yu[0][0] = yu[2][0]; /* zh == 1, ro == 0. */ /* yu == { { 2 }, { 2 }, { 2 }, { 4 } }. */ yu[1][0] = yu[2][0]; /* zh == 1, ro == 1. */ /* yu == { { 2 }, { 2 }, { 2 }, { 4 } }. */ yu[2][0] = yu[2][0]; /* zh == 1, ro == 2. */ /* yu == { { 2 }, { 2 }, { 2 }, { 4 } }. */ > is rewritten into this loop nest, which has different semantics: > ... > for (ro = 0; ro < 3; ++ro) > for (zh = 0; zh < 2; ++zh) > yu[ro][0] = yu[zh + 1][0]; > ... and unrolled, this looks like: /* yu == { { 1 }, { 2 }, { 3 }, { 4 } }. */ yu[0][0] = yu[1][0]; /* zh == 0, ro == 0. */ /* yu == { { 2 }, { 2 }, { 3 }, { 4 } }. */ yu[0][0] = yu[2][0]; /* zh == 1, ro == 0. */ /* yu == { { 3 }, { 2 }, { 3 }, { 4 } }. */ yu[1][0] = yu[1][0]; /* zh == 0, ro == 1. */ /* yu == { { 3 }, { 2 }, { 3 }, { 4 } }. */ yu[1][0] = yu[2][0]; /* zh == 1, ro == 1. */ /* yu == { { 3 }, { 3 }, { 3 }, { 4 } }. */ yu[2][0] = yu[1][0]; /* zh == 0, ro == 2. */ /* yu == { { 3 }, { 3 }, { 3 }, { 4 } }. */ yu[2][0] = yu[2][0]; /* zh == 1, ro == 2. */ /* yu == { { 3 }, { 3 }, { 3 }, { 4 } }. */
[Bug tree-optimization/68953] [6 Regression] [graphite] Wrong code w/ -O[12] -floop-nest-optimize
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68953 --- Comment #1 from vries at gcc dot gnu.org --- Created attachment 38141 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38141=edit updated test-case Fails with -O1, passes with -O2. The problem is that this loop nest: ... for (zh = 0; zh < 2; ++zh) for (ro = 0; ro < 3; ++ro) yu[ro][0] = yu[zh + 1][0]; ... is rewritten into this loop nest, which has different semantics: ... for (ro = 0; ro < 3; ++ro) for (zh = 0; zh < 2; ++zh) yu[ro][0] = yu[zh + 1][0]; ...
[Bug tree-optimization/68953] [6 Regression] [graphite] Wrong code w/ -O[12] -floop-nest-optimize
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68953 vries at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2016-03-08 CC||vries at gcc dot gnu.org Ever confirmed|0 |1
[Bug tree-optimization/68953] [6 Regression] [graphite] Wrong code w/ -O[12] -floop-nest-optimize
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68953 Jeffrey A. Law changed: What|Removed |Added Priority|P3 |P4 CC||law at redhat dot com
[Bug tree-optimization/68953] [6 Regression] [graphite] Wrong code w/ -O[12] -floop-nest-optimize
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68953 Richard Biener changed: What|Removed |Added CC||spop at gcc dot gnu.org Target Milestone|--- |6.0