[Bug rtl-optimization/44838] [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c
--- Comment #31 from rguenth at gcc dot gnu dot org 2010-07-08 09:09 --- Subject: Bug 44838 Author: rguenth Date: Thu Jul 8 09:09:15 2010 New Revision: 161945 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=161945 Log: 2010-07-08 Richard Guenther rguent...@suse.de PR rtl-optimization/44838 * tree-ssa-alias.c (indirect_refs_may_alias_p): When not in SSA form do not use pointer equivalence. Modified: trunk/gcc/ChangeLog trunk/gcc/tree-ssa-alias.c -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44838
[Bug rtl-optimization/44838] [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c
--- Comment #32 from rguenth at gcc dot gnu dot org 2010-07-08 09:16 --- Fixed. -- rguenth at gcc dot gnu dot org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44838
[Bug rtl-optimization/44838] [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c
--- Comment #33 from hjl at gcc dot gnu dot org 2010-07-08 13:40 --- Subject: Bug 44838 Author: hjl Date: Thu Jul 8 13:40:24 2010 New Revision: 161953 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=161953 Log: Add gcc.dg/pr44838.c. 2010-07-08 H.J. Lu hongjiu...@intel.com PR rtl-optimization/44838 * gcc.dg/pr44838.c: New. Added: trunk/gcc/testsuite/gcc.dg/pr44838.c Modified: trunk/gcc/testsuite/ChangeLog -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44838
[Bug rtl-optimization/44838] [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c
--- Comment #14 from rguenth at gcc dot gnu dot org 2010-07-07 09:01 --- Huh. Unrolling preserves MEM_ATTRs even though it re-writes the RTXen. That causes scheduling to see just a bunch of repeated (insn 218 309 219 18 t.c:9 (parallel [ (set (mem:SI (reg/v/f:DI 1 dx [orig:100 a ] [100]) [2 *a_22+0 S4 A32]) (ashift:SI (mem:SI (reg/v/f:DI 1 dx [orig:100 a ] [100]) [2 *a_22+0 S4 A32]) (const_int 1 [0x1]))) (clobber (reg:CC 17 flags)) ]) 490 {*ashlsi3_1} (expr_list:REG_UNUSED (reg:CC 17 flags) (expr_list:REG_EQUAL (ashift:SI (mem:SI (plus:DI (reg/v/f:DI 0 ax [orig:84 a ] [84]) (const_int 16 [0x10])) [2 *a_22+0 S4 A32]) (const_int 1 [0x1])) (nil (insn 220 219 221 18 t.c:10 (set (reg:SI 4 si [103]) (mem:SI (plus:DI (reg/v/f:DI 1 dx [orig:100 a ] [100]) (const_int -8 [0xfff8])) [2 MEM[(int *)a_22 + -8B]+0 S4 A32])) 63 {*movsi_internal} (nil)) (insn 221 220 222 18 t.c:10 (parallel [ (set (reg:SI 4 si [103]) (plus:SI (reg:SI 4 si [103]) (mem:SI (plus:DI (reg/v/f:DI 1 dx [orig:100 a ] [100]) (const_int -4 [0xfffc])) [2 MEM[(int *)a_22 + -4B]+0 S4 A32]))) (clobber (reg:CC 17 flags)) ]) 251 {*addsi_1} (expr_list:REG_UNUSED (reg:CC 17 flags) (nil))) (insn 222 221 310 18 t.c:10 (set (mem:SI (plus:DI (reg/v/f:DI 1 dx [orig:100 a ] [100]) (const_int 4 [0x4])) [2 MEM[(int *)a_22 + 4B]+0 S4 A32]) (reg:SI 4 si [103])) 63 {*movsi_internal} (expr_list:REG_DEAD (reg:SI 4 si [103]) (expr_list:REG_DEAD (reg/v/f:DI 1 dx [orig:100 a ] [100]) (expr_list:REG_EQUAL (plus:SI (mem:SI (plus:DI (reg/v/f:DI 1 dx [orig:100 a ] [100]) (const_int -8 [0xfff8])) [2 MEM[(int *)a_22 + -8B]+0 S4 A32]) (mem:SI (plus:DI (reg/v/f:DI 1 dx [orig:100 a ] [100]) (const_int -4 [0xfffc])) [2 MEM[(int *)a_22 + -4B]+0 S4 A32])) (nil) where there is obviously no conflicts between the above patterns during different unrolled copies. Who is supposed to magically deal with that? (or what is supposed to prevent this from happening?) -- rguenth at gcc dot gnu dot org changed: What|Removed |Added CC||rakdver at gcc dot gnu dot ||org Summary|[4.6 regression] RTL loop |[4.6 regression] RTL loop |unrolling and scheduling|unrolling causes FAIL: |cause FAIL: gcc.dg/pr39794.c|gcc.dg/pr39794.c http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44838
[Bug rtl-optimization/44838] [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c
--- Comment #15 from rakdver at kam dot mff dot cuni dot cz 2010-07-07 09:37 --- Subject: Re: [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c --- Comment #14 from rguenth at gcc dot gnu dot org 2010-07-07 09:01 --- Huh. Unrolling preserves MEM_ATTRs even though it re-writes the RTXen. That causes scheduling to see just a bunch of repeated where there is obviously no conflicts between the above patterns during different unrolled copies. Who is supposed to magically deal with that? (or what is supposed to prevent this from happening?) I am not sure what you mean -- I may be misunderstanding how rtl alias analysis works, but as far as I can tell, what unroller does (just preserving the MEM_ATTRs) is conservatively correct (so, potentially it may make us believe that there are dependences that are not really present, but it should not cause a wrong-code bug). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44838
[Bug rtl-optimization/44838] [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c
--- Comment #16 from amonakov at gcc dot gnu dot org 2010-07-07 09:54 --- (In reply to comment #15) Subject: Re: [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c I am not sure what you mean -- I may be misunderstanding how rtl alias analysis works, but as far as I can tell, what unroller does (just preserving the MEM_ATTRs) is conservatively correct (so, potentially it may make us believe that there are dependences that are not really present, but it should not cause a wrong-code bug). Consider this simplified example: for (i ...) { /*A*/ t = a[i]; /*B*/ a[i+1] = t; } MEM_ATTRS would indicate that memory references in A and B do not alias. Unrolling by 2 produces: for (i ...) { /*A */ t = a[i]; /*B */ a[i+1] = t; /*A'*/ t = a[i+1]; /*B'*/ a[i+2] = t; } Preserving MEM_ATTRS wrongly indicates that memory references in B and A' do not alias, and the scheduler then may happen to lift A' above B. -- amonakov at gcc dot gnu dot org changed: What|Removed |Added CC||amonakov at gcc dot gnu dot ||org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44838
[Bug rtl-optimization/44838] [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c
--- Comment #17 from rakdver at kam dot mff dot cuni dot cz 2010-07-07 10:09 --- Subject: Re: [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c (In reply to comment #15) Subject: Re: [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c I am not sure what you mean -- I may be misunderstanding how rtl alias analysis works, but as far as I can tell, what unroller does (just preserving the MEM_ATTRs) is conservatively correct (so, potentially it may make us believe that there are dependences that are not really present, but it should not cause a wrong-code bug). Consider this simplified example: for (i ...) { /*A*/ t = a[i]; /*B*/ a[i+1] = t; } MEM_ATTRS would indicate that memory references in A and B do not alias. but this is clearly wrong, since B in iteration X aliases with A in iteration X+1. So, not a problem in unroller. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44838
[Bug rtl-optimization/44838] [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c
--- Comment #18 from rguenth at gcc dot gnu dot org 2010-07-07 10:30 --- (In reply to comment #17) Subject: Re: [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c (In reply to comment #15) Subject: Re: [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c I am not sure what you mean -- I may be misunderstanding how rtl alias analysis works, but as far as I can tell, what unroller does (just preserving the MEM_ATTRs) is conservatively correct (so, potentially it may make us believe that there are dependences that are not really present, but it should not cause a wrong-code bug). Consider this simplified example: for (i ...) { /*A*/ t = a[i]; /*B*/ a[i+1] = t; } MEM_ATTRS would indicate that memory references in A and B do not alias. but this is clearly wrong, since B in iteration X aliases with A in iteration X+1. So, not a problem in unroller. It is not wrong. You have the two identical pointers p = a[i] and q = p + 1. *p and *q do not alias. Never. Now unrolling rewrites p and q but does not adjust MEM_ATTRs. So alias information still claims the same pointer bases are used for every unrolled load/store, which is certainly not true. (In the past we didn't preserve pointer bases at all, which is why we didn't hit this before. Starting with 4.5.0 and export of points-to information we do, so passes need to fix MEM_ATTRs accordingly) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44838
[Bug rtl-optimization/44838] [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c
--- Comment #19 from rakdver at kam dot mff dot cuni dot cz 2010-07-07 10:35 --- Subject: Re: [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c I am not sure what you mean -- I may be misunderstanding how rtl alias analysis works, but as far as I can tell, what unroller does (just preserving the MEM_ATTRs) is conservatively correct (so, potentially it may make us believe that there are dependences that are not really present, but it should not cause a wrong-code bug). Consider this simplified example: for (i ...) { /*A*/ t = a[i]; /*B*/ a[i+1] = t; } MEM_ATTRS would indicate that memory references in A and B do not alias. but this is clearly wrong, since B in iteration X aliases with A in iteration X+1. So, not a problem in unroller. It is not wrong. You have the two identical pointers p = a[i] and q = p + 1. *p and *q do not alias. Never. Well, then you have some other definition of aliasing than me. For me, two memory references M1 and M2 do not alias, if on every code path, the locations accessed by M1 and M2 are different. With this definition, *p and *q may alias, as the example above shows. What is your definition? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44838
[Bug rtl-optimization/44838] [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c
--- Comment #20 from rguenth at gcc dot gnu dot org 2010-07-07 10:43 --- (In reply to comment #19) Subject: Re: [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c I am not sure what you mean -- I may be misunderstanding how rtl alias analysis works, but as far as I can tell, what unroller does (just preserving the MEM_ATTRs) is conservatively correct (so, potentially it may make us believe that there are dependences that are not really present, but it should not cause a wrong-code bug). Consider this simplified example: for (i ...) { /*A*/ t = a[i]; /*B*/ a[i+1] = t; } MEM_ATTRS would indicate that memory references in A and B do not alias. but this is clearly wrong, since B in iteration X aliases with A in iteration X+1. So, not a problem in unroller. It is not wrong. You have the two identical pointers p = a[i] and q = p + 1. *p and *q do not alias. Never. Well, then you have some other definition of aliasing than me. For me, two memory references M1 and M2 do not alias, if on every code path, the locations accessed by M1 and M2 are different. With this definition, *p and *q may alias, as the example above shows. What is your definition? My definition is that the two statements in sequence A, B have a true dependence if stmt B accesses memory written to by A. Thus, in this context *p and *q do not alias (and this is what the scheduler and every other optimization pass queries). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44838
[Bug rtl-optimization/44838] [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c
--- Comment #21 from rakdver at kam dot mff dot cuni dot cz 2010-07-07 10:47 --- Subject: Re: [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c Consider this simplified example: for (i ...) { /*A*/ t = a[i]; /*B*/ a[i+1] = t; } MEM_ATTRS would indicate that memory references in A and B do not alias. but this is clearly wrong, since B in iteration X aliases with A in iteration X+1. So, not a problem in unroller. It is not wrong. You have the two identical pointers p = a[i] and q = p + 1. *p and *q do not alias. Never. Well, then you have some other definition of aliasing than me. For me, two memory references M1 and M2 do not alias, if on every code path, the locations accessed by M1 and M2 are different. With this definition, *p and *q may alias, as the example above shows. What is your definition? My definition is that the two statements in sequence A, B have a true dependence if stmt B accesses memory written to by A. Thus, in this context *p and *q do not alias (and this is what the scheduler and every other optimization pass queries). what do you mean by statements in sequence? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44838
[Bug rtl-optimization/44838] [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c
--- Comment #22 from rguenth at gcc dot gnu dot org 2010-07-07 10:48 --- (In reply to comment #21) Subject: Re: [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c Consider this simplified example: for (i ...) { /*A*/ t = a[i]; /*B*/ a[i+1] = t; } MEM_ATTRS would indicate that memory references in A and B do not alias. but this is clearly wrong, since B in iteration X aliases with A in iteration X+1. So, not a problem in unroller. It is not wrong. You have the two identical pointers p = a[i] and q = p + 1. *p and *q do not alias. Never. Well, then you have some other definition of aliasing than me. For me, two memory references M1 and M2 do not alias, if on every code path, the locations accessed by M1 and M2 are different. With this definition, *p and *q may alias, as the example above shows. What is your definition? My definition is that the two statements in sequence A, B have a true dependence if stmt B accesses memory written to by A. Thus, in this context *p and *q do not alias (and this is what the scheduler and every other optimization pass queries). what do you mean by statements in sequence? statement B executes after A. Note that the issue we run into here is partly (or completely?) due to the fact that the pointer variables in MEM_ATTRs are SSA names and that we still honor their single-definition (and thus trivial equality) property on RTL. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44838
[Bug rtl-optimization/44838] [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c
--- Comment #23 from rakdver at kam dot mff dot cuni dot cz 2010-07-07 10:51 --- Subject: Re: [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c Consider this simplified example: for (i ...) { /*A*/ t = a[i]; /*B*/ a[i+1] = t; } MEM_ATTRS would indicate that memory references in A and B do not alias. but this is clearly wrong, since B in iteration X aliases with A in iteration X+1. So, not a problem in unroller. It is not wrong. You have the two identical pointers p = a[i] and q = p + 1. *p and *q do not alias. Never. Well, then you have some other definition of aliasing than me. For me, two memory references M1 and M2 do not alias, if on every code path, the locations accessed by M1 and M2 are different. With this definition, *p and *q may alias, as the example above shows. What is your definition? My definition is that the two statements in sequence A, B have a true dependence if stmt B accesses memory written to by A. Thus, in this context *p and *q do not alias (and this is what the scheduler and every other optimization pass queries). what do you mean by statements in sequence? statement B executes after A. which means what? In the example above, due to the loop, you cannot say which statement executes after which. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44838
[Bug rtl-optimization/44838] [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c
--- Comment #24 from rguenth at gcc dot gnu dot org 2010-07-07 11:06 --- In ... *p_1 = x; y = *(p_1 + 1); ... I can say that *p_1 does not alias *(p_1 + 1) independent on what code is around. If it would be BB3: # p_1 = PHI p_0, p_2(3) *p_1 = x; y = *(p_1 + 1); p_2 = p_1 + 1; goto BB3; that would be still correct (I can exchange those two statements). For cross loop-iteration dependence after unrolling you would see accesses based on different pointer SSA name bases. Now on RTL we are not in SSA form and so yes, this change might be a bit fishy (I, too, just discovered this side-effect and I assumed passes would already to something here). A way around this is to either adjust or clear MEM_OFFSET. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44838
[Bug rtl-optimization/44838] [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c
--- Comment #25 from matz at gcc dot gnu dot org 2010-07-07 11:15 --- Due to SSA form the alias information reflects dependencies only between accesses as if it ignores back edges. Hence any transformation that transforms a back edge into a forward edge, or moves code over back edges needs to do adjustment to the alias info (effectively doing something like PHI translation, or making the alias info simply more imprecise). Hmpf. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44838
[Bug rtl-optimization/44838] [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c
--- Comment #26 from rakdver at kam dot mff dot cuni dot cz 2010-07-07 11:19 --- Subject: Re: [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c In ... *p_1 = x; y = *(p_1 + 1); ... I can say that *p_1 does not alias *(p_1 + 1) independent on what code is around. If it would be BB3: # p_1 = PHI p_0, p_2(3) *p_1 = x; y = *(p_1 + 1); p_2 = p_1 + 1; goto BB3; that would be still correct (I can exchange those two statements). Well, yes. Still, I would like to hear your formal definition of what it means for two memory references (not to) alias. We certainly can modify the code to ensure such a property, but just toying around without knowing precisely what this property is definitely is not a good idea. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44838
[Bug rtl-optimization/44838] [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c
--- Comment #27 from rakdver at kam dot mff dot cuni dot cz 2010-07-07 11:31 --- Subject: Re: [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c Due to SSA form the alias information reflects dependencies only between accesses as if it ignores back edges. Well, this is closer to what I was asking for; so, the actual definition that we use is: Two memory references M1 and M2 (appearing in statements S1 and S2) if for every code execution path P, and every appearance A1 of S1 and A2 of S2 in P such that no backedge is taken between A1 and A2, the memory locations accessed in A1 and A2 are different. Still, this is somewhat ambiguous (in the presence of irreducible loops, it is not quite clear what is a backedge). Hence any transformation that transforms a back edge into a forward edge, or moves code over back edges needs to do adjustment to the alias info (effectively doing something like PHI translation, or making the alias info simply more imprecise). Hmpf. It is kind of unpleasant that this affects optimizations like loop unrolling, which should make sheduling better (but likely won't do as well if we have to just throw away the results of alias analysis). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44838
[Bug rtl-optimization/44838] [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c
--- Comment #28 from rguenth at gcc dot gnu dot org 2010-07-07 11:59 --- The following is a fix (or workaround) for the problem. Index: gcc/tree-ssa-alias.c === --- gcc/tree-ssa-alias.c(revision 161869) +++ gcc/tree-ssa-alias.c(working copy) @@ -801,7 +780,8 @@ indirect_refs_may_alias_p (tree ref1 ATT /* If both bases are based on pointers they cannot alias if they may not point to the same memory object or if they point to the same object and the accesses do not overlap. */ - if (operand_equal_p (ptr1, ptr2, 0)) + if (gimple_in_ssa_p (cfun) + operand_equal_p (ptr1, ptr2, 0)) { if (TREE_CODE (base1) == MEM_REF) offset1 += mem_ref_offset (base1).low * BITS_PER_UNIT; In SSA form we are sure that if two SSA names are equal their (same) definition dominates them. So if you ask whether the two memory references do alias if they are still in loopy form they do not. For every iteration they have a strict ordering with respect to the definition of their name. Now if you unroll the loop and re-instantiate SSA form you can't use the previous alias query result to determine cross-loop-iteration dependences. The above patch disables offset-based disambiguation for accesses via pointers (technically a nice thing to have). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44838
[Bug rtl-optimization/44838] [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c
--- Comment #29 from matz at gcc dot gnu dot org 2010-07-07 12:10 --- [just for completeness to not lose the thought:] Thinking about this some more (triggered by the problem of not having nice back edges in irreducible loops), it's not really the back edges that are interesting but the underlying property of SSA, namely the correspondence between static single assignments and dynamic single assignments: The alias oracle will give correct answers only for memory references when it can infer runtime equality of values from syntactic equality, which it can for a correct SSA program. So, if M1 and M2 (two memrefs) contain mentions of syntactically the same values, then A1/A2 (two accesses to M1/M2) have to be dominated by the dynamically same definitions of those values. For SSA form that's trivially true, for RTL of course it isn't. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44838
[Bug rtl-optimization/44838] [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c
--- Comment #30 from rguenth at gcc dot gnu dot org 2010-07-07 13:58 --- I'm going to test the patch in comment #28. -- rguenth at gcc dot gnu dot org changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |rguenth at gcc dot gnu dot |dot org |org Status|NEW |ASSIGNED Last reconfirmed|2010-07-06 21:40:24 |2010-07-07 13:58:10 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44838