https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122308
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |ASSIGNED
Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot
gnu.org
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
So the issue with unroll-and-jam is that when performing jam we put outer loop
stmts into the inner loop:
<bb 3> [local count: 9663674]:
# i_22 = PHI <0(17), i_55(26)>
# ivtmp_56 = PHI <1(17), ivtmp_57(26)>
index_14 = c[i_22];
<bb 4> [local count: 956703966]:
# j_23 = PHI <0(3), j_53(23)>
# ivtmp_18 = PHI <1024(3), ivtmp_27(23)>
_1 = a[j_23];
_2 = (unsigned short) _1;
_3 = index_14 + j_23;
_4 = b[_3];
_5 = (unsigned short) _4;
_6 = _2 + _5;
_7 = (short int) _6;
a[j_23] = _7;
i_15 = i_22 + 1;
===> index_11 = c[i_15];
_47 = index_11 + j_23;
_48 = b[_47];
_49 = (unsigned short) _48;
_50 = _6 + _49;
_51 = (short int) _50;
a[j_23] = _51;
j_53 = j_23 + 1;
ivtmp_27 = ivtmp_18 - 1;
if (ivtmp_27 != 0)
and so the b[_47] access becomes a gather. The most reasonable short-term
solution would be to not perform unroll-and-jam when there's a load
in the outer loop. Alternatively do it like loop interchange and perform
poor-mans hoisting of invariant refs. Aka, sth like
diff --git a/gcc/gimple-loop-jam.cc b/gcc/gimple-loop-jam.cc
index 5e6c04a7d7f..5c74f80af4c 100644
--- a/gcc/gimple-loop-jam.cc
+++ b/gcc/gimple-loop-jam.cc
@@ -641,6 +641,7 @@ tree_loop_unroll_and_jam (void)
{
cleanup_tree_cfg ();
todo &= ~TODO_cleanup_cfg;
+ todo |= loop_invariant_motion_in_fun (cfun, false);
}
rewrite_into_loop_closed_ssa (NULL, 0);
scev_reset ();