https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122308

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot 
gnu.org

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
So the issue with unroll-and-jam is that when performing jam we put outer loop
stmts into the inner loop:

  <bb 3> [local count: 9663674]:
  # i_22 = PHI <0(17), i_55(26)>
  # ivtmp_56 = PHI <1(17), ivtmp_57(26)>
  index_14 = c[i_22];

  <bb 4> [local count: 956703966]:
  # j_23 = PHI <0(3), j_53(23)>
  # ivtmp_18 = PHI <1024(3), ivtmp_27(23)>
  _1 = a[j_23];
  _2 = (unsigned short) _1;
  _3 = index_14 + j_23;
  _4 = b[_3];
  _5 = (unsigned short) _4;
  _6 = _2 + _5;
  _7 = (short int) _6;
  a[j_23] = _7;
  i_15 = i_22 + 1;
 ===>  index_11 = c[i_15];
  _47 = index_11 + j_23;
  _48 = b[_47];
  _49 = (unsigned short) _48;
  _50 = _6 + _49;
  _51 = (short int) _50;
  a[j_23] = _51;
  j_53 = j_23 + 1;
  ivtmp_27 = ivtmp_18 - 1;
  if (ivtmp_27 != 0)

and so the b[_47] access becomes a gather.  The most reasonable short-term
solution would be to not perform unroll-and-jam when there's a load
in the outer loop.  Alternatively do it like loop interchange and perform
poor-mans hoisting of invariant refs.  Aka, sth like

diff --git a/gcc/gimple-loop-jam.cc b/gcc/gimple-loop-jam.cc
index 5e6c04a7d7f..5c74f80af4c 100644
--- a/gcc/gimple-loop-jam.cc
+++ b/gcc/gimple-loop-jam.cc
@@ -641,6 +641,7 @@ tree_loop_unroll_and_jam (void)
        {
          cleanup_tree_cfg ();
          todo &= ~TODO_cleanup_cfg;
+         todo |= loop_invariant_motion_in_fun (cfun, false);
        }
       rewrite_into_loop_closed_ssa (NULL, 0);
       scev_reset ();

Reply via email to