https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123672

--- Comment #9 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
forwprop1 changes:
   a_7 = *x_6(D);
   b_9 = *y_8(D);
-  c_10 = VEC_PERM_EXPR <a_7, a_7, { 0, 2, 0, 2 }>;
-  d_11 = VEC_PERM_EXPR <a_7, a_7, { 1, 3, 1, 3 }>;
+  c_10 = VEC_PERM_EXPR <a_7, b_9, { 0, 2, 4, 4 }>;
+  d_11 = VEC_PERM_EXPR <a_7, b_9, { 1, 3, 5, 5 }>;
   e_12 = VEC_PERM_EXPR <b_9, b_9, { 0, 2, 0, 2 }>;
   f_13 = VEC_PERM_EXPR <b_9, b_9, { 1, 3, 1, 3 }>;
   _1 = c_10 + d_11;
   _2 = c_10 - d_11;
   g_14 = VEC_PERM_EXPR <_1, _2, { 0, 4, 1, 5 }>;
   _3 = e_12 + f_13;
   _4 = e_12 - f_13;
-  h_15 = VEC_PERM_EXPR <_3, _4, { 0, 4, 1, 5 }>;
+  h_15 = VEC_PERM_EXPR <_1, _2, { 2, 6, 3, 7 }>;
   *x_6(D) = g_14;
   *y_8(D) = h_15;
   return;

What is wrong are the new selectors on the two new VEC_PERM_EXPRs, it should
have been 0, 2, 4, 6 and 1, 3, 5, 7.
By using 4 twice and 5 twice only 2 lanes from b are actually used when
previously all 4 have been used.

Reply via email to