https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92266

            Bug ID: 92266
           Summary: Duplicate code generation for live stmts from SLP
           Product: gcc
           Version: 10.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

int foo (int * __restrict__ a, int * b, int n)
{
  int tem1, tem2;
  for (int i = 0; i < n; ++i)
    {
      tem1 = a[i*2 + 0] * 2;
      tem2 = a[i*2 + 1] * 2;
      b[i*4 + 0] = tem1;
      b[i*4 + 1] = tem1;
      b[i*4 + 2] = tem2;
      b[i*4 + 3] = tem2;
    }
  return tem1 + tem2;
}

shows

  <bb 10> [local count: 105119324]:
  # tem1_55 = PHI <tem1_35(3)>
  # tem2_54 = PHI <tem2_36(3)>
  # vect_tem1_35.9_53 = PHI <vect_tem1_35.9_65(3)>
  _61 = BIT_FIELD_REF <vect_tem1_35.9_53, 32, 96>;
  _62 = BIT_FIELD_REF <vect_tem1_35.9_53, 32, 64>;
  _63 = BIT_FIELD_REF <vect_tem1_35.9_53, 32, 32>;
  _64 = BIT_FIELD_REF <vect_tem1_35.9_53, 32, 0>;
  goto <bb 4>; [100.00%]

  <bb 6> [local count: 850510900]:
  goto <bb 3>; [100.00%]

  <bb 4> [local count: 118111601]:
  # tem1_45 = PHI <tem1_29(D)(2), _64(10)>
  # tem2_46 = PHI <tem2_30(D)(2), _62(10)>
  _33 = tem1_45 + tem2_46;

which is because we iterate like

      FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (slp_node), i, slp_stmt_info)
        {
          if (STMT_VINFO_LIVE_P (slp_stmt_info)
              && !vectorizable_live_operation (slp_stmt_info, gsi, slp_node,
                                               slp_node_instance, i,
                                               vec_stmt_p, cost_vec))
            return false;
        }

so for stmts appearing multiple times we code-gen the live operation multiple
times.  This is even worse for stmts appearing in multiple SLP nodes.
Luckily the code is all dead in the end.

Reply via email to