https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92266
Bug ID: 92266 Summary: Duplicate code generation for live stmts from SLP Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- int foo (int * __restrict__ a, int * b, int n) { int tem1, tem2; for (int i = 0; i < n; ++i) { tem1 = a[i*2 + 0] * 2; tem2 = a[i*2 + 1] * 2; b[i*4 + 0] = tem1; b[i*4 + 1] = tem1; b[i*4 + 2] = tem2; b[i*4 + 3] = tem2; } return tem1 + tem2; } shows <bb 10> [local count: 105119324]: # tem1_55 = PHI <tem1_35(3)> # tem2_54 = PHI <tem2_36(3)> # vect_tem1_35.9_53 = PHI <vect_tem1_35.9_65(3)> _61 = BIT_FIELD_REF <vect_tem1_35.9_53, 32, 96>; _62 = BIT_FIELD_REF <vect_tem1_35.9_53, 32, 64>; _63 = BIT_FIELD_REF <vect_tem1_35.9_53, 32, 32>; _64 = BIT_FIELD_REF <vect_tem1_35.9_53, 32, 0>; goto <bb 4>; [100.00%] <bb 6> [local count: 850510900]: goto <bb 3>; [100.00%] <bb 4> [local count: 118111601]: # tem1_45 = PHI <tem1_29(D)(2), _64(10)> # tem2_46 = PHI <tem2_30(D)(2), _62(10)> _33 = tem1_45 + tem2_46; which is because we iterate like FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (slp_node), i, slp_stmt_info) { if (STMT_VINFO_LIVE_P (slp_stmt_info) && !vectorizable_live_operation (slp_stmt_info, gsi, slp_node, slp_node_instance, i, vec_stmt_p, cost_vec)) return false; } so for stmts appearing multiple times we code-gen the live operation multiple times. This is even worse for stmts appearing in multiple SLP nodes. Luckily the code is all dead in the end.