Instead of going via the PHI node accessible through the reduc-dec link, use the scalar def of the reduction SLP node. Compute this in vectorize_fold_left_reduction itself.
v2 avoids using the ops[] taken from the scalar stmt in the stmt-info since that does no longer work for epilogue loops after I nuked the pattern stmt operand massaging. Instead go to the reduction SLP tree and use its scalar def. I added a FIXME as to how I think this all should work, but as usual, I'll leave that for a followup, not necessarily by me ... Bootstrap and regtest ongoing on x86_64-unknown-linux-gnu. Richard. * tree-vect-loop.cc (vectorize_fold_left_reduction): Do not get reduc_var as argument, instead compute it here. (vect_transform_reduction): Adjust. --- gcc/tree-vect-loop.cc | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index f22613e5a3a..24cff6fa870 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -6501,7 +6501,6 @@ vectorize_fold_left_reduction (loop_vec_info loop_vinfo, stmt_vec_info stmt_info, gimple_stmt_iterator *gsi, slp_tree slp_node, - gimple *reduc_def_stmt, code_helper code, internal_fn reduc_fn, int num_ops, tree vectype_in, int reduc_index, vec_loop_masks *masks, @@ -6526,6 +6525,13 @@ vectorize_fold_left_reduction (loop_vec_info loop_vinfo, gcc_assert (known_eq (TYPE_VECTOR_SUBPARTS (vectype_out), TYPE_VECTOR_SUBPARTS (vectype_in))); + /* ??? We should, when transforming the cycle PHI, record the existing + scalar def as vector def so looking up the vector def works. This + would also allow generalizing this for reduction paths of length > 1 + and/or SLP reductions. */ + slp_tree reduc_node = SLP_TREE_CHILDREN (slp_node)[reduc_index]; + tree reduc_var = vect_get_slp_scalar_def (reduc_node, 0); + /* The operands either come from a binary operation or an IFN_COND operation. The former is a gimple assign with binary rhs and the latter is a gimple call with four arguments. */ @@ -6546,7 +6552,6 @@ vectorize_fold_left_reduction (loop_vec_info loop_vinfo, gimple *sdef = vect_orig_stmt (scalar_dest_def_info)->stmt; tree scalar_dest = gimple_get_lhs (sdef); tree scalar_type = TREE_TYPE (scalar_dest); - tree reduc_var = gimple_phi_result (reduc_def_stmt); int vec_num = vec_oprnds0.length (); tree vec_elem_type = TREE_TYPE (vectype_out); @@ -8016,8 +8021,6 @@ vect_transform_reduction (loop_vec_info loop_vinfo, The last use is the reduction variable. In case of nested cycle this assumption is not true: we use reduc_index to record the index of the reduction variable. */ - stmt_vec_info phi_info = STMT_VINFO_REDUC_DEF (vect_orig_stmt (stmt_info)); - gphi *reduc_def_phi = as_a <gphi *> (phi_info->stmt); int reduc_index = STMT_VINFO_REDUC_IDX (stmt_info); tree vectype_in = SLP_TREE_VECTYPE (SLP_TREE_CHILDREN (slp_node)[0]); @@ -8060,7 +8063,7 @@ vect_transform_reduction (loop_vec_info loop_vinfo, internal_fn reduc_fn = STMT_VINFO_REDUC_FN (reduc_info); gcc_assert (code.is_tree_code () || cond_fn_p); return vectorize_fold_left_reduction - (loop_vinfo, stmt_info, gsi, slp_node, reduc_def_phi, + (loop_vinfo, stmt_info, gsi, slp_node, code, reduc_fn, op.num_ops, vectype_in, reduc_index, masks, lens); } -- 2.43.0