Re: [2/6] Don't assign a cost to vectorizable_assignment
On Thu, Nov 7, 2019 at 5:40 PM Richard Sandiford wrote: > > Richard Biener writes: > > On Wed, Nov 6, 2019 at 4:58 PM Richard Sandiford > > wrote: > >> > >> Richard Biener writes: > >> > On Tue, Nov 5, 2019 at 3:27 PM Richard Sandiford > >> > wrote: > >> >> > >> >> vectorizable_assignment handles true SSA-to-SSA copies (which hopefully > >> >> we don't see in practice) and no-op conversions that are required > >> >> to maintain correct gimple, such as changes between signed and > >> >> unsigned types. These cases shouldn't generate any code and so > >> >> shouldn't count against either the scalar or vector costs. > >> >> > >> >> Later patches test this, but it seemed worth splitting out. > >> > > >> > Hmm, but you have to adjust vect_compute_single_scalar_iteration_cost and > >> > possibly the SLP cost walk as well, otherwise we're artificially making > >> > those copies cheaper when vectorized. > >> > >> Ah, yeah. It looks complicated to reproduce the conditions exactly > >> there, so how about just costing 1 copy in vectorizable_assignment > >> to counteract it, and ignore ncopies? > > > > I guess costing a single scalar_stmt ought to make it exactly offset > > the scalar cost? > > To summarise what we said on IRC: the problem with that is that we > need to count VF scalar stmts, where VF might be a runtime value. > The follow-on loop costing code copes with variable VF without > relying on vect_vf_for_cost. > > Calling vectorizable_assignment from the scalar costing code > seemed like too much of a hack. And it turns out that we can't > delay the scalar costing until after vect_analyze_stmts because > vect_enhance_data_refs_alignment needs it before then. Reworking > this whole thing is too much work for GCC 10 at this stage. > > So this patch goes with your suggestion of using a test based on > tree_nop_conversion. To make sure that the scalar and vector costs > stay somewhat consistent, vectorizable_assignment continues to cost > stmts for which the new predicate is false. > > Tested as before. OK. thanks, Richard. > Thanks, > Richard > > > 2019-11-07 Richard Sandiford > > gcc/ > * tree-vectorizer.h (vect_nop_conversion_p): Declare. > * tree-vect-stmts.c (vect_nop_conversion_p): New function. > (vectorizable_assignment): Don't add a cost for nop conversions. > * tree-vect-loop.c (vect_compute_single_scalar_iteration_cost): > Likewise. > * tree-vect-slp.c (vect_bb_slp_scalar_cost): Likewise. > > Index: gcc/tree-vectorizer.h > === > --- gcc/tree-vectorizer.h 2019-11-07 15:11:22.290972236 + > +++ gcc/tree-vectorizer.h 2019-11-07 16:32:14.817523866 + > @@ -1654,6 +1654,7 @@ extern tree vect_get_vec_def_for_stmt_co > extern bool vect_transform_stmt (stmt_vec_info, gimple_stmt_iterator *, > slp_tree, slp_instance); > extern void vect_remove_stores (stmt_vec_info); > +extern bool vect_nop_conversion_p (stmt_vec_info); > extern opt_result vect_analyze_stmt (stmt_vec_info, bool *, slp_tree, > slp_instance, stmt_vector_for_cost *); > extern void vect_get_load_cost (stmt_vec_info, int, bool, > Index: gcc/tree-vect-stmts.c > === > --- gcc/tree-vect-stmts.c 2019-11-07 15:11:50.134775028 + > +++ gcc/tree-vect-stmts.c 2019-11-07 16:32:14.817523866 + > @@ -5284,6 +5284,29 @@ vectorizable_conversion (stmt_vec_info s >return true; > } > > +/* Return true if we can assume from the scalar form of STMT_INFO that > + neither the scalar nor the vector forms will generate code. STMT_INFO > + is known not to involve a data reference. */ > + > +bool > +vect_nop_conversion_p (stmt_vec_info stmt_info) > +{ > + gassign *stmt = dyn_cast (stmt_info->stmt); > + if (!stmt) > +return false; > + > + tree lhs = gimple_assign_lhs (stmt); > + tree_code code = gimple_assign_rhs_code (stmt); > + tree rhs = gimple_assign_rhs1 (stmt); > + > + if (code == SSA_NAME || code == VIEW_CONVERT_EXPR) > +return true; > + > + if (CONVERT_EXPR_CODE_P (code)) > +return tree_nop_conversion_p (TREE_TYPE (lhs), TREE_TYPE (rhs)); > + > + return false; > +} > > /* Function vectorizable_assignment. > > @@ -5399,7 +5422,9 @@ vectorizable_assignment (stmt_vec_info s > { >STMT_VINFO_TYPE (stmt_info) = assignment_vec_info_type; >DUMP_VECT_SCOPE ("vectorizable_assignment"); > - vect_model_simple_cost (stmt_info, ncopies, dt, ndts, slp_node, > cost_vec); > + if (!vect_nop_conversion_p (stmt_info)) > + vect_model_simple_cost (stmt_info, ncopies, dt, ndts, slp_node, > + cost_vec); >return true; > } > > Index: gcc/tree-vect-loop.c > === > --- gcc/tree-vect-loop.c2019-11-07
Re: [2/6] Don't assign a cost to vectorizable_assignment
Richard Biener writes: > On Wed, Nov 6, 2019 at 4:58 PM Richard Sandiford > wrote: >> >> Richard Biener writes: >> > On Tue, Nov 5, 2019 at 3:27 PM Richard Sandiford >> > wrote: >> >> >> >> vectorizable_assignment handles true SSA-to-SSA copies (which hopefully >> >> we don't see in practice) and no-op conversions that are required >> >> to maintain correct gimple, such as changes between signed and >> >> unsigned types. These cases shouldn't generate any code and so >> >> shouldn't count against either the scalar or vector costs. >> >> >> >> Later patches test this, but it seemed worth splitting out. >> > >> > Hmm, but you have to adjust vect_compute_single_scalar_iteration_cost and >> > possibly the SLP cost walk as well, otherwise we're artificially making >> > those copies cheaper when vectorized. >> >> Ah, yeah. It looks complicated to reproduce the conditions exactly >> there, so how about just costing 1 copy in vectorizable_assignment >> to counteract it, and ignore ncopies? > > I guess costing a single scalar_stmt ought to make it exactly offset > the scalar cost? To summarise what we said on IRC: the problem with that is that we need to count VF scalar stmts, where VF might be a runtime value. The follow-on loop costing code copes with variable VF without relying on vect_vf_for_cost. Calling vectorizable_assignment from the scalar costing code seemed like too much of a hack. And it turns out that we can't delay the scalar costing until after vect_analyze_stmts because vect_enhance_data_refs_alignment needs it before then. Reworking this whole thing is too much work for GCC 10 at this stage. So this patch goes with your suggestion of using a test based on tree_nop_conversion. To make sure that the scalar and vector costs stay somewhat consistent, vectorizable_assignment continues to cost stmts for which the new predicate is false. Tested as before. Thanks, Richard 2019-11-07 Richard Sandiford gcc/ * tree-vectorizer.h (vect_nop_conversion_p): Declare. * tree-vect-stmts.c (vect_nop_conversion_p): New function. (vectorizable_assignment): Don't add a cost for nop conversions. * tree-vect-loop.c (vect_compute_single_scalar_iteration_cost): Likewise. * tree-vect-slp.c (vect_bb_slp_scalar_cost): Likewise. Index: gcc/tree-vectorizer.h === --- gcc/tree-vectorizer.h 2019-11-07 15:11:22.290972236 + +++ gcc/tree-vectorizer.h 2019-11-07 16:32:14.817523866 + @@ -1654,6 +1654,7 @@ extern tree vect_get_vec_def_for_stmt_co extern bool vect_transform_stmt (stmt_vec_info, gimple_stmt_iterator *, slp_tree, slp_instance); extern void vect_remove_stores (stmt_vec_info); +extern bool vect_nop_conversion_p (stmt_vec_info); extern opt_result vect_analyze_stmt (stmt_vec_info, bool *, slp_tree, slp_instance, stmt_vector_for_cost *); extern void vect_get_load_cost (stmt_vec_info, int, bool, Index: gcc/tree-vect-stmts.c === --- gcc/tree-vect-stmts.c 2019-11-07 15:11:50.134775028 + +++ gcc/tree-vect-stmts.c 2019-11-07 16:32:14.817523866 + @@ -5284,6 +5284,29 @@ vectorizable_conversion (stmt_vec_info s return true; } +/* Return true if we can assume from the scalar form of STMT_INFO that + neither the scalar nor the vector forms will generate code. STMT_INFO + is known not to involve a data reference. */ + +bool +vect_nop_conversion_p (stmt_vec_info stmt_info) +{ + gassign *stmt = dyn_cast (stmt_info->stmt); + if (!stmt) +return false; + + tree lhs = gimple_assign_lhs (stmt); + tree_code code = gimple_assign_rhs_code (stmt); + tree rhs = gimple_assign_rhs1 (stmt); + + if (code == SSA_NAME || code == VIEW_CONVERT_EXPR) +return true; + + if (CONVERT_EXPR_CODE_P (code)) +return tree_nop_conversion_p (TREE_TYPE (lhs), TREE_TYPE (rhs)); + + return false; +} /* Function vectorizable_assignment. @@ -5399,7 +5422,9 @@ vectorizable_assignment (stmt_vec_info s { STMT_VINFO_TYPE (stmt_info) = assignment_vec_info_type; DUMP_VECT_SCOPE ("vectorizable_assignment"); - vect_model_simple_cost (stmt_info, ncopies, dt, ndts, slp_node, cost_vec); + if (!vect_nop_conversion_p (stmt_info)) + vect_model_simple_cost (stmt_info, ncopies, dt, ndts, slp_node, + cost_vec); return true; } Index: gcc/tree-vect-loop.c === --- gcc/tree-vect-loop.c2019-11-07 15:11:22.290972236 + +++ gcc/tree-vect-loop.c2019-11-07 16:32:14.813523891 + @@ -1126,7 +1126,9 @@ vect_compute_single_scalar_iteration_cos else kind = scalar_store; } - else + else if (vect_nop_conversion_p (stmt_info)) +
Re: [2/6] Don't assign a cost to vectorizable_assignment
On Wed, Nov 6, 2019 at 4:58 PM Richard Sandiford wrote: > > Richard Biener writes: > > On Tue, Nov 5, 2019 at 3:27 PM Richard Sandiford > > wrote: > >> > >> vectorizable_assignment handles true SSA-to-SSA copies (which hopefully > >> we don't see in practice) and no-op conversions that are required > >> to maintain correct gimple, such as changes between signed and > >> unsigned types. These cases shouldn't generate any code and so > >> shouldn't count against either the scalar or vector costs. > >> > >> Later patches test this, but it seemed worth splitting out. > > > > Hmm, but you have to adjust vect_compute_single_scalar_iteration_cost and > > possibly the SLP cost walk as well, otherwise we're artificially making > > those copies cheaper when vectorized. > > Ah, yeah. It looks complicated to reproduce the conditions exactly > there, so how about just costing 1 copy in vectorizable_assignment > to counteract it, and ignore ncopies? I guess costing a single scalar_stmt ought to make it exactly offset the scalar cost? > Seems like vectorizable_* ought to be costing the scalar code as > well as the vector code, but that's too much for GCC 10 at this stage. Yeah. Richard. > Thanks, > Richard > > > > > >> > >> 2019-11-04 Richard Sandiford > >> > >> gcc/ > >> * tree-vect-stmts.c (vectorizable_assignment): Don't add a cost. > >> > >> Index: gcc/tree-vect-stmts.c > >> === > >> --- gcc/tree-vect-stmts.c 2019-11-05 14:17:43.330141911 + > >> +++ gcc/tree-vect-stmts.c 2019-11-05 14:18:39.169752725 + > >> @@ -5305,7 +5305,7 @@ vectorizable_conversion (stmt_vec_info s > >> static bool > >> vectorizable_assignment (stmt_vec_info stmt_info, gimple_stmt_iterator > >> *gsi, > >> stmt_vec_info *vec_stmt, slp_tree slp_node, > >> -stmt_vector_for_cost *cost_vec) > >> +stmt_vector_for_cost *) > >> { > >>tree vec_dest; > >>tree scalar_dest; > >> @@ -5313,7 +5313,6 @@ vectorizable_assignment (stmt_vec_info s > >>loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info); > >>tree new_temp; > >>enum vect_def_type dt[1] = {vect_unknown_def_type}; > >> - int ndts = 1; > >>int ncopies; > >>int i, j; > >>vec vec_oprnds = vNULL; > >> @@ -5409,7 +5408,8 @@ vectorizable_assignment (stmt_vec_info s > >> { > >>STMT_VINFO_TYPE (stmt_info) = assignment_vec_info_type; > >>DUMP_VECT_SCOPE ("vectorizable_assignment"); > >> - vect_model_simple_cost (stmt_info, ncopies, dt, ndts, slp_node, > >> cost_vec); > >> + /* Don't add a cost here. SSA copies and no-op conversions > >> +shouldn't generate any code in either scalar or vector form. */ > >>return true; > >> } > >>
Re: [2/6] Don't assign a cost to vectorizable_assignment
Richard Biener writes: > On Tue, Nov 5, 2019 at 3:27 PM Richard Sandiford > wrote: >> >> vectorizable_assignment handles true SSA-to-SSA copies (which hopefully >> we don't see in practice) and no-op conversions that are required >> to maintain correct gimple, such as changes between signed and >> unsigned types. These cases shouldn't generate any code and so >> shouldn't count against either the scalar or vector costs. >> >> Later patches test this, but it seemed worth splitting out. > > Hmm, but you have to adjust vect_compute_single_scalar_iteration_cost and > possibly the SLP cost walk as well, otherwise we're artificially making > those copies cheaper when vectorized. Ah, yeah. It looks complicated to reproduce the conditions exactly there, so how about just costing 1 copy in vectorizable_assignment to counteract it, and ignore ncopies? Seems like vectorizable_* ought to be costing the scalar code as well as the vector code, but that's too much for GCC 10 at this stage. Thanks, Richard > >> >> 2019-11-04 Richard Sandiford >> >> gcc/ >> * tree-vect-stmts.c (vectorizable_assignment): Don't add a cost. >> >> Index: gcc/tree-vect-stmts.c >> === >> --- gcc/tree-vect-stmts.c 2019-11-05 14:17:43.330141911 + >> +++ gcc/tree-vect-stmts.c 2019-11-05 14:18:39.169752725 + >> @@ -5305,7 +5305,7 @@ vectorizable_conversion (stmt_vec_info s >> static bool >> vectorizable_assignment (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi, >> stmt_vec_info *vec_stmt, slp_tree slp_node, >> -stmt_vector_for_cost *cost_vec) >> +stmt_vector_for_cost *) >> { >>tree vec_dest; >>tree scalar_dest; >> @@ -5313,7 +5313,6 @@ vectorizable_assignment (stmt_vec_info s >>loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info); >>tree new_temp; >>enum vect_def_type dt[1] = {vect_unknown_def_type}; >> - int ndts = 1; >>int ncopies; >>int i, j; >>vec vec_oprnds = vNULL; >> @@ -5409,7 +5408,8 @@ vectorizable_assignment (stmt_vec_info s >> { >>STMT_VINFO_TYPE (stmt_info) = assignment_vec_info_type; >>DUMP_VECT_SCOPE ("vectorizable_assignment"); >> - vect_model_simple_cost (stmt_info, ncopies, dt, ndts, slp_node, >> cost_vec); >> + /* Don't add a cost here. SSA copies and no-op conversions >> +shouldn't generate any code in either scalar or vector form. */ >>return true; >> } >>
Re: [2/6] Don't assign a cost to vectorizable_assignment
On Tue, Nov 5, 2019 at 3:27 PM Richard Sandiford wrote: > > vectorizable_assignment handles true SSA-to-SSA copies (which hopefully > we don't see in practice) and no-op conversions that are required > to maintain correct gimple, such as changes between signed and > unsigned types. These cases shouldn't generate any code and so > shouldn't count against either the scalar or vector costs. > > Later patches test this, but it seemed worth splitting out. Hmm, but you have to adjust vect_compute_single_scalar_iteration_cost and possibly the SLP cost walk as well, otherwise we're artificially making those copies cheaper when vectorized. > > 2019-11-04 Richard Sandiford > > gcc/ > * tree-vect-stmts.c (vectorizable_assignment): Don't add a cost. > > Index: gcc/tree-vect-stmts.c > === > --- gcc/tree-vect-stmts.c 2019-11-05 14:17:43.330141911 + > +++ gcc/tree-vect-stmts.c 2019-11-05 14:18:39.169752725 + > @@ -5305,7 +5305,7 @@ vectorizable_conversion (stmt_vec_info s > static bool > vectorizable_assignment (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi, > stmt_vec_info *vec_stmt, slp_tree slp_node, > -stmt_vector_for_cost *cost_vec) > +stmt_vector_for_cost *) > { >tree vec_dest; >tree scalar_dest; > @@ -5313,7 +5313,6 @@ vectorizable_assignment (stmt_vec_info s >loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info); >tree new_temp; >enum vect_def_type dt[1] = {vect_unknown_def_type}; > - int ndts = 1; >int ncopies; >int i, j; >vec vec_oprnds = vNULL; > @@ -5409,7 +5408,8 @@ vectorizable_assignment (stmt_vec_info s > { >STMT_VINFO_TYPE (stmt_info) = assignment_vec_info_type; >DUMP_VECT_SCOPE ("vectorizable_assignment"); > - vect_model_simple_cost (stmt_info, ncopies, dt, ndts, slp_node, > cost_vec); > + /* Don't add a cost here. SSA copies and no-op conversions > +shouldn't generate any code in either scalar or vector form. */ >return true; > } >
[2/6] Don't assign a cost to vectorizable_assignment
vectorizable_assignment handles true SSA-to-SSA copies (which hopefully we don't see in practice) and no-op conversions that are required to maintain correct gimple, such as changes between signed and unsigned types. These cases shouldn't generate any code and so shouldn't count against either the scalar or vector costs. Later patches test this, but it seemed worth splitting out. 2019-11-04 Richard Sandiford gcc/ * tree-vect-stmts.c (vectorizable_assignment): Don't add a cost. Index: gcc/tree-vect-stmts.c === --- gcc/tree-vect-stmts.c 2019-11-05 14:17:43.330141911 + +++ gcc/tree-vect-stmts.c 2019-11-05 14:18:39.169752725 + @@ -5305,7 +5305,7 @@ vectorizable_conversion (stmt_vec_info s static bool vectorizable_assignment (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi, stmt_vec_info *vec_stmt, slp_tree slp_node, -stmt_vector_for_cost *cost_vec) +stmt_vector_for_cost *) { tree vec_dest; tree scalar_dest; @@ -5313,7 +5313,6 @@ vectorizable_assignment (stmt_vec_info s loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info); tree new_temp; enum vect_def_type dt[1] = {vect_unknown_def_type}; - int ndts = 1; int ncopies; int i, j; vec vec_oprnds = vNULL; @@ -5409,7 +5408,8 @@ vectorizable_assignment (stmt_vec_info s { STMT_VINFO_TYPE (stmt_info) = assignment_vec_info_type; DUMP_VECT_SCOPE ("vectorizable_assignment"); - vect_model_simple_cost (stmt_info, ncopies, dt, ndts, slp_node, cost_vec); + /* Don't add a cost here. SSA copies and no-op conversions +shouldn't generate any code in either scalar or vector form. */ return true; }