Re: [2/6] Don't assign a cost to vectorizable_assignment

2019-11-08 Thread Richard Biener
On Thu, Nov 7, 2019 at 5:40 PM Richard Sandiford
 wrote:
>
> Richard Biener  writes:
> > On Wed, Nov 6, 2019 at 4:58 PM Richard Sandiford
> >  wrote:
> >>
> >> Richard Biener  writes:
> >> > On Tue, Nov 5, 2019 at 3:27 PM Richard Sandiford
> >> >  wrote:
> >> >>
> >> >> vectorizable_assignment handles true SSA-to-SSA copies (which hopefully
> >> >> we don't see in practice) and no-op conversions that are required
> >> >> to maintain correct gimple, such as changes between signed and
> >> >> unsigned types.  These cases shouldn't generate any code and so
> >> >> shouldn't count against either the scalar or vector costs.
> >> >>
> >> >> Later patches test this, but it seemed worth splitting out.
> >> >
> >> > Hmm, but you have to adjust vect_compute_single_scalar_iteration_cost and
> >> > possibly the SLP cost walk as well, otherwise we're artificially making
> >> > those copies cheaper when vectorized.
> >>
> >> Ah, yeah.  It looks complicated to reproduce the conditions exactly
> >> there, so how about just costing 1 copy in vectorizable_assignment
> >> to counteract it, and ignore ncopies?
> >
> > I guess costing a single scalar_stmt ought to make it exactly offset
> > the scalar cost?
>
> To summarise what we said on IRC: the problem with that is that we
> need to count VF scalar stmts, where VF might be a runtime value.
> The follow-on loop costing code copes with variable VF without
> relying on vect_vf_for_cost.
>
> Calling vectorizable_assignment from the scalar costing code
> seemed like too much of a hack.  And it turns out that we can't
> delay the scalar costing until after vect_analyze_stmts because
> vect_enhance_data_refs_alignment needs it before then.  Reworking
> this whole thing is too much work for GCC 10 at this stage.
>
> So this patch goes with your suggestion of using a test based on
> tree_nop_conversion.  To make sure that the scalar and vector costs
> stay somewhat consistent, vectorizable_assignment continues to cost
> stmts for which the new predicate is false.
>
> Tested as before.

OK.

thanks,
Richard.

> Thanks,
> Richard
>
>
> 2019-11-07  Richard Sandiford  
>
> gcc/
> * tree-vectorizer.h (vect_nop_conversion_p): Declare.
> * tree-vect-stmts.c (vect_nop_conversion_p): New function.
> (vectorizable_assignment): Don't add a cost for nop conversions.
> * tree-vect-loop.c (vect_compute_single_scalar_iteration_cost):
> Likewise.
> * tree-vect-slp.c (vect_bb_slp_scalar_cost): Likewise.
>
> Index: gcc/tree-vectorizer.h
> ===
> --- gcc/tree-vectorizer.h   2019-11-07 15:11:22.290972236 +
> +++ gcc/tree-vectorizer.h   2019-11-07 16:32:14.817523866 +
> @@ -1654,6 +1654,7 @@ extern tree vect_get_vec_def_for_stmt_co
>  extern bool vect_transform_stmt (stmt_vec_info, gimple_stmt_iterator *,
>  slp_tree, slp_instance);
>  extern void vect_remove_stores (stmt_vec_info);
> +extern bool vect_nop_conversion_p (stmt_vec_info);
>  extern opt_result vect_analyze_stmt (stmt_vec_info, bool *, slp_tree,
>  slp_instance, stmt_vector_for_cost *);
>  extern void vect_get_load_cost (stmt_vec_info, int, bool,
> Index: gcc/tree-vect-stmts.c
> ===
> --- gcc/tree-vect-stmts.c   2019-11-07 15:11:50.134775028 +
> +++ gcc/tree-vect-stmts.c   2019-11-07 16:32:14.817523866 +
> @@ -5284,6 +5284,29 @@ vectorizable_conversion (stmt_vec_info s
>return true;
>  }
>
> +/* Return true if we can assume from the scalar form of STMT_INFO that
> +   neither the scalar nor the vector forms will generate code.  STMT_INFO
> +   is known not to involve a data reference.  */
> +
> +bool
> +vect_nop_conversion_p (stmt_vec_info stmt_info)
> +{
> +  gassign *stmt = dyn_cast  (stmt_info->stmt);
> +  if (!stmt)
> +return false;
> +
> +  tree lhs = gimple_assign_lhs (stmt);
> +  tree_code code = gimple_assign_rhs_code (stmt);
> +  tree rhs = gimple_assign_rhs1 (stmt);
> +
> +  if (code == SSA_NAME || code == VIEW_CONVERT_EXPR)
> +return true;
> +
> +  if (CONVERT_EXPR_CODE_P (code))
> +return tree_nop_conversion_p (TREE_TYPE (lhs), TREE_TYPE (rhs));
> +
> +  return false;
> +}
>
>  /* Function vectorizable_assignment.
>
> @@ -5399,7 +5422,9 @@ vectorizable_assignment (stmt_vec_info s
>  {
>STMT_VINFO_TYPE (stmt_info) = assignment_vec_info_type;
>DUMP_VECT_SCOPE ("vectorizable_assignment");
> -  vect_model_simple_cost (stmt_info, ncopies, dt, ndts, slp_node, 
> cost_vec);
> +  if (!vect_nop_conversion_p (stmt_info))
> +   vect_model_simple_cost (stmt_info, ncopies, dt, ndts, slp_node,
> +   cost_vec);
>return true;
>  }
>
> Index: gcc/tree-vect-loop.c
> ===
> --- gcc/tree-vect-loop.c2019-11-07 

Re: [2/6] Don't assign a cost to vectorizable_assignment

2019-11-07 Thread Richard Sandiford
Richard Biener  writes:
> On Wed, Nov 6, 2019 at 4:58 PM Richard Sandiford
>  wrote:
>>
>> Richard Biener  writes:
>> > On Tue, Nov 5, 2019 at 3:27 PM Richard Sandiford
>> >  wrote:
>> >>
>> >> vectorizable_assignment handles true SSA-to-SSA copies (which hopefully
>> >> we don't see in practice) and no-op conversions that are required
>> >> to maintain correct gimple, such as changes between signed and
>> >> unsigned types.  These cases shouldn't generate any code and so
>> >> shouldn't count against either the scalar or vector costs.
>> >>
>> >> Later patches test this, but it seemed worth splitting out.
>> >
>> > Hmm, but you have to adjust vect_compute_single_scalar_iteration_cost and
>> > possibly the SLP cost walk as well, otherwise we're artificially making
>> > those copies cheaper when vectorized.
>>
>> Ah, yeah.  It looks complicated to reproduce the conditions exactly
>> there, so how about just costing 1 copy in vectorizable_assignment
>> to counteract it, and ignore ncopies?
>
> I guess costing a single scalar_stmt ought to make it exactly offset
> the scalar cost?

To summarise what we said on IRC: the problem with that is that we
need to count VF scalar stmts, where VF might be a runtime value.
The follow-on loop costing code copes with variable VF without
relying on vect_vf_for_cost.

Calling vectorizable_assignment from the scalar costing code
seemed like too much of a hack.  And it turns out that we can't
delay the scalar costing until after vect_analyze_stmts because
vect_enhance_data_refs_alignment needs it before then.  Reworking
this whole thing is too much work for GCC 10 at this stage.

So this patch goes with your suggestion of using a test based on
tree_nop_conversion.  To make sure that the scalar and vector costs
stay somewhat consistent, vectorizable_assignment continues to cost
stmts for which the new predicate is false.

Tested as before.

Thanks,
Richard


2019-11-07  Richard Sandiford  

gcc/
* tree-vectorizer.h (vect_nop_conversion_p): Declare.
* tree-vect-stmts.c (vect_nop_conversion_p): New function.
(vectorizable_assignment): Don't add a cost for nop conversions.
* tree-vect-loop.c (vect_compute_single_scalar_iteration_cost):
Likewise.
* tree-vect-slp.c (vect_bb_slp_scalar_cost): Likewise.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2019-11-07 15:11:22.290972236 +
+++ gcc/tree-vectorizer.h   2019-11-07 16:32:14.817523866 +
@@ -1654,6 +1654,7 @@ extern tree vect_get_vec_def_for_stmt_co
 extern bool vect_transform_stmt (stmt_vec_info, gimple_stmt_iterator *,
 slp_tree, slp_instance);
 extern void vect_remove_stores (stmt_vec_info);
+extern bool vect_nop_conversion_p (stmt_vec_info);
 extern opt_result vect_analyze_stmt (stmt_vec_info, bool *, slp_tree,
 slp_instance, stmt_vector_for_cost *);
 extern void vect_get_load_cost (stmt_vec_info, int, bool,
Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   2019-11-07 15:11:50.134775028 +
+++ gcc/tree-vect-stmts.c   2019-11-07 16:32:14.817523866 +
@@ -5284,6 +5284,29 @@ vectorizable_conversion (stmt_vec_info s
   return true;
 }
 
+/* Return true if we can assume from the scalar form of STMT_INFO that
+   neither the scalar nor the vector forms will generate code.  STMT_INFO
+   is known not to involve a data reference.  */
+
+bool
+vect_nop_conversion_p (stmt_vec_info stmt_info)
+{
+  gassign *stmt = dyn_cast  (stmt_info->stmt);
+  if (!stmt)
+return false;
+
+  tree lhs = gimple_assign_lhs (stmt);
+  tree_code code = gimple_assign_rhs_code (stmt);
+  tree rhs = gimple_assign_rhs1 (stmt);
+
+  if (code == SSA_NAME || code == VIEW_CONVERT_EXPR)
+return true;
+
+  if (CONVERT_EXPR_CODE_P (code))
+return tree_nop_conversion_p (TREE_TYPE (lhs), TREE_TYPE (rhs));
+
+  return false;
+}
 
 /* Function vectorizable_assignment.
 
@@ -5399,7 +5422,9 @@ vectorizable_assignment (stmt_vec_info s
 {
   STMT_VINFO_TYPE (stmt_info) = assignment_vec_info_type;
   DUMP_VECT_SCOPE ("vectorizable_assignment");
-  vect_model_simple_cost (stmt_info, ncopies, dt, ndts, slp_node, 
cost_vec);
+  if (!vect_nop_conversion_p (stmt_info))
+   vect_model_simple_cost (stmt_info, ncopies, dt, ndts, slp_node,
+   cost_vec);
   return true;
 }
 
Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c2019-11-07 15:11:22.290972236 +
+++ gcc/tree-vect-loop.c2019-11-07 16:32:14.813523891 +
@@ -1126,7 +1126,9 @@ vect_compute_single_scalar_iteration_cos
  else
kind = scalar_store;
 }
-  else
+ else if (vect_nop_conversion_p (stmt_info))
+   

Re: [2/6] Don't assign a cost to vectorizable_assignment

2019-11-07 Thread Richard Biener
On Wed, Nov 6, 2019 at 4:58 PM Richard Sandiford
 wrote:
>
> Richard Biener  writes:
> > On Tue, Nov 5, 2019 at 3:27 PM Richard Sandiford
> >  wrote:
> >>
> >> vectorizable_assignment handles true SSA-to-SSA copies (which hopefully
> >> we don't see in practice) and no-op conversions that are required
> >> to maintain correct gimple, such as changes between signed and
> >> unsigned types.  These cases shouldn't generate any code and so
> >> shouldn't count against either the scalar or vector costs.
> >>
> >> Later patches test this, but it seemed worth splitting out.
> >
> > Hmm, but you have to adjust vect_compute_single_scalar_iteration_cost and
> > possibly the SLP cost walk as well, otherwise we're artificially making
> > those copies cheaper when vectorized.
>
> Ah, yeah.  It looks complicated to reproduce the conditions exactly
> there, so how about just costing 1 copy in vectorizable_assignment
> to counteract it, and ignore ncopies?

I guess costing a single scalar_stmt ought to make it exactly offset
the scalar cost?

> Seems like vectorizable_* ought to be costing the scalar code as
> well as the vector code, but that's too much for GCC 10 at this stage.

Yeah.

Richard.

> Thanks,
> Richard
>
>
> >
> >>
> >> 2019-11-04  Richard Sandiford  
> >>
> >> gcc/
> >> * tree-vect-stmts.c (vectorizable_assignment): Don't add a cost.
> >>
> >> Index: gcc/tree-vect-stmts.c
> >> ===
> >> --- gcc/tree-vect-stmts.c   2019-11-05 14:17:43.330141911 +
> >> +++ gcc/tree-vect-stmts.c   2019-11-05 14:18:39.169752725 +
> >> @@ -5305,7 +5305,7 @@ vectorizable_conversion (stmt_vec_info s
> >>  static bool
> >>  vectorizable_assignment (stmt_vec_info stmt_info, gimple_stmt_iterator 
> >> *gsi,
> >>  stmt_vec_info *vec_stmt, slp_tree slp_node,
> >> -stmt_vector_for_cost *cost_vec)
> >> +stmt_vector_for_cost *)
> >>  {
> >>tree vec_dest;
> >>tree scalar_dest;
> >> @@ -5313,7 +5313,6 @@ vectorizable_assignment (stmt_vec_info s
> >>loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
> >>tree new_temp;
> >>enum vect_def_type dt[1] = {vect_unknown_def_type};
> >> -  int ndts = 1;
> >>int ncopies;
> >>int i, j;
> >>vec vec_oprnds = vNULL;
> >> @@ -5409,7 +5408,8 @@ vectorizable_assignment (stmt_vec_info s
> >>  {
> >>STMT_VINFO_TYPE (stmt_info) = assignment_vec_info_type;
> >>DUMP_VECT_SCOPE ("vectorizable_assignment");
> >> -  vect_model_simple_cost (stmt_info, ncopies, dt, ndts, slp_node, 
> >> cost_vec);
> >> +  /* Don't add a cost here.  SSA copies and no-op conversions
> >> +shouldn't generate any code in either scalar or vector form.  */
> >>return true;
> >>  }
> >>


Re: [2/6] Don't assign a cost to vectorizable_assignment

2019-11-06 Thread Richard Sandiford
Richard Biener  writes:
> On Tue, Nov 5, 2019 at 3:27 PM Richard Sandiford
>  wrote:
>>
>> vectorizable_assignment handles true SSA-to-SSA copies (which hopefully
>> we don't see in practice) and no-op conversions that are required
>> to maintain correct gimple, such as changes between signed and
>> unsigned types.  These cases shouldn't generate any code and so
>> shouldn't count against either the scalar or vector costs.
>>
>> Later patches test this, but it seemed worth splitting out.
>
> Hmm, but you have to adjust vect_compute_single_scalar_iteration_cost and
> possibly the SLP cost walk as well, otherwise we're artificially making
> those copies cheaper when vectorized.

Ah, yeah.  It looks complicated to reproduce the conditions exactly
there, so how about just costing 1 copy in vectorizable_assignment
to counteract it, and ignore ncopies?

Seems like vectorizable_* ought to be costing the scalar code as
well as the vector code, but that's too much for GCC 10 at this stage.

Thanks,
Richard


>
>>
>> 2019-11-04  Richard Sandiford  
>>
>> gcc/
>> * tree-vect-stmts.c (vectorizable_assignment): Don't add a cost.
>>
>> Index: gcc/tree-vect-stmts.c
>> ===
>> --- gcc/tree-vect-stmts.c   2019-11-05 14:17:43.330141911 +
>> +++ gcc/tree-vect-stmts.c   2019-11-05 14:18:39.169752725 +
>> @@ -5305,7 +5305,7 @@ vectorizable_conversion (stmt_vec_info s
>>  static bool
>>  vectorizable_assignment (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi,
>>  stmt_vec_info *vec_stmt, slp_tree slp_node,
>> -stmt_vector_for_cost *cost_vec)
>> +stmt_vector_for_cost *)
>>  {
>>tree vec_dest;
>>tree scalar_dest;
>> @@ -5313,7 +5313,6 @@ vectorizable_assignment (stmt_vec_info s
>>loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
>>tree new_temp;
>>enum vect_def_type dt[1] = {vect_unknown_def_type};
>> -  int ndts = 1;
>>int ncopies;
>>int i, j;
>>vec vec_oprnds = vNULL;
>> @@ -5409,7 +5408,8 @@ vectorizable_assignment (stmt_vec_info s
>>  {
>>STMT_VINFO_TYPE (stmt_info) = assignment_vec_info_type;
>>DUMP_VECT_SCOPE ("vectorizable_assignment");
>> -  vect_model_simple_cost (stmt_info, ncopies, dt, ndts, slp_node, 
>> cost_vec);
>> +  /* Don't add a cost here.  SSA copies and no-op conversions
>> +shouldn't generate any code in either scalar or vector form.  */
>>return true;
>>  }
>>


Re: [2/6] Don't assign a cost to vectorizable_assignment

2019-11-06 Thread Richard Biener
On Tue, Nov 5, 2019 at 3:27 PM Richard Sandiford
 wrote:
>
> vectorizable_assignment handles true SSA-to-SSA copies (which hopefully
> we don't see in practice) and no-op conversions that are required
> to maintain correct gimple, such as changes between signed and
> unsigned types.  These cases shouldn't generate any code and so
> shouldn't count against either the scalar or vector costs.
>
> Later patches test this, but it seemed worth splitting out.

Hmm, but you have to adjust vect_compute_single_scalar_iteration_cost and
possibly the SLP cost walk as well, otherwise we're artificially making
those copies cheaper when vectorized.

>
> 2019-11-04  Richard Sandiford  
>
> gcc/
> * tree-vect-stmts.c (vectorizable_assignment): Don't add a cost.
>
> Index: gcc/tree-vect-stmts.c
> ===
> --- gcc/tree-vect-stmts.c   2019-11-05 14:17:43.330141911 +
> +++ gcc/tree-vect-stmts.c   2019-11-05 14:18:39.169752725 +
> @@ -5305,7 +5305,7 @@ vectorizable_conversion (stmt_vec_info s
>  static bool
>  vectorizable_assignment (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi,
>  stmt_vec_info *vec_stmt, slp_tree slp_node,
> -stmt_vector_for_cost *cost_vec)
> +stmt_vector_for_cost *)
>  {
>tree vec_dest;
>tree scalar_dest;
> @@ -5313,7 +5313,6 @@ vectorizable_assignment (stmt_vec_info s
>loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
>tree new_temp;
>enum vect_def_type dt[1] = {vect_unknown_def_type};
> -  int ndts = 1;
>int ncopies;
>int i, j;
>vec vec_oprnds = vNULL;
> @@ -5409,7 +5408,8 @@ vectorizable_assignment (stmt_vec_info s
>  {
>STMT_VINFO_TYPE (stmt_info) = assignment_vec_info_type;
>DUMP_VECT_SCOPE ("vectorizable_assignment");
> -  vect_model_simple_cost (stmt_info, ncopies, dt, ndts, slp_node, 
> cost_vec);
> +  /* Don't add a cost here.  SSA copies and no-op conversions
> +shouldn't generate any code in either scalar or vector form.  */
>return true;
>  }
>


[2/6] Don't assign a cost to vectorizable_assignment

2019-11-05 Thread Richard Sandiford
vectorizable_assignment handles true SSA-to-SSA copies (which hopefully
we don't see in practice) and no-op conversions that are required
to maintain correct gimple, such as changes between signed and
unsigned types.  These cases shouldn't generate any code and so
shouldn't count against either the scalar or vector costs.

Later patches test this, but it seemed worth splitting out.


2019-11-04  Richard Sandiford  

gcc/
* tree-vect-stmts.c (vectorizable_assignment): Don't add a cost.

Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   2019-11-05 14:17:43.330141911 +
+++ gcc/tree-vect-stmts.c   2019-11-05 14:18:39.169752725 +
@@ -5305,7 +5305,7 @@ vectorizable_conversion (stmt_vec_info s
 static bool
 vectorizable_assignment (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi,
 stmt_vec_info *vec_stmt, slp_tree slp_node,
-stmt_vector_for_cost *cost_vec)
+stmt_vector_for_cost *)
 {
   tree vec_dest;
   tree scalar_dest;
@@ -5313,7 +5313,6 @@ vectorizable_assignment (stmt_vec_info s
   loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
   tree new_temp;
   enum vect_def_type dt[1] = {vect_unknown_def_type};
-  int ndts = 1;
   int ncopies;
   int i, j;
   vec vec_oprnds = vNULL;
@@ -5409,7 +5408,8 @@ vectorizable_assignment (stmt_vec_info s
 {
   STMT_VINFO_TYPE (stmt_info) = assignment_vec_info_type;
   DUMP_VECT_SCOPE ("vectorizable_assignment");
-  vect_model_simple_cost (stmt_info, ncopies, dt, ndts, slp_node, 
cost_vec);
+  /* Don't add a cost here.  SSA copies and no-op conversions
+shouldn't generate any code in either scalar or vector form.  */
   return true;
 }