Re: [073/nnn] poly_int: vectorizable_load/store

2017-12-05 Thread Jeff Law
On 10/23/2017 11:29 AM, Richard Sandiford wrote:
> This patch makes vectorizable_load and vectorizable_store cope with
> variable-length vectors.  The reverse and permute cases will be
> excluded by the code that checks the permutation mask (although a
> patch after the main SVE submission adds support for the reversed
> case).  Here we also need to exclude VMAT_ELEMENTWISE and
> VMAT_STRIDED_SLP, which split the operation up into a constant
> number of constant-sized operations.  We also don't try to extend
> the current widening gather/scatter support to variable-length
> vectors, since SVE uses a different approach.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * tree-vect-stmts.c (get_load_store_type): Treat the number of
>   units as polynomial.  Reject VMAT_ELEMENTWISE and VMAT_STRIDED_SLP
>   for variable-length vectors.
>   (vectorizable_mask_load_store): Treat the number of units as
>   polynomial, asserting that it is constant if the condition has
>   already been enforced.
>   (vectorizable_store, vectorizable_load): Likewise.

OK.
jeff


[073/nnn] poly_int: vectorizable_load/store

2017-10-23 Thread Richard Sandiford
This patch makes vectorizable_load and vectorizable_store cope with
variable-length vectors.  The reverse and permute cases will be
excluded by the code that checks the permutation mask (although a
patch after the main SVE submission adds support for the reversed
case).  Here we also need to exclude VMAT_ELEMENTWISE and
VMAT_STRIDED_SLP, which split the operation up into a constant
number of constant-sized operations.  We also don't try to extend
the current widening gather/scatter support to variable-length
vectors, since SVE uses a different approach.


2017-10-23  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* tree-vect-stmts.c (get_load_store_type): Treat the number of
units as polynomial.  Reject VMAT_ELEMENTWISE and VMAT_STRIDED_SLP
for variable-length vectors.
(vectorizable_mask_load_store): Treat the number of units as
polynomial, asserting that it is constant if the condition has
already been enforced.
(vectorizable_store, vectorizable_load): Likewise.

Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   2017-10-23 17:22:32.730226813 +0100
+++ gcc/tree-vect-stmts.c   2017-10-23 17:22:38.938582823 +0100
@@ -1955,6 +1955,7 @@ get_load_store_type (gimple *stmt, tree
   stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
   vec_info *vinfo = stmt_info->vinfo;
   loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
+  poly_uint64 nunits = TYPE_VECTOR_SUBPARTS (vectype);
   if (STMT_VINFO_GATHER_SCATTER_P (stmt_info))
 {
   *memory_access_type = VMAT_GATHER_SCATTER;
@@ -1998,6 +1999,17 @@ get_load_store_type (gimple *stmt, tree
*memory_access_type = VMAT_CONTIGUOUS;
 }
 
+  if ((*memory_access_type == VMAT_ELEMENTWISE
+   || *memory_access_type == VMAT_STRIDED_SLP)
+  && !nunits.is_constant ())
+{
+  if (dump_enabled_p ())
+   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+"Not using elementwise accesses due to variable "
+"vectorization factor.\n");
+  return false;
+}
+
   /* FIXME: At the moment the cost model seems to underestimate the
  cost of using elementwise accesses.  This check preserves the
  traditional behavior until that can be fixed.  */
@@ -2038,7 +2050,7 @@ vectorizable_mask_load_store (gimple *st
   tree dummy;
   tree dataref_ptr = NULL_TREE;
   gimple *ptr_incr;
-  int nunits = TYPE_VECTOR_SUBPARTS (vectype);
+  poly_uint64 nunits = TYPE_VECTOR_SUBPARTS (vectype);
   int ncopies;
   int i, j;
   bool inv_p;
@@ -2168,7 +2180,8 @@ vectorizable_mask_load_store (gimple *st
   gimple_seq seq;
   basic_block new_bb;
   enum { NARROW, NONE, WIDEN } modifier;
-  int gather_off_nunits = TYPE_VECTOR_SUBPARTS (gs_info.offset_vectype);
+  poly_uint64 gather_off_nunits
+   = TYPE_VECTOR_SUBPARTS (gs_info.offset_vectype);
 
   rettype = TREE_TYPE (TREE_TYPE (gs_info.decl));
   srctype = TREE_VALUE (arglist); arglist = TREE_CHAIN (arglist);
@@ -2179,32 +2192,37 @@ vectorizable_mask_load_store (gimple *st
   gcc_checking_assert (types_compatible_p (srctype, rettype)
   && types_compatible_p (srctype, masktype));
 
-  if (nunits == gather_off_nunits)
+  if (must_eq (nunits, gather_off_nunits))
modifier = NONE;
-  else if (nunits == gather_off_nunits / 2)
+  else if (must_eq (nunits * 2, gather_off_nunits))
{
  modifier = WIDEN;
 
- auto_vec_perm_indices sel (gather_off_nunits);
- for (i = 0; i < gather_off_nunits; ++i)
-   sel.quick_push (i | nunits);
+ /* Currently widening gathers and scatters are only supported for
+fixed-length vectors.  */
+ int count = gather_off_nunits.to_constant ();
+ auto_vec_perm_indices sel (count);
+ for (i = 0; i < count; ++i)
+   sel.quick_push (i | (count / 2));
 
  perm_mask = vect_gen_perm_mask_checked (gs_info.offset_vectype, sel);
}
-  else if (nunits == gather_off_nunits * 2)
+  else if (must_eq (nunits, gather_off_nunits * 2))
{
  modifier = NARROW;
 
- auto_vec_perm_indices sel (nunits);
- sel.quick_grow (nunits);
- for (i = 0; i < nunits; ++i)
-   sel[i] = i < gather_off_nunits
-? i : i + nunits - gather_off_nunits;
+ /* Currently narrowing gathers and scatters are only supported for
+fixed-length vectors.  */
+ int count = nunits.to_constant ();
+ auto_vec_perm_indices sel (count);
+ sel.quick_grow (count);
+ for (i = 0; i < count; ++i)
+   sel[i] = i < count / 2 ? i : i + count / 2;
 
  perm_mask = vect_gen_perm_mask_checked (vectype, sel);
  ncopies *=