Hi,

I'm trying to enforce SLP_TREE_VECTYPE being set (correctly) on
external and invariant SLP nodes to avoid (re-)computing that
in other places.  The responsible entity for specifying the
desired vector type is vectorizable_* - vectorization analysis
of the user (when we start to share those nodes between multiple
users a user can then also unshare such a node if an existing
vector type conflicts with its own requirements).

The previous patch 
https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545938.html
set up things so that vect_slp_analyze_node_operations would
use such type information during costing and also pick up
unshared nodes.

Now as you can see below where I started to enforce SLP_TREE_VECTYPE
for the special boolean case in vect_get_constant_vectors I seemingly
only had to fixup two places, vectorizable_operation (saw bool & bool)
and vectorizable_comparison doing testing on x86_64 and aarch64
(cross cc1 on aarch64-sve.exp and vect.exp).  There obviously will
be more places and I was thinking of less ugly (and more correct?)
ways than doing

+         FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (slp_node), i, child)
+           if (SLP_TREE_DEF_TYPE (child) != vect_internal_def)
+             SLP_TREE_VECTYPE (child) = vectype;

in each vectorizable_*.  In principle it is the vect_is_simple_use
that would need adjustment to associate the correct SLP child with
a vectype so I can't think of sth better than refactoring things
to sth like

  vect_simple_uses (vinfo, stmt_info, slp_node, &dts, &vectypes)

and process all operans at once transparently for stmt_info/slp_node
(where the actual vect_is_simple_use check is redundant for SLP
nodes anyway).  The index into the array would then match up
with the SLP child (if SLP) or the stmt operand (if not SLP - with
SLP the order can be different on the stmt).

Any better ideas or comments on the general direction?  Richard,
which vectorizable_* do you remember being the "worst" with respect
to vector type determining for invariants?

Thanks,
Richard.


---
 gcc/tree-vect-slp.c   | 17 +++++++++++++++--
 gcc/tree-vect-stmts.c | 22 ++++++++++++++++++++++
 2 files changed, 37 insertions(+), 2 deletions(-)

diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 69a2002717f..32e0b6677b3 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -3640,9 +3640,22 @@ vect_get_constant_vectors (vec_info *vinfo,
   tree stmt_vectype = STMT_VINFO_VECTYPE (stmt_vinfo);
   if (VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (op))
       && vect_mask_constant_operand_p (vinfo, stmt_vinfo, op_num))
-    vector_type = truth_type_for (stmt_vectype);
+    {
+      /* We always want SLP_TREE_VECTYPE (op_node) here correctly set
+        as first step.  */
+      vector_type = SLP_TREE_VECTYPE (op_node);
+      gcc_assert (vector_type
+                 && types_compatible_p (vector_type,
+                                        truth_type_for (stmt_vectype)));
+    }
   else
-    vector_type = get_vectype_for_scalar_type (vinfo, TREE_TYPE (op), op_node);
+    {
+      vector_type = get_vectype_for_scalar_type (vinfo,
+                                                TREE_TYPE (op), op_node);
+      gcc_assert (!SLP_TREE_VECTYPE (op_node)
+                 || types_compatible_p (SLP_TREE_VECTYPE (op_node),
+                                        vector_type));
+    }
 
   poly_uint64 vf = 1;
   if (loop_vec_info loop_vinfo = dyn_cast <loop_vec_info> (vinfo))
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 94f233e2e27..c21fba84ec1 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -6169,6 +6169,15 @@ vectorizable_operation (vec_info *vinfo,
                                   vectype, NULL);
        }
 
+      /* Put types on constant and invariant SLP children.  */
+      if (slp_node)
+       {
+         slp_tree child;
+         FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (slp_node), i, child)
+           if (SLP_TREE_DEF_TYPE (child) != vect_internal_def)
+             SLP_TREE_VECTYPE (child) = vectype;
+       }
+
       STMT_VINFO_TYPE (stmt_info) = op_vec_info_type;
       DUMP_VECT_SCOPE ("vectorizable_operation");
       vect_model_simple_cost (vinfo, stmt_info,
@@ -10654,6 +10663,19 @@ vectorizable_comparison (vec_info *vinfo,
            }
        }
 
+      /* Ideally vect_is_simple_use would return the corresponding SLP
+        nodes as well.  But a reorg of that is for when we make SLP
+        only reality.  */
+      if ((!vectype1 || !vectype2) && slp_node)
+       for (unsigned i = 0; i < 2; ++i)
+         {
+           slp_tree child = SLP_TREE_CHILDREN (slp_node)[i];
+           if (SLP_TREE_DEF_TYPE (child) != vect_internal_def)
+             /* ???  If we share invariant nodes look for existing
+                vectype and on mismatch unshare the child node.  */
+             SLP_TREE_VECTYPE (child) = vectype;
+         }
+
       STMT_VINFO_TYPE (stmt_info) = comparison_vec_info_type;
       vect_model_simple_cost (vinfo, stmt_info,
                              ncopies * (1 + (bitop2 != NOP_EXPR)),
-- 
2.26.1

Reply via email to