On Wed, Jun 3, 2026 at 5:20 PM Christopher Bazley <[email protected]> wrote:
>
> Add two new fields to SLP tree nodes, which are accessed as
> SLP_TREE_CAN_USE_PARTIAL_VECTORS_P and SLP_TREE_PARTIAL_VECTORS_STYLE.
>
> SLP_TREE_CAN_USE_PARTIAL_VECTORS_P is analogous to the existing
> predicate LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P. It is initialized to
> true. This flag just records whether the target could vectorize a
> node using a partial vector; it does not say anything about
> whether the vector actually is partial, or how the target would support
> use of a partial vector. Some kinds of node require mask/length for
> partial vectors; others don't. In the latter case (e.g., for add
> operations), SLP_TREE_CAN_USE_PARTIAL_VECTORS_P will remain true.
>
> SLP_TREE_PARTIAL_VECTORS_STYLE is analogous to the existing field
> LOOP_VINFO_PARTIAL_VECTORS_STYLE. Both are initialized to 'none'.
> The vect_partial_vectors_avx512 enumerator is not used for BB SLP.
> Unlike loop vectorization, a different style of partial vectors can be
> chosen for each node during analysis of that node.
>
> Implement the recently-introduced wrapper functions,
> vect_record_(len|mask), for BB SLP by setting
> SLP_TREE_PARTIAL_VECTORS_STYLE to indicate that a mask or length should
> be used for a given SLP node. The passed-in vec_info is ignored.
>
> Implement the vect_fully_(masked|with_length)_p wrapper functions for
> BB SLP by checking the SLP_TREE_PARTIAL_VECTORS_STYLE. This should be
> sufficient because at most one of vect_record_(len|mask) and
> vect_cannot_use_partial_vectors are expected to be called for any
> given SLP node. SLP_TREE_CAN_USE_PARTIAL_VECTORS_P should be true if
> the style is not 'none', but its value isn't used beyond the analysis
> phase.
>
> The implementations of vect_get_mask and vect_get_len for BB SLP are
> non-trivial (albeit simpler than for loop vectorization), therefore they
> are delegated to SLP-specific functions defined in tree-vect-slp.cc.
>
> Implement the vect_cannot_use_partial_vectors wrapper function by
> setting the SLP_TREE_CAN_USE_PARTIAL_VECTORS_P flag to false.
> To prevent regressions, vect_can_use_partial_vectors_p still returns
> false for BB SLP regardless (for now). This prevents vect_record_mask
> or vect_record_len from being called.
>
> gcc/ChangeLog:
>
>         * tree-vect-slp.cc (_slp_tree::_slp_tree): initialize new
>         partial_vector_style, can_use_partial_vectors and
>         num_partial_vectors members.
>         (vect_slp_analyze_node_operations): Account for worst-case
>         prologue costs of per-node partial-vector mask or length
>         materialisation.
>         (vect_slp_record_bb_style): Set the partial vector style of an
>         SLP node, checking that the style does not flip-flop between mask
>         and length.
>         (vect_slp_record_bb_mask): Use vect_slp_record_bb_style to set
>         the partial vector style of the SLP tree node to
>         vect_partial_vectors_while_ult.
>         (vect_slp_get_bb_mask): New function to materialize a mask for
>         basic block SLP vectorization.
>         (vect_slp_record_bb_len): Use vect_slp_record_bb_style to set
>         the partial vector style of the SLP tree node to
>         vect_partial_vectors_len.
>         (vect_slp_get_bb_len): New function to materialize a length for
>         basic block SLP vectorization.
>         * tree-vect-stmts.cc (vectorizable_internal_function):
>         (vect_record_mask): Handle the basic block SLP use case by
>         delegating to vect_slp_record_bb_mask.
>         (vect_get_mask): Handle the basic block SLP use case by
>         delegating to vect_slp_get_bb_mask.
>         (vect_record_len): Handle the basic block SLP use case by
>         delegating to vect_slp_record_bb_len.
>         (vect_get_len): Handle the basic block SLP use case by
>         delegating to vect_slp_get_bb_len.
>         (vect_gen_while_ssa_name): New function containing code
>         refactored out of vect_gen_while for reuse by
>         vect_slp_get_bb_mask.
>         (vect_gen_while): Use vect_gen_while_ssa_name instead of custom
>         code for some of the implementation.
>         * tree-vectorizer.h (enum vect_partial_vector_style): Move this
>         definition earlier to allow reuse by struct _slp_tree.
>         (struct _slp_tree): Add a partial_vector_style member to record
>         whether to use a length or mask for the SLP tree node, if
>         partial vectors are required and supported.
>         Add a can_use_partial_vectors member to record whether partial
>         vectors are supported for the SLP tree node.
>         Add a num_partial_vectors member for costing.
>         (SLP_TREE_PARTIAL_VECTORS_STYLE): New member accessor macro.
>         (SLP_TREE_CAN_USE_PARTIAL_VECTORS_P): New member accessor macro.
>         (SLP_TREE_NUM_PARTIAL_VECTORS): New member accessor macro.
>         (vect_gen_while_ssa_name): Declaration of a new function.
>         (vect_slp_get_bb_mask): As above.
>         (vect_slp_get_bb_len): As above.
>         (vect_cannot_use_partial_vectors): Handle the basic block SLP
>         use-case by setting SLP_TREE_CAN_USE_PARTIAL_VECTORS_P to
>         false.
>         (vect_fully_with_length_p): Handle the basic block SLP use
>         case by checking whether the SLP_TREE_PARTIAL_VECTORS_STYLE is
>         vect_partial_vectors_len.
>         (vect_fully_masked_p): Handle the basic block SLP use case by
>         checking whether the SLP_TREE_PARTIAL_VECTORS_STYLE is
>         vect_partial_vectors_while_ult.
> ---
>  gcc/tree-vect-slp.cc   | 182 +++++++++++++++++++++++++++++++++++++++++
>  gcc/tree-vect-stmts.cc |  52 +++++++-----
>  gcc/tree-vectorizer.h  |  52 ++++++++----
>  3 files changed, 247 insertions(+), 39 deletions(-)
>
> diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> index 075e93f04a9..4dd7e6e1e21 100644
> --- a/gcc/tree-vect-slp.cc
> +++ b/gcc/tree-vect-slp.cc
> @@ -125,6 +125,9 @@ _slp_tree::_slp_tree ()
>    SLP_TREE_GS_BASE (this) = NULL_TREE;
>    this->ldst_lanes = false;
>    this->avoid_stlf_fail = false;
> +  SLP_TREE_PARTIAL_VECTORS_STYLE (this) = vect_partial_vectors_none;
> +  SLP_TREE_CAN_USE_PARTIAL_VECTORS_P (this) = true;
> +  SLP_TREE_NUM_PARTIAL_VECTORS (this) = 0;
>    SLP_TREE_VECTYPE (this) = NULL_TREE;
>    SLP_TREE_REPRESENTATIVE (this) = NULL;
>    this->cycle_info.id = -1;
> @@ -8958,6 +8961,40 @@ vect_slp_analyze_node_operations (vec_info *vinfo, 
> slp_tree node,
>           vect_prologue_cost_for_slp (vinfo, child, cost_vec);
>         }
>
> +  if (res)
> +    {
> +      /* Take care of special costs for partial vectors.
> +        Costing each partial vector is excessive for many SLP instances,
> +        because it is common to materialise identical masks/lengths for 
> related
> +        operations (e.g., for vector loads and stores of the same length).
> +        Masks/lengths can also be shared between SLP subgraphs or eliminated 
> by
> +        pattern-based lowering during instruction selection.  However, it's
> +        simpler and safer to use the worst-case cost; if this ends up being 
> the
> +        tie-breaker between vectorizing or not, then it's probably better not
> +        to vectorize.  */

I'd prefer to do this per SLP subgraph group based on recorded
requirements so similar
how loop masking is set up.

> +      const int num_partial_vectors = SLP_TREE_NUM_PARTIAL_VECTORS (node);
> +
> +      if (SLP_TREE_PARTIAL_VECTORS_STYLE (node)
> +         == vect_partial_vectors_while_ult)
> +       {
> +         gcc_assert (num_partial_vectors > 0);
> +         record_stmt_cost (cost_vec, num_partial_vectors, vector_stmt, NULL,
> +                           NULL, NULL_TREE, 0, vect_prologue);
> +       }
> +      else if (SLP_TREE_PARTIAL_VECTORS_STYLE (node)
> +              == vect_partial_vectors_len)
> +       {
> +         /* Need to set up a length in the prologue.  */
> +         gcc_assert (num_partial_vectors > 0);
> +         record_stmt_cost (cost_vec, num_partial_vectors, scalar_stmt, NULL,
> +                           NULL, NULL_TREE, 0, vect_prologue);
> +       }
> +      else
> +       {
> +         gcc_assert (num_partial_vectors == 0);
> +       }
> +    }
> +
>    /* If this node or any of its children can't be vectorized, try pruning
>       the tree here rather than felling the whole thing.  */
>    if (!res && vect_slp_convert_to_external (vinfo, node, node_instance))
> @@ -12441,3 +12478,148 @@ vect_schedule_slp (vec_info *vinfo, const 
> vec<slp_instance> &slp_instances)
>          }
>      }
>  }
> +
> +/* Record that a specific partial vector style could be used to vectorize
> +   SLP_NODE if required.  */
> +
> +static void
> +vect_slp_record_bb_style (slp_tree slp_node, vect_partial_vector_style style)
> +{
> +  gcc_assert (style != vect_partial_vectors_none);
> +  gcc_assert (style != vect_partial_vectors_avx512);
> +
> +  if (SLP_TREE_PARTIAL_VECTORS_STYLE (slp_node) == vect_partial_vectors_none)
> +    SLP_TREE_PARTIAL_VECTORS_STYLE (slp_node) = style;
> +  else
> +    gcc_assert (SLP_TREE_PARTIAL_VECTORS_STYLE (slp_node) == style);
> +}
> +
> +/* Record that a complete set of masks associated with SLP_NODE would need to
> +   contain a sequence of NVECTORS masks that each control a vector of type
> +   VECTYPE.  If SCALAR_MASK is nonnull, the fully-masked loop would AND
> +   these vector masks with the vector version of SCALAR_MASK.  */
> +void
> +vect_slp_record_bb_mask (slp_tree slp_node, unsigned int /* nvectors */,
> +                        tree /* vectype */, tree /* scalar_mask */)
> +{
> +  vect_slp_record_bb_style (slp_node, vect_partial_vectors_while_ult);
> +
> +  /* FORNOW: this often overestimates the number of masks for costing 
> purposes
> +     because, after lowering, masks have often been eliminated, shared 
> between
> +     SLP nodes, or even shared between SLP subgraphs.  */
> +  SLP_TREE_NUM_PARTIAL_VECTORS(slp_node) ++;
> +}
> +
> +/* Materialize mask number INDEX for a group of scalar stmts in SLP_NODE that
> +   operate on NVECTORS vectors of type VECTYPE, where 0 <= INDEX < NVECTORS.
> +   Insert any set-up statements before GSI.  */
> +
> +tree
> +vect_slp_get_bb_mask (slp_tree slp_node, gimple_stmt_iterator *gsi,
> +                     unsigned int nvectors, tree vectype, unsigned int index)
> +{
> +  gcc_assert (SLP_TREE_PARTIAL_VECTORS_STYLE (slp_node)
> +             == vect_partial_vectors_while_ult);
> +  gcc_assert (nvectors >= 1);
> +  gcc_assert (index < nvectors);
> +
> +  const poly_uint64 nunits = TYPE_VECTOR_SUBPARTS (vectype);
> +  const unsigned int group_size = SLP_TREE_LANES (slp_node);
> +  unsigned int mask_size = group_size;
> +  const tree masktype = truth_type_for (vectype);
> +
> +  if (nunits.is_constant ())
> +    {
> +      /* Only the last vector can be a partial vector.  */
> +      if (index + 1 < nvectors)
> +       return build_minus_one_cst (masktype);
> +
> +      /* Return a mask for a possibly-partial tail vector. */
> +      const unsigned int const_nunits = nunits.to_constant ();
> +      const unsigned int head_size = (nvectors - 1) * const_nunits;
> +      gcc_assert (head_size <= group_size);
> +      mask_size = group_size - head_size;
> +
> +      if (mask_size == const_nunits)
> +       return build_minus_one_cst (masktype);
> +    }
> +  else
> +    {
> +      /* Return a mask for a single variable-length vector. */
> +      gcc_assert (nvectors == 1);
> +      gcc_assert (known_le (mask_size, nunits));
> +    }
> +
> +  /* FORNOW: don't bother maintaining a set of mask constants to allow
> +     sharing between nodes belonging to the same instance of bb_vec_info
> +     or even within the same SLP subgraph.  */

See above.  The loop code already should have everything set up for
caching.  Why not reuse that?

> +  gimple_seq stmts = NULL;
> +  const tree cmp_type = size_type_node;
> +  const tree start_index = build_zero_cst (cmp_type);
> +  const tree end_index = build_int_cst (cmp_type, mask_size);
> +  const tree mask = make_temp_ssa_name (masktype, NULL, "slp_mask");
> +  vect_gen_while_ssa_name (&stmts, masktype, start_index, end_index, mask);
> +  gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
> +  return mask;
> +}
> +
> +/* Record that a complete set of lengths associated with SLP_NODE would need 
> to
> +   contain a sequence of NVECTORS lengths for controlling an operation on
> +   VECTYPE.  The operation splits each element of VECTYPE into FACTOR 
> separate
> +   subelements, measuring the length as a number of these subelements.  */
> +
> +void
> +vect_slp_record_bb_len (slp_tree slp_node, unsigned int /* nvectors */,
> +                       tree /* vectype */, unsigned int /* factor */)
> +{
> +  vect_slp_record_bb_style (slp_node, vect_partial_vectors_len);
> +
> +  /* FORNOW: this probably overestimates the number of lengths for costing
> +     purposes because, after lowering, lengths might have been eliminated,
> +     shared between SLP nodes, or even shared between SLP subgraphs.  */
> +  SLP_TREE_NUM_PARTIAL_VECTORS (slp_node)++;
> +}
> +
> +/* Materialize length number INDEX for a group of scalar stmts in SLP_NODE 
> that
> +   operate on NVECTORS vectors of type VECTYPE, where 0 <= INDEX < NVECTORS.
> +   Return a value that contains FACTOR multiplied by the number of elements 
> that
> +   should be processed.  */
> +
> +tree
> +vect_slp_get_bb_len (slp_tree slp_node, unsigned int nvectors, tree vectype,
> +                    unsigned int index, unsigned int factor, bool adjusted)
> +{
> +  gcc_checking_assert (SLP_TREE_PARTIAL_VECTORS_STYLE (slp_node)
> +                      == vect_partial_vectors_len);
> +  gcc_assert (nvectors >= 1);
> +  gcc_assert (index < nvectors);
> +  (void) adjusted;
> +
> +  const poly_uint64 nunits = TYPE_VECTOR_SUBPARTS (vectype);
> +  const unsigned int group_size = SLP_TREE_LANES (slp_node);
> +  unsigned int len = group_size;
> +
> +  if (nunits.is_constant ())
> +    {
> +      const unsigned int const_nunits = nunits.to_constant ();
> +
> +      /* Only the last vector can be a partial vector.  */
> +      if (index + 1 < nvectors)
> +       len = const_nunits;
> +      else
> +       {
> +         /* Return a length for a possibly-partial tail vector. */
> +         const unsigned int head_size = (nvectors - 1) * const_nunits;
> +         gcc_assert (head_size <= group_size);
> +         len = group_size - head_size;
> +       }
> +    }
> +  else
> +    {
> +      /* Return a length for a single variable-length vector. */
> +      gcc_assert (nvectors == 1);
> +      gcc_assert (known_le (len, nunits));
> +    }
> +
> +  return size_int (len * factor);
> +}
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index 15fca17a407..ecad74e7cbf 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -1385,7 +1385,9 @@ vectorizable_internal_function (combined_fn cfn, tree 
> fndecl,
>  /* Record that a complete set of masks associated with VINFO would need to
>     contain a sequence of NVECTORS masks that each control a vector of type
>     VECTYPE.  If SCALAR_MASK is nonnull, the fully-masked loop would AND
> -   these vector masks with the vector version of SCALAR_MASK.  */
> +   these vector masks with the vector version of SCALAR_MASK.  Alternatively,
> +   if doing basic block vectorization, record that a mask could be used to
> +   vectorize SLP_NODE if required.  */
>  static void
>  vect_record_mask (vec_info *vinfo, slp_tree slp_node, unsigned int nvectors,
>                   tree vectype, tree scalar_mask)
> @@ -1395,7 +1397,7 @@ vect_record_mask (vec_info *vinfo, slp_tree slp_node, 
> unsigned int nvectors,
>      vect_record_loop_mask (loop_vinfo, &LOOP_VINFO_MASKS (loop_vinfo), 
> nvectors,
>                            vectype, scalar_mask);
>    else
> -    (void) slp_node; /* FORNOW */
> +    vect_slp_record_bb_mask (slp_node, nvectors, vectype, scalar_mask);
>  }
>
>  /* Given a complete set of masks associated with VINFO, extract mask number
> @@ -1413,16 +1415,15 @@ vect_get_mask (vec_info *vinfo, slp_tree slp_node, 
> gimple_stmt_iterator *gsi,
>      return vect_get_loop_mask (loop_vinfo, gsi, &LOOP_VINFO_MASKS 
> (loop_vinfo),
>                                nvectors, vectype, index);
>    else
> -    {
> -      (void) slp_node; /* FORNOW */
> -      return NULL_TREE;
> -    }
> +    return vect_slp_get_bb_mask (slp_node, gsi, nvectors, vectype, index);
>  }
>
>  /* Record that a complete set of lengths associated with VINFO would need to
>     contain a sequence of NVECTORS lengths for controlling an operation on
>     VECTYPE.  The operation splits each element of VECTYPE into FACTOR 
> separate
> -   subelements, measuring the length as a number of these subelements.  */
> +   subelements, measuring the length as a number of these subelements.
> +   Alternatively, if doing basic block vectorization, record that a length 
> limit
> +   could be used to vectorize SLP_NODE if required.  */
>  static void
>  vect_record_len (vec_info *vinfo, slp_tree slp_node, unsigned int nvectors,
>                  tree vectype, unsigned int factor)
> @@ -1432,7 +1433,7 @@ vect_record_len (vec_info *vinfo, slp_tree slp_node, 
> unsigned int nvectors,
>      vect_record_loop_len (loop_vinfo, &LOOP_VINFO_LENS (loop_vinfo), 
> nvectors,
>                           vectype, factor);
>    else
> -    (void) slp_node; /* FORNOW */
> +    vect_slp_record_bb_len (slp_node, nvectors, vectype, factor);
>  }
>
>  /* Given a complete set of lengths associated with VINFO, extract length 
> number
> @@ -1453,10 +1454,8 @@ vect_get_len (vec_info *vinfo, slp_tree slp_node, 
> gimple_stmt_iterator *gsi,
>      return vect_get_loop_len (loop_vinfo, gsi, &LOOP_VINFO_LENS (loop_vinfo),
>                               nvectors, vectype, index, factor, adjusted);
>    else
> -    {
> -      (void) slp_node; /* FORNOW */
> -      return NULL_TREE;
> -    }
> +    return vect_slp_get_bb_len (slp_node, nvectors, vectype, index, factor,
> +                               adjusted);
>  }
>
>  static tree permute_vec_elements (vec_info *, tree, tree, tree, 
> stmt_vec_info,
> @@ -14710,24 +14709,35 @@ supportable_indirect_convert_operation (code_helper 
> code,
>     mask[I] is true iff J + START_INDEX < END_INDEX for all J <= I.
>     Add the statements to SEQ.  */
>
> +void
> +vect_gen_while_ssa_name (gimple_seq *seq, tree mask_type, tree start_index,
> +                        tree end_index, tree ssa_name)
> +{
> +  tree cmp_type = TREE_TYPE (start_index);
> +  gcc_checking_assert (direct_internal_fn_supported_p (IFN_WHILE_ULT, 
> cmp_type,
> +                                                      mask_type,
> +                                                      OPTIMIZE_FOR_SPEED));
> +  gcall *call
> +    = gimple_build_call_internal (IFN_WHILE_ULT, 3, start_index, end_index,
> +                                 build_zero_cst (mask_type));
> +  gimple_call_set_lhs (call, ssa_name);
> +  gimple_seq_add_stmt (seq, call);
> +}
> +
> +/*  Like vect_gen_while_ssa_name except that it creates a new SSA_NAME node
> +    for type MASK_TYPE defined in the created GIMPLE_CALL statement.  If NAME
> +    is not a null pointer then it is used for the SSA_NAME in dumps.  */
> +
>  tree
>  vect_gen_while (gimple_seq *seq, tree mask_type, tree start_index,
>                 tree end_index, const char *name)
>  {
> -  tree cmp_type = TREE_TYPE (start_index);
> -  gcc_checking_assert (direct_internal_fn_supported_p (IFN_WHILE_ULT,
> -                                                      cmp_type, mask_type,
> -                                                      OPTIMIZE_FOR_SPEED));
> -  gcall *call = gimple_build_call_internal (IFN_WHILE_ULT, 3,
> -                                           start_index, end_index,
> -                                           build_zero_cst (mask_type));
>    tree tmp;
>    if (name)
>      tmp = make_temp_ssa_name (mask_type, NULL, name);
>    else
>      tmp = make_ssa_name (mask_type);
> -  gimple_call_set_lhs (call, tmp);
> -  gimple_seq_add_stmt (seq, call);
> +  vect_gen_while_ssa_name (seq, mask_type, start_index, end_index, tmp);
>    return tmp;
>  }
>
> diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
> index a3855568b09..f79f04ff8ac 100644
> --- a/gcc/tree-vectorizer.h
> +++ b/gcc/tree-vectorizer.h
> @@ -312,6 +312,13 @@ struct vect_load_store_data : vect_data {
>    bool subchain_p; // VMAT_STRIDED_SLP and VMAT_GATHER_SCATTER
>  };
>
> +enum vect_partial_vector_style {
> +  vect_partial_vectors_none,
> +  vect_partial_vectors_while_ult,
> +  vect_partial_vectors_avx512,
> +  vect_partial_vectors_len
> +};
> +
>  /* A computation tree of an SLP instance.  Each node corresponds to a group 
> of
>     stmts to be packed in a SIMD stmt.  */
>  struct _slp_tree {
> @@ -377,7 +384,16 @@ struct _slp_tree {
>    /* For BB vect, flag to indicate this load node should be vectorized
>       as to avoid STLF fails because of related stores.  */
>    bool avoid_stlf_fail;
> -
> +  /* The style used for implementing partial vectors if LANES is less than
> +     the minimum number of lanes implied by the VECTYPE.  */
> +  vect_partial_vector_style partial_vector_style;

I wonder if we want to / need to mix style across the SLP subgraph, likewise
whether we really need to track can_use_partial_vectors per SLP node as
opposed to per subgraph.  Likewise I wonder if we want to deal with the
case of parts of the graph being unsupported because of lack of masking
support which we could fix by promoting that part extern (not covered) rather
than failing the whole subgraph.

That is, I'm questioning (maybe again?) the overall tracking/analysis phase?

> +  /* Flag to indicate whether we still have the option of vectorizing this 
> node
> +     using partial vectors (i.e.  using lengths or masks to prevent use of
> +     inactive scalar lanes).  */
> +  bool can_use_partial_vectors;
> +  /* Number of partial vectors, for costing purposes. Should be 0 unless a
> +     partial vector style has been set.  */
> +  int num_partial_vectors;
>    int vertex;
>
>    /* The kind of operation as determined by analysis and optional
> @@ -476,6 +492,9 @@ public:
>  #define SLP_TREE_GS_BASE(S)                     (S)->gs_base
>  #define SLP_TREE_REDUC_IDX(S)                   (S)->cycle_info.reduc_idx
>  #define SLP_TREE_PERMUTE_P(S)                   ((S)->code == VEC_PERM_EXPR)
> +#define SLP_TREE_PARTIAL_VECTORS_STYLE(S)       (S)->partial_vector_style
> +#define SLP_TREE_CAN_USE_PARTIAL_VECTORS_P(S)   (S)->can_use_partial_vectors
> +#define SLP_TREE_NUM_PARTIAL_VECTORS(S)                 
> (S)->num_partial_vectors
>
>  inline vect_memory_access_type
>  SLP_TREE_MEMORY_ACCESS_TYPE (slp_tree node)
> @@ -486,13 +505,6 @@ SLP_TREE_MEMORY_ACCESS_TYPE (slp_tree node)
>    return VMAT_UNINITIALIZED;
>  }
>
> -enum vect_partial_vector_style {
> -    vect_partial_vectors_none,
> -    vect_partial_vectors_while_ult,
> -    vect_partial_vectors_avx512,
> -    vect_partial_vectors_len
> -};
> -
>  /* Key for map that records association between
>     scalar conditions and corresponding loop mask, and
>     is populated by vect_record_loop_mask.  */
> @@ -2607,6 +2619,7 @@ extern tree vect_gen_perm_mask_checked (tree, const 
> vec_perm_indices &);
>  extern void optimize_mask_stores (class loop*);
>  extern tree vect_gen_while (gimple_seq *, tree, tree, tree,
>                             const char * = nullptr);
> +extern void vect_gen_while_ssa_name (gimple_seq *, tree, tree, tree, tree);
>  extern tree vect_gen_while_not (gimple_seq *, tree, tree, tree);
>  extern opt_result vect_get_vector_types_for_stmt (vec_info *,
>                                                   stmt_vec_info, tree *,
> @@ -2788,7 +2801,14 @@ extern slp_tree vect_create_new_slp_node (unsigned, 
> tree_code);
>  extern void vect_free_slp_tree (slp_tree);
>  extern bool compatible_calls_p (gcall *, gcall *, bool);
>  extern int vect_slp_child_index_for_operand (const stmt_vec_info, int op);
> -
> +extern void vect_slp_record_bb_mask (slp_tree slp_node, unsigned int 
> nvectors,
> +                                    tree vectype, tree scalar_mask);
> +extern tree vect_slp_get_bb_mask (slp_tree, gimple_stmt_iterator *,
> +                                 unsigned int, tree, unsigned int);
> +extern void vect_slp_record_bb_len (slp_tree slp_node, unsigned int nvectors,
> +                                   tree vectype, unsigned int factor);
> +extern tree vect_slp_get_bb_len (slp_tree, unsigned int, tree, unsigned int,
> +                                unsigned int, bool);
>  extern tree prepare_vec_mask (vec_info *, tree, tree, tree,
>                               gimple_stmt_iterator *);
>  extern tree vect_get_mask_load_else (int, tree);
> @@ -2953,7 +2973,7 @@ vect_cannot_use_partial_vectors (vec_info *vinfo, 
> slp_tree slp_node)
>    if (loop_vinfo)
>      LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false;
>    else
> -    (void) slp_node; /* FORNOW */
> +    SLP_TREE_CAN_USE_PARTIAL_VECTORS_P (slp_node) = false;
>  }
>
>  /* Return true if VINFO is vectorizer state for loop vectorization, we've
> @@ -2967,10 +2987,8 @@ vect_fully_with_length_p (vec_info *vinfo, slp_tree 
> slp_node)
>    if (loop_vec_info loop_vinfo = dyn_cast<loop_vec_info> (vinfo))
>      return LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo);
>    else
> -    {
> -      (void) slp_node; /* FORNOW */
> -      return false;
> -    }
> +    return SLP_TREE_PARTIAL_VECTORS_STYLE (slp_node)
> +          == vect_partial_vectors_len;
>  }
>
>  /* Return true if VINFO is vectorizer state for loop vectorization, we've
> @@ -2984,10 +3002,8 @@ vect_fully_masked_p (vec_info *vinfo, slp_tree 
> slp_node)
>    if (loop_vec_info loop_vinfo = dyn_cast<loop_vec_info> (vinfo))
>      return LOOP_VINFO_FULLY_MASKED_P (loop_vinfo);
>    else
> -    {
> -      (void) slp_node; /* FORNOW */
> -      return false;
> -    }
> +    return SLP_TREE_PARTIAL_VECTORS_STYLE (slp_node)
> +          == vect_partial_vectors_while_ult;
>  }
>
>  /* If STMT_INFO describes a reduction, return the vect_reduction_type
> --
> 2.43.0
>

Reply via email to