On Tue, Jun 9, 2026 at 11:24 AM Richard Biener
<[email protected]> wrote:
>
> On Wed, Jun 3, 2026 at 5:20 PM Christopher Bazley <[email protected]>
> wrote:
> >
> > Add two new fields to SLP tree nodes, which are accessed as
> > SLP_TREE_CAN_USE_PARTIAL_VECTORS_P and SLP_TREE_PARTIAL_VECTORS_STYLE.
> >
> > SLP_TREE_CAN_USE_PARTIAL_VECTORS_P is analogous to the existing
> > predicate LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P. It is initialized to
> > true. This flag just records whether the target could vectorize a
> > node using a partial vector; it does not say anything about
> > whether the vector actually is partial, or how the target would support
> > use of a partial vector. Some kinds of node require mask/length for
> > partial vectors; others don't. In the latter case (e.g., for add
> > operations), SLP_TREE_CAN_USE_PARTIAL_VECTORS_P will remain true.
> >
> > SLP_TREE_PARTIAL_VECTORS_STYLE is analogous to the existing field
> > LOOP_VINFO_PARTIAL_VECTORS_STYLE. Both are initialized to 'none'.
> > The vect_partial_vectors_avx512 enumerator is not used for BB SLP.
> > Unlike loop vectorization, a different style of partial vectors can be
> > chosen for each node during analysis of that node.
To add, SLP_TREE_PARTIAL_VECTORS_STYLE should not be per SLP node but
per 'vinfo'. It doesn't make much sense to have differing styles
active. So please
just move the partial vectors style from loop_vinfo to the vinfo base class.
The overall commit message does not mention SLP_TREE_NUM_PARTIAL_VECTORS
you add per node and why. It definitely shouldn't live there, it's an
odd counter
that's going to be 0 or 1 exactly when we decide to (possibly) use
partial vectors?
As said, I think we want to re-use the loop mask/len tracking here.
> >
> > Implement the recently-introduced wrapper functions,
> > vect_record_(len|mask), for BB SLP by setting
> > SLP_TREE_PARTIAL_VECTORS_STYLE to indicate that a mask or length should
> > be used for a given SLP node. The passed-in vec_info is ignored.
> >
> > Implement the vect_fully_(masked|with_length)_p wrapper functions for
> > BB SLP by checking the SLP_TREE_PARTIAL_VECTORS_STYLE. This should be
> > sufficient because at most one of vect_record_(len|mask) and
> > vect_cannot_use_partial_vectors are expected to be called for any
> > given SLP node. SLP_TREE_CAN_USE_PARTIAL_VECTORS_P should be true if
> > the style is not 'none', but its value isn't used beyond the analysis
> > phase.
> >
> > The implementations of vect_get_mask and vect_get_len for BB SLP are
> > non-trivial (albeit simpler than for loop vectorization), therefore they
> > are delegated to SLP-specific functions defined in tree-vect-slp.cc.
> >
> > Implement the vect_cannot_use_partial_vectors wrapper function by
> > setting the SLP_TREE_CAN_USE_PARTIAL_VECTORS_P flag to false.
> > To prevent regressions, vect_can_use_partial_vectors_p still returns
> > false for BB SLP regardless (for now). This prevents vect_record_mask
> > or vect_record_len from being called.
> >
> > gcc/ChangeLog:
> >
> > * tree-vect-slp.cc (_slp_tree::_slp_tree): initialize new
> > partial_vector_style, can_use_partial_vectors and
> > num_partial_vectors members.
> > (vect_slp_analyze_node_operations): Account for worst-case
> > prologue costs of per-node partial-vector mask or length
> > materialisation.
> > (vect_slp_record_bb_style): Set the partial vector style of an
> > SLP node, checking that the style does not flip-flop between mask
> > and length.
> > (vect_slp_record_bb_mask): Use vect_slp_record_bb_style to set
> > the partial vector style of the SLP tree node to
> > vect_partial_vectors_while_ult.
> > (vect_slp_get_bb_mask): New function to materialize a mask for
> > basic block SLP vectorization.
> > (vect_slp_record_bb_len): Use vect_slp_record_bb_style to set
> > the partial vector style of the SLP tree node to
> > vect_partial_vectors_len.
> > (vect_slp_get_bb_len): New function to materialize a length for
> > basic block SLP vectorization.
> > * tree-vect-stmts.cc (vectorizable_internal_function):
> > (vect_record_mask): Handle the basic block SLP use case by
> > delegating to vect_slp_record_bb_mask.
> > (vect_get_mask): Handle the basic block SLP use case by
> > delegating to vect_slp_get_bb_mask.
> > (vect_record_len): Handle the basic block SLP use case by
> > delegating to vect_slp_record_bb_len.
> > (vect_get_len): Handle the basic block SLP use case by
> > delegating to vect_slp_get_bb_len.
> > (vect_gen_while_ssa_name): New function containing code
> > refactored out of vect_gen_while for reuse by
> > vect_slp_get_bb_mask.
> > (vect_gen_while): Use vect_gen_while_ssa_name instead of custom
> > code for some of the implementation.
> > * tree-vectorizer.h (enum vect_partial_vector_style): Move this
> > definition earlier to allow reuse by struct _slp_tree.
> > (struct _slp_tree): Add a partial_vector_style member to record
> > whether to use a length or mask for the SLP tree node, if
> > partial vectors are required and supported.
> > Add a can_use_partial_vectors member to record whether partial
> > vectors are supported for the SLP tree node.
> > Add a num_partial_vectors member for costing.
> > (SLP_TREE_PARTIAL_VECTORS_STYLE): New member accessor macro.
> > (SLP_TREE_CAN_USE_PARTIAL_VECTORS_P): New member accessor macro.
> > (SLP_TREE_NUM_PARTIAL_VECTORS): New member accessor macro.
> > (vect_gen_while_ssa_name): Declaration of a new function.
> > (vect_slp_get_bb_mask): As above.
> > (vect_slp_get_bb_len): As above.
> > (vect_cannot_use_partial_vectors): Handle the basic block SLP
> > use-case by setting SLP_TREE_CAN_USE_PARTIAL_VECTORS_P to
> > false.
> > (vect_fully_with_length_p): Handle the basic block SLP use
> > case by checking whether the SLP_TREE_PARTIAL_VECTORS_STYLE is
> > vect_partial_vectors_len.
> > (vect_fully_masked_p): Handle the basic block SLP use case by
> > checking whether the SLP_TREE_PARTIAL_VECTORS_STYLE is
> > vect_partial_vectors_while_ult.
> > ---
> > gcc/tree-vect-slp.cc | 182 +++++++++++++++++++++++++++++++++++++++++
> > gcc/tree-vect-stmts.cc | 52 +++++++-----
> > gcc/tree-vectorizer.h | 52 ++++++++----
> > 3 files changed, 247 insertions(+), 39 deletions(-)
> >
> > diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> > index 075e93f04a9..4dd7e6e1e21 100644
> > --- a/gcc/tree-vect-slp.cc
> > +++ b/gcc/tree-vect-slp.cc
> > @@ -125,6 +125,9 @@ _slp_tree::_slp_tree ()
> > SLP_TREE_GS_BASE (this) = NULL_TREE;
> > this->ldst_lanes = false;
> > this->avoid_stlf_fail = false;
> > + SLP_TREE_PARTIAL_VECTORS_STYLE (this) = vect_partial_vectors_none;
> > + SLP_TREE_CAN_USE_PARTIAL_VECTORS_P (this) = true;
> > + SLP_TREE_NUM_PARTIAL_VECTORS (this) = 0;
> > SLP_TREE_VECTYPE (this) = NULL_TREE;
> > SLP_TREE_REPRESENTATIVE (this) = NULL;
> > this->cycle_info.id = -1;
> > @@ -8958,6 +8961,40 @@ vect_slp_analyze_node_operations (vec_info *vinfo,
> > slp_tree node,
> > vect_prologue_cost_for_slp (vinfo, child, cost_vec);
> > }
> >
> > + if (res)
> > + {
> > + /* Take care of special costs for partial vectors.
> > + Costing each partial vector is excessive for many SLP instances,
> > + because it is common to materialise identical masks/lengths for
> > related
> > + operations (e.g., for vector loads and stores of the same length).
> > + Masks/lengths can also be shared between SLP subgraphs or
> > eliminated by
> > + pattern-based lowering during instruction selection. However, it's
> > + simpler and safer to use the worst-case cost; if this ends up
> > being the
> > + tie-breaker between vectorizing or not, then it's probably better
> > not
> > + to vectorize. */
>
> I'd prefer to do this per SLP subgraph group based on recorded
> requirements so similar
> how loop masking is set up.
>
> > + const int num_partial_vectors = SLP_TREE_NUM_PARTIAL_VECTORS (node);
> > +
> > + if (SLP_TREE_PARTIAL_VECTORS_STYLE (node)
> > + == vect_partial_vectors_while_ult)
> > + {
> > + gcc_assert (num_partial_vectors > 0);
> > + record_stmt_cost (cost_vec, num_partial_vectors, vector_stmt,
> > NULL,
> > + NULL, NULL_TREE, 0, vect_prologue);
> > + }
> > + else if (SLP_TREE_PARTIAL_VECTORS_STYLE (node)
> > + == vect_partial_vectors_len)
> > + {
> > + /* Need to set up a length in the prologue. */
> > + gcc_assert (num_partial_vectors > 0);
> > + record_stmt_cost (cost_vec, num_partial_vectors, scalar_stmt,
> > NULL,
> > + NULL, NULL_TREE, 0, vect_prologue);
> > + }
> > + else
> > + {
> > + gcc_assert (num_partial_vectors == 0);
> > + }
> > + }
> > +
> > /* If this node or any of its children can't be vectorized, try pruning
> > the tree here rather than felling the whole thing. */
> > if (!res && vect_slp_convert_to_external (vinfo, node, node_instance))
> > @@ -12441,3 +12478,148 @@ vect_schedule_slp (vec_info *vinfo, const
> > vec<slp_instance> &slp_instances)
> > }
> > }
> > }
> > +
> > +/* Record that a specific partial vector style could be used to vectorize
> > + SLP_NODE if required. */
> > +
> > +static void
> > +vect_slp_record_bb_style (slp_tree slp_node, vect_partial_vector_style
> > style)
> > +{
> > + gcc_assert (style != vect_partial_vectors_none);
> > + gcc_assert (style != vect_partial_vectors_avx512);
> > +
> > + if (SLP_TREE_PARTIAL_VECTORS_STYLE (slp_node) ==
> > vect_partial_vectors_none)
> > + SLP_TREE_PARTIAL_VECTORS_STYLE (slp_node) = style;
> > + else
> > + gcc_assert (SLP_TREE_PARTIAL_VECTORS_STYLE (slp_node) == style);
> > +}
> > +
> > +/* Record that a complete set of masks associated with SLP_NODE would need
> > to
> > + contain a sequence of NVECTORS masks that each control a vector of type
> > + VECTYPE. If SCALAR_MASK is nonnull, the fully-masked loop would AND
> > + these vector masks with the vector version of SCALAR_MASK. */
> > +void
> > +vect_slp_record_bb_mask (slp_tree slp_node, unsigned int /* nvectors */,
> > + tree /* vectype */, tree /* scalar_mask */)
> > +{
> > + vect_slp_record_bb_style (slp_node, vect_partial_vectors_while_ult);
> > +
> > + /* FORNOW: this often overestimates the number of masks for costing
> > purposes
> > + because, after lowering, masks have often been eliminated, shared
> > between
> > + SLP nodes, or even shared between SLP subgraphs. */
> > + SLP_TREE_NUM_PARTIAL_VECTORS(slp_node) ++;
> > +}
> > +
> > +/* Materialize mask number INDEX for a group of scalar stmts in SLP_NODE
> > that
> > + operate on NVECTORS vectors of type VECTYPE, where 0 <= INDEX <
> > NVECTORS.
> > + Insert any set-up statements before GSI. */
> > +
> > +tree
> > +vect_slp_get_bb_mask (slp_tree slp_node, gimple_stmt_iterator *gsi,
> > + unsigned int nvectors, tree vectype, unsigned int
> > index)
> > +{
> > + gcc_assert (SLP_TREE_PARTIAL_VECTORS_STYLE (slp_node)
> > + == vect_partial_vectors_while_ult);
> > + gcc_assert (nvectors >= 1);
> > + gcc_assert (index < nvectors);
> > +
> > + const poly_uint64 nunits = TYPE_VECTOR_SUBPARTS (vectype);
> > + const unsigned int group_size = SLP_TREE_LANES (slp_node);
> > + unsigned int mask_size = group_size;
> > + const tree masktype = truth_type_for (vectype);
> > +
> > + if (nunits.is_constant ())
> > + {
> > + /* Only the last vector can be a partial vector. */
> > + if (index + 1 < nvectors)
> > + return build_minus_one_cst (masktype);
> > +
> > + /* Return a mask for a possibly-partial tail vector. */
> > + const unsigned int const_nunits = nunits.to_constant ();
> > + const unsigned int head_size = (nvectors - 1) * const_nunits;
> > + gcc_assert (head_size <= group_size);
> > + mask_size = group_size - head_size;
> > +
> > + if (mask_size == const_nunits)
> > + return build_minus_one_cst (masktype);
> > + }
> > + else
> > + {
> > + /* Return a mask for a single variable-length vector. */
> > + gcc_assert (nvectors == 1);
> > + gcc_assert (known_le (mask_size, nunits));
> > + }
> > +
> > + /* FORNOW: don't bother maintaining a set of mask constants to allow
> > + sharing between nodes belonging to the same instance of bb_vec_info
> > + or even within the same SLP subgraph. */
>
> See above. The loop code already should have everything set up for
> caching. Why not reuse that?
>
> > + gimple_seq stmts = NULL;
> > + const tree cmp_type = size_type_node;
> > + const tree start_index = build_zero_cst (cmp_type);
> > + const tree end_index = build_int_cst (cmp_type, mask_size);
> > + const tree mask = make_temp_ssa_name (masktype, NULL, "slp_mask");
> > + vect_gen_while_ssa_name (&stmts, masktype, start_index, end_index, mask);
> > + gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
> > + return mask;
> > +}
> > +
> > +/* Record that a complete set of lengths associated with SLP_NODE would
> > need to
> > + contain a sequence of NVECTORS lengths for controlling an operation on
> > + VECTYPE. The operation splits each element of VECTYPE into FACTOR
> > separate
> > + subelements, measuring the length as a number of these subelements. */
> > +
> > +void
> > +vect_slp_record_bb_len (slp_tree slp_node, unsigned int /* nvectors */,
> > + tree /* vectype */, unsigned int /* factor */)
> > +{
> > + vect_slp_record_bb_style (slp_node, vect_partial_vectors_len);
> > +
> > + /* FORNOW: this probably overestimates the number of lengths for costing
> > + purposes because, after lowering, lengths might have been eliminated,
> > + shared between SLP nodes, or even shared between SLP subgraphs. */
> > + SLP_TREE_NUM_PARTIAL_VECTORS (slp_node)++;
> > +}
> > +
> > +/* Materialize length number INDEX for a group of scalar stmts in SLP_NODE
> > that
> > + operate on NVECTORS vectors of type VECTYPE, where 0 <= INDEX <
> > NVECTORS.
> > + Return a value that contains FACTOR multiplied by the number of
> > elements that
> > + should be processed. */
> > +
> > +tree
> > +vect_slp_get_bb_len (slp_tree slp_node, unsigned int nvectors, tree
> > vectype,
> > + unsigned int index, unsigned int factor, bool adjusted)
> > +{
> > + gcc_checking_assert (SLP_TREE_PARTIAL_VECTORS_STYLE (slp_node)
> > + == vect_partial_vectors_len);
> > + gcc_assert (nvectors >= 1);
> > + gcc_assert (index < nvectors);
> > + (void) adjusted;
> > +
> > + const poly_uint64 nunits = TYPE_VECTOR_SUBPARTS (vectype);
> > + const unsigned int group_size = SLP_TREE_LANES (slp_node);
> > + unsigned int len = group_size;
> > +
> > + if (nunits.is_constant ())
> > + {
> > + const unsigned int const_nunits = nunits.to_constant ();
> > +
> > + /* Only the last vector can be a partial vector. */
> > + if (index + 1 < nvectors)
> > + len = const_nunits;
> > + else
> > + {
> > + /* Return a length for a possibly-partial tail vector. */
> > + const unsigned int head_size = (nvectors - 1) * const_nunits;
> > + gcc_assert (head_size <= group_size);
> > + len = group_size - head_size;
> > + }
> > + }
> > + else
> > + {
> > + /* Return a length for a single variable-length vector. */
> > + gcc_assert (nvectors == 1);
> > + gcc_assert (known_le (len, nunits));
> > + }
> > +
> > + return size_int (len * factor);
> > +}
> > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> > index 15fca17a407..ecad74e7cbf 100644
> > --- a/gcc/tree-vect-stmts.cc
> > +++ b/gcc/tree-vect-stmts.cc
> > @@ -1385,7 +1385,9 @@ vectorizable_internal_function (combined_fn cfn, tree
> > fndecl,
> > /* Record that a complete set of masks associated with VINFO would need to
> > contain a sequence of NVECTORS masks that each control a vector of type
> > VECTYPE. If SCALAR_MASK is nonnull, the fully-masked loop would AND
> > - these vector masks with the vector version of SCALAR_MASK. */
> > + these vector masks with the vector version of SCALAR_MASK.
> > Alternatively,
> > + if doing basic block vectorization, record that a mask could be used to
> > + vectorize SLP_NODE if required. */
> > static void
> > vect_record_mask (vec_info *vinfo, slp_tree slp_node, unsigned int
> > nvectors,
> > tree vectype, tree scalar_mask)
> > @@ -1395,7 +1397,7 @@ vect_record_mask (vec_info *vinfo, slp_tree slp_node,
> > unsigned int nvectors,
> > vect_record_loop_mask (loop_vinfo, &LOOP_VINFO_MASKS (loop_vinfo),
> > nvectors,
> > vectype, scalar_mask);
> > else
> > - (void) slp_node; /* FORNOW */
> > + vect_slp_record_bb_mask (slp_node, nvectors, vectype, scalar_mask);
> > }
> >
> > /* Given a complete set of masks associated with VINFO, extract mask number
> > @@ -1413,16 +1415,15 @@ vect_get_mask (vec_info *vinfo, slp_tree slp_node,
> > gimple_stmt_iterator *gsi,
> > return vect_get_loop_mask (loop_vinfo, gsi, &LOOP_VINFO_MASKS
> > (loop_vinfo),
> > nvectors, vectype, index);
> > else
> > - {
> > - (void) slp_node; /* FORNOW */
> > - return NULL_TREE;
> > - }
> > + return vect_slp_get_bb_mask (slp_node, gsi, nvectors, vectype, index);
> > }
> >
> > /* Record that a complete set of lengths associated with VINFO would need
> > to
> > contain a sequence of NVECTORS lengths for controlling an operation on
> > VECTYPE. The operation splits each element of VECTYPE into FACTOR
> > separate
> > - subelements, measuring the length as a number of these subelements. */
> > + subelements, measuring the length as a number of these subelements.
> > + Alternatively, if doing basic block vectorization, record that a length
> > limit
> > + could be used to vectorize SLP_NODE if required. */
> > static void
> > vect_record_len (vec_info *vinfo, slp_tree slp_node, unsigned int nvectors,
> > tree vectype, unsigned int factor)
> > @@ -1432,7 +1433,7 @@ vect_record_len (vec_info *vinfo, slp_tree slp_node,
> > unsigned int nvectors,
> > vect_record_loop_len (loop_vinfo, &LOOP_VINFO_LENS (loop_vinfo),
> > nvectors,
> > vectype, factor);
> > else
> > - (void) slp_node; /* FORNOW */
> > + vect_slp_record_bb_len (slp_node, nvectors, vectype, factor);
> > }
> >
> > /* Given a complete set of lengths associated with VINFO, extract length
> > number
> > @@ -1453,10 +1454,8 @@ vect_get_len (vec_info *vinfo, slp_tree slp_node,
> > gimple_stmt_iterator *gsi,
> > return vect_get_loop_len (loop_vinfo, gsi, &LOOP_VINFO_LENS
> > (loop_vinfo),
> > nvectors, vectype, index, factor, adjusted);
> > else
> > - {
> > - (void) slp_node; /* FORNOW */
> > - return NULL_TREE;
> > - }
> > + return vect_slp_get_bb_len (slp_node, nvectors, vectype, index, factor,
> > + adjusted);
> > }
> >
> > static tree permute_vec_elements (vec_info *, tree, tree, tree,
> > stmt_vec_info,
> > @@ -14710,24 +14709,35 @@ supportable_indirect_convert_operation
> > (code_helper code,
> > mask[I] is true iff J + START_INDEX < END_INDEX for all J <= I.
> > Add the statements to SEQ. */
> >
> > +void
> > +vect_gen_while_ssa_name (gimple_seq *seq, tree mask_type, tree start_index,
> > + tree end_index, tree ssa_name)
> > +{
> > + tree cmp_type = TREE_TYPE (start_index);
> > + gcc_checking_assert (direct_internal_fn_supported_p (IFN_WHILE_ULT,
> > cmp_type,
> > + mask_type,
> > + OPTIMIZE_FOR_SPEED));
> > + gcall *call
> > + = gimple_build_call_internal (IFN_WHILE_ULT, 3, start_index, end_index,
> > + build_zero_cst (mask_type));
> > + gimple_call_set_lhs (call, ssa_name);
> > + gimple_seq_add_stmt (seq, call);
> > +}
> > +
> > +/* Like vect_gen_while_ssa_name except that it creates a new SSA_NAME node
> > + for type MASK_TYPE defined in the created GIMPLE_CALL statement. If
> > NAME
> > + is not a null pointer then it is used for the SSA_NAME in dumps. */
> > +
> > tree
> > vect_gen_while (gimple_seq *seq, tree mask_type, tree start_index,
> > tree end_index, const char *name)
> > {
> > - tree cmp_type = TREE_TYPE (start_index);
> > - gcc_checking_assert (direct_internal_fn_supported_p (IFN_WHILE_ULT,
> > - cmp_type, mask_type,
> > - OPTIMIZE_FOR_SPEED));
> > - gcall *call = gimple_build_call_internal (IFN_WHILE_ULT, 3,
> > - start_index, end_index,
> > - build_zero_cst (mask_type));
> > tree tmp;
> > if (name)
> > tmp = make_temp_ssa_name (mask_type, NULL, name);
> > else
> > tmp = make_ssa_name (mask_type);
> > - gimple_call_set_lhs (call, tmp);
> > - gimple_seq_add_stmt (seq, call);
> > + vect_gen_while_ssa_name (seq, mask_type, start_index, end_index, tmp);
> > return tmp;
> > }
> >
> > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
> > index a3855568b09..f79f04ff8ac 100644
> > --- a/gcc/tree-vectorizer.h
> > +++ b/gcc/tree-vectorizer.h
> > @@ -312,6 +312,13 @@ struct vect_load_store_data : vect_data {
> > bool subchain_p; // VMAT_STRIDED_SLP and VMAT_GATHER_SCATTER
> > };
> >
> > +enum vect_partial_vector_style {
> > + vect_partial_vectors_none,
> > + vect_partial_vectors_while_ult,
> > + vect_partial_vectors_avx512,
> > + vect_partial_vectors_len
> > +};
> > +
> > /* A computation tree of an SLP instance. Each node corresponds to a
> > group of
> > stmts to be packed in a SIMD stmt. */
> > struct _slp_tree {
> > @@ -377,7 +384,16 @@ struct _slp_tree {
> > /* For BB vect, flag to indicate this load node should be vectorized
> > as to avoid STLF fails because of related stores. */
> > bool avoid_stlf_fail;
> > -
> > + /* The style used for implementing partial vectors if LANES is less than
> > + the minimum number of lanes implied by the VECTYPE. */
> > + vect_partial_vector_style partial_vector_style;
>
> I wonder if we want to / need to mix style across the SLP subgraph, likewise
> whether we really need to track can_use_partial_vectors per SLP node as
> opposed to per subgraph. Likewise I wonder if we want to deal with the
> case of parts of the graph being unsupported because of lack of masking
> support which we could fix by promoting that part extern (not covered) rather
> than failing the whole subgraph.
>
> That is, I'm questioning (maybe again?) the overall tracking/analysis phase?
>
> > + /* Flag to indicate whether we still have the option of vectorizing this
> > node
> > + using partial vectors (i.e. using lengths or masks to prevent use of
> > + inactive scalar lanes). */
> > + bool can_use_partial_vectors;
> > + /* Number of partial vectors, for costing purposes. Should be 0 unless a
> > + partial vector style has been set. */
> > + int num_partial_vectors;
> > int vertex;
> >
> > /* The kind of operation as determined by analysis and optional
> > @@ -476,6 +492,9 @@ public:
> > #define SLP_TREE_GS_BASE(S) (S)->gs_base
> > #define SLP_TREE_REDUC_IDX(S) (S)->cycle_info.reduc_idx
> > #define SLP_TREE_PERMUTE_P(S) ((S)->code ==
> > VEC_PERM_EXPR)
> > +#define SLP_TREE_PARTIAL_VECTORS_STYLE(S) (S)->partial_vector_style
> > +#define SLP_TREE_CAN_USE_PARTIAL_VECTORS_P(S)
> > (S)->can_use_partial_vectors
> > +#define SLP_TREE_NUM_PARTIAL_VECTORS(S)
> > (S)->num_partial_vectors
> >
> > inline vect_memory_access_type
> > SLP_TREE_MEMORY_ACCESS_TYPE (slp_tree node)
> > @@ -486,13 +505,6 @@ SLP_TREE_MEMORY_ACCESS_TYPE (slp_tree node)
> > return VMAT_UNINITIALIZED;
> > }
> >
> > -enum vect_partial_vector_style {
> > - vect_partial_vectors_none,
> > - vect_partial_vectors_while_ult,
> > - vect_partial_vectors_avx512,
> > - vect_partial_vectors_len
> > -};
> > -
> > /* Key for map that records association between
> > scalar conditions and corresponding loop mask, and
> > is populated by vect_record_loop_mask. */
> > @@ -2607,6 +2619,7 @@ extern tree vect_gen_perm_mask_checked (tree, const
> > vec_perm_indices &);
> > extern void optimize_mask_stores (class loop*);
> > extern tree vect_gen_while (gimple_seq *, tree, tree, tree,
> > const char * = nullptr);
> > +extern void vect_gen_while_ssa_name (gimple_seq *, tree, tree, tree, tree);
> > extern tree vect_gen_while_not (gimple_seq *, tree, tree, tree);
> > extern opt_result vect_get_vector_types_for_stmt (vec_info *,
> > stmt_vec_info, tree *,
> > @@ -2788,7 +2801,14 @@ extern slp_tree vect_create_new_slp_node (unsigned,
> > tree_code);
> > extern void vect_free_slp_tree (slp_tree);
> > extern bool compatible_calls_p (gcall *, gcall *, bool);
> > extern int vect_slp_child_index_for_operand (const stmt_vec_info, int op);
> > -
> > +extern void vect_slp_record_bb_mask (slp_tree slp_node, unsigned int
> > nvectors,
> > + tree vectype, tree scalar_mask);
> > +extern tree vect_slp_get_bb_mask (slp_tree, gimple_stmt_iterator *,
> > + unsigned int, tree, unsigned int);
> > +extern void vect_slp_record_bb_len (slp_tree slp_node, unsigned int
> > nvectors,
> > + tree vectype, unsigned int factor);
> > +extern tree vect_slp_get_bb_len (slp_tree, unsigned int, tree, unsigned
> > int,
> > + unsigned int, bool);
> > extern tree prepare_vec_mask (vec_info *, tree, tree, tree,
> > gimple_stmt_iterator *);
> > extern tree vect_get_mask_load_else (int, tree);
> > @@ -2953,7 +2973,7 @@ vect_cannot_use_partial_vectors (vec_info *vinfo,
> > slp_tree slp_node)
> > if (loop_vinfo)
> > LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false;
> > else
> > - (void) slp_node; /* FORNOW */
> > + SLP_TREE_CAN_USE_PARTIAL_VECTORS_P (slp_node) = false;
> > }
> >
> > /* Return true if VINFO is vectorizer state for loop vectorization, we've
> > @@ -2967,10 +2987,8 @@ vect_fully_with_length_p (vec_info *vinfo, slp_tree
> > slp_node)
> > if (loop_vec_info loop_vinfo = dyn_cast<loop_vec_info> (vinfo))
> > return LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo);
> > else
> > - {
> > - (void) slp_node; /* FORNOW */
> > - return false;
> > - }
> > + return SLP_TREE_PARTIAL_VECTORS_STYLE (slp_node)
> > + == vect_partial_vectors_len;
> > }
> >
> > /* Return true if VINFO is vectorizer state for loop vectorization, we've
> > @@ -2984,10 +3002,8 @@ vect_fully_masked_p (vec_info *vinfo, slp_tree
> > slp_node)
> > if (loop_vec_info loop_vinfo = dyn_cast<loop_vec_info> (vinfo))
> > return LOOP_VINFO_FULLY_MASKED_P (loop_vinfo);
> > else
> > - {
> > - (void) slp_node; /* FORNOW */
> > - return false;
> > - }
> > + return SLP_TREE_PARTIAL_VECTORS_STYLE (slp_node)
> > + == vect_partial_vectors_while_ult;
> > }
> >
> > /* If STMT_INFO describes a reduction, return the vect_reduction_type
> > --
> > 2.43.0
> >