On Thu, 31 Jul 2025, Richard Biener wrote:

> On Thu, 31 Jul 2025, Tamar Christina wrote:
> 
> > > -----Original Message-----
> > > From: Richard Biener <rguent...@suse.de>
> > > Sent: Wednesday, July 30, 2025 12:17 PM
> > > To: gcc-patches@gcc.gnu.org
> > > Cc: Tamar Christina <tamar.christ...@arm.com>
> > > Subject: [PATCH 2/2] Put SLP_TREE_SIMD_CLONE_INFO into type specifc data
> > > 
> > > The following adds vect_simd_clone_data as a container for vect
> > > type specific data for vectorizable_simd_clone_call and moves
> > > SLP_TREE_SIMD_CLONE_INFO there.
> > > 
> > > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> > > 
> > > This time for a simpler candidate (and more immediately useful
> > > since it shrinks the SLP node size)
> > > 
> > > I suppose this is somewhat better - do you have a better suggestion
> > > for the dances around the data object creation upon analysis success?
> > 
> > I've only checked that this compiles, but how about something like
> > 
> > > 
> > > Thanks,
> > > Richard.
> > > 
> > >   * tree-vectorizer.h (vect_simd_clone_data): New.
> > >   (_slp_tree::simd_clone_info): Remove.
> > >   (SLP_TREE_SIMD_CLONE_INFO): Likewise.
> > >   * tree-vect-slp.cc (_slp_tree::_slp_tree): Adjust.
> > >   (_slp_tree::~_slp_tree): Likewise.
> > >   * tree-vect-stmts.cc (vectorizable_simd_clone_call): Use
> > >   tyupe specific data to store SLP_TREE_SIMD_CLONE_INFO.
> > > ---
> > >  gcc/tree-vect-slp.cc   |  2 --
> > >  gcc/tree-vect-stmts.cc |  8 +++++---
> > >  gcc/tree-vectorizer.h  | 22 ++++++++++++++++------
> > >  3 files changed, 21 insertions(+), 11 deletions(-)
> > > 
> > > diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> > > index 233543214fa..4038fe84e87 100644
> > > --- a/gcc/tree-vect-slp.cc
> > > +++ b/gcc/tree-vect-slp.cc
> > > @@ -118,7 +118,6 @@ _slp_tree::_slp_tree ()
> > >    SLP_TREE_CHILDREN (this) = vNULL;
> > >    SLP_TREE_LOAD_PERMUTATION (this) = vNULL;
> > >    SLP_TREE_LANE_PERMUTATION (this) = vNULL;
> > > -  SLP_TREE_SIMD_CLONE_INFO (this) = vNULL;
> > >    SLP_TREE_DEF_TYPE (this) = vect_uninitialized_def;
> > >    SLP_TREE_CODE (this) = ERROR_MARK;
> > >    this->ldst_lanes = false;
> > > @@ -150,7 +149,6 @@ _slp_tree::~_slp_tree ()
> > >    SLP_TREE_VEC_DEFS (this).release ();
> > >    SLP_TREE_LOAD_PERMUTATION (this).release ();
> > >    SLP_TREE_LANE_PERMUTATION (this).release ();
> > > -  SLP_TREE_SIMD_CLONE_INFO (this).release ();
> > >    if (this->failed)
> > >      free (failed);
> > >    if (this->data)
> > > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> > > index 5d34fcd193b..d9ff7fe7960 100644
> > > --- a/gcc/tree-vect-stmts.cc
> > > +++ b/gcc/tree-vect-stmts.cc
> > > @@ -3873,9 +3873,10 @@ vectorizable_simd_clone_call (vec_info *vinfo,
> > > stmt_vec_info stmt_info,
> > >    if (nargs == 0)
> > >      return false;
> > > 
> > > -  vec<tree>& simd_clone_info = SLP_TREE_SIMD_CLONE_INFO (slp_node);
> > > -  if (cost_vec)
> > > -    simd_clone_info.truncate (0);
> > > +  vect_simd_clone_data _data;
> > > +  vec<tree>& simd_clone_info
> > > +    = cost_vec ? _data.simd_clone_info
> > > +      : static_cast <vect_simd_clone_data *> 
> > > (slp_node->data)->simd_clone_info;
> > 
> > What if we put the conditional extraction on the creation of _data.  So 
> > something like:
> > 
> >   vect_simd_clone_data *_data;
> >   slp_get_vect_data (cost_vec, slp_node, _data);
> >   vec<tree>& simd_clone_info = _data->simd_clone_info;
> 
> ah, interesting, even better make it a template member function (is there
> sth like that!?  I'll figure that out).

So the following works.  Adding _slp_tree::get_data like

  template <class T>
  T& get_data (T& else_) { return data ? *static_cast <T *> (data) : 
else_; }

and using it like

  vect_simd_clone_data _data;
  vect_simd_clone_data &data = slp_node->get_data (_data);
  vec<tree>& simd_clone_info = data.simd_clone_info;

ideally we could magically write-protect 'data' during transform time.

At allocation time the defaulted move CTOR works and I have to write

      slp_node->data = new vect_simd_clone_data (std::move (_data));

I suppose we could place the actual storage into _slp_tree itself
in a char[] array and construct the actual objects in-place.  It'll
be then much like a union (so we could actually use a union, right?
but not sure about C++14).

It does look nicer, but still somewhat ugly due to the automatic
_data.

Richard.

> > PS. Could we do the extraction based on whether slp_node->data is null or 
> > not?
> >        then we don't need the cost_vec parameter?
> > 
> > The helper in tree-vectorizer.h is then
> > 
> > template <typename T>
> > inline void slp_get_vect_data (bool costing_p, slp_tree slp_node, T& 
> > refvalue)
> > {
> >   if (costing_p && slp_node->data)
> >     {
> >       refvalue = static_cast<T>(slp_node->data);
> >       return;
> >     }
> > 
> >   refvalue = T{};
> > }
> > So we move the instantiation into the helper instead.
> 
> Hmm, but above _data is a pointer, so you pass *_data instead even
> though that is uninitialized?  I don't quite see how this should work ...
> 
> > >    arginfo.reserve (nargs, true);
> > >    auto_vec<slp_tree> slp_op;
> > >    slp_op.safe_grow_cleared (nargs);
> > > @@ -4272,6 +4273,7 @@ vectorizable_simd_clone_call (vec_info *vinfo,
> > > stmt_vec_info stmt_info,
> > >   }
> > > 
> > >        SLP_TREE_TYPE (slp_node) = call_simd_clone_vec_info_type;
> > > +      slp_node->data = new vect_simd_clone_data (_data);
> > 
> > And then this becomes 
> > 
> > slp_node->data = _data;
> > 
> > but we have to delete _data on failure.  That or have  the default case 
> > assign
> > it to slp_tree and have its destructor take care of it?
> 
> ... so I _did_ think about doing, in vect_analyze_stmt,
> 
>  slp->data = new vect_simd_clone_data ();
>  if (vectorizable_simd_clone_call (...))
>    ;
>  else
>   {
>     delete slp->data;
>     if (vectorizable_...)
>        ...
> 
> but apart from being ugly we have a lot of allocation/deallocation which
> I wanted to aovid.
> 
> > so something like
> > 
> > template <typename T>
> > inline void slp_get_vect_data (bool costing_p, slp_tree slp_node, T& 
> > refvalue)
> > {
> >   if (costing_p && slp_node->data)
> >     {
> >       refvalue = static_cast<T>(slp_node->data);
> >       return;
> >     }
> > 
> >   refvalue = T{};
> >   slp_node->data = refvalue;
> > }
> > So we don't have to worry about the delete?
> 
> I'll for now see to use a template helper to avoid the ugly casting.
> 
> > >        DUMP_VECT_SCOPE ("vectorizable_simd_clone_call");
> > >  /*      vect_model_simple_cost (vinfo, 1, slp_node, cost_vec); */
> > >        return true;
> > > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
> > > index ad27577cc55..56ba705fa39 100644
> > > --- a/gcc/tree-vectorizer.h
> > > +++ b/gcc/tree-vectorizer.h
> > > @@ -243,6 +243,22 @@ struct vect_data {
> > >    virtual ~vect_data () = default;
> > >  };
> > > 
> > > +/* Analysis data from vectorizable_simd_clone_call for
> > > +   call_simd_clone_vec_info_type.  */
> > > +struct vect_simd_clone_data : vect_data {
> > > +  virtual ~vect_simd_clone_data () = default;
> > > +  vect_simd_clone_data () = default;
> > > +  vect_simd_clone_data (vect_simd_clone_data &other)
> > > +  {
> > > +    simd_clone_info = std::move (other.simd_clone_info);
> > > +  }
> > > +
> > 
> > I think we can have the compiler generate this right? Using
> > 
> > vect_simd_clone_data(vect_simd_clone_data&& other) = default;
> > 
> > so we don't have to move the members individually.
> > 
> > Would something like this work?
> 
> Ah, that's something easy to try (and nice if it works).
> 
> Richard.
> 
> > Thanks,
> > Tamar
> > 
> > > +  /* Selected SIMD clone's function info.  First vector element
> > > +     is SIMD clone's function decl, followed by a pair of trees (base + 
> > > step)
> > > +     for linear arguments (pair of NULLs for other arguments).  */
> > > +  auto_vec<tree> simd_clone_info;
> > > +};
> > > +
> > >  /* A computation tree of an SLP instance.  Each node corresponds to a 
> > > group of
> > >     stmts to be packed in a SIMD stmt.  */
> > >  struct _slp_tree {
> > > @@ -271,11 +287,6 @@ struct _slp_tree {
> > >       denotes the number of output lanes.  */
> > >    lane_permutation_t lane_permutation;
> > > 
> > > -  /* Selected SIMD clone's function info.  First vector element
> > > -     is SIMD clone's function decl, followed by a pair of trees (base + 
> > > step)
> > > -     for linear arguments (pair of NULLs for other arguments).  */
> > > -  vec<tree> simd_clone_info;
> > > -
> > >    tree vectype;
> > >    /* Vectorized defs.  */
> > >    vec<tree> vec_defs;
> > > @@ -392,7 +403,6 @@ public:
> > >  #define SLP_TREE_NUMBER_OF_VEC_STMTS(S)          (S)->vec_stmts_size
> > >  #define SLP_TREE_LOAD_PERMUTATION(S)             (S)->load_permutation
> > >  #define SLP_TREE_LANE_PERMUTATION(S)             (S)->lane_permutation
> > > -#define SLP_TREE_SIMD_CLONE_INFO(S)              (S)->simd_clone_info
> > >  #define SLP_TREE_DEF_TYPE(S)                      (S)->def_type
> > >  #define SLP_TREE_VECTYPE(S)                       (S)->vectype
> > >  #define SLP_TREE_REPRESENTATIVE(S)                (S)->representative
> > > --
> > > 2.43.0
> > 
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to