RE: [PATCH v2 3/16]middle-end Add basic SLP pattern matching scaffolding.

2020-11-04 Thread Richard Biener
On Wed, 4 Nov 2020, Tamar Christina wrote:

> > -Original Message-
> > From: rguent...@c653.arch.suse.de  On
> > Behalf Of Richard Biener
> > Sent: Wednesday, November 4, 2020 2:04 PM
> > To: Tamar Christina 
> > Cc: Richard Sandiford ; nd ;
> > gcc-patches@gcc.gnu.org
> > Subject: RE: [PATCH v2 3/16]middle-end Add basic SLP pattern matching
> > scaffolding.
> > 
> > On Wed, 4 Nov 2020, Tamar Christina wrote:
> > 
> > > > -Original Message-
> > > > From: rguent...@c653.arch.suse.de  On
> > > > Behalf Of Richard Biener
> > > > Sent: Wednesday, November 4, 2020 12:41 PM
> > > > To: Tamar Christina 
> > > > Cc: Richard Sandiford ; nd ;
> > > > gcc-patches@gcc.gnu.org
> > > > Subject: RE: [PATCH v2 3/16]middle-end Add basic SLP pattern
> > > > matching scaffolding.
> > > >
> > > > On Tue, 3 Nov 2020, Tamar Christina wrote:
> > > >
> > > > > Hi Richi,
> > > > >
> > > > > This is a respin which includes the changes you requested.
> > > >
> > > > Comments randomly ordered, I'm pasting in pieces of the patch -
> > > > sending it inline would help to get pieces properly quoted and in-order.
> > > >
> > > > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index
> > > >
> > 4bd454cfb185d7036843fc7140b073f525b2ec6a..b813508d3ceaf4c54f612bc10f
> > > > 9
> > > > aa42ffe0ce0dd
> > > > 100644
> > > > --- a/gcc/tree-vectorizer.h
> > > > +++ b/gcc/tree-vectorizer.h
> > > > ...
> > > >
> > > > I miss comments in this file, see tree-vectorizer.h where we try to
> > > > document purpose of classes and fields.
> > > >
> > > > Things that sticks out to me:
> > > >
> > > > +uint8_t m_arity;
> > > > +uint8_t m_num_args;
> > > >
> > > > why uint8_t and not simply unsigned int?  Not knowing what arity /
> > > > num_args should be here ;)
> > >
> > > I think I can remove arity, but num_args is how many operands the
> > > created internal function call should take.  Since we can't vectorize
> > > calls with more than
> > > 4 arguments at the moment it seemed like 255 would be a safe limit :).
> > >
> > > >
> > > > +vec_info *m_vinfo;
> > > > ...
> > > > +vect_pattern (slp_tree *node, vec_info *vinfo)
> > > >
> > > > so this looks like something I freed stmt_vec_info of -
> > > > back-pointers in the "wrong" direction of the logical hierarchy.  I
> > > > suppose it's just to avoid passing down vinfo where we need it?
> > > > Please do that instead - pass down vinfo as everything else does.
> > > >
> > > > The class seems to expose both very high-level (build () it!) and
> > > > very low level details (get_ifn).  The high-level one suggests that
> > > > a pattern _not_ being represented by an ifn is possible but there's
> > > > too much implementation detail already in the vect_pattern class to
> > > > make that impossible.  I guess the IFN details could be pushed down
> > > > to the simple matching class (and that be called vect_ifn_pattern or 
> > > > so).
> > > >
> > > > +static bool
> > > > +vect_match_slp_patterns (slp_tree *ref_node, vec_info *vinfo) {
> > > > +  DUMP_VECT_SCOPE ("vect_match_slp_patterns");
> > > > +  bool found_p = false;
> > > > +
> > > > +  if (dump_enabled_p ())
> > > > +{
> > > > +  dump_printf_loc (MSG_NOTE, vect_location, "-- before patt
> > > > + match
> > > > --\n");
> > > > +  vect_print_slp_graph (MSG_NOTE, vect_location, *ref_node);
> > > > +  dump_printf_loc (MSG_NOTE, vect_location, "-- end patt --\n");
> > > > +}
> > > >
> > > > we dumped all instances after their analysis.  Maybe just refer to
> > > > the instance with its address (dump_print %p) so lookup in the
> > > > (already large) dump file is easy.
> > > >
> > > > +  hash_set *visited = new hash_set ();  for
> > > > + (unsigned x = 0; x < num__slp_patterns; x++)
> > > > +{
> > > > +  visited->empty ();
> > > > +  found_p |= vect_match_slp_patterns_2 (

RE: [PATCH v2 3/16]middle-end Add basic SLP pattern matching scaffolding.

2020-11-04 Thread Tamar Christina via Gcc-patches
> -Original Message-
> From: rguent...@c653.arch.suse.de  On
> Behalf Of Richard Biener
> Sent: Wednesday, November 4, 2020 2:04 PM
> To: Tamar Christina 
> Cc: Richard Sandiford ; nd ;
> gcc-patches@gcc.gnu.org
> Subject: RE: [PATCH v2 3/16]middle-end Add basic SLP pattern matching
> scaffolding.
> 
> On Wed, 4 Nov 2020, Tamar Christina wrote:
> 
> > > -Original Message-
> > > From: rguent...@c653.arch.suse.de  On
> > > Behalf Of Richard Biener
> > > Sent: Wednesday, November 4, 2020 12:41 PM
> > > To: Tamar Christina 
> > > Cc: Richard Sandiford ; nd ;
> > > gcc-patches@gcc.gnu.org
> > > Subject: RE: [PATCH v2 3/16]middle-end Add basic SLP pattern
> > > matching scaffolding.
> > >
> > > On Tue, 3 Nov 2020, Tamar Christina wrote:
> > >
> > > > Hi Richi,
> > > >
> > > > This is a respin which includes the changes you requested.
> > >
> > > Comments randomly ordered, I'm pasting in pieces of the patch -
> > > sending it inline would help to get pieces properly quoted and in-order.
> > >
> > > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index
> > >
> 4bd454cfb185d7036843fc7140b073f525b2ec6a..b813508d3ceaf4c54f612bc10f
> > > 9
> > > aa42ffe0ce0dd
> > > 100644
> > > --- a/gcc/tree-vectorizer.h
> > > +++ b/gcc/tree-vectorizer.h
> > > ...
> > >
> > > I miss comments in this file, see tree-vectorizer.h where we try to
> > > document purpose of classes and fields.
> > >
> > > Things that sticks out to me:
> > >
> > > +uint8_t m_arity;
> > > +uint8_t m_num_args;
> > >
> > > why uint8_t and not simply unsigned int?  Not knowing what arity /
> > > num_args should be here ;)
> >
> > I think I can remove arity, but num_args is how many operands the
> > created internal function call should take.  Since we can't vectorize
> > calls with more than
> > 4 arguments at the moment it seemed like 255 would be a safe limit :).
> >
> > >
> > > +vec_info *m_vinfo;
> > > ...
> > > +vect_pattern (slp_tree *node, vec_info *vinfo)
> > >
> > > so this looks like something I freed stmt_vec_info of -
> > > back-pointers in the "wrong" direction of the logical hierarchy.  I
> > > suppose it's just to avoid passing down vinfo where we need it?
> > > Please do that instead - pass down vinfo as everything else does.
> > >
> > > The class seems to expose both very high-level (build () it!) and
> > > very low level details (get_ifn).  The high-level one suggests that
> > > a pattern _not_ being represented by an ifn is possible but there's
> > > too much implementation detail already in the vect_pattern class to
> > > make that impossible.  I guess the IFN details could be pushed down
> > > to the simple matching class (and that be called vect_ifn_pattern or so).
> > >
> > > +static bool
> > > +vect_match_slp_patterns (slp_tree *ref_node, vec_info *vinfo) {
> > > +  DUMP_VECT_SCOPE ("vect_match_slp_patterns");
> > > +  bool found_p = false;
> > > +
> > > +  if (dump_enabled_p ())
> > > +{
> > > +  dump_printf_loc (MSG_NOTE, vect_location, "-- before patt
> > > + match
> > > --\n");
> > > +  vect_print_slp_graph (MSG_NOTE, vect_location, *ref_node);
> > > +  dump_printf_loc (MSG_NOTE, vect_location, "-- end patt --\n");
> > > +}
> > >
> > > we dumped all instances after their analysis.  Maybe just refer to
> > > the instance with its address (dump_print %p) so lookup in the
> > > (already large) dump file is easy.
> > >
> > > +  hash_set *visited = new hash_set ();  for
> > > + (unsigned x = 0; x < num__slp_patterns; x++)
> > > +{
> > > +  visited->empty ();
> > > +  found_p |= vect_match_slp_patterns_2 (ref_node, vinfo,
> > > slp_patterns[x],
> > > +   visited);
> > > +}
> > > +
> > > +  delete visited;
> > >
> > > no need to new / delete, just do
> > >
> > >   has_set visited;
> > >
> > > like everyone else.  Btw, do you really want to scan pieces of the
> > > SLP graph (with instances being graph entries) multiple times?  If
> > > not then you should move the visited s

RE: [PATCH v2 3/16]middle-end Add basic SLP pattern matching scaffolding.

2020-11-04 Thread Richard Biener
On Wed, 4 Nov 2020, Tamar Christina wrote:

> > -Original Message-
> > From: rguent...@c653.arch.suse.de  On
> > Behalf Of Richard Biener
> > Sent: Wednesday, November 4, 2020 12:41 PM
> > To: Tamar Christina 
> > Cc: Richard Sandiford ; nd ;
> > gcc-patches@gcc.gnu.org
> > Subject: RE: [PATCH v2 3/16]middle-end Add basic SLP pattern matching
> > scaffolding.
> > 
> > On Tue, 3 Nov 2020, Tamar Christina wrote:
> > 
> > > Hi Richi,
> > >
> > > This is a respin which includes the changes you requested.
> > 
> > Comments randomly ordered, I'm pasting in pieces of the patch - sending it
> > inline would help to get pieces properly quoted and in-order.
> > 
> > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index
> > 4bd454cfb185d7036843fc7140b073f525b2ec6a..b813508d3ceaf4c54f612bc10f9
> > aa42ffe0ce0dd
> > 100644
> > --- a/gcc/tree-vectorizer.h
> > +++ b/gcc/tree-vectorizer.h
> > ...
> > 
> > I miss comments in this file, see tree-vectorizer.h where we try to document
> > purpose of classes and fields.
> > 
> > Things that sticks out to me:
> > 
> > +uint8_t m_arity;
> > +uint8_t m_num_args;
> > 
> > why uint8_t and not simply unsigned int?  Not knowing what arity /
> > num_args should be here ;)
> 
> I think I can remove arity, but num_args is how many operands the created
> internal function call should take.  Since we can't vectorize calls with more 
> than
> 4 arguments at the moment it seemed like 255 would be a safe limit :).
> 
> > 
> > +vec_info *m_vinfo;
> > ...
> > +vect_pattern (slp_tree *node, vec_info *vinfo)
> > 
> > so this looks like something I freed stmt_vec_info of - back-pointers in the
> > "wrong" direction of the logical hierarchy.  I suppose it's just to avoid 
> > passing
> > down vinfo where we need it?  Please do that instead - pass down vinfo as
> > everything else does.
> > 
> > The class seems to expose both very high-level (build () it!) and very low
> > level details (get_ifn).  The high-level one suggests that a pattern _not_
> > being represented by an ifn is possible but there's too much implementation
> > detail already in the vect_pattern class to make that impossible.  I guess 
> > the
> > IFN details could be pushed down to the simple matching class (and that be
> > called vect_ifn_pattern or so).
> > 
> > +static bool
> > +vect_match_slp_patterns (slp_tree *ref_node, vec_info *vinfo) {
> > +  DUMP_VECT_SCOPE ("vect_match_slp_patterns");
> > +  bool found_p = false;
> > +
> > +  if (dump_enabled_p ())
> > +{
> > +  dump_printf_loc (MSG_NOTE, vect_location, "-- before patt match
> > --\n");
> > +  vect_print_slp_graph (MSG_NOTE, vect_location, *ref_node);
> > +  dump_printf_loc (MSG_NOTE, vect_location, "-- end patt --\n");
> > +}
> > 
> > we dumped all instances after their analysis.  Maybe just refer to the
> > instance with its address (dump_print %p) so lookup in the (already large)
> > dump file is easy.
> > 
> > +  hash_set *visited = new hash_set ();  for
> > + (unsigned x = 0; x < num__slp_patterns; x++)
> > +{
> > +  visited->empty ();
> > +  found_p |= vect_match_slp_patterns_2 (ref_node, vinfo,
> > slp_patterns[x],
> > +   visited);
> > +}
> > +
> > +  delete visited;
> > 
> > no need to new / delete, just do
> > 
> >   has_set visited;
> > 
> > like everyone else.  Btw, do you really want to scan pieces of the SLP graph
> > (with instances being graph entries) multiple times?  If not then you should
> > move the visited set to the caller instead.
> > 
> > +  /* TODO: Remove in final version, only here for generating debug dot
> > graphs
> > +  from SLP tree.  */
> > +
> > +  if (dump_enabled_p ())
> > +{
> > +  dump_printf_loc (MSG_NOTE, vect_location, "-- start dot --\n");
> > +  vect_print_slp_graph (MSG_NOTE, vect_location, *ref_node);
> > +  dump_printf_loc (MSG_NOTE, vect_location, "-- end dot --\n");
> > +}
> > 
> > now, if there was some pattern matched it is probably useful to dump the
> > graph (entry) again.  But only conditional on that I think.  So can you 
> > instead
> > make the dump conditional on found_p and remove the start dot/end dot
> > markers as said in the comment?
> &g

RE: [PATCH v2 3/16]middle-end Add basic SLP pattern matching scaffolding.

2020-11-04 Thread Tamar Christina via Gcc-patches
> -Original Message-
> From: rguent...@c653.arch.suse.de  On
> Behalf Of Richard Biener
> Sent: Wednesday, November 4, 2020 12:41 PM
> To: Tamar Christina 
> Cc: Richard Sandiford ; nd ;
> gcc-patches@gcc.gnu.org
> Subject: RE: [PATCH v2 3/16]middle-end Add basic SLP pattern matching
> scaffolding.
> 
> On Tue, 3 Nov 2020, Tamar Christina wrote:
> 
> > Hi Richi,
> >
> > This is a respin which includes the changes you requested.
> 
> Comments randomly ordered, I'm pasting in pieces of the patch - sending it
> inline would help to get pieces properly quoted and in-order.
> 
> diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index
> 4bd454cfb185d7036843fc7140b073f525b2ec6a..b813508d3ceaf4c54f612bc10f9
> aa42ffe0ce0dd
> 100644
> --- a/gcc/tree-vectorizer.h
> +++ b/gcc/tree-vectorizer.h
> ...
> 
> I miss comments in this file, see tree-vectorizer.h where we try to document
> purpose of classes and fields.
> 
> Things that sticks out to me:
> 
> +uint8_t m_arity;
> +uint8_t m_num_args;
> 
> why uint8_t and not simply unsigned int?  Not knowing what arity /
> num_args should be here ;)

I think I can remove arity, but num_args is how many operands the created
internal function call should take.  Since we can't vectorize calls with more 
than
4 arguments at the moment it seemed like 255 would be a safe limit :).

> 
> +vec_info *m_vinfo;
> ...
> +vect_pattern (slp_tree *node, vec_info *vinfo)
> 
> so this looks like something I freed stmt_vec_info of - back-pointers in the
> "wrong" direction of the logical hierarchy.  I suppose it's just to avoid 
> passing
> down vinfo where we need it?  Please do that instead - pass down vinfo as
> everything else does.
> 
> The class seems to expose both very high-level (build () it!) and very low
> level details (get_ifn).  The high-level one suggests that a pattern _not_
> being represented by an ifn is possible but there's too much implementation
> detail already in the vect_pattern class to make that impossible.  I guess the
> IFN details could be pushed down to the simple matching class (and that be
> called vect_ifn_pattern or so).
> 
> +static bool
> +vect_match_slp_patterns (slp_tree *ref_node, vec_info *vinfo) {
> +  DUMP_VECT_SCOPE ("vect_match_slp_patterns");
> +  bool found_p = false;
> +
> +  if (dump_enabled_p ())
> +{
> +  dump_printf_loc (MSG_NOTE, vect_location, "-- before patt match
> --\n");
> +  vect_print_slp_graph (MSG_NOTE, vect_location, *ref_node);
> +  dump_printf_loc (MSG_NOTE, vect_location, "-- end patt --\n");
> +}
> 
> we dumped all instances after their analysis.  Maybe just refer to the
> instance with its address (dump_print %p) so lookup in the (already large)
> dump file is easy.
> 
> +  hash_set *visited = new hash_set ();  for
> + (unsigned x = 0; x < num__slp_patterns; x++)
> +{
> +  visited->empty ();
> +  found_p |= vect_match_slp_patterns_2 (ref_node, vinfo,
> slp_patterns[x],
> +   visited);
> +}
> +
> +  delete visited;
> 
> no need to new / delete, just do
> 
>   has_set visited;
> 
> like everyone else.  Btw, do you really want to scan pieces of the SLP graph
> (with instances being graph entries) multiple times?  If not then you should
> move the visited set to the caller instead.
> 
> +  /* TODO: Remove in final version, only here for generating debug dot
> graphs
> +  from SLP tree.  */
> +
> +  if (dump_enabled_p ())
> +{
> +  dump_printf_loc (MSG_NOTE, vect_location, "-- start dot --\n");
> +  vect_print_slp_graph (MSG_NOTE, vect_location, *ref_node);
> +  dump_printf_loc (MSG_NOTE, vect_location, "-- end dot --\n");
> +}
> 
> now, if there was some pattern matched it is probably useful to dump the
> graph (entry) again.  But only conditional on that I think.  So can you 
> instead
> make the dump conditional on found_p and remove the start dot/end dot
> markers as said in the comment?
> 
> + if (dump_enabled_p ())
> +   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> +"transformation for %s not valid due to
> + post
> "
> +"condition\n",
> 
> not really a MSG_MISSED_OPTIMIZATION, use MSG_NOTE.
> MSG_MISSED_OPTIMIZATION should be used for things (likely) making
> vectorization fail.
> 
> +  /* Perform recursive matching, it's important to do this after
> + matching
> things
> 
> before matching things?
> 
> + in the current node as the ma

RE: [PATCH v2 3/16]middle-end Add basic SLP pattern matching scaffolding.

2020-11-04 Thread Richard Biener
ntation
right now (guess splitting the patch up doesn't really help much ...)

Noting elsewhere:

+static void
+vect_dissolve_slp_only_patterns (loop_vec_info loop_vinfo,
+hash_set *visited, slp_tree 
root)
+{
+  if (!root || visited->contains (root))
+return;
+
+  unsigned int i;
+  slp_tree node;
+  stmt_vec_info related_stmt_info;
+  stmt_vec_info stmt_info = SLP_TREE_REPRESENTATIVE (root);
+
+  visited->add (root);

visited->add () returns whether the key was already there so
please combine the contains and the add calls here and elsewhere.

if (stmt_info && STMT_VINFO_SLP_VECT_ONLY (stmt_info)
+&& (related_stmt_info = STMT_VINFO_RELATED_STMT (stmt_info)) != 
NULL)
+  {
+   if (dump_enabled_p ())

I think STMT_VINFO_SLP_VECT_ONLY is a thing already, so I wonder
if the above is a sufficient test.  There's is_pattern_stmt_p ()
(which obviously also applies to non-SLP only patterns).

Note that I also see another issue, namely with hybrid SLP the
original non-pattern stmt would need to stay relevant.  That
probably asks for hybrid SLP discovery and relevance marking
to be combined somehow.  You can probably create a simple
testcase by storing a lane of a complex op via a non-grouped
store.

+   STMT_VINFO_RELEVANT (stmt_info) = vect_unused_in_scope;
+   STMT_VINFO_RELEVANT (related_stmt_info) = vect_used_in_scope;

as said previously the relevance should be copied, in this case
back to the original stmt info from the pattern stmt info.

+   STMT_VINFO_IN_PATTERN_P (related_stmt_info) = false;
+   STMT_SLP_TYPE (related_stmt_info) = loop_vect;

one option might be to delay pattern recognition (for the loop case)
to after vect_detect_hybrid_slp and simply not make any hybrid
stmt containing nodes part of a SLP pattern.  It would be a workaround
of course, not the best solution.  Another option is to
add a new field to stmt_info to put a "saved" relevance to and
simply go over all stmts restoring relevance - we already restore
SLP_TYPE to loop_vect that way at

  /* Reset SLP type to loop_vect on all stmts.  */
  for (i = 0; i < LOOP_VINFO_LOOP (loop_vinfo)->num_nodes; ++i)
{

so simply restore the saved relevance would work there (guess I
like that more than what vect_dissolve_slp_only_patterns does,
in any case do what vect_dissolve_slp_only_patterns does in the
above loop please).

Thanks,
Richard.



> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * Makefile.in (tree-vect-slp-patterns.o): New.
>   * doc/passes.texi: Update documentation.
>   * tree-vect-slp.c (vect_print_slp_tree): Add new state.
>   (vect_match_slp_patterns_2, vect_match_slp_patterns): New.
>   (vect_analyze_slp): Call pattern matcher.
>   * tree-vectorizer.h (enum _complex_operation):
>   (class vect_pattern_match, class vect_pattern): New.
>   * tree-vect-slp-patterns.c: New file.
> 
> > -Original Message-----
> > From: rguent...@c653.arch.suse.de  On
> > Behalf Of Richard Biener
> > Sent: Tuesday, September 29, 2020 10:42 AM
> > To: Richard Sandiford 
> > Cc: Tamar Christina ; nd ; gcc-
> > patc...@gcc.gnu.org
> > Subject: Re: [PATCH v2 3/16]middle-end Add basic SLP pattern matching
> > scaffolding.
> > 
> > On Tue, 29 Sep 2020, Richard Sandiford wrote:
> > 
> > > Richard Biener  writes:
> > > >> > > @@ -2192,6 +2378,17 @@ vect_analyze_slp_instance (vec_info
> > *vinfo,
> > > >> > >   _size, bst_map);
> > > >> > >if (node != NULL)
> > > >> > >  {
> > > >> > > +  /* Temporarily allow add_stmt calls again.  */
> > > >> > > +  vinfo->stmt_vec_info_ro = false;
> > > >> > > +
> > > >> > > +  /* See if any patterns can be found in the constructed SLP 
> > > >> > > tree
> > > >> > > +before we do any analysis on it.  */
> > > >> > > +  vect_match_slp_patterns (node, vinfo, group_size,
> > _nunits,
> > > >> > > +  matches, , _size,
> > > >> > > + bst_map);
> > > >> > > +
> > > >> > > +  /* After this no more add_stmt calls are allowed.  */
> > > >> > > +  vinfo->stmt_vec_info_ro = true;
> > > >> > > +
> > > >> > >
> > > >> > > I think this is a bit early to match patterns - I'd defer it to
> > > >> > > the point where all entries into the same SLP subgraph are
> > > >> > > analyzed, thus somewhere at the end of vect_analyze_slp l

RE: [PATCH v2 3/16]middle-end Add basic SLP pattern matching scaffolding.

2020-11-03 Thread Tamar Christina via Gcc-patches
Hi Richi,

This is a respin which includes the changes you requested.

Thanks,
Tamar

gcc/ChangeLog:

* Makefile.in (tree-vect-slp-patterns.o): New.
* doc/passes.texi: Update documentation.
* tree-vect-slp.c (vect_print_slp_tree): Add new state.
(vect_match_slp_patterns_2, vect_match_slp_patterns): New.
(vect_analyze_slp): Call pattern matcher.
* tree-vectorizer.h (enum _complex_operation):
(class vect_pattern_match, class vect_pattern): New.
* tree-vect-slp-patterns.c: New file.

> -Original Message-
> From: rguent...@c653.arch.suse.de  On
> Behalf Of Richard Biener
> Sent: Tuesday, September 29, 2020 10:42 AM
> To: Richard Sandiford 
> Cc: Tamar Christina ; nd ; gcc-
> patc...@gcc.gnu.org
> Subject: Re: [PATCH v2 3/16]middle-end Add basic SLP pattern matching
> scaffolding.
> 
> On Tue, 29 Sep 2020, Richard Sandiford wrote:
> 
> > Richard Biener  writes:
> > >> > > @@ -2192,6 +2378,17 @@ vect_analyze_slp_instance (vec_info
> *vinfo,
> > >> > >   _size, bst_map);
> > >> > >if (node != NULL)
> > >> > >  {
> > >> > > +  /* Temporarily allow add_stmt calls again.  */
> > >> > > +  vinfo->stmt_vec_info_ro = false;
> > >> > > +
> > >> > > +  /* See if any patterns can be found in the constructed SLP 
> > >> > > tree
> > >> > > +before we do any analysis on it.  */
> > >> > > +  vect_match_slp_patterns (node, vinfo, group_size,
> _nunits,
> > >> > > +  matches, , _size,
> > >> > > + bst_map);
> > >> > > +
> > >> > > +  /* After this no more add_stmt calls are allowed.  */
> > >> > > +  vinfo->stmt_vec_info_ro = true;
> > >> > > +
> > >> > >
> > >> > > I think this is a bit early to match patterns - I'd defer it to
> > >> > > the point where all entries into the same SLP subgraph are
> > >> > > analyzed, thus somewhere at the end of vect_analyze_slp loop
> > >> > > over all instances and match patterns?  That way phases are more
> clearly separated.
> > >> >
> > >> > That would probably work, my only worry is that the SLP analysis
> > >> > itself may fail and bail out at
> > >> >
> > >> >  /* If the loads and stores can be handled with load/store-lane
> > >> > instructions do not generate this SLP instance.  */
> > >> >  if (is_a  (vinfo)
> > >> >  && loads_permuted
> > >> >  && dr && vect_store_lanes_supported (vectype, group_size,
> > >> > false))
> > >
> > > Ah, that piece of code.  Yeah, I'm repeatedly running into it as
> > > well - it's a bad hack that stands in the way all the time :/
> >
> > At one point I was wondering about trying to drop the above, vectorise
> > with and without SLP, and then compare their costs, like for
> VECT_COMPARE_COSTS.
> > But that seemed like a dead end with the move to doing everything on
> > the SLP representation.
> 
> Yeah ... though even moving everything to the SLP representation will retain
> the issue since there it will be N group-size 1 SLP instances vs. 1 
> group-size N
> SLP instance.
> 
> > > I guess we should try moving this upward like to
> > > vect_analyze_loop_2 right before
> > >
> > >   /* Check the SLP opportunities in the loop, analyze and build SLP trees.
> > > */
> > >   ok = vect_analyze_slp (loop_vinfo, *n_stmts);
> > >   if (!ok)
> > > return ok;
> > >
> > > and there check whether all grouped loads and stores can be handled
> > > with store-/load-lanes (and there are no SLP reduction chains?) in
> > > which case do not try to attempt SLP at all.  Because the testcases
> > > this check was supposed to change were all-load/store-lane or all
> > > SLP so the mixed case is probably not worth special casing.
> > >
> > > Since load-/store-lanes is an arm speciality I tried to only touch
> > > this fragile part with a ten-foot pole ;)  CCing Richard, if he acks
> > > the above I can produce a patch.
> >
> > Yeah, sounds good to me.  Probably also sorth checking whether the
> > likely_max iteration count is high enough to support group_size
> > vectors, if we 

Re: [PATCH v2 3/16]middle-end Add basic SLP pattern matching scaffolding.

2020-09-29 Thread Richard Biener
On Tue, 29 Sep 2020, Richard Sandiford wrote:

> Richard Biener  writes:
> >> > > @@ -2192,6 +2378,17 @@ vect_analyze_slp_instance (vec_info *vinfo,
> >> > >   _size, bst_map);
> >> > >if (node != NULL)
> >> > >  {
> >> > > +  /* Temporarily allow add_stmt calls again.  */
> >> > > +  vinfo->stmt_vec_info_ro = false;
> >> > > +
> >> > > +  /* See if any patterns can be found in the constructed SLP tree
> >> > > +before we do any analysis on it.  */
> >> > > +  vect_match_slp_patterns (node, vinfo, group_size, _nunits,
> >> > > +  matches, , _size,
> >> > > + bst_map);
> >> > > +
> >> > > +  /* After this no more add_stmt calls are allowed.  */
> >> > > +  vinfo->stmt_vec_info_ro = true;
> >> > > +
> >> > >
> >> > > I think this is a bit early to match patterns - I'd defer it to the
> >> > > point where all entries into the same SLP subgraph are analyzed, thus
> >> > > somewhere at the end of vect_analyze_slp loop over all instances and
> >> > > match patterns?  That way phases are more clearly separated.
> >> > 
> >> > That would probably work, my only worry is that the SLP analysis itself 
> >> > may
> >> > fail and bail out at
> >> > 
> >> >/* If the loads and stores can be handled with load/store-lane
> >> >   instructions do not generate this SLP instance.  */
> >> >if (is_a  (vinfo)
> >> >&& loads_permuted
> >> >&& dr && vect_store_lanes_supported (vectype, group_size,
> >> > false))
> >
> > Ah, that piece of code.  Yeah, I'm repeatedly running into it as well - 
> > it's a bad hack that stands in the way all the time :/
> 
> At one point I was wondering about trying to drop the above, vectorise with
> and without SLP, and then compare their costs, like for VECT_COMPARE_COSTS.
> But that seemed like a dead end with the move to doing everything on the
> SLP representation.

Yeah ... though even moving everything to the SLP representation will
retain the issue since there it will be N group-size 1 SLP instances
vs. 1 group-size N SLP instance.

> > I guess we should try moving this upward like to
> > vect_analyze_loop_2 right before
> >
> >   /* Check the SLP opportunities in the loop, analyze and build SLP trees.  
> > */
> >   ok = vect_analyze_slp (loop_vinfo, *n_stmts);
> >   if (!ok)
> > return ok;
> >
> > and there check whether all grouped loads and stores can be handled
> > with store-/load-lanes (and there are no SLP reduction chains?) in
> > which case do not try to attempt SLP at all.  Because the testcases
> > this check was supposed to change were all-load/store-lane or
> > all SLP so the mixed case is probably not worth special casing.
> >
> > Since load-/store-lanes is an arm speciality I tried to only touch
> > this fragile part with a ten-foot pole ;)  CCing Richard, if he
> > acks the above I can produce a patch.
> 
> Yeah, sounds good to me.  Probably also sorth checking whether the
> likely_max iteration count is high enough to support group_size
> vectors, if we have enough information to guess that.
> 
> We could also get the gen* machinery to emit a macro that is true if at
> least one load/store-lane pattern is defined, so that we can skip the
> code for non-Arm targets.  I can do that as a follow-up.

I've had a second look and one complication is that we only elide the
SLP node if any of the loads are permuted.  So if all loads/stores
are unpermuted but load/store-lanes would work we'd keep the SLP node.

Of course without actually building the SLP node we don't know whether
the loads will be permuted or not ...

But surely the current place for the check will cause some testcases
to become hybrid vectorizations which is likely undesirable.

So we could move the check after all SLP discovery is completed
and throw it all away if we can and should use load/store-lanes?
But that might then not solve Tamars issue.

Richard.


Re: [PATCH v2 3/16]middle-end Add basic SLP pattern matching scaffolding.

2020-09-29 Thread Richard Sandiford
Richard Biener  writes:
>> > > @@ -2192,6 +2378,17 @@ vect_analyze_slp_instance (vec_info *vinfo,
>> > >   _size, bst_map);
>> > >if (node != NULL)
>> > >  {
>> > > +  /* Temporarily allow add_stmt calls again.  */
>> > > +  vinfo->stmt_vec_info_ro = false;
>> > > +
>> > > +  /* See if any patterns can be found in the constructed SLP tree
>> > > +before we do any analysis on it.  */
>> > > +  vect_match_slp_patterns (node, vinfo, group_size, _nunits,
>> > > +  matches, , _size,
>> > > + bst_map);
>> > > +
>> > > +  /* After this no more add_stmt calls are allowed.  */
>> > > +  vinfo->stmt_vec_info_ro = true;
>> > > +
>> > >
>> > > I think this is a bit early to match patterns - I'd defer it to the
>> > > point where all entries into the same SLP subgraph are analyzed, thus
>> > > somewhere at the end of vect_analyze_slp loop over all instances and
>> > > match patterns?  That way phases are more clearly separated.
>> > 
>> > That would probably work, my only worry is that the SLP analysis itself may
>> > fail and bail out at
>> > 
>> >  /* If the loads and stores can be handled with load/store-lane
>> > instructions do not generate this SLP instance.  */
>> >  if (is_a  (vinfo)
>> >  && loads_permuted
>> >  && dr && vect_store_lanes_supported (vectype, group_size,
>> > false))
>
> Ah, that piece of code.  Yeah, I'm repeatedly running into it as well - 
> it's a bad hack that stands in the way all the time :/

At one point I was wondering about trying to drop the above, vectorise with
and without SLP, and then compare their costs, like for VECT_COMPARE_COSTS.
But that seemed like a dead end with the move to doing everything on the
SLP representation.

> I guess we should try moving this upward like to
> vect_analyze_loop_2 right before
>
>   /* Check the SLP opportunities in the loop, analyze and build SLP trees.  
> */
>   ok = vect_analyze_slp (loop_vinfo, *n_stmts);
>   if (!ok)
> return ok;
>
> and there check whether all grouped loads and stores can be handled
> with store-/load-lanes (and there are no SLP reduction chains?) in
> which case do not try to attempt SLP at all.  Because the testcases
> this check was supposed to change were all-load/store-lane or
> all SLP so the mixed case is probably not worth special casing.
>
> Since load-/store-lanes is an arm speciality I tried to only touch
> this fragile part with a ten-foot pole ;)  CCing Richard, if he
> acks the above I can produce a patch.

Yeah, sounds good to me.  Probably also sorth checking whether the
likely_max iteration count is high enough to support group_size
vectors, if we have enough information to guess that.

We could also get the gen* machinery to emit a macro that is true if at
least one load/store-lane pattern is defined, so that we can skip the
code for non-Arm targets.  I can do that as a follow-up.

Thanks,
Richard


RE: [PATCH v2 3/16]middle-end Add basic SLP pattern matching scaffolding.

2020-09-29 Thread Richard Biener
On Mon, 28 Sep 2020, Tamar Christina wrote:

> 
> 
> > -Original Message-
> > From: Gcc-patches  On Behalf Of Tamar
> > Christina
> > Sent: Monday, September 28, 2020 3:56 PM
> > To: Richard Biener 
> > Cc: nd ; gcc-patches@gcc.gnu.org; o...@ucw.cz
> > Subject: RE: [PATCH v2 3/16]middle-end Add basic SLP pattern matching
> > scaffolding.
> > 
> > Hi Richi,
> > 
> > Thanks for the review!
> > 
> > Just some answers to your questions:
> > 
> > > -Original Message-
> > > From: rguent...@c653.arch.suse.de  On
> > > Behalf Of Richard Biener
> > > Sent: Monday, September 28, 2020 1:37 PM
> > > To: Tamar Christina 
> > > Cc: gcc-patches@gcc.gnu.org; nd ; o...@ucw.cz
> > > Subject: Re: [PATCH v2 3/16]middle-end Add basic SLP pattern matching
> > > scaffolding.
> > >
> > > On Fri, 25 Sep 2020, Tamar Christina wrote:
> > >
> > > > Hi All,
> > > >
> > > > This patch adds the basic infrastructure for doing pattern matching
> > > > on SLP
> > > trees.
> > > > This is done immediately after the SLP tree creation because it can
> > > > change the shape of the tree in radical ways and so we would like to
> > > > do it before any analysis is performed on the tree.
> > > >
> > > > A new file tree-vect-slp-patterns.c is added which contains all the
> > > > code for pattern matching on SLP trees.
> > > >
> > > > This cover letter is short because the changes are heavily commented.
> > > >
> > > > All pattern matchers need to implement the abstract type
> > > VectPatternMatch.
> > > > The VectSimplePatternMatch abstract class provides some default
> > > > functionality for pattern matchers that need to rebuild nodes.
> > > >
> > > > The pattern matcher requires if replacing a statement in a node,
> > > > that ALL statements be replaced.
> > > >
> > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > > >
> > > > Ok for master?
> > >
> > > +gcall *build ()
> > > +{
> > > +  stmt_vec_info stmt_info;
> > > +
> > >
> > > please define functions out-of-line (apart from the 1-liners)
> > >
> > > +  /* We have to explicitly mark the old statement as unused
> > > + because
> > > during
> > > +statement analysis the original and new pattern statement may
> > > require
> > > +different level of unrolling.  As an example add/sub when
> > > vectorized
> > > +without a pattern requires 4 copies, whereas with a
> > > + COMPLEX_ADD
> > > pattern
> > > +this only requires 2 copies and the two statement will be
> > > + treated
> > > as
> > > +hand unrolled.  That means that the analysis won't happen as
> > > it'll find
> > > +a mismatch.  So we don't analyze the old statement and if we
> > > + end
> > > up
> > > +needing it, e.g. SLP fails then we have to quickly re-analyze it.
> > > */
> > > +  STMT_VINFO_RELEVANT (stmt_info) = vect_unused_in_scope;
> > > +  STMT_VINFO_SLP_VECT_ONLY (call_stmt_info) = true;
> > > +  STMT_VINFO_RELATED_STMT (call_stmt_info) = stmt_info;
> > >
> > > so this means all uses have to be inside the pattern as otherwise
> > > there may be even non-SLP uses.  vect_mark_pattern_stmts supports
> > > detecting patterns of patterns, I suppose the two-phase analysis for
> > > SLP patterns does not support this right now?
> > >
> > > +  SLP_TREE_CODE (this->m_node) = gimple_expr_code (call_stmt);;
> > >
> > > double ;, just make it CALL_EXPR literally (or leave it ERROR_MARK)
> > >
> > > You seem to do in-place changing of the SLP node you match off?
> > 
> > Yes since this would allow me to change the root node as well, though
> > thinking about it I can probably do it by passing it as a reference which 
> > then
> > would allow me to re-use vect_create_new_slp_node which is probably
> > preferable.
> > 
> > >
> > > @@ -2192,6 +2378,17 @@ vect_analyze_slp_instance (vec_info *vinfo,
> > >   _size, bst_map);
> > >if (node != NULL)
> > >  {
> > > +  /* Temporarily allow add_stmt calls again.  */
> > > + 

RE: [PATCH v2 3/16]middle-end Add basic SLP pattern matching scaffolding.

2020-09-28 Thread Tamar Christina



> -Original Message-
> From: Gcc-patches  On Behalf Of Tamar
> Christina
> Sent: Monday, September 28, 2020 3:56 PM
> To: Richard Biener 
> Cc: nd ; gcc-patches@gcc.gnu.org; o...@ucw.cz
> Subject: RE: [PATCH v2 3/16]middle-end Add basic SLP pattern matching
> scaffolding.
> 
> Hi Richi,
> 
> Thanks for the review!
> 
> Just some answers to your questions:
> 
> > -Original Message-
> > From: rguent...@c653.arch.suse.de  On
> > Behalf Of Richard Biener
> > Sent: Monday, September 28, 2020 1:37 PM
> > To: Tamar Christina 
> > Cc: gcc-patches@gcc.gnu.org; nd ; o...@ucw.cz
> > Subject: Re: [PATCH v2 3/16]middle-end Add basic SLP pattern matching
> > scaffolding.
> >
> > On Fri, 25 Sep 2020, Tamar Christina wrote:
> >
> > > Hi All,
> > >
> > > This patch adds the basic infrastructure for doing pattern matching
> > > on SLP
> > trees.
> > > This is done immediately after the SLP tree creation because it can
> > > change the shape of the tree in radical ways and so we would like to
> > > do it before any analysis is performed on the tree.
> > >
> > > A new file tree-vect-slp-patterns.c is added which contains all the
> > > code for pattern matching on SLP trees.
> > >
> > > This cover letter is short because the changes are heavily commented.
> > >
> > > All pattern matchers need to implement the abstract type
> > VectPatternMatch.
> > > The VectSimplePatternMatch abstract class provides some default
> > > functionality for pattern matchers that need to rebuild nodes.
> > >
> > > The pattern matcher requires if replacing a statement in a node,
> > > that ALL statements be replaced.
> > >
> > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > >
> > > Ok for master?
> >
> > +gcall *build ()
> > +{
> > +  stmt_vec_info stmt_info;
> > +
> >
> > please define functions out-of-line (apart from the 1-liners)
> >
> > +  /* We have to explicitly mark the old statement as unused
> > + because
> > during
> > +statement analysis the original and new pattern statement may
> > require
> > +different level of unrolling.  As an example add/sub when
> > vectorized
> > +without a pattern requires 4 copies, whereas with a
> > + COMPLEX_ADD
> > pattern
> > +this only requires 2 copies and the two statement will be
> > + treated
> > as
> > +hand unrolled.  That means that the analysis won't happen as
> > it'll find
> > +a mismatch.  So we don't analyze the old statement and if we
> > + end
> > up
> > +needing it, e.g. SLP fails then we have to quickly re-analyze it.
> > */
> > +  STMT_VINFO_RELEVANT (stmt_info) = vect_unused_in_scope;
> > +  STMT_VINFO_SLP_VECT_ONLY (call_stmt_info) = true;
> > +  STMT_VINFO_RELATED_STMT (call_stmt_info) = stmt_info;
> >
> > so this means all uses have to be inside the pattern as otherwise
> > there may be even non-SLP uses.  vect_mark_pattern_stmts supports
> > detecting patterns of patterns, I suppose the two-phase analysis for
> > SLP patterns does not support this right now?
> >
> > +  SLP_TREE_CODE (this->m_node) = gimple_expr_code (call_stmt);;
> >
> > double ;, just make it CALL_EXPR literally (or leave it ERROR_MARK)
> >
> > You seem to do in-place changing of the SLP node you match off?
> 
> Yes since this would allow me to change the root node as well, though
> thinking about it I can probably do it by passing it as a reference which then
> would allow me to re-use vect_create_new_slp_node which is probably
> preferable.
> 
> >
> > @@ -2192,6 +2378,17 @@ vect_analyze_slp_instance (vec_info *vinfo,
> >   _size, bst_map);
> >if (node != NULL)
> >  {
> > +  /* Temporarily allow add_stmt calls again.  */
> > +  vinfo->stmt_vec_info_ro = false;
> > +
> > +  /* See if any patterns can be found in the constructed SLP tree
> > +before we do any analysis on it.  */
> > +  vect_match_slp_patterns (node, vinfo, group_size, _nunits,
> > +  matches, , _size,
> > + bst_map);
> > +
> > +  /* After this no more add_stmt calls are allowed.  */
> > +  vinfo->stmt_vec_info_ro = true;
> > +
> >
> > I think this is a bit early to match patterns - I'd defer it to the
> > point w

RE: [PATCH v2 3/16]middle-end Add basic SLP pattern matching scaffolding.

2020-09-28 Thread Tamar Christina
Hi Richi,

Thanks for the review! 

Just some answers to your questions:

> -Original Message-
> From: rguent...@c653.arch.suse.de  On
> Behalf Of Richard Biener
> Sent: Monday, September 28, 2020 1:37 PM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd ; o...@ucw.cz
> Subject: Re: [PATCH v2 3/16]middle-end Add basic SLP pattern matching
> scaffolding.
> 
> On Fri, 25 Sep 2020, Tamar Christina wrote:
> 
> > Hi All,
> >
> > This patch adds the basic infrastructure for doing pattern matching on SLP
> trees.
> > This is done immediately after the SLP tree creation because it can
> > change the shape of the tree in radical ways and so we would like to
> > do it before any analysis is performed on the tree.
> >
> > A new file tree-vect-slp-patterns.c is added which contains all the
> > code for pattern matching on SLP trees.
> >
> > This cover letter is short because the changes are heavily commented.
> >
> > All pattern matchers need to implement the abstract type
> VectPatternMatch.
> > The VectSimplePatternMatch abstract class provides some default
> > functionality for pattern matchers that need to rebuild nodes.
> >
> > The pattern matcher requires if replacing a statement in a node, that
> > ALL statements be replaced.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >
> > Ok for master?
> 
> +gcall *build ()
> +{
> +  stmt_vec_info stmt_info;
> +
> 
> please define functions out-of-line (apart from the 1-liners)
> 
> +  /* We have to explicitly mark the old statement as unused because
> during
> +statement analysis the original and new pattern statement may
> require
> +different level of unrolling.  As an example add/sub when
> vectorized
> +without a pattern requires 4 copies, whereas with a COMPLEX_ADD
> pattern
> +this only requires 2 copies and the two statement will be
> + treated
> as
> +hand unrolled.  That means that the analysis won't happen as
> it'll find
> +a mismatch.  So we don't analyze the old statement and if we
> + end
> up
> +needing it, e.g. SLP fails then we have to quickly re-analyze it.
> */
> +  STMT_VINFO_RELEVANT (stmt_info) = vect_unused_in_scope;
> +  STMT_VINFO_SLP_VECT_ONLY (call_stmt_info) = true;
> +  STMT_VINFO_RELATED_STMT (call_stmt_info) = stmt_info;
> 
> so this means all uses have to be inside the pattern as otherwise there may
> be even non-SLP uses.  vect_mark_pattern_stmts supports detecting
> patterns of patterns, I suppose the two-phase analysis for SLP patterns does
> not support this right now?
> 
> +  SLP_TREE_CODE (this->m_node) = gimple_expr_code (call_stmt);;
> 
> double ;, just make it CALL_EXPR literally (or leave it ERROR_MARK)
> 
> You seem to do in-place changing of the SLP node you match off?

Yes since this would allow me to change the root node as well, though
thinking about it I can probably do it by passing it as a reference which
then would allow me to re-use vect_create_new_slp_node which is
probably preferable. 

> 
> @@ -2192,6 +2378,17 @@ vect_analyze_slp_instance (vec_info *vinfo,
>   _size, bst_map);
>if (node != NULL)
>  {
> +  /* Temporarily allow add_stmt calls again.  */
> +  vinfo->stmt_vec_info_ro = false;
> +
> +  /* See if any patterns can be found in the constructed SLP tree
> +before we do any analysis on it.  */
> +  vect_match_slp_patterns (node, vinfo, group_size, _nunits,
> +  matches, , _size,
> + bst_map);
> +
> +  /* After this no more add_stmt calls are allowed.  */
> +  vinfo->stmt_vec_info_ro = true;
> +
> 
> I think this is a bit early to match patterns - I'd defer it to the point 
> where all
> entries into the same SLP subgraph are analyzed, thus somewhere at the
> end of vect_analyze_slp loop over all instances and match patterns?  That
> way phases are more clearly separated.

That would probably work, my only worry is that the SLP analysis itself may 
fail and
bail out at 

  /* If the loads and stores can be handled with load/store-lane
 instructions do not generate this SLP instance.  */
  if (is_a  (vinfo)
  && loads_permuted
  && dr && vect_store_lanes_supported (vectype, group_size, false))

Which in the initial tree may be true, but in the patterned tree may not be.  
In the previous
revision of the patch you had suggested I return a boolean which can be used to 
cancel such
checks.  Would that be the preferred approach?

> 
> Note that f

Re: [PATCH v2 3/16]middle-end Add basic SLP pattern matching scaffolding.

2020-09-28 Thread Richard Biener
On Fri, 25 Sep 2020, Tamar Christina wrote:

> Hi All,
> 
> This patch adds the basic infrastructure for doing pattern matching on SLP 
> trees.
> This is done immediately after the SLP tree creation because it can change the
> shape of the tree in radical ways and so we would like to do it before any
> analysis is performed on the tree.
> 
> A new file tree-vect-slp-patterns.c is added which contains all the code for
> pattern matching on SLP trees.
> 
> This cover letter is short because the changes are heavily commented.
> 
> All pattern matchers need to implement the abstract type VectPatternMatch.
> The VectSimplePatternMatch abstract class provides some default functionality
> for pattern matchers that need to rebuild nodes.
> 
> The pattern matcher requires if replacing a statement in a node, that ALL
> statements be replaced.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?

+gcall *build ()
+{
+  stmt_vec_info stmt_info;
+

please define functions out-of-line (apart from the 1-liners)

+  /* We have to explicitly mark the old statement as unused because 
during
+statement analysis the original and new pattern statement may 
require
+different level of unrolling.  As an example add/sub when 
vectorized
+without a pattern requires 4 copies, whereas with a COMPLEX_ADD 
pattern
+this only requires 2 copies and the two statement will be treated 
as
+hand unrolled.  That means that the analysis won't happen as 
it'll find
+a mismatch.  So we don't analyze the old statement and if we end 
up
+needing it, e.g. SLP fails then we have to quickly re-analyze it.  
*/
+  STMT_VINFO_RELEVANT (stmt_info) = vect_unused_in_scope;
+  STMT_VINFO_SLP_VECT_ONLY (call_stmt_info) = true;
+  STMT_VINFO_RELATED_STMT (call_stmt_info) = stmt_info;

so this means all uses have to be inside the pattern as otherwise
there may be even non-SLP uses.  vect_mark_pattern_stmts supports
detecting patterns of patterns, I suppose the two-phase analysis
for SLP patterns does not support this right now?

+  SLP_TREE_CODE (this->m_node) = gimple_expr_code (call_stmt);;

double ;, just make it CALL_EXPR literally (or leave it ERROR_MARK)

You seem to do in-place changing of the SLP node you match off?

@@ -2192,6 +2378,17 @@ vect_analyze_slp_instance (vec_info *vinfo,
  _size, bst_map);
   if (node != NULL)
 {
+  /* Temporarily allow add_stmt calls again.  */
+  vinfo->stmt_vec_info_ro = false;
+
+  /* See if any patterns can be found in the constructed SLP tree
+before we do any analysis on it.  */
+  vect_match_slp_patterns (node, vinfo, group_size, _nunits,
+  matches, , _size, bst_map);
+
+  /* After this no more add_stmt calls are allowed.  */
+  vinfo->stmt_vec_info_ro = true;
+

I think this is a bit early to match patterns - I'd defer it to the
point where all entries into the same SLP subgraph are analyzed,
thus somewhere at the end of vect_analyze_slp loop over all
instances and match patterns?  That way phases are more clearly
separated.

Note that fiddling with vinfo->stmt_vec_info_ro is a bit ugly,
maybe add a ->add_pattern_stmt (gimple *pattern_stmt, stmt_vec_info 
orig_stmt) variant that also sets STMT_VINFO_RELATED_STMT
but doesn't check !stmt_vec_info_ro.  That could be used from
tree-vect-patterns.c as well and we could set stmt_vec_info_ro
earlier.

+  VectPattern *pattern = patt_fn (node, vinfo);
+  uint8_t n = pattern->get_arity ();
+
+  if (group_size % n != 0)
+{
+  delete pattern;

seems to require VectPattern allocation even upon failure, I suggest
to return NULL then to avoid excessive allocations.

+  if (!pattern->matches (stmt_infos, i))
+   {
+ /* We can only do replacements for entire groups, we must 
replace all
+statements in a node as the argument list/children may not 
have
+equal height then.  Operations that don't rewrite the 
arguments
+may be safe to do, so perhaps paramatrise it.  */
+
+ found_p = false;

I find it a bit ugly to iterate over "unrolls" in the machinery
rather than the individual pattern matcher which might have an
easier and in particular cheaper job here.  Since you require
all lanes to match the same pattern anyway.   Not sure if your
later patches support say, mixing complex add with different
rotate in the same SLP node.  Note the ultimate idea in the end
is that a SLP node can, of course, be split into two [but at
this point the vector type / unroll factor is not final so
general splitting at vector boundary is not desired yet].
The split can be undone for consumers by inserting a
VEC_PERM node (which should semantically be a concat + select)

+  tree type = gimple_expr_type (STMT_VINFO_STMT (stmt_info));
+  tree vectype = get_vectype_for_scalar_type (vinfo, type, node);

use 

  

[PATCH v2 3/16]middle-end Add basic SLP pattern matching scaffolding.

2020-09-25 Thread Tamar Christina
Hi All,

This patch adds the basic infrastructure for doing pattern matching on SLP 
trees.
This is done immediately after the SLP tree creation because it can change the
shape of the tree in radical ways and so we would like to do it before any
analysis is performed on the tree.

A new file tree-vect-slp-patterns.c is added which contains all the code for
pattern matching on SLP trees.

This cover letter is short because the changes are heavily commented.

All pattern matchers need to implement the abstract type VectPatternMatch.
The VectSimplePatternMatch abstract class provides some default functionality
for pattern matchers that need to rebuild nodes.

The pattern matcher requires if replacing a statement in a node, that ALL
statements be replaced.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

* Makefile.in (tree-vect-slp-patterns.o): New.
* doc/passes.texi: Update documentation.
* tree-vect-slp.c (vect_match_slp_patterns_2, vect_match_slp_patterns):
New.
(vect_analyze_slp_instance): Call pattern matcher.
* tree-vectorizer.h (class VectPatternMatch, class VectPattern): New.
* tree-vect-slp-patterns.c: New file.

-- 
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 9c6c1c93b976aaf350cc1f9b3bdc538308fdf08b..936202b73696c8529b32c05b2356c7316fabc542 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1638,6 +1638,7 @@ OBJS = \
 	tree-vect-loop.o \
 	tree-vect-loop-manip.o \
 	tree-vect-slp.o \
+	tree-vect-slp-patterns.o \
 	tree-vectorizer.o \
 	tree-vector-builder.o \
 	tree-vrp.o \
diff --git a/gcc/doc/passes.texi b/gcc/doc/passes.texi
index a5ae4143a8c1293e674b499120372ee5fe5c412b..c86df5cd843084a5b7933ef99a23386891a7b0c1 100644
--- a/gcc/doc/passes.texi
+++ b/gcc/doc/passes.texi
@@ -709,7 +709,8 @@ loop.
 The pass is implemented in @file{tree-vectorizer.c} (the main driver),
 @file{tree-vect-loop.c} and @file{tree-vect-loop-manip.c} (loop specific parts
 and general loop utilities), @file{tree-vect-slp} (loop-aware SLP
-functionality), @file{tree-vect-stmts.c} and @file{tree-vect-data-refs.c}.
+functionality), @file{tree-vect-stmts.c}, @file{tree-vect-data-refs.c} and
+@file{tree-vect-slp-patterns.c} containing the SLP pattern matcher.
 Analysis of data references is in @file{tree-data-ref.c}.
 
 SLP Vectorization.  This pass performs vectorization of straight-line code. The
diff --git a/gcc/tree-vect-slp-patterns.c b/gcc/tree-vect-slp-patterns.c
new file mode 100644
index ..f605f68d2a14c4bf4941f97b7c1d57f6acb5ffb1
--- /dev/null
+++ b/gcc/tree-vect-slp-patterns.c
@@ -0,0 +1,310 @@
+/* SLP - Pattern matcher on SLP trees
+   Copyright (C) 2020 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "target.h"
+#include "rtl.h"
+#include "tree.h"
+#include "gimple.h"
+#include "tree-pass.h"
+#include "ssa.h"
+#include "optabs-tree.h"
+#include "insn-config.h"
+#include "recog.h"		/* FIXME: for insn_data */
+#include "fold-const.h"
+#include "stor-layout.h"
+#include "gimple-iterator.h"
+#include "cfgloop.h"
+#include "tree-vectorizer.h"
+#include "langhooks.h"
+#include "gimple-walk.h"
+#include "dbgcnt.h"
+#include "tree-vector-builder.h"
+#include "vec-perm-indices.h"
+#include "gimple-fold.h"
+#include "internal-fn.h"
+
+/* SLP Pattern matching mechanism.
+
+  This extension to the SLP vectorizer allows one to transform the generated SLP
+  tree based on any pattern.  The difference between this and the normal vect
+  pattern matcher is that unlike the former, this matcher allows you to match
+  with instructions that do not belong to the same SSA dominator graph.
+
+  The only requirement that this pattern matcher has is that you are only
+  only allowed to either match an entire group or none.
+
+  As an example, the following simple loop:
+
+double a[restrict N]; double b[restrict N]; double c[restrict N];
+
+for (int i=0; i < N; i+=2)
+{
+  c[i] = a[i] - b[i+1];
+  c[i+1] = a[i+1] + b[i];
+}
+
+  which represents a complex addition on with a rotation of 90* around the
+  argand plane. i.e. if `a` and `b` were complex numbers then this would be the
+  same as `a + (b * I)`.
+
+  Here the expressions for