Re: [PATCH] reassoc: Propagate PHI_LOOP_BIAS along single uses

2020-06-30 Thread Richard Biener via Gcc-patches
On Mon, Jun 29, 2020 at 4:58 PM Ilya Leoshkevich  wrote:
>
> On Thu, 2020-06-25 at 14:34 +0200, Richard Biener wrote:
> > On Wed, Jun 24, 2020 at 1:31 AM Ilya Leoshkevich via Gcc-patches
> >  wrote:
> > > Bootstrapped and regtested x86_64-redhat-linux, ppc64le-redhat-
> > > linux and
> > > s390x-redhat-linux.  I also ran SPEC 2006 and 2017 on these
> > > platforms,
> > > and the only measurable regression was 3% in 520.omnetpp_r on ppc,
> > > which
> > > went away after inserting a single nop at the beginning of
> > > cDynamicExpression::evaluate.
> > >
> > > OK for master?
> >
> > As you might know this is incredibly fragile so I'd prefer if you
> > submit
> > and push the change to disable PHI biasing in the early pass instance
> > separately.  At first glance that change looks reasonable (but we'll
> > watch for fallout).
>
> Will do.
>
> >
> > Comments on the other changes inline
> >
> > > ---
> > >
> > > PR tree-optimization/49749 introduced code that shortens dependency
> > > chains containing loop accumulators by placing them last on operand
> > > lists of associative operations.
> > >
> > > 456.hmmer benchmark on s390 could benefit from this, however, the
> > > code
> > > that needs it modifies loop accumulator before using it, and since
> > > only
> > > so-called loop-carried phis are are treated as loop accumulators,
> > > the
> > > code in the present form doesn't really help.   According to Bill
> > > Schmidt - the original author - such a conservative approach was
> > > chosen
> > > so as to avoid unnecessarily swapping operands, which might cause
> > > unpredictable effects.  However, giving special treatment to forms
> > > of
> > > loop accumulators is acceptable.
> > >
> > > The definition of loop-carried phi is: it's a single-use phi, which
> > > is
> > > used in the same innermost loop it's defined in, at least one
> > > argument
> > > of which is defined in the same innermost loop as the phi itself.
> > > Given this, it seems natural to treat single uses of such phis as
> > > phis
> > > themselves.
> > >
> > > gcc/ChangeLog:
> > >
> > > 2020-05-06  Ilya Leoshkevich  
> > >
> > > * passes.def (pass_reassoc): Rename parameter to early_p.
> > > * tree-ssa-reassoc.c
> > > (reassoc_bias_loop_carried_phi_ranks_p):
> > > New variable.
> > > (phi_rank): Don't bias loop-carried phi ranks
> > > before vectorization pass.
> > > (loop_carried_phi): Remove (superseded by
> > > operand_rank::biased_p).
> > > (propagate_rank): Propagate bias along single uses.
> > > (get_rank): Pass stmt to propagate_rank.
> > > (execute_reassoc): Add bias_loop_carried_phi_ranks_p
> > > parameter.
> > > (pass_reassoc::pass_reassoc): Add
> > > bias_loop_carried_phi_ranks_p
> > > initializer.
> > > (pass_reassoc::set_param): Set
> > > bias_loop_carried_phi_ranks_p
> > > value.
> > > (pass_reassoc::execute): Pass bias_loop_carried_phi_ranks_p
> > > to
> > > execute_reassoc.
> > > (pass_reassoc::bias_loop_carried_phi_ranks_p): New member.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > 2020-05-06  Ilya Leoshkevich  
> > >
> > > * gcc.target/s390/reassoc-1.c: New test.
> > > * gcc.target/s390/reassoc-2.c: New test.
> > > * gcc.target/s390/reassoc-3.c: New test.
> > > * gcc.target/s390/reassoc.h: New test.
> > > ---
> > >  gcc/passes.def|  4 +-
> > >  gcc/testsuite/gcc.target/s390/reassoc-1.c |  6 ++
> > >  gcc/testsuite/gcc.target/s390/reassoc-2.c |  7 ++
> > >  gcc/testsuite/gcc.target/s390/reassoc-3.c |  8 ++
> > >  gcc/testsuite/gcc.target/s390/reassoc.h   | 22 +
> > >  gcc/tree-ssa-reassoc.c| 97 ++-
> > > 
> > >  6 files changed, 105 insertions(+), 39 deletions(-)
> > >  create mode 100644 gcc/testsuite/gcc.target/s390/reassoc-1.c
> > >  create mode 100644 gcc/testsuite/gcc.target/s390/reassoc-2.c
> > >  create mode 100644 gcc/testsuite/gcc.target/s390/reassoc-3.c
> > >  create mode 100644 gcc/testsuite/gcc.target/s390/reassoc.h
> > >
> > > diff --git a/gcc/passes.def b/gcc/passes.def
> > > index 2b1e09fdda3..6864f583f20 100644
> > > --- a/gcc/passes.def
> > > +++ b/gcc/passes.def
> > > @@ -235,7 +235,7 @@ along with GCC; see the file COPYING3.  If not
> > > see
> > >  program and isolate those paths.  */
> > >NEXT_PASS (pass_isolate_erroneous_paths);
> > >NEXT_PASS (pass_dse);
> > > -  NEXT_PASS (pass_reassoc, true /* insert_powi_p */);
> > > +  NEXT_PASS (pass_reassoc, true /* early_p */);
> > >NEXT_PASS (pass_dce);
> > >NEXT_PASS (pass_forwprop);
> > >NEXT_PASS (pass_phiopt, false /* early_p */);
> > > @@ -312,7 +312,7 @@ along with GCC; see the file COPYING3.  If not
> > > see
> > >NEXT_PASS (pass_lower_vector_ssa);
> > >NEXT_PASS (pass_lower_switch);
> > >NEXT_PASS (pass_cse_reciprocals);
> > > -

Re: [PATCH] reassoc: Propagate PHI_LOOP_BIAS along single uses

2020-06-29 Thread Ilya Leoshkevich via Gcc-patches
On Thu, 2020-06-25 at 14:34 +0200, Richard Biener wrote:
> On Wed, Jun 24, 2020 at 1:31 AM Ilya Leoshkevich via Gcc-patches
>  wrote:
> > Bootstrapped and regtested x86_64-redhat-linux, ppc64le-redhat-
> > linux and
> > s390x-redhat-linux.  I also ran SPEC 2006 and 2017 on these
> > platforms,
> > and the only measurable regression was 3% in 520.omnetpp_r on ppc,
> > which
> > went away after inserting a single nop at the beginning of
> > cDynamicExpression::evaluate.
> > 
> > OK for master?
> 
> As you might know this is incredibly fragile so I'd prefer if you
> submit
> and push the change to disable PHI biasing in the early pass instance
> separately.  At first glance that change looks reasonable (but we'll
> watch for fallout).

Will do.

> 
> Comments on the other changes inline
> 
> > ---
> > 
> > PR tree-optimization/49749 introduced code that shortens dependency
> > chains containing loop accumulators by placing them last on operand
> > lists of associative operations.
> > 
> > 456.hmmer benchmark on s390 could benefit from this, however, the
> > code
> > that needs it modifies loop accumulator before using it, and since
> > only
> > so-called loop-carried phis are are treated as loop accumulators,
> > the
> > code in the present form doesn't really help.   According to Bill
> > Schmidt - the original author - such a conservative approach was
> > chosen
> > so as to avoid unnecessarily swapping operands, which might cause
> > unpredictable effects.  However, giving special treatment to forms
> > of
> > loop accumulators is acceptable.
> > 
> > The definition of loop-carried phi is: it's a single-use phi, which
> > is
> > used in the same innermost loop it's defined in, at least one
> > argument
> > of which is defined in the same innermost loop as the phi itself.
> > Given this, it seems natural to treat single uses of such phis as
> > phis
> > themselves.
> > 
> > gcc/ChangeLog:
> > 
> > 2020-05-06  Ilya Leoshkevich  
> > 
> > * passes.def (pass_reassoc): Rename parameter to early_p.
> > * tree-ssa-reassoc.c
> > (reassoc_bias_loop_carried_phi_ranks_p):
> > New variable.
> > (phi_rank): Don't bias loop-carried phi ranks
> > before vectorization pass.
> > (loop_carried_phi): Remove (superseded by
> > operand_rank::biased_p).
> > (propagate_rank): Propagate bias along single uses.
> > (get_rank): Pass stmt to propagate_rank.
> > (execute_reassoc): Add bias_loop_carried_phi_ranks_p
> > parameter.
> > (pass_reassoc::pass_reassoc): Add
> > bias_loop_carried_phi_ranks_p
> > initializer.
> > (pass_reassoc::set_param): Set
> > bias_loop_carried_phi_ranks_p
> > value.
> > (pass_reassoc::execute): Pass bias_loop_carried_phi_ranks_p
> > to
> > execute_reassoc.
> > (pass_reassoc::bias_loop_carried_phi_ranks_p): New member.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > 2020-05-06  Ilya Leoshkevich  
> > 
> > * gcc.target/s390/reassoc-1.c: New test.
> > * gcc.target/s390/reassoc-2.c: New test.
> > * gcc.target/s390/reassoc-3.c: New test.
> > * gcc.target/s390/reassoc.h: New test.
> > ---
> >  gcc/passes.def|  4 +-
> >  gcc/testsuite/gcc.target/s390/reassoc-1.c |  6 ++
> >  gcc/testsuite/gcc.target/s390/reassoc-2.c |  7 ++
> >  gcc/testsuite/gcc.target/s390/reassoc-3.c |  8 ++
> >  gcc/testsuite/gcc.target/s390/reassoc.h   | 22 +
> >  gcc/tree-ssa-reassoc.c| 97 ++-
> > 
> >  6 files changed, 105 insertions(+), 39 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/s390/reassoc-1.c
> >  create mode 100644 gcc/testsuite/gcc.target/s390/reassoc-2.c
> >  create mode 100644 gcc/testsuite/gcc.target/s390/reassoc-3.c
> >  create mode 100644 gcc/testsuite/gcc.target/s390/reassoc.h
> > 
> > diff --git a/gcc/passes.def b/gcc/passes.def
> > index 2b1e09fdda3..6864f583f20 100644
> > --- a/gcc/passes.def
> > +++ b/gcc/passes.def
> > @@ -235,7 +235,7 @@ along with GCC; see the file COPYING3.  If not
> > see
> >  program and isolate those paths.  */
> >NEXT_PASS (pass_isolate_erroneous_paths);
> >NEXT_PASS (pass_dse);
> > -  NEXT_PASS (pass_reassoc, true /* insert_powi_p */);
> > +  NEXT_PASS (pass_reassoc, true /* early_p */);
> >NEXT_PASS (pass_dce);
> >NEXT_PASS (pass_forwprop);
> >NEXT_PASS (pass_phiopt, false /* early_p */);
> > @@ -312,7 +312,7 @@ along with GCC; see the file COPYING3.  If not
> > see
> >NEXT_PASS (pass_lower_vector_ssa);
> >NEXT_PASS (pass_lower_switch);
> >NEXT_PASS (pass_cse_reciprocals);
> > -  NEXT_PASS (pass_reassoc, false /* insert_powi_p */);
> > +  NEXT_PASS (pass_reassoc, false /* early_p */);
> >NEXT_PASS (pass_strength_reduction);
> >NEXT_PASS (pass_split_paths);
> >NEXT_PASS (pass_tracer);
> > diff --git 

Re: [PATCH] reassoc: Propagate PHI_LOOP_BIAS along single uses

2020-06-25 Thread Richard Biener via Gcc-patches
On Wed, Jun 24, 2020 at 1:31 AM Ilya Leoshkevich via Gcc-patches
 wrote:
>
> Bootstrapped and regtested x86_64-redhat-linux, ppc64le-redhat-linux and
> s390x-redhat-linux.  I also ran SPEC 2006 and 2017 on these platforms,
> and the only measurable regression was 3% in 520.omnetpp_r on ppc, which
> went away after inserting a single nop at the beginning of
> cDynamicExpression::evaluate.
>
> OK for master?

As you might know this is incredibly fragile so I'd prefer if you submit
and push the change to disable PHI biasing in the early pass instance
separately.  At first glance that change looks reasonable (but we'll
watch for fallout).

Comments on the other changes inline

>
> ---
>
> PR tree-optimization/49749 introduced code that shortens dependency
> chains containing loop accumulators by placing them last on operand
> lists of associative operations.
>
> 456.hmmer benchmark on s390 could benefit from this, however, the code
> that needs it modifies loop accumulator before using it, and since only
> so-called loop-carried phis are are treated as loop accumulators, the
> code in the present form doesn't really help.   According to Bill
> Schmidt - the original author - such a conservative approach was chosen
> so as to avoid unnecessarily swapping operands, which might cause
> unpredictable effects.  However, giving special treatment to forms of
> loop accumulators is acceptable.
>
> The definition of loop-carried phi is: it's a single-use phi, which is
> used in the same innermost loop it's defined in, at least one argument
> of which is defined in the same innermost loop as the phi itself.
> Given this, it seems natural to treat single uses of such phis as phis
> themselves.
>
> gcc/ChangeLog:
>
> 2020-05-06  Ilya Leoshkevich  
>
> * passes.def (pass_reassoc): Rename parameter to early_p.
> * tree-ssa-reassoc.c (reassoc_bias_loop_carried_phi_ranks_p):
> New variable.
> (phi_rank): Don't bias loop-carried phi ranks
> before vectorization pass.
> (loop_carried_phi): Remove (superseded by
> operand_rank::biased_p).
> (propagate_rank): Propagate bias along single uses.
> (get_rank): Pass stmt to propagate_rank.
> (execute_reassoc): Add bias_loop_carried_phi_ranks_p parameter.
> (pass_reassoc::pass_reassoc): Add bias_loop_carried_phi_ranks_p
> initializer.
> (pass_reassoc::set_param): Set bias_loop_carried_phi_ranks_p
> value.
> (pass_reassoc::execute): Pass bias_loop_carried_phi_ranks_p to
> execute_reassoc.
> (pass_reassoc::bias_loop_carried_phi_ranks_p): New member.
>
> gcc/testsuite/ChangeLog:
>
> 2020-05-06  Ilya Leoshkevich  
>
> * gcc.target/s390/reassoc-1.c: New test.
> * gcc.target/s390/reassoc-2.c: New test.
> * gcc.target/s390/reassoc-3.c: New test.
> * gcc.target/s390/reassoc.h: New test.
> ---
>  gcc/passes.def|  4 +-
>  gcc/testsuite/gcc.target/s390/reassoc-1.c |  6 ++
>  gcc/testsuite/gcc.target/s390/reassoc-2.c |  7 ++
>  gcc/testsuite/gcc.target/s390/reassoc-3.c |  8 ++
>  gcc/testsuite/gcc.target/s390/reassoc.h   | 22 +
>  gcc/tree-ssa-reassoc.c| 97 ++-
>  6 files changed, 105 insertions(+), 39 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/s390/reassoc-1.c
>  create mode 100644 gcc/testsuite/gcc.target/s390/reassoc-2.c
>  create mode 100644 gcc/testsuite/gcc.target/s390/reassoc-3.c
>  create mode 100644 gcc/testsuite/gcc.target/s390/reassoc.h
>
> diff --git a/gcc/passes.def b/gcc/passes.def
> index 2b1e09fdda3..6864f583f20 100644
> --- a/gcc/passes.def
> +++ b/gcc/passes.def
> @@ -235,7 +235,7 @@ along with GCC; see the file COPYING3.  If not see
>  program and isolate those paths.  */
>NEXT_PASS (pass_isolate_erroneous_paths);
>NEXT_PASS (pass_dse);
> -  NEXT_PASS (pass_reassoc, true /* insert_powi_p */);
> +  NEXT_PASS (pass_reassoc, true /* early_p */);
>NEXT_PASS (pass_dce);
>NEXT_PASS (pass_forwprop);
>NEXT_PASS (pass_phiopt, false /* early_p */);
> @@ -312,7 +312,7 @@ along with GCC; see the file COPYING3.  If not see
>NEXT_PASS (pass_lower_vector_ssa);
>NEXT_PASS (pass_lower_switch);
>NEXT_PASS (pass_cse_reciprocals);
> -  NEXT_PASS (pass_reassoc, false /* insert_powi_p */);
> +  NEXT_PASS (pass_reassoc, false /* early_p */);
>NEXT_PASS (pass_strength_reduction);
>NEXT_PASS (pass_split_paths);
>NEXT_PASS (pass_tracer);
> diff --git a/gcc/testsuite/gcc.target/s390/reassoc-1.c 
> b/gcc/testsuite/gcc.target/s390/reassoc-1.c
> new file mode 100644
> index 000..8343f1cd4b7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/s390/reassoc-1.c
> @@ -0,0 +1,6 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +#include "reassoc.h"
> +
> +/* { dg-final { scan-assembler 
> 

[PATCH] reassoc: Propagate PHI_LOOP_BIAS along single uses

2020-06-23 Thread Ilya Leoshkevich via Gcc-patches
Bootstrapped and regtested x86_64-redhat-linux, ppc64le-redhat-linux and
s390x-redhat-linux.  I also ran SPEC 2006 and 2017 on these platforms,
and the only measurable regression was 3% in 520.omnetpp_r on ppc, which
went away after inserting a single nop at the beginning of
cDynamicExpression::evaluate.

OK for master?

---

PR tree-optimization/49749 introduced code that shortens dependency
chains containing loop accumulators by placing them last on operand
lists of associative operations.

456.hmmer benchmark on s390 could benefit from this, however, the code
that needs it modifies loop accumulator before using it, and since only
so-called loop-carried phis are are treated as loop accumulators, the
code in the present form doesn't really help.   According to Bill
Schmidt - the original author - such a conservative approach was chosen
so as to avoid unnecessarily swapping operands, which might cause
unpredictable effects.  However, giving special treatment to forms of
loop accumulators is acceptable.

The definition of loop-carried phi is: it's a single-use phi, which is
used in the same innermost loop it's defined in, at least one argument
of which is defined in the same innermost loop as the phi itself.
Given this, it seems natural to treat single uses of such phis as phis
themselves.

gcc/ChangeLog:

2020-05-06  Ilya Leoshkevich  

* passes.def (pass_reassoc): Rename parameter to early_p.
* tree-ssa-reassoc.c (reassoc_bias_loop_carried_phi_ranks_p):
New variable.
(phi_rank): Don't bias loop-carried phi ranks
before vectorization pass.
(loop_carried_phi): Remove (superseded by
operand_rank::biased_p).
(propagate_rank): Propagate bias along single uses.
(get_rank): Pass stmt to propagate_rank.
(execute_reassoc): Add bias_loop_carried_phi_ranks_p parameter.
(pass_reassoc::pass_reassoc): Add bias_loop_carried_phi_ranks_p
initializer.
(pass_reassoc::set_param): Set bias_loop_carried_phi_ranks_p
value.
(pass_reassoc::execute): Pass bias_loop_carried_phi_ranks_p to
execute_reassoc.
(pass_reassoc::bias_loop_carried_phi_ranks_p): New member.

gcc/testsuite/ChangeLog:

2020-05-06  Ilya Leoshkevich  

* gcc.target/s390/reassoc-1.c: New test.
* gcc.target/s390/reassoc-2.c: New test.
* gcc.target/s390/reassoc-3.c: New test.
* gcc.target/s390/reassoc.h: New test.
---
 gcc/passes.def|  4 +-
 gcc/testsuite/gcc.target/s390/reassoc-1.c |  6 ++
 gcc/testsuite/gcc.target/s390/reassoc-2.c |  7 ++
 gcc/testsuite/gcc.target/s390/reassoc-3.c |  8 ++
 gcc/testsuite/gcc.target/s390/reassoc.h   | 22 +
 gcc/tree-ssa-reassoc.c| 97 ++-
 6 files changed, 105 insertions(+), 39 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/reassoc-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/reassoc-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/reassoc-3.c
 create mode 100644 gcc/testsuite/gcc.target/s390/reassoc.h

diff --git a/gcc/passes.def b/gcc/passes.def
index 2b1e09fdda3..6864f583f20 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -235,7 +235,7 @@ along with GCC; see the file COPYING3.  If not see
 program and isolate those paths.  */
   NEXT_PASS (pass_isolate_erroneous_paths);
   NEXT_PASS (pass_dse);
-  NEXT_PASS (pass_reassoc, true /* insert_powi_p */);
+  NEXT_PASS (pass_reassoc, true /* early_p */);
   NEXT_PASS (pass_dce);
   NEXT_PASS (pass_forwprop);
   NEXT_PASS (pass_phiopt, false /* early_p */);
@@ -312,7 +312,7 @@ along with GCC; see the file COPYING3.  If not see
   NEXT_PASS (pass_lower_vector_ssa);
   NEXT_PASS (pass_lower_switch);
   NEXT_PASS (pass_cse_reciprocals);
-  NEXT_PASS (pass_reassoc, false /* insert_powi_p */);
+  NEXT_PASS (pass_reassoc, false /* early_p */);
   NEXT_PASS (pass_strength_reduction);
   NEXT_PASS (pass_split_paths);
   NEXT_PASS (pass_tracer);
diff --git a/gcc/testsuite/gcc.target/s390/reassoc-1.c 
b/gcc/testsuite/gcc.target/s390/reassoc-1.c
new file mode 100644
index 000..8343f1cd4b7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/reassoc-1.c
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+#include "reassoc.h"
+
+/* { dg-final { scan-assembler 
{(?n)\n\tl\t(%r\d+),.+(\n.*)*\n\ta\t(\1),.+(\n.*)*\n\tar\t(%r\d+),(\1)} } } */
diff --git a/gcc/testsuite/gcc.target/s390/reassoc-2.c 
b/gcc/testsuite/gcc.target/s390/reassoc-2.c
new file mode 100644
index 000..5e393ed4937
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/reassoc-2.c
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+#define MODIFY
+#include "reassoc.h"
+
+/* { dg-final { scan-assembler 
{(?n)\n\tl\t(%r\d+),.+(\n.*)*\n\ta\t(\1),.+(\n.*)*\n\tar\t(%r\d+),(\1)} } } */
diff --git a/gcc/testsuite/gcc.target/s390/reassoc-3.c