Re: [PATCH] reassoc: Propagate PHI_LOOP_BIAS along single uses
On Mon, Jun 29, 2020 at 4:58 PM Ilya Leoshkevich wrote: > > On Thu, 2020-06-25 at 14:34 +0200, Richard Biener wrote: > > On Wed, Jun 24, 2020 at 1:31 AM Ilya Leoshkevich via Gcc-patches > > wrote: > > > Bootstrapped and regtested x86_64-redhat-linux, ppc64le-redhat- > > > linux and > > > s390x-redhat-linux. I also ran SPEC 2006 and 2017 on these > > > platforms, > > > and the only measurable regression was 3% in 520.omnetpp_r on ppc, > > > which > > > went away after inserting a single nop at the beginning of > > > cDynamicExpression::evaluate. > > > > > > OK for master? > > > > As you might know this is incredibly fragile so I'd prefer if you > > submit > > and push the change to disable PHI biasing in the early pass instance > > separately. At first glance that change looks reasonable (but we'll > > watch for fallout). > > Will do. > > > > > Comments on the other changes inline > > > > > --- > > > > > > PR tree-optimization/49749 introduced code that shortens dependency > > > chains containing loop accumulators by placing them last on operand > > > lists of associative operations. > > > > > > 456.hmmer benchmark on s390 could benefit from this, however, the > > > code > > > that needs it modifies loop accumulator before using it, and since > > > only > > > so-called loop-carried phis are are treated as loop accumulators, > > > the > > > code in the present form doesn't really help. According to Bill > > > Schmidt - the original author - such a conservative approach was > > > chosen > > > so as to avoid unnecessarily swapping operands, which might cause > > > unpredictable effects. However, giving special treatment to forms > > > of > > > loop accumulators is acceptable. > > > > > > The definition of loop-carried phi is: it's a single-use phi, which > > > is > > > used in the same innermost loop it's defined in, at least one > > > argument > > > of which is defined in the same innermost loop as the phi itself. > > > Given this, it seems natural to treat single uses of such phis as > > > phis > > > themselves. > > > > > > gcc/ChangeLog: > > > > > > 2020-05-06 Ilya Leoshkevich > > > > > > * passes.def (pass_reassoc): Rename parameter to early_p. > > > * tree-ssa-reassoc.c > > > (reassoc_bias_loop_carried_phi_ranks_p): > > > New variable. > > > (phi_rank): Don't bias loop-carried phi ranks > > > before vectorization pass. > > > (loop_carried_phi): Remove (superseded by > > > operand_rank::biased_p). > > > (propagate_rank): Propagate bias along single uses. > > > (get_rank): Pass stmt to propagate_rank. > > > (execute_reassoc): Add bias_loop_carried_phi_ranks_p > > > parameter. > > > (pass_reassoc::pass_reassoc): Add > > > bias_loop_carried_phi_ranks_p > > > initializer. > > > (pass_reassoc::set_param): Set > > > bias_loop_carried_phi_ranks_p > > > value. > > > (pass_reassoc::execute): Pass bias_loop_carried_phi_ranks_p > > > to > > > execute_reassoc. > > > (pass_reassoc::bias_loop_carried_phi_ranks_p): New member. > > > > > > gcc/testsuite/ChangeLog: > > > > > > 2020-05-06 Ilya Leoshkevich > > > > > > * gcc.target/s390/reassoc-1.c: New test. > > > * gcc.target/s390/reassoc-2.c: New test. > > > * gcc.target/s390/reassoc-3.c: New test. > > > * gcc.target/s390/reassoc.h: New test. > > > --- > > > gcc/passes.def| 4 +- > > > gcc/testsuite/gcc.target/s390/reassoc-1.c | 6 ++ > > > gcc/testsuite/gcc.target/s390/reassoc-2.c | 7 ++ > > > gcc/testsuite/gcc.target/s390/reassoc-3.c | 8 ++ > > > gcc/testsuite/gcc.target/s390/reassoc.h | 22 + > > > gcc/tree-ssa-reassoc.c| 97 ++- > > > > > > 6 files changed, 105 insertions(+), 39 deletions(-) > > > create mode 100644 gcc/testsuite/gcc.target/s390/reassoc-1.c > > > create mode 100644 gcc/testsuite/gcc.target/s390/reassoc-2.c > > > create mode 100644 gcc/testsuite/gcc.target/s390/reassoc-3.c > > > create mode 100644 gcc/testsuite/gcc.target/s390/reassoc.h > > > > > > diff --git a/gcc/passes.def b/gcc/passes.def > > > index 2b1e09fdda3..6864f583f20 100644 > > > --- a/gcc/passes.def > > > +++ b/gcc/passes.def > > > @@ -235,7 +235,7 @@ along with GCC; see the file COPYING3. If not > > > see > > > program and isolate those paths. */ > > >NEXT_PASS (pass_isolate_erroneous_paths); > > >NEXT_PASS (pass_dse); > > > - NEXT_PASS (pass_reassoc, true /* insert_powi_p */); > > > + NEXT_PASS (pass_reassoc, true /* early_p */); > > >NEXT_PASS (pass_dce); > > >NEXT_PASS (pass_forwprop); > > >NEXT_PASS (pass_phiopt, false /* early_p */); > > > @@ -312,7 +312,7 @@ along with GCC; see the file COPYING3. If not > > > see > > >NEXT_PASS (pass_lower_vector_ssa); > > >NEXT_PASS (pass_lower_switch); > > >NEXT_PASS (pass_cse_reciprocals); > > > -
Re: [PATCH] reassoc: Propagate PHI_LOOP_BIAS along single uses
On Thu, 2020-06-25 at 14:34 +0200, Richard Biener wrote: > On Wed, Jun 24, 2020 at 1:31 AM Ilya Leoshkevich via Gcc-patches > wrote: > > Bootstrapped and regtested x86_64-redhat-linux, ppc64le-redhat- > > linux and > > s390x-redhat-linux. I also ran SPEC 2006 and 2017 on these > > platforms, > > and the only measurable regression was 3% in 520.omnetpp_r on ppc, > > which > > went away after inserting a single nop at the beginning of > > cDynamicExpression::evaluate. > > > > OK for master? > > As you might know this is incredibly fragile so I'd prefer if you > submit > and push the change to disable PHI biasing in the early pass instance > separately. At first glance that change looks reasonable (but we'll > watch for fallout). Will do. > > Comments on the other changes inline > > > --- > > > > PR tree-optimization/49749 introduced code that shortens dependency > > chains containing loop accumulators by placing them last on operand > > lists of associative operations. > > > > 456.hmmer benchmark on s390 could benefit from this, however, the > > code > > that needs it modifies loop accumulator before using it, and since > > only > > so-called loop-carried phis are are treated as loop accumulators, > > the > > code in the present form doesn't really help. According to Bill > > Schmidt - the original author - such a conservative approach was > > chosen > > so as to avoid unnecessarily swapping operands, which might cause > > unpredictable effects. However, giving special treatment to forms > > of > > loop accumulators is acceptable. > > > > The definition of loop-carried phi is: it's a single-use phi, which > > is > > used in the same innermost loop it's defined in, at least one > > argument > > of which is defined in the same innermost loop as the phi itself. > > Given this, it seems natural to treat single uses of such phis as > > phis > > themselves. > > > > gcc/ChangeLog: > > > > 2020-05-06 Ilya Leoshkevich > > > > * passes.def (pass_reassoc): Rename parameter to early_p. > > * tree-ssa-reassoc.c > > (reassoc_bias_loop_carried_phi_ranks_p): > > New variable. > > (phi_rank): Don't bias loop-carried phi ranks > > before vectorization pass. > > (loop_carried_phi): Remove (superseded by > > operand_rank::biased_p). > > (propagate_rank): Propagate bias along single uses. > > (get_rank): Pass stmt to propagate_rank. > > (execute_reassoc): Add bias_loop_carried_phi_ranks_p > > parameter. > > (pass_reassoc::pass_reassoc): Add > > bias_loop_carried_phi_ranks_p > > initializer. > > (pass_reassoc::set_param): Set > > bias_loop_carried_phi_ranks_p > > value. > > (pass_reassoc::execute): Pass bias_loop_carried_phi_ranks_p > > to > > execute_reassoc. > > (pass_reassoc::bias_loop_carried_phi_ranks_p): New member. > > > > gcc/testsuite/ChangeLog: > > > > 2020-05-06 Ilya Leoshkevich > > > > * gcc.target/s390/reassoc-1.c: New test. > > * gcc.target/s390/reassoc-2.c: New test. > > * gcc.target/s390/reassoc-3.c: New test. > > * gcc.target/s390/reassoc.h: New test. > > --- > > gcc/passes.def| 4 +- > > gcc/testsuite/gcc.target/s390/reassoc-1.c | 6 ++ > > gcc/testsuite/gcc.target/s390/reassoc-2.c | 7 ++ > > gcc/testsuite/gcc.target/s390/reassoc-3.c | 8 ++ > > gcc/testsuite/gcc.target/s390/reassoc.h | 22 + > > gcc/tree-ssa-reassoc.c| 97 ++- > > > > 6 files changed, 105 insertions(+), 39 deletions(-) > > create mode 100644 gcc/testsuite/gcc.target/s390/reassoc-1.c > > create mode 100644 gcc/testsuite/gcc.target/s390/reassoc-2.c > > create mode 100644 gcc/testsuite/gcc.target/s390/reassoc-3.c > > create mode 100644 gcc/testsuite/gcc.target/s390/reassoc.h > > > > diff --git a/gcc/passes.def b/gcc/passes.def > > index 2b1e09fdda3..6864f583f20 100644 > > --- a/gcc/passes.def > > +++ b/gcc/passes.def > > @@ -235,7 +235,7 @@ along with GCC; see the file COPYING3. If not > > see > > program and isolate those paths. */ > >NEXT_PASS (pass_isolate_erroneous_paths); > >NEXT_PASS (pass_dse); > > - NEXT_PASS (pass_reassoc, true /* insert_powi_p */); > > + NEXT_PASS (pass_reassoc, true /* early_p */); > >NEXT_PASS (pass_dce); > >NEXT_PASS (pass_forwprop); > >NEXT_PASS (pass_phiopt, false /* early_p */); > > @@ -312,7 +312,7 @@ along with GCC; see the file COPYING3. If not > > see > >NEXT_PASS (pass_lower_vector_ssa); > >NEXT_PASS (pass_lower_switch); > >NEXT_PASS (pass_cse_reciprocals); > > - NEXT_PASS (pass_reassoc, false /* insert_powi_p */); > > + NEXT_PASS (pass_reassoc, false /* early_p */); > >NEXT_PASS (pass_strength_reduction); > >NEXT_PASS (pass_split_paths); > >NEXT_PASS (pass_tracer); > > diff --git
Re: [PATCH] reassoc: Propagate PHI_LOOP_BIAS along single uses
On Wed, Jun 24, 2020 at 1:31 AM Ilya Leoshkevich via Gcc-patches wrote: > > Bootstrapped and regtested x86_64-redhat-linux, ppc64le-redhat-linux and > s390x-redhat-linux. I also ran SPEC 2006 and 2017 on these platforms, > and the only measurable regression was 3% in 520.omnetpp_r on ppc, which > went away after inserting a single nop at the beginning of > cDynamicExpression::evaluate. > > OK for master? As you might know this is incredibly fragile so I'd prefer if you submit and push the change to disable PHI biasing in the early pass instance separately. At first glance that change looks reasonable (but we'll watch for fallout). Comments on the other changes inline > > --- > > PR tree-optimization/49749 introduced code that shortens dependency > chains containing loop accumulators by placing them last on operand > lists of associative operations. > > 456.hmmer benchmark on s390 could benefit from this, however, the code > that needs it modifies loop accumulator before using it, and since only > so-called loop-carried phis are are treated as loop accumulators, the > code in the present form doesn't really help. According to Bill > Schmidt - the original author - such a conservative approach was chosen > so as to avoid unnecessarily swapping operands, which might cause > unpredictable effects. However, giving special treatment to forms of > loop accumulators is acceptable. > > The definition of loop-carried phi is: it's a single-use phi, which is > used in the same innermost loop it's defined in, at least one argument > of which is defined in the same innermost loop as the phi itself. > Given this, it seems natural to treat single uses of such phis as phis > themselves. > > gcc/ChangeLog: > > 2020-05-06 Ilya Leoshkevich > > * passes.def (pass_reassoc): Rename parameter to early_p. > * tree-ssa-reassoc.c (reassoc_bias_loop_carried_phi_ranks_p): > New variable. > (phi_rank): Don't bias loop-carried phi ranks > before vectorization pass. > (loop_carried_phi): Remove (superseded by > operand_rank::biased_p). > (propagate_rank): Propagate bias along single uses. > (get_rank): Pass stmt to propagate_rank. > (execute_reassoc): Add bias_loop_carried_phi_ranks_p parameter. > (pass_reassoc::pass_reassoc): Add bias_loop_carried_phi_ranks_p > initializer. > (pass_reassoc::set_param): Set bias_loop_carried_phi_ranks_p > value. > (pass_reassoc::execute): Pass bias_loop_carried_phi_ranks_p to > execute_reassoc. > (pass_reassoc::bias_loop_carried_phi_ranks_p): New member. > > gcc/testsuite/ChangeLog: > > 2020-05-06 Ilya Leoshkevich > > * gcc.target/s390/reassoc-1.c: New test. > * gcc.target/s390/reassoc-2.c: New test. > * gcc.target/s390/reassoc-3.c: New test. > * gcc.target/s390/reassoc.h: New test. > --- > gcc/passes.def| 4 +- > gcc/testsuite/gcc.target/s390/reassoc-1.c | 6 ++ > gcc/testsuite/gcc.target/s390/reassoc-2.c | 7 ++ > gcc/testsuite/gcc.target/s390/reassoc-3.c | 8 ++ > gcc/testsuite/gcc.target/s390/reassoc.h | 22 + > gcc/tree-ssa-reassoc.c| 97 ++- > 6 files changed, 105 insertions(+), 39 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/s390/reassoc-1.c > create mode 100644 gcc/testsuite/gcc.target/s390/reassoc-2.c > create mode 100644 gcc/testsuite/gcc.target/s390/reassoc-3.c > create mode 100644 gcc/testsuite/gcc.target/s390/reassoc.h > > diff --git a/gcc/passes.def b/gcc/passes.def > index 2b1e09fdda3..6864f583f20 100644 > --- a/gcc/passes.def > +++ b/gcc/passes.def > @@ -235,7 +235,7 @@ along with GCC; see the file COPYING3. If not see > program and isolate those paths. */ >NEXT_PASS (pass_isolate_erroneous_paths); >NEXT_PASS (pass_dse); > - NEXT_PASS (pass_reassoc, true /* insert_powi_p */); > + NEXT_PASS (pass_reassoc, true /* early_p */); >NEXT_PASS (pass_dce); >NEXT_PASS (pass_forwprop); >NEXT_PASS (pass_phiopt, false /* early_p */); > @@ -312,7 +312,7 @@ along with GCC; see the file COPYING3. If not see >NEXT_PASS (pass_lower_vector_ssa); >NEXT_PASS (pass_lower_switch); >NEXT_PASS (pass_cse_reciprocals); > - NEXT_PASS (pass_reassoc, false /* insert_powi_p */); > + NEXT_PASS (pass_reassoc, false /* early_p */); >NEXT_PASS (pass_strength_reduction); >NEXT_PASS (pass_split_paths); >NEXT_PASS (pass_tracer); > diff --git a/gcc/testsuite/gcc.target/s390/reassoc-1.c > b/gcc/testsuite/gcc.target/s390/reassoc-1.c > new file mode 100644 > index 000..8343f1cd4b7 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/s390/reassoc-1.c > @@ -0,0 +1,6 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2" } */ > + > +#include "reassoc.h" > + > +/* { dg-final { scan-assembler >
[PATCH] reassoc: Propagate PHI_LOOP_BIAS along single uses
Bootstrapped and regtested x86_64-redhat-linux, ppc64le-redhat-linux and s390x-redhat-linux. I also ran SPEC 2006 and 2017 on these platforms, and the only measurable regression was 3% in 520.omnetpp_r on ppc, which went away after inserting a single nop at the beginning of cDynamicExpression::evaluate. OK for master? --- PR tree-optimization/49749 introduced code that shortens dependency chains containing loop accumulators by placing them last on operand lists of associative operations. 456.hmmer benchmark on s390 could benefit from this, however, the code that needs it modifies loop accumulator before using it, and since only so-called loop-carried phis are are treated as loop accumulators, the code in the present form doesn't really help. According to Bill Schmidt - the original author - such a conservative approach was chosen so as to avoid unnecessarily swapping operands, which might cause unpredictable effects. However, giving special treatment to forms of loop accumulators is acceptable. The definition of loop-carried phi is: it's a single-use phi, which is used in the same innermost loop it's defined in, at least one argument of which is defined in the same innermost loop as the phi itself. Given this, it seems natural to treat single uses of such phis as phis themselves. gcc/ChangeLog: 2020-05-06 Ilya Leoshkevich * passes.def (pass_reassoc): Rename parameter to early_p. * tree-ssa-reassoc.c (reassoc_bias_loop_carried_phi_ranks_p): New variable. (phi_rank): Don't bias loop-carried phi ranks before vectorization pass. (loop_carried_phi): Remove (superseded by operand_rank::biased_p). (propagate_rank): Propagate bias along single uses. (get_rank): Pass stmt to propagate_rank. (execute_reassoc): Add bias_loop_carried_phi_ranks_p parameter. (pass_reassoc::pass_reassoc): Add bias_loop_carried_phi_ranks_p initializer. (pass_reassoc::set_param): Set bias_loop_carried_phi_ranks_p value. (pass_reassoc::execute): Pass bias_loop_carried_phi_ranks_p to execute_reassoc. (pass_reassoc::bias_loop_carried_phi_ranks_p): New member. gcc/testsuite/ChangeLog: 2020-05-06 Ilya Leoshkevich * gcc.target/s390/reassoc-1.c: New test. * gcc.target/s390/reassoc-2.c: New test. * gcc.target/s390/reassoc-3.c: New test. * gcc.target/s390/reassoc.h: New test. --- gcc/passes.def| 4 +- gcc/testsuite/gcc.target/s390/reassoc-1.c | 6 ++ gcc/testsuite/gcc.target/s390/reassoc-2.c | 7 ++ gcc/testsuite/gcc.target/s390/reassoc-3.c | 8 ++ gcc/testsuite/gcc.target/s390/reassoc.h | 22 + gcc/tree-ssa-reassoc.c| 97 ++- 6 files changed, 105 insertions(+), 39 deletions(-) create mode 100644 gcc/testsuite/gcc.target/s390/reassoc-1.c create mode 100644 gcc/testsuite/gcc.target/s390/reassoc-2.c create mode 100644 gcc/testsuite/gcc.target/s390/reassoc-3.c create mode 100644 gcc/testsuite/gcc.target/s390/reassoc.h diff --git a/gcc/passes.def b/gcc/passes.def index 2b1e09fdda3..6864f583f20 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -235,7 +235,7 @@ along with GCC; see the file COPYING3. If not see program and isolate those paths. */ NEXT_PASS (pass_isolate_erroneous_paths); NEXT_PASS (pass_dse); - NEXT_PASS (pass_reassoc, true /* insert_powi_p */); + NEXT_PASS (pass_reassoc, true /* early_p */); NEXT_PASS (pass_dce); NEXT_PASS (pass_forwprop); NEXT_PASS (pass_phiopt, false /* early_p */); @@ -312,7 +312,7 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_lower_vector_ssa); NEXT_PASS (pass_lower_switch); NEXT_PASS (pass_cse_reciprocals); - NEXT_PASS (pass_reassoc, false /* insert_powi_p */); + NEXT_PASS (pass_reassoc, false /* early_p */); NEXT_PASS (pass_strength_reduction); NEXT_PASS (pass_split_paths); NEXT_PASS (pass_tracer); diff --git a/gcc/testsuite/gcc.target/s390/reassoc-1.c b/gcc/testsuite/gcc.target/s390/reassoc-1.c new file mode 100644 index 000..8343f1cd4b7 --- /dev/null +++ b/gcc/testsuite/gcc.target/s390/reassoc-1.c @@ -0,0 +1,6 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +#include "reassoc.h" + +/* { dg-final { scan-assembler {(?n)\n\tl\t(%r\d+),.+(\n.*)*\n\ta\t(\1),.+(\n.*)*\n\tar\t(%r\d+),(\1)} } } */ diff --git a/gcc/testsuite/gcc.target/s390/reassoc-2.c b/gcc/testsuite/gcc.target/s390/reassoc-2.c new file mode 100644 index 000..5e393ed4937 --- /dev/null +++ b/gcc/testsuite/gcc.target/s390/reassoc-2.c @@ -0,0 +1,7 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +#define MODIFY +#include "reassoc.h" + +/* { dg-final { scan-assembler {(?n)\n\tl\t(%r\d+),.+(\n.*)*\n\ta\t(\1),.+(\n.*)*\n\tar\t(%r\d+),(\1)} } } */ diff --git a/gcc/testsuite/gcc.target/s390/reassoc-3.c