On Thu, Oct 26, 2023 at 2:29 AM Richard Biener
<richard.guent...@gmail.com> wrote:
>
> On Wed, Oct 25, 2023 at 5:51 AM Andrew Pinski <pins...@gmail.com> wrote:
> >
> > I noticed we were missing optimizing `a / (1 << b)` when
> > we know that a is nonnegative but only due to ranger information.
> > This adds the use of the global ranger to tree_single_nonnegative_warnv_p
> > for SSA_NAME.
> > I didn't extend tree_single_nonnegative_warnv_p to use the ranger for 
> > floating
> > point nor to use the local ranger since I am not 100% sure it is safe where
> > all of the uses tree_expr_nonnegative_p would be safe.
> >
> > Note pr80776-1.c testcase fails again due to vrp's bad handling of setting
> > global ranges from __builtin_unreachable. It just happened to be optimized
> > before due to global ranges not being used as much.
> >
> > Bootstrapped and tested on x86_64-linux-gnu with no regressions.
> >
> >         PR tree-optimization/111959
> >
> > gcc/ChangeLog:
> >
> >         * fold-const.cc (tree_single_nonnegative_warnv_p): Use
> >         the global range to see if the SSA_NAME was nonnegative.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         * gcc.dg/tree-ssa/forwprop-42.c: New test.
> >         * gcc.dg/pr80776-1.c: xfail and update comment.
> > ---
> >  gcc/fold-const.cc                           | 36 +++++++++++++++------
> >  gcc/testsuite/gcc.dg/pr80776-1.c            |  8 ++---
> >  gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c | 15 +++++++++
> >  3 files changed, 46 insertions(+), 13 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c
> >
> > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> > index 40767736389..2a2a90230f5 100644
> > --- a/gcc/fold-const.cc
> > +++ b/gcc/fold-const.cc
> > @@ -15047,15 +15047,33 @@ tree_single_nonnegative_warnv_p (tree t, bool 
> > *strict_overflow_p, int depth)
> >        return RECURSE (TREE_OPERAND (t, 1)) && RECURSE (TREE_OPERAND (t, 
> > 2));
> >
> >      case SSA_NAME:
> > -      /* Limit the depth of recursion to avoid quadratic behavior.
> > -        This is expected to catch almost all occurrences in practice.
> > -        If this code misses important cases that unbounded recursion
> > -        would not, passes that need this information could be revised
> > -        to provide it through dataflow propagation.  */
> > -      return (!name_registered_for_update_p (t)
> > -             && depth < param_max_ssa_name_query_depth
> > -             && gimple_stmt_nonnegative_warnv_p (SSA_NAME_DEF_STMT (t),
> > -                                                 strict_overflow_p, 
> > depth));
> > +      {
> > +       /* For integral types, querry the global range if possible. */
>
> query
>
> > +       if (INTEGRAL_TYPE_P (TREE_TYPE (t)))
> > +         {
> > +           value_range vr;
> > +           if (get_global_range_query ()->range_of_expr (vr, t)
> > +               && !vr.varying_p () && !vr.undefined_p ())
> > +             {
> > +               /* If the range is nonnegative, return true. */
> > +               if (vr.nonnegative_p ())
> > +                 return true;
> > +
> > +               /* If the range is non-positive, then return false. */
> > +               if (vr.nonpositive_p ())
> > +                 return false;
>
> That's testing for <= 0, nonnegative for >= 0.  This means when
> vr.nonpositive_p () the value could still be zero (and nonnegative),
> possibly be figured out by the recursion below.
>
> Since we don't have negative_p () do we want to test
> nonpositive_p () && nonzero_p () instead?

I was thinking about that when I was writing the patch.
If the ranger figured out the value was zero, nonnegative_p would have
returned true.
So while yes nonpositive_p() would return true but we already checked
nonnegative_p beforehand and the nonzero_p would not matter.
Now the question is if after nonnegative_p we check if the range could
contain 0 still is that worth the recursion. The whole idea of
returning false was to remove the need from recursion as much.

Thanks,
Andrew


>
> OK with that change.
>
> Richard.
>
> > +             }
> > +         }
> > +       /* Limit the depth of recursion to avoid quadratic behavior.
> > +          This is expected to catch almost all occurrences in practice.
> > +          If this code misses important cases that unbounded recursion
> > +          would not, passes that need this information could be revised
> > +          to provide it through dataflow propagation.  */
> > +       return (!name_registered_for_update_p (t)
> > +               && depth < param_max_ssa_name_query_depth
> > +               && gimple_stmt_nonnegative_warnv_p (SSA_NAME_DEF_STMT (t),
> > +                                                   strict_overflow_p, 
> > depth));
> > +      }
> >
> >      default:
> >        return tree_simple_nonnegative_warnv_p (TREE_CODE (t), TREE_TYPE 
> > (t));
> > diff --git a/gcc/testsuite/gcc.dg/pr80776-1.c 
> > b/gcc/testsuite/gcc.dg/pr80776-1.c
> > index b9bce62d982..f3d47aeda36 100644
> > --- a/gcc/testsuite/gcc.dg/pr80776-1.c
> > +++ b/gcc/testsuite/gcc.dg/pr80776-1.c
> > @@ -18,14 +18,14 @@ Foo (void)
> >    if (! (0 <= i && i <= 999999))
> >      __builtin_unreachable ();
> >
> > -  /* Legacy evrp sets the range of i to [0, MAX] *before* the first 
> > conditional,
> > +  /* vrp1 sets the range of i to [0, MAX] *before* the first conditional,
> >       and to [0,999999] *before* the second conditional.  This is because 
> > both
> > -     evrp and VRP use trickery to set global ranges when this particular 
> > use of
> > +     vrp use trickery to set global ranges when this particular use of
> >       a __builtin_unreachable is in play (see uses of
> >       assert_unreachable_fallthru_edge_p).
> >
> > -     Setting these ranges at the definition site, causes VRP to remove the
> > +     Setting these ranges at the definition site, causes other passes to 
> > remove the
> >       unreachable code altogether, leaving the following sprintf unguarded. 
> >  This
> >       causes the bogus warning below.  */
> > -  sprintf (number, "%d", i); /* { dg-bogus "writing" "" } */
> > +  sprintf (number, "%d", i); /* { dg-bogus "writing" "" { xfail *-*-* } } 
> > */
> >  }
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c 
> > b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c
> > new file mode 100644
> > index 00000000000..4e5421ed4d4
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c
> > @@ -0,0 +1,15 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -fdump-tree-optimized" } */
> > +/* PR tree-optimization/111959 */
> > +
> > +int divbypow2(int a, int b)
> > +{
> > +  if (a & ~0xff) __builtin_unreachable();
> > +  return a / (1<<b);
> > +}
> > +
> > +/* divbypow2 should be able to optimize to just a/b as a is known to be 
> > always positive. */
> > +/* { dg-final { scan-tree-dump-not " / " "optimized" } } */
> > +/* { dg-final { scan-tree-dump-not " << " "optimized" } } */
> > +/* { dg-final { scan-tree-dump-times " >> " 1 "optimized" } } */
> > +
> > --
> > 2.39.3
> >

Reply via email to