Re: [PATCH] Improve tree_expr_nonnegative_p by using the ranger [PR111959]

2023-10-27 Thread Richard Biener
On Thu, Oct 26, 2023 at 8:30 PM Andrew Pinski  wrote:
>
> On Thu, Oct 26, 2023 at 2:29 AM Richard Biener
>  wrote:
> >
> > On Wed, Oct 25, 2023 at 5:51 AM Andrew Pinski  wrote:
> > >
> > > I noticed we were missing optimizing `a / (1 << b)` when
> > > we know that a is nonnegative but only due to ranger information.
> > > This adds the use of the global ranger to tree_single_nonnegative_warnv_p
> > > for SSA_NAME.
> > > I didn't extend tree_single_nonnegative_warnv_p to use the ranger for 
> > > floating
> > > point nor to use the local ranger since I am not 100% sure it is safe 
> > > where
> > > all of the uses tree_expr_nonnegative_p would be safe.
> > >
> > > Note pr80776-1.c testcase fails again due to vrp's bad handling of setting
> > > global ranges from __builtin_unreachable. It just happened to be optimized
> > > before due to global ranges not being used as much.
> > >
> > > Bootstrapped and tested on x86_64-linux-gnu with no regressions.
> > >
> > > PR tree-optimization/111959
> > >
> > > gcc/ChangeLog:
> > >
> > > * fold-const.cc (tree_single_nonnegative_warnv_p): Use
> > > the global range to see if the SSA_NAME was nonnegative.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > * gcc.dg/tree-ssa/forwprop-42.c: New test.
> > > * gcc.dg/pr80776-1.c: xfail and update comment.
> > > ---
> > >  gcc/fold-const.cc   | 36 +++--
> > >  gcc/testsuite/gcc.dg/pr80776-1.c|  8 ++---
> > >  gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c | 15 +
> > >  3 files changed, 46 insertions(+), 13 deletions(-)
> > >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c
> > >
> > > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> > > index 40767736389..2a2a90230f5 100644
> > > --- a/gcc/fold-const.cc
> > > +++ b/gcc/fold-const.cc
> > > @@ -15047,15 +15047,33 @@ tree_single_nonnegative_warnv_p (tree t, bool 
> > > *strict_overflow_p, int depth)
> > >return RECURSE (TREE_OPERAND (t, 1)) && RECURSE (TREE_OPERAND (t, 
> > > 2));
> > >
> > >  case SSA_NAME:
> > > -  /* Limit the depth of recursion to avoid quadratic behavior.
> > > -This is expected to catch almost all occurrences in practice.
> > > -If this code misses important cases that unbounded recursion
> > > -would not, passes that need this information could be revised
> > > -to provide it through dataflow propagation.  */
> > > -  return (!name_registered_for_update_p (t)
> > > - && depth < param_max_ssa_name_query_depth
> > > - && gimple_stmt_nonnegative_warnv_p (SSA_NAME_DEF_STMT (t),
> > > - strict_overflow_p, 
> > > depth));
> > > +  {
> > > +   /* For integral types, querry the global range if possible. */
> >
> > query
> >
> > > +   if (INTEGRAL_TYPE_P (TREE_TYPE (t)))
> > > + {
> > > +   value_range vr;
> > > +   if (get_global_range_query ()->range_of_expr (vr, t)
> > > +   && !vr.varying_p () && !vr.undefined_p ())
> > > + {
> > > +   /* If the range is nonnegative, return true. */
> > > +   if (vr.nonnegative_p ())
> > > + return true;
> > > +
> > > +   /* If the range is non-positive, then return false. */
> > > +   if (vr.nonpositive_p ())
> > > + return false;
> >
> > That's testing for <= 0, nonnegative for >= 0.  This means when
> > vr.nonpositive_p () the value could still be zero (and nonnegative),
> > possibly be figured out by the recursion below.
> >
> > Since we don't have negative_p () do we want to test
> > nonpositive_p () && nonzero_p () instead?
>
> I was thinking about that when I was writing the patch.
> If the ranger figured out the value was zero, nonnegative_p would have
> returned true.
> So while yes nonpositive_p() would return true but we already checked
> nonnegative_p beforehand and the nonzero_p would not matter.
> Now the question is if after nonnegative_p we check if the range could
> contain 0 still is that worth the recursion.

Yes, that was the point.

> The whole idea of
> returning false was to remove the need from recursion as much.

Well, specifically the exact zero case is something the code is reasonably
good at.  If we want to remove recursion as much as possible we'd
remove it completely.

Maybe you can do some statistics how many hits the recursion yields
after the vr.nonnegative_p () out (but without the nonpositive_p one)?

Richard.

> Thanks,
> Andrew
>
>
> >
> > OK with that change.
> >
> > Richard.
> >
> > > + }
> > > + }
> > > +   /* Limit the depth of recursion to avoid quadratic behavior.
> > > +  This is expected to catch almost all occurrences in practice.
> > > +  If this code misses important cases that unbounded recursion
> > > +  would not, passes that need this information could be 

Re: [PATCH] Improve tree_expr_nonnegative_p by using the ranger [PR111959]

2023-10-26 Thread Andrew Pinski
On Thu, Oct 26, 2023 at 2:29 AM Richard Biener
 wrote:
>
> On Wed, Oct 25, 2023 at 5:51 AM Andrew Pinski  wrote:
> >
> > I noticed we were missing optimizing `a / (1 << b)` when
> > we know that a is nonnegative but only due to ranger information.
> > This adds the use of the global ranger to tree_single_nonnegative_warnv_p
> > for SSA_NAME.
> > I didn't extend tree_single_nonnegative_warnv_p to use the ranger for 
> > floating
> > point nor to use the local ranger since I am not 100% sure it is safe where
> > all of the uses tree_expr_nonnegative_p would be safe.
> >
> > Note pr80776-1.c testcase fails again due to vrp's bad handling of setting
> > global ranges from __builtin_unreachable. It just happened to be optimized
> > before due to global ranges not being used as much.
> >
> > Bootstrapped and tested on x86_64-linux-gnu with no regressions.
> >
> > PR tree-optimization/111959
> >
> > gcc/ChangeLog:
> >
> > * fold-const.cc (tree_single_nonnegative_warnv_p): Use
> > the global range to see if the SSA_NAME was nonnegative.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.dg/tree-ssa/forwprop-42.c: New test.
> > * gcc.dg/pr80776-1.c: xfail and update comment.
> > ---
> >  gcc/fold-const.cc   | 36 +++--
> >  gcc/testsuite/gcc.dg/pr80776-1.c|  8 ++---
> >  gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c | 15 +
> >  3 files changed, 46 insertions(+), 13 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c
> >
> > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> > index 40767736389..2a2a90230f5 100644
> > --- a/gcc/fold-const.cc
> > +++ b/gcc/fold-const.cc
> > @@ -15047,15 +15047,33 @@ tree_single_nonnegative_warnv_p (tree t, bool 
> > *strict_overflow_p, int depth)
> >return RECURSE (TREE_OPERAND (t, 1)) && RECURSE (TREE_OPERAND (t, 
> > 2));
> >
> >  case SSA_NAME:
> > -  /* Limit the depth of recursion to avoid quadratic behavior.
> > -This is expected to catch almost all occurrences in practice.
> > -If this code misses important cases that unbounded recursion
> > -would not, passes that need this information could be revised
> > -to provide it through dataflow propagation.  */
> > -  return (!name_registered_for_update_p (t)
> > - && depth < param_max_ssa_name_query_depth
> > - && gimple_stmt_nonnegative_warnv_p (SSA_NAME_DEF_STMT (t),
> > - strict_overflow_p, 
> > depth));
> > +  {
> > +   /* For integral types, querry the global range if possible. */
>
> query
>
> > +   if (INTEGRAL_TYPE_P (TREE_TYPE (t)))
> > + {
> > +   value_range vr;
> > +   if (get_global_range_query ()->range_of_expr (vr, t)
> > +   && !vr.varying_p () && !vr.undefined_p ())
> > + {
> > +   /* If the range is nonnegative, return true. */
> > +   if (vr.nonnegative_p ())
> > + return true;
> > +
> > +   /* If the range is non-positive, then return false. */
> > +   if (vr.nonpositive_p ())
> > + return false;
>
> That's testing for <= 0, nonnegative for >= 0.  This means when
> vr.nonpositive_p () the value could still be zero (and nonnegative),
> possibly be figured out by the recursion below.
>
> Since we don't have negative_p () do we want to test
> nonpositive_p () && nonzero_p () instead?

I was thinking about that when I was writing the patch.
If the ranger figured out the value was zero, nonnegative_p would have
returned true.
So while yes nonpositive_p() would return true but we already checked
nonnegative_p beforehand and the nonzero_p would not matter.
Now the question is if after nonnegative_p we check if the range could
contain 0 still is that worth the recursion. The whole idea of
returning false was to remove the need from recursion as much.

Thanks,
Andrew


>
> OK with that change.
>
> Richard.
>
> > + }
> > + }
> > +   /* Limit the depth of recursion to avoid quadratic behavior.
> > +  This is expected to catch almost all occurrences in practice.
> > +  If this code misses important cases that unbounded recursion
> > +  would not, passes that need this information could be revised
> > +  to provide it through dataflow propagation.  */
> > +   return (!name_registered_for_update_p (t)
> > +   && depth < param_max_ssa_name_query_depth
> > +   && gimple_stmt_nonnegative_warnv_p (SSA_NAME_DEF_STMT (t),
> > +   strict_overflow_p, 
> > depth));
> > +  }
> >
> >  default:
> >return tree_simple_nonnegative_warnv_p (TREE_CODE (t), TREE_TYPE 
> > (t));
> > diff --git a/gcc/testsuite/gcc.dg/pr80776-1.c 
> > b/gcc/testsuite/gcc.dg/pr80776-1.c
> > index b9bce62d982..f3d47aeda36 100644
> > --- 

Re: [PATCH] Improve tree_expr_nonnegative_p by using the ranger [PR111959]

2023-10-26 Thread Mikael Morin

Le 26/10/2023 à 11:29, Richard Biener a écrit :

On Wed, Oct 25, 2023 at 5:51 AM Andrew Pinski  wrote:

diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index 40767736389..2a2a90230f5 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -15047,15 +15047,33 @@ tree_single_nonnegative_warnv_p (tree t, bool 
*strict_overflow_p, int depth)
return RECURSE (TREE_OPERAND (t, 1)) && RECURSE (TREE_OPERAND (t, 2));

  case SSA_NAME:
-  /* Limit the depth of recursion to avoid quadratic behavior.
-This is expected to catch almost all occurrences in practice.
-If this code misses important cases that unbounded recursion
-would not, passes that need this information could be revised
-to provide it through dataflow propagation.  */
-  return (!name_registered_for_update_p (t)
- && depth < param_max_ssa_name_query_depth
- && gimple_stmt_nonnegative_warnv_p (SSA_NAME_DEF_STMT (t),
- strict_overflow_p, depth));
+  {
+   /* For integral types, querry the global range if possible. */


query


+   if (INTEGRAL_TYPE_P (TREE_TYPE (t)))
+ {
+   value_range vr;
+   if (get_global_range_query ()->range_of_expr (vr, t)
+   && !vr.varying_p () && !vr.undefined_p ())
+ {
+   /* If the range is nonnegative, return true. */
+   if (vr.nonnegative_p ())
+ return true;
+
+   /* If the range is non-positive, then return false. */
+   if (vr.nonpositive_p ())
+ return false;


That's testing for <= 0, nonnegative for >= 0.  This means when
vr.nonpositive_p () the value could still be zero (and nonnegative),
possibly be figured out by the recursion below.

Since we don't have negative_p () do we want to test
nonpositive_p () && nonzero_p () instead?



Maybe !contains_zero_p () instead of nonzero_p () ?

nonzero_p seems to check that the range is exactly the "all but zero" 
range as visible in the implementation:


  inline bool
  irange::nonzero_p () const
  {
if (undefined_p ())
  return false;

wide_int zero = wi::zero (TYPE_PRECISION (type ()));
return *this == int_range<2> (type (), zero, zero, VR_ANTI_RANGE);
  }



Re: [PATCH] Improve tree_expr_nonnegative_p by using the ranger [PR111959]

2023-10-26 Thread Richard Biener
On Wed, Oct 25, 2023 at 5:51 AM Andrew Pinski  wrote:
>
> I noticed we were missing optimizing `a / (1 << b)` when
> we know that a is nonnegative but only due to ranger information.
> This adds the use of the global ranger to tree_single_nonnegative_warnv_p
> for SSA_NAME.
> I didn't extend tree_single_nonnegative_warnv_p to use the ranger for floating
> point nor to use the local ranger since I am not 100% sure it is safe where
> all of the uses tree_expr_nonnegative_p would be safe.
>
> Note pr80776-1.c testcase fails again due to vrp's bad handling of setting
> global ranges from __builtin_unreachable. It just happened to be optimized
> before due to global ranges not being used as much.
>
> Bootstrapped and tested on x86_64-linux-gnu with no regressions.
>
> PR tree-optimization/111959
>
> gcc/ChangeLog:
>
> * fold-const.cc (tree_single_nonnegative_warnv_p): Use
> the global range to see if the SSA_NAME was nonnegative.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/tree-ssa/forwprop-42.c: New test.
> * gcc.dg/pr80776-1.c: xfail and update comment.
> ---
>  gcc/fold-const.cc   | 36 +++--
>  gcc/testsuite/gcc.dg/pr80776-1.c|  8 ++---
>  gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c | 15 +
>  3 files changed, 46 insertions(+), 13 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c
>
> diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> index 40767736389..2a2a90230f5 100644
> --- a/gcc/fold-const.cc
> +++ b/gcc/fold-const.cc
> @@ -15047,15 +15047,33 @@ tree_single_nonnegative_warnv_p (tree t, bool 
> *strict_overflow_p, int depth)
>return RECURSE (TREE_OPERAND (t, 1)) && RECURSE (TREE_OPERAND (t, 2));
>
>  case SSA_NAME:
> -  /* Limit the depth of recursion to avoid quadratic behavior.
> -This is expected to catch almost all occurrences in practice.
> -If this code misses important cases that unbounded recursion
> -would not, passes that need this information could be revised
> -to provide it through dataflow propagation.  */
> -  return (!name_registered_for_update_p (t)
> - && depth < param_max_ssa_name_query_depth
> - && gimple_stmt_nonnegative_warnv_p (SSA_NAME_DEF_STMT (t),
> - strict_overflow_p, depth));
> +  {
> +   /* For integral types, querry the global range if possible. */

query

> +   if (INTEGRAL_TYPE_P (TREE_TYPE (t)))
> + {
> +   value_range vr;
> +   if (get_global_range_query ()->range_of_expr (vr, t)
> +   && !vr.varying_p () && !vr.undefined_p ())
> + {
> +   /* If the range is nonnegative, return true. */
> +   if (vr.nonnegative_p ())
> + return true;
> +
> +   /* If the range is non-positive, then return false. */
> +   if (vr.nonpositive_p ())
> + return false;

That's testing for <= 0, nonnegative for >= 0.  This means when
vr.nonpositive_p () the value could still be zero (and nonnegative),
possibly be figured out by the recursion below.

Since we don't have negative_p () do we want to test
nonpositive_p () && nonzero_p () instead?

OK with that change.

Richard.

> + }
> + }
> +   /* Limit the depth of recursion to avoid quadratic behavior.
> +  This is expected to catch almost all occurrences in practice.
> +  If this code misses important cases that unbounded recursion
> +  would not, passes that need this information could be revised
> +  to provide it through dataflow propagation.  */
> +   return (!name_registered_for_update_p (t)
> +   && depth < param_max_ssa_name_query_depth
> +   && gimple_stmt_nonnegative_warnv_p (SSA_NAME_DEF_STMT (t),
> +   strict_overflow_p, 
> depth));
> +  }
>
>  default:
>return tree_simple_nonnegative_warnv_p (TREE_CODE (t), TREE_TYPE (t));
> diff --git a/gcc/testsuite/gcc.dg/pr80776-1.c 
> b/gcc/testsuite/gcc.dg/pr80776-1.c
> index b9bce62d982..f3d47aeda36 100644
> --- a/gcc/testsuite/gcc.dg/pr80776-1.c
> +++ b/gcc/testsuite/gcc.dg/pr80776-1.c
> @@ -18,14 +18,14 @@ Foo (void)
>if (! (0 <= i && i <= 99))
>  __builtin_unreachable ();
>
> -  /* Legacy evrp sets the range of i to [0, MAX] *before* the first 
> conditional,
> +  /* vrp1 sets the range of i to [0, MAX] *before* the first conditional,
>   and to [0,99] *before* the second conditional.  This is because both
> - evrp and VRP use trickery to set global ranges when this particular use 
> of
> + vrp use trickery to set global ranges when this particular use of
>   a __builtin_unreachable is in play (see uses of
>   assert_unreachable_fallthru_edge_p).
>
> - Setting these ranges at the definition site, causes VRP to remove 

[PATCH] Improve tree_expr_nonnegative_p by using the ranger [PR111959]

2023-10-24 Thread Andrew Pinski
I noticed we were missing optimizing `a / (1 << b)` when
we know that a is nonnegative but only due to ranger information.
This adds the use of the global ranger to tree_single_nonnegative_warnv_p
for SSA_NAME.
I didn't extend tree_single_nonnegative_warnv_p to use the ranger for floating
point nor to use the local ranger since I am not 100% sure it is safe where
all of the uses tree_expr_nonnegative_p would be safe.

Note pr80776-1.c testcase fails again due to vrp's bad handling of setting
global ranges from __builtin_unreachable. It just happened to be optimized
before due to global ranges not being used as much.

Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/111959

gcc/ChangeLog:

* fold-const.cc (tree_single_nonnegative_warnv_p): Use
the global range to see if the SSA_NAME was nonnegative.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/forwprop-42.c: New test.
* gcc.dg/pr80776-1.c: xfail and update comment.
---
 gcc/fold-const.cc   | 36 +++--
 gcc/testsuite/gcc.dg/pr80776-1.c|  8 ++---
 gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c | 15 +
 3 files changed, 46 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c

diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index 40767736389..2a2a90230f5 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -15047,15 +15047,33 @@ tree_single_nonnegative_warnv_p (tree t, bool 
*strict_overflow_p, int depth)
   return RECURSE (TREE_OPERAND (t, 1)) && RECURSE (TREE_OPERAND (t, 2));
 
 case SSA_NAME:
-  /* Limit the depth of recursion to avoid quadratic behavior.
-This is expected to catch almost all occurrences in practice.
-If this code misses important cases that unbounded recursion
-would not, passes that need this information could be revised
-to provide it through dataflow propagation.  */
-  return (!name_registered_for_update_p (t)
- && depth < param_max_ssa_name_query_depth
- && gimple_stmt_nonnegative_warnv_p (SSA_NAME_DEF_STMT (t),
- strict_overflow_p, depth));
+  {
+   /* For integral types, querry the global range if possible. */
+   if (INTEGRAL_TYPE_P (TREE_TYPE (t)))
+ {
+   value_range vr;
+   if (get_global_range_query ()->range_of_expr (vr, t)
+   && !vr.varying_p () && !vr.undefined_p ())
+ {
+   /* If the range is nonnegative, return true. */
+   if (vr.nonnegative_p ())
+ return true;
+
+   /* If the range is non-positive, then return false. */
+   if (vr.nonpositive_p ())
+ return false;
+ }
+ }
+   /* Limit the depth of recursion to avoid quadratic behavior.
+  This is expected to catch almost all occurrences in practice.
+  If this code misses important cases that unbounded recursion
+  would not, passes that need this information could be revised
+  to provide it through dataflow propagation.  */
+   return (!name_registered_for_update_p (t)
+   && depth < param_max_ssa_name_query_depth
+   && gimple_stmt_nonnegative_warnv_p (SSA_NAME_DEF_STMT (t),
+   strict_overflow_p, depth));
+  }
 
 default:
   return tree_simple_nonnegative_warnv_p (TREE_CODE (t), TREE_TYPE (t));
diff --git a/gcc/testsuite/gcc.dg/pr80776-1.c b/gcc/testsuite/gcc.dg/pr80776-1.c
index b9bce62d982..f3d47aeda36 100644
--- a/gcc/testsuite/gcc.dg/pr80776-1.c
+++ b/gcc/testsuite/gcc.dg/pr80776-1.c
@@ -18,14 +18,14 @@ Foo (void)
   if (! (0 <= i && i <= 99))
 __builtin_unreachable ();
 
-  /* Legacy evrp sets the range of i to [0, MAX] *before* the first 
conditional,
+  /* vrp1 sets the range of i to [0, MAX] *before* the first conditional,
  and to [0,99] *before* the second conditional.  This is because both
- evrp and VRP use trickery to set global ranges when this particular use of
+ vrp use trickery to set global ranges when this particular use of
  a __builtin_unreachable is in play (see uses of
  assert_unreachable_fallthru_edge_p).
 
- Setting these ranges at the definition site, causes VRP to remove the
+ Setting these ranges at the definition site, causes other passes to 
remove the
  unreachable code altogether, leaving the following sprintf unguarded.  
This
  causes the bogus warning below.  */
-  sprintf (number, "%d", i); /* { dg-bogus "writing" "" } */
+  sprintf (number, "%d", i); /* { dg-bogus "writing" "" { xfail *-*-* } } */
 }
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c 
b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c
new file mode 100644
index 000..4e5421ed4d4
--- /dev/null
+++