[Bug tree-optimization/110875] [14 Regression] Dead Code Elimination Regression since r14-2501-g285c9d042e9

2023-09-07 Thread amacleod at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110875

Andrew Macleod  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #4 from Andrew Macleod  ---
When range_of_stmt invokes prefill_name to evaluate unvisited dependencies it
should not mark visited names as always_current.

when ranger_cache::get_globaL_range() is invoked with the optional  "current_p"
flag, it triggers additional functionality. This call is meant to be from
within ranger and it is understood that if the current value is not current, 
set_global_range will always be called later with a value.  Thus it sets the
always_current flag in the temporal cache to avoid computation cycles.

the prefill_stmt_dependencies () mechanism within ranger is intended to emulate
the bahaviour of range_of_stmt on an arbitrarily long series of unresolved
dependencies without triggering the overhead of huge call chains from the
range_of_expr/range_on_entry/range_on_exit routines.  Rather, it creates a
stack of unvisited names, and invokes range_of_stmt on them directly in order
to get initial cache values for each ssa-name.

The issue in this PR was that routine was incorrectly invoking the
get_global_cache to determine whether there was a global value.  If there was,
it would move on to the next dependency without invoking set_global_range to
clear the always_current flag.

What it should have been doing was simply checking if there as a global value,
and if there was not, add the name for processing and THEN invoke
get_global_value to do all the special processing.
fixed.

[Bug tree-optimization/110875] [14 Regression] Dead Code Elimination Regression since r14-2501-g285c9d042e9

2023-09-07 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110875

--- Comment #3 from CVS Commits  ---
The master branch has been updated by Andrew Macleod :

https://gcc.gnu.org/g:cf2ae3fff4ee9bf884b122ee6cd83bffd791a16f

commit r14-3792-gcf2ae3fff4ee9bf884b122ee6cd83bffd791a16f
Author: Andrew MacLeod 
Date:   Thu Sep 7 11:15:50 2023 -0400

Some ssa-names get incorrectly marked as always_current.

When range_of_stmt invokes prefill_name to evaluate unvisited dependencies
it should not mark already visited names as always_current.

PR tree-optimization/110875
gcc/
* gimple-range.cc (gimple_ranger::prefill_name): Only invoke
cache-prefilling routine when the ssa-name has no global value.

gcc/testsuite/
* gcc.dg/pr110875.c: New.

[Bug tree-optimization/110875] [14 Regression] Dead Code Elimination Regression since r14-2501-g285c9d042e9

2023-08-21 Thread aldyh at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110875

Aldy Hernandez  changed:

   What|Removed |Added

 CC||amacleod at redhat dot com

--- Comment #2 from Aldy Hernandez  ---
(In reply to Andrew Pinski from comment #1)

> I am super confused about VRP's ranges:
> We have the following that ranges that get exported and their relationships:
> Global Exported: a.8_105 = [irange] int [-2, 0]
>   _10 = a.8_105 + -1;
> Global Exported: _10 = [irange] int [-INF, -6][-3, -1][1, 2147483645]
>   _103 = (unsigned int) _10;
> Global Exported: _103 = [irange] unsigned int [1, 2147483645][2147483648,
> 4294967290][4294967294, +INF]
> Simplified relational if (_103 > 1)
>  into if (_103 != 1)
> 
> 
> Shouldn't the range of _10 just be [-3,-1] 
> If so _103 can't get 0 or 1 ? And then if that gets it right then the call
> to foo will go away.

[It looks like a caching issue of some kind.  Looping Andrew.]

Yes, that is indeed confusing.  _10 should have a more refined range.

Note that there's a dependency between a.8_105 and _10:

 [local count: 327784168]:
# f_lsm.17_26 = PHI 
# a.8_105 = PHI <0(3), _10(13)>
# b_lsm.19_33 = PHI 
# b_lsm_flag.20_53 = PHI <0(3), 1(13)>
# a_lsm.21_49 = PHI <_54(D)(3), _10(13)>
_9 = e.10_39 + 4294967061;
_10 = a.8_105 + -1;
if (_10 != -3(OVF))
  goto ; [94.50%]
else
  goto ; [5.50%]

This is what I see with --param=ranger-debug=tracegori in VRP2...

We first calculate a.8_105 to [-INF, -5][-2, 0][2, 2147483646]:

1140 range_of_stmt (a.8_105) at stmt a.8_105 = PHI <0(3), _10(13)>
1141   ROS dependence fill
 ROS dep fill (a.8_105) at stmt a.8_105 = PHI <0(3), _10(13)>
 ROS dep fill (_10) at stmt _10 = a.8_105 + -1;
1142 range_of_expr(a.8_105) at stmt _10 = a.8_105 + -1;
 TRUE : (1142) range_of_expr (a.8_105) [irange] int [-INF, -5][-2,
0][2, 2147483646]

Which we later refine with SCEV:

Statement _10 = a.8_105 + -1;
 is executed at most 2147483647 (bounded by 2147483647) + 1 times in loop 4.
   Loops range found for a.8_105: [irange] int [-2, 0] and calculated range
:[irange] int [-INF, -6][-2, 0][2, 2147483645]
 TRUE : (1140) range_of_stmt (a.8_105) [irange] int [-2, 0]
Global Exported: a.8_105 = [irange] int [-2, 0]

I have verified that range_of_expr after this point returns [-2, 0], so we know
both globally and locally this refined range.

However, when we try to fold _10 later on, we use the cached value instead of
recalculating with the new range for a.8_105:

Folding statement: _10 = a.8_105 + -1;
872  range_of_stmt (_10) at stmt _10 = a.8_105 + -1;
 TRUE : (872)  cached (_10) [irange] int [-INF, -6][-3, -1][1,
2147483645]

[Bug tree-optimization/110875] [14 Regression] Dead Code Elimination Regression since r14-2501-g285c9d042e9

2023-08-02 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110875

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2023-08-03
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #1 from Andrew Pinski  ---
Confirmed. Though I have no idea how to fix this really.
The first major change to the IR happens in thread2 where we decide to do a
jump thread with the change that we didn't do before.

In GCC 13 we had:
```
   [local count: 282631250]:
  # a.8_39 = PHI <_12(23), 0(3)>
  # f_lsm.17_20 = PHI 
  # f_lsm_flag.18_22 = PHI 
  # b_lsm.19_45 = PHI <0(23), b_lsm.19_53(3)>
  # b_lsm_flag.20_47 = PHI <1(23), 0(3)>
  # a_lsm.21_49 = PHI <_12(23), _55(D)(3)>
  _1 = a.8_39 != 0;
  _2 = (int) _1;
  if (_2 != a.8_39)
goto ; [41.79%]
```

On the trunk we get:
```
   [local count: 339987332]:
  # a.8_38 = PHI <_10(24), 0(3)>
  # f_lsm.17_18 = PHI 
  # f_lsm_flag.18_20 = PHI 
  # b_lsm.19_44 = PHI <0(24), b_lsm.19_52(3)>
  # b_lsm_flag.20_46 = PHI <1(24), 0(3)>
  # a_lsm.21_48 = PHI <_10(24), _54(D)(3)>
  _13 = (unsigned int) a.8_38;
  if (_13 > 1)
goto ; [34.74%]
  else
goto ; [65.26%]
```
We duplicate bb4 for bb3 as we can figure that _13>1 will be false. This was
not done for the IR in GCC 13.

I am super confused about VRP's ranges:
We have the following that ranges that get exported and their relationships:
Global Exported: a.8_105 = [irange] int [-2, 0]
  _10 = a.8_105 + -1;
Global Exported: _10 = [irange] int [-INF, -6][-3, -1][1, 2147483645]
  _103 = (unsigned int) _10;
Global Exported: _103 = [irange] unsigned int [1, 2147483645][2147483648,
4294967290][4294967294, +INF]
Simplified relational if (_103 > 1)
 into if (_103 != 1)


Shouldn't the range of _10 just be [-3,-1] 
If so _103 can't get 0 or 1 ? And then if that gets it right then the call to
foo will go away.

[Bug tree-optimization/110875] [14 Regression] Dead Code Elimination Regression since r14-2501-g285c9d042e9

2023-08-02 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110875

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||missed-optimization
   Target Milestone|--- |14.0