[Bug tree-optimization/114322] [14 Regression] SCEV analysis failed for bases like A[(i+x)*stride] since r14-9193-ga0b1798042d033

2024-03-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114322

--- Comment #4 from GCC Commits  ---
The master branch has been updated by Hao Liu :

https://gcc.gnu.org/g:4c276896d646c2dbc8047fd81d6e65f8c5ecf01d

commit r14-9569-g4c276896d646c2dbc8047fd81d6e65f8c5ecf01d
Author: Hao Liu 
Date:   Wed Mar 20 17:37:01 2024 +0800

testsuite: add the case to cover the vectorization of A[(i+x)*stride]
[PR114322]

This issues has been fixed by r14-9540-ge0e9499a in PR114151. Tested on
aarch64-linux-gnu.

gcc/testsuite/ChangeLog:

PR tree-optimization/114322
* gcc.dg/vect/pr114322.c: New testcase.

[Bug tree-optimization/114322] [14 Regression] SCEV analysis failed for bases like A[(i+x)*stride] since r14-9193-ga0b1798042d033

2024-03-19 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114322

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Richard Biener  ---
Fixed by reverting the offending change.  Feel free to submit a testcase for
the 
testsuite covering your case.

[Bug tree-optimization/114322] [14 Regression] SCEV analysis failed for bases like A[(i+x)*stride] since r14-9193-ga0b1798042d033

2024-03-19 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114322

--- Comment #2 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:e0e9499aeffdaca88f0f29334384aa5f710a81a4

commit r14-9540-ge0e9499aeffdaca88f0f29334384aa5f710a81a4
Author: Richard Biener 
Date:   Tue Mar 19 12:24:08 2024 +0100

tree-optimization/114151 - revert PR114074 fix

The following reverts the chrec_fold_multiply fix and only keeps
handling of constant overflow which keeps the original testcase
fixed.  A better solution might involve ranger improvements or
tracking of assumptions during SCEV analysis similar to what niter
analysis does.

PR tree-optimization/114151
PR tree-optimization/114269
PR tree-optimization/114322
PR tree-optimization/114074
* tree-chrec.cc (chrec_fold_multiply): Restrict the use of
unsigned arithmetic when actual overflow on constant operands
is observed.

* gcc.dg/pr68317.c: Revert last change.

[Bug tree-optimization/114322] [14 Regression] SCEV analysis failed for bases like A[(i+x)*stride] since r14-9193-ga0b1798042d033

2024-03-13 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114322

Jeffrey A. Law  changed:

   What|Removed |Added

   Priority|P3  |P2
 CC||law at gcc dot gnu.org

[Bug tree-optimization/114322] [14 Regression] SCEV analysis failed for bases like A[(i+x)*stride] since r14-9193-ga0b1798042d033

2024-03-13 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114322

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
   Last reconfirmed||2024-03-13
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
Confirmed.  The issue is we have

 { x_12(D), +, 1 } * stride_11(D)

which doesn't behave the same with respect to overflow as

 { x_12(D) * stride_11(D), +, stride_11(D) }

and because of that we analyze it as


 (int) {(unsigned) x_12(D) * (unsigned) stride_11(D), +, (unsigned)
stride_11(D) }

as it might wrap.  But then then sign-extension to long unsigned int is
no longer affine.

  _1 = x_12(D) + i_20;
  _2 = _1 * stride_11(D);
  _3 = (long unsigned int) _2;
  _4 = _3 * 2;
  _5 = A_13(D) + _4;
  _6 = *_5;

The problematical case is x == N < 0 where the last - N might now
overflow with the new SCEV.

The correctness means that we'll now more often run into these issues
for IVs smaller than pointer width.  With -m32 we can analyze the DR to

Creating dr for *_5
offset from base address: 0
constant offset from base address: 0
step: (ssizetype) ((unsigned int) stride_11(D) * 2)
base alignment: 2
base misalignment: 0
offset alignment: 256
step alignment: 2
base_object: *A_13(D) + (sizetype) ((unsigned int) stride_11(D) *
(unsigned int) x_12(D)) * 2
Access function 0: {0B, +, (unsigned int) stride_11(D) * 2}_1

If you had written

   sum += A[i*stride + x*stride];

it might have worked but unfortunately EVRP transforms this back to
(i+x)*stride because it knows stride isn't zero.

In the end this means it's our failure that we fail to handle

  2 * (unsigned long)({ x_12(D), +, 1 } * stride_11(D))

as valid evolution for further analysis - of course the multiplication
by two in an unsigned type might overflow as well.