[Bug tree-optimization/114192] scalar code left around following early break vectorization of reduction

2024-03-04 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114192

--- Comment #5 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:324d2907c86f05e40dc52d226940308f53a956c2

commit r14-9292-g324d2907c86f05e40dc52d226940308f53a956c2
Author: Richard Biener 
Date:   Mon Mar 4 09:46:13 2024 +0100

tree-optimization/114192 - scalar reduction kept live with early break vect

The following fixes a missing replacement of the reduction value
used in the epilog, causing the scalar reduction to be kept live
across the early break exit path.

PR tree-optimization/114192
* tree-vect-loop.cc (vect_create_epilog_for_reduction): Use the
appropriate def for the live out stmt in case of an alternate
exit.

[Bug tree-optimization/114192] scalar code left around following early break vectorization of reduction

2024-03-04 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114192

Richard Biener  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from Richard Biener  ---
Fixed.

[Bug tree-optimization/114192] scalar code left around following early break vectorization of reduction

2024-03-04 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114192

--- Comment #4 from Richard Biener  ---
vect-early-break_104-pr113373.c might be such case which ICEs then.

[Bug tree-optimization/114192] scalar code left around following early break vectorization of reduction

2024-03-04 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114192

--- Comment #3 from Richard Biener  ---
Created attachment 57600
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57600=edit
patch

Ah, so the issue is that we only replace the LHS, in this case we pass
the reduction stmt for the early exit but the live value is defined by
the PHI (we re-start the iteration).  That confuses the replacement process.
It looks like it might also be wrong for the peeled case on the main edge?

The following fixes it for me.  Didn't check what happens for the peeled
case with a reduction.

[Bug tree-optimization/114192] scalar code left around following early break vectorization of reduction

2024-03-04 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114192

Richard Biener  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #2 from Richard Biener  ---
The issue is the scalar reduction value is live from the main loop to the
epilog.  We don't seem to use the vector .REDUC_PLUS value on both paths.
Likely failure of reduction epilog generation.  Let me have a look.

[Bug tree-optimization/114192] scalar code left around following early break vectorization of reduction

2024-03-01 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114192

Tamar Christina  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-03-01

--- Comment #1 from Tamar Christina  ---
Confirmed.

It looks like DCE6 no longer thinks:

  # sum_10 = PHI 

  _1 = aD.4432[i_12];
  sum_7 = _1 + sum_11;

is dead after vectorization.

it removes the only dead consumer of sum_7,
a PHI node left over in the guard block which becomes unused after the
reduction is vectorized.

DCE says:

marking necessary through sum_11 stmt sum_11 = PHI 
processing: sum_11 = PHI 

marking necessary through sum_7 stmt sum_7 = _1 + sum_11;
processing: sum_7 = _1 + sum_11;

marking necessary through _1 stmt _1 = a[i_12];
processing: _1 = a[i_12];

so it thinks the closed definition is needed?

This seems to only happen with reductions, other live operations look fine:

extern int a[1024];
int f4(int *x, int n)
{
int sum = 0;
for (int i = 0; i < n; i++)
{
sum = a[i];
if (a[i] == 42)
break;
}
return sum;
}