[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519

2024-04-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #17 from Jakub Jelinek  ---
Fixed.

[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519

2024-03-29 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303

--- Comment #16 from GCC Commits  ---
The releases/gcc-13 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:b7b4ef2ff20c5023a41ed663dd8f4724b4ff0f9c

commit r13-8525-gb7b4ef2ff20c5023a41ed663dd8f4724b4ff0f9c
Author: Jakub Jelinek 
Date:   Thu Mar 28 15:00:44 2024 +0100

profile-count: Avoid overflows into uninitialized [PR112303]

The testcase in the patch ICEs with
--- gcc/tree-scalar-evolution.cc
+++ gcc/tree-scalar-evolution.cc
@@ -3881,7 +3881,7 @@ final_value_replacement_loop (class loop *loop)

   /* Propagate constants immediately, but leave an unused
initialization
 around to avoid invalidating the SCEV cache.  */
-  if (CONSTANT_CLASS_P (def) && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI
(rslt))
+  if (0 && CONSTANT_CLASS_P (def) && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI
(rslt))
replace_uses_by (rslt, def);

   /* Create the replacement statements.  */
(the addition of the above made the ICE latent), because profile_count
addition doesn't check for overflows and if unlucky, we can even overflow
into the uninitialized value.
Getting really huge profile counts is very easy even when not using
recursive inlining in loops, e.g.
__attribute__((noipa)) void
bar (void)
{
  __builtin_exit (0);
}

__attribute__((noipa)) void
foo (void)
{
  for (int i = 0; i < 1000; ++i)
  for (int j = 0; j < 1000; ++j)
  for (int k = 0; k < 1000; ++k)
  for (int l = 0; l < 1000; ++l)
  for (int m = 0; m < 1000; ++m)
  for (int n = 0; n < 1000; ++n)
  for (int o = 0; o < 1000; ++o)
  for (int p = 0; p < 1000; ++p)
  for (int q = 0; q < 1000; ++q)
  for (int r = 0; r < 1000; ++r)
  for (int s = 0; s < 1000; ++s)
  for (int t = 0; t < 1000; ++t)
  for (int u = 0; u < 1000; ++u)
  for (int v = 0; v < 1000; ++v)
  for (int w = 0; w < 1000; ++w)
  for (int x = 0; x < 1000; ++x)
  for (int y = 0; y < 1000; ++y)
  for (int z = 0; z < 1000; ++z)
  for (int a = 0; a < 1000; ++a)
  for (int b = 0; b < 1000; ++b)
bar ();
}

int
main ()
{
  foo ();
}
reaches the maximum count already on the 11th loop.

Some other methods of profile_count like apply_scale already
do use MIN (val, max_count) before assignment to m_val, this patch
just extends that to operator{+,+=} methods.
Furthermore, one overload of apply_probability wasn't using
safe_scale_64bit and so could very easily overflow as well
- prob is required to be [0, 1] and if m_val is near the max_count,
it can overflow even with multiplications by 8.

2024-03-28  Jakub Jelinek  

PR tree-optimization/112303
* profile-count.h (profile_count::operator+): Perform
addition in uint64_t variable and set m_val to MIN of that
val and max_count.
(profile_count::operator+=): Likewise.
(profile_count::operator-=): Formatting fix.
(profile_count::apply_probability): Use safe_scale_64bit
even in the int overload.

* gcc.c-torture/compile/pr112303.c: New test.

(cherry picked from commit d5a3b4afcdf4d517334a2717dbb65ae0d2c26507)

[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519

2024-03-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303

--- Comment #15 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:d5a3b4afcdf4d517334a2717dbb65ae0d2c26507

commit r14-9707-gd5a3b4afcdf4d517334a2717dbb65ae0d2c26507
Author: Jakub Jelinek 
Date:   Thu Mar 28 15:00:44 2024 +0100

profile-count: Avoid overflows into uninitialized [PR112303]

The testcase in the patch ICEs with
--- gcc/tree-scalar-evolution.cc
+++ gcc/tree-scalar-evolution.cc
@@ -3881,7 +3881,7 @@ final_value_replacement_loop (class loop *loop)

   /* Propagate constants immediately, but leave an unused
initialization
 around to avoid invalidating the SCEV cache.  */
-  if (CONSTANT_CLASS_P (def) && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI
(rslt))
+  if (0 && CONSTANT_CLASS_P (def) && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI
(rslt))
replace_uses_by (rslt, def);

   /* Create the replacement statements.  */
(the addition of the above made the ICE latent), because profile_count
addition doesn't check for overflows and if unlucky, we can even overflow
into the uninitialized value.
Getting really huge profile counts is very easy even when not using
recursive inlining in loops, e.g.
__attribute__((noipa)) void
bar (void)
{
  __builtin_exit (0);
}

__attribute__((noipa)) void
foo (void)
{
  for (int i = 0; i < 1000; ++i)
  for (int j = 0; j < 1000; ++j)
  for (int k = 0; k < 1000; ++k)
  for (int l = 0; l < 1000; ++l)
  for (int m = 0; m < 1000; ++m)
  for (int n = 0; n < 1000; ++n)
  for (int o = 0; o < 1000; ++o)
  for (int p = 0; p < 1000; ++p)
  for (int q = 0; q < 1000; ++q)
  for (int r = 0; r < 1000; ++r)
  for (int s = 0; s < 1000; ++s)
  for (int t = 0; t < 1000; ++t)
  for (int u = 0; u < 1000; ++u)
  for (int v = 0; v < 1000; ++v)
  for (int w = 0; w < 1000; ++w)
  for (int x = 0; x < 1000; ++x)
  for (int y = 0; y < 1000; ++y)
  for (int z = 0; z < 1000; ++z)
  for (int a = 0; a < 1000; ++a)
  for (int b = 0; b < 1000; ++b)
bar ();
}

int
main ()
{
  foo ();
}
reaches the maximum count already on the 11th loop.

Some other methods of profile_count like apply_scale already
do use MIN (val, max_count) before assignment to m_val, this patch
just extends that to operator{+,+=} methods.
Furthermore, one overload of apply_probability wasn't using
safe_scale_64bit and so could very easily overflow as well
- prob is required to be [0, 1] and if m_val is near the max_count,
it can overflow even with multiplications by 8.

2024-03-28  Jakub Jelinek  

PR tree-optimization/112303
* profile-count.h (profile_count::operator+): Perform
addition in uint64_t variable and set m_val to MIN of that
val and max_count.
(profile_count::operator+=): Likewise.
(profile_count::operator-=): Formatting fix.
(profile_count::apply_probability): Use safe_scale_64bit
even in the int overload.

* gcc.c-torture/compile/pr112303.c: New test.

[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519

2024-03-27 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303

--- Comment #14 from Jan Hubicka  ---
> This patch fixes the ICE for me.
> Seems we already did something like that in other spots (e.g. in apply_scale).

In general if the overflow happens, some pass must have misbehaved and
do something crazy when updating profile.  But indeed we probably ought
to cap here instead of randomly getting to uninitialized. It may make
sense to make these enable checking only ICEs.   I will look into why
the overflow happens.

Honza

[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519

2024-03-27 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303

--- Comment #13 from Jakub Jelinek  ---
Created attachment 57821
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57821=edit
gcc14-pr112303.patch

This patch fixes the ICE for me.
Seems we already did something like that in other spots (e.g. in apply_scale).

[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519

2024-03-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303

--- Comment #12 from Jakub Jelinek  ---
(In reply to Jakub Jelinek from comment #11)
> (In reply to Richard Biener from comment #10)
> > Looks like so, can you test that?  I think !(bb->count >= new_count) is 
> > good,
> > we're using this kind of compare regularly.
> 
> Sure, I'll test that.

Actually no, that doesn't help, nor the IMO better
  if (!new_count.initialized_p () || bb->count < new_count)
new_count = bb->count;
because if say bb->count is not initialized but e->count is, we don't want to
overwrite it.
The thing is that new_count is actually not used unless e is non-NULL.

The actual problem is different, bb->count of one of the duplicated blocks is
initialized to the largest possible unitialized m_val (0x3ffe aka
2305843009213693950 (estimated locally, freq 144115188075855872.)
) and then scaled to uninitialized.
This is because in the second duplicate_loop_body_to_header_edge on the
testcase (with the #c9 patch to reproduce it even on the trunk) we have
(gdb) p count_le.debug ()
1729382256910270463 (estimated locally, freq 108086391056891904.)
(gdb) p count_out_orig.debug ()
576460752303423488 (estimated locally, freq 36028797018963968.)
but
1264  profile_count new_count_le = count_le + count_out_orig;
is
(gdb) p new_count_le.debug ()
uninitialized
because 0x17ff + 0x800 yields the largest possible
value.

If profile_count wants to use the 0x1fff value as unitialized,
shouldn't
it perform saturating arithmetics such that the counts will be never larger
than
0x1ffe unless it is really meant to be uninitialized?
I mean in all those spots like operator+ which just m_val + other.m_val and
similar without checking for overflow?  What about apply_scale etc.?

Honza?

[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519

2024-03-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303

--- Comment #11 from Jakub Jelinek  ---
(In reply to Richard Biener from comment #10)
> Looks like so, can you test that?  I think !(bb->count >= new_count) is good,
> we're using this kind of compare regularly.

Sure, I'll test that.

[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519

2024-03-26 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303

--- Comment #10 from Richard Biener  ---
Looks like so, can you test that?  I think !(bb->count >= new_count) is good,
we're using this kind of compare regularly.

[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519

2024-03-05 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303

--- Comment #9 from Jakub Jelinek  ---
Still reproduceable with
--- gcc/tree-scalar-evolution.cc
+++ gcc/tree-scalar-evolution.cc
@@ -3881,7 +3881,7 @@ final_value_replacement_loop (class loop *loop)

   /* Propagate constants immediately, but leave an unused initialization
 around to avoid invalidating the SCEV cache.  */
-  if (CONSTANT_CLASS_P (def) && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rslt))
+  if (0 && CONSTANT_CLASS_P (def) && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI
(rslt))
replace_uses_by (rslt, def);

   /* Create the replacement statements.  */
The bb with uninitialized count is created by
#7  0x0069060b in create_empty_bb (after=) at ../../gcc/cfghooks.cc:773
#8  0x00e2c995 in gimple_duplicate_bb (bb=, id=0x7fffc610) at ../../gcc/tree-cfg.cc:6513
#9  0x00691158 in duplicate_block (bb=, e=, after=,
id=0x7fffc610)
at ../../gcc/cfghooks.cc:1119
#10 0x006918f5 in copy_bbs (bbs=0x3bfa670, n=3, new_bbs=0x3bce9c0,
edges=0x7fffc790, num_edges=2, new_edges=0x7fffc780,
base=0x7fffe9f1f7d0, 
after=, update_dominance=true) at
../../gcc/cfghooks.cc:1384
#11 0x006a19c6 in duplicate_loop_body_to_header_edge
(loop=0x7fffe9f1f7d0, e= 62)>, ndupl=2,
wont_exit=0x3ac78f0, 
orig= 66)>, Python Exception : There is no member or method named m_vecpfx.
to_remove=0x39ba7b0, flags=5) at ../../gcc/cfgloopmanip.cc:1403
#12 0x00fc8fd9 in gimple_duplicate_loop_body_to_header_edge
(loop=0x7fffe9f1f7d0, e= 225)>, ndupl=2,
wont_exit=0x3ac78f0, 
orig= 66)>, Python Exception : There is no member or method named m_vecpfx.
to_remove=0x39ba7b0, flags=5) at ../../gcc/tree-ssa-loop-manip.cc:860
#13 0x00fa53f6 in try_unroll_loop_completely (loop=0x7fffe9f1f7d0,
exit= 66)>, niter=,
may_be_zero=false, ul=UL_ALL, 
maxiter=2, locus=..., allow_peel=true) at
../../gcc/tree-ssa-loop-ivcanon.cc:960

Seems in the above backtrace it is duplicate_block which does the new_bb->count
updates.

It does:
1107  profile_count new_count = e ? e->count ():
profile_count::uninitialized ();
but e is NULL, so here new_count is unitialized, and then
1114  if (bb->count < new_count)
1115new_count = bb->count;
here
p bb->count.debug ()
2305843009213693950 (estimated locally, freq 144115188075855872.)
p new_count.debug ()
uninitialized
but bb->count < new_count is false due to
  bool operator< (const profile_probability ) const
{
  return initialized_p () && other.initialized_p () && m_val < other.m_val;
}

Shouldn't that be if (!(bb->count >= new_count)) or if (bb->count < new_count
|| !new_count.initialized_p ()) ?
Honza?

[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519

2023-12-05 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303

--- Comment #8 from rguenther at suse dot de  ---
On Tue, 5 Dec 2023, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303
> 
> Jakub Jelinek  changed:
> 
>What|Removed |Added
> 
>  CC||jakub at gcc dot gnu.org,
>||rguenth at gcc dot gnu.org
>Keywords|needs-bisection |
> 
> --- Comment #7 from Jakub Jelinek  ---
> Doesn't ICE since r14-6010-g2dde9f326ded84814a78c3044294b535c1f97b41
> No idea whether that was the fix for this or just something that made it
> latent.

I'm quite sure it just made it latent.

[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519

2023-12-05 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org,
   ||rguenth at gcc dot gnu.org
   Keywords|needs-bisection |

--- Comment #7 from Jakub Jelinek  ---
Doesn't ICE since r14-6010-g2dde9f326ded84814a78c3044294b535c1f97b41
No idea whether that was the fix for this or just something that made it
latent.

[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519

2023-12-05 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||needs-bisection

--- Comment #6 from Andrew Pinski  ---
This seems to have been fixed recently.

[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519

2023-10-31 Thread zhendong.su at inf dot ethz.ch via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303

--- Comment #5 from Zhendong Su  ---
(In reply to Sam James from comment #3)
> (In reply to Zhendong Su from comment #0)
> > This appears to be a recent regression.
> > 
> 
> Out of interest, when you say this, do you have a rough range in mind? It'd
> make bisecting easier. Or do you just mean you surely would've hit it by now
> with your testing if it had been there a while?

By "This appears to be a recent regression", I typically mean, according to
Compiler Explorer, the bug is only reproduced with its current trunk build.

[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519

2023-10-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P1
Version|unknown |14.0

[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519

2023-10-30 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303

Sam James  changed:

   What|Removed |Added

   Keywords|needs-bisection |
 CC||hubicka at gcc dot gnu.org
Summary|[14 Regression] ICE on  |[14 Regression] ICE on
   |valid code at -O3 on|valid code at -O3 on
   |x86_64-linux-gnu:   |x86_64-linux-gnu:
   |verify_flow_info failed |verify_flow_info failed
   ||since
   ||r14-3459-g0c78240fd7d519

--- Comment #4 from Sam James  ---
bisect says:

commit 0c78240fd7d519fc27ca822f66a92f85edf43f70
Author: Jan Hubicka 
Date:   Thu Aug 24 15:10:46 2023 +0200

Check that passes do not forget to define profile

in r14-3459-g0c78240fd7d519. It's probably been there for a while. This has
popped up in a bunch of places naturally.