[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-16 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #40 from Jakub Jelinek  ---
(In reply to Jan Hubicka from comment #39)
> This testcase
> #include 
> int data[100];
> 
> __attribute__((noinline))
> int bar (int d, unsigned int d2)
> {
>   if (d2 > 10)
> printf ("Bingo\n");
>   return d + d2;
> }

So, while d2 should have [17, 27] range when called from test2 and [17, 117]
from test,
I don't see anything that would limit ranges of data[?], it can be anything
(say some global constructor modifying the data array).  So bar has to return
VARYING.

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-16 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #39 from Jan Hubicka  ---
This testcase
#include 
int data[100];

__attribute__((noinline))
int bar (int d, unsigned int d2)
{
  if (d2 > 10)
printf ("Bingo\n");
  return d + d2;
}

int
test2 (unsigned int i)
{
  if (i > 10)
__builtin_unreachable ();
  if (__builtin_expect (data[i] != 0, 1))
return data[i];
  printf ("%i\n",i);
  for (int j = 0; j < 100; j++)
data[i] += bar (data[j], i+17);
  return data[i];
}
int
test (unsigned int i)
{
  if (i > 100)
__builtin_unreachable ();
  if (__builtin_expect (data[i] != 0, 1))
return data[i];
  printf ("%i\n",i);
  for (int j = 0; j < 100; j++)
data[i] += bar (data[j], i+17);
  return data[i];
}
int
main ()
{
  test (1);
  test (2);
  test (3);
  test2 (4);
  test2 (100);
  return 0;
}

gets me most of what I want to reproduce ipa-prop problem. Functions test and
test2 are split with different value ranges visible in the fnsplit dump. 
However curiously enough ipa-prop analysis seems to ignore the value ranges and
does not attach them to the jump function, which is odd...

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-15 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #38 from Jakub Jelinek  ---
(In reply to Jan Hubicka from comment #36)
> > > Having a testcase is great. I was just playing with crafting one.
> > > I am still concerned about value ranges in ipa-prop's jump functions.
> > 
> > Maybe my imagination is too limited, but if the ipa-prop's jump functions 
> > are
> > computed before ICF is performed, then if you call function a which makes 
> > some
> > assumptions about its arguments and SSA_NAMEs used within the function and
> > results in some return range
> > and another one which makes different assumptions and results in a different
> > range,
> > then even if the two functions are merged and either the range info is 
> > thrown
> > away
> > or unioned, I think if some functions call one or the other functions then
> > still the original range assumptions still apply, depending on which one
> > actually called.
> 
> I think the tescase should have functions A1 and A2 calling function B.
> We need to arrange A1 to ICF into A2 and make A2 to have narrower value
> range of parameter of B visible to ipa-prop. Since ICF happens first,
> ipa-prop will not see A1's value range and specialize B for A2's range.
> Which sould make it possible to trigger wrong code.

If you manage to get wrong ranges in such case, then reusing part of the above
testcase could help make it exploitable, with -minline-all-stringops (at least
in some tunings?) memcpy emits if (len < 8) rep movsb; else { if (dst & 1)
movsb; if (dst & 2) movsb; etc. } and so if incorrect value range results in
the len < 8 check to be optimized away, the rest misbehaves if destination is
aligned to 8 with misalignment 1 and length is smaller than 8.

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-15 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #37 from Jan Hubicka  ---
> Also remember we like to have a fix that's easily backportable, and
> that's probably going to be resetting the info.  We can do something
> more fancy for GCC 15

Rejecting to merge function with different vlaue ranges is also easy,
but so is reseting.  I only need to check the ipa-prop.
Jakub, also I can take over the patch - and sorry for beging slow. I had
visitor doing some research last two weeks and I am trying to catch up
now.

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-15 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #36 from Jan Hubicka  ---
> > Having a testcase is great. I was just playing with crafting one.
> > I am still concerned about value ranges in ipa-prop's jump functions.
> 
> Maybe my imagination is too limited, but if the ipa-prop's jump functions are
> computed before ICF is performed, then if you call function a which makes some
> assumptions about its arguments and SSA_NAMEs used within the function and
> results in some return range
> and another one which makes different assumptions and results in a different
> range,
> then even if the two functions are merged and either the range info is thrown
> away
> or unioned, I think if some functions call one or the other functions then
> still the original range assumptions still apply, depending on which one
> actually called.

I think the tescase should have functions A1 and A2 calling function B.
We need to arrange A1 to ICF into A2 and make A2 to have narrower value
range of parameter of B visible to ipa-prop. Since ICF happens first,
ipa-prop will not see A1's value range and specialize B for A2's range.
Which sould make it possible to trigger wrong code.
> 
> > Let me see if I can modify the testcase to also trigger problem with value
> > ranges in ipa-prop jump functions.
> > 
> > Not streaming value ranges is an omission on my side (I mistakely assumed we
> > do stream them).  We ought to stream them, since otherwise we will lose
> > propagated return value ranges in partitioned programs, which is pity.
> 
> Not sure if it is a good idea to start streaming them now in stage4, but sure,
> if we
> stream them (and I think we should mostly have code to be able to stream that

It probably makes sense to postpone VR stremaing for stage1.
(Even thouh it is not hard to do.)  I will add that to my TODO.

> because we can stream the ranges in the jump functions, just unsure about
> points-to stuff),
> then I still think best approach would be to merge regardless of range info,
> but in that case preferrably union instead of throw away.  The only question 
> is
> where to do the merging for the LTO case (where to stream the union of the
> ranges and where to read it in and update for the SSA_NAMEs).

With LTO ICF streams in all function bodies for comparsion to WPA and
keeps the winning one, so it is about extending the comparator to keep
track of difference of value ranges for all compared pairs.

There is code to merge profiles, so at that place one can also do other
kind of metadata merging.  ICF needs a lot of love on this - value
range merging is just one example.  We also want to merge TBAA (so we
can merge different template instatiations) and other stuff.

At the moment we handle all other metadat conservatively (i.e. do not
attempt to merge, just refuse merging if it differs) so value ranges are
first that are handled aggressively with your patch.

I think it is fine, since most value range will be recomputed in late
optimization - the only exceptions are the return value ranges for now.

I will try to work on making this more systematic next stage1.

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-15 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #35 from rguenther at suse dot de  ---
On Thu, 15 Feb 2024, rguenther at suse dot de wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907
> 
> --- Comment #34 from rguenther at suse dot de  ---
> On Thu, 15 Feb 2024, jakub at gcc dot gnu.org wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907
> > 
> > --- Comment #32 from Jakub Jelinek  ---
> > (In reply to Jan Hubicka from comment #31)
> [...]
> > > Not streaming value ranges is an omission on my side (I mistakely assumed 
> > > we
> > > do stream them).  We ought to stream them, since otherwise we will lose
> > > propagated return value ranges in partitioned programs, which is pity.
> > 
> > Not sure if it is a good idea to start streaming them now in stage4, but 
> > sure,
> 
> Please don't do that now ;)
> 
> (we've not had ranges early originally, and even nonzero bits were not
> computed)
> 
> But yes, IPA CP jump functions with value-ranges might be a problem.
> I think it does instantiate them as local ranges, does it?  We
> have streaming support for ranges, we just don't stream the auxiliary
> data for all SSA names (like also not points-to info).

Also remember we like to have a fix that's easily backportable, and
that's probably going to be resetting the info.  We can do something
more fancy for GCC 15

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-15 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #34 from rguenther at suse dot de  ---
On Thu, 15 Feb 2024, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907
> 
> --- Comment #32 from Jakub Jelinek  ---
> (In reply to Jan Hubicka from comment #31)
[...]
> > Not streaming value ranges is an omission on my side (I mistakely assumed we
> > do stream them).  We ought to stream them, since otherwise we will lose
> > propagated return value ranges in partitioned programs, which is pity.
> 
> Not sure if it is a good idea to start streaming them now in stage4, but sure,

Please don't do that now ;)

(we've not had ranges early originally, and even nonzero bits were not
computed)

But yes, IPA CP jump functions with value-ranges might be a problem.
I think it does instantiate them as local ranges, does it?  We
have streaming support for ranges, we just don't stream the auxiliary
data for all SSA names (like also not points-to info).

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-15 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #33 from Jakub Jelinek  ---
Anyway, should I work on the union variant or do you want to take this over?

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-15 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #32 from Jakub Jelinek  ---
(In reply to Jan Hubicka from comment #31)
> Having a testcase is great. I was just playing with crafting one.
> I am still concerned about value ranges in ipa-prop's jump functions.

Maybe my imagination is too limited, but if the ipa-prop's jump functions are
computed before ICF is performed, then if you call function a which makes some
assumptions about its arguments and SSA_NAMEs used within the function and
results in some return range
and another one which makes different assumptions and results in a different
range,
then even if the two functions are merged and either the range info is thrown
away
or unioned, I think if some functions call one or the other functions then
still the original range assumptions still apply, depending on which one
actually called.

> Let me see if I can modify the testcase to also trigger problem with value
> ranges in ipa-prop jump functions.
> 
> Not streaming value ranges is an omission on my side (I mistakely assumed we
> do stream them).  We ought to stream them, since otherwise we will lose
> propagated return value ranges in partitioned programs, which is pity.

Not sure if it is a good idea to start streaming them now in stage4, but sure,
if we
stream them (and I think we should mostly have code to be able to stream that
because we can stream the ranges in the jump functions, just unsure about
points-to stuff),
then I still think best approach would be to merge regardless of range info,
but in that case preferrably union instead of throw away.  The only question is
where to do the merging for the LTO case (where to stream the union of the
ranges and where to read it in and update for the SSA_NAMEs).

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-15 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #31 from Jan Hubicka  ---
Having a testcase is great. I was just playing with crafting one.
I am still concerned about value ranges in ipa-prop's jump functions.
Let me see if I can modify the testcase to also trigger problem with value
ranges in ipa-prop jump functions.

Not streaming value ranges is an omission on my side (I mistakely assumed we do
stream them).  We ought to stream them, since otherwise we will lose propagated
return value ranges in partitioned programs, which is pity.

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-15 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

Sam James  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2024-02-15
 Status|UNCONFIRMED |NEW

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-14 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #30 from Jakub Jelinek  ---
Created attachment 57426
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57426=edit
gcc14-pr113907.patch

I've managed to come up with a small runtime testcase.
Now with a patch which does the resetting of SSA_NAME_{PTR,RANGE}_INFO for
!flag_wpa in ICF merged functions (in the hope that we regenerate it during
vrp1).

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-14 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

Jan Hubicka  changed:

   What|Removed |Added

 CC||hubicka at gcc dot gnu.org

--- Comment #29 from Jan Hubicka  ---
Safest fix is to make equals_p to reject merging functions with different value
ranges assigned to corresponding SSA names.  I would hope that, since early
opts are still mostly local, that does not lead to very large degradation. This
is lame of course.

If we go for smarter merging, we need to also handle ipa-prop jump functions. 
In that case I think equals_p needs to check if value range sin SSA_NAMES and
jump functions differs and if so, keep that noted so the merging code can do
corresponding update.  I will check how hard it is to implement this. 
(Equality handling is Martin Liska's code, but if I recall right, each
equivalence class has a leader, and we can keep track if there are some
differences WRT that leader, but I do not recall how subdivision of equivalence
classes is handled).

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-14 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #28 from rguenther at suse dot de  ---
On Wed, 14 Feb 2024, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907
> 
> --- Comment #27 from Jakub Jelinek  ---
> So:
> --- gcc/ipa-icf.cc.jj   2024-02-10 11:25:09.645478952 +0100
> +++ gcc/ipa-icf.cc  2024-02-14 10:44:27.906216458 +0100
> @@ -1244,6 +1244,29 @@ sem_function::merge (sem_item *alias_ite
>else
>  create_alias = true;
> 
> +  unsigned i;
> +  tree name;
> +  FOR_EACH_SSA_NAME (i, name, original->get_fun ())
> +{
> +  /* We need to either merge or reset SSA_NAME_*_INFO.
> +For merging we don't preserve the mapping between
> +original and alias SSA_NAMEs from successful equals
> +calls.  */
> +  if (POINTER_TYPE_P (TREE_TYPE (name)))
> +{
> + if (SSA_NAME_PTR_INFO (name))
> +   {
> + gcc_assert (!flag_wpa);
> + SSA_NAME_PTR_INFO (name) = NULL;
> +   }
> +}
> +  else if (SSA_NAME_RANGE_INFO (name))
> +   {
> + gcc_assert (!flag_wpa);
> + SSA_NAME_RANGE_INFO (name) = NULL;
> +   }
> +}
> +
>if (redirect_callers)
>  {
>int nredirected = redirect_all_callers (alias, local_original);
> then?

Yeah, though can't we do this in the caller and thus only once for the
target when we merge more than 2 nodes?

  For the merging, I guess we'd need to move one of the 2 vec vectors
> from m_checker to the sem_function instead of throwing it away in
> sem_function::equals
> if returning true.

It might require quite some amount of memory though when N is big,
I'm not sure it's worth it?

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-14 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #27 from Jakub Jelinek  ---
So:
--- gcc/ipa-icf.cc.jj   2024-02-10 11:25:09.645478952 +0100
+++ gcc/ipa-icf.cc  2024-02-14 10:44:27.906216458 +0100
@@ -1244,6 +1244,29 @@ sem_function::merge (sem_item *alias_ite
   else
 create_alias = true;

+  unsigned i;
+  tree name;
+  FOR_EACH_SSA_NAME (i, name, original->get_fun ())
+{
+  /* We need to either merge or reset SSA_NAME_*_INFO.
+For merging we don't preserve the mapping between
+original and alias SSA_NAMEs from successful equals
+calls.  */
+  if (POINTER_TYPE_P (TREE_TYPE (name)))
+{
+ if (SSA_NAME_PTR_INFO (name))
+   {
+ gcc_assert (!flag_wpa);
+ SSA_NAME_PTR_INFO (name) = NULL;
+   }
+}
+  else if (SSA_NAME_RANGE_INFO (name))
+   {
+ gcc_assert (!flag_wpa);
+ SSA_NAME_RANGE_INFO (name) = NULL;
+   }
+}
+
   if (redirect_callers)
 {
   int nredirected = redirect_all_callers (alias, local_original);
then?  For the merging, I guess we'd need to move one of the 2 vec vectors
from m_checker to the sem_function instead of throwing it away in
sem_function::equals
if returning true.

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #26 from Richard Biener  ---
(In reply to Jakub Jelinek from comment #25)
> So, to sum up what has been discussed on IRC, LTO streaming doesn't seem to
> stream SSA_NAME_INFO, only the final IPA phases of say IPA-VRP can set
> SSA_NAME_INFO.
> Thus, this bug is most likely solely about -fno-lto behavior of IPA-ICF.
> 
> Supposedly there we can change the SSA_NAME_INFO upon final decision to
> merge two functions.  I'd say it is in sem_function::merge above if
> (redirect_callers).
> And guard with !flag_wpa (or for flag_wpa assert all SSA_NAME_INFO is NULL).
> 
> Except that by the time sem_function::merge is called, m_checker with its
> mapping between SSA_NAME_VERSION is unfortunately gone, so we'd need to
> preserve it somewhere.

Or go and throw away all ranges from the final merge target?

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P1

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-14 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #25 from Jakub Jelinek  ---
So, to sum up what has been discussed on IRC, LTO streaming doesn't seem to
stream SSA_NAME_INFO, only the final IPA phases of say IPA-VRP can set
SSA_NAME_INFO.
Thus, this bug is most likely solely about -fno-lto behavior of IPA-ICF.

Supposedly there we can change the SSA_NAME_INFO upon final decision to merge
two functions.  I'd say it is in sem_function::merge above if
(redirect_callers).
And guard with !flag_wpa (or for flag_wpa assert all SSA_NAME_INFO is NULL).

Except that by the time sem_function::merge is called, m_checker with its
mapping between SSA_NAME_VERSION is unfortunately gone, so we'd need to
preserve it somewhere.

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-13 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #24 from Jakub Jelinek  ---
static inline int
foo (int len, void *indata, void *outdata)
{
  if (len < 0 || (len & 7) != 0)
return 0;
  if (len != 0 && indata != outdata)
__builtin_memcpy (outdata, indata, len);
  return len;
}

static inline int
bar (int len, void *indata, void *outdata)
{
  if (len < 0 || (len & 1) != 0)
return 0;
  if (len != 0 && indata != outdata)
__builtin_memcpy (outdata, indata, len);
  return len;
}

int (*volatile p1) (int, void *, void *) = foo;
int (*volatile p2) (int, void *, void *) = bar;

__attribute__((noipa)) int
baz (int len, void *indata, void *outdata)
{
  if ((len & 6) != 0)
bar (len, indata, outdata);
  else
foo (len, indata, outdata);
}

int
main ()
{
  char buf[13] = "abcdefghijkl";
  p2 (6, buf, buf + 6);
  if (__builtin_strcmp (buf, "abcdefabcdef"))
__builtin_abort ();
}

Reproduces the wrong range in bar, but still doesn't crash nor abort.  So I
probably need some different size of the copying.

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-13 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #23 from Jakub Jelinek  ---
(In reply to Andrew Pinski from comment #22)
> (In reply to Andrew Pinski from comment #21)
> > Yep that is the same descriptions as what is happening in PR 113665.
> 
> Specifically https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113665#c5 .

Indeed.  But this bug is a P1 while the other one P2.

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-13 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #22 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #21)
> Yep that is the same descriptions as what is happening in PR 113665.

Specifically https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113665#c5 .

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-13 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #21 from Andrew Pinski  ---
(In reply to Jakub Jelinek from comment #20)
> Ah, I think it is IPA-ICF.
> What happens is that fnsplit splits the uprv_copyArray{16,32,64} functions,
> where the
> inline part does what is actually different among the functions,
...
> Now IPA-ICF comes, doesn't care about ranges, merges all 3 into one and
> picks there
> the most unfortunate case, the case from the copyData64 part with N 8.
> Later on we inline this back into the functions, the function splitting was
> useless.

Yep that is the same descriptions as what is happening in PR 113665.

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-13 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jamborm at gcc dot gnu.org,
   ||rguenth at gcc dot gnu.org

--- Comment #20 from Jakub Jelinek  ---
Ah, I think it is IPA-ICF.
What happens is that fnsplit splits the uprv_copyArray{16,32,64} functions,
where the
inline part does what is actually different among the functions, i.e. the
tests which among other do the early out for
length<0 || (length&1)!=0
or
length<0 || (length&3)!=0
or
length<0 || (length&7)!=0
among other things, and then the *.part.0 parts which are exactly the same for
the 3 functions IL wise, except for the unfortunately differences in the
ranges.
So, essentially just
   [local count: 132713219]:
  _2 = length_1(D) != 0;
  _5 = inData_3(D) != outData_4(D);
  _6 = _2 & _5;
  if (_6 != 0)
goto ; [33.00%]
  else
goto ; [67.00%]

   [local count: 43795362]:
  # RANGE [irange] unsigned int [N, 2147483647] MASK 0x7ffe VALUE 0x0
  length.43_7 = (unsigned int) length_1(D);
  # USE = anything
  # CLB = anything
  memcpy (outData_4(D), inData_3(D), length.43_7);

   [local count: 132713219]:

   [local count: 132713219]:
  # RANGE [irange] int [0, 0][N, +INF] MASK 0x7fff VALUE 0x0
  # _8 = PHI 
  return _8;
where N is 2, 4 or 8.
Now IPA-ICF comes, doesn't care about ranges, merges all 3 into one and picks
there
the most unfortunate case, the case from the copyData64 part with N 8.
Later on we inline this back into the functions, the function splitting was
useless.

So, the question is what to do in IPA-ICF.  Should we punt on range
differences, reset all ranges or try to merge the ranges from all the functions
merged together?
I think the last would be best.

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-13 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #19 from Jakub Jelinek  ---
Diffing the -fdump-{tree,ipa}-all-alias dumps from r14-5108 and r14-5109,
085i.fnsummary still looks good (the 2/4/8 in the ranges corresponds to whether
it is in 16/32/64 suffixed/infixed function).
But the inline dump already contains presumably bogus
@@ -2426,7 +2426,7 @@ int32_t uprv_copyArray16 (const struct U
 goto ; [67.00%]

[local count: 43795362]:
-  # RANGE [irange] unsigned int [1, 2147483647] MASK 0x7ff8 VALUE 0x0
+  # RANGE [irange] unsigned int [8, 2147483647] MASK 0x7ff8 VALUE 0x0
   length.37_21 = (unsigned int) length_14(D);
   # USE = anything 
   # CLB = anything 
@@ -2437,7 +2437,7 @@ int32_t uprv_copyArray16 (const struct U
   _7 = _25;

[local count: 1073741824]:
-  # RANGE [irange] int32_t [0, +INF] MASK 0x7fff VALUE 0x0
+  # RANGE [irange] int32_t [0, 0][2, +INF] MASK 0x7fff VALUE 0x0
   # _6 = PHI <0(4), _7(12)>
   return _6;

and
@@ -2660,7 +2660,7 @@ int32_t uprv_copyArray32 (const struct U
 goto ; [67.00%]

[local count: 43795362]:
-  # RANGE [irange] unsigned int [1, 2147483647] MASK 0x7ff8 VALUE 0x0
+  # RANGE [irange] unsigned int [8, 2147483647] MASK 0x7ff8 VALUE 0x0
   length.37_21 = (unsigned int) length_14(D);
   # USE = anything 
   # CLB = anything 
@@ -2671,7 +2671,7 @@ int32_t uprv_copyArray32 (const struct U
   _7 = _25;

[local count: 1073741824]:
-  # RANGE [irange] int32_t [0, +INF] MASK 0x7fff VALUE 0x0
+  # RANGE [irange] int32_t [0, 0][4, +INF] MASK 0x7fff VALUE 0x0
   # _6 = PHI <0(4), _7(12)>
   return _6;
hunks, the return in those cases looks still right, but the length before
memcpy looks wrong.

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-13 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=113665

--- Comment #18 from Andrew Pinski  ---
(In reply to Jakub Jelinek from comment #17)
> What seems wrong is that the 8 in the ranges somehow leaks into copyData16
> during IPA.
> IPA-ICF?

If IPA-ICF, then PR 113665 is similar in nature.

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-13 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #17 from Jakub Jelinek  ---
One set of range changes in evrp is more precise ranges in return values of
uprv_swapArray{16,32,64}, those contain early return 0 if
length<0 || (length&1)!=0
or
length<0 || (length&3)!=0
or
length<0 || (length&7)!=0
so changing those return ranges from [0, +INF] to [0, 0][4, +INF] (for the
second, 2 or 8 instead of 4 for the first/last) seems completely reasonable.
And uprv_copyArray{16,32,64} has something very similar,
length<0 || (length&{1,3,7})!=0 early exit and length > 0 guarded memcpy.
So, even there it is completely reasonable to change
-  # RANGE [irange] unsigned int [1, 2147483647] MASK 0x7ff8 VALUE 0x0
+  # RANGE [irange] unsigned int [8, 2147483647] MASK 0x7ff8 VALUE 0x0
   length.37_10 = (unsigned int) length_25(D);
   memcpy (outData_26(D), inData_24(D), length.37_10);
for the copy and
-  # RANGE [irange] int32_t [0, +INF] MASK 0x7fff VALUE 0x0
+  # RANGE [irange] int32_t [0, 0][8, +INF] MASK 0x7fff VALUE 0x0
   # _12 = PHI <0(4), 0(9), length_25(D)(12)>
   return _12;
for the return, but naturally copyData16 uses 2 instead of 8 and copyData32 4
instead of 8.

What seems wrong is that the 8 in the ranges somehow leaks into copyData16
during IPA.
IPA-ICF?

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-13 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #16 from Sam James  ---
(In reply to Jakub Jelinek from comment #14
> So it certainly doesn't surprise me some length < 8 check is optimized away
> given the above range info.  The question is if it is correct and what
> values the length actually get at runtime if you e.g. compile with -O0.

(gdb) b uprv_copyArray16
Breakpoint 1 at 0xf770e71b: file udataswp.cpp, line 147.
(gdb) b uprv_copyArray64
Breakpoint 2 at 0xf770e7ba: file udataswp.cpp, line 164.
(gdb) r

Breakpoint 1, uprv_copyArray16 (ds=0x5655e4f0, inData=0xf32a4010, length=2,
outData=0xf7f24094, pErrorCode=0xc47c) at udataswp.cpp:147
147 if(pErrorCode==nullptr || U_FAILURE(*pErrorCode)) {
(gdb) c
Continuing.

Breakpoint 1, uprv_copyArray16 (ds=0x5655e4f0, inData=0xf32a4014, length=4,
outData=0xf7f24098, pErrorCode=0xc47c) at udataswp.cpp:147
147 if(pErrorCode==nullptr || U_FAILURE(*pErrorCode)) {

(gdb) c
Continuing.
[Inferior 1 (process 2861598) exited normally]
(gdb)

Oops.

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-13 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #15 from Andrew Pinski  ---
Note the range part looks correct when taking the mask into account so I am
suspecting the mask is messed up. Let me see if I could spot it later today.

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-13 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #14 from Jakub Jelinek  ---
There are significant differences in the ranges starting with evrp.
Even optimized has:
--- pr113907.ii.261t.optimized_ 2024-02-13 09:52:13.090609512 -0500
+++ pr113907.ii.261t.optimized  2024-02-13 09:53:35.691479080 -0500
@@ -576,7 +576,7 @@ int32_t uprv_copyArray16 (const struct U
 goto ; [67.00%]

[local count: 43795362]:
-  # RANGE [irange] unsigned int [2, 2147483647] MASK 0x7ff8 VALUE 0x0
+  # RANGE [irange] unsigned int [8, 2147483647] MASK 0x7ff8 VALUE 0x0
   length.37_21 = (unsigned int) length_14(D);
   memcpy (outData_15(D), inData_13(D), length.37_21);

@@ -676,14 +676,14 @@ int32_t uprv_copyArray64 (const struct U
 goto ; [67.00%]

[local count: 43795362]:
-  # RANGE [irange] unsigned int [1, 2147483647] MASK 0x7ff8 VALUE 0x0
+  # RANGE [irange] unsigned int [8, 2147483647] MASK 0x7ff8 VALUE 0x0
   length.37_21 = (unsigned int) length_14(D);
   memcpy (outData_15(D), inData_13(D), length.37_21);

[local count: 88917857]:

[local count: 1073741824]:
-  # RANGE [irange] int32_t [0, +INF] MASK 0x7fff VALUE 0x0
+  # RANGE [irange] int32_t [0, 0][8, +INF] MASK 0x7fff VALUE 0x0
   # _6 = PHI <0(7), length_14(D)(10)>
   return _6;

@@ -776,14 +776,14 @@ int32_t uprv_copyArray32 (const struct U
 goto ; [67.00%]

[local count: 43795362]:
-  # RANGE [irange] unsigned int [1, 2147483647] MASK 0x7ff8 VALUE 0x0
+  # RANGE [irange] unsigned int [8, 2147483647] MASK 0x7ff8 VALUE 0x0
   length.37_21 = (unsigned int) length_14(D);
   memcpy (outData_15(D), inData_13(D), length.37_21);

[local count: 88917857]:

[local count: 1073741824]:
-  # RANGE [irange] int32_t [0, +INF] MASK 0x7fff VALUE 0x0
+  # RANGE [irange] int32_t [0, 0][4, +INF] MASK 0x7fff VALUE 0x0
   # _6 = PHI <0(7), length_14(D)(10)>
   return _6;

So it certainly doesn't surprise me some length < 8 check is optimized away
given the above range info.  The question is if it is correct and what values
the length actually get at runtime if you e.g. compile with -O0.

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-13 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #13 from Sam James  ---
(In reply to Andrew Pinski from comment #11)
> Does -fwrapv help?

no (and ubsan suppresses the crash, everything works fine w/ no ubsan output)

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-13 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #12 from Sam James  ---
-O2 -march=i686 -std=c++11 -m32 -fPIC

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-13 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #11 from Andrew Pinski  ---
Does -fwrapv help?

If so does -fsanitize=undefined say anything?

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-13 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #10 from Jakub Jelinek  ---
g++ command line for that TU?
-O2 -march=i686 -std=c++14 -m32
?

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-13 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #9 from Sam James  ---
(In reply to Sam James from comment #4)
> Created attachment 57409 [details]
> udataswp.ii.xz
> 
> It goes wrong in common/udataswp.cpp's uprv_copyArray16 and uprv_copyArray64.
> 

ah, uprv_copyArray64 -> uprv_copyArray16 as the memcpy impl

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-13 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #8 from Sam James  ---
Created attachment 57413
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57413=edit
udataswp.o (bad, r14-5109-ga291237b628f41)

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-13 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #7 from Sam James  ---
Created attachment 57412
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57412=edit
udataswp.o (good, r14-5108-g751fc7bcdcdf25)

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-13 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #6 from Sam James  ---
Created attachment 57411
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57411=edit
udataswp.cpp.262r.expand (bad)

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-13 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #5 from Sam James  ---
Created attachment 57410
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57410=edit
udataswp.cpp.262r.expand (good)

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-13 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #4 from Sam James  ---
Created attachment 57409
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57409=edit
udataswp.ii.xz

It goes wrong in common/udataswp.cpp's uprv_copyArray16 and uprv_copyArray64.

(Seemingly both of them need to be pragma'd but I'll check again later as it's
a bit weird.)

The first differing pass is 262r.expand.

```
[...]
-(insn # # # (set (reg:CC 17 flags)
-(compare:CC (reg:SI 133)
-(const_int 8 [0x8]))) "udataswp.cpp":173:9#
- (nil))
-
-(jump_insn # # # (set (pc)
-(if_then_else (ltu (reg:CC 17 flags)
-(const_int 0 [0]))
-(label_ref #)
-(pc))) "udataswp.cpp":173:9#
- (int_list:REG_BR_PROB 644245100 (nil)))
[...]
@@ -6876,15 +6862,11 @@ Merged 2 and 3 without moving.
 Edge 7->9 redirected to 10
 Forwarding edge 8->9 to 10 failed.
 Deleted label in block 9.
-Redirecting jump 51 from 24 to 25.
-Edge 19->21 redirected to 22
-Merging block 21 into block 20...
-Merged blocks 20 and 21.
-Merged 20 and 21 without moving.
-Merging block 24 into block 23...
-Merged blocks 23 and 24.
-Merged 23 and 24 without moving.
-Removing jump 128.
+Redirecting jump 51 from 22 to 23.
+Merging block 22 into block 21...
+Merged blocks 21 and 22.
+Merged 21 and 22 without moving.
+Removing jump 123.
[...]
```

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-13 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
(In reply to Sam James from comment #1)
> I will do the usual bisection of objects and also narrow it down with
> pragmas. I won't be able to get much further than that though, I suspect.

That will be definitely useful + preprocessed source for the corresponding TU.
LTO isn't involved and hopefully it won't change IL everywhere in the TU.

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-13 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |14.0

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-13 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #2 from Sam James  ---
Program received signal SIGSEGV, Segmentation fault.
0xf770e5c0 in memcpy (__dest=, __src=,
__len=) at /usr/include/bits/string_fortified.h:29
29return __builtin___memcpy_chk (__dest, __src, __len,
(gdb) bt
#0  0xf770e5c0 in memcpy (__dest=, __src=,
__len=) at /usr/include/bits/string_fortified.h:29
#1  uprv_copyArray64 (ds=, pErrorCode=,
inData=, length=, outData=0xf7f21094) at
udataswp.cpp:172
#2  uprv_copyArray16 (ds=, inData=,
length=, outData=, pErrorCode=) at
udataswp.cpp:160
#3  0xf770eac5 in udata_swapDataHeader (ds=, inData=, length=, outData=, pErrorCode=) at udataswp.cpp:342
#4  0xf7f65cb2 in icu::Package::readPackage (this=this@entry=0xf7f21010,
filename=filename@entry=0xcc82 "./in/icudt74l.dat") at package.cpp:483
#5  0x56556850 in main (argc=2, argv=0xca14) at icupkg.cpp:335
(gdb)

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-13 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #1 from Sam James  ---
I will do the usual bisection of objects and also narrow it down with pragmas.
I won't be able to get much further than that though, I suspect.

-fsanitize=address,undefined builds fine (i.e. it doesn't even segfault).