[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0

2023-06-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791

--- Comment #17 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #16)
> (In reply to Richard Biener from comment #15)
> > Created attachment 55155 [details]
> > patch unfolding such PHIs
> > 
> > Updated PHI unfolding patch.  Tests fine besides mentioned diagnostic
> > regressions.
> 
> I was looking into doing the opposite in forwprop but maybe I can skip
> addresses.

Oh yes I see it was mentioned before in PR 102138.

[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0

2023-05-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791

--- Comment #16 from Andrew Pinski  ---
(In reply to Richard Biener from comment #15)
> Created attachment 55155 [details]
> patch unfolding such PHIs
> 
> Updated PHI unfolding patch.  Tests fine besides mentioned diagnostic
> regressions.

I was looking into doing the opposite in forwprop but maybe I can skip
addresses.

[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0

2023-05-25 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791

Richard Biener  changed:

   What|Removed |Added

   Assignee|rguenth at gcc dot gnu.org |unassigned at gcc dot 
gnu.org
 Status|ASSIGNED|NEW

[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0

2023-05-25 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791

Richard Biener  changed:

   What|Removed |Added

  Attachment #55047|0   |1
is obsolete||

--- Comment #15 from Richard Biener  ---
Created attachment 55155
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55155=edit
patch unfolding such PHIs

Updated PHI unfolding patch.  Tests fine besides mentioned diagnostic
regressions.

[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0

2023-05-25 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791

--- Comment #14 from Richard Biener  ---
So one issue with the unfolding of PHIs is that for example
gcc.dg/warn-sprintf-no-nul.c has

const char a2[][3] = {
  "", "1", "12", "123", "123\000"
};

and for

 # str_1 = PHI <[2], [3]>

we can determine bounds on the string length of str_1 by unioning the
string lengths of [2] and [3].  But with

 # off_2 = PHI <6, 9>
 str_1 =  + off_2;

this isn't possible.  In fact get_range_strlen doesn't handle POINTER_PLUS_EXPR
and while it might be possible to handle "foo" + off_2 with looking at the
range of off_2 for example the above case of refering to two different
strings rather than offsetting within one string isn't distinguishable.

I've also figured that when one PHI argument has zero offset (aka plain )
then PRE tends to undo the transform since  + 0 is readily available
on that edge and thus it inserts pointer adjustments on the other edges.

So while it looked like the easy way out on the ranger limitation it's
not a viable solution (because it regresses testcases).

[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0

2023-05-12 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791

--- Comment #13 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:560a3e35fe01c499bd5b1e95ddc4c3e958cf5abd

commit r14-785-g560a3e35fe01c499bd5b1e95ddc4c3e958cf5abd
Author: Richard Biener 
Date:   Thu May 11 14:28:11 2023 +0200

tree-optimization/109791 - simplify (unsigned) - (unsigned)( + o)

The following adds another variant of address difference simplification.
The utility ptr_difference_const only handles constant differences
(we also cannot code generate anything else), so exposing a possible
POINTER_PLUS_EXPR in the match and computing the difference on the
base only makes it possible to handle one case of a variable offset.
This simplifies

(unsigned long)   [(void *) + 2B] - (unsigned long) (
+ (_69 + 1))

down to (1 - (unsigned long) _69) during niter analysis, allowing
ranger to eliminate a condition later and avoiding a bogus
-Wstringop-overflow diagnostic for the testcase in the PR.

PR tree-optimization/109791
* match.pd (minus (convert ADDR_EXPR@0) (convert (pointer_plus @1
@2))):
New pattern.
(minus (convert (pointer_plus @1 @2)) (convert ADDR_EXPR@0)):
Likewise.

[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0

2023-05-11 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org

--- Comment #12 from Richard Biener  ---
So let me see if this all works out reasonably.

[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0

2023-05-11 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791

--- Comment #11 from Richard Biener  ---
Created attachment 55049
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55049=edit
patch for extra pointer difference patterns

[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0

2023-05-11 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791

--- Comment #10 from Richard Biener  ---
Created attachment 55048
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55048=edit
patch for niter expression expansion

[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0

2023-05-11 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791

--- Comment #9 from Richard Biener  ---
So the first pass that makes things difficult is reassoc which transforms

  _51 = (unsigned long)   [(void *) + 2B];
  _4 = (unsigned long) __i_44;
  _65 = _51 - _4;
  _48 = _65 + 18446744073709551615;
  _46 = _48 > 13;
  if (_46 != 0)

to

  _51 = (unsigned long)   [(void *) + 2B];
  _4 = (unsigned long) __i_44;
  _12 = -_4;
  _119 = _51 + 18446744073709551615;
  _48 = _119 - _4;
  _46 = _48 > 13;
  if (_46 != 0)

(also leaving garbage around).  That's probably because of the PHI biasing
and __i_44 being defined by a PHI.  niter analysis produces

# of iterations (((unsigned long)   [(void *) + 2B] -
(unsigned long) __i_44) + 18446744073709551615) / 2, bounded by
9223372036854775807

but doesn't expand __i_44 (not sure if we'd fold the thing then).  The
gimplifier folds stmts w/o following SSA edges when we eventually
re-gimplify those expressions for insertion.  If we expand offsetting of
invariant bases we get instead

# of iterations ((unsigned long)   [(void *) + 2B] - (unsigned
long) ( + (_69 + 1))) / 2, bounded by 9223372036854775807

but that's still not simplified.  I think ptr_difference_const should
handle this - of course the difference here isn't const...

/* Try folding difference of addresses.  */ 
(simplify
 (minus (convert ADDR_EXPR@0) (convert @1))
 (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
  (with { poly_int64 diff; } 
   (if (ptr_difference_const (@0, @1, ))
{ build_int_cst_type (type, diff); }
(simplify 
 (minus (convert @0) (convert ADDR_EXPR@1))
 (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
  (with { poly_int64 diff; }
   (if (ptr_difference_const (@0, @1, ))
{ build_int_cst_type (type, diff); } 

it works fine when adding

(simplify
 (minus (convert ADDR_EXPR@0) (convert (pointer_plus @1 @2)))
 (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
  (with { poly_int64 diff; }
   (if (ptr_difference_const (@0, @1, ))
(minus { build_int_cst_type (type, diff); } (convert @2))

then we get

# of iterations (1 - (unsigned long) _69) / 2, bounded by
9223372036854775807

and the diagnostic is gone.

[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0

2023-05-11 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791

--- Comment #8 from Richard Biener  ---
Created attachment 55047
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55047=edit
patch unfolding such PHIs

[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0

2023-05-11 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791

--- Comment #7 from rguenther at suse dot de  ---
On Thu, 11 May 2023, aldyh at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791
> 
> --- Comment #6 from Aldy Hernandez  ---
> 
> > but the issue with the PHI node remains unless we sink the  part
> > (but there's many uses of __i_14).  I guess it's still the "easiest"
> > way to get rangers help.  Aka make
> > 
> >  # __i_14' = PHI <1(10), 2(9)>
> >  __i_14 =  + __i_14'; // would be a POINTER_PLUS_EXPR
> > 
> > it's probably still not a complete fix but maybe a good start.  Of course
> > it increases the number of stmts - [ + 1B] was an 'invariant'
> > (of course the PHI result isn't).  There's not a good place for this
> > transform - we never "fold" PHIs (and this would be an un-folding).
> 
> Ughh, that sucks.  Let's see if Andrew has any ideas, but on my end I won't be
> able to work on prange until much later this cycle-- assuming I finish what I
> have on my plate.

So the idea with the above is of course that via regular folding
and value-numbering we can simplify the compare to a compare of
just the offsets and for those ranger already works.  The expression
is quite obfuscated of course and as said the strlen pass placement
doesn't help (it's before forwprop and VRP).

That said, the place to transform the PHI node is probably the same
where degenerate PHIs are removed.

For the testcase the PHI is created quite early by cunrolli.

I have a patch splitting the PHI.  We then still have

  # _69 = PHI <1(9), 2(8)>
  __i_44 =  + _69;
...
   [local count: 402445658]:
  _51 = (unsigned long)   [(void *) + 2B];
  _4 = (unsigned long) __i_44;
  _12 = -_4;
  _119 = _51 + 18446744073709551615;
  _48 = _119 - _4;
  _46 = _48 > 13;
  if (_46 != 0)
goto ; [64.00%]

and also

  _31 =   [(void *) + 3B] <= __i_44;

so at least for the latter we are missing a simplification
pattern - it should simplify the compare to

  _31 = 3 <= _69

possibly (unsigned)(p p+ offset) should be changed to
(unsigned)p + offset and thus likewise (unsigned) [ + 4B] into
(unsigned) + 4.

[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0

2023-05-11 Thread aldyh at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791

--- Comment #6 from Aldy Hernandez  ---

> but the issue with the PHI node remains unless we sink the  part
> (but there's many uses of __i_14).  I guess it's still the "easiest"
> way to get rangers help.  Aka make
> 
>  # __i_14' = PHI <1(10), 2(9)>
>  __i_14 =  + __i_14'; // would be a POINTER_PLUS_EXPR
> 
> it's probably still not a complete fix but maybe a good start.  Of course
> it increases the number of stmts - [ + 1B] was an 'invariant'
> (of course the PHI result isn't).  There's not a good place for this
> transform - we never "fold" PHIs (and this would be an un-folding).

Ughh, that sucks.  Let's see if Andrew has any ideas, but on my end I won't be
able to work on prange until much later this cycle-- assuming I finish what I
have on my plate.

[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0

2023-05-11 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791

--- Comment #5 from Richard Biener  ---
DOMs scoped tables do not help.  The crux is really the PHI:

 __i_14 = PHI <  [(void *) + 1B](10),  
> [(void *) + 2B](9)>

there's no single value that exposes  + offset.

For

  _4 = (unsigned long)   [(void *) + 2B];

we might want to go and express it as

  _4' = (unsigned long) 
  _4  = _4' + 2;

but the issue with the PHI node remains unless we sink the  part
(but there's many uses of __i_14).  I guess it's still the "easiest"
way to get rangers help.  Aka make

 # __i_14' = PHI <1(10), 2(9)>
 __i_14 =  + __i_14'; // would be a POINTER_PLUS_EXPR

it's probably still not a complete fix but maybe a good start.  Of course
it increases the number of stmts - [ + 1B] was an 'invariant'
(of course the PHI result isn't).  There's not a good place for this
transform - we never "fold" PHIs (and this would be an un-folding).

[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0

2023-05-11 Thread aldyh at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791

--- Comment #4 from Aldy Hernandez  ---
BTW, another reason I had to drop the prange work was because IPA was doing
their own thing with ranges outside of the irange API, so it was harder to
separate things out.  So really, all this stuff was related to legacy, which is
mostly gone, and my upcoming work on IPA later this cycle should clean up the
rest of IPA.

[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0

2023-05-11 Thread aldyh at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791

--- Comment #3 from Aldy Hernandez  ---
(In reply to Richard Biener from comment #2)
> Confirmed.  This is a missed optimization, we fail to optimize the loop guard
> 
>  [local count: 329643239]:
> _4 = (unsigned long)   [(void *) + 2B];
> _6 = (unsigned long) __i_14;
> _50 = -_6;
> _100 = _4 + 18446744073709551615;
> _40 = _100 - _6;
> _41 = _40 > 13;
> if (_41 != 0)

Do we even get a non-zero range for _4?  I'm assuming that even if we get that,
we can't see that +2B minus 1 is also nonzero?

> 
> with __i_14 being
> 
>  [local count: 452186132]:
> # __i_14 = PHI <  [(void *) + 1B](10),  
> [(void *) + 2B](9)>
> 
> I'll note that the strlen pass runs before VRP (but after DOM), but I'll
> also note that likely ranger isn't very good with these kind of
> "symbolic" ranges?  How would we handle this?  Using two
> relations, __i_14 >=  + 1 && __i_14 <=  + 2?

Yeah, we don't do much with that.  Although the pointer equivalency class
should help in VRP's case.  We do some simple pointer tracking and even call
into gimple fold to simplify statements, but it's far from perfect and as you
say, strlen is running before VRP, so it wouldn't help in this case.

> 
> DOM has
> 
> Optimizing block #16
> 
> 1>>> STMT 1 =   [(void *) + 2B] ge_expr __i_14
> 1>>> STMT 1 =   [(void *) + 2B] ne_expr __i_14
> 1>>> STMT 0 =   [(void *) + 2B] eq_expr __i_14
> 1>>> STMT 1 =   [(void *) + 2B] gt_expr __i_14
> 1>>> STMT 0 =   [(void *) + 2B] le_expr __i_14
> Optimizing statement _4 = (unsigned long)   [(void *) + 2B];
> LKUP STMT _4 = nop_expr   [(void *) + 2B]
> 2>>> STMT _4 = nop_expr   [(void *) + 2B]
> Optimizing statement _6 = (unsigned long) __i_14;
> LKUP STMT _6 = nop_expr __i_14
> 2>>> STMT _6 = nop_expr __i_14
> Optimizing statement _50 = -_6;
>  Registering value_relation (_6 pe64 __i_14) (bb16) at _6 = (unsigned long)
> __i_14;
> LKUP STMT _50 = negate_expr _6
> 2>>> STMT _50 = negate_expr _6
> Optimizing statement _100 = _4 + 18446744073709551615;
> LKUP STMT _100 = _4 plus_expr 18446744073709551615
> 2>>> STMT _100 = _4 plus_expr 18446744073709551615
> Optimizing statement _40 = _100 - _6;
>  Registering value_relation (_100 < _4) (bb16) at _100 = _4 +
> 18446744073709551615;
> LKUP STMT _40 = _100 minus_expr _6
> 2>>> STMT _40 = _100 minus_expr _6
> Optimizing statement _41 = _40 > 13;
> LKUP STMT _41 = _40 gt_expr 13
> 2>>> STMT _41 = _40 gt_expr 13
> LKUP STMT _40 le_expr 14
> Optimizing statement if (_41 != 0)
> 
> Visiting conditional with predicate: if (_41 != 0)
> 
> With known ranges
> _41: [irange] bool VARYING

Ranger won't do anything, but can DOM's scoped tables do anything here?  The
hybrid threader in DOM first asks DOM's internal mechanism before asking ranger
(which seems useless in this case):

dom_jt_simplifier::simplify (gimple *stmt, gimple *within_stmt,
 basic_block bb, jt_state *state)
{
  /* First see if the conditional is in the hash table.  */
  tree cached_lhs =  m_avails->lookup_avail_expr (stmt, false, true);
  if (cached_lhs)
return cached_lhs;

  /* Otherwise call the ranger if possible.  */
  if (state)
return hybrid_jt_simplifier::simplify (stmt, within_stmt, bb, state);

  return NULL;
}

Our long term plan for pointers was providing a prange class and separating
pointers from irange.  This class would track zero/nonzero mostly, but it would
also do some pointer equivalence tracking, thus subsuming what we do with
pointer_equiv_analyzer which is currently restricted to VRP and is an
on-the-side hack.

The prototype I had for it last release tracked the equivalence plus an offset,
so we should be able to do arithmetic on a prange of say [ + X].  I had to
put it aside because the frange work took way longer than expected, plus legacy
got in the way.  Neither of this is an issue now, so we'll see.. but that's the
plan.  I should dust off those patches :-/.

[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0

2023-05-10 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791

Richard Biener  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 CC||aldyh at gcc dot gnu.org,
   ||amacleod at redhat dot com,
   ||rguenth at gcc dot gnu.org
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-05-10

--- Comment #2 from Richard Biener  ---
Confirmed.  This is a missed optimization, we fail to optimize the loop guard

 [local count: 329643239]:
_4 = (unsigned long)   [(void *) + 2B];
_6 = (unsigned long) __i_14;
_50 = -_6;
_100 = _4 + 18446744073709551615;
_40 = _100 - _6;
_41 = _40 > 13;
if (_41 != 0)

with __i_14 being

 [local count: 452186132]:
# __i_14 = PHI <  [(void *) + 1B](10),   [(void
*) + 2B](9)>

I'll note that the strlen pass runs before VRP (but after DOM), but I'll
also note that likely ranger isn't very good with these kind of
"symbolic" ranges?  How would we handle this?  Using two
relations, __i_14 >=  + 1 && __i_14 <=  + 2?

DOM has

Optimizing block #16

1>>> STMT 1 =   [(void *) + 2B] ge_expr __i_14
1>>> STMT 1 =   [(void *) + 2B] ne_expr __i_14
1>>> STMT 0 =   [(void *) + 2B] eq_expr __i_14
1>>> STMT 1 =   [(void *) + 2B] gt_expr __i_14
1>>> STMT 0 =   [(void *) + 2B] le_expr __i_14
Optimizing statement _4 = (unsigned long)   [(void *) + 2B];
LKUP STMT _4 = nop_expr   [(void *) + 2B]
2>>> STMT _4 = nop_expr   [(void *) + 2B]
Optimizing statement _6 = (unsigned long) __i_14;
LKUP STMT _6 = nop_expr __i_14
2>>> STMT _6 = nop_expr __i_14
Optimizing statement _50 = -_6;
 Registering value_relation (_6 pe64 __i_14) (bb16) at _6 = (unsigned long)
__i_14;
LKUP STMT _50 = negate_expr _6
2>>> STMT _50 = negate_expr _6
Optimizing statement _100 = _4 + 18446744073709551615;
LKUP STMT _100 = _4 plus_expr 18446744073709551615
2>>> STMT _100 = _4 plus_expr 18446744073709551615
Optimizing statement _40 = _100 - _6;
 Registering value_relation (_100 < _4) (bb16) at _100 = _4 +
18446744073709551615;
LKUP STMT _40 = _100 minus_expr _6
2>>> STMT _40 = _100 minus_expr _6
Optimizing statement _41 = _40 > 13;
LKUP STMT _41 = _40 gt_expr 13
2>>> STMT _41 = _40 gt_expr 13
LKUP STMT _40 le_expr 14
Optimizing statement if (_41 != 0)

Visiting conditional with predicate: if (_41 != 0)

With known ranges
_41: [irange] bool VARYING

[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0

2023-05-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791

--- Comment #1 from Andrew Pinski  ---
The exact command line for a generic x86_64-linux-gnu compiler:
-O2 -fvect-cost-model=dynamic  -Wstringop-overflow -D_GLIBCXX_USE_CXX11_ABI=0
-march=x86-64-v2