[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791 --- Comment #17 from Andrew Pinski --- (In reply to Andrew Pinski from comment #16) > (In reply to Richard Biener from comment #15) > > Created attachment 55155 [details] > > patch unfolding such PHIs > > > > Updated PHI unfolding patch. Tests fine besides mentioned diagnostic > > regressions. > > I was looking into doing the opposite in forwprop but maybe I can skip > addresses. Oh yes I see it was mentioned before in PR 102138.
[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791 --- Comment #16 from Andrew Pinski --- (In reply to Richard Biener from comment #15) > Created attachment 55155 [details] > patch unfolding such PHIs > > Updated PHI unfolding patch. Tests fine besides mentioned diagnostic > regressions. I was looking into doing the opposite in forwprop but maybe I can skip addresses.
[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791 Richard Biener changed: What|Removed |Added Assignee|rguenth at gcc dot gnu.org |unassigned at gcc dot gnu.org Status|ASSIGNED|NEW
[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791 Richard Biener changed: What|Removed |Added Attachment #55047|0 |1 is obsolete|| --- Comment #15 from Richard Biener --- Created attachment 55155 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55155=edit patch unfolding such PHIs Updated PHI unfolding patch. Tests fine besides mentioned diagnostic regressions.
[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791 --- Comment #14 from Richard Biener --- So one issue with the unfolding of PHIs is that for example gcc.dg/warn-sprintf-no-nul.c has const char a2[][3] = { "", "1", "12", "123", "123\000" }; and for # str_1 = PHI <[2], [3]> we can determine bounds on the string length of str_1 by unioning the string lengths of [2] and [3]. But with # off_2 = PHI <6, 9> str_1 = + off_2; this isn't possible. In fact get_range_strlen doesn't handle POINTER_PLUS_EXPR and while it might be possible to handle "foo" + off_2 with looking at the range of off_2 for example the above case of refering to two different strings rather than offsetting within one string isn't distinguishable. I've also figured that when one PHI argument has zero offset (aka plain ) then PRE tends to undo the transform since + 0 is readily available on that edge and thus it inserts pointer adjustments on the other edges. So while it looked like the easy way out on the ranger limitation it's not a viable solution (because it regresses testcases).
[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791 --- Comment #13 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:560a3e35fe01c499bd5b1e95ddc4c3e958cf5abd commit r14-785-g560a3e35fe01c499bd5b1e95ddc4c3e958cf5abd Author: Richard Biener Date: Thu May 11 14:28:11 2023 +0200 tree-optimization/109791 - simplify (unsigned) - (unsigned)( + o) The following adds another variant of address difference simplification. The utility ptr_difference_const only handles constant differences (we also cannot code generate anything else), so exposing a possible POINTER_PLUS_EXPR in the match and computing the difference on the base only makes it possible to handle one case of a variable offset. This simplifies (unsigned long) [(void *) + 2B] - (unsigned long) ( + (_69 + 1)) down to (1 - (unsigned long) _69) during niter analysis, allowing ranger to eliminate a condition later and avoiding a bogus -Wstringop-overflow diagnostic for the testcase in the PR. PR tree-optimization/109791 * match.pd (minus (convert ADDR_EXPR@0) (convert (pointer_plus @1 @2))): New pattern. (minus (convert (pointer_plus @1 @2)) (convert ADDR_EXPR@0)): Likewise.
[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791 Richard Biener changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #12 from Richard Biener --- So let me see if this all works out reasonably.
[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791 --- Comment #11 from Richard Biener --- Created attachment 55049 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55049=edit patch for extra pointer difference patterns
[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791 --- Comment #10 from Richard Biener --- Created attachment 55048 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55048=edit patch for niter expression expansion
[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791 --- Comment #9 from Richard Biener --- So the first pass that makes things difficult is reassoc which transforms _51 = (unsigned long) [(void *) + 2B]; _4 = (unsigned long) __i_44; _65 = _51 - _4; _48 = _65 + 18446744073709551615; _46 = _48 > 13; if (_46 != 0) to _51 = (unsigned long) [(void *) + 2B]; _4 = (unsigned long) __i_44; _12 = -_4; _119 = _51 + 18446744073709551615; _48 = _119 - _4; _46 = _48 > 13; if (_46 != 0) (also leaving garbage around). That's probably because of the PHI biasing and __i_44 being defined by a PHI. niter analysis produces # of iterations (((unsigned long) [(void *) + 2B] - (unsigned long) __i_44) + 18446744073709551615) / 2, bounded by 9223372036854775807 but doesn't expand __i_44 (not sure if we'd fold the thing then). The gimplifier folds stmts w/o following SSA edges when we eventually re-gimplify those expressions for insertion. If we expand offsetting of invariant bases we get instead # of iterations ((unsigned long) [(void *) + 2B] - (unsigned long) ( + (_69 + 1))) / 2, bounded by 9223372036854775807 but that's still not simplified. I think ptr_difference_const should handle this - of course the difference here isn't const... /* Try folding difference of addresses. */ (simplify (minus (convert ADDR_EXPR@0) (convert @1)) (if (tree_nop_conversion_p (type, TREE_TYPE (@0))) (with { poly_int64 diff; } (if (ptr_difference_const (@0, @1, )) { build_int_cst_type (type, diff); } (simplify (minus (convert @0) (convert ADDR_EXPR@1)) (if (tree_nop_conversion_p (type, TREE_TYPE (@0))) (with { poly_int64 diff; } (if (ptr_difference_const (@0, @1, )) { build_int_cst_type (type, diff); } it works fine when adding (simplify (minus (convert ADDR_EXPR@0) (convert (pointer_plus @1 @2))) (if (tree_nop_conversion_p (type, TREE_TYPE (@0))) (with { poly_int64 diff; } (if (ptr_difference_const (@0, @1, )) (minus { build_int_cst_type (type, diff); } (convert @2)) then we get # of iterations (1 - (unsigned long) _69) / 2, bounded by 9223372036854775807 and the diagnostic is gone.
[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791 --- Comment #8 from Richard Biener --- Created attachment 55047 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55047=edit patch unfolding such PHIs
[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791 --- Comment #7 from rguenther at suse dot de --- On Thu, 11 May 2023, aldyh at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791 > > --- Comment #6 from Aldy Hernandez --- > > > but the issue with the PHI node remains unless we sink the part > > (but there's many uses of __i_14). I guess it's still the "easiest" > > way to get rangers help. Aka make > > > > # __i_14' = PHI <1(10), 2(9)> > > __i_14 = + __i_14'; // would be a POINTER_PLUS_EXPR > > > > it's probably still not a complete fix but maybe a good start. Of course > > it increases the number of stmts - [ + 1B] was an 'invariant' > > (of course the PHI result isn't). There's not a good place for this > > transform - we never "fold" PHIs (and this would be an un-folding). > > Ughh, that sucks. Let's see if Andrew has any ideas, but on my end I won't be > able to work on prange until much later this cycle-- assuming I finish what I > have on my plate. So the idea with the above is of course that via regular folding and value-numbering we can simplify the compare to a compare of just the offsets and for those ranger already works. The expression is quite obfuscated of course and as said the strlen pass placement doesn't help (it's before forwprop and VRP). That said, the place to transform the PHI node is probably the same where degenerate PHIs are removed. For the testcase the PHI is created quite early by cunrolli. I have a patch splitting the PHI. We then still have # _69 = PHI <1(9), 2(8)> __i_44 = + _69; ... [local count: 402445658]: _51 = (unsigned long) [(void *) + 2B]; _4 = (unsigned long) __i_44; _12 = -_4; _119 = _51 + 18446744073709551615; _48 = _119 - _4; _46 = _48 > 13; if (_46 != 0) goto ; [64.00%] and also _31 = [(void *) + 3B] <= __i_44; so at least for the latter we are missing a simplification pattern - it should simplify the compare to _31 = 3 <= _69 possibly (unsigned)(p p+ offset) should be changed to (unsigned)p + offset and thus likewise (unsigned) [ + 4B] into (unsigned) + 4.
[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791 --- Comment #6 from Aldy Hernandez --- > but the issue with the PHI node remains unless we sink the part > (but there's many uses of __i_14). I guess it's still the "easiest" > way to get rangers help. Aka make > > # __i_14' = PHI <1(10), 2(9)> > __i_14 = + __i_14'; // would be a POINTER_PLUS_EXPR > > it's probably still not a complete fix but maybe a good start. Of course > it increases the number of stmts - [ + 1B] was an 'invariant' > (of course the PHI result isn't). There's not a good place for this > transform - we never "fold" PHIs (and this would be an un-folding). Ughh, that sucks. Let's see if Andrew has any ideas, but on my end I won't be able to work on prange until much later this cycle-- assuming I finish what I have on my plate.
[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791 --- Comment #5 from Richard Biener --- DOMs scoped tables do not help. The crux is really the PHI: __i_14 = PHI < [(void *) + 1B](10), > [(void *) + 2B](9)> there's no single value that exposes + offset. For _4 = (unsigned long) [(void *) + 2B]; we might want to go and express it as _4' = (unsigned long) _4 = _4' + 2; but the issue with the PHI node remains unless we sink the part (but there's many uses of __i_14). I guess it's still the "easiest" way to get rangers help. Aka make # __i_14' = PHI <1(10), 2(9)> __i_14 = + __i_14'; // would be a POINTER_PLUS_EXPR it's probably still not a complete fix but maybe a good start. Of course it increases the number of stmts - [ + 1B] was an 'invariant' (of course the PHI result isn't). There's not a good place for this transform - we never "fold" PHIs (and this would be an un-folding).
[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791 --- Comment #4 from Aldy Hernandez --- BTW, another reason I had to drop the prange work was because IPA was doing their own thing with ranges outside of the irange API, so it was harder to separate things out. So really, all this stuff was related to legacy, which is mostly gone, and my upcoming work on IPA later this cycle should clean up the rest of IPA.
[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791 --- Comment #3 from Aldy Hernandez --- (In reply to Richard Biener from comment #2) > Confirmed. This is a missed optimization, we fail to optimize the loop guard > > [local count: 329643239]: > _4 = (unsigned long) [(void *) + 2B]; > _6 = (unsigned long) __i_14; > _50 = -_6; > _100 = _4 + 18446744073709551615; > _40 = _100 - _6; > _41 = _40 > 13; > if (_41 != 0) Do we even get a non-zero range for _4? I'm assuming that even if we get that, we can't see that +2B minus 1 is also nonzero? > > with __i_14 being > > [local count: 452186132]: > # __i_14 = PHI < [(void *) + 1B](10), > [(void *) + 2B](9)> > > I'll note that the strlen pass runs before VRP (but after DOM), but I'll > also note that likely ranger isn't very good with these kind of > "symbolic" ranges? How would we handle this? Using two > relations, __i_14 >= + 1 && __i_14 <= + 2? Yeah, we don't do much with that. Although the pointer equivalency class should help in VRP's case. We do some simple pointer tracking and even call into gimple fold to simplify statements, but it's far from perfect and as you say, strlen is running before VRP, so it wouldn't help in this case. > > DOM has > > Optimizing block #16 > > 1>>> STMT 1 = [(void *) + 2B] ge_expr __i_14 > 1>>> STMT 1 = [(void *) + 2B] ne_expr __i_14 > 1>>> STMT 0 = [(void *) + 2B] eq_expr __i_14 > 1>>> STMT 1 = [(void *) + 2B] gt_expr __i_14 > 1>>> STMT 0 = [(void *) + 2B] le_expr __i_14 > Optimizing statement _4 = (unsigned long) [(void *) + 2B]; > LKUP STMT _4 = nop_expr [(void *) + 2B] > 2>>> STMT _4 = nop_expr [(void *) + 2B] > Optimizing statement _6 = (unsigned long) __i_14; > LKUP STMT _6 = nop_expr __i_14 > 2>>> STMT _6 = nop_expr __i_14 > Optimizing statement _50 = -_6; > Registering value_relation (_6 pe64 __i_14) (bb16) at _6 = (unsigned long) > __i_14; > LKUP STMT _50 = negate_expr _6 > 2>>> STMT _50 = negate_expr _6 > Optimizing statement _100 = _4 + 18446744073709551615; > LKUP STMT _100 = _4 plus_expr 18446744073709551615 > 2>>> STMT _100 = _4 plus_expr 18446744073709551615 > Optimizing statement _40 = _100 - _6; > Registering value_relation (_100 < _4) (bb16) at _100 = _4 + > 18446744073709551615; > LKUP STMT _40 = _100 minus_expr _6 > 2>>> STMT _40 = _100 minus_expr _6 > Optimizing statement _41 = _40 > 13; > LKUP STMT _41 = _40 gt_expr 13 > 2>>> STMT _41 = _40 gt_expr 13 > LKUP STMT _40 le_expr 14 > Optimizing statement if (_41 != 0) > > Visiting conditional with predicate: if (_41 != 0) > > With known ranges > _41: [irange] bool VARYING Ranger won't do anything, but can DOM's scoped tables do anything here? The hybrid threader in DOM first asks DOM's internal mechanism before asking ranger (which seems useless in this case): dom_jt_simplifier::simplify (gimple *stmt, gimple *within_stmt, basic_block bb, jt_state *state) { /* First see if the conditional is in the hash table. */ tree cached_lhs = m_avails->lookup_avail_expr (stmt, false, true); if (cached_lhs) return cached_lhs; /* Otherwise call the ranger if possible. */ if (state) return hybrid_jt_simplifier::simplify (stmt, within_stmt, bb, state); return NULL; } Our long term plan for pointers was providing a prange class and separating pointers from irange. This class would track zero/nonzero mostly, but it would also do some pointer equivalence tracking, thus subsuming what we do with pointer_equiv_analyzer which is currently restricted to VRP and is an on-the-side hack. The prototype I had for it last release tracked the equivalence plus an offset, so we should be able to do arithmetic on a prange of say [ + X]. I had to put it aside because the frange work took way longer than expected, plus legacy got in the way. Neither of this is an issue now, so we'll see.. but that's the plan. I should dust off those patches :-/.
[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791 Richard Biener changed: What|Removed |Added Ever confirmed|0 |1 CC||aldyh at gcc dot gnu.org, ||amacleod at redhat dot com, ||rguenth at gcc dot gnu.org Status|UNCONFIRMED |NEW Last reconfirmed||2023-05-10 --- Comment #2 from Richard Biener --- Confirmed. This is a missed optimization, we fail to optimize the loop guard [local count: 329643239]: _4 = (unsigned long) [(void *) + 2B]; _6 = (unsigned long) __i_14; _50 = -_6; _100 = _4 + 18446744073709551615; _40 = _100 - _6; _41 = _40 > 13; if (_41 != 0) with __i_14 being [local count: 452186132]: # __i_14 = PHI < [(void *) + 1B](10), [(void *) + 2B](9)> I'll note that the strlen pass runs before VRP (but after DOM), but I'll also note that likely ranger isn't very good with these kind of "symbolic" ranges? How would we handle this? Using two relations, __i_14 >= + 1 && __i_14 <= + 2? DOM has Optimizing block #16 1>>> STMT 1 = [(void *) + 2B] ge_expr __i_14 1>>> STMT 1 = [(void *) + 2B] ne_expr __i_14 1>>> STMT 0 = [(void *) + 2B] eq_expr __i_14 1>>> STMT 1 = [(void *) + 2B] gt_expr __i_14 1>>> STMT 0 = [(void *) + 2B] le_expr __i_14 Optimizing statement _4 = (unsigned long) [(void *) + 2B]; LKUP STMT _4 = nop_expr [(void *) + 2B] 2>>> STMT _4 = nop_expr [(void *) + 2B] Optimizing statement _6 = (unsigned long) __i_14; LKUP STMT _6 = nop_expr __i_14 2>>> STMT _6 = nop_expr __i_14 Optimizing statement _50 = -_6; Registering value_relation (_6 pe64 __i_14) (bb16) at _6 = (unsigned long) __i_14; LKUP STMT _50 = negate_expr _6 2>>> STMT _50 = negate_expr _6 Optimizing statement _100 = _4 + 18446744073709551615; LKUP STMT _100 = _4 plus_expr 18446744073709551615 2>>> STMT _100 = _4 plus_expr 18446744073709551615 Optimizing statement _40 = _100 - _6; Registering value_relation (_100 < _4) (bb16) at _100 = _4 + 18446744073709551615; LKUP STMT _40 = _100 minus_expr _6 2>>> STMT _40 = _100 minus_expr _6 Optimizing statement _41 = _40 > 13; LKUP STMT _41 = _40 gt_expr 13 2>>> STMT _41 = _40 gt_expr 13 LKUP STMT _40 le_expr 14 Optimizing statement if (_41 != 0) Visiting conditional with predicate: if (_41 != 0) With known ranges _41: [irange] bool VARYING
[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791 --- Comment #1 from Andrew Pinski --- The exact command line for a generic x86_64-linux-gnu compiler: -O2 -fvect-cost-model=dynamic -Wstringop-overflow -D_GLIBCXX_USE_CXX11_ABI=0 -march=x86-64-v2