Re: [patch] canonicalize unsigned [1,MAX] ranges into ~[0,0]
On Thu, Oct 17, 2019 at 03:15:28AM -0400, Aldy Hernandez wrote: > On 10/16/19 3:46 AM, Jakub Jelinek wrote: > > On Wed, Oct 16, 2019 at 03:38:38AM -0400, Aldy Hernandez wrote: > > > Would you take care of this, or shall I? > > > > Will defer to you, I have quite a lot of stuff on my plate ATM. > > > > Jakub > > > > No problem. Thanks for your analysis though. > > The attached patch fixes the regression. > > OK pending tests? > gcc/ > > PR tree-optimization/92131 > * tree-vrp.c (value_range_base::dump): Display +INF for both > pointers and integers when appropriate. > > gcc/testsuite/ > > * gcc.dg/tree-ssa/evrp4.c: Check for +INF instead of -1. LGTM. Jakub
Re: [patch] canonicalize unsigned [1,MAX] ranges into ~[0,0]
On 10/16/19 3:46 AM, Jakub Jelinek wrote: On Wed, Oct 16, 2019 at 03:38:38AM -0400, Aldy Hernandez wrote: Would you take care of this, or shall I? Will defer to you, I have quite a lot of stuff on my plate ATM. Jakub No problem. Thanks for your analysis though. The attached patch fixes the regression. OK pending tests? gcc/ PR tree-optimization/92131 * tree-vrp.c (value_range_base::dump): Display +INF for both pointers and integers when appropriate. gcc/testsuite/ * gcc.dg/tree-ssa/evrp4.c: Check for +INF instead of -1. diff --git a/gcc/testsuite/gcc.dg/tree-ssa/evrp4.c b/gcc/testsuite/gcc.dg/tree-ssa/evrp4.c index ba2f6b9b430..6710e6b5eff 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/evrp4.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/evrp4.c @@ -17,4 +17,4 @@ int bar (struct st *s) foo (>a); } -/* { dg-final { scan-tree-dump "\\\[1B, -1B\\\]" "evrp" } } */ +/* { dg-final { scan-tree-dump "\\\[1B, \\+INF\\\]" "evrp" } } */ diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c index 21910b36518..8d4f16e9e1f 100644 --- a/gcc/tree-vrp.c +++ b/gcc/tree-vrp.c @@ -428,8 +428,8 @@ value_range_base::dump (FILE *file) const fprintf (file, ", "); - if (INTEGRAL_TYPE_P (ttype) - && vrp_val_is_max (max ()) + if (supports_type_p (ttype) + && vrp_val_is_max (max (), true) && TYPE_PRECISION (ttype) != 1) fprintf (file, "+INF"); else
Re: [patch] canonicalize unsigned [1,MAX] ranges into ~[0,0]
On Wed, Oct 16, 2019 at 03:38:38AM -0400, Aldy Hernandez wrote: > Would you take care of this, or shall I? Will defer to you, I have quite a lot of stuff on my plate ATM. Jakub
Re: [patch] canonicalize unsigned [1,MAX] ranges into ~[0,0]
On 10/15/19 2:16 PM, Jakub Jelinek wrote: On Tue, Oct 15, 2019 at 08:35:07AM -0400, Aldy Hernandez wrote: I'm seeing this on 32-bit i386-pc-solaris2.11 and sparc-sun-solaris2.11, with more reports for armv8l, pru, and s390x. Comparing the dumps between 64 and 32-bit, I see -_1: int * [1B, -1B] +_1: int * [1B, 4294967295B] I wonder why 32-bit targets at displaying 4294967295 instead of -1. Or are pointers 64-bits here? Because the dump method does: if (INTEGRAL_TYPE_P (ttype) && vrp_val_is_max (max ()) && TYPE_PRECISION (ttype) != 1) fprintf (file, "+INF"); else print_generic_expr (file, max ()); so for integral types and maximum value, it prints +INF, but not for pointers. Ah, I see. Perhaps we want to print +INF also for pointers, if ((INTEGRAL_TYPE_P (ttype) || POINTER_TYPE_P (ttype)) && vrp_val_is_max (max (), true) && TYPE_PRECISION (ttype) != 1) fprintf (file, "+INF"); else print_generic_expr (file, max ()); That sounds reasonable, though I would use supports_type_p() instead of open-coding the check for INTEGRAL and POINTER. Would you take care of this, or shall I? but maybe vrp_val_is_{min,max} should be rewritten for pointer types to be more efficient, don't create trees, for min just use integer_zerop and for max just compare wide_int? That sounds like a separate issue, but sure. No complaints. Note, that I highly dislike the whole handle_pointers=bool argument in vrp_*val*, even though I added it. I think it should default to the handle_pointers behavior, though I was unsure what would break, so I kept existing behavior gated by a bool (yuck). Thanks. Aldy
Re: [patch] canonicalize unsigned [1,MAX] ranges into ~[0,0]
On Tue, Oct 15, 2019 at 08:35:07AM -0400, Aldy Hernandez wrote: > > I'm seeing this on 32-bit i386-pc-solaris2.11 and sparc-sun-solaris2.11, > > with more reports for armv8l, pru, and s390x. > > > > Comparing the dumps between 64 and 32-bit, I see > > > > -_1: int * [1B, -1B] > > +_1: int * [1B, 4294967295B] > > I wonder why 32-bit targets at displaying 4294967295 instead of -1. Or are > pointers 64-bits here? Because the dump method does: if (INTEGRAL_TYPE_P (ttype) && vrp_val_is_max (max ()) && TYPE_PRECISION (ttype) != 1) fprintf (file, "+INF"); else print_generic_expr (file, max ()); so for integral types and maximum value, it prints +INF, but not for pointers. Perhaps we want to print +INF also for pointers, if ((INTEGRAL_TYPE_P (ttype) || POINTER_TYPE_P (ttype)) && vrp_val_is_max (max (), true) && TYPE_PRECISION (ttype) != 1) fprintf (file, "+INF"); else print_generic_expr (file, max ()); but maybe vrp_val_is_{min,max} should be rewritten for pointer types to be more efficient, don't create trees, for min just use integer_zerop and for max just compare wide_int? Jakub
Re: [patch] canonicalize unsigned [1,MAX] ranges into ~[0,0]
Hi Aldy, Rainer Orth wrote: >> >> On 10/15/19 7:58 AM, Rainer Orth wrote: >>> Hi Aldy, >>> >> ~[0,0] has been the accepted way for a long time, I'd really prefer to >> keep that (for now). > It has. Very true. But I don't necessarily think that means we should > be introducing even more of 'em. >>> [...] Happily, normalizing into ~0 for signed and [1,MAX] for unsigned, simplified the patch because it on longer needs tweaks to ranges_from_anti_range. OK for trunk? >>> >>> the new testcase FAILs on several (all?) 32-bit targets: >>> >>> +FAIL: gcc.dg/tree-ssa/evrp4.c scan-tree-dump evrp "[1B, -1B]" >> >> That's unfortunate. >> >> Is this the only test that is failing? > > it's the only on on Solaris/SPARC and Solaris/x86. Haven't checked > other affected targets, though. > >>> I'm seeing this on 32-bit i386-pc-solaris2.11 and sparc-sun-solaris2.11, >>> with more reports for armv8l, pru, and s390x. >>> >>> Comparing the dumps between 64 and 32-bit, I see >>> >>> -_1: int * [1B, -1B] >>> +_1: int * [1B, 4294967295B] >> >> I wonder why 32-bit targets at displaying 4294967295 instead of -1. Or are >> pointers 64-bits here? > > No, it's a pure 32-bit target. The compiler is 32-bit, too, but > bi-arch (32 and 64-bit). identical result on i686-darwin9, also a pure32b target (with a 64b multilb). Iain
Re: [patch] canonicalize unsigned [1,MAX] ranges into ~[0,0]
Hi Aldy, > On 10/15/19 7:58 AM, Rainer Orth wrote: >> Hi Aldy, >> > ~[0,0] has been the accepted way for a long time, I'd really prefer to > keep that (for now). It has. Very true. But I don't necessarily think that means we should be introducing even more of 'em. >> [...] >>> Happily, normalizing into ~0 for signed and [1,MAX] for unsigned, >>> simplified the patch because it on longer needs tweaks to >>> ranges_from_anti_range. >>> >>> OK for trunk? >> >> the new testcase FAILs on several (all?) 32-bit targets: >> >> +FAIL: gcc.dg/tree-ssa/evrp4.c scan-tree-dump evrp "[1B, -1B]" > > That's unfortunate. > > Is this the only test that is failing? it's the only on on Solaris/SPARC and Solaris/x86. Haven't checked other affected targets, though. >> I'm seeing this on 32-bit i386-pc-solaris2.11 and sparc-sun-solaris2.11, >> with more reports for armv8l, pru, and s390x. >> >> Comparing the dumps between 64 and 32-bit, I see >> >> -_1: int * [1B, -1B] >> +_1: int * [1B, 4294967295B] > > I wonder why 32-bit targets at displaying 4294967295 instead of -1. Or are > pointers 64-bits here? No, it's a pure 32-bit target. The compiler is 32-bit, too, but bi-arch (32 and 64-bit). Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [patch] canonicalize unsigned [1,MAX] ranges into ~[0,0]
On 10/15/19 7:58 AM, Rainer Orth wrote: Hi Aldy, ~[0,0] has been the accepted way for a long time, I'd really prefer to keep that (for now). It has. Very true. But I don't necessarily think that means we should be introducing even more of 'em. [...] Happily, normalizing into ~0 for signed and [1,MAX] for unsigned, simplified the patch because it on longer needs tweaks to ranges_from_anti_range. OK for trunk? the new testcase FAILs on several (all?) 32-bit targets: +FAIL: gcc.dg/tree-ssa/evrp4.c scan-tree-dump evrp "[1B, -1B]" That's unfortunate. Is this the only test that is failing? I'm seeing this on 32-bit i386-pc-solaris2.11 and sparc-sun-solaris2.11, with more reports for armv8l, pru, and s390x. Comparing the dumps between 64 and 32-bit, I see -_1: int * [1B, -1B] +_1: int * [1B, 4294967295B] I wonder why 32-bit targets at displaying 4294967295 instead of -1. Or are pointers 64-bits here? I wonder if we should just change value_range_base::dump() to dump ~[0,0] for ::nonzero_p(), that way we have a consistent way of *displaying* non-zeroness for tests. Aldy
Re: [patch] canonicalize unsigned [1,MAX] ranges into ~[0,0]
Hi Aldy, >>> ~[0,0] has been the accepted way for a long time, I'd really prefer to >>> keep that (for now). >> It has. Very true. But I don't necessarily think that means we should >> be introducing even more of 'em. [...] > Happily, normalizing into ~0 for signed and [1,MAX] for unsigned, > simplified the patch because it on longer needs tweaks to > ranges_from_anti_range. > > OK for trunk? the new testcase FAILs on several (all?) 32-bit targets: +FAIL: gcc.dg/tree-ssa/evrp4.c scan-tree-dump evrp "[1B, -1B]" I'm seeing this on 32-bit i386-pc-solaris2.11 and sparc-sun-solaris2.11, with more reports for armv8l, pru, and s390x. Comparing the dumps between 64 and 32-bit, I see -_1: int * [1B, -1B] +_1: int * [1B, 4294967295B] Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [patch] canonicalize unsigned [1,MAX] ranges into ~[0,0]
On 10/7/19 6:28 AM, Aldy Hernandez wrote: > > Fair enough. I guess I don't care what we settle on, inasmuch as we > canonicalize into one true value. For some reason, I thought the above > nonzero would cause you to cringe, I guess not :). :-) Takes more than that these days.. And just to reiterate for the larger audience what we discussed in IRC -- I'm absolutely for canonicalization. We're just debating what the canonical form should be and how to get there. > > Happily, normalizing into ~0 for signed and [1,MAX] for unsigned, > simplified the patch because it on longer needs tweaks to > ranges_from_anti_range. Looks like it was ultimately far smaller than I expected. Unless we haven't audited existing code looking for open-coded ~[0,0] tests. > > OK for trunk? With a ChangeLog, yes. > > Aldy > > p.s. This still leaves open the issue with ipa_vr's handling of > nonzero_p. As per my last message, I've adjusted it to check for either > ~[0,0] or the [1,MAX] variant for unsigned, since before this patch > there were two ways of representing the same thing. However, ipa-prop > has no API (it open codes everything), as it has rolled its own version > of "value ranges" with wide-ints. > > class GTY(()) ipa_vr > { > public: > /* The data fields below are valid only if known is true. */ > bool known; > enum value_range_kind type; > wide_int min; > wide_int max; > bool nonzero_p (tree) const; > }; > > I'm tempted to leave the nonzero_p which checks for both ~0 and [1,MAX]: > > bool > ipa_vr::nonzero_p (tree expr_type) const > { > if (type == VR_ANTI_RANGE && wi::eq_p (min, 0) && wi::eq_p (max, 0)) > return true; > > unsigned prec = TYPE_PRECISION (expr_type); > return (type == VR_RANGE > && TYPE_UNSIGNED (expr_type) > && wi::eq_p (min, wi::one (prec)) > && wi::eq_p (max, wi::max_value (prec, TYPE_SIGN (expr_type; > } > > I don't care either way. Sigh. I can live with either here as well. I'm hesitant to start mucking around with it simply because it's well outside of where we've got expertise. jeff
Re: [patch] canonicalize unsigned [1,MAX] ranges into ~[0,0]
On 10/4/19 1:17 PM, Jeff Law wrote: On 10/4/19 10:14 AM, Aldy Hernandez wrote: On 10/4/19 12:02 PM, Jeff Law wrote: On 10/4/19 9:49 AM, Aldy Hernandez wrote: On 10/4/19 11:38 AM, Jeff Law wrote: On 10/4/19 6:59 AM, Aldy Hernandez wrote: When I did the value_range canonicalization work, I noticed that an unsigned [1,MAX] and an ~[0,0] could be two different representations for the same thing. I didn't address the problem then because callers to ranges_from_anti_range() would go into an infinite loop trying to extract ~[0,0] into [1,MAX] and []. We had a lot of callers to ranges_from_anti_range, and it smelled like a rat's nest, so I bailed. Now that we have one main caller (from the symbolic PLUS/MINUS handling), it's a lot easier to contain. Well, singleton_p also calls it, but it's already handling nonzero specially, so it wouldn't be affected. With some upcoming cleanups I'm about to post, the fact that [1,MAX] and ~[0,0] are equal_p(), but not nonzero_p(), matters. Plus, it's just good form to have one representation, giving us the ability to pick at nonzero_p ranges with ease. The code in extract_range_from_plus_minus_expr() continues to be a mess (as it has always been), but at least it's contained, and with this patch, it's slightly smaller. Note, I'm avoiding adding a comment header for functions with highly descriptive obvious names. OK? Aldy canonicalize-nonzero-ranges.patch commit 1c333730deeb4ddadc46ad6d12d5344f92c0352c Author: Aldy Hernandez Date: Fri Oct 4 08:51:25 2019 +0200 Canonicalize UNSIGNED [1,MAX] into ~[0,0]. Adapt PLUS/MINUS symbolic handling, so it doesn't call ranges_from_anti_range with a VR_ANTI_RANGE containing one sub-range. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 6e4f145af46..3934b41fdf9 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,18 @@ +2019-10-04 Aldy Hernandez + + * tree-vrp.c (value_range_base::singleton_p): Use num_pairs + instead of calling vrp_val_is_*. + (value_range_base::set): Canonicalize unsigned [1,MAX] into + non-zero. + (range_has_numeric_bounds_p): New. + (range_int_cst_p): Use range_has_numeric_bounds_p. + (ranges_from_anti_range): Assert that we won't recurse + indefinitely. + (extract_extremes_from_range): New. + (extract_range_from_plus_minus_expr): Adapt so we don't call + ranges_from_anti_range with an anti-range containing only one + sub-range. So no problem with the implementation, but I do have a higher level question. One of the goals of the representation side of the Ranger project is to drop anti-ranges. Canonicalizing [1, MAX] to ~[0,0] seems to be going in the opposite direction. So do we really want to canonicalize to ~[0,0]? Hmmm, Andrew had the same question. It really doesn't matter what we canonicalize too, as long as we're consistent, but there are a bunch of non-zero tests throughout that were checking for the ~[0,0] construct, and I didn't want to rock the boat too much. Although in all honesty, most of those should already be converted to the ::nonzero_p() API. However, if we canonicalize into [1,MAX] for unsigned, we have the problem that a signed non-zero will still be ~[0,0], so our ::nonzero_p code will have to check two different representations, not to mention it will now have to check TYPE_UNSIGNED(type). ISTM that the right thing to do is to first move to the ::nonzero_p API, which should be a behavior preserving change. It'd probably be a medium sized change, but highly mechanical and behavior preserving, so easy to review. Ughh, please no? This was just a means to get the general range_fold* cleanups which are important and make everything easier to read. I'd rather not get distracted by having to audit all the nonzero checking throughout. Doesn't the audit just have to look for an open-coded check for ~[0,0] and convert any to use the API? I don't think we have *that* many anti-range bits I wouldn't think this would be terrible. What am I missing here that makes this so painful? I think I'm just suffering from PTSD from anything VRP related. If you fix something correctly, 20 things are bound to break because they were expecting incorrect behavior. Besides, we can't get away from anti-ranges inasmuch as we have value_range_base, which hopefully we can move away from in a future release. So this will eventually become a non-issue. At that point, we'll loose ALL anti-ranges once and for all. Even if we can't get away from them, introducing more, particularly canonicalizing on a form using anti-ranges just seems like we're going backwards. If we funnel everything through the established API, then changing the canonical form becomes trivial because it's isolated to just two places. The test ::nonzero_p method and the canonicalization point. ~[0,0] has been the accepted way for a long time, I'd really prefer to keep that (for now). It has. Very true. But I don't
Re: [patch] canonicalize unsigned [1,MAX] ranges into ~[0,0]
On 10/4/19 10:14 AM, Aldy Hernandez wrote: > > > On 10/4/19 12:02 PM, Jeff Law wrote: >> On 10/4/19 9:49 AM, Aldy Hernandez wrote: >>> >>> >>> On 10/4/19 11:38 AM, Jeff Law wrote: On 10/4/19 6:59 AM, Aldy Hernandez wrote: > When I did the value_range canonicalization work, I noticed that an > unsigned [1,MAX] and an ~[0,0] could be two different representations > for the same thing. I didn't address the problem then because callers > to ranges_from_anti_range() would go into an infinite loop trying to > extract ~[0,0] into [1,MAX] and []. We had a lot of callers to > ranges_from_anti_range, and it smelled like a rat's nest, so I bailed. > > Now that we have one main caller (from the symbolic PLUS/MINUS > handling), it's a lot easier to contain. Well, singleton_p also calls > it, but it's already handling nonzero specially, so it wouldn't be > affected. > > > > > With some upcoming cleanups I'm about to post, the fact that > [1,MAX] and > ~[0,0] are equal_p(), but not nonzero_p(), matters. Plus, it's just > good form to have one representation, giving us the ability to pick at > nonzero_p ranges with ease. > > The code in extract_range_from_plus_minus_expr() continues to be a > mess > (as it has always been), but at least it's contained, and with this > patch, it's slightly smaller. > > Note, I'm avoiding adding a comment header for functions with highly > descriptive obvious names. > > OK? > > Aldy > > canonicalize-nonzero-ranges.patch > > commit 1c333730deeb4ddadc46ad6d12d5344f92c0352c > Author: Aldy Hernandez > Date: Fri Oct 4 08:51:25 2019 +0200 > > Canonicalize UNSIGNED [1,MAX] into ~[0,0]. > Adapt PLUS/MINUS symbolic handling, so it doesn't call > ranges_from_anti_range with a VR_ANTI_RANGE containing one > sub-range. > > diff --git a/gcc/ChangeLog b/gcc/ChangeLog > index 6e4f145af46..3934b41fdf9 100644 > --- a/gcc/ChangeLog > +++ b/gcc/ChangeLog > @@ -1,3 +1,18 @@ > +2019-10-04 Aldy Hernandez > + > + * tree-vrp.c (value_range_base::singleton_p): Use num_pairs > + instead of calling vrp_val_is_*. > + (value_range_base::set): Canonicalize unsigned [1,MAX] into > + non-zero. > + (range_has_numeric_bounds_p): New. > + (range_int_cst_p): Use range_has_numeric_bounds_p. > + (ranges_from_anti_range): Assert that we won't recurse > + indefinitely. > + (extract_extremes_from_range): New. > + (extract_range_from_plus_minus_expr): Adapt so we don't call > + ranges_from_anti_range with an anti-range containing only one > + sub-range. So no problem with the implementation, but I do have a higher level question. One of the goals of the representation side of the Ranger project is to drop anti-ranges. Canonicalizing [1, MAX] to ~[0,0] seems to be going in the opposite direction. So do we really want to canonicalize to ~[0,0]? >>> >>> Hmmm, Andrew had the same question. >>> >>> It really doesn't matter what we canonicalize too, as long as we're >>> consistent, but there are a bunch of non-zero tests throughout that were >>> checking for the ~[0,0] construct, and I didn't want to rock the boat >>> too much. Although in all honesty, most of those should already be >>> converted to the ::nonzero_p() API. >>> >>> However, if we canonicalize into [1,MAX] for unsigned, we have the >>> problem that a signed non-zero will still be ~[0,0], so our ::nonzero_p >>> code will have to check two different representations, not to mention it >>> will now have to check TYPE_UNSIGNED(type). >> ISTM that the right thing to do is to first move to the ::nonzero_p API, >> which should be a behavior preserving change. It'd probably be a medium >> sized change, but highly mechanical and behavior preserving, so easy to >> review. > > Ughh, please no? This was just a means to get the general range_fold* > cleanups which are important and make everything easier to read. I'd > rather not get distracted by having to audit all the nonzero checking > throughout. Doesn't the audit just have to look for an open-coded check for ~[0,0] and convert any to use the API? I don't think we have *that* many anti-range bits I wouldn't think this would be terrible. What am I missing here that makes this so painful? > > Besides, we can't get away from anti-ranges inasmuch as we have > value_range_base, which hopefully we can move away from in a future > release. So this will eventually become a non-issue. At that point, > we'll loose ALL anti-ranges once and for all. Even if we can't get away from them, introducing more, particularly canonicalizing on a form using anti-ranges just seems like we're going backwards. If we funnel everything through the established API, then changing the
Re: [patch] canonicalize unsigned [1,MAX] ranges into ~[0,0]
On October 4, 2019 5:38:09 PM GMT+02:00, Jeff Law wrote: >On 10/4/19 6:59 AM, Aldy Hernandez wrote: >> When I did the value_range canonicalization work, I noticed that an >> unsigned [1,MAX] and an ~[0,0] could be two different representations >> for the same thing. I didn't address the problem then because >callers >> to ranges_from_anti_range() would go into an infinite loop trying to >> extract ~[0,0] into [1,MAX] and []. We had a lot of callers to >> >ranges_from_anti_range, and it smelled like a rat's nest, so I bailed. >> >> Now that we have one main caller (from the symbolic PLUS/MINUS >> handling), it's a lot easier to contain. Well, singleton_p also >calls >> >it, but it's already handling nonzero specially, so it wouldn't be affected. >> >> >> With some upcoming cleanups I'm about to post, the fact that [1,MAX] >and >> ~[0,0] are equal_p(), but not nonzero_p(), matters. Plus, it's just >> good form to have one representation, giving us the ability to pick >at >> nonzero_p ranges with ease. >> >> The code in extract_range_from_plus_minus_expr() continues to be a >mess >> (as it has always been), but at least it's contained, and with this >> patch, it's slightly smaller. >> >> Note, I'm avoiding adding a comment header for functions with highly >> descriptive obvious names. >> >> OK? >> >> Aldy >> >> canonicalize-nonzero-ranges.patch >> >> commit 1c333730deeb4ddadc46ad6d12d5344f92c0352c >> Author: Aldy Hernandez >> Date: Fri Oct 4 08:51:25 2019 +0200 >> >> Canonicalize UNSIGNED [1,MAX] into ~[0,0]. >> >> Adapt PLUS/MINUS symbolic handling, so it doesn't call >> ranges_from_anti_range with a VR_ANTI_RANGE containing one >sub-range. >> >> diff --git a/gcc/ChangeLog b/gcc/ChangeLog >> index 6e4f145af46..3934b41fdf9 100644 >> --- a/gcc/ChangeLog >> +++ b/gcc/ChangeLog >> @@ -1,3 +1,18 @@ >> +2019-10-04 Aldy Hernandez >> + >> +* tree-vrp.c (value_range_base::singleton_p): Use num_pairs >> +instead of calling vrp_val_is_*. >> +(value_range_base::set): Canonicalize unsigned [1,MAX] into >> +non-zero. >> +(range_has_numeric_bounds_p): New. >> +(range_int_cst_p): Use range_has_numeric_bounds_p. >> +(ranges_from_anti_range): Assert that we won't recurse >> +indefinitely. >> +(extract_extremes_from_range): New. >> +(extract_range_from_plus_minus_expr): Adapt so we don't call >> +ranges_from_anti_range with an anti-range containing only one >> +sub-range. >So no problem with the implementation, but I do have a higher level >question. > >One of the goals of the representation side of the Ranger project is to >drop anti-ranges. Canonicalizing [1, MAX] to ~[0,0] seems to be going >in the opposite direction. So do we really want to canonicalize to >~[0,0]? No, we don't. Richard. >jeff
Re: [patch] canonicalize unsigned [1,MAX] ranges into ~[0,0]
On 10/4/19 12:02 PM, Jeff Law wrote: On 10/4/19 9:49 AM, Aldy Hernandez wrote: On 10/4/19 11:38 AM, Jeff Law wrote: On 10/4/19 6:59 AM, Aldy Hernandez wrote: When I did the value_range canonicalization work, I noticed that an unsigned [1,MAX] and an ~[0,0] could be two different representations for the same thing. I didn't address the problem then because callers to ranges_from_anti_range() would go into an infinite loop trying to extract ~[0,0] into [1,MAX] and []. We had a lot of callers to ranges_from_anti_range, and it smelled like a rat's nest, so I bailed. Now that we have one main caller (from the symbolic PLUS/MINUS handling), it's a lot easier to contain. Well, singleton_p also calls it, but it's already handling nonzero specially, so it wouldn't be affected. With some upcoming cleanups I'm about to post, the fact that [1,MAX] and ~[0,0] are equal_p(), but not nonzero_p(), matters. Plus, it's just good form to have one representation, giving us the ability to pick at nonzero_p ranges with ease. The code in extract_range_from_plus_minus_expr() continues to be a mess (as it has always been), but at least it's contained, and with this patch, it's slightly smaller. Note, I'm avoiding adding a comment header for functions with highly descriptive obvious names. OK? Aldy canonicalize-nonzero-ranges.patch commit 1c333730deeb4ddadc46ad6d12d5344f92c0352c Author: Aldy Hernandez Date: Fri Oct 4 08:51:25 2019 +0200 Canonicalize UNSIGNED [1,MAX] into ~[0,0]. Adapt PLUS/MINUS symbolic handling, so it doesn't call ranges_from_anti_range with a VR_ANTI_RANGE containing one sub-range. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 6e4f145af46..3934b41fdf9 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,18 @@ +2019-10-04 Aldy Hernandez + + * tree-vrp.c (value_range_base::singleton_p): Use num_pairs + instead of calling vrp_val_is_*. + (value_range_base::set): Canonicalize unsigned [1,MAX] into + non-zero. + (range_has_numeric_bounds_p): New. + (range_int_cst_p): Use range_has_numeric_bounds_p. + (ranges_from_anti_range): Assert that we won't recurse + indefinitely. + (extract_extremes_from_range): New. + (extract_range_from_plus_minus_expr): Adapt so we don't call + ranges_from_anti_range with an anti-range containing only one + sub-range. So no problem with the implementation, but I do have a higher level question. One of the goals of the representation side of the Ranger project is to drop anti-ranges. Canonicalizing [1, MAX] to ~[0,0] seems to be going in the opposite direction. So do we really want to canonicalize to ~[0,0]? Hmmm, Andrew had the same question. It really doesn't matter what we canonicalize too, as long as we're consistent, but there are a bunch of non-zero tests throughout that were checking for the ~[0,0] construct, and I didn't want to rock the boat too much. Although in all honesty, most of those should already be converted to the ::nonzero_p() API. However, if we canonicalize into [1,MAX] for unsigned, we have the problem that a signed non-zero will still be ~[0,0], so our ::nonzero_p code will have to check two different representations, not to mention it will now have to check TYPE_UNSIGNED(type). ISTM that the right thing to do is to first move to the ::nonzero_p API, which should be a behavior preserving change. It'd probably be a medium sized change, but highly mechanical and behavior preserving, so easy to review. Ughh, please no? This was just a means to get the general range_fold* cleanups which are important and make everything easier to read. I'd rather not get distracted by having to audit all the nonzero checking throughout. Besides, we can't get away from anti-ranges inasmuch as we have value_range_base, which hopefully we can move away from in a future release. So this will eventually become a non-issue. At that point, we'll loose ALL anti-ranges once and for all. ~[0,0] has been the accepted way for a long time, I'd really prefer to keep that (for now). And really, this is just plain ugly: bool value_range_base::nonzero_p () { if (m_kind == VR_ANTI_RANGE && !TYPE_UNSIGNED (type ()) && integer_zerop (m_min) && integer_zerop (m_max)) return true; return (m_kind == VR_RANGE && TYPE_UNSIGNED (type ()) && integer_onep (m_min) && vrp_val_is_max (m_max)); } Aldy Once that's in place, then I'd seriously look at [1,MAX] as the canonical form for unsigned types and [MIN, -1] [1, MAX] as the canonical form for signed types. If we're trying to get away from ANTI ranges, canonicalizing on ~[0,0] just seems wrong. Where things may get painful is things like [MIN, -3] [3, MAX] which occur due to arithmetic and pointer alignments. I think we have a hack somewhere which special cases this into [MIN, -1], [1, MAX] even though it's technically less precise. jeff
Re: [patch] canonicalize unsigned [1,MAX] ranges into ~[0,0]
On 10/4/19 9:49 AM, Aldy Hernandez wrote: > > > On 10/4/19 11:38 AM, Jeff Law wrote: >> On 10/4/19 6:59 AM, Aldy Hernandez wrote: >>> When I did the value_range canonicalization work, I noticed that an >>> unsigned [1,MAX] and an ~[0,0] could be two different representations >>> for the same thing. I didn't address the problem then because callers >>> to ranges_from_anti_range() would go into an infinite loop trying to >>> extract ~[0,0] into [1,MAX] and []. We had a lot of callers to >>> ranges_from_anti_range, and it smelled like a rat's nest, so I bailed. >>> >>> Now that we have one main caller (from the symbolic PLUS/MINUS >>> handling), it's a lot easier to contain. Well, singleton_p also calls >>> it, but it's already handling nonzero specially, so it wouldn't be affected. >>> >>> >>> >>> With some upcoming cleanups I'm about to post, the fact that [1,MAX] and >>> ~[0,0] are equal_p(), but not nonzero_p(), matters. Plus, it's just >>> good form to have one representation, giving us the ability to pick at >>> nonzero_p ranges with ease. >>> >>> The code in extract_range_from_plus_minus_expr() continues to be a mess >>> (as it has always been), but at least it's contained, and with this >>> patch, it's slightly smaller. >>> >>> Note, I'm avoiding adding a comment header for functions with highly >>> descriptive obvious names. >>> >>> OK? >>> >>> Aldy >>> >>> canonicalize-nonzero-ranges.patch >>> >>> commit 1c333730deeb4ddadc46ad6d12d5344f92c0352c >>> Author: Aldy Hernandez >>> Date: Fri Oct 4 08:51:25 2019 +0200 >>> >>> Canonicalize UNSIGNED [1,MAX] into ~[0,0]. >>> Adapt PLUS/MINUS symbolic handling, so it doesn't call >>> ranges_from_anti_range with a VR_ANTI_RANGE containing one >>> sub-range. >>> >>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog >>> index 6e4f145af46..3934b41fdf9 100644 >>> --- a/gcc/ChangeLog >>> +++ b/gcc/ChangeLog >>> @@ -1,3 +1,18 @@ >>> +2019-10-04 Aldy Hernandez >>> + >>> + * tree-vrp.c (value_range_base::singleton_p): Use num_pairs >>> + instead of calling vrp_val_is_*. >>> + (value_range_base::set): Canonicalize unsigned [1,MAX] into >>> + non-zero. >>> + (range_has_numeric_bounds_p): New. >>> + (range_int_cst_p): Use range_has_numeric_bounds_p. >>> + (ranges_from_anti_range): Assert that we won't recurse >>> + indefinitely. >>> + (extract_extremes_from_range): New. >>> + (extract_range_from_plus_minus_expr): Adapt so we don't call >>> + ranges_from_anti_range with an anti-range containing only one >>> + sub-range. >> So no problem with the implementation, but I do have a higher level >> question. >> >> One of the goals of the representation side of the Ranger project is to >> drop anti-ranges. Canonicalizing [1, MAX] to ~[0,0] seems to be going >> in the opposite direction. So do we really want to canonicalize to >> ~[0,0]? > > Hmmm, Andrew had the same question. > > It really doesn't matter what we canonicalize too, as long as we're > consistent, but there are a bunch of non-zero tests throughout that were > checking for the ~[0,0] construct, and I didn't want to rock the boat > too much. Although in all honesty, most of those should already be > converted to the ::nonzero_p() API. > > However, if we canonicalize into [1,MAX] for unsigned, we have the > problem that a signed non-zero will still be ~[0,0], so our ::nonzero_p > code will have to check two different representations, not to mention it > will now have to check TYPE_UNSIGNED(type). ISTM that the right thing to do is to first move to the ::nonzero_p API, which should be a behavior preserving change. It'd probably be a medium sized change, but highly mechanical and behavior preserving, so easy to review. Once that's in place, then I'd seriously look at [1,MAX] as the canonical form for unsigned types and [MIN, -1] [1, MAX] as the canonical form for signed types. If we're trying to get away from ANTI ranges, canonicalizing on ~[0,0] just seems wrong. Where things may get painful is things like [MIN, -3] [3, MAX] which occur due to arithmetic and pointer alignments. I think we have a hack somewhere which special cases this into [MIN, -1], [1, MAX] even though it's technically less precise. jeff
Re: [patch] canonicalize unsigned [1,MAX] ranges into ~[0,0]
On 10/4/19 11:38 AM, Jeff Law wrote: On 10/4/19 6:59 AM, Aldy Hernandez wrote: When I did the value_range canonicalization work, I noticed that an unsigned [1,MAX] and an ~[0,0] could be two different representations for the same thing. I didn't address the problem then because callers to ranges_from_anti_range() would go into an infinite loop trying to extract ~[0,0] into [1,MAX] and []. We had a lot of callers to ranges_from_anti_range, and it smelled like a rat's nest, so I bailed. Now that we have one main caller (from the symbolic PLUS/MINUS handling), it's a lot easier to contain. Well, singleton_p also calls it, but it's already handling nonzero specially, so it wouldn't be affected. With some upcoming cleanups I'm about to post, the fact that [1,MAX] and ~[0,0] are equal_p(), but not nonzero_p(), matters. Plus, it's just good form to have one representation, giving us the ability to pick at nonzero_p ranges with ease. The code in extract_range_from_plus_minus_expr() continues to be a mess (as it has always been), but at least it's contained, and with this patch, it's slightly smaller. Note, I'm avoiding adding a comment header for functions with highly descriptive obvious names. OK? Aldy canonicalize-nonzero-ranges.patch commit 1c333730deeb4ddadc46ad6d12d5344f92c0352c Author: Aldy Hernandez Date: Fri Oct 4 08:51:25 2019 +0200 Canonicalize UNSIGNED [1,MAX] into ~[0,0]. Adapt PLUS/MINUS symbolic handling, so it doesn't call ranges_from_anti_range with a VR_ANTI_RANGE containing one sub-range. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 6e4f145af46..3934b41fdf9 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,18 @@ +2019-10-04 Aldy Hernandez + + * tree-vrp.c (value_range_base::singleton_p): Use num_pairs + instead of calling vrp_val_is_*. + (value_range_base::set): Canonicalize unsigned [1,MAX] into + non-zero. + (range_has_numeric_bounds_p): New. + (range_int_cst_p): Use range_has_numeric_bounds_p. + (ranges_from_anti_range): Assert that we won't recurse + indefinitely. + (extract_extremes_from_range): New. + (extract_range_from_plus_minus_expr): Adapt so we don't call + ranges_from_anti_range with an anti-range containing only one + sub-range. So no problem with the implementation, but I do have a higher level question. One of the goals of the representation side of the Ranger project is to drop anti-ranges. Canonicalizing [1, MAX] to ~[0,0] seems to be going in the opposite direction. So do we really want to canonicalize to ~[0,0]? Hmmm, Andrew had the same question. It really doesn't matter what we canonicalize too, as long as we're consistent, but there are a bunch of non-zero tests throughout that were checking for the ~[0,0] construct, and I didn't want to rock the boat too much. Although in all honesty, most of those should already be converted to the ::nonzero_p() API. However, if we canonicalize into [1,MAX] for unsigned, we have the problem that a signed non-zero will still be ~[0,0], so our ::nonzero_p code will have to check two different representations, not to mention it will now have to check TYPE_UNSIGNED(type). Aldy
Re: [patch] canonicalize unsigned [1,MAX] ranges into ~[0,0]
On 10/4/19 6:59 AM, Aldy Hernandez wrote: > When I did the value_range canonicalization work, I noticed that an > unsigned [1,MAX] and an ~[0,0] could be two different representations > for the same thing. I didn't address the problem then because callers > to ranges_from_anti_range() would go into an infinite loop trying to > extract ~[0,0] into [1,MAX] and []. We had a lot of callers to > ranges_from_anti_range, and it smelled like a rat's nest, so I bailed. > > Now that we have one main caller (from the symbolic PLUS/MINUS > handling), it's a lot easier to contain. Well, singleton_p also calls > it, but it's already handling nonzero specially, so it wouldn't be affected. > > > With some upcoming cleanups I'm about to post, the fact that [1,MAX] and > ~[0,0] are equal_p(), but not nonzero_p(), matters. Plus, it's just > good form to have one representation, giving us the ability to pick at > nonzero_p ranges with ease. > > The code in extract_range_from_plus_minus_expr() continues to be a mess > (as it has always been), but at least it's contained, and with this > patch, it's slightly smaller. > > Note, I'm avoiding adding a comment header for functions with highly > descriptive obvious names. > > OK? > > Aldy > > canonicalize-nonzero-ranges.patch > > commit 1c333730deeb4ddadc46ad6d12d5344f92c0352c > Author: Aldy Hernandez > Date: Fri Oct 4 08:51:25 2019 +0200 > > Canonicalize UNSIGNED [1,MAX] into ~[0,0]. > > Adapt PLUS/MINUS symbolic handling, so it doesn't call > ranges_from_anti_range with a VR_ANTI_RANGE containing one sub-range. > > diff --git a/gcc/ChangeLog b/gcc/ChangeLog > index 6e4f145af46..3934b41fdf9 100644 > --- a/gcc/ChangeLog > +++ b/gcc/ChangeLog > @@ -1,3 +1,18 @@ > +2019-10-04 Aldy Hernandez > + > + * tree-vrp.c (value_range_base::singleton_p): Use num_pairs > + instead of calling vrp_val_is_*. > + (value_range_base::set): Canonicalize unsigned [1,MAX] into > + non-zero. > + (range_has_numeric_bounds_p): New. > + (range_int_cst_p): Use range_has_numeric_bounds_p. > + (ranges_from_anti_range): Assert that we won't recurse > + indefinitely. > + (extract_extremes_from_range): New. > + (extract_range_from_plus_minus_expr): Adapt so we don't call > + ranges_from_anti_range with an anti-range containing only one > + sub-range. So no problem with the implementation, but I do have a higher level question. One of the goals of the representation side of the Ranger project is to drop anti-ranges. Canonicalizing [1, MAX] to ~[0,0] seems to be going in the opposite direction. So do we really want to canonicalize to ~[0,0]? jeff
[patch] canonicalize unsigned [1,MAX] ranges into ~[0,0]
When I did the value_range canonicalization work, I noticed that an unsigned [1,MAX] and an ~[0,0] could be two different representations for the same thing. I didn't address the problem then because callers to ranges_from_anti_range() would go into an infinite loop trying to extract ~[0,0] into [1,MAX] and []. We had a lot of callers to ranges_from_anti_range, and it smelled like a rat's nest, so I bailed. Now that we have one main caller (from the symbolic PLUS/MINUS handling), it's a lot easier to contain. Well, singleton_p also calls it, but it's already handling nonzero specially, so it wouldn't be affected. With some upcoming cleanups I'm about to post, the fact that [1,MAX] and ~[0,0] are equal_p(), but not nonzero_p(), matters. Plus, it's just good form to have one representation, giving us the ability to pick at nonzero_p ranges with ease. The code in extract_range_from_plus_minus_expr() continues to be a mess (as it has always been), but at least it's contained, and with this patch, it's slightly smaller. Note, I'm avoiding adding a comment header for functions with highly descriptive obvious names. OK? Aldy commit 1c333730deeb4ddadc46ad6d12d5344f92c0352c Author: Aldy Hernandez Date: Fri Oct 4 08:51:25 2019 +0200 Canonicalize UNSIGNED [1,MAX] into ~[0,0]. Adapt PLUS/MINUS symbolic handling, so it doesn't call ranges_from_anti_range with a VR_ANTI_RANGE containing one sub-range. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 6e4f145af46..3934b41fdf9 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,18 @@ +2019-10-04 Aldy Hernandez + + * tree-vrp.c (value_range_base::singleton_p): Use num_pairs + instead of calling vrp_val_is_*. + (value_range_base::set): Canonicalize unsigned [1,MAX] into + non-zero. + (range_has_numeric_bounds_p): New. + (range_int_cst_p): Use range_has_numeric_bounds_p. + (ranges_from_anti_range): Assert that we won't recurse + indefinitely. + (extract_extremes_from_range): New. + (extract_range_from_plus_minus_expr): Adapt so we don't call + ranges_from_anti_range with an anti-range containing only one + sub-range. + 2019-10-04 Aldy Hernandez (value_range_from_overflowed_bounds): Rename from diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c index a2ab4a21925..97dd2b7a734 100644 --- a/gcc/tree-vrp.c +++ b/gcc/tree-vrp.c @@ -379,10 +379,7 @@ value_range_base::singleton_p (tree *result) const } return false; } - - /* An anti-range that includes an extreme, is just a range with - one sub-range. Use the one sub-range. */ - if (vrp_val_is_min (m_min, true) || vrp_val_is_max (m_max, true)) + if (num_pairs () == 1) { value_range_base vr0, vr1; ranges_from_anti_range (this, , , true); @@ -843,6 +840,17 @@ value_range_base::set (enum value_range_kind kind, tree min, tree max) return; } + /* Unsigned [1, MAX] is canonicalized as ~[0, 0]. */ + if (kind == VR_RANGE && TYPE_UNSIGNED (type) + && prec > 1 + && wi::eq_p (wi::to_wide (min), wi::one (prec)) + && wi::eq_p (wi::to_wide (max), wi::max_value (prec, sign))) +{ + m_kind = VR_ANTI_RANGE; + m_min = m_max = build_int_cst (type, 0); + return; +} + /* Do not drop [-INF(OVF), +INF(OVF)] to varying. (OVF) has to be sticky to make sure VRP iteration terminates, otherwise we can get into oscillations. */ @@ -913,15 +921,21 @@ vrp_bitmap_equal_p (const_bitmap b1, const_bitmap b2) && bitmap_equal_p (b1, b2))); } +static bool +range_has_numeric_bounds_p (const value_range_base *vr) +{ + return (vr->min () + && TREE_CODE (vr->min ()) == INTEGER_CST + && TREE_CODE (vr->max ()) == INTEGER_CST); +} + /* Return true if max and min of VR are INTEGER_CST. It's not necessary a singleton. */ bool range_int_cst_p (const value_range_base *vr) { - return (vr->kind () == VR_RANGE - && TREE_CODE (vr->min ()) == INTEGER_CST - && TREE_CODE (vr->max ()) == INTEGER_CST); + return (vr->kind () == VR_RANGE && range_has_numeric_bounds_p (vr)); } /* Return true if VR is a INTEGER_CST singleton. */ @@ -1327,6 +1341,10 @@ ranges_from_anti_range (const value_range_base *ar, value_range_base *vr0, value_range_base *vr1, bool handle_pointers) { + /* An unsigned ~[0,0] cannot be split into [1,MAX] because it gets + normalized back into ~[0,0]. Avoid infinite loop. */ + gcc_checking_assert (!ar->nonzero_p () || !TYPE_UNSIGNED (ar->type ())); + tree type = ar->type (); vr0->set_undefined (); @@ -1582,6 +1600,22 @@ extract_range_from_pointer_plus_expr (value_range_base *vr, vr->set_varying (expr_type); } +static void +extract_extremes_from_range (const value_range_base *vr, tree *min, tree *max) +{ + if (range_has_numeric_bounds_p (vr)) +{ + *min = wide_int_to_tree (vr->type (), vr->lower_bound ()); + *max = wide_int_to_tree (vr->type (), vr->upper_bound ()); +} + else +{ + gcc_checking_assert (vr->kind