Re: [PATCH] Tree-level fix for PR 69526
> Hmm, won't (uint32_t + uint32_t-CST) doesn't overflow be sufficient > condition for such transformation? Yes, in principle this should suffice. What we're actually looking for is something like a "proper" (or no) overflow, i.e. an overflow in both min and max of the value range. In (a + cst1) + cst2 an overflow of (a + cst1) will still produce a valid range as long as overflow(a_min + cst1) == overflow(a_max + cst1). When max overflows but min does not we must not simplify. Currently, I'm checking if the range of (a + cst1) is still a VR_RANGE, for now disregarding ANTI_RANGEs which could most likely be dealt with as well. A major discussion point in my last try was to wrongly always use sign-extend when extending cst1 to the outer type. This is now changed to use sign-extension when (a + cst1) "properly" overflows as in ASSUME (a > 0); (unsigned long)(a + UINT_MAX) + 1; resulting in (unsigned long)(a) + (unsigned long)0, or use the type sign of the constant like in ASSUME (a < 2); (unsigned long)(a + 4294967294u) + 10; resulting in (unsigned long)(a) + (unsigned long)((unsigned long)4294967294 + (unsigned long)10). The additional flag from the last patch is not necessary. Test suite is clean on s390x and x86, bootstraps successful. Regards Robin
Re: [PATCH] Tree-level fix for PR 69526
On Tue, Jan 17, 2017 at 9:48 AM, Richard Biener wrote: > On Tue, Jan 10, 2017 at 2:32 PM, Robin Dapp wrote: >> Perhaps I'm still missing how some cases are handled or not handled, >> sorry for the noise. >> >>> I'm not sure there is anything to "interpret" -- the operation is unsigned >>> and overflow is when the operation may wrap around zero. There might >>> be clever ways of re-writing the expression to >>> (uint64_t)((uint32_t)((int32_t)uint32 + -1) + 1) >>> avoiding the overflow and thus allowing the transform but I'm not sure >>> that's >>> good. >> >> The extra work I introduced was to discern between >> >> (uint64_t)(a + UINT_MAX) + 1 -> (uint64_t)(a), >> (uint64_t)(a + UINT_MAX) + 1 -> (uint64_t)(a) + (uint64_t)(UINT_MAX + 1), >> >> For a's range of [1,1] there is an overflow in both cases. >> We still want to simplify the first case by combining UINT_MAX + 1 -> 0. > > So there's the case where a == 0 where (uint64_t)(0 + UINT_MAX) + 1 is not > zero. > > I think we're still searching for the proper condition on when it is allowed > to > re-write > > (uint64_t)(uint32_t + uint32_t-CST) + uint64_t-CST > > to > > (uint64_t)(uint32_t) + (uint64_t)uint32_t-CST + uint64_t-CST Hmm, won't (uint32_t + uint32_t-CST) doesn't overflow be sufficient condition for such transformation? > > or to > > (uint64_t)(uint32_t) + (uint64_t)(uint32_t-CST + (uint32_t)uint64_t-CST) We don't actually need to prove this? What complicates this is when it's safe to transform: (uint64_t)(uint32_t + uint32_t-CST) + uint64_t-CST into (uint64_t)(uint32_t + uint32_t-CSTx) where uint32_t-CSTx = uint32_t-CST + (uint32_t)uint64_t-CST Am I misunderstanding the whole thing? Thanks, bin > >> If "interpreting" UINT_MAX as -1 is not the correct thing to do, perhaps >> (uint64_t)((uint32_t)(UINT_MAX + 1)) is? This fails, however, if the >> outer constant is larger than UINT_MAX. What else can we do here? >> Do we see cases like the second one at all? If it's not needed, the >> extra work is likely not needed. > > We do have the need to associate and simplfy across conversions but of > course we have to do it conservatively correct. > >>> A related thing would be canonicalizing unsigned X plus CST to >>> unsigned X minus CST' >>> if CST' has a smaller absolute value than CST. I think currently we >>> simply canonicalize >>> to 'plus CST' but also only in fold-const.c, not in match.pd (ah we >>> do, but only in a simplified manner). >> >> I can imagine this could simplify the treatment of some cases, yet I'm >> already a bit lost with the current cases :) > > Yes, but I am lost a bit as well. I don't know the correct conditions to test > off-head -- I know we may not introduce new undefined overflow ops and > of course we need to not compute wrong numbers either. > >>> That said, can we leave that "trick" out of the patch? I think your >>> more complicated >>> "overflows" result from extract_range_from_binary_expr_1 doesn't apply to >>> all >>> ops (like MULT_EXPR where more complicated cases can arise). >> >> There is certainly additional work to be done for MULT_EXPR, I >> disregarded it so far. For this patch, I'd rather conservatively assume >> overflow. > > Ok... > > Richard. > >> Regards >> Robin >>
Re: [PATCH] Tree-level fix for PR 69526
ping.
Re: [PATCH] Tree-level fix for PR 69526
I skimmed through the code to see where transformation like (a - 1) -> (a + UINT_MAX) are performed. It seems there are only two places, match.pd (/* A - B -> A + (-B) if B is easily negatable. */) and fold-const.c. In order to be able to reliably know whether to zero-extend or to sign-extend the constant (1) in (ulong)(a - 1) + 1 -> (ulong)(a) + 0 or -> (ulong)(a) + (ulong)(UINT_MAX + 1) we'd have to know if the constant was converted by a transformation like above. I first tried removing the two transformations altogether but this causes around 20 new regressions on s390x and I didn't go through all of them to see whether they can be fixed. Is there a rationale for applying the transformations other than canonicalization? I saw some optimizations that only apply to PLUS_EXPRs but that can easily be changed to also include MINUS_EXPR. My other attempt entails an additional flag TREE_WAS_SIGNED for the lack of a better naming idea. It is set when the above transformation takes place. I used (NODE)->base.deprecated_flag but there may be better choices. Tentative/example patch attached for reference. Using this, the type extension in my original patch can be changed to w1 = w1.from (w1, TYPE_PRECISION (type), TREE_WAS_SIGNED (@1) ? SIGNED : TYPE_SIGN (inner_type)); which works for me and does not introduce regressions on s390x. Is this a viable approach? Regards Robin diff --git a/gcc/fold-const.c b/gcc/fold-const.c index c649e54..cbb7469 100644 --- a/gcc/fold-const.c +++ b/gcc/fold-const.c @@ -9648,9 +9648,15 @@ fold_binary_loc (location_t loc, && (TREE_CODE (op1) != REAL_CST || REAL_VALUE_NEGATIVE (TREE_REAL_CST (op1 || INTEGRAL_TYPE_P (type))) - return fold_build2_loc (loc, PLUS_EXPR, type, -fold_convert_loc (loc, type, arg0), -negate_expr (op1)); + { + tree negtem = negate_expr (op1); + if (CONSTANT_CLASS_P (negtem)) + TREE_WAS_SIGNED (negtem) = 1; + tree tem = fold_build2_loc (loc, PLUS_EXPR, type, + fold_convert_loc (loc, type, arg0), + negtem); + return tem; + } /* Fold &a[i] - &a[j] to i-j. */ if (TREE_CODE (arg0) == ADDR_EXPR @@ -13620,6 +13626,7 @@ fold_negate_const (tree arg0, tree type) t = force_fit_type (type, val, 1, (overflow | TREE_OVERFLOW (arg0)) && !TYPE_UNSIGNED (type)); + TREE_WAS_SIGNED (t) = TREE_WAS_SIGNED (arg0); break; } diff --git a/gcc/match.pd b/gcc/match.pd index b3f2063..e791942 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -870,7 +870,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (simplify (minus @0 negate_expr_p@1) (if (!FIXED_POINT_TYPE_P (type)) - (plus @0 (negate @1 + (with { + if (CONSTANT_CLASS_P (@1)) + TREE_WAS_SIGNED (@1) = 1; + } + (plus @0 (negate @1) /* Try to fold (type) X op CST -> (type) (X op ((type-x) CST)) when profitable. @@ -1223,8 +1227,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) /* Extend @1 to TYPE. Perform sign extension if the range overflowed but did not split. */ - w1 = w1.from (w1, TYPE_PRECISION (type), ovf.ovf ? SIGNED : - TYPE_SIGN (inner_type)); + w1 = w1.from (w1, TYPE_PRECISION (type), TREE_WAS_SIGNED (@1) + ? SIGNED : TYPE_SIGN (inner_type)); /* w1 = w1.from (w1, TYPE_PRECISION (type), TYPE_SIGN (inner_type)); */ diff --git a/gcc/tree.h b/gcc/tree.h index 62cd7bb..1e7efd9 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -735,11 +735,18 @@ extern void omp_clause_range_check_failed (const_tree, const char *, int, #define TREE_OVERFLOW(NODE) (CST_CHECK (NODE)->base.public_flag) +/* Foo. */ + +#define TREE_WAS_SIGNED(NODE) (CST_CHECK (NODE)->base.deprecated_flag) + /* TREE_OVERFLOW can only be true for EXPR of CONSTANT_CLASS_P. */ #define TREE_OVERFLOW_P(EXPR) \ (CONSTANT_CLASS_P (EXPR) && TREE_OVERFLOW (EXPR)) +#define TREE_WAS_SIGNED_P(EXPR) \ + (CONSTANT_CLASS_P (EXPR) && TREE_WAS_SIGNED (EXPR)) + /* In a VAR_DECL, FUNCTION_DECL, NAMESPACE_DECL or TYPE_DECL, nonzero means name is to be accessible from outside this translation unit. In an IDENTIFIER_NODE, nonzero means an external declaration
Re: [PATCH] Tree-level fix for PR 69526
On Tue, Jan 10, 2017 at 2:32 PM, Robin Dapp wrote: > Perhaps I'm still missing how some cases are handled or not handled, > sorry for the noise. > >> I'm not sure there is anything to "interpret" -- the operation is unsigned >> and overflow is when the operation may wrap around zero. There might >> be clever ways of re-writing the expression to >> (uint64_t)((uint32_t)((int32_t)uint32 + -1) + 1) >> avoiding the overflow and thus allowing the transform but I'm not sure that's >> good. > > The extra work I introduced was to discern between > > (uint64_t)(a + UINT_MAX) + 1 -> (uint64_t)(a), > (uint64_t)(a + UINT_MAX) + 1 -> (uint64_t)(a) + (uint64_t)(UINT_MAX + 1), > > For a's range of [1,1] there is an overflow in both cases. > We still want to simplify the first case by combining UINT_MAX + 1 -> 0. So there's the case where a == 0 where (uint64_t)(0 + UINT_MAX) + 1 is not zero. I think we're still searching for the proper condition on when it is allowed to re-write (uint64_t)(uint32_t + uint32_t-CST) + uint64_t-CST to (uint64_t)(uint32_t) + (uint64_t)uint32_t-CST + uint64_t-CST or to (uint64_t)(uint32_t) + (uint64_t)(uint32_t-CST + (uint32_t)uint64_t-CST) > If "interpreting" UINT_MAX as -1 is not the correct thing to do, perhaps > (uint64_t)((uint32_t)(UINT_MAX + 1)) is? This fails, however, if the > outer constant is larger than UINT_MAX. What else can we do here? > Do we see cases like the second one at all? If it's not needed, the > extra work is likely not needed. We do have the need to associate and simplfy across conversions but of course we have to do it conservatively correct. >> A related thing would be canonicalizing unsigned X plus CST to >> unsigned X minus CST' >> if CST' has a smaller absolute value than CST. I think currently we >> simply canonicalize >> to 'plus CST' but also only in fold-const.c, not in match.pd (ah we >> do, but only in a simplified manner). > > I can imagine this could simplify the treatment of some cases, yet I'm > already a bit lost with the current cases :) Yes, but I am lost a bit as well. I don't know the correct conditions to test off-head -- I know we may not introduce new undefined overflow ops and of course we need to not compute wrong numbers either. >> That said, can we leave that "trick" out of the patch? I think your >> more complicated >> "overflows" result from extract_range_from_binary_expr_1 doesn't apply to all >> ops (like MULT_EXPR where more complicated cases can arise). > > There is certainly additional work to be done for MULT_EXPR, I > disregarded it so far. For this patch, I'd rather conservatively assume > overflow. Ok... Richard. > Regards > Robin >
Re: [PATCH] Tree-level fix for PR 69526
Ping. To put it shortly, I'm not sure how to differentiate between: example range of a: [3,3] (ulong)(a + UINT_MAX) + 1 --> (ulong)(a) + (ulong)(-1 + 1), sign extend example range of a: [0,0] (ulong)(a + UINT_MAX) + 1 --> (ulong)(a) + (ulong)(UINT_MAX + 1), no sign extend In this case, there is an "ok" overflow in the first example's range and no overflow in the second, but I don't think this suffices as criterion. As said, your proposed canonicalization (" unsigned X plus CST to unsigned X minus CST' if CST' has a smaller absolute value than CST) might help here, but I'm unsure how to do it currently.
Re: [PATCH] Tree-level fix for PR 69526
Perhaps I'm still missing how some cases are handled or not handled, sorry for the noise. > I'm not sure there is anything to "interpret" -- the operation is unsigned > and overflow is when the operation may wrap around zero. There might > be clever ways of re-writing the expression to > (uint64_t)((uint32_t)((int32_t)uint32 + -1) + 1) > avoiding the overflow and thus allowing the transform but I'm not sure that's > good. The extra work I introduced was to discern between (uint64_t)(a + UINT_MAX) + 1 -> (uint64_t)(a), (uint64_t)(a + UINT_MAX) + 1 -> (uint64_t)(a) + (uint64_t)(UINT_MAX + 1), For a's range of [1,1] there is an overflow in both cases. We still want to simplify the first case by combining UINT_MAX + 1 -> 0. If "interpreting" UINT_MAX as -1 is not the correct thing to do, perhaps (uint64_t)((uint32_t)(UINT_MAX + 1)) is? This fails, however, if the outer constant is larger than UINT_MAX. What else can we do here? Do we see cases like the second one at all? If it's not needed, the extra work is likely not needed. > A related thing would be canonicalizing unsigned X plus CST to > unsigned X minus CST' > if CST' has a smaller absolute value than CST. I think currently we > simply canonicalize > to 'plus CST' but also only in fold-const.c, not in match.pd (ah we > do, but only in a simplified manner). I can imagine this could simplify the treatment of some cases, yet I'm already a bit lost with the current cases :) > That said, can we leave that "trick" out of the patch? I think your > more complicated > "overflows" result from extract_range_from_binary_expr_1 doesn't apply to all > ops (like MULT_EXPR where more complicated cases can arise). There is certainly additional work to be done for MULT_EXPR, I disregarded it so far. For this patch, I'd rather conservatively assume overflow. Regards Robin
Re: [PATCH] Tree-level fix for PR 69526
On Wed, Dec 7, 2016 at 5:14 PM, Robin Dapp wrote: >> So we have (uint64_t)(uint32 + -1U) + 1 and using TYPE_SIGN (inner_type) >> produces (uint64_t)uint32 + -1U + 1. This simply means that we cannot ignore >> overflow of the inner operation and for some reason your change >> to extract_range_from_binary_expr didn't catch this. That is _8 + 4294967295 >> overflows but we ignored that. > > In this case the range of _8 was [1,1] so no overflow. > I think the problem is rather about the interpretation of the inner > constant. I tried discerning two cases now, a range split (i.e. when the > single range of a binop variable becomes an anti range or two ranges > after combining it with the binop's other range) and an overflow of the > range's min and max. > If the range didn't split, we can perform the simplification. If there > was a min and max overflow, we have to interpret the inner constant as > signed and perform a sign extension before converting it to the outer > type. Without overflow we can use TYPE_SIGN (inner_type). > Does this make sense? I'm not sure there is anything to "interpret" -- the operation is unsigned and overflow is when the operation may wrap around zero. There might be clever ways of re-writing the expression to (uint64_t)((uint32_t)((int32_t)uint32 + -1) + 1) avoiding the overflow and thus allowing the transform but I'm not sure that's good. A related thing would be canonicalizing unsigned X plus CST to unsigned X minus CST' if CST' has a smaller absolute value than CST. I think currently we simply canonicalize to 'plus CST' but also only in fold-const.c, not in match.pd (ah we do, but only in a simplified manner). That said, can we leave that "trick" out of the patch? I think your more complicated "overflows" result from extract_range_from_binary_expr_1 doesn't apply to all ops (like MULT_EXPR where more complicated cases can arise). Richard. > Included the remarks and attached the new version. > > Regards > Robin
Re: [PATCH] Tree-level fix for PR 69526
> So we have (uint64_t)(uint32 + -1U) + 1 and using TYPE_SIGN (inner_type) > produces (uint64_t)uint32 + -1U + 1. This simply means that we cannot ignore > overflow of the inner operation and for some reason your change > to extract_range_from_binary_expr didn't catch this. That is _8 + 4294967295 > overflows but we ignored that. In this case the range of _8 was [1,1] so no overflow. I think the problem is rather about the interpretation of the inner constant. I tried discerning two cases now, a range split (i.e. when the single range of a binop variable becomes an anti range or two ranges after combining it with the binop's other range) and an overflow of the range's min and max. If the range didn't split, we can perform the simplification. If there was a min and max overflow, we have to interpret the inner constant as signed and perform a sign extension before converting it to the outer type. Without overflow we can use TYPE_SIGN (inner_type). Does this make sense? Included the remarks and attached the new version. Regards Robin diff --git a/gcc/match.pd b/gcc/match.pd index feaa4a1..19df142 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -1195,6 +1195,81 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (minus @0 (minus @0 @1)) @1) + /* ((T)(A +- CST)) +- CST -> (T)(A) +- CST) */ +#if GIMPLE + (for outer_op (plus minus) + (for inner_op (plus minus) + (simplify + (outer_op (convert (inner_op@3 @0 INTEGER_CST@1)) INTEGER_CST@2) + (if (TREE_CODE (type) == INTEGER_TYPE + && TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@3))) + (with + { + struct vr_binop_overflow ovf; + tree cst = NULL_TREE; + tree inner_type = TREE_TYPE (@3); + value_range vr = VR_INITIALIZER; + + /* Convert combined constant to tree of outer type if + there was no value range split in the original operation. */ + if (TYPE_OVERFLOW_UNDEFINED (inner_type) + || (extract_range_from_binary_expr (&vr, inner_op, + inner_type, @0, @1, &ovf), !ovf.range_split)) + { + wide_int w1 = @1; + wide_int w2 = @2; + + wide_int combined_cst; + + /* Extend @1 to TYPE. Perform sign extension if the range + overflowed but did not split. */ + w1 = w1.from (w1, TYPE_PRECISION (type), ovf.ovf ? SIGNED : + TYPE_SIGN (inner_type)); + + if (inner_op == MINUS_EXPR) + w1 = wi::neg (w1); + + if (outer_op == MINUS_EXPR) + w2 = wi::neg (w2); + + /* Combine in outer, larger type. */ + bool combine_ovf = true; + combined_cst = wi::add (w1, w2); + + cst = wide_int_to_tree (type, combined_cst); + } + } + (if (cst) + (outer_op (convert @0) { cst; })) + ) +#endif + + /* ((T)(A)) +- CST -> (T)(A +- CST) */ +#if GIMPLE + (for outer_op (plus minus) +(simplify + (outer_op (convert @0) INTEGER_CST@2) + (if (TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@0)) + && TREE_CODE (TREE_TYPE (@0)) == INTEGER_TYPE + && TREE_CODE (type) == INTEGER_TYPE) + /* Perform binary operation inside the cast if the constant fits + and there is no overflow. */ + (with + { + struct vr_binop_overflow ovf; + + int_fits_type_p (@2, TREE_TYPE (@0)); + tree cst_inner = fold_convert (TREE_TYPE (@0), @2); + + value_range vr = VR_INITIALIZER; + extract_range_from_binary_expr (&vr, outer_op, TREE_TYPE (@0), + @0, cst_inner, &ovf); + } + (if (!ovf.range_split) + (convert (outer_op @0 { cst_inner; }))) + +#endif + /* (A +- CST1) +- CST2 -> A + CST3 */ (for outer_op (plus minus) (for inner_op (plus minus) diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-1.c b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-1.c new file mode 100644 index 000..19b787b --- /dev/null +++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-1.c @@ -0,0 +1,60 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-ccp1-details" } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 11 "ccp1" } } */ + +#include + +long foo(int a) +{ + return (long)(a - 2) + 1; +} + +long bar(int a) +{ + return (long)(a + 3) - 1; +} + +long baz(int a) +{ + return (long)(a - 1) + 2; +} + +long baf(int a) +{ + return (long)(a + 1) - 2; +} + +long bak(int a) +{ + return (long)(a + 1) + 3; +} + +long bal(int a) +{ + return (long)(a - 7) - 4; +} + +long bam(int a) +{ + return (long)(a - 1) - INT_MAX; +} + +long bam2(int a) +{ + return (long)(a + 1) + INT_MAX; +} + +long ban(int a) +{ + return (long)(a - 1) + INT_MIN; +} + +long ban2(int a) +{ + return (long)(a + 1) - INT_MIN; +} + +unsigned long baq(int a) +{ + return (unsigned long)(a + 1) - 1; +} diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-1.c b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-1.c new file mode 100644 index 000..8befc96 --- /dev/null +++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-1.c @@ -0,0 +1,51 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-ccp1-details -fdump-tree-evrp-details
Re: [PATCH] Tree-level fix for PR 69526
On Mon, Nov 28, 2016 at 2:26 PM, Robin Dapp wrote: >>> + /* Sign-extend @1 to TYPE. */ >>> + w1 = w1.from (w1, TYPE_PRECISION (type), SIGNED); >>> >>> not sure why you do always sign-extend. If the inner op is unsigned >>> and we widen then that's certainly bogus considering your UINT_MAX >>> example above. Does >>> >>>w1 = w1.from (w1, TYPE_PRECISION (type), TYPE_SIGN >>> (inner_type)); >>> >>> not work for some reason? > > With TYPE_SIGN (inner_type), I encountered situations like this in the > testsuite (mostly Fortran but also 2422-1.c): > > Folding statement: _1 = _9 + 1; > Applying pattern match.pd:1214, gimple-match.c:8719 > gimple_simplified to _93 = (sizetype) _8; > _1 = _93 + 4294967296; > Folded into: _1 = _93 + 4294967296; > > in > _8 = (unsigned int) i_73; > _5 = _8 + 4294967295; > _9 = (sizetype) _5; > _93 = (sizetype) _8; > _1 = _93 + 4294967296; > > TYPE_SIGN (inner_type) is PLUS here, although it should be MINUS in > order to combine -1 and +1 to 0. Perhaps this can be handled differently > with keeping TYPE_SIGN (inner_type)? So we have (uint64_t)(uint32 + -1U) + 1 and using TYPE_SIGN (inner_type) produces (uint64_t)uint32 + -1U + 1. This simply means that we cannot ignore overflow of the inner operation and for some reason your change to extract_range_from_binary_expr didn't catch this. That is _8 + 4294967295 overflows but we ignored that. > On the other hand, using SIGNED instead of TYPE_SIGN (inner_type) worked > for all cases I looked at, like > if (a > 1 && a < 10) > return (unsigned long)(a + UINT_MAX) + 1; > For > if (a > 0 && a < 10) > extract_range...() would not find a non-VR_VARYING range although we > should probably still be able to simplify this. > > if (a > 0) > Here, vrp figures out [0,4294967294] and the simplification takes place. > > For > if (a <= 10) > return (unsigned long)(a + (UINT_MAX - 10)) + 1; > 003t.original already reads > return (long unsigned int) a + 4294967286 > > From this I hand-wavingly deduced, we'd only see instances where a > sign-extension of the constant is fine (test suites on s390x and x86 > affirm this for what it's woth) but I don't have a cogent reason hence > my doubts :) Well, even if sign-extending you can probably construct a testcase where that's still wrong with respect to overflow. Richard. > I'm ok with omitting the sign-changing case (I hadn't though of it > anyway) and adapted the patch. I haven't attached the updated version > though, because I'm still unsure about the issue above (despite the > clean test suite). > > Regards > Robin >
Re: [PATCH] Tree-level fix for PR 69526
Ping. Any idea how to tackle this?
Re: [PATCH] Tree-level fix for PR 69526
>> + /* Sign-extend @1 to TYPE. */ >> + w1 = w1.from (w1, TYPE_PRECISION (type), SIGNED); >> >> not sure why you do always sign-extend. If the inner op is unsigned >> and we widen then that's certainly bogus considering your UINT_MAX >> example above. Does >> >>w1 = w1.from (w1, TYPE_PRECISION (type), TYPE_SIGN >> (inner_type)); >> >> not work for some reason? With TYPE_SIGN (inner_type), I encountered situations like this in the testsuite (mostly Fortran but also 2422-1.c): Folding statement: _1 = _9 + 1; Applying pattern match.pd:1214, gimple-match.c:8719 gimple_simplified to _93 = (sizetype) _8; _1 = _93 + 4294967296; Folded into: _1 = _93 + 4294967296; in _8 = (unsigned int) i_73; _5 = _8 + 4294967295; _9 = (sizetype) _5; _93 = (sizetype) _8; _1 = _93 + 4294967296; TYPE_SIGN (inner_type) is PLUS here, although it should be MINUS in order to combine -1 and +1 to 0. Perhaps this can be handled differently with keeping TYPE_SIGN (inner_type)? On the other hand, using SIGNED instead of TYPE_SIGN (inner_type) worked for all cases I looked at, like if (a > 1 && a < 10) return (unsigned long)(a + UINT_MAX) + 1; For if (a > 0 && a < 10) extract_range...() would not find a non-VR_VARYING range although we should probably still be able to simplify this. if (a > 0) Here, vrp figures out [0,4294967294] and the simplification takes place. For if (a <= 10) return (unsigned long)(a + (UINT_MAX - 10)) + 1; 003t.original already reads return (long unsigned int) a + 4294967286 >From this I hand-wavingly deduced, we'd only see instances where a sign-extension of the constant is fine (test suites on s390x and x86 affirm this for what it's woth) but I don't have a cogent reason hence my doubts :) I'm ok with omitting the sign-changing case (I hadn't though of it anyway) and adapted the patch. I haven't attached the updated version though, because I'm still unsure about the issue above (despite the clean test suite). Regards Robin
Re: [PATCH] Tree-level fix for PR 69526
On Fri, Nov 25, 2016 at 2:46 PM, Richard Biener wrote> On Wed, Nov 16, 2016 at 4:54 PM, Robin Dapp wrote: >> Found some time to look into this again. >> >>> Index: tree-ssa-propagate.c >>> === >>> --- tree-ssa-propagate.c(revision 240133) >>> +++ tree-ssa-propagate.c(working copy) >>> @@ -1105,10 +1105,10 @@ substitute_and_fold_dom_walker::before_d >>>/* Replace real uses in the statement. */ >>>did_replace |= replace_uses_in (stmt, get_value_fn); >>> >>> - /* If we made a replacement, fold the statement. */ >>> - if (did_replace) >>> + /* Fold the statement. */ >>> + if (fold_stmt (&i, follow_single_use_edges)) >>> { >>> - fold_stmt (&i, follow_single_use_edges); >>> + did_replace = true; >>> stmt = gsi_stmt (i); >>> } >>> >>> this would need compile-time cost evaluation (and avoid the tree-vrp.c >>> folding part >>> of your patch). >> >> This causes an ICE and bootstrap errors with newer revisions. I tested >> my patch on r240691 where it still works. How can this be done properly now? > > Not sure, I'd have to investigate. It should still just work > (throwing off a bootstrap > with just that change over the weekend). Index: gcc/tree-ssa-propagate.c === --- gcc/tree-ssa-propagate.c(revision 242875) +++ gcc/tree-ssa-propagate.c(working copy) @@ -1065,13 +1065,15 @@ substitute_and_fold_dom_walker::before_d /* Replace real uses in the statement. */ did_replace |= replace_uses_in (stmt, get_value_fn); + if (did_replace) + gimple_set_modified (stmt, true); /* If we made a replacement, fold the statement. */ - if (did_replace) + if (fold_stmt (&i, follow_single_use_edges)) { - fold_stmt (&i, follow_single_use_edges); stmt = gsi_stmt (i); gimple_set_modified (stmt, true); + did_replace = true; } /* Some statements may be simplified using propagator works fine for me. >>> OTOH given that you use this to decide whether you can use a combined >>> constant >>> that doesn't look necessary (if the constant is correct, that is) -- >>> what you need >>> to make sure is that the final operation, (T)(A) +- CST, does not overflow >>> if ! TYPE_OVERFLOW_WRAPS and there wasn't any overflow in the >>> original operation. I think that always holds, thus the combine_ovf check >>> looks >>> superfluous to me. >> >> Removed the check and addressed the other remarks. >> >>> So now we know that for (T)(X + CST1) + CST2, (T)CST1 + CST2 >>> does not overflow. But we do not really care for that, we want to know >>> whether (T)X + CST' might invoke undefined behavior when the original >>> expression did not. This involves range information on X. I don't >>> see how we ensure this here. >> >> I guess I'm still missing an undefined behavior case. In which case can >> (T)X + CST' trigger undefined behavior where the original statement did >> not? I see the need for checking in the second pattern ((T)(X) + CST' -> >> (T)(X + CST')), of course. > > Looking at > > + /* ((T)(A +- CST)) +- CST -> (T)(A) +- CST) */ > +#if GIMPLE > + (for outer_op (plus minus) > + (for inner_op (plus minus) > + (simplify > +(outer_op (convert (inner_op@3 @0 INTEGER_CST@1)) INTEGER_CST@2) > + (if (TREE_CODE (type) == INTEGER_TYPE && > + TYPE_PRECISION (type) >= TYPE_PRECISION (TREE_TYPE (@3))) > > so the conversion (T) is widening or sign-changing. (&& go to the next line) > > If A + CST overflows we cannot do the transform (you check that > with extract_range_from_binary_expr setting 'range_split'). > > If A + CST does not overflow but is unsigned and we are just changing sign > (precision ==) then (T)A + (CST + CST) might overflow. Consider > (int)(INT_MAX + 1) + 1 -> INT_MAX + 2. I think here the important part > is whether A + CST fits in T for the case where we just change the type > to a type with !TYPE_OVERFLOW_WRAPS. Certainly restricting to > widenings would avoid the issue. > >>> But that's "easily fixable" by computing it in unsigned arithmetic, that is >>> doing >>> >>> (long)(a + 2) + LONG_MAX -> (long)((unsigned long)a + (LONG_MAX + 2)) >> >> Does this also work if (unsigned long)a + (LONG_MAX + 2) does not fit >> into [0,LONG_MAX]? IIRC (unsigned long)(LONG_MAX + 2) is >> implementation-defined and not undefined so it should work? > > Yes, implementation-defined beavior is fine. > >> Revised patch version attached. One thing I'm still not sure about is >> the handling of sth. like (unsigned long)(a + UINT_MAX) + 1 for a = 0. >> In the current patch version I always do a sign-extend of the first >> constant (UINT_MAX here) which seems to cause no problems in the >> testsuite and some custom tests still worked. > > + /* Sign-extend @1 to TY
Re: [PATCH] Tree-level fix for PR 69526
On Wed, Nov 16, 2016 at 4:54 PM, Robin Dapp wrote: > Found some time to look into this again. > >> Index: tree-ssa-propagate.c >> === >> --- tree-ssa-propagate.c(revision 240133) >> +++ tree-ssa-propagate.c(working copy) >> @@ -1105,10 +1105,10 @@ substitute_and_fold_dom_walker::before_d >>/* Replace real uses in the statement. */ >>did_replace |= replace_uses_in (stmt, get_value_fn); >> >> - /* If we made a replacement, fold the statement. */ >> - if (did_replace) >> + /* Fold the statement. */ >> + if (fold_stmt (&i, follow_single_use_edges)) >> { >> - fold_stmt (&i, follow_single_use_edges); >> + did_replace = true; >> stmt = gsi_stmt (i); >> } >> >> this would need compile-time cost evaluation (and avoid the tree-vrp.c >> folding part >> of your patch). > > This causes an ICE and bootstrap errors with newer revisions. I tested > my patch on r240691 where it still works. How can this be done properly now? Not sure, I'd have to investigate. It should still just work (throwing off a bootstrap with just that change over the weekend). >> OTOH given that you use this to decide whether you can use a combined >> constant >> that doesn't look necessary (if the constant is correct, that is) -- >> what you need >> to make sure is that the final operation, (T)(A) +- CST, does not overflow >> if ! TYPE_OVERFLOW_WRAPS and there wasn't any overflow in the >> original operation. I think that always holds, thus the combine_ovf check >> looks >> superfluous to me. > > Removed the check and addressed the other remarks. > >> So now we know that for (T)(X + CST1) + CST2, (T)CST1 + CST2 >> does not overflow. But we do not really care for that, we want to know >> whether (T)X + CST' might invoke undefined behavior when the original >> expression did not. This involves range information on X. I don't >> see how we ensure this here. > > I guess I'm still missing an undefined behavior case. In which case can > (T)X + CST' trigger undefined behavior where the original statement did > not? I see the need for checking in the second pattern ((T)(X) + CST' -> > (T)(X + CST')), of course. Looking at + /* ((T)(A +- CST)) +- CST -> (T)(A) +- CST) */ +#if GIMPLE + (for outer_op (plus minus) + (for inner_op (plus minus) + (simplify +(outer_op (convert (inner_op@3 @0 INTEGER_CST@1)) INTEGER_CST@2) + (if (TREE_CODE (type) == INTEGER_TYPE && + TYPE_PRECISION (type) >= TYPE_PRECISION (TREE_TYPE (@3))) so the conversion (T) is widening or sign-changing. (&& go to the next line) If A + CST overflows we cannot do the transform (you check that with extract_range_from_binary_expr setting 'range_split'). If A + CST does not overflow but is unsigned and we are just changing sign (precision ==) then (T)A + (CST + CST) might overflow. Consider (int)(INT_MAX + 1) + 1 -> INT_MAX + 2. I think here the important part is whether A + CST fits in T for the case where we just change the type to a type with !TYPE_OVERFLOW_WRAPS. Certainly restricting to widenings would avoid the issue. >> But that's "easily fixable" by computing it in unsigned arithmetic, that is >> doing >> >> (long)(a + 2) + LONG_MAX -> (long)((unsigned long)a + (LONG_MAX + 2)) > > Does this also work if (unsigned long)a + (LONG_MAX + 2) does not fit > into [0,LONG_MAX]? IIRC (unsigned long)(LONG_MAX + 2) is > implementation-defined and not undefined so it should work? Yes, implementation-defined beavior is fine. > Revised patch version attached. One thing I'm still not sure about is > the handling of sth. like (unsigned long)(a + UINT_MAX) + 1 for a = 0. > In the current patch version I always do a sign-extend of the first > constant (UINT_MAX here) which seems to cause no problems in the > testsuite and some custom tests still worked. + /* Sign-extend @1 to TYPE. */ + w1 = w1.from (w1, TYPE_PRECISION (type), SIGNED); not sure why you do always sign-extend. If the inner op is unsigned and we widen then that's certainly bogus considering your UINT_MAX example above. Does w1 = w1.from (w1, TYPE_PRECISION (type), TYPE_SIGN (inner_type)); not work for some reason? + /* Combine in outer, larger type. */ + bool combine_ovf = true; + combined_cst = wi::add (w1, w2, SIGNED, &combine_ovf); as you ignore combine_ovf you can simply use combined_cst = wi::add (w1, w2); + /* Convert combined constant to tree of outer type if +there was no value range split in the original operation. */ + if (!range_split) + { I'd say you want to condition on range_split early, like with bool range_split; if (TYPE_OVERFLOW_UNDEFINED (inner_type) || ! (extract_range_from_binary_expr (, &range_split), range_split)) { ...
Re: [PATCH] Tree-level fix for PR 69526
Ping.
Re: [PATCH] Tree-level fix for PR 69526
Found some time to look into this again. > Index: tree-ssa-propagate.c > === > --- tree-ssa-propagate.c(revision 240133) > +++ tree-ssa-propagate.c(working copy) > @@ -1105,10 +1105,10 @@ substitute_and_fold_dom_walker::before_d >/* Replace real uses in the statement. */ >did_replace |= replace_uses_in (stmt, get_value_fn); > > - /* If we made a replacement, fold the statement. */ > - if (did_replace) > + /* Fold the statement. */ > + if (fold_stmt (&i, follow_single_use_edges)) > { > - fold_stmt (&i, follow_single_use_edges); > + did_replace = true; > stmt = gsi_stmt (i); > } > > this would need compile-time cost evaluation (and avoid the tree-vrp.c > folding part > of your patch). This causes an ICE and bootstrap errors with newer revisions. I tested my patch on r240691 where it still works. How can this be done properly now? > OTOH given that you use this to decide whether you can use a combined constant > that doesn't look necessary (if the constant is correct, that is) -- > what you need > to make sure is that the final operation, (T)(A) +- CST, does not overflow > if ! TYPE_OVERFLOW_WRAPS and there wasn't any overflow in the > original operation. I think that always holds, thus the combine_ovf check > looks > superfluous to me. Removed the check and addressed the other remarks. > So now we know that for (T)(X + CST1) + CST2, (T)CST1 + CST2 > does not overflow. But we do not really care for that, we want to know > whether (T)X + CST' might invoke undefined behavior when the original > expression did not. This involves range information on X. I don't > see how we ensure this here. I guess I'm still missing an undefined behavior case. In which case can (T)X + CST' trigger undefined behavior where the original statement did not? I see the need for checking in the second pattern ((T)(X) + CST' -> (T)(X + CST')), of course. > But that's "easily fixable" by computing it in unsigned arithmetic, that is > doing > > (long)(a + 2) + LONG_MAX -> (long)((unsigned long)a + (LONG_MAX + 2)) Does this also work if (unsigned long)a + (LONG_MAX + 2) does not fit into [0,LONG_MAX]? IIRC (unsigned long)(LONG_MAX + 2) is implementation-defined and not undefined so it should work? Revised patch version attached. One thing I'm still not sure about is the handling of sth. like (unsigned long)(a + UINT_MAX) + 1 for a = 0. In the current patch version I always do a sign-extend of the first constant (UINT_MAX here) which seems to cause no problems in the testsuite and some custom tests still worked. Can UINT_MAX, -1 and similar cases be disambiguated (and correctly converted to the outer type) when done in unsigned arithmetic? Thinking about the second pattern, on s390x it introduces more casts than just using the first one (e.g. in cases where the value will be sign-extended after the operation which wouldn't have happened when performing the operation in the larger type. Can we catch this with a cost function? On a side note: Can/should VRP infer ranges assuming no undefined behavior will take place when -fstrict-overflow is in use? I.e. inferring ~[INT_MIN,INT_MIN] for (long)(a - 1)? Would this even make sense? Regards Robin diff --git a/gcc/match.pd b/gcc/match.pd index ba7e013..3d3f5bb 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -1150,6 +1150,86 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (minus @0 (minus @0 @1)) @1) + /* ((T)(A +- CST)) +- CST -> (T)(A) +- CST) */ +#if GIMPLE + (for outer_op (plus minus) + (for inner_op (plus minus) + (simplify + (outer_op (convert (inner_op@3 @0 INTEGER_CST@1)) INTEGER_CST@2) + (if (TREE_CODE (type) == INTEGER_TYPE && + TYPE_PRECISION (type) >= TYPE_PRECISION (TREE_TYPE (@3))) + (with + { + bool range_split = false; + tree cst = NULL_TREE; + tree inner_type = TREE_TYPE (@3); + value_range vr = VR_INITIALIZER; + + if (TYPE_OVERFLOW_UNDEFINED (inner_type)) + range_split = false; + else + { + /* Check the inner binop using VRP information. */ + extract_range_from_binary_expr (&vr, inner_op, + inner_type, @0, @1, &range_split); + } + + wide_int w1 = @1; + wide_int w2 = @2; + + wide_int combined_cst; + + /* Sign-extend @1 to TYPE. */ + w1 = w1.from (w1, TYPE_PRECISION (type), SIGNED); + + if (inner_op == MINUS_EXPR) + w1 = wi::neg (w1); + + if (outer_op == MINUS_EXPR) + w2 = wi::neg (w2); + + /* Combine in outer, larger type. */ + bool combine_ovf = true; + combined_cst = wi::add (w1, w2, SIGNED, &combine_ovf); + + /* Convert combined constant to tree of outer type if + there was no value range split in the original operation. */ + if (!range_split) + { + cst = wide_int_to_tree (type, combined_cst); + } + } + (if (cst) + (outer_op (convert
Re: [PATCH] Tree-level fix for PR 69526
On Tue, Sep 20, 2016 at 2:31 PM, Robin Dapp wrote: > Hi, > >> I meant to do sth like >> >> Index: tree-ssa-propagate.c >> === >> --- tree-ssa-propagate.c(revision 240133) >> +++ tree-ssa-propagate.c(working copy) >> @@ -1105,10 +1105,10 @@ substitute_and_fold_dom_walker::before_d >>/* Replace real uses in the statement. */ >>did_replace |= replace_uses_in (stmt, get_value_fn); >> >> - /* If we made a replacement, fold the statement. */ >> - if (did_replace) >> + /* Fold the statement. */ >> + if (fold_stmt (&i, follow_single_use_edges)) >> { >> - fold_stmt (&i, follow_single_use_edges); >> + did_replace = true; >> stmt = gsi_stmt (i); >> } >> >> this would need compile-time cost evaluation (and avoid the tree-vrp.c >> folding part >> of your patch). > > Using this causes the simplifications to be performed in ccp1 instead of > fwprop1. I noticed two tests failing that rely on propagation being > performed in fwprop. Should these be adapted or rather the patch be changed? > >> Heh, but I think duplicating code is even worse. > > Ok, I changed extract_range_... after all, it's not so bad as I had > thought. Imho it should be simplified nevertheless, perhaps I'll do it > in the future if I find some time. > >> + tree cst_inner = wide_int_to_tree (inner_type, cst); >> >> What prevents CST being truncated here? It looks like >> (long)intvar + 0xL will be mishandled that way. >> > > Right, it may be truncated here, but the following TYPE_PRECISION () > guard prevents the value from being used in that case. I pulled the > line into the condition for clarity. > >> OTOH given that you use this to decide whether you can use a combined >> constant >> that doesn't look necessary (if the constant is correct, that is) -- >> what you need >> to make sure is that the final operation, (T)(A) +- CST, does not overflow >> if ! TYPE_OVERFLOW_WRAPS and there wasn't any overflow in the >> original operation. I think that always holds, thus the combine_ovf check >> looks >> superfluous to me. >> > > I included this to account for cases like > return (long)(a + 2) + LONG_MAX > which should not simplify as opposed to > return (unsigned long)(a + 2) + LONG_MAX > where the constant is usable despite the overflow. Do you think it > should be handled differently? Well, (long)(a + 2) + LONG_MAX -> (long)a + (LONG_MAX + 2) is indeed invalid because the resulting expression may overflow. But that's "easily fixable" by computing it in unsigned arithmetic, that is doing (long)(a + 2) + LONG_MAX -> (long)((unsigned long)a + (LONG_MAX + 2)) which I believe is still desired? Anyway, since the desired transform in the end is (long)a + CST -> (long)(a + CST) this constant will not fit here anyway... Few comments on patch details: + if (TYPE_PRECISION (type) >= TYPE_PRECISION (inner_type)) please put the precision tests as a pattern if, thus (for outer_op (plus minus) + (for inner_op (plus minus) + (simplify +(outer_op (convert (inner_op@3 @0 INTEGER_CST@1)) INTEGER_CST@2) (if (TYPE_PRECISION (type) >= TYPE_PRECISION (TREE_TYPE (@3))) (with { ... that makes it more obvious you are matching a widening conversion. + if (@0 == NULL_TREE || @1 == NULL_TREE + || inner_type == NULL_TREE captures cannot be NULL_TREE nor can their type be NULL_TREE. If you run into such cases sth is broken elsewhere. You have tests against INTEGER_TYPE - please put those alongsize (even before) the precision test. Note that you probably want to use INTEGRAL_TYPE_P there to also catch ENUM_TYPE and BOOLEAN_TYPE. +(outer_op (convert (inner_op@3 @0 INTEGER_CST@1)) INTEGER_CST@2) you can use @0 to get at the type of inner_op. For my taste there is too many temporaries, like 'check_inner_ovf' where at the single point of use a !TYPE_OVERFLOW_UNDEFINED (inner_type); would be much more descriptive. + /* Convert to TYPE keeping possible signedness even when + dealing with unsigned types. */ + wide_int v1; + v1 = v1.from (w1, w2.get_precision(), tree_int_cst_sgn (@1) ? + SIGNED : UNSIGNED); better to comment it as /* Extend @1 to TYPE according to its sign. */ I also think you want to use TYPE_SIGN (inner_type) instead of tree_int_cst_sgn (@1) just to simplify the code. You also want to use TYPE_PRECISION (inner_type) instead of w2.get_precision (). + if (inner_op == MINUS_EXPR) + w1 = wi::neg (w1); I think you want to negate v1 here? (better not introduce v1 but use w1 for the extension?) + /* Combine in outer, larger type. */ + combined_cst = wi::add (v1, w2 + TYPE_SIGN (t
Re: [PATCH] Tree-level fix for PR 69526
Ping :)
Re: [PATCH] Tree-level fix for PR 69526
Ping.
Re: [PATCH] Tree-level fix for PR 69526
On 09/20/2016 06:31 AM, Robin Dapp wrote: Hi, I meant to do sth like Index: tree-ssa-propagate.c === --- tree-ssa-propagate.c(revision 240133) +++ tree-ssa-propagate.c(working copy) @@ -1105,10 +1105,10 @@ substitute_and_fold_dom_walker::before_d /* Replace real uses in the statement. */ did_replace |= replace_uses_in (stmt, get_value_fn); - /* If we made a replacement, fold the statement. */ - if (did_replace) + /* Fold the statement. */ + if (fold_stmt (&i, follow_single_use_edges)) { - fold_stmt (&i, follow_single_use_edges); + did_replace = true; stmt = gsi_stmt (i); } this would need compile-time cost evaluation (and avoid the tree-vrp.c folding part of your patch). Using this causes the simplifications to be performed in ccp1 instead of fwprop1. I noticed two tests failing that rely on propagation being performed in fwprop. Should these be adapted or rather the patch be changed? ISTM this is an indication that something changed the IL without folding prior to ccp & forwprop. That's not in and of itself bad, but may be worth investigating to see if whatever prior pass that made this change ought to be adjusted. That investigation would likely guide you with what to do with the testcase. If you find that the earlier pass should have folded, then you fix it and change the testcase to verify the earlier pass folded properly. Else you change the testcase to verify ccp folds the statement. I'm going to let Richi own the review on this. Just thought I'd chime in on that one topic. jeff
Re: [PATCH] Tree-level fix for PR 69526
Hi, > I meant to do sth like > > Index: tree-ssa-propagate.c > === > --- tree-ssa-propagate.c(revision 240133) > +++ tree-ssa-propagate.c(working copy) > @@ -1105,10 +1105,10 @@ substitute_and_fold_dom_walker::before_d >/* Replace real uses in the statement. */ >did_replace |= replace_uses_in (stmt, get_value_fn); > > - /* If we made a replacement, fold the statement. */ > - if (did_replace) > + /* Fold the statement. */ > + if (fold_stmt (&i, follow_single_use_edges)) > { > - fold_stmt (&i, follow_single_use_edges); > + did_replace = true; > stmt = gsi_stmt (i); > } > > this would need compile-time cost evaluation (and avoid the tree-vrp.c > folding part > of your patch). Using this causes the simplifications to be performed in ccp1 instead of fwprop1. I noticed two tests failing that rely on propagation being performed in fwprop. Should these be adapted or rather the patch be changed? > Heh, but I think duplicating code is even worse. Ok, I changed extract_range_... after all, it's not so bad as I had thought. Imho it should be simplified nevertheless, perhaps I'll do it in the future if I find some time. > + tree cst_inner = wide_int_to_tree (inner_type, cst); > > What prevents CST being truncated here? It looks like > (long)intvar + 0xL will be mishandled that way. > Right, it may be truncated here, but the following TYPE_PRECISION () guard prevents the value from being used in that case. I pulled the line into the condition for clarity. > OTOH given that you use this to decide whether you can use a combined constant > that doesn't look necessary (if the constant is correct, that is) -- > what you need > to make sure is that the final operation, (T)(A) +- CST, does not overflow > if ! TYPE_OVERFLOW_WRAPS and there wasn't any overflow in the > original operation. I think that always holds, thus the combine_ovf check > looks > superfluous to me. > I included this to account for cases like return (long)(a + 2) + LONG_MAX which should not simplify as opposed to return (unsigned long)(a + 2) + LONG_MAX where the constant is usable despite the overflow. Do you think it should be handled differently? Revised version attached. Regards Robin -- gcc/ChangeLog: 2016-09-20 Robin Dapp PR middle-end/69526 This enables combining of wrapped binary operations and fixes the tree level part of the PR. * match.pd: Introduce patterns to simplify binary operations wrapped inside a cast. * tree-vrp.h: Export extract_range_from_binary_expr (). * tree-ssa-propagate.c (substitute_and_fold_dom_walker::before_dom_children): Try to fold regardless of prior use propagations. * tree-vrp.c (extract_range_from_binary_expr_1): Add overflow check arguments and wrapper function without them. (extract_range_from_binary_expr): Add wrapper function. gcc/testsuite/ChangeLog: 2016-09-20 Robin Dapp * gcc.dg/wrapped-binop-simplify-signed-1.c: New test. * gcc.dg/wrapped-binop-simplify-signed-2.c: New test. * gcc.dg/wrapped-binop-simplify-unsigned-1.c: New test. * gcc.dg/wrapped-binop-simplify-unsigned-2.c: New test. diff --git a/gcc/match.pd b/gcc/match.pd index 123e754..f624e7e 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -1138,6 +1138,130 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (minus @0 (minus @0 @1)) @1) + /* ((T)(A +- CST)) +- CST -> (T)(A) +- CST) */ +#if GIMPLE + (for outer_op (plus minus) + (for inner_op (plus minus) + (simplify + (outer_op (convert (inner_op@3 @0 INTEGER_CST@1)) INTEGER_CST@2) + (with + { + /* If the inner operation is wrapped inside a conversion, we have + to check it overflows/underflows and can only then perform the + simplification, i.e. add the second constant to the first + (wrapped) and convert afterwards. */ + bool inner_ovf = false; + + bool combine_ovf = true; + tree cst = NULL_TREE; + bool combine_constants = false; + bool types_ok = false; + + tree inner_type = TREE_TYPE (@3); + /* Check overflow in wrapped binop? */ + bool check_inner_ovf = !TYPE_OVERFLOW_UNDEFINED (inner_type); + bool check_combine_ovf = !TYPE_OVERFLOW_WRAPS (type); + + signop s1 = TYPE_SIGN (type); + + if (TYPE_PRECISION (type) >= TYPE_PRECISION (inner_type)) + { + types_ok = true; + + /* Check if the inner binop does not overflow i.e. we have VRP + information and can prove prove it. */ + if (check_inner_ovf) + { + value_range vr = VR_INITIALIZER; + if (@0 == NULL_TREE || @1 == NULL_TREE + || inner_type == NULL_TREE + || TREE_CODE (type) != INTEGER_TYPE) + { + inner_ovf = true; + } + else + { + extract_range_from_binary_expr (&vr, inner
Re: [PATCH] Tree-level fix for PR 69526
[ Another patch I'd started working through, but hadn't completed... ] On 09/14/2016 07:05 AM, Richard Biener wrote: On Mon, Aug 22, 2016 at 4:58 PM, Robin Dapp wrote: if !inner_ovf (just set that to false if !check_inner_ovf to simplify checks please). you know it's valid to transform the op to (T)@0 innerop (T)@1 outerop @2 (if T is wider than the inner type which I think you should check and which should simplify things). I simplified the control flow a little and split both parts of the combination into separate patterns. You can then combine (T)@1 and @2 where I think you fail to handle mixed MINUS_EXPR/PLUS_EXPR (practically you'll see only PLUS_EXPRs for integers). Is it ok to do something like this (see patch) in order to deal with MINUS_EXPR and then perform a wi::add? if (inner_op == MINUS_EXPR) w1 = wi::neg (w1); if (outer_op == MINUS_EXPR) w2 = wi::neg (w2); Yes. I was concerned about the case where w1 or w2 might be the minimum (negative) integer for its type. That creates a constant that can't be represented. I'm not familiar enough with the rest of hte code yet to know if that's problematical. tree.c doesn't look like the best fit. I think putting it into tree-vrp.c is better and I think that extract_range_from_binary_expr_1 itself should compute what we want here as additional output. Conservative handling for all but plus/minus is ok with me. I kept the extra function for now because I find extract_range_from_binary_expr_1 somewhat lengthy and hard to follow already :) Heh, but I think duplicating code is even worse. I was going to suggest a pre-patch that just refactored the various cases in extract_range_from_binary_expr_1 into their own functions. THat might it easier to keep things manageable. [ ... ] Now looking at binop_overflow (that I don't like very much, but ...) Note that the naked "return true" in there makes 95% of that function unreachable code. That's where I stopped without looking deeply at the rest of the code. Jeff
Re: [PATCH] Tree-level fix for PR 69526
On Mon, Aug 22, 2016 at 4:58 PM, Robin Dapp wrote: >> if !inner_ovf (just set that to false if !check_inner_ovf to simplify >> checks please). >> you know it's valid to transform the op to (T)@0 innerop (T)@1 outerop @2 >> (if T is wider than the inner type which I think you should check and >> which should >> simplify things). > > I simplified the control flow a little and split both parts of the > combination into separate patterns. > >> You can then combine (T)@1 and @2 where I think you fail to handle mixed >> MINUS_EXPR/PLUS_EXPR (practically you'll see only PLUS_EXPRs for > integers). > > Is it ok to do something like this (see patch) in order to deal with > MINUS_EXPR and then perform a wi::add? > > if (inner_op == MINUS_EXPR) > w1 = wi::neg (w1); > > if (outer_op == MINUS_EXPR) > w2 = wi::neg (w2); Yes. >> The testcase is rather unspecific as to what testcases shoudl fold and > what not >> given your very simple scan and mixing should/should-not simplify > cases. Please >> consider splitting it up and make it a run test that verifies the >> bogus transforms >> do not take place. > > I divided the testcases into signed/unsigned and should simplify/should > not simplify. The bogus unsigned transforms are now checked via assert(). > >> Doing > [...] >> causes such stmts to be folded twice as substitute_and_fold does > [...] >> which is less than ideal. I think that given we have fold_stmt following >> SSA edges and thus not only stmts we propagated into are possibly >> interesting to fold (but also their single-uses, recursively), we should >> evaluate the compile-time cost of doing the fold_stmt unconditionally. > > Only using update_stmt seems to avoid the double call to fold_stmt but > is obviously the wrong thing to do as this then tries to update > everything that matches the pattern in tree-vrp.c. Copying some checks > from match.pd to tree-vrp.c would help but isn't there a proper way to > prevent duplicating the checking logic? I think what triggers it is that you return 'true', not that you call update_stmt. > In addition, is the compile-time cost checking something I could do for > this patch or a general thing to be done later? I meant to do sth like Index: tree-ssa-propagate.c === --- tree-ssa-propagate.c(revision 240133) +++ tree-ssa-propagate.c(working copy) @@ -1105,10 +1105,10 @@ substitute_and_fold_dom_walker::before_d /* Replace real uses in the statement. */ did_replace |= replace_uses_in (stmt, get_value_fn); - /* If we made a replacement, fold the statement. */ - if (did_replace) + /* Fold the statement. */ + if (fold_stmt (&i, follow_single_use_edges)) { - fold_stmt (&i, follow_single_use_edges); + did_replace = true; stmt = gsi_stmt (i); } this would need compile-time cost evaluation (and avoid the tree-vrp.c folding part of your patch). >> tree.c doesn't look like the best fit. I think putting it into >> tree-vrp.c is better >> and I think that extract_range_from_binary_expr_1 itself should > compute what we >> want here as additional output. Conservative handling for all but > plus/minus is >> ok with me. > > I kept the extra function for now because I find > extract_range_from_binary_expr_1 somewhat lengthy and hard to follow > already :) Heh, but I think duplicating code is even worse. Just looking at the simpler pattern right now: + /* ((T)(A)) +- CST -> (T)(A +- CST) */ +#if GIMPLE + (for outer_op (plus minus) +(simplify + (outer_op (convert @0) INTEGER_CST@2) + /* Perform binary operation inside the cast if the constant fits + and there is no overflow. */ + (with + { + bool simplify = false; + + wide_int cst = @2; + tree inner_type = TREE_TYPE (@0); + tree cst_inner = wide_int_to_tree (inner_type, cst); What prevents CST being truncated here? It looks like (long)intvar + 0xL will be mishandled that way. + bool inner_ovf = true; + + if (TYPE_PRECISION (inner_type) >= TYPE_PRECISION (type) + || TREE_CODE (inner_type) != INTEGER_TYPE) + simplify = false; + else + { + inner_ovf = binop_overflow (outer_op, @0, cst_inner, inner_type); + if (!inner_ovf) + simplify = true; + } + } + (if (simplify && @0) @0 is always "true" +(convert (outer_op @0 (convert { @2; } You already have @2 converted -- cst_inner. Thus just use (convert (outer_op @0 { cst_inner; })) + ))) +#endif Now onto the other pattern. + /* ((T)(A +- CST)) +- CST -> (T)(A) +- CST) */ +#if GIMPLE + (for outer_op (plus minus) + (for inner_op (plus minus) + (simplify +(outer_op (convert (inner_op@3 @0 INTEGER_CST@1)) INTEGER_CST@2) + (with + { +
Re: [PATCH] Tree-level fix for PR 69526
Ping. diff --git a/gcc/gimple-match-head.c b/gcc/gimple-match-head.c index 2beadbc..d66fcb1 100644 --- a/gcc/gimple-match-head.c +++ b/gcc/gimple-match-head.c @@ -39,6 +39,7 @@ along with GCC; see the file COPYING3. If not see #include "internal-fn.h" #include "case-cfn-macros.h" #include "gimplify.h" +#include "tree-vrp.h" /* Forward declarations of the private auto-generated matchers. diff --git a/gcc/match.pd b/gcc/match.pd index b24bfb4..f54b734 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -1119,6 +1119,113 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (minus @0 (minus @0 @1)) @1) + /* ((T)(A +- CST)) +- CST -> (T)(A) +- CST) */ +#if GIMPLE + (for outer_op (plus minus) + (for inner_op (plus minus) + (simplify + (outer_op (convert (inner_op@3 @0 INTEGER_CST@1)) INTEGER_CST@2) + (with + { + /* If the inner operation is wrapped inside a conversion, we have to + check it overflows/underflows and can only then perform the + simplification, i.e. add the second constant to the first + (wrapped) and convert afterwards. This fixes + https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69526 */ + bool inner_ovf = false; + + bool combine_ovf = true; + tree cst = NULL_TREE; + bool combine_constants = false; + bool types_ok = false; + + tree inner_type = TREE_TYPE (@3); + /* Check overflow in wrapped binop? */ + bool check_inner_ovf = !TYPE_OVERFLOW_UNDEFINED (inner_type); + /* Check overflow when combining constants? */ + bool check_combine_ovf = TYPE_OVERFLOW_UNDEFINED (type); + + signop s1 = TYPE_SIGN (type); + + if (TYPE_PRECISION (type) >= TYPE_PRECISION (inner_type)) + { + types_ok = true; + + /* Check if the inner binop does not overflow i.e. we have VRP + information and can prove prove it. */ + if (check_inner_ovf) + inner_ovf = binop_overflow (inner_op, @0, @1, inner_type); + else + inner_ovf = false; + + wide_int w1 = @1; + wide_int w2 = @2; + + wide_int combined_cst; + + /* Convert to TYPE keeping possible signedness even when + dealing with unsigned types. */ + wide_int v1; + v1 = v1.from (w1, w2.get_precision(), tree_int_cst_sign_bit + (@1) ? SIGNED : UNSIGNED); + + if (inner_op == MINUS_EXPR) + w1 = wi::neg (w1); + + if (outer_op == MINUS_EXPR) + w2 = wi::neg (w2); + + /* Combine in outer, larger type */ + combined_cst = wi::add (v1, w2, TYPE_SIGN (type), &combine_ovf); + + /* The combined constant is ok if its type does not exhibit + undefined overflow or there was no overflow when + combining. */ + bool combined_cst_ok = !check_combine_ovf || !combine_ovf; + if (!inner_ovf && combined_cst_ok) + { + /* convert to tree of outer type */ + cst = wide_int_to_tree (type, combined_cst); + combine_constants = true; + } + } + } + (if (types_ok && combine_constants && cst) + (outer_op (convert { @0; }) { cst; })) + +#endif + + /* ((T)(A)) +- CST -> (T)(A +- CST) */ +#if GIMPLE + (for outer_op (plus minus) +(simplify + (outer_op (convert @0) INTEGER_CST@2) + /* Perform binary operation inside the cast if the constant fits + and there is no overflow. */ + (with + { + bool simplify = false; + + wide_int cst = @2; + tree inner_type = TREE_TYPE (@0); + tree cst_inner = wide_int_to_tree (inner_type, cst); + bool inner_ovf = true; + + if (TYPE_PRECISION (inner_type) >= TYPE_PRECISION (type) + || TREE_CODE (inner_type) != INTEGER_TYPE) + simplify = false; + else + { + inner_ovf = binop_overflow (outer_op, @0, cst_inner, inner_type); + if (!inner_ovf) + simplify = true; + } + } + (if (simplify && @0) + (convert (outer_op @0 (convert { @2; } + ))) +#endif + /* (A +- CST) +- CST -> A + CST */ (for outer_op (plus minus) (for inner_op (plus minus) @@ -1132,6 +1239,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (cst && !TREE_OVERFLOW (cst)) (inner_op @0 { cst; } )) + /* (CST - A) +- CST -> CST - A */ (for outer_op (plus minus) (simplify diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-1.c b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-1.c new file mode 100644 index 000..0afe99a --- /dev/null +++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-1.c @@ -0,0 +1,76 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-forwprop1-details" } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 11 "forwprop1" } } */ + +#include + +// should simplify, ignore possible signed overflow in (a - 2) and combine +// constants +long foo(int a) +{ + return (long)(a - 2) + 1; +} + +// should simplify +long bar(int a) +{ + return (long)(a + 3) - 1; +} + +// should simplify with combined constant in outer type +long baz(int a) +{ + return (long)(a - 1) + 2; +} + +// should simplify +long baf(int a) +{ + return (long)(a + 1) - 2; +} + +// should simplify (same as above) +long bak(int
Re: [PATCH] Tree-level fix for PR 69526
gah, this + return true; + if (TREE_CODE (t1) != SSA_NAME) should of course be like this + if (TREE_CODE (t1) != SSA_NAME) + return true; in the last patch.
Re: [PATCH] Tree-level fix for PR 69526
> if !inner_ovf (just set that to false if !check_inner_ovf to simplify > checks please). > you know it's valid to transform the op to (T)@0 innerop (T)@1 outerop @2 > (if T is wider than the inner type which I think you should check and > which should > simplify things). I simplified the control flow a little and split both parts of the combination into separate patterns. > You can then combine (T)@1 and @2 where I think you fail to handle mixed > MINUS_EXPR/PLUS_EXPR (practically you'll see only PLUS_EXPRs for integers). Is it ok to do something like this (see patch) in order to deal with MINUS_EXPR and then perform a wi::add? if (inner_op == MINUS_EXPR) w1 = wi::neg (w1); if (outer_op == MINUS_EXPR) w2 = wi::neg (w2); > The testcase is rather unspecific as to what testcases shoudl fold and what not > given your very simple scan and mixing should/should-not simplify cases. Please > consider splitting it up and make it a run test that verifies the > bogus transforms > do not take place. I divided the testcases into signed/unsigned and should simplify/should not simplify. The bogus unsigned transforms are now checked via assert(). > Doing [...] > causes such stmts to be folded twice as substitute_and_fold does [...] > which is less than ideal. I think that given we have fold_stmt following > SSA edges and thus not only stmts we propagated into are possibly > interesting to fold (but also their single-uses, recursively), we should > evaluate the compile-time cost of doing the fold_stmt unconditionally. Only using update_stmt seems to avoid the double call to fold_stmt but is obviously the wrong thing to do as this then tries to update everything that matches the pattern in tree-vrp.c. Copying some checks from match.pd to tree-vrp.c would help but isn't there a proper way to prevent duplicating the checking logic? In addition, is the compile-time cost checking something I could do for this patch or a general thing to be done later? > tree.c doesn't look like the best fit. I think putting it into > tree-vrp.c is better > and I think that extract_range_from_binary_expr_1 itself should compute what we > want here as additional output. Conservative handling for all but plus/minus is > ok with me. I kept the extra function for now because I find extract_range_from_binary_expr_1 somewhat lengthy and hard to follow already :) Wouldn't it be better to "separate concerns"/split it up in the long run and merge the functionality needed here at some time? Bootstrapped and reg-tested on s390x, bootstrap on x86 running. Regards Robin gcc/ChangeLog: 2016-08-22 Robin Dapp * match.pd: Introduce patterns to simplify binary operations wrapped inside a cast. * tree-vrp.c (bool binop_overflow): (simplify_plus_or_minus_using_ranges): Make use of newly introduced patterns. (simplify_stmt_using_ranges): Call simplify_plus_or_minus_using_ranges * tree-vrp.h: New file. * gimple-match-head.c: Include tree-vrp.h gcc/testsuite/ChangeLog: 2016-08-22 Robin Dapp * gcc.dg/wrapped-binop-simplify-signed-1.c: New test. * gcc.dg/wrapped-binop-simplify-signed-2.c: New test. * gcc.dg/wrapped-binop-simplify-unsigned-1.c: New test. * gcc.dg/wrapped-binop-simplify-unsigned-2.c: New test. diff --git a/gcc/gimple-match-head.c b/gcc/gimple-match-head.c index 2beadbc..d66fcb1 100644 --- a/gcc/gimple-match-head.c +++ b/gcc/gimple-match-head.c @@ -39,6 +39,7 @@ along with GCC; see the file COPYING3. If not see #include "internal-fn.h" #include "case-cfn-macros.h" #include "gimplify.h" +#include "tree-vrp.h" /* Forward declarations of the private auto-generated matchers. diff --git a/gcc/match.pd b/gcc/match.pd index b24bfb4..f54b734 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -1119,6 +1119,113 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (minus @0 (minus @0 @1)) @1) + /* ((T)(A +- CST)) +- CST -> (T)(A) +- CST) */ +#if GIMPLE + (for outer_op (plus minus) + (for inner_op (plus minus) + (simplify + (outer_op (convert (inner_op@3 @0 INTEGER_CST@1)) INTEGER_CST@2) + (with + { + /* If the inner operation is wrapped inside a conversion, we have to + check it overflows/underflows and can only then perform the + simplification, i.e. add the second constant to the first + (wrapped) and convert afterwards. This fixes + https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69526 */ + bool inner_ovf = false; + + bool combine_ovf = true; + tree cst = NULL_TREE; + bool combine_constants = false; + bool types_ok = false; + + tree inner_type = TREE_TYPE (@3); + /* Check overflow in wrapped binop? */ + bool check_inner_ovf = !TYPE_OVERFLOW_UNDEFINED (inner_type); + /* Check overflow when combining constants? */ + bool check_combine_ovf = TYPE_OVERFLOW_UNDEFINED (type); + + signop s1 = TYPE_SIGN (type); + + if
Re: [PATCH] Tree-level fix for PR 69526
On Thu, Jul 21, 2016 at 12:42 PM, Robin Dapp wrote: > As described in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69526, we > currently fail to simplify cases like > > (unsigned long)(a - 1) + 1 > > to > > (unsigned long)a > > when VRP knows that (a - 1) does not overflow. > > This patch introduces a match.pd pattern as well as a helper function > that checks for overflow in a binary operation using VRP information and > simplifies when no overflow is present. > > Some effort was put in to stay in the inner type in cases like this > > (unsigned long)(a + CST1) - CST2 > -> > (unsigned long)(a + CST3) rather than > (unsigned long)a + CST3 > > where abs(CST3) = abs(CST1 - CST) <= abs(CST1). I wonder if this is > warranted, i.e. if it is always advantageous or if the evaluation should > rather involve a cost estimation - e.g. distinguish between costs for > operations in int vs. in long. > > Absence of signed overflow is also exploited: > > (long)(a + 2) - 1 > -> > (long)(a + 1) > > Bootstrapped and regression tested on s390x and x86_64. I find this a bit hard to follow (looking at the match.pd pattern). + if (check_inner_ovf) + { + // check if the inner binop does not overflow i.e. we have VRP + // information and can prove prove it + inner_ovf = binop_overflow (inner_op, @0, @1, inner_type); + } if !inner_ovf (just set that to false if !check_inner_ovf to simplify checks please). you know it's valid to transform the op to (T)@0 innerop (T)@1 outerop @2 (if T is wider than the inner type which I think you should check and which should simplify things). You can then combine (T)@1 and @2 where I think you fail to handle mixed MINUS_EXPR/PLUS_EXPR (practically you'll see only PLUS_EXPRs for integers). So you have (T)@0 + combined-in-T which you then can either emit or check whether combined-in-T fits in the inner type and @0 + combined-in-T does not overflow in which case (T)(@0 + combined-in-T) is safe. I believe that for simplicity we want to split this transform into two - one doing (T)(a + CST) - CST -> (T)a + CST' and one doing (T)a + CST -> (T)(a + CST). The testcase is rather unspecific as to what testcases shoudl fold and what not given your very simple scan and mixing should/should-not simplify cases. Please consider splitting it up and make it a run test that verifies the bogus transforms do not take place. Doing +static bool +simplify_plus_or_minus_using_ranges (gimple_stmt_iterator *gsi, gimple *stmt) +{ + enum tree_code code = gimple_assign_rhs_code (stmt); + tree op0 = gimple_assign_rhs1 (stmt); + tree op1 = gimple_assign_rhs2 (stmt); + + if ((code == PLUS_EXPR || code == MINUS_EXPR) && + op0 != NULL && op1 != NULL) +{ + gimple *stmt1 = SSA_NAME_DEF_STMT (op0); + if (gassign *def = dyn_cast (stmt1)) + { + if (CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def)) + && TREE_CODE (op1) == INTEGER_CST) + { + if (fold_stmt (gsi, follow_single_use_edges)) + return true; causes such stmts to be folded twice as substitute_and_fold does /* Some statements may be simplified using propagator specific information. Do this before propagating into the stmt to not disturb pass specific information. */ if (fold_fn && (*fold_fn)(&i)) { did_replace = true; prop_stats.num_stmts_folded++; stmt = gsi_stmt (i); update_stmt (stmt); } /* Replace real uses in the statement. */ did_replace |= replace_uses_in (stmt, get_value_fn); /* If we made a replacement, fold the statement. */ if (did_replace) { fold_stmt (&i, follow_single_use_edges); stmt = gsi_stmt (i); which is less than ideal. I think that given we have fold_stmt following SSA edges and thus not only stmts we propagated into are possibly interesting to fold (but also their single-uses, recursively), we should evaluate the compile-time cost of doing the fold_stmt unconditionally. diff --git a/gcc/tree.c b/gcc/tree.c index 2295111..bc477fa 100644 --- a/gcc/tree.c +++ b/gcc/tree.c @@ -1358,6 +1358,108 @@ force_fit_type (tree type, const wide_int_ref &cst, return wide_int_to_tree (type, cst); } +bool binop_overflow (enum tree_code op, tree t1, tree t2, tree type) +{ tree.c doesn't look like the best fit. I think putting it into tree-vrp.c is better and I think that extract_range_from_binary_expr_1 itself should compute what we want here as additional output. Conservative handling for all but plus/minus is ok with me. + if (t1 == NULL_TREE || t2 == NULL_TREE || type == NULL_TREE) +return true; + + if (TYPE_OVERFLOW_UNDEFINED (type)) +return false; + + if (!INTEGRAL_TYPE_P (type)) +return true; note that we'll ICE if you call TYPE_OVERFLOW_UNDEFINED on a type that is not ANY_INTEGRAL_TYPE_P, so better re-order the checks.