On Tue, Sep 20, 2016 at 2:31 PM, Robin Dapp <rd...@linux.vnet.ibm.com> wrote: > Hi, > >> I meant to do sth like >> >> Index: tree-ssa-propagate.c >> =================================================================== >> --- tree-ssa-propagate.c (revision 240133) >> +++ tree-ssa-propagate.c (working copy) >> @@ -1105,10 +1105,10 @@ substitute_and_fold_dom_walker::before_d >> /* Replace real uses in the statement. */ >> did_replace |= replace_uses_in (stmt, get_value_fn); >> >> - /* If we made a replacement, fold the statement. */ >> - if (did_replace) >> + /* Fold the statement. */ >> + if (fold_stmt (&i, follow_single_use_edges)) >> { >> - fold_stmt (&i, follow_single_use_edges); >> + did_replace = true; >> stmt = gsi_stmt (i); >> } >> >> this would need compile-time cost evaluation (and avoid the tree-vrp.c >> folding part >> of your patch). > > Using this causes the simplifications to be performed in ccp1 instead of > fwprop1. I noticed two tests failing that rely on propagation being > performed in fwprop. Should these be adapted or rather the patch be changed? > >> Heh, but I think duplicating code is even worse. > > Ok, I changed extract_range_... after all, it's not so bad as I had > thought. Imho it should be simplified nevertheless, perhaps I'll do it > in the future if I find some time. > >> + tree cst_inner = wide_int_to_tree (inner_type, cst); >> >> What prevents CST being truncated here? It looks like >> (long)intvar + 0xffff00000000L will be mishandled that way. >> > > Right, it may be truncated here, but the following TYPE_PRECISION () > guard prevents the value from being used in that case. I pulled the > line into the condition for clarity. > >> OTOH given that you use this to decide whether you can use a combined >> constant >> that doesn't look necessary (if the constant is correct, that is) -- >> what you need >> to make sure is that the final operation, (T)(A) +- CST, does not overflow >> if ! TYPE_OVERFLOW_WRAPS and there wasn't any overflow in the >> original operation. I think that always holds, thus the combine_ovf check >> looks >> superfluous to me. >> > > I included this to account for cases like > return (long)(a + 2) + LONG_MAX > which should not simplify as opposed to > return (unsigned long)(a + 2) + LONG_MAX > where the constant is usable despite the overflow. Do you think it > should be handled differently?

## Advertising

Well, (long)(a + 2) + LONG_MAX -> (long)a + (LONG_MAX + 2) is indeed invalid because the resulting expression may overflow. But that's "easily fixable" by computing it in unsigned arithmetic, that is doing (long)(a + 2) + LONG_MAX -> (long)((unsigned long)a + (LONG_MAX + 2)) which I believe is still desired? Anyway, since the desired transform in the end is (long)a + CST -> (long)(a + CST) this constant will not fit here anyway... Few comments on patch details: + if (TYPE_PRECISION (type) >= TYPE_PRECISION (inner_type)) please put the precision tests as a pattern if, thus (for outer_op (plus minus) + (for inner_op (plus minus) + (simplify + (outer_op (convert (inner_op@3 @0 INTEGER_CST@1)) INTEGER_CST@2) (if (TYPE_PRECISION (type) >= TYPE_PRECISION (TREE_TYPE (@3))) (with { ... that makes it more obvious you are matching a widening conversion. + if (@0 == NULL_TREE || @1 == NULL_TREE + || inner_type == NULL_TREE captures cannot be NULL_TREE nor can their type be NULL_TREE. If you run into such cases sth is broken elsewhere. You have tests against INTEGER_TYPE - please put those alongsize (even before) the precision test. Note that you probably want to use INTEGRAL_TYPE_P there to also catch ENUM_TYPE and BOOLEAN_TYPE. + (outer_op (convert (inner_op@3 @0 INTEGER_CST@1)) INTEGER_CST@2) you can use @0 to get at the type of inner_op. For my taste there is too many temporaries, like 'check_inner_ovf' where at the single point of use a !TYPE_OVERFLOW_UNDEFINED (inner_type); would be much more descriptive. + /* Convert to TYPE keeping possible signedness even when + dealing with unsigned types. */ + wide_int v1; + v1 = v1.from (w1, w2.get_precision(), tree_int_cst_sgn (@1) ? + SIGNED : UNSIGNED); better to comment it as /* Extend @1 to TYPE according to its sign. */ I also think you want to use TYPE_SIGN (inner_type) instead of tree_int_cst_sgn (@1) just to simplify the code. You also want to use TYPE_PRECISION (inner_type) instead of w2.get_precision (). + if (inner_op == MINUS_EXPR) + w1 = wi::neg (w1); I think you want to negate v1 here? (better not introduce v1 but use w1 for the extension?) + /* Combine in outer, larger type. */ + combined_cst = wi::add (v1, w2 + TYPE_SIGN (type), &combine_ovf); So now we know that for (T)(X + CST1) + CST2, (T)CST1 + CST2 does not overflow. But we do not really care for that, we want to know whether (T)X + CST' might invoke undefined behavior when the original expression did not. This involves range information on X. I don't see how we ensure this here. Richard. > > Revised version attached. > > Regards > Robin > > -- > > gcc/ChangeLog: > > 2016-09-20 Robin Dapp <rd...@linux.vnet.ibm.com> > > PR middle-end/69526 > This enables combining of wrapped binary operations and fixes > the tree level part of the PR. > > * match.pd: Introduce patterns to simplify binary operations > wrapped inside a cast. > * tree-vrp.h: Export extract_range_from_binary_expr (). > * tree-ssa-propagate.c > (substitute_and_fold_dom_walker::before_dom_children): > Try to fold regardless of prior use propagations. > * tree-vrp.c (extract_range_from_binary_expr_1): > Add overflow check arguments and wrapper function without them. > (extract_range_from_binary_expr): > Add wrapper function. > > > gcc/testsuite/ChangeLog: > > 2016-09-20 Robin Dapp <rd...@linux.vnet.ibm.com> > > * gcc.dg/wrapped-binop-simplify-signed-1.c: New test. > * gcc.dg/wrapped-binop-simplify-signed-2.c: New test. > * gcc.dg/wrapped-binop-simplify-unsigned-1.c: New test. > * gcc.dg/wrapped-binop-simplify-unsigned-2.c: New test.