I skimmed through the code to see where transformation like (a - 1) -> (a + UINT_MAX) are performed. It seems there are only two places, match.pd (/* A - B -> A + (-B) if B is easily negatable. */) and fold-const.c.
In order to be able to reliably know whether to zero-extend or to sign-extend the constant (1) in (ulong)(a - 1) + 1 -> (ulong)(a) + 0 or -> (ulong)(a) + (ulong)(UINT_MAX + 1) we'd have to know if the constant was converted by a transformation like above. I first tried removing the two transformations altogether but this causes around 20 new regressions on s390x and I didn't go through all of them to see whether they can be fixed. Is there a rationale for applying the transformations other than canonicalization? I saw some optimizations that only apply to PLUS_EXPRs but that can easily be changed to also include MINUS_EXPR. My other attempt entails an additional flag TREE_WAS_SIGNED for the lack of a better naming idea. It is set when the above transformation takes place. I used (NODE)->base.deprecated_flag but there may be better choices. Tentative/example patch attached for reference. Using this, the type extension in my original patch can be changed to w1 = w1.from (w1, TYPE_PRECISION (type), TREE_WAS_SIGNED (@1) ? SIGNED : TYPE_SIGN (inner_type)); which works for me and does not introduce regressions on s390x. Is this a viable approach? Regards Robin
diff --git a/gcc/fold-const.c b/gcc/fold-const.c index c649e54..cbb7469 100644 --- a/gcc/fold-const.c +++ b/gcc/fold-const.c @@ -9648,9 +9648,15 @@ fold_binary_loc (location_t loc, && (TREE_CODE (op1) != REAL_CST || REAL_VALUE_NEGATIVE (TREE_REAL_CST (op1)))) || INTEGRAL_TYPE_P (type))) - return fold_build2_loc (loc, PLUS_EXPR, type, - fold_convert_loc (loc, type, arg0), - negate_expr (op1)); + { + tree negtem = negate_expr (op1); + if (CONSTANT_CLASS_P (negtem)) + TREE_WAS_SIGNED (negtem) = 1; + tree tem = fold_build2_loc (loc, PLUS_EXPR, type, + fold_convert_loc (loc, type, arg0), + negtem); + return tem; + } /* Fold &a[i] - &a[j] to i-j. */ if (TREE_CODE (arg0) == ADDR_EXPR @@ -13620,6 +13626,7 @@ fold_negate_const (tree arg0, tree type) t = force_fit_type (type, val, 1, (overflow | TREE_OVERFLOW (arg0)) && !TYPE_UNSIGNED (type)); + TREE_WAS_SIGNED (t) = TREE_WAS_SIGNED (arg0); break; } diff --git a/gcc/match.pd b/gcc/match.pd index b3f2063..e791942 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -870,7 +870,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (simplify (minus @0 negate_expr_p@1) (if (!FIXED_POINT_TYPE_P (type)) - (plus @0 (negate @1)))) + (with { + if (CONSTANT_CLASS_P (@1)) + TREE_WAS_SIGNED (@1) = 1; + } + (plus @0 (negate @1))))) /* Try to fold (type) X op CST -> (type) (X op ((type-x) CST)) when profitable. @@ -1223,8 +1227,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) /* Extend @1 to TYPE. Perform sign extension if the range overflowed but did not split. */ - w1 = w1.from (w1, TYPE_PRECISION (type), ovf.ovf ? SIGNED : - TYPE_SIGN (inner_type)); + w1 = w1.from (w1, TYPE_PRECISION (type), TREE_WAS_SIGNED (@1) + ? SIGNED : TYPE_SIGN (inner_type)); /* w1 = w1.from (w1, TYPE_PRECISION (type), TYPE_SIGN (inner_type)); */ diff --git a/gcc/tree.h b/gcc/tree.h index 62cd7bb..1e7efd9 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -735,11 +735,18 @@ extern void omp_clause_range_check_failed (const_tree, const char *, int, #define TREE_OVERFLOW(NODE) (CST_CHECK (NODE)->base.public_flag) +/* Foo. */ + +#define TREE_WAS_SIGNED(NODE) (CST_CHECK (NODE)->base.deprecated_flag) + /* TREE_OVERFLOW can only be true for EXPR of CONSTANT_CLASS_P. */ #define TREE_OVERFLOW_P(EXPR) \ (CONSTANT_CLASS_P (EXPR) && TREE_OVERFLOW (EXPR)) +#define TREE_WAS_SIGNED_P(EXPR) \ + (CONSTANT_CLASS_P (EXPR) && TREE_WAS_SIGNED (EXPR)) + /* In a VAR_DECL, FUNCTION_DECL, NAMESPACE_DECL or TYPE_DECL, nonzero means name is to be accessible from outside this translation unit. In an IDENTIFIER_NODE, nonzero means an external declaration