On Mon, 23 Nov 2020, Jakub Jelinek wrote: > Hi! > > The PR38359 change made the -1 >> x to -1 optimization less useful by > requiring that the x must be non-negative. > Shifts by negative amount are UB, but we for historic reasons had in some > (but not all) places some hack to treat shifts by negative value as the > other direction shifts by the negated amount. > > The following patch just removes that special handling, instead we punt on > optimizing those (and ideally path isolation should catch that up and turn > those into __builtin_unreachable, perhaps with __builtin_warning next to > it). Folding the shifts in some places as if they were rotates and in other > as if they were saturating just leads to inconsistencies. > > For C++ constexpr diagnostics and -fpermissive, I've added code to pretend > fold-const.c has not changed, without -fpermissive it will be an error > anyway and I think it is better not to change all the diagnostics. > > During x86_64-linux and i686-linux bootstrap/regtest, my statistics > gathering patch noted 185 unique -m32/-m64 x TU x function_name x shift_kind > x fold-const/tree-ssa-ccp cases. I have investigated the > 64 ../../gcc/config/i386/i386.c x86_output_aligned_bss LSHIFT_EXPR > wide_int_bitop > 64 ../../gcc/config/i386/i386-expand.c emit_memmov LSHIFT_EXPR wide_int_bitop > 64 ../../gcc/config/i386/i386-expand.c ix86_expand_carry_flag_compare > LSHIFT_EXPR wide_int_bitop > 64 ../../gcc/expmed.c expand_divmod LSHIFT_EXPR wide_int_bitop > 64 ../../gcc/lra-lives.c process_bb_lives LSHIFT_EXPR wide_int_bitop > 64 ../../gcc/rtlanal.c nonzero_bits1 LSHIFT_EXPR wide_int_bitop > 64 ../../gcc/varasm.c optimize_constant_pool.isra LSHIFT_EXPR wide_int_bitop > cases and all of them are either during jump threading (dom) or during PRE. > For jump threading, the most common case is 1 << floor_log2 (whatever) where > floor_log2 is return HOST_BITS_PER_WIDE_INT - 1 - clz_hwi (x); > and clz_hwi is if (x == 0) return HOST_BITS_PER_WIDE_INT; return > __builtin_clz* (x); > and so has range [-1, 63] and a comparison against == 0 which makes the > threader think it might be nice to jump thread the case leading to 1 << -1. > I think it is better to keep the 1 << -1 s in the IL for this and let path > isolation turn that into __builtin_unreachable () if the user wishes so. > > Earlier version of this patch has been bootstrapped/regtested on > x86_64-linux and i686-linux, is this one ok for trunk if it passes the same > testing?
Yes. Thanks, Richard. > 2020-11-23 Jakub Jelinek <ja...@redhat.com> > > PR tree-optimization/96929 > * fold-const.c (wide_int_binop) <case LSHIFT_EXPR, case RSHIFT_EXPR>: > Return false on negative second argument rather than trying to handle > it as shift in the other direction. > * tree-ssa-ccp.c (bit_value_binop) <case LSHIFT_EXPR, > case RSHIFT_EXPR>: Punt on negative shift count rather than trying > to handle it as shift in the other direction. > * match.pd (-1 >> x to -1): Remove tree_expr_nonnegative_p check. > > * constexpr.c (cxx_eval_binary_expression): For shifts by constant > with MSB set, emulate older wide_int_binop behavior to preserve > diagnostics and -fpermissive behavior. > > * gcc.dg/tree-ssa/pr96929.c: New test. > > --- gcc/fold-const.c.jj 2020-11-20 12:26:16.921936188 +0100 > +++ gcc/fold-const.c 2020-11-23 12:42:13.784315448 +0100 > @@ -992,26 +992,19 @@ wide_int_binop (wide_int &res, > res = wi::bit_and (arg1, arg2); > break; > > - case RSHIFT_EXPR: > case LSHIFT_EXPR: > if (wi::neg_p (arg2)) > - { > - tmp = -arg2; > - if (code == RSHIFT_EXPR) > - code = LSHIFT_EXPR; > - else > - code = RSHIFT_EXPR; > - } > - else > - tmp = arg2; > + return false; > + res = wi::lshift (arg1, arg2); > + break; > > - if (code == RSHIFT_EXPR) > - /* It's unclear from the C standard whether shifts can overflow. > - The following code ignores overflow; perhaps a C standard > - interpretation ruling is needed. */ > - res = wi::rshift (arg1, tmp, sign); > - else > - res = wi::lshift (arg1, tmp); > + case RSHIFT_EXPR: > + if (wi::neg_p (arg2)) > + return false; > + /* It's unclear from the C standard whether shifts can overflow. > + The following code ignores overflow; perhaps a C standard > + interpretation ruling is needed. */ > + res = wi::rshift (arg1, arg2, sign); > break; > > case RROTATE_EXPR: > --- gcc/tree-ssa-ccp.c.jj 2020-11-20 12:26:16.946935910 +0100 > +++ gcc/tree-ssa-ccp.c 2020-11-23 12:42:13.785315436 +0100 > @@ -1432,13 +1432,7 @@ bit_value_binop (enum tree_code code, si > else > { > if (wi::neg_p (shift)) > - { > - shift = -shift; > - if (code == RSHIFT_EXPR) > - code = LSHIFT_EXPR; > - else > - code = RSHIFT_EXPR; > - } > + break; > if (code == RSHIFT_EXPR) > { > *mask = wi::rshift (wi::ext (r1mask, width, sgn), shift, sgn); > --- gcc/match.pd.jj 2020-11-20 12:26:16.941935966 +0100 > +++ gcc/match.pd 2020-11-23 12:42:13.809315167 +0100 > @@ -2900,8 +2900,7 @@ (define_operator_list COND_TERNARY > /* Optimize -1 >> x for arithmetic right shifts. */ > (simplify > (rshift integer_all_onesp@0 @1) > - (if (!TYPE_UNSIGNED (type) > - && tree_expr_nonnegative_p (@1)) > + (if (!TYPE_UNSIGNED (type)) > @0)) > > /* Optimize (x >> c) << c into x & (-1<<c). */ > --- gcc/cp/constexpr.c.jj 2020-11-22 19:11:44.053589081 +0100 > +++ gcc/cp/constexpr.c 2020-11-23 13:08:16.815759331 +0100 > @@ -3162,6 +3162,21 @@ cxx_eval_binary_expression (const conste > if (r == NULL_TREE) > r = fold_binary_loc (loc, code, type, lhs, rhs); > > + if (r == NULL_TREE > + && (code == LSHIFT_EXPR || code == RSHIFT_EXPR) > + && TREE_CODE (lhs) == INTEGER_CST > + && TREE_CODE (rhs) == INTEGER_CST > + && wi::neg_p (wi::to_wide (rhs))) > + { > + /* For diagnostics and -fpermissive emulate previous behavior of > + handling shifts by negative amount. */ > + tree nrhs = const_unop (NEGATE_EXPR, TREE_TYPE (rhs), rhs); > + if (nrhs) > + r = fold_binary_loc (loc, > + code == LSHIFT_EXPR ? RSHIFT_EXPR : LSHIFT_EXPR, > + type, lhs, nrhs); > + } > + > if (r == NULL_TREE) > { > if (lhs == orig_lhs && rhs == orig_rhs) > --- gcc/testsuite/gcc.dg/tree-ssa/pr96929.c.jj 2020-11-23 > 12:42:13.809315167 +0100 > +++ gcc/testsuite/gcc.dg/tree-ssa/pr96929.c 2020-11-23 12:42:13.809315167 > +0100 > @@ -0,0 +1,21 @@ > +/* PR tree-optimization/96929 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-optimized" } */ > +/* { dg-final { scan-tree-dump "baz \\\(\\\);" "optimized" } } */ > +/* { dg-final { scan-tree-dump-times "return -1;" 2 "optimized" } } */ > +/* { dg-final { scan-tree-dump-not " >> " "optimized" } } */ > + > +int baz (void); > + > +int > +foo (void) > +{ > + return -1 >> baz (); > +} > + > +int > +bar (int y) > +{ > + int z = -1; > + return z >> y; > +} > > Jakub > > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Felix Imend