Re: [patch optimization]: Add some basic folding for ==/!= comparison

Kai Tietz Mon, 16 Apr 2012 06:58:28 -0700

2012/4/12 Richard Guenther <richard.guent...@gmail.com>:
> On Thu, Apr 5, 2012 at 6:15 PM, Kai Tietz <ktiet...@googlemail.com> wrote:
>> Hello,
>>
>> this patch adds some basic folding capabilities to fold-const for
>> equal and none-equal comparisons
>> on integer typed argument.
>>
>> ChangeLog
>>
>> 2012-04-05  Kai Tietz  <kti...@redhat.com>
>>
>>        * fold-const.c (fold_comparison_1): New
>>        function.
>>        (fold_comparison): Use fold_comparison_1.
>>
>> 2012-04-05  Kai Tietz  <kti...@redhat.com>
>>
>>        * gcc.dg/fold-compare-1.c: Adjust matching rule
>>        for a == b without argument swap.
>>        * gcc.dg/fold-compare-7.c: New test.
>>
>> Regession tested for x86_64-unknown-linux-gnu for all languages
>> (including Ada and Obj-C++).  Ok for apply?
>>
>> Regards,
>> Kai
>>
>> Index: gcc/gcc/fold-const.c
>> ===================================================================
>> --- gcc.orig/gcc/fold-const.c
>> +++ gcc/gcc/fold-const.c
>> @@ -8739,6 +8739,241 @@ pointer_may_wrap_p (tree base, tree offs
>>   return total_low > (unsigned HOST_WIDE_INT) size;
>>  }
>>
>> +/* Sub-routine of fold_comparison.  Folding of EQ_EXPR/NE_EXPR
>> +   comparisons with integral typed arguments.  */
>> +
>> +static tree
>> +fold_comparison_1 (location_t loc, enum tree_code code, tree type,
>> +                  tree arg0, tree arg1)
>
> Please name it more specific, like for example
> fold_integral_equality_comparison.
>
>> +{
>> +  enum tree_code c1, c2;
>> +  tree optype, op0, op1, opr0, opr1, tem;
>> +
>> +  if (code != NE_EXPR && code != EQ_EXPR)
>> +    return NULL_TREE;
>> +
>> +  optype = TREE_TYPE (arg0);
>> +  if (!INTEGRAL_TYPE_P (optype))
>> +    return NULL_TREE;
>> +
>> +  c1 = TREE_CODE (arg0);
>> +  c2 = TREE_CODE (arg1);
>> +
>> +  /* Integer constant is on right-hand side.  */
>> +  if (c1 == INTEGER_CST
>> +      && c2 != c1)
>> +    return fold_build2_loc (loc, code, type, arg1, arg0);
>
>  /* If one arg is a real or integer constant, put it last.  */
>  if (tree_swap_operands_p (arg0, arg1, true))
>    return fold_build2_loc (loc, swap_tree_comparison (code), type, op1, op0);
>
> in fold_comparison already ensures this.
>
>> +  if (!TREE_SIDE_EFFECTS (arg0)
>> +      && operand_equal_p (arg0, arg1, 0))
>> +    {
>> +      if (code == EQ_EXPR)
>> +        return build_one_cst (type);
>> +      return build_zero_cst (type);
>> +    }
>
> This is already done in a more general way in fold_comparison:


Yes, was a duplicate like ...

>  /* Simplify comparison of something with itself.  (For IEEE
>     floating-point, we can only do some of these simplifications.)  */
>  if (operand_equal_p (arg0, arg1, 0))
>    {
>      switch (code)
>        {
>        case EQ_EXPR:
> ...
>
> which also shows how to fold to true/false - using constant_boolean_node.

like this one. So I removed from patch.

>> +
>> +  if (c1 == NEGATE_EXPR)
>> +    {
>> +      op0 = TREE_OPERAND (arg0, 0);
>> +      /* -X ==/!= -Y -> X ==/!= Y.  */
>> +      if (c2 == c1)
>> +        return fold_build2_loc (loc, code, type,
>> +                               op0,
>> +                               TREE_OPERAND (arg1, 0));
>
> This is already done, in a more general way but only for float types,
> in fold_comparison.  It's beyond me why it is conditional on float types
> there and does not check for trapping math and NaNs (maybe that's
> well-defined - one would need to double-check).  For integral types
> you'd have to care for undefined overflow (or issue a warning), and ...

You miss here explicit a point about ==/!= comparisons.  The negate
can be removed for such comparisons uncoditionally, as there can't
happen an overflow, which changes result of compare.  It would be even
a flaw for checking here for those cases about overflow.

>> +      /* -X ==/!= CST -> X ==/!= CST' with CST'= -CST.  */
>> +      else if (c2 == INTEGER_CST)
>> +        return fold_build2_loc (loc, code, type, op0,
>> +                               fold_build1_loc (loc, NEGATE_EXPR,
>> +                                                optype, arg1));
>
> ... generalizing this the code should use negate_expr_p / negate_expr
> to for example handle folding -b != b - a to b != a - b (of course you'd
> require at least one explicit NEGATE_EXPR - similar foldings elsewhere
> will tell you what to do).

See, above. No, it would be a failure to use negate_expr_p here, as
the overflow simply doesn't matter and there is also no need to warn
about it.

>> +     }
>> +  else if (c1 == BIT_NOT_EXPR)
>> +    {
>> +      op0 = TREE_OPERAND (arg0, 0);
>> +      /* ~X ==/!= ~Y -> X ==/!= Y.  */
>> +      if (c2 == c1)
>> +        return fold_build2_loc (loc, code, type, op0,
>> +                               TREE_OPERAND (arg1, 0));
>
> This can be generalized to relational comparisons as well I think.  It's also
> already done in fold_comparison here:

No it isn't.  As again for ==/!= comparison the overflow simply
doesn't matter.  Therefore I added this function to special-case
(non-)equal-comparison.  The overflow cases are already handled for
general comparison, no need to do it twice.

>  /* Fold ~X op ~Y as Y op X.  */
>  if (TREE_CODE (arg0) == BIT_NOT_EXPR
>      && TREE_CODE (arg1) == BIT_NOT_EXPR)
>    {
>
>
>> +      /* ~X ==/!= CST -> X ==/!= CST' with CST'= ~CST.  */
>> +      else if (c2 == INTEGER_CST)
>> +        return fold_build2_loc (loc, code, type, op0,
>> +                               fold_build1_loc (loc, BIT_NOT_EXPR,
>> +                                                optype, arg1));
>
> Possibly unified with having a new predicate/worker invert_expr_p / 
> invert_expr.

Well, there is no need for an invert_expr_p (see above).  Also in this
case we don't need and have to warn.

>> +    }
>> +
>> +  /* See if we can simplify case X == (Y +|-|^ Z).  */
>> +  if (c1 != PLUS_EXPR && c1 != MINUS_EXPR && c1 != BIT_XOR_EXPR)
>> +    {
>> +      if ((c2 != PLUS_EXPR && c2 != MINUS_EXPR
>> +           && c2 != BIT_XOR_EXPR)
>> +          || TREE_SIDE_EFFECTS (arg0))
>> +        return NULL_TREE;
>
> (I'm not sure why you are testing for side-effects - if you omit sth use
> omit_*_operand ())

Actual the use of omit_*_operand () introduces for none-volative cases
NON_LVALUE_EXPR expressions, which are within comparisons vain.  Also
it wasn't quite clear if the following reduction of volatiles within a
comparison is valid.  At least for substractions we don't do this
optimization, so I would assume that it would be wrong for
comparisons, too.


>> +
>> +      op0 = TREE_OPERAND (arg1, 0);
>> +      op1 = TREE_OPERAND (arg1, 1);
>
> Please use names like arg10 and arg11 as elsewhere in folding.
>
>> +      /* Convert temporary X - Y to X + (-Y).  */
>> +      if (c2 == MINUS_EXPR)
>> +        op1 = fold_build1_loc (loc, NEGATE_EXPR, optype, op1);
>
> That's not a good idea - in general we avoid building scratch trees
> during folding.

Well, this patterns can be of course written out explicit.  But by
doing this transformation simplifies later on used patterns a bit.
Also it can be later on used for futher optimizations about value
preditions for patterns like 'a == 1 - a' being always false for all
integer values, etc.

>> +
>> +      /* Check if we can simplify X ==/!= (X ^ Y) to Y ==/!= 0,
>> +         or X ==/!= (X +|- Y) to Y ==/!= 0.  */
>> +      tem = fold_build2_loc (loc, (c2 == BIT_XOR_EXPR ? c2 : MINUS_EXPR),
>> +                            optype, arg0, op0);
>
> Similar - also this code and the code below duplicates things four times.
> That's both expensive and hard to read.  It asks for some factorization
> and use of explicit pattern matching instead of recursing into folding.

Not quite sure what you mean here by recursion.  We have actual here
three cases (with op being PLUS/MINUX or XOR expression):

1) A !=/== B (which is already convered in general comparison-code)

2) A !=/== (B op C) or (A op B) !=/== C (this I duplicated in code,
but could be of course factored out into a helper-routine)

and

3 (A op B) !=/== (C op D)

>> +      if (TREE_CODE (tem) == INTEGER_CST
>> +         && (integer_zerop (tem) || TYPE_UNSIGNED (optype)
>> +             || c2 == BIT_XOR_EXPR))
>> +       return fold_build2_loc (loc, code, type, op1, tem);
>> +
>> +      /* Check if we can simplify X ==/!= (Y ^ X) to Y ==/!= 0,
>> +         or X ==/!= (Y + X) to Y ==/!= 0.  */
>> +      tem = fold_build2_loc (loc, (c2 == BIT_XOR_EXPR ? c2 : MINUS_EXPR),
>> +                            optype, arg0, op1);
>> +      if (TREE_CODE (tem) == INTEGER_CST
>> +         && (integer_zerop (tem) || TYPE_UNSIGNED (optype)
>> +             || c2 == BIT_XOR_EXPR))
>> +       return fold_build2_loc (loc, code, type, op0, tem);
>> +    }
>> +  else if (c2 != PLUS_EXPR && c2 != MINUS_EXPR && c2 != BIT_XOR_EXPR)
>> +    {
>> +      if ((c1 != PLUS_EXPR && c1 != MINUS_EXPR
>> +           && c1 != BIT_XOR_EXPR)
>> +          || TREE_SIDE_EFFECTS (arg1))
>> +        return NULL_TREE;
>> +
>> +      op0 = TREE_OPERAND (arg0, 0);
>> +      op1 = TREE_OPERAND (arg0, 1);
>> +
>> +      /* Convert temporary X - Y to X + (-Y).  */
>> +      if (c1 == MINUS_EXPR)
>> +        op1 = fold_build1_loc (loc, NEGATE_EXPR, optype, op1);
>> +
>> +      /* Check if we can simplify X ==/!= (X ^ Y) to Y ==/!= 0,
>> +         or X ==/!= (X +|- Y) to Y ==/!= 0.  */
>> +      tem = fold_build2_loc (loc, (c1 == BIT_XOR_EXPR ? c1 : MINUS_EXPR),
>> +                            optype, arg1, op0);
>> +      if (TREE_CODE (tem) == INTEGER_CST
>> +         && (integer_zerop (tem) || TYPE_UNSIGNED (optype)
>> +             || c1 == BIT_XOR_EXPR))
>> +       return fold_build2_loc (loc, code, type, op1, tem);
>> +
>> +      /* Check if we can simplify X ==/!= (Y ^ X) to Y ==/!= 0,
>> +         or X ==/!= (Y + X) to Y ==/!= 0.  */
>> +      tem = fold_build2_loc (loc, (c1 == BIT_XOR_EXPR ? c1 : MINUS_EXPR),
>> +                            optype, arg1, op1);
>> +      if (TREE_CODE (tem) == INTEGER_CST
>> +         && (integer_zerop (tem) || TYPE_UNSIGNED (optype)
>> +             || c1 == BIT_XOR_EXPR))
>> +       return fold_build2_loc (loc, code, type, op0, tem);
>> +    }
>> +
>> +  /* We check if arg1 and arg2 are matching one of the patterns
>> +     (V + W) ==/!= (X + Y), (V + W) ==/!= (X - Y), (V - W) ==/!= (X + Y),
>> +     (V - W) ==/!= (X - Y), or (V ^ W) ==/!= (X ^ Y).  */
>
> I stopped reading here.  Please try to double check what we already do,
> don't produce new code for everything you can think of.  This patch could
> have needed splitting, too.
>
> Richard.
>
>> +  if ((c1 != PLUS_EXPR && c1 != MINUS_EXPR && c1 != BIT_XOR_EXPR)
>> +      || (c2 != PLUS_EXPR && c2 != MINUS_EXPR && c2 != BIT_XOR_EXPR))
>> +    return NULL_TREE;
>> +  if (c1 != c2 && (c1 == BIT_XOR_EXPR || c2 == BIT_XOR_EXPR))
>> +    return NULL_TREE;
>> +
>> +  op0 = TREE_OPERAND (arg0, 0);
>> +  op1 = TREE_OPERAND (arg0, 1);
>> +  opr0 = TREE_OPERAND (arg1, 0);
>> +  opr1 = TREE_OPERAND (arg1, 1);
>> +
>> +  /* Convert temporary (X - Y) to (X + (-Y)).  */
>> +  if (c1 == MINUS_EXPR)
>> +    {
>> +      op1 = fold_build1_loc (loc, NEGATE_EXPR, optype, op1);
>> +      c1 = PLUS_EXPR;
>> +    }
>> +
>> +  /* Convert temporary (X - Y) to (X + (-Y)).  */
>> +  if (c2 == MINUS_EXPR)
>> +    {
>> +      opr1 = fold_build1_loc (loc, NEGATE_EXPR, optype, opr1);
>> +      c2 = PLUS_EXPR;
>> +    }
>> +
>> +  if (c1 != c2)
>> +    return NULL_TREE;
>> +
>> +  /* If OP0 has no side-effects, we might be able to optimize
>> +     (OP0 + OP1) ==/!= (OP0 + OPR1) to OP1 ==/!= OPR1,
>> +     (OP0 + OP1) ==/!= (OPR0 + OP0) to OP1 ==/!= OPR0,
>> +     (OP0 ^ OP1) ==/!= (OP0 ^ OPR1) to OP1 ==/!= OPR1,
>> +     or (OP0 ^ OP1) ==/!= (OPR0 ^ OP0) to OP1 ==/!= OPR0..  */
>> +  if (!TREE_SIDE_EFFECTS (op0))
>> +    {
>> +      tem = fold_build2_loc (loc, (c1 == PLUS_EXPR ? MINUS_EXPR : c1),
>> +                            optype, op0, opr0);
>> +      if (TREE_CODE (tem) == INTEGER_CST
>> +         && !TREE_SIDE_EFFECTS (opr0)
>> +         && (integer_zerop (tem) || TYPE_UNSIGNED (optype)
>> +             || c1 == BIT_XOR_EXPR))
>> +       {
>> +         if (!integer_zerop (tem))
>> +           tem = fold_build2_loc (loc, c1, optype, op1, tem);
>> +         else
>> +           tem = op1;
>> +
>> +         return fold_build2_loc (loc, code, type, tem, opr1);
>> +       }
>> +      tem = fold_build2_loc (loc, (c1 == PLUS_EXPR ? MINUS_EXPR : c1),
>> +                            optype, op0, opr1);
>> +      if (TREE_CODE (tem) == INTEGER_CST
>> +         && !TREE_SIDE_EFFECTS (opr1)
>> +         && (integer_zerop (tem) || TYPE_UNSIGNED (optype)
>> +             || c1 == BIT_XOR_EXPR))
>> +       {
>> +         if (!integer_zerop (tem))
>> +           tem = fold_build2_loc (loc, c1, optype, op1, tem);
>> +         else
>> +           tem = op1;
>> +          return fold_build2_loc (loc, code, type, tem, opr0);
>> +       }
>> +    }
>> +
>> +  /* If OP1 has no side-effects, we might be able to optimize
>> +     (OP0 + OP1) ==/!= (OP1 + OPR1) to OP0 ==/!= OPR1,
>> +     (OP0 + OP1) ==/!= (OPR0 + OP1) to OP0 ==/!= OPR0,
>> +     (OP0 ^ OP1) ==/!= (OP1 ^ OPR1) to OP0 ==/!= OPR1,
>> +     or (OP0 ^ OP1) ==/!= (OPR0 ^ OP1) to OP0 ==/!= OPR0..  */
>> +  if (!TREE_SIDE_EFFECTS (op1))
>> +    {
>> +      tem = fold_build2_loc (loc, (c1 == PLUS_EXPR ? MINUS_EXPR : c1),
>> +                            optype, op1, opr0);
>> +      if (TREE_CODE (tem) == INTEGER_CST
>> +         && !TREE_SIDE_EFFECTS (opr0)
>> +         && (integer_zerop (tem) || TYPE_UNSIGNED (optype)
>> +             || c1 == BIT_XOR_EXPR))
>> +       {
>> +         if (!integer_zerop (tem))
>> +           tem = fold_build2_loc (loc, c1, optype, op0, tem);
>> +         else
>> +           tem = op0;
>> +         return fold_build2_loc (loc, code, type, tem, opr1);
>> +       }
>> +
>> +      tem = fold_build2_loc (loc, (c1 == PLUS_EXPR ? MINUS_EXPR : c1),
>> +                            optype, op1, opr1);
>> +      if (TREE_CODE (tem) == INTEGER_CST
>> +         && !TREE_SIDE_EFFECTS (opr1)
>> +         && (integer_zerop (tem) || TYPE_UNSIGNED (optype)
>> +             || c1 == BIT_XOR_EXPR))
>> +       {
>> +         if (!integer_zerop (tem))
>> +           tem = fold_build2_loc (loc, c1, optype, op0, tem);
>> +         else
>> +           tem = op0;
>> +         return fold_build2_loc (loc, code, type, tem, opr0);
>> +       }
>> +    }
>> +
>> +  return NULL_TREE;
>> +}
>> +
>>  /* Subroutine of fold_binary.  This routine performs all of the
>>    transformations that are common to the equality/inequality
>>    operators (EQ_EXPR and NE_EXPR) and the ordering operators
>> @@ -8767,6 +9002,10 @@ fold_comparison (location_t loc, enum tr
>>   if (tree_swap_operands_p (arg0, arg1, true))
>>     return fold_build2_loc (loc, swap_tree_comparison (code), type, op1, 
>> op0);
>>
>> +  tem = fold_comparison_1 (loc, code, type, arg0, arg1);
>> +  if (tem != NULL_TREE)
>> +    return tem;
>> +
>>   /* Transform comparisons of the form X +- C1 CMP C2 to X CMP C2 +- C1.  */
>>   if ((TREE_CODE (arg0) == PLUS_EXPR || TREE_CODE (arg0) == MINUS_EXPR)
>>       && (TREE_CODE (TREE_OPERAND (arg0, 1)) == INTEGER_CST
>> Index: gcc/gcc/testsuite/gcc.dg/fold-compare-1.c
>> ===================================================================
>> --- gcc.orig/gcc/testsuite/gcc.dg/fold-compare-1.c
>> +++ gcc/gcc/testsuite/gcc.dg/fold-compare-1.c
>> @@ -41,7 +41,7 @@ int test8(int l)
>>   return ~l >= 2;
>>  }
>>
>> -/* { dg-final { scan-tree-dump-times "b == a" 1 "original" } } */
>> +/* { dg-final { scan-tree-dump-times "b == a|a == b" 1 "original" } } */
>>  /* { dg-final { scan-tree-dump-times "c == d" 1 "original" } } */
>>  /* { dg-final { scan-tree-dump-times "e == -5" 1 "original" } } */
>>  /* { dg-final { scan-tree-dump-times "f == -6" 1 "original" } } */
>> Index: gcc/gcc/testsuite/gcc.dg/fold-compare-7.c
>> ===================================================================
>> --- /dev/null
>> +++ gcc/gcc/testsuite/gcc.dg/fold-compare-7.c
>> @@ -0,0 +1,36 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -fdump-tree-original" } */
>> +
>> +int test1(int a, int elim)
>> +{
>> +  return ~elim == (elim ^ a);
>> +}
>> +
>> +int test2(int elim, int b)
>> +{
>> +  return -elim == (b - elim);
>> +}
>> +
>> +int test3(int c, int elim, int d)
>> +{
>> +  return (c + elim) != (elim + d);
>> +}
>> +
>> +int test4(int e, int f, int elim)
>> +{
>> +  return (e - elim) != (-elim + f);
>> +}
>> +
>> +int test5(int g, int h, int elim)
>> +{
>> +  return (elim ^ g) == (h ^ elim);
>> +}
>> +
>> +int test6(int i, int j, int elim)
>> +{
>> +  return (elim ^ i) == (j ^ ~elim);
>> +}
>> +
>> +/* { dg-final { scan-tree-dump-times "elim" 0 "original" } } */
>> +/* { dg-final { cleanup-tree-dump "original" } } */
>> +


Regards,
Kai

Re: [patch optimization]: Add some basic folding for ==/!= comparison

Reply via email to