Instead of converting XOR or PLUS of two values, ANDed with two constants that have no bits in common, to IOR expression, convert IOR or XOR of said two ANDed values to PLUS expression.
If we consider the following testcase: --cut here-- unsigned int foo (unsigned int a, unsigned int b) { unsigned int r = a & 0x1; unsigned int p = b & ~0x3; return r + p + 2; } unsigned int bar (unsigned int a, unsigned int b) { unsigned int r = a & 0x1; unsigned int p = b & ~0x3; return r | p | 2; } --cut here-- the above testcase compiles (x86_64 -O2) to: foo: andl $1, %edi andl $-4, %esi orl %esi, %edi leal 2(%rdi), %eax ret bar: andl $1, %edi andl $-4, %esi orl %esi, %edi movl %edi, %eax orl $2, %eax ret There is no further simplification possible in any case, we can't combine OR with a PLUS in the first case, and we don't have OR instruction with multiple inputs in the second case. If we switch around the logic in the conversion and convert from IOR/XOR to PLUS, then the resulting assembly reads: foo: andl $-4, %esi andl $1, %edi leal 2(%rsi,%rdi), %eax ret bar: andl $1, %edi andl $-4, %esi leal (%rdi,%rsi), %eax orl $2, %eax ret On x86, the conversion can now use LEA instruction, which is much more usable than OR instruction. In the first case, LEA implements three input ADD instruction, while in the second case, even though the instruction can't be combined with a follow-up OR, the non-destructive LEA avoids a move. PR target/108477 gcc/ChangeLog: * match.pd (A & CST1 | B & CST2 -> A & CST1 + B & CST2): Do not convert PLUS of two values, ANDed with two constants that have no bits in common to IOR exporession, convert IOR or XOR of said two ANDed values to PLUS expression. gcc/testsuite/ChangeLog: * gcc.target/i386/pr108477.c: New test. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. OK for mainline? Uros.
diff --git a/gcc/match.pd b/gcc/match.pd index 7b4b15acc41..deac18a7635 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -1830,18 +1830,18 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) && element_precision (type) <= element_precision (TREE_TYPE (@1))) (bit_not (rop (convert @0) (convert @1)))))) -/* If we are XORing or adding two BIT_AND_EXPR's, both of which are and'ing +/* If we are ORing or XORing two BIT_AND_EXPR's, both of which are and'ing with a constant, and the two constants have no bits in common, - we should treat this as a BIT_IOR_EXPR since this may produce more + we should treat this as a PLUS_EXPR since this may produce more simplifications. */ -(for op (bit_xor plus) +(for op (bit_ior bit_xor) (simplify (op (convert1? (bit_and@4 @0 INTEGER_CST@1)) (convert2? (bit_and@5 @2 INTEGER_CST@3))) (if (tree_nop_conversion_p (type, TREE_TYPE (@0)) && tree_nop_conversion_p (type, TREE_TYPE (@2)) && (wi::to_wide (@1) & wi::to_wide (@3)) == 0) - (bit_ior (convert @4) (convert @5))))) + (plus (convert @4) (convert @5))))) /* (X | Y) ^ X -> Y & ~ X*/ (simplify diff --git a/gcc/testsuite/gcc.target/i386/pr108477.c b/gcc/testsuite/gcc.target/i386/pr108477.c new file mode 100644 index 00000000000..fb320a84c6d --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr108477.c @@ -0,0 +1,13 @@ +/* PR target/108477 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -masm=att" } */ + +unsigned int foo (unsigned int a, unsigned int b) +{ + unsigned int r = a & 0x1; + unsigned int p = b & ~0x3; + + return r + p + 2; +} + +/* { dg-final { scan-assembler-not "orl" } } */