On 11/3/2021 2:15 AM, Richard Biener via Gcc-patches wrote:
On Tue, Nov 2, 2021 at 4:53 PM Jeff Law <j...@tachyum.com> wrote:

I was wandering spec chasing down instances where we should be
generating bit-test, bit-set and bit-clear types of instructions for our
target when I ran across a generic missed optimization in this space.


(((1 << N) & C) != 0)  -> (N == C')
(((1 << N) & C) == 0)  -> (N != C')

Where C is a constant power of 2 and C' is log2 (C).



That obviously avoids the shift by a variable amount and the bit masking
which is the primary effect.  I did see cases where we were able to
constant propagate into uses of N, but those were only in PHI nodes and
never triggered any real secondary effects in the cases I looked at.


Anyway, it's a fairly minor optimization, but with the analysis done and
patch in hand, it's silly not to take the easy win.


Bootstrapped and regression tested on x86_64 and verified that the
affected spec benchmark (gcc itself) still passes on our target.

OK for the trunk?  Note I added the patterns at the end of match.pd.
Certainly open to moving them elsewhere.
There are related patterns like

/* (CST1 << A) == CST2 -> A == ctz (CST2) - ctz (CST1)
    (CST1 << A) != CST2 -> A != ctz (CST2) - ctz (CST1)

please move the new patterns next to those.

+/* ((1 << n) & M) != 0  -> n == log2 (M) */
+(simplify
+ (ne
+  (bit_and
+   (nop_convert? (lshift integer_onep@0 @1)) integer_pow2p@2) integer_zerop@3)
+ (eq @1 { build_int_cst (integer_type_node,
+                         wi::exact_log2 (wi::to_wide (@2))); }))
+
+/* ((1 << n) & M) == 0  -> n != log2 (M) */
+(simplify
+ (eq
+  (bit_and
+   (nop_convert? (lshift integer_onep@0 @1)) integer_pow2p@2) integer_zerop@3)
+ (ne @1 { build_int_cst (integer_type_node,
+                         wi::exact_log2 (wi::to_wide (@2))); }))

you don't need @3 or @0 so no need to specify them.  You can merge the
patterns with

(for cmp (ne eq)
        icmp (eq ne)
   (simplify
     (cmp
+  (bit_and
       (nop_convert? (lshift integer_onep @1)) integer_pow2p@2) integer_zerop)
     (icmp @1 { wide_int_to_tree (TREE_TYPE (@1),
+                         wi::exact_log2 (wi::to_wide (@2))); }))

I belive the integer constant you build should be of the type of @1 (I
fixed that above,
also using wide_int_to_tree.  The pattern is written in a way that _could_ match
vector operations and a vector by vector shift in which case the
wi::to_wide would
ICE - integer_pow2p currently does not match vector constants.  But maybe be
defensive and add

   (if (INTEGRAL_TYPE_P (TREE_TYPE (@1)))

I think the patch is OK with those changes.
FTR, here's what I committed.  Bootstrapped and regression tested on our internal target.

jeff
commit 2abd924f91e54a5229efb2c3bb5fb247059a5b37
Author: Jeff Law <jeffreya...@gmail.com>
Date:   Mon Nov 8 23:23:34 2021 -0500

    Minor optimization of variable bit testing
    
    gcc/
            * match.pd: New pattern to simplify (1 << n) & M ==/!= 0 for M
            being a power of 2.
    
    gcc/testsuite
            * gcc.dg/tree-ssa/bittest.c: New test

diff --git a/gcc/match.pd b/gcc/match.pd
index 71cf6f9df0a..cdab5e59f4e 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3441,6 +3441,17 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
             (bit_and (convert (shift:shift_type (convert @3) @1)) { newmaskt; 
})
             (bit_and @4 { newmaskt; })))))))))))))
 
+/* ((1 << n) & M) != 0  -> n == log2 (M) */
+(for cmp (ne eq)
+       icmp (eq ne)
+ (simplify
+  (cmp
+   (bit_and
+    (nop_convert? (lshift integer_onep @0)) integer_pow2p@1) integer_zerop)
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
+   (icmp @0 { wide_int_to_tree (TREE_TYPE (@0),
+                               wi::exact_log2 (wi::to_wide (@1))); }))))
+
 /* Fold (X {&,^,|} C2) << C1 into (X << C1) {&,^,|} (C2 << C1)
    (X {&,^,|} C2) >> C1 into (X >> C1) & (C2 >> C1).  */
 (for shift (lshift rshift)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bittest.c 
b/gcc/testsuite/gcc.dg/tree-ssa/bittest.c
new file mode 100644
index 00000000000..7d712cad1ee
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/bittest.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+
+void bar (void);
+
+void
+foo(unsigned int abc123)
+{
+  unsigned int xyzpdq = (1 << abc123);
+  if ((xyzpdq & 0x800) != 0)
+    bar();
+}
+
+void
+baz(unsigned int abc123)
+{
+  unsigned int xyzpdq = (1 << abc123);
+  if ((xyzpdq & 0x800) == 0)
+    bar();
+}
+
+/* What we want to verify is that the bit test against xyzpdq is
+   replaced with a test against abc123 which avoids the shifting
+   and bit ops.  */
+/* { dg-final { scan-tree-dump-not "xyzpdq" "optimized"} } */
+/* { dg-final { scan-tree-dump-times "if .abc123" 2 "optimized"} } */

Reply via email to