(I am not a reviewer, just commenting)

On Fri, 9 Feb 2018, Roger Sayle wrote:

The following patch implements a number of __builtin_popcount related
optimizations.
(i) popcount(x) == 0 can be simplified to x==0, and popcount(x) != 0 to
x!=0.
(ii) popcount(x&1) can be simplified to x&1, and for unsigned x,
popcount(x>>31) to x>>31.
(iii) popcount (x&6) + popcount(y&16) can be simplified to
popcount((x&6)|(y&16))

These may seem obscure transformations, but performing these types of POPCOUNT operations are often the performance critical steps in some cheminformatics applications.

To implement the above transformations I've introduced the tree_nonzero_bits function, which is a tree-level version of rtlanal's nonzero_bits used by the RTL optimizers.

I am wondering if this brings much, compared to just using get_nonzero_bits (works on INTEGER_CST / SSA_NAME, matched by with_possible_nonzero_bits). If we do decide to introduce this function, we probably want to use it in other places that currently use get_nonzero_bits as well...

+    case NOP_EXPR:
+    case CONVERT_EXPR:

We have CASE_CONVERT: for this pair.

+    case LSHIFT_EXPR:
+      if (TREE_CODE (TREE_OPERAND (t, 1)) == INTEGER_CST)

Maybe check INTEGRAL_TYPE_P as well, like you did for PLUS_EXPR? Or was that also unnecessary for PLUS_EXPR?

+         return wi::neg_p (arg1)
+                ? wi::rshift (nzbits, -arg1, TYPE_SIGN (type))
+                : wi::lshift (nzbits, arg1);

I can see that fold-const.c already does something like that. I am surprised the sanitizer guys haven't asked that we just punt on negative values instead.

--- gcc/match.pd        (revision 257227)
+++ gcc/match.pd        (working copy)
@@ -4648,3 +4648,24 @@
        || wi::geu_p (wi::to_wide (@rpos),
                      wi::to_wide (@ipos) + isize))
     (BIT_FIELD_REF @0 @rsize @rpos)))))
+
+/* POPCOUNT simplifications.  */
+(for popcount (BUILT_IN_POPCOUNT BUILT_IN_POPCOUNTL BUILT_IN_POPCOUNTLL
+               BUILT_IN_POPCOUNTIMAX)
+  /* popcount(X&1) is nop_expr(X&1).  */
+  (simplify
+    (popcount @0)
+    (if (tree_nonzero_bits (@0) == 1)
+      (convert @0)))

Good thing we can't call popcount on signed 1-bit types ;-)

+  /* popcount(X) + popcount(Y) is popcount(X|Y) when X&Y must be zero.  */
+  (simplify
+    (plus (popcount @0) (popcount @1))

We probably want :s on both popcount: if they are used in other places than just this addition, it is likely cheaper not to introduce a new call to popcount.

+    (if (wi::bit_and (tree_nonzero_bits (@0), tree_nonzero_bits (@1)) == 0)
+      (popcount (bit_ior @0 @1))))

We'll have to be careful if we ever turn popcount into something generic, but the current situation indeed should safely guarantee that @0 and @1 have the same type.

+  /* pocount(X) == 0 is X == 0, and related (in)equalities.  */

po+p+count

+  (for cmp (le eq ne gt)
+       rep (eq eq ne ne)
+    (simplify
+      (cmp (popcount @0) zerop)

Might as well use integer_zerop when we know we are dealing with integers.

+      (rep @0 { build_zero_cst (TREE_TYPE (@0)); }))))


Nice patch :-)

--
Marc Glisse

Reply via email to