(I am not a reviewer, just commenting)
On Fri, 9 Feb 2018, Roger Sayle wrote:
The following patch implements a number of __builtin_popcount related
optimizations.
(i) popcount(x) == 0 can be simplified to x==0, and popcount(x) != 0 to
x!=0.
(ii) popcount(x&1) can be simplified to x&1, and for unsigned x,
popcount(x>>31) to x>>31.
(iii) popcount (x&6) + popcount(y&16) can be simplified to
popcount((x&6)|(y&16))
These may seem obscure transformations, but performing these types of
POPCOUNT operations are often the performance critical steps in some
cheminformatics applications.
To implement the above transformations I've introduced the
tree_nonzero_bits function, which is a tree-level version of rtlanal's
nonzero_bits used by the RTL optimizers.
I am wondering if this brings much, compared to just using
get_nonzero_bits (works on INTEGER_CST / SSA_NAME, matched by
with_possible_nonzero_bits). If we do decide to introduce this function,
we probably want to use it in other places that currently use
get_nonzero_bits as well...
+ case NOP_EXPR:
+ case CONVERT_EXPR:
We have CASE_CONVERT: for this pair.
+ case LSHIFT_EXPR:
+ if (TREE_CODE (TREE_OPERAND (t, 1)) == INTEGER_CST)
Maybe check INTEGRAL_TYPE_P as well, like you did for PLUS_EXPR? Or was
that also unnecessary for PLUS_EXPR?
+ return wi::neg_p (arg1)
+ ? wi::rshift (nzbits, -arg1, TYPE_SIGN (type))
+ : wi::lshift (nzbits, arg1);
I can see that fold-const.c already does something like that. I am
surprised the sanitizer guys haven't asked that we just punt on
negative values instead.
--- gcc/match.pd (revision 257227)
+++ gcc/match.pd (working copy)
@@ -4648,3 +4648,24 @@
|| wi::geu_p (wi::to_wide (@rpos),
wi::to_wide (@ipos) + isize))
(BIT_FIELD_REF @0 @rsize @rpos)))))
+
+/* POPCOUNT simplifications. */
+(for popcount (BUILT_IN_POPCOUNT BUILT_IN_POPCOUNTL BUILT_IN_POPCOUNTLL
+ BUILT_IN_POPCOUNTIMAX)
+ /* popcount(X&1) is nop_expr(X&1). */
+ (simplify
+ (popcount @0)
+ (if (tree_nonzero_bits (@0) == 1)
+ (convert @0)))
Good thing we can't call popcount on signed 1-bit types ;-)
+ /* popcount(X) + popcount(Y) is popcount(X|Y) when X&Y must be zero. */
+ (simplify
+ (plus (popcount @0) (popcount @1))
We probably want :s on both popcount: if they are used in other places
than just this addition, it is likely cheaper not to introduce a new call
to popcount.
+ (if (wi::bit_and (tree_nonzero_bits (@0), tree_nonzero_bits (@1)) == 0)
+ (popcount (bit_ior @0 @1))))
We'll have to be careful if we ever turn popcount into something generic,
but the current situation indeed should safely guarantee that @0 and @1
have the same type.
+ /* pocount(X) == 0 is X == 0, and related (in)equalities. */
po+p+count
+ (for cmp (le eq ne gt)
+ rep (eq eq ne ne)
+ (simplify
+ (cmp (popcount @0) zerop)
Might as well use integer_zerop when we know we are dealing with integers.
+ (rep @0 { build_zero_cst (TREE_TYPE (@0)); }))))
Nice patch :-)
--
Marc Glisse