Re: [PATCH] POPCOUNT folding optimizations

Jeff Law Tue, 01 May 2018 08:02:31 -0700

On 05/01/2018 02:48 AM, Marc Glisse wrote:
> On Mon, 30 Apr 2018, Jeff Law wrote:
> 
>> On 02/09/2018 05:42 AM, Roger Sayle wrote:
>>> The following patch implements a number of __builtin_popcount related
>>> optimizations.
>>> (i) popcount(x) == 0 can be simplified to x==0, and popcount(x) != 0 to
>>> x!=0.
>>> (ii) popcount(x&1) can be simplified to x&1, and for unsigned x,
>>> popcount(x>>31) to x>>31.
>>> (iii) popcount (x&6) + popcount(y&16) can be simplified to
>>> popcount((x&6)|(y&16))
>>>
>>> These may seem obscure transformations, but performing these types of
>>> POPCOUNT
>>> operations are often the performance critical steps in some
>>> cheminformatics
>>> applications.
>>>
>>> To implement the above transformations I've introduced the
>>> tree_nonzero_bits
>>> function,
>>> which is a tree-level version of rtlanal's nonzero_bits used by the RTL
>>> optimizers.
>>>
>>> The following patch has been tested on x86_64-pc-linux-gnu with a "make
>>> bootstrap"
>>> and "make check" with no regressions, and passes for the four new gcc.dg
>>> test cases.
>>>
>>> Many thanks In advance.  Best regards,
>>>
>>> Roger
>>> -- 
>>> Roger Sayle, PhD.
>>> NextMove Software Limited
>>> Innovation Centre (Unit 23), Cambridge Science Park, Cambridge, CB4 0EY
>>>
>>> 2018-02-09  Roger Sayle  <ro...@nextmovesoftware.com>
>>>
>>>         * fold-const.c (tree_nonzero_bits): New function.
>>>         * fold-const.h (tree_nonzero_bits): Likewise.
>>>         * match.pd (POPCOUNT): New patterns to fold BUILTIN_POPCOUNT and
>>>         friends.  POPCOUNT(x&1) => x&1, POPCOUNT(x)==0 => x==0, etc.
>>>
>>> 2018-02-09  Roger Sayle  <ro...@nextmovesoftware.com>
>>>
>>>         * gcc.dg/fold-popcount-1.c: New testcase.
>>>         * gcc.dg/fold-popcount-2.c: New testcase.
>>>         * gcc.dg/fold-popcount-3.c: New testcase.
>>>         * gcc.dg/fold-popcount-4.c: New testcase.
>>>
>>>
>>>
>>>
>>> Index: gcc/fold-const.c
>>> ===================================================================
>>> --- gcc/fold-const.c    (revision 257227)
>>> +++ gcc/fold-const.c    (working copy)
>>> @@ -14580,6 +14580,75 @@
>>>    return string + offset;
>>>  }
>>>
>>> +/* Given a tree T, compute which bits in T may be nonzero.  */
>>> +
>>> +wide_int
>>> +tree_nonzero_bits (const_tree t)
>>> +{
>>> +  switch (TREE_CODE (t))
>>> +    {
>>> +    case BIT_IOR_EXPR:
>>> +    case BIT_XOR_EXPR:
>>> +      return wi::bit_or (tree_nonzero_bits (TREE_OPERAND (t, 0)),
>>> +             tree_nonzero_bits (TREE_OPERAND (t, 1)));
>> Hmm.   I think this will potentially have too many bits set in the
>> BIT_XOR case.  Is there some reason you didn't use wi::bit_xor for that
>> case?
> 
> You cannot use bit_xor because the bits are only *possibly* set (same as
> you can't say anything about BIT_NOT_EXPR). We would need to also track
> the bits *certainly* set to start doing clever stuff, and for
> BIT_XOR_EXPR that would still be inconvenient.
Yea, I realized it was a may vs must issue after I signed off for the night.


jeff

Re: [PATCH] POPCOUNT folding optimizations

Reply via email to