Re: [PATCH] middle-end: Fold popcount(x&4) to (x>>2)&1 and friends.

2020-07-23 Thread Richard Biener via Gcc-patches
On Thu, Jul 23, 2020 at 2:01 PM Roger Sayle  wrote:
>
>
> On Thu, Jul 23, 2020 at 10:02 Richard Biener wrote:
> > Likewise for the existing
> >
> >+/* popcount(X) == 0 is X == 0, and related (in)equalities.  */ (for
> >+(for popcount (POPCOUNT)
> >   (for cmp (le eq ne gt)
> >   rep (eq eq ne ne)
> >   (simplify
> >   (cmp (popcount @0) integer_zerop)
> >   (rep @0 { build_zero_cst (TREE_TYPE (@0)); }
> >
> > you can elide the (for ...)
>
> Unfortunately, I tried that myself when preparing this patch:
>
> Attempt #1:
>
>(for cmp (le eq ne gt)
> rep (eq eq ne ne)
>  (simplify
>(cmp (POPCOUNT @0) integer_zerop)
>(rep @0 { build_zero_cst (TREE_TYPE (@0)); }
>
> results in:
> build/genmatch --gimple ../../patcha3/gcc/match.pd \
> > tmp-gimple-match.c
> ../../patcha3/gcc/match.pd:5977:11 error: operator-list POPCOUNT cannot be 
> expanded inside 'for'
> (cmp (POPCOUNT @0) integer_zerop)
>   ^
> Attempt #2:
>
> (for popcount (POPCOUNT)
>  cmp (le eq ne gt)
>  rep (eq eq ne ne)
>   (simplify
> (cmp (popcount @0) integer_zerop)
> (rep @0 { build_zero_cst (TREE_TYPE (@0)); })))
>
> results in:
> > tmp-gimple-match.c
> ../../patcha3/gcc/match.pd:5975:22 error: All user-defined identifiers must 
> have a multiple number of operator substitutions of the smallest number of 
> substitutions
>  cmp (le eq ne gt)
>  ^

Oh yeah - I failed to spot the outer (for ..), indeed it doesn't work
then.  We're not magically
creating a nest from the first vairant which we could make work - I
erred on the side of
"too much implicitness" caution here.

> I'll leave it the way it is with the nested FORs, and investigate your other 
> suggestions.

Yeah, fine.

> > OK with those changes.
> >
> > Richard.
>
>
> Roger
> --
>
>


RE: [PATCH] middle-end: Fold popcount(x&4) to (x>>2)&1 and friends.

2020-07-23 Thread Roger Sayle


On Thu, Jul 23, 2020 at 10:02 Richard Biener wrote:
> Likewise for the existing
>
>+/* popcount(X) == 0 is X == 0, and related (in)equalities.  */ (for 
>+(for popcount (POPCOUNT)
>   (for cmp (le eq ne gt)
>   rep (eq eq ne ne)
>   (simplify
>   (cmp (popcount @0) integer_zerop)
>   (rep @0 { build_zero_cst (TREE_TYPE (@0)); }
>
> you can elide the (for ...)

Unfortunately, I tried that myself when preparing this patch:

Attempt #1:

   (for cmp (le eq ne gt)
rep (eq eq ne ne)
 (simplify
   (cmp (POPCOUNT @0) integer_zerop)
   (rep @0 { build_zero_cst (TREE_TYPE (@0)); }

results in:
build/genmatch --gimple ../../patcha3/gcc/match.pd \
> tmp-gimple-match.c
../../patcha3/gcc/match.pd:5977:11 error: operator-list POPCOUNT cannot be 
expanded inside 'for'
(cmp (POPCOUNT @0) integer_zerop)
  ^
Attempt #2:

(for popcount (POPCOUNT)
 cmp (le eq ne gt)
 rep (eq eq ne ne)
  (simplify
(cmp (popcount @0) integer_zerop)
(rep @0 { build_zero_cst (TREE_TYPE (@0)); })))

results in:
> tmp-gimple-match.c
../../patcha3/gcc/match.pd:5975:22 error: All user-defined identifiers must 
have a multiple number of operator substitutions of the smallest number of 
substitutions
 cmp (le eq ne gt)
 ^

I'll leave it the way it is with the nested FORs, and investigate your other 
suggestions.

> OK with those changes.
>
> Richard.


Roger
--




Re: [PATCH] middle-end: Fold popcount(x&4) to (x>>2)&1 and friends.

2020-07-23 Thread Richard Biener via Gcc-patches
On Thu, Jul 23, 2020 at 12:58 PM Richard Sandiford
 wrote:
>
> Richard Biener via Gcc-patches  writes:
> > On Thu, Jul 23, 2020 at 11:11 AM Marc Glisse  wrote:
> >>
> >> On Thu, 23 Jul 2020, Marc Glisse wrote:
> >>
> >> > On Wed, 22 Jul 2020, Roger Sayle wrote:
> >> >
> >> >> Many thanks for the peer review and feedback.  I completely agree that
> >> >> POPCOUNT
> >> >> and PARITY iterators simplifies things and handle the IFN_ variants.
> >> >
> >> > Is there a reason why the iterators cannot be used for this one?
> >> >
> >> > +(for popcount (BUILT_IN_POPCOUNT BUILT_IN_POPCOUNTL BUILT_IN_POPCOUNTLL
> >> > +BUILT_IN_POPCOUNTIMAX)
> >> > + parity (BUILT_IN_PARITY BUILT_IN_PARITYL BUILT_IN_PARITYLL
> >> > +  BUILT_IN_PARITYIMAX)
> >> > +  (simplify
> >> > +(bit_and (popcount @0) integer_onep)
> >> > +(parity @0)))
> >>
> >> Ah, maybe because we may have platforms/modes where the optab for popcount
> >> is supported but not the one for parity, and we are not allowed to create
> >> internal calls if the optab is not supported? I think expand_parity tries
> >> to expand parity as popcount&1, so it should be fine, but I didn't
> >> actually try it...
> >
> > Hmm, that might indeed be a problem.  An explicit
> > direct_internal_fn_supported_p guard would help here, like
> > maybe
> >
> >   (if (PARITY != IFN_PARITY
> >|| direct_internal_fn_supported_p (IFN_PARITY, type, 
> > OPTIMIZE_FOR_BOTH))
> >   (PARITY @0)))
> >
> > magic that eventually could be auto-generated by genmatch ... (but only if
> > iterating, so it would be a bit iffy)
>
> The matching code should already reject folds to IFN_PARITY calls that
> aren't supported by the target (gimple-match-head.c:build_call_internal).

Ah, indeed, so it should work unchanged even.

Richard.

> Thanks,
> Richard


Re: [PATCH] middle-end: Fold popcount(x&4) to (x>>2)&1 and friends.

2020-07-23 Thread Richard Sandiford
Richard Biener via Gcc-patches  writes:
> On Thu, Jul 23, 2020 at 11:11 AM Marc Glisse  wrote:
>>
>> On Thu, 23 Jul 2020, Marc Glisse wrote:
>>
>> > On Wed, 22 Jul 2020, Roger Sayle wrote:
>> >
>> >> Many thanks for the peer review and feedback.  I completely agree that
>> >> POPCOUNT
>> >> and PARITY iterators simplifies things and handle the IFN_ variants.
>> >
>> > Is there a reason why the iterators cannot be used for this one?
>> >
>> > +(for popcount (BUILT_IN_POPCOUNT BUILT_IN_POPCOUNTL BUILT_IN_POPCOUNTLL
>> > +BUILT_IN_POPCOUNTIMAX)
>> > + parity (BUILT_IN_PARITY BUILT_IN_PARITYL BUILT_IN_PARITYLL
>> > +  BUILT_IN_PARITYIMAX)
>> > +  (simplify
>> > +(bit_and (popcount @0) integer_onep)
>> > +(parity @0)))
>>
>> Ah, maybe because we may have platforms/modes where the optab for popcount
>> is supported but not the one for parity, and we are not allowed to create
>> internal calls if the optab is not supported? I think expand_parity tries
>> to expand parity as popcount&1, so it should be fine, but I didn't
>> actually try it...
>
> Hmm, that might indeed be a problem.  An explicit
> direct_internal_fn_supported_p guard would help here, like
> maybe
>
>   (if (PARITY != IFN_PARITY
>|| direct_internal_fn_supported_p (IFN_PARITY, type, 
> OPTIMIZE_FOR_BOTH))
>   (PARITY @0)))
>
> magic that eventually could be auto-generated by genmatch ... (but only if
> iterating, so it would be a bit iffy)

The matching code should already reject folds to IFN_PARITY calls that
aren't supported by the target (gimple-match-head.c:build_call_internal).

Thanks,
Richard


Re: [PATCH] middle-end: Fold popcount(x&4) to (x>>2)&1 and friends.

2020-07-23 Thread Richard Biener via Gcc-patches
On Thu, Jul 23, 2020 at 11:11 AM Marc Glisse  wrote:
>
> On Thu, 23 Jul 2020, Marc Glisse wrote:
>
> > On Wed, 22 Jul 2020, Roger Sayle wrote:
> >
> >> Many thanks for the peer review and feedback.  I completely agree that
> >> POPCOUNT
> >> and PARITY iterators simplifies things and handle the IFN_ variants.
> >
> > Is there a reason why the iterators cannot be used for this one?
> >
> > +(for popcount (BUILT_IN_POPCOUNT BUILT_IN_POPCOUNTL BUILT_IN_POPCOUNTLL
> > +BUILT_IN_POPCOUNTIMAX)
> > + parity (BUILT_IN_PARITY BUILT_IN_PARITYL BUILT_IN_PARITYLL
> > +  BUILT_IN_PARITYIMAX)
> > +  (simplify
> > +(bit_and (popcount @0) integer_onep)
> > +(parity @0)))
>
> Ah, maybe because we may have platforms/modes where the optab for popcount
> is supported but not the one for parity, and we are not allowed to create
> internal calls if the optab is not supported? I think expand_parity tries
> to expand parity as popcount&1, so it should be fine, but I didn't
> actually try it...

Hmm, that might indeed be a problem.  An explicit
direct_internal_fn_supported_p guard would help here, like
maybe

  (if (PARITY != IFN_PARITY
   || direct_internal_fn_supported_p (IFN_PARITY, type, OPTIMIZE_FOR_BOTH))
  (PARITY @0)))

magic that eventually could be auto-generated by genmatch ... (but only if
iterating, so it would be a bit iffy)

Richard.

> --
> Marc Glisse


RE: [PATCH] middle-end: Fold popcount(x&4) to (x>>2)&1 and friends.

2020-07-23 Thread Marc Glisse

On Thu, 23 Jul 2020, Marc Glisse wrote:


On Wed, 22 Jul 2020, Roger Sayle wrote:

Many thanks for the peer review and feedback.  I completely agree that 
POPCOUNT

and PARITY iterators simplifies things and handle the IFN_ variants.


Is there a reason why the iterators cannot be used for this one?

+(for popcount (BUILT_IN_POPCOUNT BUILT_IN_POPCOUNTL BUILT_IN_POPCOUNTLL
+  BUILT_IN_POPCOUNTIMAX)
+ parity (BUILT_IN_PARITY BUILT_IN_PARITYL BUILT_IN_PARITYLL
+BUILT_IN_PARITYIMAX)
+  (simplify
+(bit_and (popcount @0) integer_onep)
+(parity @0)))


Ah, maybe because we may have platforms/modes where the optab for popcount 
is supported but not the one for parity, and we are not allowed to create 
internal calls if the optab is not supported? I think expand_parity tries 
to expand parity as popcount&1, so it should be fine, but I didn't 
actually try it...


--
Marc Glisse


Re: [PATCH] middle-end: Fold popcount(x&4) to (x>>2)&1 and friends.

2020-07-23 Thread Richard Biener via Gcc-patches
On Thu, Jul 23, 2020 at 10:15 AM Marc Glisse  wrote:
>
> On Wed, 22 Jul 2020, Roger Sayle wrote:
>
> > Many thanks for the peer review and feedback.  I completely agree that 
> > POPCOUNT
> > and PARITY iterators simplifies things and handle the IFN_ variants.
>
> Is there a reason why the iterators cannot be used for this one?
>
> +(for popcount (BUILT_IN_POPCOUNT BUILT_IN_POPCOUNTL BUILT_IN_POPCOUNTLL
> +  BUILT_IN_POPCOUNTIMAX)
> + parity (BUILT_IN_PARITY BUILT_IN_PARITYL BUILT_IN_PARITYLL
> +BUILT_IN_PARITYIMAX)
> +  (simplify
> +(bit_and (popcount @0) integer_onep)
> +(parity @0)))

I should indeed work fine to even do

 (simplify
   (bit_and (POPCOUNT @0) integer_onep)
   (PARITY @0))

Likewise for the existing

+/* popcount(X) == 0 is X == 0, and related (in)equalities.  */
+(for popcount (POPCOUNT)
   (for cmp (le eq ne gt)
rep (eq eq ne ne)
 (simplify
   (cmp (popcount @0) integer_zerop)
   (rep @0 { build_zero_cst (TREE_TYPE (@0)); }

you can elide the (for ...)

OK with those changes.

Richard.

>
> --
> Marc Glisse


RE: [PATCH] middle-end: Fold popcount(x&4) to (x>>2)&1 and friends.

2020-07-23 Thread Marc Glisse

On Wed, 22 Jul 2020, Roger Sayle wrote:


Many thanks for the peer review and feedback.  I completely agree that POPCOUNT
and PARITY iterators simplifies things and handle the IFN_ variants.


Is there a reason why the iterators cannot be used for this one?

+(for popcount (BUILT_IN_POPCOUNT BUILT_IN_POPCOUNTL BUILT_IN_POPCOUNTLL
+  BUILT_IN_POPCOUNTIMAX)
+ parity (BUILT_IN_PARITY BUILT_IN_PARITYL BUILT_IN_PARITYLL
+BUILT_IN_PARITYIMAX)
+  (simplify
+(bit_and (popcount @0) integer_onep)
+(parity @0)))


--
Marc Glisse


RE: [PATCH] middle-end: Fold popcount(x&4) to (x>>2)&1 and friends.

2020-07-22 Thread Roger Sayle

Hi Richard,

Many thanks for the peer review and feedback.  I completely agree that POPCOUNT
and PARITY iterators simplifies things and handle the IFN_ variants.  Likewise, 
using
integer_type_node as the type of shift constants also matches the idioms used 
elsewhere in match.pd and fold.  The following (combined) patch implements those
suggestions, for both submitted patches and the existing POPCOUNT 
simplifications,
cleaning up this logic and making this part of match.pd more consistent.
[I hadn't appreciated using POPCOUNT/PARITY avoids the need for an explicit 
for].

I've kept the shiftrt unsigned which is both a requirement for the 
transformation (when
the single bit is the sign bit), but this also matches the (canonicalization) 
preference in
the middle-end that unsigned logical shifts are preferred over arithmetic 
shifts when
the distinction isn't important [lshrdi3 is sometimes cheaper than ashrdi3].

This revised patch has been tested on x86_64-pc-linux-gnu with a "make 
bootstrap"
and "make -k check" with no new failures.
Ok for mainline?


2020-07-22  Roger Sayle  
Richard Biener  

gcc/ChangeLog
* match.pd (popcount(x)&1 -> parity(x)): New simplification.
(parity(~x) -> parity(x)): New simplification.
(parity(x)^parity(y) -> parity(x^y)): New simplification.
(parity(x&1) -> x&1): New simplification.
(popcount(x) -> x>>C): New simplification.

gcc/testsuite/ChangeLog
* gcc.dg/fold-popcount-5.c: New test.
* gcc.dg/fold-parity-1.c: Likewise.
* gcc.dg/fold-parity-2.c: Likewise.
* gcc.dg/fold-parity-3.c: Likewise.
* gcc.dg/fold-parity-4.c: Likewise.
* gcc.dg/fold-parity-5.c: Likewise.


Thanks in advance,
Roger
--
Roger Sayle
NextMove Software
Cambridge, UK

-Original Message-
From: Richard Biener  
Sent: 20 July 2020 14:40
To: Roger Sayle 
Cc: GCC Patches 
Subject: Re: [PATCH] middle-end: Fold popcount(x&4) to (x>>2)&1 and friends.

On Mon, Jul 20, 2020 at 3:06 PM Roger Sayle  wrote:
> This patch complements one from June 12th which is still awaiting review: 
> https://gcc.gnu.org/pipermail/gcc-patches/2020-June/547937.html
>
> This patch optimizes popcount and parity of an argument known to have 
> at most a single bit set, to be that single bit.  Hence, popcount(x&8)
> is simplified to (x>>3)&1.   This generalizes the existing optimization
> of popcount(x&1) being simplified to x&1, which is moved with this 
> patch to avoid a duplicate pattern warning in match.pd.
>
> This patch has been tested on x86_64-pc-linux-gnu with a "make bootstrap"
> and "make -k check" with no new failures.  If this is approved after 
> (or at the same time) as the patch above, I'm happy to resolve the 
> conflicts and retest before committing.

Given you know the constant bit position of the possibly nonzero bit you can 
elide the conversion to unsigned for all but the case of a possibly negative 
input (IIRC GCC doesn't yet take advantage of negative right shift 
undefinedness - but maybe sanitizers complain).
Also the shift amount doesn't need to be in the same type as the shifted amount 
so using either size_int() or integer_type_node for that argument should reduce 
INTEGER_CST waste.

Any reason you are not tackling IFN_POPCOUNT/PARITY?
You could use

(for pfun (POPCOUNT PARITY)
 ...

and automagically get all builtins and the IFN.

Thanks,
Richard.

diff --git a/gcc/match.pd b/gcc/match.pd
index c6ae7a7..a096a17 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -5964,25 +5964,55 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(IFN_FMA @0 @1 @2
 
 /* POPCOUNT simplifications.  */
-(for popcount (BUILT_IN_POPCOUNT BUILT_IN_POPCOUNTL BUILT_IN_POPCOUNTLL
-  BUILT_IN_POPCOUNTIMAX)
-  /* popcount(X&1) is nop_expr(X&1).  */
-  (simplify
-(popcount @0)
-(if (tree_nonzero_bits (@0) == 1)
-  (convert @0)))
-  /* popcount(X) + popcount(Y) is popcount(X|Y) when X must be zero.  */
-  (simplify
-(plus (popcount:s @0) (popcount:s @1))
-(if (wi::bit_and (tree_nonzero_bits (@0), tree_nonzero_bits (@1)) == 0)
-  (popcount (bit_ior @0 @1
-  /* popcount(X) == 0 is X == 0, and related (in)equalities.  */
+/* popcount(X) + popcount(Y) is popcount(X|Y) when X must be zero.  */
+(simplify
+  (plus (POPCOUNT:s @0) (POPCOUNT:s @1))
+  (if (wi::bit_and (tree_nonzero_bits (@0), tree_nonzero_bits (@1)) == 0)
+(POPCOUNT (bit_ior @0 @1
+
+/* popcount(X) == 0 is X == 0, and related (in)equalities.  */
+(for popcount (POPCOUNT)
   (for cmp (le eq ne gt)
rep (eq eq ne ne)
 (simplify
   (cmp (popcount @0) integer_zerop)
   (rep @0 { build_zero_cst (TREE_TYPE (@0)); }
 
+/* Canonicalize POPCOUNT(x)&1 as PARITY(X).  */
+(for popcount (BUILT_IN_POPCOUNT BUILT_IN_POPCOUNTL BUILT_IN_POPCOUNTLL
+  B

Re: [PATCH] middle-end: Fold popcount(x&4) to (x>>2)&1 and friends.

2020-07-20 Thread Florian Weimer via Gcc-patches
* Richard Biener via Gcc-patches:

> Given you know the constant bit position of the possibly nonzero
> bit you can elide the conversion to unsigned for all but the case
> of a possibly negative input (IIRC GCC doesn't yet take advantage
> of negative right shift undefinedness - but maybe sanitizers complain).

It's not undefined: “Signed '>>' acts on negative numbers by sign
extension.”

Thanks,
Florian



Re: [PATCH] middle-end: Fold popcount(x&4) to (x>>2)&1 and friends.

2020-07-20 Thread Richard Biener via Gcc-patches
On Mon, Jul 20, 2020 at 3:06 PM Roger Sayle  wrote:
>
>
> This patch complements one from June 12th which is still awaiting
> review: https://gcc.gnu.org/pipermail/gcc-patches/2020-June/547937.html
>
> This patch optimizes popcount and parity of an argument known to have
> at most a single bit set, to be that single bit.  Hence, popcount(x&8)
> is simplified to (x>>3)&1.   This generalizes the existing optimization
> of popcount(x&1) being simplified to x&1, which is moved with this
> patch to avoid a duplicate pattern warning in match.pd.
>
> This patch has been tested on x86_64-pc-linux-gnu with a "make bootstrap"
> and "make -k check" with no new failures.  If this is approved after
> (or at the same time) as the patch above, I'm happy to resolve the
> conflicts and retest before committing.

Given you know the constant bit position of the possibly nonzero
bit you can elide the conversion to unsigned for all but the case
of a possibly negative input (IIRC GCC doesn't yet take advantage
of negative right shift undefinedness - but maybe sanitizers complain).
Also the shift amount doesn't need to be in the same type as
the shifted amount so using either size_int() or integer_type_node
for that argument should reduce INTEGER_CST waste.

Any reason you are not tackling IFN_POPCOUNT/PARITY?
You could use

(for pfun (POPCOUNT PARITY)
 ...

and automagically get all builtins and the IFN.

Thanks,
Richard.



>
> 2020-07-20  Roger Sayle  
>
> gcc/ChangeLog
> * match.pd (popcount(x) -> x>>C): New simplification.
>
> gcc/testsuite
> * gcc.dg/fold-popcount-5.c: New test.
> * gcc.dg/fold-parity-5.c: Likewise.
>
>
> Ok for mainline?
> Thanks in advance,
> Roger
> --
> Roger Sayle
> NextMove Software
> Cambridge, UK
>