On 13 October 2016 at 13:22, Marc Glisse <marc.gli...@inria.fr> wrote: > On Thu, 13 Oct 2016, Prathamesh Kulkarni wrote: > >> On 12 October 2016 at 14:43, Richard Biener <rguent...@suse.de> wrote: >>> >>> On Wed, 12 Oct 2016, Marc Glisse wrote: >>> >>>> On Wed, 12 Oct 2016, Prathamesh Kulkarni wrote: >>>> >>>>> I was having a look at PR71636 and added the following pattern to >>>>> match.pd: >>>>> x & ((1U << b) - 1) -> x & ~(~0U << b) >>>>> However the transform is useful only if the target supports "andnot" >>>>> instruction. >>>> >>>> >>>> rth was selling the transformation as a canonicalization, which is >>>> beneficial >>>> when there is an andnot instruction, and neutral otherwise, so it could >>>> be >>>> done always. >>> >>> >>> Well, its three instructions to three instructions and a more expensive >>> constant(?). ~0U might not be available as immediate for the shift >>> instruction and 1U << b might be available as a bit-set instruction ... >>> (vs. the andnot). > > > True, I hadn't thought of bit-set. > >>> So yes, we might decide to canonicalize to andnot (and decide that >>> three binary to two binary and one unary op is "better"). >>> >>> So no excuse to explore the target specific .pd fragment idea ... :/ >> >> Hi, >> I have attached patch that adds the transform. >> Does that look OK ? > > > Why bit_not of build_zero_cst instead of build_all_ones_cst, as suggested in > the PR? If we only do the transformation when (1<<b)-1 is fed to a bit_and, > then we probably want to require that it has a single use (maybe even the > shift). > >> I am not sure how to write test-cases for it though. >> For the test-case: >> unsigned f(unsigned x, unsigned b) >> { >> unsigned t1 = 1U << b; >> unsigned t2 = t1 - 1; >> unsigned t3 = x & t2; >> return t3; >> } >> >> forwprop dump shows: >> Applying pattern match.pd:523, gimple-match.c:47419 >> gimple_simplified to _6 = 4294967295 << b_1(D); >> _8 = ~_6; >> t3_5 = x_4(D) & _8; >> >> I could scan for "_6 = 4294967295 << b_1(D);" however I suppose >> ~0 would depend on width of int and not always be 4294967295 ? >> Or should I scan for "_6 = 4294967295 << b_1(D);" >> and add /* { dg-require-effective int32 } */ to the test-case ? > > > You could check that you have ~, or that you don't have " 1 << ". Thanks for the suggestions. Does the attached patch look OK ?
For test-cases, scan-tree-dump-not "1 <<" works well for pr71636-1.c which tests GENERIC folding, however for GIMPLE folding, "1 << " still remains in the forwprop dump because dce isn't run to remove unused values. For the test-case: unsigned f(unsigned x, unsigned b) { unsigned t1 = 1U << b; unsigned t2 = t1 - 1; unsigned t3 = x & t2; return t3; } forwprop dump shows: Applying pattern match.pd:523, gimple-match.c:47418 gimple_simplified to _6 = 4294967295 << b_1(D); _8 = ~_6; t3_5 = x_4(D) & _8; f (unsigned int x, unsigned int b) { unsigned int t3; unsigned int t2; unsigned int t1; unsigned int _6; unsigned int _8; <bb 2>: t1_2 = 1 << b_1(D); t2_3 = t1_2 + 4294967295; _6 = 4294967295 << b_1(D); _8 = ~_6; t3_5 = x_4(D) & _8; return t3_5; } Instead I scanned for _8 = ~_6 with: /* { dg-final { scan-tree-dump "_\[0-9\] = ~_\[0-9\]" "forwprop1" } } */ because rhs has bit_not and lhs doesn't. Is that OK ? Bootstrap+tested on x86_64-unknown-linux-gnu. Cross-tested on arm*-*-*, aarch64*-*-* Thanks, Prathamesh > > -- > Marc Glisse
2016-10-15 Prathamesh Kulkarni <prathamesh.kulka...@linaro.org> * match.pd (x & ((1 << b) - 1) -> x & ~(~0 << b)): New pattern. testsuite/ * gcc.dg/pr71636-1.c: New test-case. * gcc.dg/pr71636-2.c: Likewise. diff --git a/gcc/match.pd b/gcc/match.pd index 1d80613..24ebf38 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -516,6 +516,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (bit_and:c (convert? @0) (convert? (bit_not @0))) { build_zero_cst (type); }) +/* PR71636: Transform x & ((1U << b) - 1) -> x & ~(~0U << b); */ +(simplify + (bit_and:c @0 (plus:s (lshift:s integer_onep @1) integer_minus_onep)) + (if (TYPE_UNSIGNED (type)) + (bit_and @0 (bit_not (lshift { build_all_ones_cst (type); } @1))))) + /* Fold (A & ~B) - (A & B) into (A ^ B) - B. */ (simplify (minus (bit_and:cs @0 (bit_not @1)) (bit_and:cs @0 @1)) diff --git a/gcc/testsuite/gcc.dg/pr71636-1.c b/gcc/testsuite/gcc.dg/pr71636-1.c new file mode 100644 index 0000000..2df5f96 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr71636-1.c @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +/* { dg-options "-fdump-tree-gimple" } */ + +unsigned f(unsigned x, unsigned b) +{ + return x & ((1U << b) - 1); +} + +/* { dg-final { scan-tree-dump-not "1 <<" "gimple" } } */ diff --git a/gcc/testsuite/gcc.dg/pr71636-2.c b/gcc/testsuite/gcc.dg/pr71636-2.c new file mode 100644 index 0000000..9e9297d --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr71636-2.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-forwprop-details" } */ + +unsigned f(unsigned x, unsigned b) +{ + unsigned t1 = 1U << b; + unsigned t2 = t1 - 1; + unsigned t3 = x & t2; + return t3; +} + +/* { dg-final { scan-tree-dump "_\[0-9\] = ~_\[0-9\]" "forwprop1" } } */