subject:"\[PATCH\] improve folding of expressions that move a single bit around"

Re: [PATCH] improve folding of expressions that move a single bit around

2016-11-29 Thread Jeff Law


On 11/29/2016 03:16 AM, Richard Biener wrote:

On Mon, Nov 28, 2016 at 7:41 PM, Jeff Law  wrote:

On 11/28/2016 06:10 AM, Paolo Bonzini wrote:




On 27/11/2016 00:28, Marc Glisse wrote:


On Sat, 26 Nov 2016, Paolo Bonzini wrote:


--- match.pd(revision 242742)
+++ match.pd(working copy)
@@ -2554,6 +2554,19 @@
  (cmp (bit_and@2 @0 integer_pow2p@1) @1)
  (icmp @2 { build_zero_cst (TREE_TYPE (@0)); })))

+/* If we have (A & C) != 0 ? D : 0 where C and D are powers of 2,
+   convert this into a shift of (A & C).  */
+(simplify
+ (cond
+  (ne (bit_and@2 @0 integer_pow2p@1) integer_zerop)
+  integer_pow2p@3 integer_zerop)
+ (with {
+int shift = wi::exact_log2 (@3) - wi::exact_log2 (@1);
+  }
+  (if (shift > 0)
+   (lshift (convert @2) { build_int_cst (integer_type_node, shift); })
+   (convert (rshift @2 { build_int_cst (integer_type_node, -shift);
})



What happens if @1 is the sign bit, in a signed type? Do we get an
arithmetic shift right?



It shouldn't happen because the canonical form of a sign bit test is A <
0 (that's the pattern immediately after).  However I can add an "if" if
preferred, or change the pattern to do the AND after the shift.


But are we absolutely sure it'll be in canonical form every time?


No, of course not (though it would be a bug).  If the pattern generates wrong
code when the non-canonical form is met that would be bad, if it merely
does not optimize (or optimize non-optimally) then that's not too bad.
Agreed.  I managed to convince myself that for a signed type with the 
sign bit on that we'd generate incorrect code.  But that was from a 
quick review of the pattern.


Jeff

Re: [PATCH] improve folding of expressions that move a single bit around

2016-11-29 Thread Richard Biener

On Tue, Nov 29, 2016 at 12:50 PM, Paolo Bonzini  wrote:
>
>
> On 29/11/2016 11:16, Richard Biener wrote:
 >>  (bit_and
 >>   (if (shift > 0)
 >>(lshift (convert @0) { build_int_cst (integer_type_node, shift); })
 >>(convert (rshift @0 { build_int_cst (integer_type_node, -shift); })))
 >>   @3)
 >>
 >> What do you think?
>>> >
>>> > Shouldn't be necessary if we're working on unsigned types.  May be 
>>> > necessary
>>> > on signed types unless we're 100% sure sign bit tests will be 
>>> > canonicalized.
>> Instead of the bit_and better put a if() condition around this.
>
> Note that the bit_and is not duplicated.  While v1 of the patch did
> and-then-shift (the source BIT_AND_EXPR was @2), here I'm doing
> shift-then-and so I would stop capturing the source BIT_AND_EXPR.

Ah, ok.

> It also matches what I'm doing for the A < 0 ? D : C pattern, so I think
> it's nicer this way.
>
> (BTW the above of course doesn't work because (if...) is only allowed at
> the top level, but I've bootstrapped and regtested already a version
> that works).
>
> Paolo

Re: [PATCH] improve folding of expressions that move a single bit around

2016-11-29 Thread Paolo Bonzini

On 29/11/2016 11:16, Richard Biener wrote:
>>> >>  (bit_and
>>> >>   (if (shift > 0)
>>> >>(lshift (convert @0) { build_int_cst (integer_type_node, shift); })
>>> >>(convert (rshift @0 { build_int_cst (integer_type_node, -shift); })))
>>> >>   @3)
>>> >>
>>> >> What do you think?
>> >
>> > Shouldn't be necessary if we're working on unsigned types.  May be 
>> > necessary
>> > on signed types unless we're 100% sure sign bit tests will be 
>> > canonicalized.
> Instead of the bit_and better put a if() condition around this.

Note that the bit_and is not duplicated.  While v1 of the patch did
and-then-shift (the source BIT_AND_EXPR was @2), here I'm doing
shift-then-and so I would stop capturing the source BIT_AND_EXPR.

It also matches what I'm doing for the A < 0 ? D : C pattern, so I think
it's nicer this way.

(BTW the above of course doesn't work because (if...) is only allowed at
the top level, but I've bootstrapped and regtested already a version
that works).

Paolo

Re: [PATCH] improve folding of expressions that move a single bit around

2016-11-29 Thread Richard Biener

On Mon, Nov 28, 2016 at 7:41 PM, Jeff Law  wrote:
> On 11/28/2016 06:10 AM, Paolo Bonzini wrote:
>>
>>
>>
>> On 27/11/2016 00:28, Marc Glisse wrote:
>>>
>>> On Sat, 26 Nov 2016, Paolo Bonzini wrote:
>>>
 --- match.pd(revision 242742)
 +++ match.pd(working copy)
 @@ -2554,6 +2554,19 @@
   (cmp (bit_and@2 @0 integer_pow2p@1) @1)
   (icmp @2 { build_zero_cst (TREE_TYPE (@0)); })))

 +/* If we have (A & C) != 0 ? D : 0 where C and D are powers of 2,
 +   convert this into a shift of (A & C).  */
 +(simplify
 + (cond
 +  (ne (bit_and@2 @0 integer_pow2p@1) integer_zerop)
 +  integer_pow2p@3 integer_zerop)
 + (with {
 +int shift = wi::exact_log2 (@3) - wi::exact_log2 (@1);
 +  }
 +  (if (shift > 0)
 +   (lshift (convert @2) { build_int_cst (integer_type_node, shift); })
 +   (convert (rshift @2 { build_int_cst (integer_type_node, -shift);
 })
>>>
>>>
>>> What happens if @1 is the sign bit, in a signed type? Do we get an
>>> arithmetic shift right?
>>
>>
>> It shouldn't happen because the canonical form of a sign bit test is A <
>> 0 (that's the pattern immediately after).  However I can add an "if" if
>> preferred, or change the pattern to do the AND after the shift.
>
> But are we absolutely sure it'll be in canonical form every time?

No, of course not (though it would be a bug).  If the pattern generates wrong
code when the non-canonical form is met that would be bad, if it merely
does not optimize (or optimize non-optimally) then that's not too bad.

>   Is there
> a pattern in in match.pd which turns an (x & SIGN_BIT) tests into canonical
> form?
>
>
>>
>>  (bit_and
>>   (if (shift > 0)
>>(lshift (convert @0) { build_int_cst (integer_type_node, shift); })
>>(convert (rshift @0 { build_int_cst (integer_type_node, -shift); })))
>>   @3)
>>
>> What do you think?
>
> Shouldn't be necessary if we're working on unsigned types.  May be necessary
> on signed types unless we're 100% sure sign bit tests will be canonicalized.

Instead of the bit_and better put a if() condition around this.

Richard.


> jeff

Re: [PATCH] improve folding of expressions that move a single bit around

2016-11-28 Thread Jeff Law


On 11/28/2016 06:10 AM, Paolo Bonzini wrote:



On 27/11/2016 00:28, Marc Glisse wrote:

On Sat, 26 Nov 2016, Paolo Bonzini wrote:


--- match.pd(revision 242742)
+++ match.pd(working copy)
@@ -2554,6 +2554,19 @@
  (cmp (bit_and@2 @0 integer_pow2p@1) @1)
  (icmp @2 { build_zero_cst (TREE_TYPE (@0)); })))

+/* If we have (A & C) != 0 ? D : 0 where C and D are powers of 2,
+   convert this into a shift of (A & C).  */
+(simplify
+ (cond
+  (ne (bit_and@2 @0 integer_pow2p@1) integer_zerop)
+  integer_pow2p@3 integer_zerop)
+ (with {
+int shift = wi::exact_log2 (@3) - wi::exact_log2 (@1);
+  }
+  (if (shift > 0)
+   (lshift (convert @2) { build_int_cst (integer_type_node, shift); })
+   (convert (rshift @2 { build_int_cst (integer_type_node, -shift);
})


What happens if @1 is the sign bit, in a signed type? Do we get an
arithmetic shift right?


It shouldn't happen because the canonical form of a sign bit test is A <
0 (that's the pattern immediately after).  However I can add an "if" if
preferred, or change the pattern to do the AND after the shift.
But are we absolutely sure it'll be in canonical form every time?   Is 
there a pattern in in match.pd which turns an (x & SIGN_BIT) tests into 
canonical form?





 (bit_and
  (if (shift > 0)
   (lshift (convert @0) { build_int_cst (integer_type_node, shift); })
   (convert (rshift @0 { build_int_cst (integer_type_node, -shift); })))
  @3)

What do you think?
Shouldn't be necessary if we're working on unsigned types.  May be 
necessary on signed types unless we're 100% sure sign bit tests will be 
canonicalized.


jeff

Re: [PATCH] improve folding of expressions that move a single bit around

2016-11-28 Thread Paolo Bonzini



On 27/11/2016 00:28, Marc Glisse wrote:
> On Sat, 26 Nov 2016, Paolo Bonzini wrote:
> 
>> --- match.pd(revision 242742)
>> +++ match.pd(working copy)
>> @@ -2554,6 +2554,19 @@
>>   (cmp (bit_and@2 @0 integer_pow2p@1) @1)
>>   (icmp @2 { build_zero_cst (TREE_TYPE (@0)); })))
>>
>> +/* If we have (A & C) != 0 ? D : 0 where C and D are powers of 2,
>> +   convert this into a shift of (A & C).  */
>> +(simplify
>> + (cond
>> +  (ne (bit_and@2 @0 integer_pow2p@1) integer_zerop)
>> +  integer_pow2p@3 integer_zerop)
>> + (with {
>> +int shift = wi::exact_log2 (@3) - wi::exact_log2 (@1);
>> +  }
>> +  (if (shift > 0)
>> +   (lshift (convert @2) { build_int_cst (integer_type_node, shift); })
>> +   (convert (rshift @2 { build_int_cst (integer_type_node, -shift);
>> })
> 
> What happens if @1 is the sign bit, in a signed type? Do we get an
> arithmetic shift right?

It shouldn't happen because the canonical form of a sign bit test is A <
0 (that's the pattern immediately after).  However I can add an "if" if
preferred, or change the pattern to do the AND after the shift.

 (bit_and
  (if (shift > 0)
   (lshift (convert @0) { build_int_cst (integer_type_node, shift); })
   (convert (rshift @0 { build_int_cst (integer_type_node, -shift); })))
  @3)

What do you think?

Paolo

Re: [PATCH] improve folding of expressions that move a single bit around

2016-11-26 Thread Segher Boessenkool

On Sat, Nov 26, 2016 at 11:22:44PM +0100, Paolo Bonzini wrote:
> The combine.c hunk instead is needed to simplify cases that do not use the
> ternary operator (the "h" and "i" functions in the testcases) like this:
> 
>   return ((x >> 9) & 1) << 7;
> 
> Normally this is simplified just fine to a single shift and an AND.
> Here, however, the bit to preserve after (x >> 9 << 7) is the QImode
> sign bit, and if_then_else_cond produces a complicated concoction
> involving (ne:SI (subreg:QI ...)).  simplify_if_then_else cannot then
> reduce it back to the original.  In fact, simplify_if_then_else does
> have a similar pattern, but it cannot deal with the subreg.  This is
> easily done by ZERO_EXTENDing the result from QImode back to the
> comparison's mode.  The shift/shift/and or shift/and/shift combination
> can then be reduced to shift+and just like for any other bit position.

The combine part is fine, thanks for the patch.


Segher


> 2016-11-26  Paolo Bonzini  
> 
>   * combine.c (simplify_if_then_else): Simplify IF_THEN_ELSE
>   that isolates a single bit, even if the condition involves
>   subregs.
>   * match.pd: Simplify X ? C : 0 where C is a power of 2 and
>   X tests a single bit.
> 
> 2016-11-26  Paolo Bonzini  
> 
>   * gcc.dg/fold-and-lshift.c, gcc.dg/fold-and-rshift-1.c,
>   gcc.dg/fold-and-rshift-2.c: New testcases.

Re: [PATCH] improve folding of expressions that move a single bit around

2016-11-26 Thread Marc Glisse


On Sat, 26 Nov 2016, Paolo Bonzini wrote:


--- match.pd(revision 242742)
+++ match.pd(working copy)
@@ -2554,6 +2554,19 @@
  (cmp (bit_and@2 @0 integer_pow2p@1) @1)
  (icmp @2 { build_zero_cst (TREE_TYPE (@0)); })))

+/* If we have (A & C) != 0 ? D : 0 where C and D are powers of 2,
+   convert this into a shift of (A & C).  */
+(simplify
+ (cond
+  (ne (bit_and@2 @0 integer_pow2p@1) integer_zerop)
+  integer_pow2p@3 integer_zerop)
+ (with {
+int shift = wi::exact_log2 (@3) - wi::exact_log2 (@1);
+  }
+  (if (shift > 0)
+   (lshift (convert @2) { build_int_cst (integer_type_node, shift); })
+   (convert (rshift @2 { build_int_cst (integer_type_node, -shift); })


What happens if @1 is the sign bit, in a signed type? Do we get an 
arithmetic shift right?


--
Marc Glisse

[PATCH] improve folding of expressions that move a single bit around

2016-11-26 Thread Paolo Bonzini

In code like the following from KVM:

/* it is a read fault? */
error_code = (exit_qualification << 2) & PFERR_FETCH_MASK;

it would be nicer to write

/* it is a read fault? */
error_code = (exit_qualification & VMX_EPT_READ_FAULT_MASK) ? 
PFERR_FETCH_MASK : 0;

instead of having to know the difference between the positions of the
source and destination bits.  LLVM catches the latter just fine (which
is why I am sending this in stage 3...), but GCC does not, so this
patch adds two patterns to catch it.

The combine.c hunk instead is needed to simplify cases that do not use the
ternary operator (the "h" and "i" functions in the testcases) like this:

return ((x >> 9) & 1) << 7;

Normally this is simplified just fine to a single shift and an AND.
Here, however, the bit to preserve after (x >> 9 << 7) is the QImode
sign bit, and if_then_else_cond produces a complicated concoction
involving (ne:SI (subreg:QI ...)).  simplify_if_then_else cannot then
reduce it back to the original.  In fact, simplify_if_then_else does
have a similar pattern, but it cannot deal with the subreg.  This is
easily done by ZERO_EXTENDing the result from QImode back to the
comparison's mode.  The shift/shift/and or shift/and/shift combination
can then be reduced to shift+and just like for any other bit position.

These forms are not included in the fold-and-rshift-2.c testcase,
because in this case a shift+shift (without the following AND) is a
valid alternative too; and at least on x86 it has the same cost as
shift+and.  Compare:

movl%edi, %eax
sarl$24, %eax
andl$128, %eax
ret

and

movl%edi, %eax
shrl$31, %eax
sall$7, %eax

Bootstrapped/regtested x86_64-pc-linux-gnu, ok?

Paolo

2016-11-26  Paolo Bonzini  

* combine.c (simplify_if_then_else): Simplify IF_THEN_ELSE
that isolates a single bit, even if the condition involves
subregs.
* match.pd: Simplify X ? C : 0 where C is a power of 2 and
X tests a single bit.

2016-11-26  Paolo Bonzini  

* gcc.dg/fold-and-lshift.c, gcc.dg/fold-and-rshift-1.c,
gcc.dg/fold-and-rshift-2.c: New testcases.

Index: combine.c
===
--- combine.c   (revision 242742)
+++ combine.c   (working copy)
@@ -6522,14 +6522,22 @@
   simplify_shift_const (NULL_RTX, ASHIFT, mode,
gen_lowpart (mode, XEXP (cond, 0)), i);
 
-  /* (IF_THEN_ELSE (NE REG 0) (0) (8)) is REG for nonzero_bits (REG) == 8.  */
+  /* (IF_THEN_ELSE (NE A 0) C1 0) is A or a zero-extend of A if the only
+ non-zero bit in A is C1.  */
   if (true_code == NE && XEXP (cond, 1) == const0_rtx
   && false_rtx == const0_rtx && CONST_INT_P (true_rtx)
-  && GET_MODE (XEXP (cond, 0)) == mode
+  && INTEGRAL_MODE_P (GET_MODE (XEXP (cond, 0)))
   && (UINTVAL (true_rtx) & GET_MODE_MASK (mode))
- == nonzero_bits (XEXP (cond, 0), mode)
+ == nonzero_bits (XEXP (cond, 0), GET_MODE (XEXP (cond, 0)))
   && (i = exact_log2 (UINTVAL (true_rtx) & GET_MODE_MASK (mode))) >= 0)
-return XEXP (cond, 0);
+{
+  rtx val = XEXP (cond, 0);
+  enum machine_mode val_mode = GET_MODE (val);
+  if (val_mode == mode)
+return val;
+  else if (GET_MODE_PRECISION (val_mode) < GET_MODE_PRECISION (mode))
+return simplify_gen_unary (ZERO_EXTEND, mode, val, val_mode);
+}
 
   return x;
 }
Index: match.pd
===
--- match.pd(revision 242742)
+++ match.pd(working copy)
@@ -2554,6 +2554,19 @@
   (cmp (bit_and@2 @0 integer_pow2p@1) @1)
   (icmp @2 { build_zero_cst (TREE_TYPE (@0)); })))
  
+/* If we have (A & C) != 0 ? D : 0 where C and D are powers of 2,
+   convert this into a shift of (A & C).  */
+(simplify
+ (cond
+  (ne (bit_and@2 @0 integer_pow2p@1) integer_zerop)
+  integer_pow2p@3 integer_zerop)
+ (with {
+int shift = wi::exact_log2 (@3) - wi::exact_log2 (@1);
+  }
+  (if (shift > 0)
+   (lshift (convert @2) { build_int_cst (integer_type_node, shift); })
+   (convert (rshift @2 { build_int_cst (integer_type_node, -shift); })
+
 /* If we have (A & C) != 0 where C is the sign bit of A, convert
this into A < 0.  Similarly for (A & C) == 0 into A >= 0.  */
 (for cmp (eq ne)
@@ -2568,6 +2581,19 @@
(with { tree stype = signed_type_for (TREE_TYPE (@0)); }
 (ncmp (convert:stype @0) { build_zero_cst (stype); })
 
+/* If we have A < 0 ? C : 0 where C and D are powers of 2,
+   convert this into a right shift and AND.  */
+(simplify
+ (cond
+  (lt @0 integer_zerop)
+  integer_pow2p@1 integer_zerop)
+ (with {
+int shift = element_precision (@0) - wi::exact_log2 (@1) - 1;
+  }
+  (bit_and
+   (convert (rshift @0 { build_int_cst (integer_type_node, shift); }))
+   @1)))
+
 /* When the addresses are not directly of decls

Re: [PATCH] improve folding of expressions that move a single bit around

Re: [PATCH] improve folding of expressions that move a single bit around

Re: [PATCH] improve folding of expressions that move a single bit around

Re: [PATCH] improve folding of expressions that move a single bit around

Re: [PATCH] improve folding of expressions that move a single bit around

Re: [PATCH] improve folding of expressions that move a single bit around

Re: [PATCH] improve folding of expressions that move a single bit around

Re: [PATCH] improve folding of expressions that move a single bit around

[PATCH] improve folding of expressions that move a single bit around

9 matches

Site Navigation

Mail list logo

Footer information