Re: [PATCH] improve folding of expressions that move a single bit around
On 11/29/2016 03:16 AM, Richard Biener wrote: On Mon, Nov 28, 2016 at 7:41 PM, Jeff Lawwrote: On 11/28/2016 06:10 AM, Paolo Bonzini wrote: On 27/11/2016 00:28, Marc Glisse wrote: On Sat, 26 Nov 2016, Paolo Bonzini wrote: --- match.pd(revision 242742) +++ match.pd(working copy) @@ -2554,6 +2554,19 @@ (cmp (bit_and@2 @0 integer_pow2p@1) @1) (icmp @2 { build_zero_cst (TREE_TYPE (@0)); }))) +/* If we have (A & C) != 0 ? D : 0 where C and D are powers of 2, + convert this into a shift of (A & C). */ +(simplify + (cond + (ne (bit_and@2 @0 integer_pow2p@1) integer_zerop) + integer_pow2p@3 integer_zerop) + (with { +int shift = wi::exact_log2 (@3) - wi::exact_log2 (@1); + } + (if (shift > 0) + (lshift (convert @2) { build_int_cst (integer_type_node, shift); }) + (convert (rshift @2 { build_int_cst (integer_type_node, -shift); }) What happens if @1 is the sign bit, in a signed type? Do we get an arithmetic shift right? It shouldn't happen because the canonical form of a sign bit test is A < 0 (that's the pattern immediately after). However I can add an "if" if preferred, or change the pattern to do the AND after the shift. But are we absolutely sure it'll be in canonical form every time? No, of course not (though it would be a bug). If the pattern generates wrong code when the non-canonical form is met that would be bad, if it merely does not optimize (or optimize non-optimally) then that's not too bad. Agreed. I managed to convince myself that for a signed type with the sign bit on that we'd generate incorrect code. But that was from a quick review of the pattern. Jeff
Re: [PATCH] improve folding of expressions that move a single bit around
On Tue, Nov 29, 2016 at 12:50 PM, Paolo Bonziniwrote: > > > On 29/11/2016 11:16, Richard Biener wrote: >> (bit_and >> (if (shift > 0) >>(lshift (convert @0) { build_int_cst (integer_type_node, shift); }) >>(convert (rshift @0 { build_int_cst (integer_type_node, -shift); }))) >> @3) >> >> What do you think? >>> > >>> > Shouldn't be necessary if we're working on unsigned types. May be >>> > necessary >>> > on signed types unless we're 100% sure sign bit tests will be >>> > canonicalized. >> Instead of the bit_and better put a if() condition around this. > > Note that the bit_and is not duplicated. While v1 of the patch did > and-then-shift (the source BIT_AND_EXPR was @2), here I'm doing > shift-then-and so I would stop capturing the source BIT_AND_EXPR. Ah, ok. > It also matches what I'm doing for the A < 0 ? D : C pattern, so I think > it's nicer this way. > > (BTW the above of course doesn't work because (if...) is only allowed at > the top level, but I've bootstrapped and regtested already a version > that works). > > Paolo
Re: [PATCH] improve folding of expressions that move a single bit around
On 29/11/2016 11:16, Richard Biener wrote: >>> >> (bit_and >>> >> (if (shift > 0) >>> >>(lshift (convert @0) { build_int_cst (integer_type_node, shift); }) >>> >>(convert (rshift @0 { build_int_cst (integer_type_node, -shift); }))) >>> >> @3) >>> >> >>> >> What do you think? >> > >> > Shouldn't be necessary if we're working on unsigned types. May be >> > necessary >> > on signed types unless we're 100% sure sign bit tests will be >> > canonicalized. > Instead of the bit_and better put a if() condition around this. Note that the bit_and is not duplicated. While v1 of the patch did and-then-shift (the source BIT_AND_EXPR was @2), here I'm doing shift-then-and so I would stop capturing the source BIT_AND_EXPR. It also matches what I'm doing for the A < 0 ? D : C pattern, so I think it's nicer this way. (BTW the above of course doesn't work because (if...) is only allowed at the top level, but I've bootstrapped and regtested already a version that works). Paolo
Re: [PATCH] improve folding of expressions that move a single bit around
On Mon, Nov 28, 2016 at 7:41 PM, Jeff Lawwrote: > On 11/28/2016 06:10 AM, Paolo Bonzini wrote: >> >> >> >> On 27/11/2016 00:28, Marc Glisse wrote: >>> >>> On Sat, 26 Nov 2016, Paolo Bonzini wrote: >>> --- match.pd(revision 242742) +++ match.pd(working copy) @@ -2554,6 +2554,19 @@ (cmp (bit_and@2 @0 integer_pow2p@1) @1) (icmp @2 { build_zero_cst (TREE_TYPE (@0)); }))) +/* If we have (A & C) != 0 ? D : 0 where C and D are powers of 2, + convert this into a shift of (A & C). */ +(simplify + (cond + (ne (bit_and@2 @0 integer_pow2p@1) integer_zerop) + integer_pow2p@3 integer_zerop) + (with { +int shift = wi::exact_log2 (@3) - wi::exact_log2 (@1); + } + (if (shift > 0) + (lshift (convert @2) { build_int_cst (integer_type_node, shift); }) + (convert (rshift @2 { build_int_cst (integer_type_node, -shift); }) >>> >>> >>> What happens if @1 is the sign bit, in a signed type? Do we get an >>> arithmetic shift right? >> >> >> It shouldn't happen because the canonical form of a sign bit test is A < >> 0 (that's the pattern immediately after). However I can add an "if" if >> preferred, or change the pattern to do the AND after the shift. > > But are we absolutely sure it'll be in canonical form every time? No, of course not (though it would be a bug). If the pattern generates wrong code when the non-canonical form is met that would be bad, if it merely does not optimize (or optimize non-optimally) then that's not too bad. > Is there > a pattern in in match.pd which turns an (x & SIGN_BIT) tests into canonical > form? > > >> >> (bit_and >> (if (shift > 0) >>(lshift (convert @0) { build_int_cst (integer_type_node, shift); }) >>(convert (rshift @0 { build_int_cst (integer_type_node, -shift); }))) >> @3) >> >> What do you think? > > Shouldn't be necessary if we're working on unsigned types. May be necessary > on signed types unless we're 100% sure sign bit tests will be canonicalized. Instead of the bit_and better put a if() condition around this. Richard. > jeff
Re: [PATCH] improve folding of expressions that move a single bit around
On 11/28/2016 06:10 AM, Paolo Bonzini wrote: On 27/11/2016 00:28, Marc Glisse wrote: On Sat, 26 Nov 2016, Paolo Bonzini wrote: --- match.pd(revision 242742) +++ match.pd(working copy) @@ -2554,6 +2554,19 @@ (cmp (bit_and@2 @0 integer_pow2p@1) @1) (icmp @2 { build_zero_cst (TREE_TYPE (@0)); }))) +/* If we have (A & C) != 0 ? D : 0 where C and D are powers of 2, + convert this into a shift of (A & C). */ +(simplify + (cond + (ne (bit_and@2 @0 integer_pow2p@1) integer_zerop) + integer_pow2p@3 integer_zerop) + (with { +int shift = wi::exact_log2 (@3) - wi::exact_log2 (@1); + } + (if (shift > 0) + (lshift (convert @2) { build_int_cst (integer_type_node, shift); }) + (convert (rshift @2 { build_int_cst (integer_type_node, -shift); }) What happens if @1 is the sign bit, in a signed type? Do we get an arithmetic shift right? It shouldn't happen because the canonical form of a sign bit test is A < 0 (that's the pattern immediately after). However I can add an "if" if preferred, or change the pattern to do the AND after the shift. But are we absolutely sure it'll be in canonical form every time? Is there a pattern in in match.pd which turns an (x & SIGN_BIT) tests into canonical form? (bit_and (if (shift > 0) (lshift (convert @0) { build_int_cst (integer_type_node, shift); }) (convert (rshift @0 { build_int_cst (integer_type_node, -shift); }))) @3) What do you think? Shouldn't be necessary if we're working on unsigned types. May be necessary on signed types unless we're 100% sure sign bit tests will be canonicalized. jeff
Re: [PATCH] improve folding of expressions that move a single bit around
On 27/11/2016 00:28, Marc Glisse wrote: > On Sat, 26 Nov 2016, Paolo Bonzini wrote: > >> --- match.pd(revision 242742) >> +++ match.pd(working copy) >> @@ -2554,6 +2554,19 @@ >> (cmp (bit_and@2 @0 integer_pow2p@1) @1) >> (icmp @2 { build_zero_cst (TREE_TYPE (@0)); }))) >> >> +/* If we have (A & C) != 0 ? D : 0 where C and D are powers of 2, >> + convert this into a shift of (A & C). */ >> +(simplify >> + (cond >> + (ne (bit_and@2 @0 integer_pow2p@1) integer_zerop) >> + integer_pow2p@3 integer_zerop) >> + (with { >> +int shift = wi::exact_log2 (@3) - wi::exact_log2 (@1); >> + } >> + (if (shift > 0) >> + (lshift (convert @2) { build_int_cst (integer_type_node, shift); }) >> + (convert (rshift @2 { build_int_cst (integer_type_node, -shift); >> }) > > What happens if @1 is the sign bit, in a signed type? Do we get an > arithmetic shift right? It shouldn't happen because the canonical form of a sign bit test is A < 0 (that's the pattern immediately after). However I can add an "if" if preferred, or change the pattern to do the AND after the shift. (bit_and (if (shift > 0) (lshift (convert @0) { build_int_cst (integer_type_node, shift); }) (convert (rshift @0 { build_int_cst (integer_type_node, -shift); }))) @3) What do you think? Paolo
Re: [PATCH] improve folding of expressions that move a single bit around
On Sat, Nov 26, 2016 at 11:22:44PM +0100, Paolo Bonzini wrote: > The combine.c hunk instead is needed to simplify cases that do not use the > ternary operator (the "h" and "i" functions in the testcases) like this: > > return ((x >> 9) & 1) << 7; > > Normally this is simplified just fine to a single shift and an AND. > Here, however, the bit to preserve after (x >> 9 << 7) is the QImode > sign bit, and if_then_else_cond produces a complicated concoction > involving (ne:SI (subreg:QI ...)). simplify_if_then_else cannot then > reduce it back to the original. In fact, simplify_if_then_else does > have a similar pattern, but it cannot deal with the subreg. This is > easily done by ZERO_EXTENDing the result from QImode back to the > comparison's mode. The shift/shift/and or shift/and/shift combination > can then be reduced to shift+and just like for any other bit position. The combine part is fine, thanks for the patch. Segher > 2016-11-26 Paolo Bonzini> > * combine.c (simplify_if_then_else): Simplify IF_THEN_ELSE > that isolates a single bit, even if the condition involves > subregs. > * match.pd: Simplify X ? C : 0 where C is a power of 2 and > X tests a single bit. > > 2016-11-26 Paolo Bonzini > > * gcc.dg/fold-and-lshift.c, gcc.dg/fold-and-rshift-1.c, > gcc.dg/fold-and-rshift-2.c: New testcases.
Re: [PATCH] improve folding of expressions that move a single bit around
On Sat, 26 Nov 2016, Paolo Bonzini wrote: --- match.pd(revision 242742) +++ match.pd(working copy) @@ -2554,6 +2554,19 @@ (cmp (bit_and@2 @0 integer_pow2p@1) @1) (icmp @2 { build_zero_cst (TREE_TYPE (@0)); }))) +/* If we have (A & C) != 0 ? D : 0 where C and D are powers of 2, + convert this into a shift of (A & C). */ +(simplify + (cond + (ne (bit_and@2 @0 integer_pow2p@1) integer_zerop) + integer_pow2p@3 integer_zerop) + (with { +int shift = wi::exact_log2 (@3) - wi::exact_log2 (@1); + } + (if (shift > 0) + (lshift (convert @2) { build_int_cst (integer_type_node, shift); }) + (convert (rshift @2 { build_int_cst (integer_type_node, -shift); }) What happens if @1 is the sign bit, in a signed type? Do we get an arithmetic shift right? -- Marc Glisse
[PATCH] improve folding of expressions that move a single bit around
In code like the following from KVM: /* it is a read fault? */ error_code = (exit_qualification << 2) & PFERR_FETCH_MASK; it would be nicer to write /* it is a read fault? */ error_code = (exit_qualification & VMX_EPT_READ_FAULT_MASK) ? PFERR_FETCH_MASK : 0; instead of having to know the difference between the positions of the source and destination bits. LLVM catches the latter just fine (which is why I am sending this in stage 3...), but GCC does not, so this patch adds two patterns to catch it. The combine.c hunk instead is needed to simplify cases that do not use the ternary operator (the "h" and "i" functions in the testcases) like this: return ((x >> 9) & 1) << 7; Normally this is simplified just fine to a single shift and an AND. Here, however, the bit to preserve after (x >> 9 << 7) is the QImode sign bit, and if_then_else_cond produces a complicated concoction involving (ne:SI (subreg:QI ...)). simplify_if_then_else cannot then reduce it back to the original. In fact, simplify_if_then_else does have a similar pattern, but it cannot deal with the subreg. This is easily done by ZERO_EXTENDing the result from QImode back to the comparison's mode. The shift/shift/and or shift/and/shift combination can then be reduced to shift+and just like for any other bit position. These forms are not included in the fold-and-rshift-2.c testcase, because in this case a shift+shift (without the following AND) is a valid alternative too; and at least on x86 it has the same cost as shift+and. Compare: movl%edi, %eax sarl$24, %eax andl$128, %eax ret and movl%edi, %eax shrl$31, %eax sall$7, %eax Bootstrapped/regtested x86_64-pc-linux-gnu, ok? Paolo 2016-11-26 Paolo Bonzini* combine.c (simplify_if_then_else): Simplify IF_THEN_ELSE that isolates a single bit, even if the condition involves subregs. * match.pd: Simplify X ? C : 0 where C is a power of 2 and X tests a single bit. 2016-11-26 Paolo Bonzini * gcc.dg/fold-and-lshift.c, gcc.dg/fold-and-rshift-1.c, gcc.dg/fold-and-rshift-2.c: New testcases. Index: combine.c === --- combine.c (revision 242742) +++ combine.c (working copy) @@ -6522,14 +6522,22 @@ simplify_shift_const (NULL_RTX, ASHIFT, mode, gen_lowpart (mode, XEXP (cond, 0)), i); - /* (IF_THEN_ELSE (NE REG 0) (0) (8)) is REG for nonzero_bits (REG) == 8. */ + /* (IF_THEN_ELSE (NE A 0) C1 0) is A or a zero-extend of A if the only + non-zero bit in A is C1. */ if (true_code == NE && XEXP (cond, 1) == const0_rtx && false_rtx == const0_rtx && CONST_INT_P (true_rtx) - && GET_MODE (XEXP (cond, 0)) == mode + && INTEGRAL_MODE_P (GET_MODE (XEXP (cond, 0))) && (UINTVAL (true_rtx) & GET_MODE_MASK (mode)) - == nonzero_bits (XEXP (cond, 0), mode) + == nonzero_bits (XEXP (cond, 0), GET_MODE (XEXP (cond, 0))) && (i = exact_log2 (UINTVAL (true_rtx) & GET_MODE_MASK (mode))) >= 0) -return XEXP (cond, 0); +{ + rtx val = XEXP (cond, 0); + enum machine_mode val_mode = GET_MODE (val); + if (val_mode == mode) +return val; + else if (GET_MODE_PRECISION (val_mode) < GET_MODE_PRECISION (mode)) +return simplify_gen_unary (ZERO_EXTEND, mode, val, val_mode); +} return x; } Index: match.pd === --- match.pd(revision 242742) +++ match.pd(working copy) @@ -2554,6 +2554,19 @@ (cmp (bit_and@2 @0 integer_pow2p@1) @1) (icmp @2 { build_zero_cst (TREE_TYPE (@0)); }))) +/* If we have (A & C) != 0 ? D : 0 where C and D are powers of 2, + convert this into a shift of (A & C). */ +(simplify + (cond + (ne (bit_and@2 @0 integer_pow2p@1) integer_zerop) + integer_pow2p@3 integer_zerop) + (with { +int shift = wi::exact_log2 (@3) - wi::exact_log2 (@1); + } + (if (shift > 0) + (lshift (convert @2) { build_int_cst (integer_type_node, shift); }) + (convert (rshift @2 { build_int_cst (integer_type_node, -shift); }) + /* If we have (A & C) != 0 where C is the sign bit of A, convert this into A < 0. Similarly for (A & C) == 0 into A >= 0. */ (for cmp (eq ne) @@ -2568,6 +2581,19 @@ (with { tree stype = signed_type_for (TREE_TYPE (@0)); } (ncmp (convert:stype @0) { build_zero_cst (stype); }) +/* If we have A < 0 ? C : 0 where C and D are powers of 2, + convert this into a right shift and AND. */ +(simplify + (cond + (lt @0 integer_zerop) + integer_pow2p@1 integer_zerop) + (with { +int shift = element_precision (@0) - wi::exact_log2 (@1) - 1; + } + (bit_and + (convert (rshift @0 { build_int_cst (integer_type_node, shift); })) + @1))) + /* When the addresses are not directly of decls