[Bug tree-optimization/96921] Failure to optimize combined boolean not patterns
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96921 Andrew Pinski changed: What|Removed |Added Status|ASSIGNED|RESOLVED Target Milestone|--- |13.0 Resolution|--- |FIXED --- Comment #11 from Andrew Pinski --- Fixed.
[Bug tree-optimization/96921] Failure to optimize combined boolean not patterns
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96921 --- Comment #10 from CVS Commits --- The trunk branch has been updated by Andrew Pinski : https://gcc.gnu.org/g:a42ed1d9181d21d5cb02f131f641c0cf375eca9d commit r13-5988-ga42ed1d9181d21d5cb02f131f641c0cf375eca9d Author: Andrew Pinski Date: Tue Jan 31 05:03:21 2023 + Simplify "1 - bool_val" to "bool_val ^ 1" For bool values, it is easier to deal with xor 1 rather than having 1 - a. This is because we are more likely to simplify the xor further in many cases. This is a special case for (MASK - b) where MASK is a powerof2 - 1 and b <= MASK but only for bool ranges ([0,1]) as that is the main case where the difference comes into play. Note this is enabled for gimple folding only as the ranges are only know while doing gimple folding and cfun is not always set when fold is called. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/ChangeLog: PR tree-optimization/108355 PR tree-optimization/96921 * match.pd: Add pattern for "1 - bool_val". gcc/testsuite/ChangeLog: PR tree-optimization/108355 PR tree-optimization/96921 * gcc.dg/tree-ssa/bool-minus-1.c: New test. * gcc.dg/tree-ssa/bool-minus-2.c: New test. * gcc.dg/tree-ssa/pr108354-1.c: New test.
[Bug tree-optimization/96921] Failure to optimize combined boolean not patterns
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96921 --- Comment #9 from Andrew Pinski --- (In reply to Andrew Pinski from comment #3) > So something like: > (simplify > (minus integer_one@0 SSA_NAME@1) > (if (TREE_CODE (@0) == SSA_NAME > && ssa_name_has_boolean_range (@0)) > (bit_xor @1 @0))) I ended up with this only in the end (I actually came up with this the second time too) as it fixes a regression The expanded one should wait and might be worse.
[Bug tree-optimization/96921] Failure to optimize combined boolean not patterns
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96921 --- Comment #8 from Andrew Pinski --- (In reply to Andrew Pinski from comment #7) > (In reply to Andrew Pinski from comment #4) > > Hmm, thinking about expanding this further: > > I am going to handle the non-special (bool) case as PR 101610. Actually PR 91213.
[Bug tree-optimization/96921] Failure to optimize combined boolean not patterns
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96921 --- Comment #7 from Andrew Pinski --- (In reply to Andrew Pinski from comment #4) > Hmm, thinking about expanding this further: I am going to handle the non-special (bool) case as PR 101610.
[Bug tree-optimization/96921] Failure to optimize combined boolean not patterns
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96921 --- Comment #6 from Andrew Pinski --- The final conversion should happen at the RTL level (or during expansion). simplify-rtx.c has this: /* If STORE_FLAG_VALUE is 1, (minus 1 (comparison foo bar)) can be done by reversing the comparison code if valid. */ I will implement the rtl level in simplify-rtx.c first and then add the others.
[Bug tree-optimization/96921] Failure to optimize combined boolean not patterns
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96921 --- Comment #5 from Andrew Pinski --- (In reply to Andrew Pinski from comment #4) > Hmm, thinking about expanding this further: Even further. int f1(int n) { if (n&~8) __builtin_unreachable(); return 63 - n; } CUT So the nonzero bits just need to be make sure they no non-overlapping bits.
[Bug tree-optimization/96921] Failure to optimize combined boolean not patterns
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96921 --- Comment #4 from Andrew Pinski --- Hmm, thinking about expanding this further: int f1(int n) { if (n&~63) __builtin_unreachable(); return 63 - n; } int f2(int n) { if (n&~63) __builtin_unreachable(); return 63 ^ n; } These two should be able to get the same code which happens on clang already. Something like (but with the expansion of the +!, etc.) (simplify (minus INTEGER_CST@0 SSA_NAME@1) (if (exact_power2(@0 + 1) && get_nonzero_bits(@1) == @0 (bit_xor @1 @0)))
[Bug tree-optimization/96921] Failure to optimize combined boolean not patterns
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96921 Andrew Pinski changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |pinskia at gcc dot gnu.org --- Comment #3 from Andrew Pinski --- I was thinking if we see: _2 = 1 - _1; and _1 has a range of [0,1] aka boolean turn it into _1 ^ 1 There is already a pattern which turns ((int)a)^1 into (int)(~a). You can see the other patterns in action if you write the code as: int foo1 (_Bool a, _Bool b) { int c = a; c ^= 1; int d = b; d = d ^ 1; int e = c & d; return e ^ 1; } So something like: (simplify (minus integer_one@0 SSA_NAME@1) (if (TREE_CODE (@0) == SSA_NAME && ssa_name_has_boolean_range (@0)) (bit_xor @1 @0))) That is for boolean types/ranges, we always use a ^ 1 (or ~a) instead of 1 - a Take: int fooneg (_Bool a) { return 1 - a; } int fooxor (_Bool a) { return a^1; }
[Bug tree-optimization/96921] Failure to optimize combined boolean not patterns
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96921 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement Ever confirmed|0 |1 Last reconfirmed||2021-07-28 Status|UNCONFIRMED |NEW CC||pinskia at gcc dot gnu.org --- Comment #2 from Andrew Pinski --- Confirmed.
[Bug tree-optimization/96921] Failure to optimize combined boolean not patterns
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96921 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #1 from Jakub Jelinek --- On: _Bool foo (_Bool a, _Bool b) { int c = 1 - a; int d = 1 - b; int e = c & d; return 1 - e; } _Bool bar (_Bool a, _Bool b) { int c = 1 - a; int d = 1 - b; _Bool e = c & d; return 1 - e; } _Bool baz (_Bool a, _Bool b) { _Bool c = 1 - a; _Bool d = 1 - b; _Bool e = c & d; return 1 - e; } we are able to optimize just baz. Rather than duplicating the bit_not vs. bit_and/bit_ior etc. simplifications, I wonder if it wouldn't be better to perform type demotion here, see that 1 - x is used in a boolean context and that x is ssa_name_has_boolean_range and turn that into bit_not on _Bool. Similarly for the bit_and/ior/xor whose result is used in a boolean context and where both operands are ssa_name_has_boolean_range. Thoughts on that?