[Bug rtl-optimization/115092] [14/15 Regression] wrong code at -O1 with "-fgcse -ftree-pre -fno-tree-dominator-opts -fno-tree-fre -fno-guess-branch-probability" on x86_64-linux-gnu since r14-4810
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115092 Bug 115092 depends on bug 114902, which changed state. Bug 114902 Summary: [14 Regression] wrong code at -O3 with "-fno-tree-vrp -fno-expensive-optimizations -fno-tree-dominator-opts" on x86_64-linux-gnu https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114902 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug rtl-optimization/115092] [14/15 Regression] wrong code at -O1 with "-fgcse -ftree-pre -fno-tree-dominator-opts -fno-tree-fre -fno-guess-branch-probability" on x86_64-linux-gnu since r14-4810
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115092 --- Comment #13 from GCC Commits --- The releases/gcc-14 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:14a7296d04474055bfe1d7f130dceac6dabf390d commit r14-10276-g14a7296d04474055bfe1d7f130dceac6dabf390d Author: Jakub Jelinek Date: Wed May 15 18:37:17 2024 +0200 combine: Fix up simplify_compare_const [PR115092] The following testcases are miscompiled (with tons of GIMPLE optimization disabled) because combine sees GE comparison of 1-bit sign_extract (i.e. something with [-1, 0] value range) with (const_int -1) (which is always true) and optimizes it into NE comparison of 1-bit zero_extract ([0, 1] value range) against (const_int 0). The reason is that simplify_compare_const first (correctly) simplifies the comparison to GE (ashift:SI something (const_int 31)) (const_int -2147483648) and then an optimization for when the second operand is power of 2 triggers. That optimization is fine for power of 2s which aren't the signed minimum of the mode, or if it is NE, EQ, GEU or LTU against the signed minimum of the mode, but for GE or LT optimizing it into NE (or EQ) against const0_rtx is wrong, those cases are always true or always false (but the function doesn't have a standardized way to tell callers the comparison is now unconditional). The following patch just disables the optimization in that case. 2024-05-15 Jakub Jelinek PR rtl-optimization/114902 PR rtl-optimization/115092 * combine.cc (simplify_compare_const): Don't optimize GE op0 SIGNED_MIN or LT op0 SIGNED_MIN into NE op0 const0_rtx or EQ op0 const0_rtx. * gcc.dg/pr114902.c: New test. * gcc.dg/pr115092.c: New test. (cherry picked from commit 0b93a0ae153ef70a82ff63e67926a01fdab9956b)
[Bug rtl-optimization/115092] [14/15 Regression] wrong code at -O1 with "-fgcse -ftree-pre -fno-tree-dominator-opts -fno-tree-fre -fno-guess-branch-probability" on x86_64-linux-gnu since r14-4810
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115092 --- Comment #12 from GCC Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:0b93a0ae153ef70a82ff63e67926a01fdab9956b commit r15-520-g0b93a0ae153ef70a82ff63e67926a01fdab9956b Author: Jakub Jelinek Date: Wed May 15 18:37:17 2024 +0200 combine: Fix up simplify_compare_const [PR115092] The following testcases are miscompiled (with tons of GIMPLE optimization disabled) because combine sees GE comparison of 1-bit sign_extract (i.e. something with [-1, 0] value range) with (const_int -1) (which is always true) and optimizes it into NE comparison of 1-bit zero_extract ([0, 1] value range) against (const_int 0). The reason is that simplify_compare_const first (correctly) simplifies the comparison to GE (ashift:SI something (const_int 31)) (const_int -2147483648) and then an optimization for when the second operand is power of 2 triggers. That optimization is fine for power of 2s which aren't the signed minimum of the mode, or if it is NE, EQ, GEU or LTU against the signed minimum of the mode, but for GE or LT optimizing it into NE (or EQ) against const0_rtx is wrong, those cases are always true or always false (but the function doesn't have a standardized way to tell callers the comparison is now unconditional). The following patch just disables the optimization in that case. 2024-05-15 Jakub Jelinek PR rtl-optimization/114902 PR rtl-optimization/115092 * combine.cc (simplify_compare_const): Don't optimize GE op0 SIGNED_MIN or LT op0 SIGNED_MIN into NE op0 const0_rtx or EQ op0 const0_rtx. * gcc.dg/pr114902.c: New test. * gcc.dg/pr115092.c: New test.
[Bug rtl-optimization/115092] [14/15 Regression] wrong code at -O1 with "-fgcse -ftree-pre -fno-tree-dominator-opts -fno-tree-fre -fno-guess-branch-probability" on x86_64-linux-gnu since r14-4810
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115092 --- Comment #11 from Segher Boessenkool --- Still okay :-)
[Bug rtl-optimization/115092] [14/15 Regression] wrong code at -O1 with "-fgcse -ftree-pre -fno-tree-dominator-opts -fno-tree-fre -fno-guess-branch-probability" on x86_64-linux-gnu since r14-4810
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115092 --- Comment #10 from Jakub Jelinek --- Created attachment 58213 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58213=edit gcc15-pr115092.patch Full patch I'm going to test.
[Bug rtl-optimization/115092] [14/15 Regression] wrong code at -O1 with "-fgcse -ftree-pre -fno-tree-dominator-opts -fno-tree-fre -fno-guess-branch-probability" on x86_64-linux-gnu since r14-4810
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115092 --- Comment #9 from Segher Boessenkool --- (In reply to Jakub Jelinek from comment #8) > > Yeah, that look like it is missing some test. > > I'd go with > --- gcc/combine.cc.jj 2024-05-07 18:10:10.415874636 +0200 > +++ gcc/combine.cc2024-05-15 13:33:26.555081215 +0200 > @@ -11852,8 +11852,10 @@ simplify_compare_const (enum rtx_code co > `and'ed with that bit), we can replace this with a comparison > with zero. */ >if (const_op > - && (code == EQ || code == NE || code == GE || code == GEU > - || code == LT || code == LTU) > + && (code == EQ || code == NE || code == GEU || code == LTU > + /* This optimization is incorrect for signed >= INT_MIN or > + < INT_MIN, those are always true or always false. */ > + || ((code == GE || code == LT) && const_op > 0)) >&& is_a (mode, _mode) >&& GET_MODE_PRECISION (int_mode) - 1 < HOST_BITS_PER_WIDE_INT >&& pow2p_hwi (const_op & GET_MODE_MASK (int_mode)) Pre-approved. Thanks! > Seems there is no canonical way to return this is always true or this is > always false, > sure, we could make up something like NE 1 0 or EQ 1 0 or similar, but it > wouldn't likely match and the question is if it would simplify. Later code will likely pick this up. More likely than with the GE anyway :-) > The const_op == -1 handling below this looks correct to me. Yup. > > That needs to be fixed of course, but independent of that, this should > > really > > have been completely folded away earlier already? > > It would if one wouldn't carefully disable tons of optimizations (say -O1, > so no (significant) VRP, dom* disabled, fre disabled). Ha :-)
[Bug rtl-optimization/115092] [14/15 Regression] wrong code at -O1 with "-fgcse -ftree-pre -fno-tree-dominator-opts -fno-tree-fre -fno-guess-branch-probability" on x86_64-linux-gnu since r14-4810
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115092 --- Comment #8 from Jakub Jelinek --- (In reply to Segher Boessenkool from comment #7) > (In reply to Jakub Jelinek from comment #5) > > I think the bug is in simplify_comparison. > > We have there > > GE (sign_extract:SI (reg/v:SI 101 [ g ]) (const_int 1 [0x1]) (const_int 0 > > [0])) (const_int -1 [0x]) > > That is first changed into > > GE (ashiftrt:SI (ashift:SI (reg/v:SI 101 [ g ]) (const_int 31 [0x1f])) > > (const_int 31 [0x1f])) (const_int -1 [0x]) > > Both are always true. > > But then the > > /* FALLTHROUGH */ > > case LSHIFTRT: > > /* If we have (compare (xshiftrt FOO N) (const_int C)) and > > the low order N bits of FOO are known to be zero, we can do > > this > > by comparing FOO with C shifted left N bits so long as no > > overflow occurs. Even if the low order N bits of FOO aren't > > known > > to be zero, if the comparison is >= or < we can use the same > > optimization and for > or <= by setting all the low > > order N bits in the comparison constant. */ > > optimization triggers and optimizes it into > > GE (ashift:SI (reg/v:SI 101 [ g ]) (const_int 31 [0x1f])) (const_int > > -2147483648 [0x8000]) > > I think that is ok too. > > But then > > code = simplify_compare_const (code, raw_mode, , ); > > simplifies that to NE and I think that step is wrong, because GE of anything > > >= INT_MIN > > is true. > > > > So, I think > > /* If we are comparing against a constant power of two and the value > > being compared can only have that single bit nonzero (e.g., it was > > `and'ed with that bit), we can replace this with a comparison > > with zero. */ > > if (const_op > > && (code == EQ || code == NE || code == GE || code == GEU > > || code == LT || code == LTU) > > && is_a (mode, _mode) > > && GET_MODE_PRECISION (int_mode) - 1 < HOST_BITS_PER_WIDE_INT > > && pow2p_hwi (const_op & GET_MODE_MASK (int_mode)) > > && (nonzero_bits (op0, int_mode) > > == (unsigned HOST_WIDE_INT) (const_op & GET_MODE_MASK > > (int_mode > > { > > code = (code == EQ || code == GE || code == GEU ? NE : EQ); > > const_op = 0; > > } > > in simplify_compare_const is wrong if const_op is the most significant bit > > of int_mode. > > Yeah, that look like it is missing some test. I'd go with --- gcc/combine.cc.jj 2024-05-07 18:10:10.415874636 +0200 +++ gcc/combine.cc 2024-05-15 13:33:26.555081215 +0200 @@ -11852,8 +11852,10 @@ simplify_compare_const (enum rtx_code co `and'ed with that bit), we can replace this with a comparison with zero. */ if (const_op - && (code == EQ || code == NE || code == GE || code == GEU - || code == LT || code == LTU) + && (code == EQ || code == NE || code == GEU || code == LTU + /* This optimization is incorrect for signed >= INT_MIN or +< INT_MIN, those are always true or always false. */ + || ((code == GE || code == LT) && const_op > 0)) && is_a (mode, _mode) && GET_MODE_PRECISION (int_mode) - 1 < HOST_BITS_PER_WIDE_INT && pow2p_hwi (const_op & GET_MODE_MASK (int_mode)) Seems there is no canonical way to return this is always true or this is always false, sure, we could make up something like NE 1 0 or EQ 1 0 or similar, but it wouldn't likely match and the question is if it would simplify. The const_op == -1 handling below this looks correct to me. > That needs to be fixed of course, but independent of that, this should really > have been completely folded away earlier already? It would if one wouldn't carefully disable tons of optimizations (say -O1, so no (significant) VRP, dom* disabled, fre disabled). Furthermore, at least in the optimized dump it is obfuscated through: _22 = _21 & 1; # RANGE [irange] int [0, 1] MASK 0x1 VALUE 0x0 _24 = 1 >> _22; _26 = -_24; : # prephitmp_27 = PHI <_26(3), -1(2)> if (prephitmp_27 < -1) Sure, VRP could see that _26 has [-1, 0] range, unioned with [-1, -1] and that is never < -1.
[Bug rtl-optimization/115092] [14/15 Regression] wrong code at -O1 with "-fgcse -ftree-pre -fno-tree-dominator-opts -fno-tree-fre -fno-guess-branch-probability" on x86_64-linux-gnu since r14-4810
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115092 --- Comment #7 from Segher Boessenkool --- (In reply to Jakub Jelinek from comment #5) > I think the bug is in simplify_comparison. > We have there > GE (sign_extract:SI (reg/v:SI 101 [ g ]) (const_int 1 [0x1]) (const_int 0 > [0])) (const_int -1 [0x]) > That is first changed into > GE (ashiftrt:SI (ashift:SI (reg/v:SI 101 [ g ]) (const_int 31 [0x1f])) > (const_int 31 [0x1f])) (const_int -1 [0x]) > Both are always true. > But then the > /* FALLTHROUGH */ > case LSHIFTRT: > /* If we have (compare (xshiftrt FOO N) (const_int C)) and > the low order N bits of FOO are known to be zero, we can do this > by comparing FOO with C shifted left N bits so long as no > overflow occurs. Even if the low order N bits of FOO aren't > known > to be zero, if the comparison is >= or < we can use the same > optimization and for > or <= by setting all the low > order N bits in the comparison constant. */ > optimization triggers and optimizes it into > GE (ashift:SI (reg/v:SI 101 [ g ]) (const_int 31 [0x1f])) (const_int > -2147483648 [0x8000]) > I think that is ok too. > But then > code = simplify_compare_const (code, raw_mode, , ); > simplifies that to NE and I think that step is wrong, because GE of anything > >= INT_MIN > is true. > > So, I think > /* If we are comparing against a constant power of two and the value > being compared can only have that single bit nonzero (e.g., it was > `and'ed with that bit), we can replace this with a comparison > with zero. */ > if (const_op > && (code == EQ || code == NE || code == GE || code == GEU > || code == LT || code == LTU) > && is_a (mode, _mode) > && GET_MODE_PRECISION (int_mode) - 1 < HOST_BITS_PER_WIDE_INT > && pow2p_hwi (const_op & GET_MODE_MASK (int_mode)) > && (nonzero_bits (op0, int_mode) > == (unsigned HOST_WIDE_INT) (const_op & GET_MODE_MASK (int_mode > { > code = (code == EQ || code == GE || code == GEU ? NE : EQ); > const_op = 0; > } > in simplify_compare_const is wrong if const_op is the most significant bit > of int_mode. Yeah, that look like it is missing some test. That needs to be fixed of course, but independent of that, this should really have been completely folded away earlier already?
[Bug rtl-optimization/115092] [14/15 Regression] wrong code at -O1 with "-fgcse -ftree-pre -fno-tree-dominator-opts -fno-tree-fre -fno-guess-branch-probability" on x86_64-linux-gnu since r14-4810
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115092 --- Comment #6 from Segher Boessenkool --- (In reply to Jakub Jelinek from comment #4) > Indeed, combine_simplify_rtx on > (set (reg:CCGC 17 flags) > (compare:CCGC (sign_extract:SI (reg/v:SI 101 [ g ]) > (const_int 1 [0x1]) > (const_int 0 [0])) > (const_int -1 [0x]))) > with VOIDmode, false, false remaining arguments is optimizing it to > (set (reg:CCZ 17 flags) > (compare:CCZ (zero_extract:SI (reg/v:SI 101 [ g ]) > (const_int 1 [0x1]) > (const_int 0 [0])) > (const_int 0 [0]))) > which is ok if it would be used solely in equality/non-equality comparisons, > but is not ok when it is used in other comparisons. 1-bit sign_extract has > range [-1,0] and > [-1,0] < -1 is always false. It is some target code that decided what to do with the CCGC thing. It decided to use CCZ instead, which of course is wrong if other conditions are used (and should ICE if you try to use it for non-equality actually).
[Bug rtl-optimization/115092] [14/15 Regression] wrong code at -O1 with "-fgcse -ftree-pre -fno-tree-dominator-opts -fno-tree-fre -fno-guess-branch-probability" on x86_64-linux-gnu since r14-4810
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115092 --- Comment #5 from Jakub Jelinek --- I think the bug is in simplify_comparison. We have there GE (sign_extract:SI (reg/v:SI 101 [ g ]) (const_int 1 [0x1]) (const_int 0 [0])) (const_int -1 [0x]) That is first changed into GE (ashiftrt:SI (ashift:SI (reg/v:SI 101 [ g ]) (const_int 31 [0x1f])) (const_int 31 [0x1f])) (const_int -1 [0x]) Both are always true. But then the /* FALLTHROUGH */ case LSHIFTRT: /* If we have (compare (xshiftrt FOO N) (const_int C)) and the low order N bits of FOO are known to be zero, we can do this by comparing FOO with C shifted left N bits so long as no overflow occurs. Even if the low order N bits of FOO aren't known to be zero, if the comparison is >= or < we can use the same optimization and for > or <= by setting all the low order N bits in the comparison constant. */ optimization triggers and optimizes it into GE (ashift:SI (reg/v:SI 101 [ g ]) (const_int 31 [0x1f])) (const_int -2147483648 [0x8000]) I think that is ok too. But then code = simplify_compare_const (code, raw_mode, , ); simplifies that to NE and I think that step is wrong, because GE of anything >= INT_MIN is true. So, I think /* If we are comparing against a constant power of two and the value being compared can only have that single bit nonzero (e.g., it was `and'ed with that bit), we can replace this with a comparison with zero. */ if (const_op && (code == EQ || code == NE || code == GE || code == GEU || code == LT || code == LTU) && is_a (mode, _mode) && GET_MODE_PRECISION (int_mode) - 1 < HOST_BITS_PER_WIDE_INT && pow2p_hwi (const_op & GET_MODE_MASK (int_mode)) && (nonzero_bits (op0, int_mode) == (unsigned HOST_WIDE_INT) (const_op & GET_MODE_MASK (int_mode { code = (code == EQ || code == GE || code == GEU ? NE : EQ); const_op = 0; } in simplify_compare_const is wrong if const_op is the most significant bit of int_mode.
[Bug rtl-optimization/115092] [14/15 Regression] wrong code at -O1 with "-fgcse -ftree-pre -fno-tree-dominator-opts -fno-tree-fre -fno-guess-branch-probability" on x86_64-linux-gnu since r14-4810
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115092 Jakub Jelinek changed: What|Removed |Added CC||segher at gcc dot gnu.org --- Comment #4 from Jakub Jelinek --- Indeed, combine_simplify_rtx on (set (reg:CCGC 17 flags) (compare:CCGC (sign_extract:SI (reg/v:SI 101 [ g ]) (const_int 1 [0x1]) (const_int 0 [0])) (const_int -1 [0x]))) with VOIDmode, false, false remaining arguments is optimizing it to (set (reg:CCZ 17 flags) (compare:CCZ (zero_extract:SI (reg/v:SI 101 [ g ]) (const_int 1 [0x1]) (const_int 0 [0])) (const_int 0 [0]))) which is ok if it would be used solely in equality/non-equality comparisons, but is not ok when it is used in other comparisons. 1-bit sign_extract has range [-1,0] and [-1,0] < -1 is always false.
[Bug rtl-optimization/115092] [14/15 Regression] wrong code at -O1 with "-fgcse -ftree-pre -fno-tree-dominator-opts -fno-tree-fre -fno-guess-branch-probability" on x86_64-linux-gnu since r14-4810
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115092 Andrew Pinski changed: What|Removed |Added Resolution|--- |DUPLICATE Status|NEW |RESOLVED --- Comment #3 from Andrew Pinski --- Trying 15 -> 20: 15: {r105:SI=sign_extract(r101:SI,0x1,0);clobber flags:CC;} REG_UNUSED flags:CC 20: flags:CCGC=cmp(r105:SI,0x) REG_DEAD r105:SI Successfully matched this instruction: (set (reg:CCZ 17 flags) (compare:CCZ (zero_extract:SI (reg/v:SI 101 [ gD.2776 ]) (const_int 1 [0x1]) (const_int 0 [0])) (const_int 0 [0]))) Successfully matched this instruction: (set (pc) (if_then_else (ne (reg:CCZ 17 flags) (const_int 0 [0])) (label_ref 25) (pc))) allowing combination of insns 15 and 20 original costs 4 + 4 = 20 replacement cost 16 deferring deletion of insn with uid = 15. modifying other_insn21: pc={(flags:CCZ!=0)?L25:pc} REG_DEAD flags:CCGC deferring rescan insn with uid = 21. modifying insn i320: flags:CCZ=cmp(zero_extract(r101:SI,0x1,0),0) deferring rescan insn with uid = 20. We go from `-(r101&1) < -1` into `(r101 & 1) != 0` which is totally wrong. So yes, it is a dup. *** This bug has been marked as a duplicate of bug 114902 ***
[Bug rtl-optimization/115092] [14/15 Regression] wrong code at -O1 with "-fgcse -ftree-pre -fno-tree-dominator-opts -fno-tree-fre -fno-guess-branch-probability" on x86_64-linux-gnu since r14-4810
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115092 Andrew Pinski changed: What|Removed |Added Depends on||114902 --- Comment #2 from Andrew Pinski --- (In reply to Jakub Jelinek from comment #1) > Started with r14-4810-ge28869670c9879fe7c67caf6cc11af202509ef78 Then I am 99% sure this is a dup of bug 114902. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114902 [Bug 114902] [14/15 Regression] wrong code at -O3 with "-fno-tree-vrp -fno-expensive-optimizations -fno-tree-dominator-opts" on x86_64-linux-gnu