[Bug rtl-optimization/106590] [12/13 Regression] x86-64 miscompilation starting with r12-8233-g1ceddd7497e15d w/ mtune=skylake
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106590 Jakub Jelinek changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #12 from Jakub Jelinek --- Fixed for 12.3+ too.
[Bug rtl-optimization/106590] [12/13 Regression] x86-64 miscompilation starting with r12-8233-g1ceddd7497e15d w/ mtune=skylake
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106590 --- Comment #11 from CVS Commits --- The releases/gcc-12 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:030063c43f30a2335d3c03182df0beb82d003816 commit r12-8720-g030063c43f30a2335d3c03182df0beb82d003816 Author: Jakub Jelinek Date: Mon Aug 15 13:56:57 2022 +0200 ifcvt: Fix up noce_convert_multiple_sets [PR106590] The following testcase is miscompiled on x86_64-linux. The problem is in the noce_convert_multiple_sets optimization. We essentially have: if (g == 1) { g = 1; f = 23; } else { g = 2; f = 20; } and for each insn try to create a conditional move sequence. There is code to detect overlap with the regs used in the condition and the destinations, so we actually try to construct: tmp_g = g == 1 ? 1 : 2; f = g == 1 ? 23 : 20; g = tmp_g; which is fine. But, we actually try to create two different conditional move sequences in each case, seq1 with the whole (eq (reg/v:HI 82 [ g ]) (const_int 1 [0x1])) condition and seq2 with cc_cmp (eq (reg:CCZ 17 flags) (const_int 0 [0])) to rely on the earlier present comparison. In each case, we compare the rtx costs and choose the cheaper sequence (seq1 if both have the same cost). The problem is that with the skylake tuning, tmp_g = g == 1 ? 1 : 2; is actually expanded as tmp_g = (g == 1) + 1; in seq1 (which clobbers (reg 17 flags)) and as a cmov in seq2 (which doesn't). The tuning says both have the same cost, so we pick seq1. Next we check sequences for f = g == 1 ? 23 : 20; and here the seq2 cmov is cheaper, but it uses (reg 17 flags) which has been clobbered earlier. The following patch fixes that by detecting if we in the chosen sequence clobber some register mentioned in cc_cmp or rev_cc_cmp, and if yes, arranges for only seq1 (i.e. sequences that emit the comparison itself) to be used after that. 2022-08-15 Jakub Jelinek PR rtl-optimization/106590 * ifcvt.cc (check_for_cc_cmp_clobbers): New function. (noce_convert_multiple_sets_1): If SEQ sets or clobbers any regs mentioned in cc_cmp or rev_cc_cmp, don't consider seq2 for any further conditional moves. * gcc.dg/torture/pr106590.c: New test. (cherry picked from commit 3a74a7bf62f47ed0d19866576378724be932ee17)
[Bug rtl-optimization/106590] [12/13 Regression] x86-64 miscompilation starting with r12-8233-g1ceddd7497e15d w/ mtune=skylake
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106590 Richard Biener changed: What|Removed |Added Target Milestone|12.2|12.3 --- Comment #10 from Richard Biener --- GCC 12.2 is being released, retargeting bugs to GCC 12.3.
[Bug rtl-optimization/106590] [12/13 Regression] x86-64 miscompilation starting with r12-8233-g1ceddd7497e15d w/ mtune=skylake
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106590 --- Comment #9 from Jakub Jelinek --- Fixed on the trunk so far, 12.2 is frozen now and I think the fix needs some time on the trunk before backporting, so probably only will be fixed for 12.3.
[Bug rtl-optimization/106590] [12/13 Regression] x86-64 miscompilation starting with r12-8233-g1ceddd7497e15d w/ mtune=skylake
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106590 --- Comment #8 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:3a74a7bf62f47ed0d19866576378724be932ee17 commit r13-2042-g3a74a7bf62f47ed0d19866576378724be932ee17 Author: Jakub Jelinek Date: Mon Aug 15 13:56:57 2022 +0200 ifcvt: Fix up noce_convert_multiple_sets [PR106590] The following testcase is miscompiled on x86_64-linux. The problem is in the noce_convert_multiple_sets optimization. We essentially have: if (g == 1) { g = 1; f = 23; } else { g = 2; f = 20; } and for each insn try to create a conditional move sequence. There is code to detect overlap with the regs used in the condition and the destinations, so we actually try to construct: tmp_g = g == 1 ? 1 : 2; f = g == 1 ? 23 : 20; g = tmp_g; which is fine. But, we actually try to create two different conditional move sequences in each case, seq1 with the whole (eq (reg/v:HI 82 [ g ]) (const_int 1 [0x1])) condition and seq2 with cc_cmp (eq (reg:CCZ 17 flags) (const_int 0 [0])) to rely on the earlier present comparison. In each case, we compare the rtx costs and choose the cheaper sequence (seq1 if both have the same cost). The problem is that with the skylake tuning, tmp_g = g == 1 ? 1 : 2; is actually expanded as tmp_g = (g == 1) + 1; in seq1 (which clobbers (reg 17 flags)) and as a cmov in seq2 (which doesn't). The tuning says both have the same cost, so we pick seq1. Next we check sequences for f = g == 1 ? 23 : 20; and here the seq2 cmov is cheaper, but it uses (reg 17 flags) which has been clobbered earlier. The following patch fixes that by detecting if we in the chosen sequence clobber some register mentioned in cc_cmp or rev_cc_cmp, and if yes, arranges for only seq1 (i.e. sequences that emit the comparison itself) to be used after that. 2022-08-15 Jakub Jelinek PR rtl-optimization/106590 * ifcvt.cc (check_for_cc_cmp_clobbers): New function. (noce_convert_multiple_sets_1): If SEQ sets or clobbers any regs mentioned in cc_cmp or rev_cc_cmp, don't consider seq2 for any further conditional moves. * gcc.dg/torture/pr106590.c: New test.
[Bug rtl-optimization/106590] [12/13 Regression] x86-64 miscompilation starting with r12-8233-g1ceddd7497e15d w/ mtune=skylake
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106590 Jakub Jelinek changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #7 from Jakub Jelinek --- Created attachment 53447 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53447&action=edit gcc13-pr106590.patch Untested fix.
[Bug rtl-optimization/106590] [12/13 Regression] x86-64 miscompilation starting with r12-8233-g1ceddd7497e15d w/ mtune=skylake
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106590 --- Comment #6 from Jakub Jelinek --- Cleaned up testcase: /* { dg-additional-options "-mtune=skylake" { target { i?86-*-* x86_64-*-* } } } */ typedef struct A { short a; } A; typedef A *B; typedef struct C { int c, d; } C; typedef C *D; B foo (void) { static A r = { .a = 1 }; return &r; } D bar (void) { static C r = { .c = 1, .d = 23 }; return &r; } static inline int __attribute__((always_inline)) baz (short a) { int e = 1, f; short g; D h; switch (a) { case 1: f = 23; g = 1; break; case 2: f = 20; g = 2; break; } h = bar (); if (h->d != f || h->c != g) __builtin_abort (); return e; } int qux (void) { B i = foo (); int e = 1; switch (i->a) { case 1: case 2: e = baz (i->a); break; case 3: e = 0; break; } return e; } int main () { qux (); return 0; } I think this is a latent bug in noce_convert_multiple_sets_1. For cond (eq (reg/v:HI 82 [ g ]) (const_int 1 [0x1])) and cc_cmp (eq (reg:CCZ 17 flags) (const_int 0 [0])) when we try_emit_cmove_seq for (insn 3 35 4 7 (set (reg/v:HI 82 [ g ]) (const_int 2 [0x2])) "pr106590.c":37:9 84 {*movhi_internal} (nil)) insn, we get seq1 (insn 65 0 66 (set (reg:CCZ 17 flags) (compare:CCZ (reg/v:HI 82 [ g ]) (const_int 1 [0x1]))) -1 (nil)) (insn 66 65 67 (set (reg:QI 96) (ne:QI (reg:CCZ 17 flags) (const_int 0 [0]))) -1 (nil)) (insn 67 66 68 (set (reg:HI 95) (zero_extend:HI (reg:QI 96))) -1 (nil)) (insn 68 67 0 (parallel [ (set (reg:HI 95) (plus:HI (reg:HI 95) (const_int 1 [0x1]))) (clobber (reg:CC 17 flags)) ]) -1 (nil)) and seq2 (insn 69 0 70 (set (reg:HI 97) (const_int 2 [0x2])) -1 (nil)) (insn 70 69 0 (set (reg:HI 95) (if_then_else:HI (eq (reg:CCZ 17 flags) (const_int 0 [0])) (reg/v:HI 82 [ g ]) (reg:HI 97))) -1 (nil)) and later for (insn 4 3 36 7 (set (reg/v:SI 84 [ f ]) (const_int 20 [0x14])) "pr106590.c":36:9 83 {*movsi_internal} (nil)) we get seq1 (insn 72 0 71 (set (reg:SI 98) (const_int 20 [0x14])) -1 (nil)) (insn 71 72 73 (set (reg:CCZ 17 flags) (compare:CCZ (reg/v:HI 82 [ g ]) (const_int 1 [0x1]))) -1 (nil)) (insn 73 71 0 (set (reg/v:SI 84 [ f ]) (if_then_else:SI (eq (reg:CCZ 17 flags) (const_int 0 [0])) (reg/v:SI 84 [ f ]) (reg:SI 98))) -1 (nil)) and seq2 (insn 74 0 75 (set (reg:SI 99) (const_int 20 [0x14])) -1 (nil)) (insn 75 74 0 (set (reg/v:SI 84 [ f ]) (if_then_else:SI (eq (reg:CCZ 17 flags) (const_int 0 [0])) (reg/v:SI 84 [ f ]) (reg:SI 99))) -1 (nil)) In the former case, cost1 == cost2, in the latter case cost1 > cost2. So, we for the first insn choose seq1 and for the second seq2. That is wrong, because the first seq1 clobbers the (reg 17 flags) register which is used in cc_cmp/rev_cc_cmp. So, if we pick that seq1, we may not use any seq2 afterwards for the latter instructions, need to always select seq1 since then.
[Bug rtl-optimization/106590] [12/13 Regression] x86-64 miscompilation starting with r12-8233-g1ceddd7497e15d w/ mtune=skylake
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106590 Richard Biener changed: What|Removed |Added Priority|P3 |P2 Known to fail||12.1.0 CC||jakub at gcc dot gnu.org
[Bug rtl-optimization/106590] [12/13 Regression] x86-64 miscompilation starting with r12-8233-g1ceddd7497e15d w/ mtune=skylake
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106590 --- Comment #5 from Andrew Pinski --- Created attachment 53442 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53442&action=edit testcase that should be ready for gcc.c-torture/execute
[Bug rtl-optimization/106590] [12/13 Regression] x86-64 miscompilation starting with r12-8233-g1ceddd7497e15d w/ mtune=skylake
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106590 --- Comment #4 from Andrew Pinski --- So the IR is slightly different entering CE1. The two BB for the sides of the if are swapped. But that is the only difference. This is definitely a latent bug that got exposed, it is not the first latent bug in CE1 recently either.
[Bug rtl-optimization/106590] [12/13 Regression] x86-64 miscompilation starting with r12-8233-g1ceddd7497e15d w/ mtune=skylake
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106590 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2022-08-12 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Component|target |rtl-optimization --- Comment #3 from Andrew Pinski --- CE goes bad: IF-CASE-2 found, start 6, else 8 verify found no changes in insn with uid = 35. changing bb of uid 5 from 8 to 6 deleting block 8 Conversion succeeded on pass 1.