[Bug rtl-optimization/84780] [8 Regression] wrong code aarch64 with -O3 --param=tree-reassoc-width=32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84780 Segher Boessenkool changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #12 from Segher Boessenkool --- Patch for the original problem went in as r258452: (I accidentally deleted the changelog from the commit message, so BZ didn't pick this up). combine: Fix PR84780 (more LOG_LINKS trouble) There still are situations where we have stale LOG_LINKS. This causes combine to try two-insn combinations I2->I3 where the register set by I2 is used before I3 as well. Not good. This patch fixes it by checking for this situation in can_combine_p (similar to what we already do for three and four insn combinations). Patch for #c10 went in as r258523. combine: Don't make log_links for pc_rtx (PR84780 #c10) distribute_links tries to place a log_link for whatever the destination of the modified instruction is. It shouldn't do that when that dest is pc_rtx, which isn't actually a register. * combine.c (distribute_links): Don't make a link based on pc_rtx. Closing as fixed.
[Bug rtl-optimization/84780] [8 Regression] wrong code aarch64 with -O3 --param=tree-reassoc-width=32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84780 Richard Biener changed: What|Removed |Added Priority|P3 |P1
[Bug rtl-optimization/84780] [8 Regression] wrong code aarch64 with -O3 --param=tree-reassoc-width=32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84780 --- Comment #11 from Segher Boessenkool --- That is a separate issue, not caused by the previous patch. I have a patch for this, too.
[Bug rtl-optimization/84780] [8 Regression] wrong code aarch64 with -O3 --param=tree-reassoc-width=32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84780 --- Comment #10 from Zdenek Sojka --- (In reply to Segher Boessenkool from comment #8) > Created attachment 43631 [details] > proposed patch > > I cannot reproduce that exact generated code; maybe it needs tuning for some > particular CPU? > > Could you try the attached patch? Thanks! This is causing segfault: $ cat testcase.c int a, d; long b; __int128 c; void foo(void) { a &= 0 < b; do { c = b >>= 63; c -= __builtin_add_overflow_p(0, b, d); } while (a >= 255); } $ aarch64-unknown-linux-gnu-gcc -Og testcase.c during RTL pass: combine testcase.c: In function 'foo': testcase.c:11:1: internal compiler error: Segmentation fault } ^ 0xc927bf crash_signal /repo/gcc-trunk/gcc/toplev.c:325 0xc11391 reg_used_between_p(rtx_def const*, rtx_insn const*, rtx_insn const*) /repo/gcc-trunk/gcc/rtlanal.c:1128 0x13c1486 can_combine_p /repo/gcc-trunk/gcc/combine.c:1993 ...
[Bug rtl-optimization/84780] [8 Regression] wrong code aarch64 with -O3 --param=tree-reassoc-width=32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84780 --- Comment #9 from Zdenek Sojka --- (In reply to Segher Boessenkool from comment #8) > Created attachment 43631 [details] > proposed patch > > I cannot reproduce that exact generated code; maybe it needs tuning for some > particular CPU? > > Could you try the attached patch? Thanks! I can confirm that r258444 FAILs, but r258449+patch PASSes the testcases (both original and reduced).
[Bug rtl-optimization/84780] [8 Regression] wrong code aarch64 with -O3 --param=tree-reassoc-width=32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84780 Segher Boessenkool changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |segher at gcc dot gnu.org --- Comment #8 from Segher Boessenkool --- Created attachment 43631 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43631=edit proposed patch I cannot reproduce that exact generated code; maybe it needs tuning for some particular CPU? Could you try the attached patch? Thanks!
[Bug rtl-optimization/84780] [8 Regression] wrong code aarch64 with -O3 --param=tree-reassoc-width=32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84780 --- Comment #7 from Segher Boessenkool --- I have a patch.
[Bug rtl-optimization/84780] [8 Regression] wrong code aarch64 with -O3 --param=tree-reassoc-width=32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84780 --- Comment #6 from Segher Boessenkool --- And the actual problem happens earlier: the earlier 63, 70 -> 71 combination links the much later insn 100 to 70, for cc, but there are plenty other setters and users of cc earlier.
[Bug rtl-optimization/84780] [8 Regression] wrong code aarch64 with -O3 --param=tree-reassoc-width=32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84780 --- Comment #5 from Segher Boessenkool --- Insn 55 is a parallel, and that is split into two insns i1 and i2, both numbered as 55. The i1 will never become part of the insn stream. It is this insn that is deleted. Later on insn 55 is combined into insn 100: 55: cc:CC_C=zero_extend(r165:DI)+zero_extend(x2:DI)!=zero_extend(r165:DI+x2:DI) 100: {cc:CC_C=zero_extend(r178:DI)+zero_extend(r198:DI)!=zero_extend(r178:DI+r198:DI);r200:DI=r178:DI+r198:DI;} REG_DEAD r198:DI REG_DEAD r178:DI becomes 100: {cc:CC_C=zero_extend(r178:DI)+zero_extend(r198:DI)!=zero_extend(r178:DI+r198:DI);r200:DI=r178:DI+r198:DI;} REG_DEAD r178:DI REG_DEAD r198:DI and that seems fine, too? Or does something in between use cc? Ah yes, insn 71 does. Somehow insn 100 has a LOG_LINK to 55 though (for cc). This happens at the 55 -> 70 combination.
[Bug rtl-optimization/84780] [8 Regression] wrong code aarch64 with -O3 --param=tree-reassoc-width=32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84780 --- Comment #4 from ktkachov at gcc dot gnu.org --- A carry-setting instruction gets deleted. Among the disassembly the non-failing assembly has this: cmp x13, 0 asr w0, w0, w4 csetw4, ne sxtwx0, w0 adrpx12, d sub w4, w7, w4 sxtwx9, w7 add w4, w4, w6 asr x6, x0, 63 cmn x4, x2 // sets the carry flag asr x1, x9, 63 cincx3, x3, cs // Uses the carry flag The bad disassembly has this: cmp x4, 0 adrpx13, d csetw4, ne asr x7, x0, 63 sxtwx6, w8 cincx1, x3, cs // use of carry flag, but the setter was eliminated Combine ends up eliminating the carry-setting instruction: (insn 55 52 56 2 (set (reg:CC_C 66 cc) (ne:CC_C (plus:TI (zero_extend:TI (reg:DI 165)) (zero_extend:TI (reg:DI 2 x2 [ h ]))) (zero_extend:TI (plus:DI (reg:DI 165) (reg:DI 2 x2 [ h ]) "bad.c":23 104 {*adddi3_compareC_cconly} (nil)) From what I can see in the combine logs: Trying 55 -> 70: 55: {cc:CC_C=zero_extend(r165:DI)+zero_extend(x2:DI)!=zero_extend(r165:DI+x2:DI);r167:DI=r165:DI+x2:DI;} REG_DEAD x2:DI REG_DEAD r165:DI 70: r178:DI=r167:DI REG_DEAD r167:DI After a few failed PARALLEL formation it succeeds twice with: Successfully matched this instruction: (set (reg:CC_C 66 cc) (ne:CC_C (plus:TI (zero_extend:TI (reg:DI 165)) (zero_extend:TI (reg:DI 2 x2 [ h ]))) (zero_extend:TI (plus:DI (reg:DI 165) (reg:DI 2 x2 [ h ]) Successfully matched this instruction: (set (reg:DI 178) (plus:DI (reg:DI 165) (reg:DI 2 x2 [ h ]))) allowing combination of insns 55 and 70 original costs 0 + 0 = 0 replacement costs 24 + 4 = 28 deferring deletion of insn with uid = 55. modifying insn i255: cc:CC_C=zero_extend(r165:DI)+zero_extend(x2:DI)!=zero_extend(r165:DI+x2:DI) deferring rescan insn with uid = 55. modifying insn i370: r178:DI=r165:DI+x2:DI REG_DEAD r165:DI REG_DEAD x2:DI deferring rescan insn with uid = 70. so it seems like it deletes insn 55 but then also modifies it?
[Bug rtl-optimization/84780] [8 Regression] wrong code aarch64 with -O3 --param=tree-reassoc-width=32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84780 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org, ||segher at gcc dot gnu.org --- Comment #3 from Jakub Jelinek --- Started with r257644, but haven't analyzed if things go wrong during combine or just it made some latent bug reproduceable.
[Bug rtl-optimization/84780] [8 Regression] wrong code aarch64 with -O3 --param=tree-reassoc-width=32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84780 --- Comment #2 from ktkachov at gcc dot gnu.org --- Fails for me with -O2 --param=tree-reassoc-width=4. With -fno-if-conversion it doesn't fail but I don't see what the if-conversion passes do wrong, if anything
[Bug rtl-optimization/84780] [8 Regression] wrong code aarch64 with -O3 --param=tree-reassoc-width=32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84780 ktkachov at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2018-03-09 CC||ktkachov at gcc dot gnu.org Known to work||7.3.1 Ever confirmed|0 |1 --- Comment #1 from ktkachov at gcc dot gnu.org --- Confirmed
[Bug rtl-optimization/84780] [8 Regression] wrong code aarch64 with -O3 --param=tree-reassoc-width=32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84780 Richard Biener changed: What|Removed |Added Target Milestone|--- |8.0