[Bug tree-optimization/94846] Failure to optimize jnc+inc into adc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94846 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement --- Comment #5 from Andrew Pinski --- After r12-897 (which added a late sink pass), we get the following in .optimized: if (_10 != 0) goto ; [50.00%] else goto ; [50.00%] [local count: 536870913]: _2 = _1 + 1; [local count: 1073741824]: # prephitmp_11 = PHI <_1(2), _2(3)> # _13 = PHI <_1(2), _2(3)> *p_5(D) = _13; return prephitmp_11; Notice how prephitmp_11 and _13 are the same but no RTL optimizers handles that.
[Bug tree-optimization/94846] Failure to optimize jnc+inc into adc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94846 Richard Biener changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #4 from Richard Biener --- The sinking opportunity is a secondary one, sink does Sinking # .MEM_8 = VDEF <.MEM_4(D)> *p_5(D) = _1; from bb 2 to bb 5 but then not sinking further with the store commoning I implemented for GCC 11. [local count: 1073741824]: u_6 = *p_5(D); _1 = u_6 + x_7(D); if (_1 < u_6) goto ; [50.00%] else goto ; [50.00%] [local count: 536870912]: *p_5(D) = _1; goto ; [100.00%] [local count: 536870913]: _2 = _1 + 1; *p_5(D) = _2; [local count: 1073741824]: # prephitmp_11 = PHI <_1(5), _2(3)> return prephitmp_11; regular sinking works up the postdom tree. In reality we'd have to iterate sinking and commoning as can be seen here given we're lazy and not computing a combined dataflow.
[Bug tree-optimization/94846] Failure to optimize jnc+inc into adc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94846 Uroš Bizjak changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2020-11-25 Status|UNCONFIRMED |NEW CC||rguenth at gcc dot gnu.org Component|rtl-optimization|tree-optimization --- Comment #3 from Uroš Bizjak --- This looks like a tree-optimization problem. A store to *p_5(D) could sink all the way to bb5. RTL gets expanded from: ... if (_10 != 0) goto ; [50.00%] else goto ; [50.00%] [local count: 536870912]: *p_5(D) = _1; goto ; [100.00%] [local count: 536870913]: _2 = _1 + 1; *p_5(D) = _2; [local count: 1073741824]: # prephitmp_11 = PHI <_1(3), _2(4)> return prephitmp_11; ifcvt RTL pass (_.ce1) is unable to convert: ... 17: r86:SI=r89:SI REG_DEAD r89:SI 18: flags:CCZ=cmp(r85:SI,0) REG_DEAD r85:SI 19: pc={(flags:CCZ!=0)?L24:pc} REG_DEAD flags:CCZ REG_BR_PROB 536870916 20: NOTE_INSN_BASIC_BLOCK 5 21: [r87:DI]=r89:SI REG_DEAD r87:DI ; pc falls through to BB 7 24: L24: 25: NOTE_INSN_BASIC_BLOCK 6 26: {r86:SI=r89:SI+0x1;clobber flags:CC;} REG_UNUSED flags:CC 27: [r87:DI]=r86:SI REG_DEAD r87:DI 32: L32: 35: NOTE_INSN_BASIC_BLOCK 7 ... IF-THEN-ELSE-JOIN block found, pass 2, test 2, then 5, else 6, join 7 === In case p is not a pointer, RTL optimizers start with: ... if (_8 != 0) goto ; [50.00%] else goto ; [50.00%] [local count: 536870913]: p_5 = p_4 + 1; [local count: 1073741824]: # p_1 = PHI return p_1; and RTL ifcvt pass is able to convert this form to addcc: IF-THEN-JOIN block found, pass 2, test 2, then 5, join 6 if-conversion succeeded through noce_try_addcc