This patch is the middle-end piece of a set of patches for PR target/43892, that improves combine's ability to optimize instructions with multiple side-effects, such as updating explicit carry (flag) registers.
In RTL, an instruction that updates multiple registers is represented as a PARALLEL of several SETs, such as PowerPC's subfc instruction: (insn 80 79 81 4 (parallel [ (set (reg:SI 143) (minus:SI (reg/v:SI 142 [ <retval> ]) (reg:SI 141 [ _16 ]))) (set (reg:SI 98 ca) (leu:SI (reg:SI 141 [ _16 ]) (reg/v:SI 142 [ <retval> ]))) ]) "../pr43892.c":8:6 104 {subfsi3_carry} (expr_list:REG_DEAD (reg:SI 141 [ _16 ]) (expr_list:REG_UNUSED (reg:SI 143) (nil)))) As shown above, it's relatively common for only one of the results of these instructions to be used, and the other destination register(s) ignored, annotated with a REG_UNUSED note (as above). This patch teaches combine to take advantage of these REG_UNUSED annotations when trying to simplify instruction sequences. Currently, these annotations are ignored and the useless SETs preserved in try_combine's combination attempts: Trying 79 -> 80: 79: r142:SI=r139:SI+r141:SI REG_DEAD r139:SI 80: {r143:SI=r142:SI-r141:SI;ca:SI=leu(r141:SI,r142:SI);} REG_DEAD r141:SI REG_UNUSED r143:SI Failed to match this instruction: (parallel [ (set (reg:SI 143) (reg/v:SI 139 [ <retval> ])) (set (reg:SI 98 ca) (geu:SI (plus:SI (reg/v:SI 139 [ <retval> ]) (reg:SI 141 [ _16 ])) (reg/v:SI 139 [ <retval> ]))) (set (reg/v:SI 142 [ <retval> ]) (plus:SI (reg/v:SI 139 [ <retval> ]) (reg:SI 141 [ _16 ]))) ]) Notice that the combined/fused instruction passed to recog contains a (set (reg:SI 143) (reg:139)), even though r143 was marked as unused in the input sequence. Fortunately, it's trivial to prune these vestigial SETs, using the logic in single_set to determine that only one of the SETs in a PARALLEL is useful, or expressed another way, that the parallel can be simplified to the single_set. This patch has been tested on x86_64-pc-linux-gnu with a make bootstrap and make -k check with no new failures, and in combination with other patches on powerpc64-unknown-linux-gnu (c.f. https://gcc.gnu.org/pipermail/gcc-patches/2021-December/585977.html) I'll include a testcase for this functionality with the final rs6000 backend patch in the series. Ok for mainline? 2021-12-10 Roger Sayle <ro...@nextmovesoftware.com> gcc/ChangeLog * combine.c (try_combine): When I2 or I3 is PARALLEL without clobbers that is effectively just a single_set, just use that SET during the recombination/fusion attempt. Thanks in advance, Roger --
diff --git a/gcc/combine.c b/gcc/combine.c index 03e9a78..07f70b3 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -2901,6 +2901,17 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0, alloc_insn_link (i1, regno, LOG_LINKS (i2))); } + /* If I2 is a PARALLEL with only one useful SET and without clobbers, + transform I2 into that SET. */ + if (GET_CODE (PATTERN (i2)) == PARALLEL + && GET_CODE (XVECEXP (PATTERN (i2), 0, XVECLEN (PATTERN (i2), 0) - 1)) + != CLOBBER) + { + rtx tmp = single_set (i2); + if (tmp) + SUBST (PATTERN (i2), tmp); + } + /* If I2 is a PARALLEL of two SETs of REGs (and perhaps some CLOBBERs), make those two SETs separate I1 and I2 insns, and make an I0 that is the original I1. */ @@ -3389,6 +3400,18 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0, int extra_sets = added_sets_0 + added_sets_1 + added_sets_2; combine_extras++; + /* If I3 was a PARALLEL with only one useful SET, we can discard + the other SETs now before constructing the new PARALLEL. */ + if (GET_CODE (newpat) == PARALLEL + && newpat == PATTERN (i3) + && GET_CODE (XVECEXP (newpat, 0, XVECLEN (newpat, 0) - 1)) + != CLOBBER) + { + rtx tmp = single_set (i3); + if (tmp) + newpat = tmp; + } + if (GET_CODE (newpat) == PARALLEL) { rtvec old = XVEC (newpat, 0);