[Bug rtl-optimization/92656] The zero_extend insn can't be eliminated in the combine pass
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92656 --- Comment #6 from Segher Boessenkool --- The is no simple solution, yeah. It may be possible to have a simple change that results in better code on average, but that will be marginal :-/
[Bug rtl-optimization/92656] The zero_extend insn can't be eliminated in the combine pass
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92656 --- Comment #5 from Jim Wilson --- A rewrite using dataflow would be better of course. I'm just trying to understand the problem with this testcase better, and maybe find a simple solution, but I don't think that there is one. The workarounds I see just make the code more complicated and add more risk of something else going wrong.
[Bug rtl-optimization/92656] The zero_extend insn can't be eliminated in the combine pass
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92656 --- Comment #4 from Segher Boessenkool --- The whole reg_stat thing cannot ever reliably track known bits. We need some other mechanism to do this, something that *is* reliable, and does not give different results if you try combinations in a different order. Something quite like dataflow. This then could also be used in other passes, of course.
[Bug rtl-optimization/92656] The zero_extend insn can't be eliminated in the combine pass
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92656 --- Comment #3 from Jim Wilson --- Looking at this, I see that the problem occurs in record_value_for_reg where it does if (!insn || (value && rsp->last_set_table_tick >= label_tick_ebb_start)) rsp->last_set_invalid = 1; last_set_table_tick is 2 and label_tick_ebb_start is 1 because this is the first block of the function. This actually causes a lot of variables set in the first block to be marked invalid if used in a successful combination two or more times, which then prevents the nonzero bits info from being used for any of them. There seems to be a problem with how label_tick is used. In the very first block in the body of the function, label_tick is 2 and label_tick_ebb_start is 1. This is because it is considered to be the second block in the ebb after the entry block. In the second block in the body of the function, label_tick is 3 and label_tick_ebb_start is 3. This means that every variable set in the first block gets treated differently than in every block after the first. If I add a little bit of code before the loop to force it to be the second block, then I get correct output from combine. I just added this before the loop static int j = 0; if (val) j++; This also explains why the problem only occurs with -mtune=sifive-7-series because this enables the conditional move support that turns the loop into a single block, and then the -funroll-loops option fully unrolls the loop, turning the entire function into one block, which prevents combine from handling many of the register sets correctly because everything is in the first block now. This also explains why the problem started when the 2->2 combination support was added, as that causes more successful combinations, and hence more registers getting invalidated in the first block. So the question is why we need label_tick > label_tick_ebb_start for the first block of the function. There is nothing set in the entry block other than hard registers, and those could always be handled specially by just marking them as invalid somehow before processing instructions. Or alternatively, in record_value_for_reg, maybe we can add a check for a pseudo reg only set once and not live in the prologue, and avoid marking it as invalid when we process it a second time. There are already a lot of checks like this scattered around the code.
[Bug rtl-optimization/92656] The zero_extend insn can't be eliminated in the combine pass
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92656 Martin Liška changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2020-01-30 CC||marxin at gcc dot gnu.org Ever confirmed|0 |1
[Bug rtl-optimization/92656] The zero_extend insn can't be eliminated in the combine pass
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92656 --- Comment #2 from Segher Boessenkool --- Trying 104 -> 105: 104: r125:SI=zero_extend(r101:SI#0) REG_DEAD r101:SI 105: r127:SI={(r100:SI!=0)?r125:SI:r79:SI} REG_DEAD r125:SI REG_DEAD r100:SI REG_DEAD r79:SI Failed to match this instruction: (set (reg/v:SI 127 [ result ]) (if_then_else:SI (ne (reg:SI 100) (const_int 0 [0])) (zero_extend:SI (subreg:HI (reg:SI 101) 0)) (reg/v:SI 79 [ result ]))) Failed to match this instruction: (set (reg/v:SI 127 [ result ]) (if_then_else:SI (ne (reg:SI 100) (const_int 0 [0])) (and:SI (reg:SI 101) (const_int 65535 [0x])) (reg/v:SI 79 [ result ]))) Combine does not know r101 has all the high bits clear, apparently. r101 is formed via insn_cost 4 for77: r94:SI=0x4000 insn_cost 4 for78: r93:SI=r94:SI+0x2 REG_DEAD r94:SI REG_EQUAL 0x4002 insn_cost 8 for24: r79:SI=zero_extend(r95:SI#0) REG_DEAD r95:SI insn_cost 4 for 103: r101:SI=r79:SI^r93:SI (insn 24 is a HImode subreg), so it could have seen that. But the way nonzero bits are tracked is not very predictable or dependable (it depends on the order that combine looks at insns, which changes if insns combine where they didn't before, etc.) This whole nonzero_bits thing should be handled by dataflow.
[Bug rtl-optimization/92656] The zero_extend insn can't be eliminated in the combine pass
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92656 --- Comment #1 from Andrew Pinski --- Hmm, this comes from coremarks (what a bad benchmark).
[Bug rtl-optimization/92656] The zero_extend insn can't be eliminated in the combine pass
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92656 Andrew Pinski changed: What|Removed |Added Keywords||missed-optimization Severity|normal |enhancement