[Bug rtl-optimization/92656] The zero_extend insn can't be eliminated in the combine pass

2020-02-28 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92656

--- Comment #6 from Segher Boessenkool  ---
The is no simple solution, yeah.  It may be possible to have a simple change
that results in better code on average, but that will be marginal :-/

[Bug rtl-optimization/92656] The zero_extend insn can't be eliminated in the combine pass

2020-02-27 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92656

--- Comment #5 from Jim Wilson  ---
A rewrite using dataflow would be better of course.  I'm just trying to
understand the problem with this testcase better, and maybe find a simple
solution, but I don't think that there is one.  The workarounds I see just make
the code more complicated and add more risk of something else going wrong.

[Bug rtl-optimization/92656] The zero_extend insn can't be eliminated in the combine pass

2020-02-27 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92656

--- Comment #4 from Segher Boessenkool  ---
The whole reg_stat thing cannot ever reliably track known bits.  We need
some other mechanism to do this, something that *is* reliable, and does
not give different results if you try combinations in a different order.
Something quite like dataflow.  This then could also be used in other
passes, of course.

[Bug rtl-optimization/92656] The zero_extend insn can't be eliminated in the combine pass

2020-02-27 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92656

--- Comment #3 from Jim Wilson  ---
Looking at this, I see that the problem occurs in record_value_for_reg where it
does
  if (!insn
  || (value && rsp->last_set_table_tick >= label_tick_ebb_start))
rsp->last_set_invalid = 1;
last_set_table_tick is 2 and label_tick_ebb_start is 1 because this is the
first block of the function.  This actually causes a lot of variables set in
the first block to be marked invalid if used in a successful combination two or
more times, which then prevents the nonzero bits info from being used for any
of them.

There seems to be a problem with how label_tick is used.  In the very first
block in the body of the function, label_tick is 2 and label_tick_ebb_start is
1.  This is because it is considered to be the second block in the ebb after
the entry block.  In the second block in the body of the function, label_tick
is 3 and label_tick_ebb_start is 3.  This means that every variable set in the
first block gets treated differently than in every block after the first.

If I add a little bit of code before the loop to force it to be the second
block, then I get correct output from combine.  I just added this before the
loop
  static int j = 0;
  if (val)
j++;

This also explains why the problem only occurs with -mtune=sifive-7-series
because this enables the conditional move support that turns the loop into a
single block, and then the -funroll-loops option fully unrolls the loop,
turning the entire function into one block, which prevents combine from
handling many of the register sets correctly because everything is in the first
block now.

This also explains why the problem started when the 2->2 combination support
was added, as that causes more successful combinations, and hence more
registers getting invalidated in the first block.

So the question is why we need label_tick > label_tick_ebb_start for the first
block of the function.  There is nothing set in the entry block other than hard
registers, and those could always be handled specially by just marking them as
invalid somehow before processing instructions.

Or alternatively, in record_value_for_reg, maybe we can add a check for a
pseudo reg only set once and not live in the prologue, and avoid marking it as
invalid when we process it a second time.  There are already a lot of checks
like this scattered around the code.

[Bug rtl-optimization/92656] The zero_extend insn can't be eliminated in the combine pass

2020-01-30 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92656

Martin Liška  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2020-01-30
 CC||marxin at gcc dot gnu.org
 Ever confirmed|0   |1

[Bug rtl-optimization/92656] The zero_extend insn can't be eliminated in the combine pass

2019-12-19 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92656

--- Comment #2 from Segher Boessenkool  ---
Trying 104 -> 105:
  104: r125:SI=zero_extend(r101:SI#0)
  REG_DEAD r101:SI
  105: r127:SI={(r100:SI!=0)?r125:SI:r79:SI}
  REG_DEAD r125:SI
  REG_DEAD r100:SI
  REG_DEAD r79:SI
Failed to match this instruction:
(set (reg/v:SI 127 [ result ])
(if_then_else:SI (ne (reg:SI 100)
(const_int 0 [0]))
(zero_extend:SI (subreg:HI (reg:SI 101) 0))
(reg/v:SI 79 [ result ])))
Failed to match this instruction:
(set (reg/v:SI 127 [ result ])
(if_then_else:SI (ne (reg:SI 100)
(const_int 0 [0]))
(and:SI (reg:SI 101)
(const_int 65535 [0x]))
(reg/v:SI 79 [ result ])))


Combine does not know r101 has all the high bits clear, apparently.  r101 is
formed via

insn_cost 4 for77: r94:SI=0x4000
insn_cost 4 for78: r93:SI=r94:SI+0x2
  REG_DEAD r94:SI
  REG_EQUAL 0x4002
insn_cost 8 for24: r79:SI=zero_extend(r95:SI#0)
  REG_DEAD r95:SI
insn_cost 4 for   103: r101:SI=r79:SI^r93:SI

(insn 24 is a HImode subreg), so it could have seen that.  But the way nonzero
bits are tracked is not very predictable or dependable (it depends on the order
that combine looks at insns, which changes if insns combine where they didn't
before, etc.)

This whole nonzero_bits thing should be handled by dataflow.

[Bug rtl-optimization/92656] The zero_extend insn can't be eliminated in the combine pass

2019-11-25 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92656

--- Comment #1 from Andrew Pinski  ---
Hmm, this comes from coremarks (what a bad benchmark).

[Bug rtl-optimization/92656] The zero_extend insn can't be eliminated in the combine pass

2019-11-25 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92656

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||missed-optimization
   Severity|normal  |enhancement