Re: [PATCH] Maintain a validity flag for REG_UNUSED notes [PR112760] (was Re: [PATCH] pro_and_epilogue: Call df_note_add_problem () if SHRINK_WRAPPING_ENABLED [PR112760])

2023-12-04 Thread Jakub Jelinek
On Mon, Dec 04, 2023 at 05:30:45PM +, Richard Sandiford wrote:
> > I don't think it's worth adding the note problem to shrink-wrapping
> > just for the regcprop code.  If we're prepared to take that compile-time
> > hit, we might as well run a proper (fast) DCE.
> 
> Here's a patch that tries to do that.  Boostrapped & regression tested
> on aarch64-linux-gnu.  Also tested on x86_64-linux-gnu for the testcase.
> (I'll run full x86_64-linux-gnu testing overnight.)
> 
> OK to install if that passes?  Not an elegant fix, but it's probably
> too much to hope for one of those.

Isn't this way too conservative though, basically limiting single_set
to ~ 15 out of the ~ 65 RTL passes (sure, it will still DTRT for
non-PARALLEL or just PARALLEL with clobbers/uses)?
Do we know about passes other than postreload which may invalidate
REG_UNUSED notes while not purging them altogether?
Given what postreload does, I bet cse/gcse might too.

If we add a RTL checking verification of the notes, we could know
immediately what other passes invalidate it.

So, couldn't we just set such flag at the end of such passes (or only if
they actually remove any redundant insns and thus potentially invalidate
them, perhaps during doing so)?

And on the x86 side, the question from the PR still stands, why is
vzeroupper pass placed exactly after reload and not postreload which
cleans stuff up after reload.

Jakub



[PATCH] Maintain a validity flag for REG_UNUSED notes [PR112760] (was Re: [PATCH] pro_and_epilogue: Call df_note_add_problem () if SHRINK_WRAPPING_ENABLED [PR112760])

2023-12-04 Thread Richard Sandiford
Richard Sandiford  writes:
> Jakub Jelinek  writes:
>> On Sat, Dec 02, 2023 at 11:04:04AM +, Richard Sandiford wrote:
>>> I still maintain that so much stuff relies on the lack of false-positive
>>> REG_UNUSED notes that (whatever the intention might have been) we need
>>> to prevent the false positive.  Like Andrew says, any use of single_set
>>> is suspect if there's a REG_UNUSED note for something that is in fact used.
>>
>> The false positive REG_UNUSED in that case comes from
>> (insn 15 14 35 2 (set (reg:CCZ 17 flags)
>> (compare:CCZ (reg:DI 0 ax [111])
>> (reg:DI 1 dx [112]))) "pr112760.c":11:22 12 {*cmpdi_1}
>>  (expr_list:REG_UNUSED (reg:CCZ 17 flags)
>> (nil)))
>> (insn 35 15 36 2 (set (reg:CCZ 17 flags)
>> (compare:CCZ (reg:DI 0 ax [111])
>> (reg:DI 1 dx [112]))) "pr112760.c":11:22 12 {*cmpdi_1}
>>  (expr_list:REG_DEAD (reg:DI 1 dx [112])
>> (expr_list:REG_DEAD (reg:DI 0 ax [111])
>> (nil
>> ...
>> use of flags
>> Haven't verified what causes the redundant comparison, but postreload cse
>> then does:
>> 110if (!count && cselib_redundant_set_p (body))
>> 111  {
>> 112if (check_for_inc_dec (insn))
>> 113  delete_insn_and_edges (insn);
>> 114/* We're done with this insn.  */
>> 115goto done;
>> 116  }
>> So, we'd in such cases need to look up what instruction was the earlier
>> setter and if it has REG_UNUSED note, drop it.
>
> Hmm, OK.  I guess it's not as simple as I'd imagined.  cselib does have
> some code to track which instruction established which equivalence,
> but it doesn't currently record what we want, and it would be difficult
> to reuse that information here anyway.  Something "simple" like a map of
> register numbers to instructions, populated only for REG_UNUSED sets,
> would be enough, and low overhead.  But it's not very natural.
>
> Perhaps DF should maintain a flag to say "the current pass keeps
> notes up-to-date", with the assumption being that any pass that
> uses the notes problem does that.  Then single_set and the
> regcprop.cc uses can check that flag.
>
> I don't think it's worth adding the note problem to shrink-wrapping
> just for the regcprop code.  If we're prepared to take that compile-time
> hit, we might as well run a proper (fast) DCE.

Here's a patch that tries to do that.  Boostrapped & regression tested
on aarch64-linux-gnu.  Also tested on x86_64-linux-gnu for the testcase.
(I'll run full x86_64-linux-gnu testing overnight.)

OK to install if that passes?  Not an elegant fix, but it's probably
too much to hope for one of those.

Richard



PR112760 is a miscompilation caused by a stale, false-positive
REG_UNUSED note.  There were originally two consecutive,
identical instructions that set the CC flags.  The first
originally had a REG_UNUSED note, but postreload later deleted
the second in favour of the first, based on cselib_redundant_set_p.

Although in principle it would be possible to remove the note
when making the optimisation, the required bookkeeping wouldn't
fit naturally into what cselib already does.  Doing that would also
arguably be a change of policy.

This patch instead adds a global flag that says whether REG_UNUSED
notes are trustworthy.  The assumption is that any pass that calls
df_note_add_problem cares about REG_UNUSED notes and will keep them
sufficiently up-to-date to support the pass's use of things like
single_set.

gcc/
PR rtl-optimization/112760
* df.h (df_d::can_trust_reg_unused_notes): New member variable.
* df-problems.cc (df_note_add_problem): Set can_trust_reg_unused_notes
to true.
* passes.cc (execute_one_pass): Clear can_trust_reg_unused_notes
after each pass.
* rtlanal.cc (single_set_2): Check can_trust_reg_unused_notes.
* regcprop.cc (copyprop_hardreg_forward_1): Likewise.

gcc/testsuite/
* gcc.dg/pr112760.c: New test.
---
 gcc/df-problems.cc  |  1 +
 gcc/df.h|  4 
 gcc/passes.cc   |  3 +++
 gcc/regcprop.cc |  4 +++-
 gcc/rtlanal.cc  |  8 ++--
 gcc/testsuite/gcc.dg/pr112760.c | 22 ++
 6 files changed, 39 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr112760.c

diff --git a/gcc/df-problems.cc b/gcc/df-problems.cc
index d2cfaf7f50f..d2eb95d35ad 100644
--- a/gcc/df-problems.cc
+++ b/gcc/df-problems.cc
@@ -3782,6 +3782,7 @@ void
 df_note_add_problem (void)
 {
   df_add_problem (&problem_NOTE);
+  df->can_trust_reg_unused_notes = true;
 }
 
 
diff --git a/gcc/df.h b/gcc/df.h
index 402657a7076..a405c000235 100644
--- a/gcc/df.h
+++ b/gcc/df.h
@@ -614,6 +614,10 @@ public:
   /* True if someone added or deleted something from regs_ever_live so
  that the entry and exit blocks need be reprocessed.  */
   bool redo_entry_and_exit;
+
+  /* True if REG_UNUSED notes