[Bug rtl-optimization/101188] [postreload] Uses content of a clobbered register

2023-08-09 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101188

Georg-Johann Lay  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #18 from Georg-Johann Lay  ---
Fixed in v14.

[Bug rtl-optimization/101188] [postreload] Uses content of a clobbered register

2023-06-13 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101188

--- Comment #17 from Georg-Johann Lay  ---
Created attachment 55318
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55318=edit
Reduced patch for v8 (handles only CLOBBERs, hence should be good enough for
AVR).

[Bug rtl-optimization/101188] [postreload] Uses content of a clobbered register

2023-06-13 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101188

--- Comment #16 from Georg-Johann Lay  ---
Created attachment 55317
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55317=edit
Backport to v13 of Jeff's patch.

[Bug rtl-optimization/101188] [postreload] Uses content of a clobbered register

2023-06-12 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101188

--- Comment #15 from CVS Commits  ---
The master branch has been updated by Jeff Law :

https://gcc.gnu.org/g:ae193f9008e02683e27f3c87f3b06f38e103b1d0

commit r14-1738-gae193f9008e02683e27f3c87f3b06f38e103b1d0
Author: Jeff Law 
Date:   Mon Jun 12 12:52:10 2023 -0600

[committed] [PR rtl-optimization/101188] Fix reload_cse_move2add ignoring
clobbers

So as Georg-Johann discusses in the BZ, reload_cse_move2add can generate
 incorrect code when optimizing code with clobbers.  Specifically in the
case where we try to optimize a sequence of 4 operations down to 3
operations we can reset INSN to the next instruction and continue the loop.

That skips the code to invalidate objects based on things like REG_INC
nodes, stack pushes and most importantly clobbers attached to the current
insn.

This patch factors all of the invalidation code used by reload_cse_move2add
into a new function and calls it at the appropriate time.

Georg-Johann has confirmed this patch fixes his avr bug and I've had it in
my tester over the weekend.  It's bootstrapped and regression tested on
aarch64, m68k, sh4, alpha and hppa.  It's also regression tested
successfully
on a wide variety of other targets.

gcc/
PR rtl-optimization/101188
* postreload.cc (reload_cse_move2add_invalidate): New function,
extracted from...
(reload_cse_move2add): Call reload_cse_move2add_invalidate.

gcc/testsuite
PR rtl-optimization/101188
* gcc.c-torture/execute/pr101188.c: New test

[Bug rtl-optimization/101188] [postreload] Uses content of a clobbered register

2023-06-02 Thread uweigand at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101188

--- Comment #14 from Ulrich Weigand  ---
(In reply to Georg-Johann Lay from comment #13)
> Also I don't have a test case for your scenario.  I can reproduce the bug
> back to v5 on avr and maybe it is even older.  As it appears, this PR lead
> to no hickups on any other target, so for now I'd like to keep the fix
> restricted to what I can test.

I agree that your patch looks correct and unlikely to cause any new problems,
so I won't object to it being committed.  I just wanted to point out that it
might not be a complete fix.

[Bug rtl-optimization/101188] [postreload] Uses content of a clobbered register

2023-06-02 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101188

--- Comment #13 from Georg-Johann Lay  ---
(In reply to Ulrich Weigand from comment #12)
> I think your root cause analysis is correct.  In this part of code:
> 
>   if (success)
> delete_insn (insn);
>   changed |= success;
>   insn = next;
>   move2add_record_mode (reg);
>   reg_offset[regno]
> = trunc_int_for_mode (added_offset + base_offset,
>   mode);
>   continue;
> 
> the intent seems to be to manually update the move2add data structures to
> account for the effects of "next", because the default logic is now skipped
> for the "next" insn.  That's why in particular the reg mode and offset are
> manually calculated.
> 
> This manual logic however is really only correct if "next" is actually just
> a simple SET.  Reading the comment before the whole loop:
>   /* For simplicity, we only perform this optimization on
>  straightforward SETs.  */
> makes me suspect the original author assumed that "next" is in fact a
> straightforward SET here as well.

That would render the optimization far less likely, e..g in the case of clobber
of CCmode regs.  I understodd the comment as only referring to "insn", not
necessarily to "next".

> This is however not true due to behavior
> of the "single_set" extractor.  (I'm wondering if "single_set" used to be
> defined differently back in the days?)
> Your fix does look correct to me as far as handling parallel CLOBBERs go. 
> However, looking at "single_set", it seems there is yet another case: the
> extractor also accepts a parallel of two or more SETs, as long as all except
> one of those SETs have destinations that are dead.  These cases would still
> not be handled correctly with your patch, I think.
> 
> I'm wondering whether it is even worthwhile to attempt to cover those cases.
> Maybe a more straightforward fix would be to keep in line with the
> above-mentioned comment about "straightforward SETs" and just check for a
> single SET directly instead of using "single_set" here.  Do you think this
> would miss any important optimizations?

Not sure about how many optimizations this would kill.  Many insns are
parallells that also set CCmode regs which don't interfere with this
optimization, but only considering SETs would skip all such optimizations on
targets that can have CCmode during reload (avr is not one of them).

Also I don't have a test case for your scenario.  I can reproduce the bug back
to v5 on avr and maybe it is even older.  As it appears, this PR lead to no
hickups on any other target, so for now I'd like to keep the fix restricted to
what I can test.

I already started a review this morning:
https://gcc.gnu.org/pipermail/gcc-patches/2023-June/620446.html

[Bug rtl-optimization/101188] [postreload] Uses content of a clobbered register

2023-06-02 Thread uweigand at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101188

Ulrich Weigand  changed:

   What|Removed |Added

 CC||uweigand at gcc dot gnu.org

--- Comment #12 from Ulrich Weigand  ---
Sorry for not responding earlier, I've been out on vacation.

I think your root cause analysis is correct.  In this part of code:

  if (success)
delete_insn (insn);
  changed |= success;
  insn = next;
  move2add_record_mode (reg);
  reg_offset[regno]
= trunc_int_for_mode (added_offset + base_offset,
  mode);
  continue;

the intent seems to be to manually update the move2add data structures to
account for the effects of "next", because the default logic is now skipped for
the "next" insn.  That's why in particular the reg mode and offset are manually
calculated.

This manual logic however is really only correct if "next" is actually just a
simple SET.  Reading the comment before the whole loop:
  /* For simplicity, we only perform this optimization on
 straightforward SETs.  */
makes me suspect the original author assumed that "next" is in fact a
straightforward SET here as well.  This is however not true due to behavior of
the "single_set" extractor.  (I'm wondering if "single_set" used to be defined
differently back in the days?)

Your fix does look correct to me as far as handling parallel CLOBBERs go. 
However, looking at "single_set", it seems there is yet another case: the
extractor also accepts a parallel of two or more SETs, as long as all except
one of those SETs have destinations that are dead.  These cases would still not
be handled correctly with your patch, I think.

I'm wondering whether it is even worthwhile to attempt to cover those cases. 
Maybe a more straightforward fix would be to keep in line with the
above-mentioned comment about "straightforward SETs" and just check for a
single SET directly instead of using "single_set" here.  Do you think this
would miss any important optimizations?

[Bug rtl-optimization/101188] [postreload] Uses content of a clobbered register

2023-05-30 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101188

--- Comment #11 from Georg-Johann Lay  ---
Created attachment 55217
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55217=edit
Proposed patch for postreload.cc to record clobbers of next insn + test case.

This patch solves the problem for avr and tests with no additional regressions
for avr.

--

rtl-optimization/101188: Don't bypass clobbers of some insns that are
optimized or are optimization candidates.

gcc/
PR rtl-optimization/101188
* postreload.cc (reload_cse_move2add): Record clobbers of next
insn using move2add_note_store.

gcc/testsuite/
PR rtl-optimization/101188
* gcc.c-torture/execute/pr101188.c: New test.

[Bug rtl-optimization/101188] [postreload] Uses content of a clobbered register

2023-05-29 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101188

--- Comment #10 from Georg-Johann Lay  ---
Created attachment 55189
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55189=edit
Proposed patch for postreload.cc so analysis is not bypassing "next" insn.

(In reply to Georg-Johann Lay from comment #9)
> postreload.cc::reload_cse_move2add() loops over all insns, and at some point
> it encounters
> 
> (insn 44 14 15 2 (set (reg/f:HI 14 r14 [58])
> (reg/v/f:HI 16 r16 [orig:51 self ] [51])) "fail1.c":28:5 101
> {*movhi_split}
>  (nil))
> 
> During the analysis for that insn, it executes
> 
> rtx_insn *next = next_nonnote_nondebug_insn (insn);
> rtx set = NULL_RTX;
> if (next)
>   set = single_set (next);
> 
> where next is
> 
> (insn 15 44 16 2 (parallel [
> (set (reg/f:HI 14 r14 [58])
> (plus:HI (reg/f:HI 14 r14 [58])
> (const_int 68 [0x44])))
> (clobber (reg:QI 31 r31))
> ]) "fail1.c":28:5 175 {addhi3_clobber}
>  (nil))
> 
> Further down, it continues with success = 0:
> 
>   if (success)
>   delete_insn (insn);
>   changed |= success;
>   insn = next;
>   [...]
>   continue;
> 
> The scan then continues with NEXT_INSN (insn), which is the insn AFTER insn
> 15, so the CLOBBER of QI:31 in insn 15 is bypassed, and note_stores or
> similar is never executed on insn 15.

A simple solution is to not "insn = next;" and just to pseudo-delete insn so
that NEXT_INSN (insn) works in the for loop:

diff --git a/gcc/postreload.cc b/gcc/postreload.cc
index fb392651e1b..388b8c0a506 100644
--- a/gcc/postreload.cc
+++ b/gcc/postreload.cc
@@ -2031,9 +2031,8 @@ reload_cse_move2add (rtx_insn *first)
}
}
  if (success)
-   delete_insn (insn);
+   SET_INSN_DELETED (insn);
  changed |= success;
- insn = next;
  move2add_record_mode (reg);
  reg_offset[regno]
= trunc_int_for_mode (added_offset + base_offset,

[Bug rtl-optimization/101188] [postreload] Uses content of a clobbered register

2023-05-28 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101188

--- Comment #9 from Georg-Johann Lay  ---
The bug works as follows:

postreload.cc::reload_cse_move2add() loops over all insns, and at some point it
encounters

(insn 44 14 15 2 (set (reg/f:HI 14 r14 [58])
(reg/v/f:HI 16 r16 [orig:51 self ] [51])) "fail1.c":28:5 101
{*movhi_split}
 (nil))

During the analysis for that insn, it executes

  rtx_insn *next = next_nonnote_nondebug_insn (insn);
  rtx set = NULL_RTX;
  if (next)
set = single_set (next);

where next is

(insn 15 44 16 2 (parallel [
(set (reg/f:HI 14 r14 [58])
(plus:HI (reg/f:HI 14 r14 [58])
(const_int 68 [0x44])))
(clobber (reg:QI 31 r31))
]) "fail1.c":28:5 175 {addhi3_clobber}
 (nil))

Further down, it continues with success = 0:

  if (success)
delete_insn (insn);
  changed |= success;
  insn = next;
  [...]
  continue;

The scan then continues with NEXT_INSN (insn), which is the insn AFTER insn 15,
so the CLOBBER of QI:31 in insn 15 is bypassed, and note_stores or similar is
never executed on insn 15.  The "set = single_set (next)" also bypasses that
insn 15 is a PARALLEL with a CLOBBER of a general purpose register.

Appears the code is in postreload since 2003, when postreload.c was split out
of reload1.c.

[Bug rtl-optimization/101188] [postreload] Uses content of a clobbered register

2023-05-28 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101188

Georg-Johann Lay  changed:

   What|Removed |Added

Summary|[AVR] Miscompilation and|[postreload] Uses content
   |function pointers   |of a clobbered register
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=56833

--- Comment #8 from Georg-Johann Lay  ---
Changing the title to something that resembles what is going wrong.

Also there is PR56833 which was fixed around v4.9, so maybe that fix was
incomplete.  There is also PR56442 which is still open, and where it's unclear
whether that is a duplicate.