[Bug target/113255] [11/12/13 Regression] wrong code with -O2 -mtune=k8

2024-02-04 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113255

--- Comment #17 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:5b281946c4b51132caf5e5b64c730fef92dd6123

commit r14-8796-g5b281946c4b51132caf5e5b64c730fef92dd6123
Author: Richard Biener 
Date:   Thu Feb 1 13:54:11 2024 +0100

target/113255 - avoid REG_POINTER on a pointer difference

The following avoids re-using a register holding a pointer (and
thus might be REG_POINTER) for the result of a pointer difference
computation.  That might confuse heuristics in (broken) RTL alias
analysis which relies on REG_POINTER indicating that we're
dealing with one.

This alone doesn't fix anything.

PR target/113255
* config/i386/i386-expand.cc
(expand_set_or_cpymem_prologue_epilogue_by_misaligned_moves):
Use a new pseudo for the skipped number of bytes.

[Bug target/113255] [11/12/13 Regression] wrong code with -O2 -mtune=k8

2024-02-01 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113255

--- Comment #16 from Richard Biener  ---
The "interesting" part is that the i386 + simplify_rtx parts fix the issue but
if you add the alias.cc part ontop it again fails at -O1 (the alias.cc part
alone also "fixes" it).  This all of course shows that RTL alias analysis
is fundamentally broken w/o r14-8346

[Bug target/113255] [11/12/13 Regression] wrong code with -O2 -mtune=k8

2024-02-01 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113255

--- Comment #15 from Richard Biener  ---
The issue is also that via CSELIB we go from the good

(minus:DI (reg/f:DI 119)
(reg:DI 115))

to

(minus:DI (value:DI 11:11 @0x41fca00/0x41ec410)
(value:DI 10:15448 @0x41fc9e8/0x41ec3e0))

and later when DSE does cselib_expand_value_rtx on the value it produces

(minus:DI (reg/f:DI 119)
(minus:DI (reg/f:DI 120)
(reg/f:DI 114)))

which simplify_rtx then turns into

(minus:DI (plus:DI (reg/f:DI 114)
(reg/f:DI 119))
(reg/f:DI 120))

note how that associates things in a way that confuses us later.  In particular
the loc for (value:DI 10:15448) (aka the inner minus) isn't REG_POINTER
(after you fix i386 RTL expansion) but after the re-assloc there's only
the wrong REG_POINTER immediately visible.

DSE gets this all back-and-forth into/out-of CSELIB, it feels a bit of a mess.
It obviously relies on the expansion to discover base values.

First the x86 backend should avoid having a REG_POINTER as the pointer
difference:

diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
index 0d817fc3f3b..26c48e8b0c8 100644
--- a/gcc/config/i386/i386-expand.cc
+++ b/gcc/config/i386/i386-expand.cc
@@ -8090,7 +8090,7 @@
expand_set_or_cpymem_prologue_epilogue_by_misaligned_moves (rtx destmem, rtx
src
   /* See how many bytes we skipped.  */
   saveddest = expand_simple_binop (GET_MODE (*destptr), MINUS, saveddest,
   *destptr,
-  saveddest, 1, OPTAB_DIRECT);
+  NULL_RTX, 1, OPTAB_DIRECT);
   /* Adjust srcptr and count.  */
   if (!issetmem)
*srcptr = expand_simple_binop (GET_MODE (*srcptr), MINUS, *srcptr,

We can avoid the issue by avoiding re-association of pointer MINUS:

diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
index ee75079917f..0108d0aa3bd 100644
--- a/gcc/simplify-rtx.cc
+++ b/gcc/simplify-rtx.cc
@@ -3195,11 +3195,15 @@ simplify_context::simplify_binary_operation_1 (rtx_code
code,
  canonicalize (minus A (plus B C)) to (minus (minus A B) C).
 Don't use the associative law for floating point.
 The inaccuracy makes it nonassociative,
-and subtle programs can break if operations are associated.  */
+and subtle programs can break if operations are associated.
+Don't use the associative law when subtracting a MINUS from
+a REG_POINTER as that can trick find_base_term into discovering
+the wrong base.  */

   if (INTEGRAL_MODE_P (mode)
  && (plus_minus_operand_p (op0)
- || plus_minus_operand_p (op1))
+ || ((!REG_P (op0) || !REG_POINTER (op0))
+ && plus_minus_operand_p (op1)))
  && (tem = simplify_plus_minus (code, mode, op0, op1)) != 0)
return tem;


or we can avoid it with a more dangerous (IMHO) "fix" like the following
which while it looks good on the front, isn't reliable and might instead
trick find_base_term to deflect to another invalid base.

diff --git a/gcc/alias.cc b/gcc/alias.cc
index 3672bf277b9..f589a1fa47a 100644
--- a/gcc/alias.cc
+++ b/gcc/alias.cc
@@ -2094,7 +2101,14 @@ find_base_term (rtx x, vec

[Bug target/113255] [11/12/13 Regression] wrong code with -O2 -mtune=k8

2024-01-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113255

--- Comment #14 from GCC Commits  ---
The master branch has been updated by H.J. Lu :

https://gcc.gnu.org/g:3936c8709c25c8bc72be0c1b2cc3ae7a25dc90ec

commit r14-8363-g3936c8709c25c8bc72be0c1b2cc3ae7a25dc90ec
Author: H.J. Lu <(no_default)>
Date:   Tue Jan 23 06:34:43 2024 -0800

gcc.dg/torture/pr113255.c: Fix ia32 test failure

Fix ia32 test failure:

FAIL: gcc.dg/torture/pr113255.c   -O1  (test for excess errors)
Excess errors:
cc1: error: '-mstringop-strategy=rep_8byte' not supported for 32-bit code

PR rtl-optimization/113255
* gcc.dg/torture/pr113255.c (dg-additional-options): Add only
if not ia32.

[Bug target/113255] [11/12/13 Regression] wrong code with -O2 -mtune=k8

2024-01-22 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113255

Richard Biener  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
  Known to work||14.0
 Status|NEW |ASSIGNED
Summary|[11/12/13/14 Regression]|[11/12/13 Regression] wrong
   |wrong code with -O2 |code with -O2 -mtune=k8
   |-mtune=k8   |

--- Comment #13 from Richard Biener  ---
Fixed on trunk sofar.