[Bug rtl-optimization/114674] [aarch64] ldp_fusion fails to merge 2 strs due to imprecise alignment info

2024-05-07 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114674

Alex Coplan  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from Alex Coplan  ---
Fixed for GCC 15, thanks for the report.

[Bug rtl-optimization/114674] [aarch64] ldp_fusion fails to merge 2 strs due to imprecise alignment info

2024-05-07 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114674

--- Comment #6 from GCC Commits  ---
The master branch has been updated by Alex Coplan :

https://gcc.gnu.org/g:74690ff96b263b8609639b7fbc5d6afd3f19cb98

commit r15-282-g74690ff96b263b8609639b7fbc5d6afd3f19cb98
Author: Alex Coplan 
Date:   Wed Apr 10 16:30:36 2024 +0100

aarch64: Preserve mem info on change of base for ldp/stp [PR114674]

The ldp/stp fusion pass can change the base of an access so that the two
accesses end up using a common base register.  So far we have been using
adjust_address_nv to do this, but this means that we don't preserve
other properties of the mem we're replacing.  It seems better to use
replace_equiv_address_nv, as this will preserve e.g. the MEM_ALIGN of the
mem whose address we're changing.

The PR shows that by adjusting the other mem we lose alignment
information about the original access and therefore end up rejecting an
otherwise viable pair when --param=aarch64-stp-policy=aligned is passed.
This patch fixes that by using replace_equiv_address_nv instead.

Notably this is the same approach as taken by
aarch64_check_consecutive_mems when a change of base is required, so
this at least makes things more consistent between the ldp fusion pass
and the peepholes.

gcc/ChangeLog:

PR target/114674
* config/aarch64/aarch64-ldp-fusion.cc (ldp_bb_info::fuse_pair):
Use replace_equiv_address_nv on a change of base instead of
adjust_address_nv on the other access.

gcc/testsuite/ChangeLog:

PR target/114674
* gcc.target/aarch64/pr114674.c: New test.

[Bug rtl-optimization/114674] [aarch64] ldp_fusion fails to merge 2 strs due to imprecise alignment info

2024-04-10 Thread dizhao at os dot amperecomputing.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114674

--- Comment #5 from Di Zhao  ---
Yes it's better to take a maximum MEM_ALIGN. I've missed that. Thanks.

[Bug rtl-optimization/114674] [aarch64] ldp_fusion fails to merge 2 strs due to imprecise alignment info

2024-04-10 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114674

Alex Coplan  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |acoplan at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #4 from Alex Coplan  ---
Discussing offline with Richard S an alternative approach would be to use
replace_equiv_address[_nv] instead of adjust_address[_nv]; that way we preserve
most properties of the original mem and just take the address from the other.

In fact this is what aarch64_check_consecutive_mems already does so I think we
should follow that.

I'll try a patch along those lines for stage 1.

[Bug rtl-optimization/114674] [aarch64] ldp_fusion fails to merge 2 strs due to imprecise alignment info

2024-04-10 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114674

Alex Coplan  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-04-10

--- Comment #3 from Alex Coplan  ---
Confirmed.

I think it might be best to take the maximum MEM_ALIGN between the adjusted mem
(new_mem) and the original mem (change_mem).  In this case it happens that the
original mem (change_mem) has a stronger alignment guarantee, but in general it
could be the case that the adjusted mem gives a better alignment guarantee. 
Since both locations are known to point to the same address, it seems best to
me to take the largest alignment of the two.

[Bug rtl-optimization/114674] [aarch64] ldp_fusion fails to merge 2 strs due to imprecise alignment info

2024-04-10 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114674

Alex Coplan  changed:

   What|Removed |Added

 CC||acoplan at gcc dot gnu.org
   Keywords||missed-optimization

--- Comment #2 from Alex Coplan  ---
Thanks for the report (and patch), I'll look into this.

[Bug rtl-optimization/114674] [aarch64] ldp_fusion fails to merge 2 strs due to imprecise alignment info

2024-04-10 Thread dizhao at os dot amperecomputing.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114674

--- Comment #1 from Di Zhao  ---
Here's a quick fix I tried, that works on the small test case above:

diff --git a/gcc/config/aarch64/aarch64-ldp-fusion.cc
b/gcc/config/aarch64/aarch64-ldp-fusion.cc
index 365dcf48b22..43478ede72b 100644
--- a/gcc/config/aarch64/aarch64-ldp-fusion.cc
+++ b/gcc/config/aarch64/aarch64-ldp-fusion.cc
@@ -1735,6 +1735,9 @@ ldp_bb_info::fuse_pair (bool load_p,
   rtx new_mem = adjust_address_nv (effective_base,
   mode_for_mem,
   adjust_amt);
+  // Keep original alignment info for ldp/stp policy.
+  set_mem_align (new_mem, MEM_ALIGN (change_mem));
+
   rtx new_set = load_p
? gen_rtx_SET (change_reg, new_mem)
: gen_rtx_SET (new_mem, change_reg);


Is is OK?