Re: [Patch, aarch64, middle-end] v3: Move pair_fusion pass from aarch64 to middle-end

2024-05-22 Thread Alex Coplan
Hi Ajit, You need to remove the header dependencies that are no longer required for aarch64-ldp-fusion.o in t-aarch64 (not forgetting to update the ChangeLog). A few other minor nits below. LGTM with those changes, but you'll need Richard S to approve. Thanks a lot for doing this. On

Re: [Patch, aarch64, middle-end] v2: Move pair_fusion pass from aarch64 to middle-end

2024-05-21 Thread Alex Coplan
Hi Ajit, I've left some more comments below. It's getting there now, thanks for your patience. On 21/05/2024 20:32, Ajit Agarwal wrote: > Hello Alex/Richard: > > All comments are addressed. > > Move pair fusion pass from aarch64-ldp-fusion.cc to middle-end > to support multiple targets. > >

Re: [Patch, aarch64, middle-end] Move pair_fusion pass from aarch64 to middle-end

2024-05-21 Thread Alex Coplan
On 20/05/2024 21:50, Ajit Agarwal wrote: > Hello Alex/Richard: > > Move pair fusion pass from aarch64-ldp-fusion.cc to middle-end > to support multiple targets. > > Common infrastructure of load store pair fusion is divided into > target independent and target dependent code. > > Target

Re: [Patch, aarch64, middle-end] Move pair_fusion pass from aarch64 to middle-end

2024-05-21 Thread Alex Coplan
On 21/05/2024 16:02, Ajit Agarwal wrote: > Hello Alex: > > On 21/05/24 1:16 am, Alex Coplan wrote: > > On 20/05/2024 18:44, Alex Coplan wrote: > >> Hi Ajit, > >> > >> On 20/05/2024 21:50, Ajit Agarwal wrote: > >>> Hello Alex/Richard: > &

Re: [Patch, aarch64, middle-end] Move pair_fusion pass from aarch64 to middle-end

2024-05-20 Thread Alex Coplan
On 20/05/2024 18:44, Alex Coplan wrote: > Hi Ajit, > > On 20/05/2024 21:50, Ajit Agarwal wrote: > > Hello Alex/Richard: > > > > Move pair fusion pass from aarch64-ldp-fusion.cc to middle-end > > to support multiple targets. > > > > Common infras

Re: [Patch, aarch64] v6: Preparatory patch to place target independent and,dependent changed code in one file

2024-05-17 Thread Alex Coplan
Hi Ajit, On 17/05/2024 18:05, Ajit Agarwal wrote: > Hello Alex: > > On 16/05/24 10:21 pm, Alex Coplan wrote: > > Hi Ajit, > > > > Thanks a lot for working through the review feedback. > > > > Thanks a lot for reviewing the code and approving the patch.

Re: [Patch, aarch64] v6: Preparatory patch to place target independent and,dependent changed code in one file

2024-05-16 Thread Alex Coplan
Hi Ajit, Thanks a lot for working through the review feedback. The patch LGTM with the two minor suggested changes below. I can't approve the patch, though, so you'll need an OK from Richard S. Also, I'm not sure if it makes sense to apply the patch in isolation, it might make more sense to

Re: [Patch, aarch64] v4: Preparatory patch to place target independent and,dependent changed code in one file

2024-05-14 Thread Alex Coplan
Hi Ajit, Please can you pay careful attention to the review comments? In particular, you have ignored my comment about changing the access of member functions in ldp_bb_info several times now (on at least three patch reviews). Likewise on multiple occasions you've only partially implemented a

Re: [Patch, aarch64] v3: Preparatory patch to place target independent and,dependent changed code in one file

2024-05-13 Thread Alex Coplan
Hi Ajit, Why did you send three mails for this revision of the patch? If you're going to send a new revision of the patch you should increment the version number and outline the changes / reasons for the new revision. Mostly the comments below are just style nits and things you missed from the

Re: [PATCH, aarch64] v2: Preparatory patch to place target independent and,dependent changed code in one file

2024-05-08 Thread Alex Coplan
Hi Ajit, Sorry for the long delay in reviewing this. This is really getting there now. I've left a few more comments below. Apart from minor style things, the main remaining issues are mostly around comments. It's important to have good clear comments for functions with the parameters (and

Re: [PATCH v2] aarch64: Preserve mem info on change of base for ldp/stp [PR114674]

2024-05-07 Thread Alex Coplan
On 12/04/2024 12:13, Richard Sandiford wrote: > Alex Coplan writes: > > This is a v2 because I accidentally sent a WIP version of the patch last > > time round which used replace_equiv_address instead of > > replace_equiv_address_nv; that caused some ICEs (pointed out by the

[PATCH] aarch64: Fix typo in aarch64-ldp-fusion.cc:combine_reg_notes [PR114936]

2024-05-03 Thread Alex Coplan
This fixes a typo in combine_reg_notes in the load/store pair fusion pass. As it stands, the calls to filter_notes store any REG_FRAME_RELATED_EXPR to fr_expr with the following association: - i2 -> fr_expr[0] - i1 -> fr_expr[1] but then the checks inside the following if statement expect the

Re: [PATCH V4 1/3] aarch64: Place target independent and dependent changed code in one file

2024-05-03 Thread Alex Coplan
On 22/04/2024 13:01, Ajit Agarwal wrote: > Hello Alex: > > On 14/04/24 10:29 pm, Ajit Agarwal wrote: > > Hello Alex: > > > > On 12/04/24 11:02 pm, Ajit Agarwal wrote: > >> Hello Alex: > >> > >> On 12/04/24 8:15 pm, Alex Coplan wrote: >

cfgrtl: Fix MEM_EXPR update in duplicate_insn_chain [PR114924]

2024-05-02 Thread Alex Coplan
Hi, The PR shows that when cfgrtl.cc:duplicate_insn_chain attempts to update the MR_DEPENDENCE_CLIQUE information for a MEM_EXPR we can end up accidentally dropping (e.g.) an ARRAY_REF from the MEM_EXPR and end up replacing it with the underlying MEM_REF. This leads to an inconsistency in the

Re: [PATCH] wwwdocs: Add note to changes.html for __has_{feature,extension}

2024-04-26 Thread Alex Coplan
On 26/04/2024 09:14, Marek Polacek wrote: > On Fri, Apr 26, 2024 at 11:12:54AM +0100, Alex Coplan wrote: > > On 17/04/2024 11:41, Marek Polacek wrote: > > > On Mon, Apr 15, 2024 at 11:13:27AM +0100, Alex Coplan wrote: > > > > On 04/04/2024 11:00, A

Re: [PATCH] wwwdocs: Add note to changes.html for __has_{feature,extension}

2024-04-26 Thread Alex Coplan
On 17/04/2024 11:41, Marek Polacek wrote: > On Mon, Apr 15, 2024 at 11:13:27AM +0100, Alex Coplan wrote: > > On 04/04/2024 11:00, Alex Coplan wrote: > > > Hi, > > > > > > This adds a note to the GCC 14 release notes mentioning support for > > > __has_{

Re: [PATCH] wwwdocs: Add note to changes.html for __has_{feature,extension}

2024-04-15 Thread Alex Coplan
On 04/04/2024 11:00, Alex Coplan wrote: > Hi, > > This adds a note to the GCC 14 release notes mentioning support for > __has_{feature,extension} (PR60512). > > OK to commit? Ping. Is this changes.html patch OK? I guess it needs a review from C++ maintainers since it adds

Re: [PATCH V4 1/3] aarch64: Place target independent and dependent changed code in one file

2024-04-12 Thread Alex Coplan
On 12/04/2024 20:02, Ajit Agarwal wrote: > Hello Alex: > > On 11/04/24 7:55 pm, Alex Coplan wrote: > > On 10/04/2024 23:48, Ajit Agarwal wrote: > >> Hello Alex: > >> > >> On 10/04/24 7:52 pm, Alex Coplan wrote: > >>> Hi Ajit, > >>>

[PATCH v2] aarch64: Preserve mem info on change of base for ldp/stp [PR114674]

2024-04-12 Thread Alex Coplan
This is a v2 because I accidentally sent a WIP version of the patch last time round which used replace_equiv_address instead of replace_equiv_address_nv; that caused some ICEs (pointed out by the Linaro CI) since pair addressing modes aren't a subset of the addresses that are accepted by

Re: [PATCH V4 1/3] aarch64: Place target independent and dependent changed code in one file

2024-04-11 Thread Alex Coplan
On 10/04/2024 23:48, Ajit Agarwal wrote: > Hello Alex: > > On 10/04/24 7:52 pm, Alex Coplan wrote: > > Hi Ajit, > > > > On 10/04/2024 15:31, Ajit Agarwal wrote: > >> Hello Alex: > >> > >> On 10/04/24 1:42 pm, Alex Coplan wrote: > >>

[PATCH] aarch64: Preserve mem info on change of base for ldp/stp [PR114674]

2024-04-11 Thread Alex Coplan
Hi, The ldp/stp fusion pass can change the base of an access so that the two accesses end up using a common base register. So far we have been using adjust_address_nv to do this, but this means that we don't preserve other properties of the mem we're replacing. It seems better to use

Re: [PATCH V4 1/3] aarch64: Place target independent and dependent changed code in one file

2024-04-10 Thread Alex Coplan
Hi Ajit, On 10/04/2024 15:31, Ajit Agarwal wrote: > Hello Alex: > > On 10/04/24 1:42 pm, Alex Coplan wrote: > > Hi Ajit, > > > > On 09/04/2024 20:59, Ajit Agarwal wrote: > >> Hello Alex: > >> > >> On 09/04/24 8:39 pm, Alex Copla

Re: [PATCH V4 1/3] aarch64: Place target independent and dependent changed code in one file

2024-04-10 Thread Alex Coplan
Hi Ajit, On 09/04/2024 20:59, Ajit Agarwal wrote: > Hello Alex: > > On 09/04/24 8:39 pm, Alex Coplan wrote: > > On 09/04/2024 20:01, Ajit Agarwal wrote: > >> Hello Alex: > >> > >> On 09/04/24 7:29 pm, Alex Coplan wrote: > >>> On 09/04/2024 17

Re: [PATCH V4 1/3] aarch64: Place target independent and dependent changed code in one file

2024-04-09 Thread Alex Coplan
On 09/04/2024 20:01, Ajit Agarwal wrote: > Hello Alex: > > On 09/04/24 7:29 pm, Alex Coplan wrote: > > On 09/04/2024 17:30, Ajit Agarwal wrote: > >> > >> > >> On 05/04/24 10:03 pm, Alex Coplan wrote: > >>> On 05/04/2024 13:53, Ajit Agarw

Re: [PATCH V4 1/3] aarch64: Place target independent and dependent changed code in one file

2024-04-09 Thread Alex Coplan
On 09/04/2024 17:30, Ajit Agarwal wrote: > > > On 05/04/24 10:03 pm, Alex Coplan wrote: > > On 05/04/2024 13:53, Ajit Agarwal wrote: > >> Hello Alex/Richard: > >> > >> All review comments are incorporated. > > > > Thanks, I was

[PATCH][committed] aarch64: Fix whitespace in aarch64-ldp-fusion.cc:alias_walker

2024-04-05 Thread Alex Coplan
I spotted this whitespace error during the review of https://gcc.gnu.org/pipermail/gcc-patches/2024-April/648846.html. Pushing as obvious after testing on aarch64-linux-gnu. Thanks, Alex gcc/ChangeLog: * config/aarch64/aarch64-ldp-fusion.cc (struct alias_walker): Fix double

Re: [PATCH V4 1/3] aarch64: Place target independent and dependent changed code in one file

2024-04-05 Thread Alex Coplan
On 05/04/2024 13:53, Ajit Agarwal wrote: > Hello Alex/Richard: > > All review comments are incorporated. Thanks, I was kind-of expecting you to also send the renaming patch as a preparatory patch as we discussed. Sorry for another meta comment, but: I think the reason that the Linaro CI isn't

[PATCH] wwwdocs: Add note to changes.html for __has_{feature,extension}

2024-04-04 Thread Alex Coplan
Hi, This adds a note to the GCC 14 release notes mentioning support for __has_{feature,extension} (PR60512). OK to commit? Thanks, Alex diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html index 9fd224c1..facead8d 100644 --- a/htdocs/gcc-14/changes.html +++

Re: [PATCH V3 0/2] aarch64: Place target independent and dependent changed code in one file.

2024-04-03 Thread Alex Coplan
On 23/02/2024 16:41, Ajit Agarwal wrote: > Hello Richard/Alex/Segher: Hi Ajit, Sorry for the delay and thanks for working on this. Generally this looks like the right sort of approach (IMO) but I've left some comments below. I'll start with a meta comment: in the subject line you have marked

Re: [PATCH 0/1 V2] Target independent code for common infrastructure of load,store fusion for rs6000 and aarch64 target.

2024-02-15 Thread Alex Coplan
On 15/02/2024 22:38, Ajit Agarwal wrote: > Hello Alex: > > On 15/02/24 10:12 pm, Alex Coplan wrote: > > On 15/02/2024 21:24, Ajit Agarwal wrote: > >> Hello Richard: > >> > >> As per your suggestion I have divided the patch into target independent >

Re: [PATCH 0/1 V2] Target independent code for common infrastructure of load,store fusion for rs6000 and aarch64 target.

2024-02-15 Thread Alex Coplan
On 15/02/2024 21:24, Ajit Agarwal wrote: > Hello Richard: > > As per your suggestion I have divided the patch into target independent > and target dependent for aarch64 target. I kept aarch64-ldp-fusion same > and did not change that. I'm not sure this was what Richard suggested doing, though.

Re: [PATCH][GCC 12] aarch64: Avoid out-of-range shrink-wrapped saves [PR111677]

2024-02-15 Thread Alex Coplan
On 14/02/2024 11:18, Richard Sandiford wrote: > Alex Coplan writes: > > This is a backport of the GCC 13 fix for PR111677 to the GCC 12 branch. > > The only part of the patch that isn't a straight cherry-pick is due to > > the TX iterator lacking TDmode for GCC 12, so

[PATCH][GCC 12] aarch64: Avoid out-of-range shrink-wrapped saves [PR111677]

2024-02-12 Thread Alex Coplan
This is a backport of the GCC 13 fix for PR111677 to the GCC 12 branch. The only part of the patch that isn't a straight cherry-pick is due to the TX iterator lacking TDmode for GCC 12, so this version adjusts TX_V16QI accordingly. Bootstrapped/regtested on aarch64-linux-gnu, the only changes in

Re: [PATCH][PUSHED] hwasan: support new dg-output format.

2024-02-09 Thread Alex Coplan
Hi, On 04/05/2022 09:59, Martin Liška wrote: > Supports change in libsanitizer where it newly reports: > READ of size 4 at 0xc3d4 tags: 02/01(00) (ptr/mem) in thread T0 > > So the 'tags' contains now 3 entries compared to 2 entries. > > gcc/testsuite/ChangeLog: > > *

Re: [PATCH] c++: Don't advertise cxx_constexpr_string_builtins [PR113658]

2024-02-02 Thread Alex Coplan
On 02/02/2024 09:34, Marek Polacek wrote: > On Fri, Feb 02, 2024 at 10:27:23AM +0000, Alex Coplan wrote: > > Bootstrapped/regtested on x86_64-apple-darwin, OK for trunk? > > > > Thanks, > > Alex > > > > -- >8 -- > > > > When __has_fea

[PATCH] c++: Don't advertise cxx_constexpr_string_builtins [PR113658]

2024-02-02 Thread Alex Coplan
Bootstrapped/regtested on x86_64-apple-darwin, OK for trunk? Thanks, Alex -- >8 -- When __has_feature was introduced for GCC 14, I included the feature cxx_constexpr_string_builtins, since of the relevant string builtins that GCC implements, it seems to support constexpr evaluation of those

Re: [PATCH v2] c++: avoid -Wdangling-reference for std::span-like classes [PR110358]

2024-02-01 Thread Alex Coplan
On 31/01/2024 15:53, Marek Polacek wrote: > On Wed, Jan 31, 2024 at 07:44:41PM +0000, Alex Coplan wrote: > > Hi Marek, > > > > On 30/01/2024 13:15, Marek Polacek wrote: > > > On Thu, Jan 25, 2024 at 10:13:10PM -0500, Jason Merrill wrote: > > >

Re: [PATCH v2] c++: avoid -Wdangling-reference for std::span-like classes [PR110358]

2024-01-31 Thread Alex Coplan
Hi Marek, On 30/01/2024 13:15, Marek Polacek wrote: > On Thu, Jan 25, 2024 at 10:13:10PM -0500, Jason Merrill wrote: > > On 1/25/24 20:36, Marek Polacek wrote: > > > Better version: > > > > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? > > > > > > -- >8 -- > > > Real-world

[PATCH][GCC 13] aarch64: Avoid out-of-range shrink-wrapped saves [PR111677]

2024-01-30 Thread Alex Coplan
Bootstrapped/regtested on aarch64-linux-gnu, OK for the 13 branch after a week of the trunk fix being in? OK for the other active branches if the same changes test cleanly there? GCC 14 patch for reference: https://gcc.gnu.org/pipermail/gcc-patches/2024-January/61.html Thanks, Alex -- >8

[PATCH] aarch64: Avoid out-of-range shrink-wrapped saves [PR111677]

2024-01-30 Thread Alex Coplan
Hi, The PR shows us ICEing due to an unrecognizable TFmode save emitted by aarch64_process_components. The problem is that for T{I,F,D}mode we conservatively require mems to be in range for x-register ldp/stp. That is because (at least for TImode) it can be allocated to both GPRs and FPRs, and

[PATCH] aarch64: Ensure iterator validity when updating debug uses [PR113616]

2024-01-29 Thread Alex Coplan
Hi, The fix for PR113089 introduced range-based for loops over the debug_insn_uses of an RTL-SSA set_info, but in the case that we reset a debug insn, the use would get removed from the use list, and thus we would end up using an invalidated iterator in the next iteration of the loop. In

Re: [PATCH] aarch64: Fix undefinedness while testing the J constraint [PR100204]

2024-01-26 Thread Alex Coplan
On 25/01/2024 11:57, Andrew Pinski wrote: > The J constraint can invoke undefined behavior due to it taking the > negative of the ival if ival was HWI_MIN. The fix is simple as casting > to `unsigned HOST_WIDE_INT` before doing the negative of it. This > does that. Thanks for doing this. > >

Re: [PATCH V2] rs6000: New pass for replacement of adjacent loads fusion (lxv).

2024-01-24 Thread Alex Coplan
Hi Ajit, On 21/01/2024 19:57, Ajit Agarwal wrote: > > Hello All: > > New pass to replace adjacent memory addresses lxv with lxvp. > Added common infrastructure for load store fusion for > different targets. Thanks for this, it would be nice to see the load/store pair pass generalized to

Re: [PATCH] aarch64: Re-enable ldp/stp fusion pass

2024-01-24 Thread Alex Coplan
On 24/01/2024 09:15, Kyrylo Tkachov wrote: > Hi Alex, > > > -Original Message- > > From: Alex Coplan > > Sent: Wednesday, January 24, 2024 8:34 AM > > To: gcc-patches@gcc.gnu.org > > Cc: Richard Earnshaw ; Richard Sandiford > > ; Kyrylo Tkacho

[PATCH] aarch64: Re-enable ldp/stp fusion pass

2024-01-24 Thread Alex Coplan
Hi, Since, to the best of my knowledge, all reported regressions related to the ldp/stp fusion pass have now been fixed, and PGO+LTO bootstrap with --enable-languages=all is working again with the passes enabled, this patch turns the passes back on by default, as agreed with Jakub here:

Re: [PATCH 4/4] aarch64: Fix up uses of mem following stp insert [PR113070]

2024-01-23 Thread Alex Coplan
On 22/01/2024 21:50, Alex Coplan wrote: > On 22/01/2024 15:59, Richard Sandiford wrote: > > Alex Coplan writes: > > > As the PR shows (specifically #c7) we are missing updating uses of mem > > > when inserting an stp in the aarch64 load/store pair fusion pa

Re: [PATCH 3/3] aarch64: Fix up debug uses in ldp/stp pass [PR113089]

2024-01-22 Thread Alex Coplan
On 22/01/2024 17:09, Richard Sandiford wrote: > Sorry for the earlier review comment about debug insns. I hadn't > looked far enough into the queue to see this patch. > > Alex Coplan writes: > > As the PR shows, we were missing code to update debug uses in the > > loa

Re: [PATCH 4/4] aarch64: Fix up uses of mem following stp insert [PR113070]

2024-01-22 Thread Alex Coplan
On 22/01/2024 15:59, Richard Sandiford wrote: > Alex Coplan writes: > > As the PR shows (specifically #c7) we are missing updating uses of mem > > when inserting an stp in the aarch64 load/store pair fusion pass. This > > patch fixes that. > > > > RT

Re: [PATCH 3/4] rtl-ssa: Ensure new defs get inserted [PR113070]

2024-01-22 Thread Alex Coplan
On 22/01/2024 13:49, Richard Sandiford wrote: > Alex Coplan writes: > > In r14-5820-ga49befbd2c783e751dc2110b544fe540eb7e33eb I added support to > > RTL-SSA for inserting new insns, which included support for users > > creating new defs. > > > > However, I missed

Re: [PATCH 2/4] rtl-ssa: Support for creating new uses [PR113070]

2024-01-22 Thread Alex Coplan
On 22/01/2024 13:45, Richard Sandiford wrote: > Alex Coplan writes: > > This exposes an interface for users to create new uses in RTL-SSA. > > This is needed for updating uses after inserting a new store pair insn > > in the aarch64 load/store pair fusion pass.

[PATCH] aarch64: Don't assert recog success in ldp/stp pass [PR113114]

2024-01-19 Thread Alex Coplan
Hi, The PR shows two different cases where try_promote_writeback produces an RTL pattern which isn't recognized. Currently this leads to an ICE, as we assert recog success, but I think it's better just to back out of the changes gracefully if recog fails (as we do in the main fuse_pair case).

[PATCH 3/3] aarch64: Fix up debug uses in ldp/stp pass [PR113089]

2024-01-19 Thread Alex Coplan
As the PR shows, we were missing code to update debug uses in the load/store pair fusion pass. This patch fixes that. Note that this patch depends on the following patch to create new uses in RTL-SSA, submitted as part of the fixes for PR113070:

[PATCH 2/3] aarch64: Re-parent trailing nondebug base reg uses [PR113089]

2024-01-19 Thread Alex Coplan
While working on PR113089, I realised we where missing code to re-parent trailing nondebug uses of the base register in the case of cancelling writeback in the load/store pair pass. This patch fixes that. Bootstrapped/regtested as a series on aarch64-linux-gnu (with/without the pass enabled), OK

[PATCH 1/3] rtl-ssa: Provide easier access to debug uses [PR113089]

2024-01-19 Thread Alex Coplan
This patch adds some accessors to set_info and use_info to make it easier to get at and iterate through uses in debug insns. It is used by the aarch64 load/store pair fusion pass in a subsequent patch to fix PR113089, i.e. to update debug uses in the pass. Bootstrapped/regtested as a series on

Re: [PATCH 1/4] rtl-ssa: Run finalize_new_accesses forwards [PR113070]

2024-01-17 Thread Alex Coplan
On 17/01/2024 07:42, Jeff Law wrote: > > > On 1/13/24 08:43, Alex Coplan wrote: > > The next patch in this series exposes an interface for creating new uses > > in RTL-SSA. The intent is that new user-created uses can consume new > > user-created defs in the sam

Re: [PATCH] aarch64: Fix aarch64_ldp_reg_operand predicate not to allow all subreg [PR113221]

2024-01-17 Thread Alex Coplan
Hi Andrew, On 16/01/2024 19:29, Andrew Pinski wrote: > So the problem here is that aarch64_ldp_reg_operand will all subreg even > subreg of lo_sum. > When LRA tries to fix that up, all things break. So the fix is to change the > check to only > allow reg and subreg of regs. Thanks a lot for

Re: [PATCH 4/4] aarch64: Fix up uses of mem following stp insert [PR113070]

2024-01-15 Thread Alex Coplan
On 13/01/2024 15:46, Alex Coplan wrote: > As the PR shows (specifically #c7) we are missing updating uses of mem > when inserting an stp in the aarch64 load/store pair fusion pass. This > patch fixes that. > > RTL-SSA has a simple view of memory and by default doesn't allow st

[PATCH] aarch64: Don't record hazards against paired insns [PR113356]

2024-01-15 Thread Alex Coplan
Hi, For the testcase in the PR, we try to pair insns where the first has writeback and the second uses the updated base register. This causes us to record a hazard against the second insn, thus narrowing the move range away from the end of the BB. However, it isn't meaningful to record hazards

[PATCH 4/4] aarch64: Fix up uses of mem following stp insert [PR113070]

2024-01-13 Thread Alex Coplan
As the PR shows (specifically #c7) we are missing updating uses of mem when inserting an stp in the aarch64 load/store pair fusion pass. This patch fixes that. RTL-SSA has a simple view of memory and by default doesn't allow stores to be re-ordered w.r.t. other stores. In the ldp fusion pass,

[PATCH 3/4] rtl-ssa: Ensure new defs get inserted [PR113070]

2024-01-13 Thread Alex Coplan
In r14-5820-ga49befbd2c783e751dc2110b544fe540eb7e33eb I added support to RTL-SSA for inserting new insns, which included support for users creating new defs. However, I missed that apply_changes_to_insn needed updating to ensure that the new defs actually got inserted into the main def chain.

[PATCH 2/4] rtl-ssa: Support for creating new uses [PR113070]

2024-01-13 Thread Alex Coplan
This exposes an interface for users to create new uses in RTL-SSA. This is needed for updating uses after inserting a new store pair insn in the aarch64 load/store pair fusion pass. gcc/ChangeLog: PR target/113070 * rtl-ssa/accesses.cc (function_info::create_use): New. *

[PATCH 1/4] rtl-ssa: Run finalize_new_accesses forwards [PR113070]

2024-01-13 Thread Alex Coplan
The next patch in this series exposes an interface for creating new uses in RTL-SSA. The intent is that new user-created uses can consume new user-created defs in the same change group. This is so that we can correctly update uses of memory when inserting a new store pair insn in the aarch64

[PATCH 0/4] aarch64, rtl-ssa: Fix wrong code in ldp fusion pass [PR113070]

2024-01-13 Thread Alex Coplan
wrong as it ends up incorrectly skipping over the stp insn when analysing subsequent load pair candidates. Bootstrapped/regtested as a series with/without the passes enabled on aarch64-linux-gnu (1/4 also tested independently and no regressions). OK for trunk? Thanks, Alex Alex Coplan (4): rtl-ssa

[PATCH v3] aarch64: Fix dwarf2cfi ICEs due to recent CFI note changes [PR113077]

2024-01-10 Thread Alex Coplan
This is a v3 which addresses shortcomings of the v2 patch. v2 was posted here: https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642448.html The main issue in v2 is that we were using the final (transformed) patterns in combine_reg_notes instead of the initial patterns (thanks Richard S for

[PATCH] aarch64: Make ldp/stp pass off by default

2024-01-10 Thread Alex Coplan
As discussed on IRC, this makes the aarch64 ldp/stp pass off by default. This should stabilize the trunk and give some time to address the P1 regressions. Sorry for the breakage. Bootstrapped/regtested on aarch64-linux-gnu, OK for trunk? Alex gcc/ChangeLog: *

[PATCH v2] aarch64: Fix dwarf2cfi ICEs due to recent CFI note changes [PR113077]

2024-01-10 Thread Alex Coplan
This is a v2 which addresses feedback from v1, posted here: https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642313.html Bootstrapped/regtested on aarch64-linux-gnu, OK for trunk? Thanks, Alex -- >8 -- In r14-6604-gd7ee988c491cde43d04fe25f2b3dbad9d85ded45 we changed the CFI notes

[PATCH] aarch64: Fix dwarf2cfi ICEs due to recent CFI note changes [PR113077]

2024-01-09 Thread Alex Coplan
Hi, In r14-6604-gd7ee988c491cde43d04fe25f2b3dbad9d85ded45 we changed the CFI notes attached to callee saves (in aarch64_save_callee_saves). That patch changed the ldp/stp representation to use unspecs instead of PARALLEL moves. This meant that we needed to attach CFI notes to all frame-related

[PATCH] aarch64: Further fix for throwing insns in ldp/stp pass [PR113217]

2024-01-05 Thread Alex Coplan
As the PR shows, the fix in r14-6916-g057dc349021660c40699fb5c98fd9cac8e168653 was not complete. That fix was enough to stop us trying to move throwing accesses above nondebug insns, but due to this code in try_fuse_pair: // Placement strategy: push loads down and pull stores up, this should

[PATCH] aarch64: Prevent moving throwing accesses in ldp/stp pass [PR113093]

2023-12-20 Thread Alex Coplan
As the PR shows, there was nothing to prevent the ldp/stp pass from trying to move throwing insns, which lead to an RTL verification failure. This patch fixes that. Bootstrapped/regtested on aarch64-linux-gnu, OK for trunk? Thanks, Alex gcc/ChangeLog: PR target/113093 *

[PATCH v2] aarch64: Validate register operands early in ldp fusion pass [PR113062]

2023-12-20 Thread Alex Coplan
This is a v2 addressing Richard's feedback, v1 was posted here: https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640957.html Bootstrapped/regtested on aarch64-linux-gnu, OK for trunk? Thanks, Alex -- >8 -- We were missing validation of the candidate register operands in the ldp/stp

Re: [PATCH] aarch64: Validate register operands early in ldp fusion pass [PR113062]

2023-12-19 Thread Alex Coplan
On 19/12/2023 13:38, Richard Sandiford wrote: > Alex Coplan writes: > > On 19/12/2023 10:15, Richard Sandiford wrote: > >> Alex Coplan writes: > >> > We were missing validation of the candidate register operands in the > >> > ldp/stp pass. I was rel

Re: [PATCH] aarch64: Validate register operands early in ldp fusion pass [PR113062]

2023-12-19 Thread Alex Coplan
On 19/12/2023 10:15, Richard Sandiford wrote: > Alex Coplan writes: > > We were missing validation of the candidate register operands in the > > ldp/stp pass. I was relying on recog rejecting such cases when we > > formed the final pair insn, but the testcase shows t

[PATCH] aarch64: Validate register operands early in ldp fusion pass [PR113062]

2023-12-19 Thread Alex Coplan
We were missing validation of the candidate register operands in the ldp/stp pass. I was relying on recog rejecting such cases when we formed the final pair insn, but the testcase shows that with -fharden-conditionals we attempt to combine two insns with asm_operands, both containing mem rtxes.

[PATCH] aarch64: Fix parens in aarch64_stp_reg_operand [PR113061]

2023-12-18 Thread Alex Coplan
In r14-6603-gfcdd2757c76bf925115b8e1ba4318d6366dd6f09 I messed up the parentheses in aarch64_stp_reg_operand, the indentation shows the intended nesting of the conditions. This patch fixes that. This fixes PR113061 which shows IRA substituting (const_int 1) into a writeback stp pattern as a

Re: [PATCH v4 10/11] aarch64: Add new load/store pair fusion pass

2023-12-15 Thread Alex Coplan
On 15/12/2023 15:34, Richard Sandiford wrote: > Alex Coplan writes: > > This is a v6 of the aarch64 load/store pair fusion pass, which > > addresses the feedback from Richard's last review here: > > > > https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640539

[PATCH v4 10/11] aarch64: Add new load/store pair fusion pass

2023-12-15 Thread Alex Coplan
This is a v6 of the aarch64 load/store pair fusion pass, which addresses the feedback from Richard's last review here: https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640539.html In particular this version implements the suggested changes which greatly simplify the double list walk.

[PATCH] doc: Document AArch64-specific asm operand modifiers

2023-12-14 Thread Alex Coplan
Hi, As it stands, GCC doesn't document any public AArch64-specific operand modifiers for use in inline asm. This patch fixes that by documenting an initial set of public AArch64-specific operand modifiers. Tested with make html and checking the output looks OK in a browser. OK for trunk?

[PATCH 2/2] aarch64: Handle autoinc addresses in ld1rq splitter [PR112906]

2023-12-13 Thread Alex Coplan
This patch uses the new force_reload_address routine added by the previous patch to fix PR112906. Bootstrapped/regtested on aarch64-linux-gnu, OK for trunk? Thanks, Alex gcc/ChangeLog: PR target/112906 * config/aarch64/aarch64-sve.md (@aarch64_vec_duplicate_vq_le): Use

[PATCH 1/2] emit-rtl, lra: Move lra's emit_inc to emit-rtl.cc

2023-12-13 Thread Alex Coplan
Hi, In PR112906 we ICE because we try to use force_reg to reload an auto-increment address, but force_reg can't do this. With the aim of fixing the PR by supporting reloading arbitrary addresses in pre-RA splitters, this patch generalizes lra-constraints.cc:emit_inc and makes it available to the

Re: [PATCH v2 09/11] aarch64: Rewrite non-writeback ldp/stp patterns

2023-12-13 Thread Alex Coplan
On 12/12/2023 15:58, Richard Sandiford wrote: > Alex Coplan writes: > > Hi, > > > > This is a v2 version which addresses feedback from Richard's review > > here: > > > > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637648.html > > >

[PATCH v3 10/11] aarch64: Add new load/store pair fusion pass

2023-12-07 Thread Alex Coplan
Hi, This is a v5 of the aarch64 load/store pair fusion pass, rebased on top of the SME changes. v4 is here: https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639404.html There are no changes to the pass itself since v4, this is just a rebase. Bootstrapped/regtested as a series on

[PATCH v3 09/11] aarch64: Rewrite non-writeback ldp/stp patterns

2023-12-07 Thread Alex Coplan
Hi, This is a v3, rebased on top of the SME changes. v2 is here: https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639361.html Bootstrapped/regtested as a series on aarch64-linux-gnu, OK for trunk? Thanks, Alex -- >8 -- This patch overhauls the load/store pair patterns with two main

[PATCH v3 08/11] aarch64: Generalize writeback ldp/stp patterns

2023-12-07 Thread Alex Coplan
Hi, This is a v3 patch which is rebased on top of the SME changes. Otherwise it is the same as v2, posted here: https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639367.html Bootstrapped/regtested as a series on aarch64-linux-gnu, OK for trunk? Thanks, Alex -- >8 -- Thus far the

[PATCH v2 10/11] aarch64: Add new load/store pair fusion pass.

2023-12-05 Thread Alex Coplan
Hi, This is a v4 of the aarch64 load/store pair fusion pass. This addresses feedback from the review of v3 here: https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637756.html I've attached the incremental change in reply to the review above. Bootstrapped/regtested as a series on

[PATCH v2 08/11] aarch64: Generalize writeback ldp/stp patterns

2023-12-05 Thread Alex Coplan
Hi, This is a v2 patch which implements the requested changes from the previous review here: https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637642.html the patch was pre-approved with those changes, but this patch additionally makes use of the new aarch64_const_zero_rtx_p predicate in

Re: [PATCH 09/11] aarch64: Rewrite non-writeback ldp/stp patterns

2023-12-05 Thread Alex Coplan
Thanks for the review, I've posted a v2 here which addresses this feedback: https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639361.html On 21/11/2023 16:04, Richard Sandiford wrote: > Alex Coplan writes: > > This patch overhauls the load/store pair patterns with two main goals:

[PATCH v2 09/11] aarch64: Rewrite non-writeback ldp/stp patterns

2023-12-05 Thread Alex Coplan
Hi, This is a v2 version which addresses feedback from Richard's review here: https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637648.html I'll reply inline to address specific comments. Bootstrapped/regtested on aarch64-linux-gnu, OK for trunk? Thanks, Alex -- >8 -- This patch

[PATCH v2 06/11] aarch64: Fix up aarch64_print_operand xzr/wzr case

2023-12-05 Thread Alex Coplan
Hi, This is a v2 of: https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637612.html v1 was approved as-is, but this version pulls out the test into a helper function which is used by later patches in the series. Bootstrapped/regtested as a series on aarch64-linux-gnu, OK for trunk?

Re: [PATCH v5] c-family: Implement __has_feature and __has_extension [PR60512]

2023-11-28 Thread Alex Coplan
On 28/11/2023 17:03, Thomas Schwinge wrote: > Hi! > > On 2023-11-17T14:50:45+0000, Alex Coplan wrote: > > --- a/gcc/cp/cp-objcp-common.cc > > +++ b/gcc/cp/cp-objcp-common.cc > > > +/* Table of features for __has_{feature,extension}. */ > > + &

Re: [PATCH] c++: Fix up __has_extension (cxx_init_captures)

2023-11-28 Thread Alex Coplan
On 28/11/2023 09:22, Jakub Jelinek wrote: > On Mon, Nov 27, 2023 at 10:58:04AM +0000, Alex Coplan wrote: > > Many thanks both for the reviews, this is now pushed (with Jason's > > above changes implemented) as g:06280a906cb3dc80cf5e07cf3335b758848d488d. > > The new

Re: [PATCH v5] c-family: Implement __has_feature and __has_extension [PR60512]

2023-11-27 Thread Alex Coplan
On 23/11/2023 12:41, Marek Polacek wrote: > On Mon, Nov 20, 2023 at 05:29:58PM -0500, Jason Merrill wrote: > > On 11/17/23 09:50, Alex Coplan wrote: > > > Hi, > > > > > > This is a v5 patch to address Marek's feedback here: > > > https://gcc.gnu.org/pi

Re: [PATCH 02/11] rtl-ssa: Add some helpers for removing accesses

2023-11-23 Thread Alex Coplan
On 21/11/2023 16:49, Richard Sandiford wrote: > Richard Sandiford writes: > > Alex Coplan writes: > >> This adds some helpers to access-utils.h for removing accesses from an > >> access_array. This is needed by the upcoming aarch64 load/store pair > >>

[PATCH v2 1/11] rtl-ssa: Support for inserting new insns

2023-11-23 Thread Alex Coplan
Hi, This is a v2, original patch is here: https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637606.html This addresses review feedback and: - Fixes a bug in the previous version in function_info::finalize_new_accesses; we should now correctly handle the case where properties.refs ()

Re: [PATCH 01/11] rtl-ssa: Support for inserting new insns

2023-11-22 Thread Alex Coplan
On 21/11/2023 11:51, Richard Sandiford wrote: > Alex Coplan writes: > > N.B. this is just a rebased (but otherwise unchanged) version of the > > same patch already posted here: > > > > https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633348.html > >

Re: [PATCH v4] c-family: Implement __has_feature and __has_extension [PR60512]

2023-11-17 Thread Alex Coplan
On 03/11/2023 12:19, Marek Polacek wrote: > On Wed, Sep 27, 2023 at 03:27:30PM +0100, Alex Coplan wrote: > > Hi, > > > > This is a v4 patch to address Jason's feedback here: > > https://gcc.gnu.org/pipermail/gcc-patches/2023-September/630911.html > > > >

[PATCH v5] c-family: Implement __has_feature and __has_extension [PR60512]

2023-11-17 Thread Alex Coplan
Hi, This is a v5 patch to address Marek's feedback here: https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635157.html I also implemented Jason's suggestion to use constexpr for the tables from this review: https://gcc.gnu.org/pipermail/gcc-patches/2023-October/634484.html I'll attach the

[PATCH 11/11] aarch64: Use individual loads/stores for mem{cpy,set} expansion

2023-11-16 Thread Alex Coplan
This patch adjusts the mem{cpy,set} expansion in the aarch64 backend to use individual loads/stores instead of ldp/stp at expand time. The idea is to rely on the ldp fusion pass to fuse the accesses together later in the RTL pipeline. The earlier parts of the RTL pipeline should be able to do a

[PATCH 10/11] aarch64: Add new load/store pair fusion pass.

2023-11-16 Thread Alex Coplan
This is a v3 of the aarch64 load/store pair fusion pass. v2 was posted here: - https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633601.html The main changes since v2 are as follows: We now handle writeback opportunities as well. E.g. for this testcase: void foo (long *p, long *q, long

[PATCH 09/11] aarch64: Rewrite non-writeback ldp/stp patterns

2023-11-16 Thread Alex Coplan
This patch overhauls the load/store pair patterns with two main goals: 1. Fixing a correctness issue (the current patterns are not RA-friendly). 2. Allowing more flexibility in which operand modes are supported, and which combinations of modes are allowed in the two arms of the load/store

[PATCH 08/11] aarch64: Generalize writeback ldp/stp patterns

2023-11-16 Thread Alex Coplan
Thus far the writeback forms of ldp/stp have been exclusively used in prologue and epilogue code for saving/restoring of registers to/from the stack. As such, forms of ldp/stp that weren't needed for prologue/epilogue code weren't supported by the aarch64 backend. This patch generalizes the

  1   2   3   >