[gcc r15-820] vect: Fix access size alignment assumption [PR115192]

2024-05-24 Thread Richard Sandiford via Gcc-cvs
https://gcc.gnu.org/g:a0fe4fb1c8d7804515845dd5d2a814b3c7a1ccba commit r15-820-ga0fe4fb1c8d7804515845dd5d2a814b3c7a1ccba Author: Richard Sandiford Date: Fri May 24 13:47:21 2024 +0100 vect: Fix access size alignment assumption [PR115192] create_intersect_range_checks checks

[PATCH] vect: Fix access size alignment assumption [PR115192]

2024-05-24 Thread Richard Sandiford
create_intersect_range_checks checks whether two access ranges a and b are alias-free using something equivalent to: end_a <= start_b || end_b <= start_a It has two ways of doing this: a "vanilla" way that calculates the exact exclusive end pointers, and another way that uses the last

Re: [PATCH] aarch64: Fold vget_high_* intrinsics to BIT_FIELD_REF [PR102171]

2024-05-22 Thread Richard Sandiford
Pengxuan Zheng writes: > This patch is a follow-up of r15-697-ga2e4fe5a53cf75 to also fold vget_high_* > intrinsics to BIT_FILED_REF and remove the vget_high_* definitions from > arm_neon.h to use the new intrinsics framework. > > PR target/102171 > > gcc/ChangeLog: > > *

Re: [PATCH v1 5/6] Adjust DLL import/export implementation for AArch64

2024-05-22 Thread Richard Sandiford
Evgeny Karpov writes: > The DLL import/export mingw implementation, originally from ix86, requires > minor adjustments to be compatible with AArch64. > > gcc/ChangeLog: > > * config/mingw/mingw32.h (defined): Use the correct DllMainCRTStartup > entry function. > *

Re: [PATCH v1 4/6] aarch64: Add selectany attribute handling

2024-05-22 Thread Richard Sandiford
Evgeny Karpov writes: > This patch extends the aarch64 attributes list with the selectany > attribute for the aarch64-w64-mingw32 target and reuses the mingw > implementation to handle it. > > * config/aarch64/aarch64.cc: > Extend the aarch64 attributes list. > *

Re: [PATCH v1 3/6] Rename functions for reuse in AArch64

2024-05-22 Thread Richard Sandiford
Evgeny Karpov writes: > This patch renames functions related to dllimport/dllexport > and selectany functionality. These functions will be reused > in the aarch64-w64-mingw32 target. > > gcc/ChangeLog: > > * config/i386/cygming.h (mingw_pe_record_stub): > Rename functions in mingw

Re: [PATCH 4/4] Testsuite updates

2024-05-22 Thread Richard Sandiford
Richard Biener writes: > On Tue, 21 May 2024, Richard Biener wrote: > >> The gcc.dg/vect/slp-12a.c case is interesting as we currently split >> the 8 store group into lanes 0-5 which we SLP with an unroll factor >> of two (on x86-64 with SSE) and the remaining two lanes are using >> interleaving

Re: [PATCH 3/4]AArch64: add new alternative with early clobber to patterns

2024-05-22 Thread Richard Sandiford
Tamar Christina writes: >> -Original Message- >> From: Richard Sandiford >> Sent: Wednesday, May 22, 2024 10:48 AM >> To: Tamar Christina >> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw >> ; Marcus Shawcroft >> ; ktkac...@gcc.gnu.org &

Re: [PATCH v1 2/6] Extract ix86 dllimport implementation to mingw

2024-05-22 Thread Richard Sandiford
Evgeny Karpov writes: > This patch extracts the ix86 implementation for expanding a SYMBOL > into its corresponding dllimport, far-address, or refptr symbol. > It will be reused in the aarch64-w64-mingw32 target. > The implementation is copied as is from i386/i386.cc with > minor changes to

Re: [PATCH 3/4]AArch64: add new alternative with early clobber to patterns

2024-05-22 Thread Richard Sandiford
Tamar Christina writes: > Hi All, > > This patch adds new alternatives to the patterns which are affected. The new > alternatives with the conditional early clobbers are added before the normal > ones in order for LRA to prefer them in the event that we have enough free > registers to

Re: [PATCH] Fix mixed input kind permute optimization

2024-05-22 Thread Richard Sandiford
Richard Sandiford writes: > Richard Biener writes: >> When change_vec_perm_layout runs into a permute combining two >> nodes where one is invariant and one internal the partition of >> one input can be -1 but the other might not be. The following >> supports this cas

Re: [PATCH 3/4] Avoid splitting store dataref groups during SLP discovery

2024-05-21 Thread Richard Sandiford
Richard Biener writes: > The following avoids splitting store dataref groups during SLP > discovery but instead forces (eventually single-lane) consecutive > lane SLP discovery for all lanes of the group, creating VEC_PERM > SLP nodes merging them so the store will always cover the whole group. >

Re: [PATCH] Fix mixed input kind permute optimization

2024-05-21 Thread Richard Sandiford
Richard Biener writes: > When change_vec_perm_layout runs into a permute combining two > nodes where one is invariant and one internal the partition of > one input can be -1 but the other might not be. The following > supports this case by simply ignoring inputs with input partiton -1. > > I'm

Re: [PATCH v3] aarch64: Fix normal returns inside functions which use eh_returns [PR114843]

2024-05-21 Thread Richard Sandiford
Wilco Dijkstra writes: > Hi Andrew, > > A few comments on the implementation, I think it can be simplified a lot: FWIW, I agree with Wilco's comments, except: >> +++ b/gcc/config/aarch64/aarch64.h >> @@ -700,8 +700,9 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = >> AARCH64_FL_SM_OFF; >>

[gcc r15-752] Cache the set of EH_RETURN_DATA_REGNOs

2024-05-21 Thread Richard Sandiford via Gcc-cvs
https://gcc.gnu.org/g:7f35863ebbf7ba63e2f075edfbec105de272578a commit r15-752-g7f35863ebbf7ba63e2f075edfbec105de272578a Author: Richard Sandiford Date: Tue May 21 10:21:16 2024 +0100 Cache the set of EH_RETURN_DATA_REGNOs While reviewing Andrew's fix for PR114843, it seemed like

[PATCH] Cache the set of EH_RETURN_DATA_REGNOs

2024-05-21 Thread Richard Sandiford
While reviewing Andrew's fix for PR114843, it seemed like it would be convenient to have a HARD_REG_SET of EH_RETURN_DATA_REGNOs. This patch adds one and uses it to simplify a couple of use sites. Tested on aarch64-linux-gnu & x86_64-linux-gnu. OK to install? Richard gcc/ *

Re: [PATCH] middle-end: Expand {u|s}dot product support in autovectorizer

2024-05-20 Thread Richard Sandiford
Richard Biener writes: > On Fri, May 17, 2024 at 11:56 AM Tamar Christina > wrote: >> >> > -Original Message- >> > From: Richard Biener >> > Sent: Friday, May 17, 2024 10:46 AM >> > To: Tamar Christina >> > Cc: Victor Do Nasc

Re: [PATCH 00/12] aarch64: Extend aarch64_feature_flags to 128 bits

2024-05-20 Thread Richard Sandiford
Andrew Carlotti writes: > On Fri, May 17, 2024 at 04:45:05PM +0100, Richard Sandiford wrote: >> Andrew Carlotti writes: >> > The end goal of the series is to change the definition of >> > aarch64_feature_flags >> > from a uint64_t typedef to a class w

Re: [Patch, aarch64] Further renaming of generic code

2024-05-20 Thread Richard Sandiford
Ajit Agarwal writes: > Hello Alex/Richard: > > Renaming of generic code is done to make target independent > and target dependent code to support multiple targets. > > Target independent code is the Generic code with pure virtual function > to interface betwwen target independent and dependent

Re: [PATCH] AArch64: Improve costing of ctz

2024-05-20 Thread Richard Sandiford
Wilco Dijkstra writes: > Improve costing of ctz - both TARGET_CSSC and vector cases were not handled > yet. > > Passes regress & bootstrap - OK for commit? > > gcc: > * config/aarch64/aarch64.cc (aarch64_rtx_costs): Improve CTZ costing. Ok, thanks. Richard > diff --git

Re: [PATCH] AArch64: Fix printing of 2-instruction alternatives

2024-05-20 Thread Richard Sandiford
Wilco Dijkstra writes: > Add missing '\' in 2-instruction movsi/di alternatives so that they are > printed on separate lines. > > Passes bootstrap and regress, OK for commit once stage 1 reopens? > > gcc: > * config/aarch64/aarch64.md (movsi_aarch64): Use '\;' to force > newline

Re: [PATCH] aarch64: Fold vget_low_* intrinsics to BIT_FIELD_REF [PR102171]

2024-05-20 Thread Richard Sandiford
Pengxuan Zheng writes: > This patch folds vget_low_* intrinsics to BIT_FILED_REF to open up more > optimization opportunities for gimple optimizers. > > While we are here, we also remove the vget_low_* definitions from arm_neon.h > and > use the new intrinsics framework. > > PR

Re: [Patch, aarch64] v7: Preparatory patch to place target independent and dependent changed code in one file

2024-05-20 Thread Richard Sandiford
Ajit Agarwal writes: > Hello Alex/Richard: > > All comments are addressed. > > Common infrastructure of load store pair fusion is divided into target > independent and target dependent changed code. > > Target independent code is the Generic code with pure virtual function > to interface between

Re: [Patch, aarch64] v6: Preparatory patch to place target independent and, dependent changed code in one file

2024-05-17 Thread Richard Sandiford
Ajit Agarwal writes: > Hello Alex/Richard: > > All review comments are addressed. > > Common infrastructure of load store pair fusion is divided into target > independent and target dependent changed code. > > Target independent code is the Generic code with pure virtual function > to interface

Re: [PATCH 00/12] aarch64: Extend aarch64_feature_flags to 128 bits

2024-05-17 Thread Richard Sandiford
Andrew Carlotti writes: > The end goal of the series is to change the definition of > aarch64_feature_flags > from a uint64_t typedef to a class with 128 bits of storage. This class uses > operator overloading to mimic the existing integer interface as much as > possible, but with added

Re: [RFC] Merge strathegy for all-SLP vectorizer

2024-05-17 Thread Richard Sandiford via Gcc
Richard Biener via Gcc writes: > Hi, > > I'd like to discuss how to go forward with getting the vectorizer to > all-SLP for this stage1. While there is a personal branch with my > ongoing work (users/rguenth/vect-force-slp) branches haven't proved > themselves working well for collaboration.

Re: [PATCH] AArch64: Use LDP/STP for large struct types

2024-05-16 Thread Richard Sandiford
Richard Sandiford writes: > Wilco Dijkstra writes: >> Use LDP/STP for large struct types as they have useful immediate offsets and >> are typically faster. >> This removes differences between little and big endian and allows use of >> LDP/STP without UNSPEC. >>

Re: [PATCH] AArch64: Use LDP/STP for large struct types

2024-05-16 Thread Richard Sandiford
Wilco Dijkstra writes: > Use LDP/STP for large struct types as they have useful immediate offsets and > are typically faster. > This removes differences between little and big endian and allows use of > LDP/STP without UNSPEC. > > Passes regress and bootstrap, OK for commit? > > gcc: >

Re: [PATCH 0/4]AArch64: support conditional early clobbers on certain operations.

2024-05-15 Thread Richard Sandiford
Tamar Christina writes: >> >> On Wed, May 15, 2024 at 12:29 PM Tamar Christina >> >> wrote: >> >> > >> >> > Hi All, >> >> > >> >> > Some Neoverse Software Optimization Guides (SWoG) have a clause that >> >> > state >> >> > that for predicated operations that also produce a predicate it is >>

Re: [PATCH 0/4]AArch64: support conditional early clobbers on certain operations.

2024-05-15 Thread Richard Sandiford
Tamar Christina writes: >> -Original Message- >> From: Richard Biener >> Sent: Wednesday, May 15, 2024 12:20 PM >> To: Tamar Christina >> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw >> ; Marcus Shawcroft >> ; ktkac...@gcc.gnu.org; Richar

Re: [PATCH 1/4]AArch64: convert several predicate patterns to new compact syntax

2024-05-15 Thread Richard Sandiford
Thanks for doing this a pre-patch. Minor request below: Tamar Christina writes: > ;; Perform a logical operation on operands 2 and 3, using operand 1 as > @@ -6676,38 +6690,42 @@ (define_insn "@aarch64_pred__z" > (define_insn "*3_cc" >[(set (reg:CC_NZC CC_REGNUM) > (unspec:CC_NZC >

Re: [PATCH 2/4]AArch64: add new tuning param and attribute for enabling conditional early clobber

2024-05-15 Thread Richard Sandiford
Tamar Christina writes: > Hi All, > > This adds a new tuning parameter EARLY_CLOBBER_SVE_PRED_DEST for AArch64 to > allow us to conditionally enable the early clobber alternatives based on the > tuning models. > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > Ok for master?

Re: [PATCH] AArch64: Use UZP1 instead of INS

2024-05-15 Thread Richard Sandiford
Wilco Dijkstra writes: > Use UZP1 instead of INS when combining low and high halves of vectors. > UZP1 has 3 operands which improves register allocation, and is faster on > some microarchitectures. > > Passes regress & bootstrap, OK for commit? OK, thanks. We can add core-specific tuning later

[pushed] aarch64: Avoid using mismatched ZERO ZA sizes

2024-04-12 Thread Richard Sandiford
The svzero_mask_za intrinsic tried to use the shortest combination of .b, .h, .s and .d tiles, allowing mixtures of sizes where necessary. However, Iain S pointed out that LLVM instead requires the tiles to have the same suffix. GAS supports both versions, so this patch generates the

Re: [PATCH] docs: Update function multiversioning documentation

2024-04-12 Thread Richard Sandiford
Hi Andrew, Thanks for doing this. I think it improves the organisation of the FMV documentation and adds some details that were previously missing. I've made some suggestions below, but documentation is subjective and I realise that not everyone will agree with them. I've also added Sandra to

Re: [PATCH] aarch64: Add rcpc3 dependency on rcpc2 and rcpc

2024-04-12 Thread Richard Sandiford
Andrew Carlotti writes: > We don't yet have a separate feature flag for FEAT_LRCPC2 (and adding > one will require extending the feature bitmask). Instead, make the > FEAT_LRCPC patterns available when either armv8.4-a or +rcpc3 is > specified. On the other hand, we already have a +rcpc flag,

Re: [PATCH] aarch64: Enable +cssc for armv8.9-a

2024-04-12 Thread Richard Sandiford
Andrew Carlotti writes: > FEAT_CSSC is mandatory in the architecture from Armv8.9. > > gcc/ChangeLog: > > * config/aarch64/aarch64-arches.def: Add CSSC to V8_9A > dependencies. OK, thanks. Richard > > --- > > Bootstrapped and regression tested on aarch64. Ok for master? > > > diff

Re: [PATCH]middle-end: adjust loop upper bounds when peeling for gaps and early break [PR114403].

2024-04-12 Thread Richard Sandiford
Richard Biener writes: > On Fri, 12 Apr 2024, Tamar Christina wrote: > >> Hi All, >> >> This is a story all about how the peeling for gaps introduces a bug in the >> upper >> bounds. >> >> Before I go further, I'll first explain how I understand this to work for >> loops >> with a single

Re: [PATCH v2] aarch64: Preserve mem info on change of base for ldp/stp [PR114674]

2024-04-12 Thread Richard Sandiford
Alex Coplan writes: > This is a v2 because I accidentally sent a WIP version of the patch last > time round which used replace_equiv_address instead of > replace_equiv_address_nv; that caused some ICEs (pointed out by the > Linaro CI) since pair addressing modes aren't a subset of the addresses >

Re: [PATCH] aarch64: Fix _BitInt testcases

2024-04-11 Thread Richard Sandiford
"Andre Vieira (lists)" writes: > This patch fixes some testisms introduced by: > > commit 5aa3fec38cc6f52285168b161bab1a869d864b44 > Author: Andre Vieira > Date: Wed Apr 10 16:29:46 2024 +0100 > > aarch64: Add support for _BitInt > > The testcases were relying on an unnecessary

[gcc r14-9925] aarch64: Fix _BitInt testcases

2024-04-11 Thread Richard Sandiford via Gcc-cvs
https://gcc.gnu.org/g:b87ba79200f2a727aa5c523abcc5c03fa11fc007 commit r14-9925-gb87ba79200f2a727aa5c523abcc5c03fa11fc007 Author: Andre Vieira (lists) Date: Thu Apr 11 17:54:37 2024 +0100 aarch64: Fix _BitInt testcases This patch fixes some testisms introduced by: commit

Re: [PATCH v2 00/13] Add aarch64-w64-mingw32 target

2024-04-11 Thread Richard Sandiford
Evgeny Karpov writes: > Wednesday, April 10, 2024 8:40 PM > Richard Sandiford wrote: > >> Thanks for the updates and sorry again for the slow review. >> I've replied to some of the patches in the series but otherwise it looks >> good to >> me. >> >

Re: [PATCH] aarch64: Preserve mem info on change of base for ldp/stp [PR114674]

2024-04-11 Thread Richard Sandiford
Alex Coplan writes: > Hi, > > The ldp/stp fusion pass can change the base of an access so that the two > accesses end up using a common base register. So far we have been using > adjust_address_nv to do this, but this means that we don't preserve > other properties of the mem we're replacing.

Re: [PATCH 0/5] aarch64: FMV feature list fixes

2024-04-10 Thread Richard Sandiford
Andrew Carlotti writes: > On Wed, Apr 10, 2024 at 05:42:05PM +0100, Richard Sandiford wrote: >> Andrew Carlotti writes: >> > On Tue, Apr 09, 2024 at 04:43:16PM +0100, Richard Sandiford wrote: >> >> Andrew Carlotti writes: >> >> > The first three pa

Re: [PATCH v2 00/13] Add aarch64-w64-mingw32 target

2024-04-10 Thread Richard Sandiford
Evgeny Karpov writes: > Hello, > > v2 is ready for the review! > Based on the v1 review: > https://gcc.gnu.org/pipermail/gcc-patches/2024-February/thread.html#646203 > > Testing for the x86_64-w64-mingw32 target is in progress to avoid > regression due to refactoring. Thanks for the updates and

Re: [PATCH v2 10/13] Rename "x86 Windows Options" to "Cygwin and MinGW Options"

2024-04-10 Thread Richard Sandiford
Evgeny Karpov writes: > From: Zac Walker > Date: Fri, 1 Mar 2024 02:17:39 +0100 > Subject: [PATCH v2 10/13] Rename "x86 Windows Options" to "Cygwin and MinGW > Options" > > Rename "x86 Windows Options" to "Cygwin and MinGW Options". > It will be used also for AArch64. > > gcc/ChangeLog: > >

Re: [PATCH v2 08/13] aarch64: Add Cygwin and MinGW environments for AArch64

2024-04-10 Thread Richard Sandiford
Evgeny Karpov writes: > From: Zac Walker > Date: Fri, 1 Mar 2024 10:49:28 +0100 > Subject: [PATCH v2 08/13] aarch64: Add Cygwin and MinGW environments for > AArch64 > > Define Cygwin and MinGW environment such as types, SEH definitions, > shared libraries, etc. > > gcc/ChangeLog: > > *

Re: [PATCH v2 04/13] aarch64: Add aarch64-w64-mingw32 COFF

2024-04-10 Thread Richard Sandiford
Evgeny Karpov writes: > From: Zac Walker > Date: Fri, 1 Mar 2024 01:55:47 +0100 > Subject: [PATCH v2 04/13] aarch64: Add aarch64-w64-mingw32 COFF > > Define ASM specific for COFF format on AArch64. > > gcc/ChangeLog: > > * config.gcc: Add COFF format support definitions. > *

Re: [PATCH v2 03/13] aarch64: Mark x18 register as a fixed register for MS ABI

2024-04-10 Thread Richard Sandiford
Evgeny Karpov writes: > From: Zac Walker > Date: Fri, 1 Mar 2024 09:56:59 +0100 > Subject: [PATCH v2 03/13] aarch64: Mark x18 register as a fixed register for > MS ABI > > Define the MS ABI for aarch64-w64-mingw32. > Adjust FIXED_REGISTERS, CALL_REALLY_USED_REGISTERS and > STATIC_CHAIN_REGNUM

Re: [PATCH v2 02/13] aarch64: The aarch64-w64-mingw32 target implements

2024-04-10 Thread Richard Sandiford
Sorry for the slow reply. Evgeny Karpov writes: > From: Zac Walker > Date: Fri, 1 Mar 2024 01:45:13 +0100 > Subject: [PATCH v2 02/13] aarch64: The aarch64-w64-mingw32 target implements > the MS ABI > > Two ABIs for aarch64 have been defined for different platforms. > > gcc/ChangeLog: > >

Re: [PATCH 0/5] aarch64: FMV feature list fixes

2024-04-10 Thread Richard Sandiford
Andrew Carlotti writes: > On Tue, Apr 09, 2024 at 04:43:16PM +0100, Richard Sandiford wrote: >> Andrew Carlotti writes: >> > The first three patches are trivial changes to the feature list to reflect >> > recent changes in the ACLE. Patch 4 removes most of th

Re: [PATCHv3 2/2] aarch64: Add support for _BitInt

2024-04-10 Thread Richard Sandiford
h I've not gone through the tests very thorougly this time around, and just gone by the internal diff between this version and the previous one. But we can adjust them as necessary based on any reports that come in. Richard > > On 28/03/2024 15:21, Richard Sandiford wrote: >> Jakub

Re: [PATCHv2 1/2] aarch64: Do not give ABI change diagnostics for _BitInt(N)

2024-04-10 Thread Richard Sandiford
"Andre Vieira (lists)" writes: > @@ -6907,6 +6938,11 @@ aarch64_layout_arg (cumulative_args_t pcum_v, const > function_arg_info ) > && (!alignment || abi_break_gcc_9 < alignment) > && (!abi_break_gcc_13 || alignment < abi_break_gcc_13)); > > + /* _BitInt(N) was only

Re: [PATCH]AArch64: Do not allow SIMD clones with simdlen 1 [PR113552][GCC 13/12/11 backport]

2024-04-09 Thread Richard Sandiford
Tamar Christina writes: > Hi All, > > This is a backport of g:306713c953d509720dc394c43c0890548bb0ae07. > > The AArch64 vector PCS does not allow simd calls with simdlen 1, > however due to a bug we currently do allow it for num == 0. > > This causes us to emit a symbol that doesn't exist and we

Re: [PATCH 0/5] aarch64: FMV feature list fixes

2024-04-09 Thread Richard Sandiford
Andrew Carlotti writes: > The first three patches are trivial changes to the feature list to reflect > recent changes in the ACLE. Patch 4 removes most of the FMV multiversioning > features that don't work at the moment, and should be entirely > uncontroversial. > > Patch 5 handles the

Re: [PATCH 2/5] aarch64: Don't use FEAT_MAX as array length

2024-04-09 Thread Richard Sandiford
Andrew Carlotti writes: > There was an assumption in some places that the aarch64_fmv_feature_data > array contained FEAT_MAX elements. While this assumption held up till > now, it is safer and more flexible to use the array size directly. > > gcc/ChangeLog: > > * config/aarch64/aarch64.cc

Re: [PATCH v2] aarch64: Fix ACLE SME streaming mode error in neon-sve-bridge

2024-04-09 Thread Richard Sandiford
Richard Ball writes: > When using LTO, handling the pragma for sme before the pragma > for the neon-sve-bridge caused the following error on svset_neonq, > in the neon-sve-bridge.c test. > > error: ACLE function '0' can only be called when SME streaming mode is > enabled. > > This has been

[pushed] aarch64: Fix expansion of svsudot [PR114607]

2024-04-08 Thread Richard Sandiford
Not sure how this happend, but: svsudot is supposed to be expanded as USDOT with the operands swapped. However, a thinko in the expansion of svsudot meant that the arguments weren't in fact swapped; the attempted swap was just a no-op. And the testcases blithely accepted that. Tested on

[gcc r14-9836] aarch64: Fix expansion of svsudot [PR114607]

2024-04-08 Thread Richard Sandiford via Gcc-cvs
https://gcc.gnu.org/g:2c1c2485a4b1aca746ac693041e51ea6da5c64ca commit r14-9836-g2c1c2485a4b1aca746ac693041e51ea6da5c64ca Author: Richard Sandiford Date: Mon Apr 8 16:53:32 2024 +0100 aarch64: Fix expansion of svsudot [PR114607] Not sure how this happend, but: svsudot is supposed

Re: [PATCH][wwwdocs] Add NEON-SVE bridge intrinsics to changes.html

2024-04-08 Thread Richard Sandiford
Richard Ball writes: > Hi all, > > Adding the NEON-SVE bridge intrinsics that were missed > in the last patch. > > Thanks, > Richard OK, thanks. Richard > diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html > index >

[gcc r14-9833] aarch64: Fix vld1/st1_x4 intrinsic test

2024-04-08 Thread Richard Sandiford via Gcc-cvs
https://gcc.gnu.org/g:278cad85077509b73b1faf32d36f3889c2a5524b commit r14-9833-g278cad85077509b73b1faf32d36f3889c2a5524b Author: Swinney, Jonathan Date: Mon Apr 8 14:02:33 2024 +0100 aarch64: Fix vld1/st1_x4 intrinsic test The test for this intrinsic was failing silently and so

Re: [PATCH] aarch64: Fix vld1/st1_x4 intrinsic test

2024-04-08 Thread Richard Sandiford
"Swinney, Jonathan" writes: > The test for this intrinsic was failing silently and so it failed to > report the bug reported in 114521. This patch modifes the test to > report the result. > > Bug report: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114521 > > Signed-off-by: Jonathan Swinney >

Re: [PATCH] rtl-optimization/101523 - avoid re-combine after noop 2->2 combination

2024-04-08 Thread Richard Sandiford
Segher Boessenkool writes: > Hi! > > On Wed, Apr 03, 2024 at 01:07:41PM +0200, Richard Biener wrote: >> The following avoids re-walking and re-combining the instructions >> between i2 and i3 when the pattern of i2 doesn't change. >> >> Bootstrap and regtest running ontop of a reversal of >>

Re: [pushed] aarch64: Fix bogus cnot optimisation [PR114603]

2024-04-08 Thread Richard Sandiford
Richard Biener writes: > On Fri, Apr 5, 2024 at 3:52 PM Richard Sandiford >> This isn't a regression on a known testcase. However, it's a nasty >> wrong code bug that could conceivably trigger for autovec code (although >> I've not been able to construct a reproducer so fa

[pushed] aarch64: Fix bogus cnot optimisation [PR114603]

2024-04-05 Thread Richard Sandiford
aarch64-sve.md had a pattern that combined: cmpeq pb.T, pa/z, zc.T, #0 mov zd.T, pb/z, #1 into: cnotzd.T, pa/m, zc.T But this is only valid if pa.T is a ptrue. In other cases, the original would set inactive elements of zd.T to 0, whereas the combined form

[gcc r14-9811] aarch64: Fix bogus cnot optimisation [PR114603]

2024-04-05 Thread Richard Sandiford via Gcc-cvs
https://gcc.gnu.org/g:67cbb1c638d6ab3a9cb77e674541e2b291fb67df commit r14-9811-g67cbb1c638d6ab3a9cb77e674541e2b291fb67df Author: Richard Sandiford Date: Fri Apr 5 14:47:15 2024 +0100 aarch64: Fix bogus cnot optimisation [PR114603] aarch64-sve.md had a pattern that combined

[pushed] aarch64: Recognise svundef idiom [PR114577]

2024-04-04 Thread Richard Sandiford
GCC 14 adds the header file arm_neon_sve_bridge.h to help interface SVE and Advanced SIMD code. One of the defined idioms is: svset_neonq (svundef_TYPE (), advsimd_vector) which simply reinterprets advsimd_vector as an SVE vector without regard for what's in the upper bits. GCC was failing

[gcc r14-9787] aarch64: Recognise svundef idiom [PR114577]

2024-04-04 Thread Richard Sandiford via Gcc-cvs
https://gcc.gnu.org/g:86dce005a1d440154dbf585dde5a2dd4cfac7a05 commit r14-9787-g86dce005a1d440154dbf585dde5a2dd4cfac7a05 Author: Richard Sandiford Date: Thu Apr 4 14:15:49 2024 +0100 aarch64: Recognise svundef idiom [PR114577] GCC 14 adds the header file arm_neon_sve_bridge.h

Re: [PATCH] libatomic: Cleanup macros in atomic_16.S

2024-04-04 Thread Richard Sandiford
Wilco Dijkstra writes: > As mentioned in > https://gcc.gnu.org/pipermail/gcc-patches/2024-March/648397.html , > do some additional cleanup of the macros and aliases: > > Cleanup the macros to add the libat_ prefixes in atomic_16.S. Emit the > alias to __atomic_ when ifuncs are not enabled in

Re: [PATCH] libatomic: Fix build for --disable-gnu-indirect-function [PR113986]

2024-04-04 Thread Richard Sandiford
Wilco Dijkstra writes: > v2: > > Fix libatomic build to support --disable-gnu-indirect-function on AArch64. > Always build atomic_16.S, add aliases to the __atomic_ functions if > !HAVE_IFUNC. > Include auto-config.h in atomic_16.S to avoid having to pass defines via > makefiles. > Fix build

Re: [PATCH V3 0/2] aarch64: Place target independent and dependent changed code in one file.

2024-04-03 Thread Richard Sandiford
Alex Coplan writes: > On 23/02/2024 16:41, Ajit Agarwal wrote: >> Hello Richard/Alex/Segher: > > Hi Ajit, > > Sorry for the delay and thanks for working on this. > > Generally this looks like the right sort of approach (IMO) but I've left > some comments below. > > I'll start with a meta comment:

Re: [PATCH v2 2/3] aarch64: Add support for aarch64-gnu (GNU/Hurd on AArch64)

2024-04-02 Thread Richard Sandiford
Sergey Bugaev writes: > Coupled with a corresponding binutils patch, this produces a toolchain that > can > sucessfully build working binaries targeting aarch64-gnu. > > gcc/Changelog: > > * config.gcc: Recognize aarch64*-*-gnu* targets. > * config/aarch64/aarch64-gnu.h: New file. >

Re: [PATCH] aarch64: Fix typo in comment about FEATURE_STRING

2024-04-02 Thread Richard Sandiford
Christophe Lyon writes: > Fix the comment to document FEATURE_STRING instead of FEAT_STRING. > > 2024-03-29 Christophe Lyon > > gcc/ > * config/aarch64/aarch64-option-extensions.def: Fix comment. OK, thanks. Richard > --- > gcc/config/aarch64/aarch64-option-extensions.def | 16

Re: [PATCH] libgcc: Add missing HWCAP entries to aarch64/cpuinfo.c

2024-04-02 Thread Richard Sandiford
Wilco Dijkstra writes: > A few HWCAP entries are missing from aarch64/cpuinfo.c. This results in > build errors > on older machines. > > This counts a trivial build fix, but since it's late in stage 4 I'll let > maintainers chip in. > OK for commit? > > libgcc/ > *

[oops pushed] aarch64: Fix vld1/st1_x4 intrinsic definitions

2024-03-28 Thread Richard Sandiford
Gah. As mentioned on irc, I'd written this patch to fix PR114521. The bug was fixed properly by Jonathan's struct rework in GCC 12, but that's much too invasive to backport. The attached patch therefore deals with the bug directly. Since it's new work, and since there's only one GCC 11 release

Re: [PATCHv2 2/2] aarch64: Add support for _BitInt

2024-03-28 Thread Richard Sandiford
Jakub Jelinek writes: > On Thu, Mar 28, 2024 at 03:00:46PM +0000, Richard Sandiford wrote: >> >* gcc.target/aarch64/bitint-alignments.c: New test. >> >* gcc.target/aarch64/bitint-args.c: New test. >> >* gcc.target/aarch64/bitint-sizes.c: New test. >>

Re: [PATCHv2 2/2] aarch64: Add support for _BitInt

2024-03-28 Thread Richard Sandiford
"Andre Vieira (lists)" writes: > This patch adds support for C23's _BitInt for the AArch64 port when > compiling for little endianness. Big Endianness requires further > target-agnostic support and we therefor disable it for now. > > The tests expose some suboptimal codegen for which I'll

Re: [PATCHv2 1/2] aarch64: Do not give ABI change diagnostics for _BitInt(N)

2024-03-28 Thread Richard Sandiford
"Andre Vieira (lists)" writes: > This patch makes sure we do not give ABI change diagnostics for the ABI > breaks of GCC 9, 13 and 14 for any type involving _BitInt(N), since that > type did not exist before this GCC version. > > ChangeLog: > > * config/aarch64/aarch64.cc

[gcc r11-11296] asan: Handle poly-int sizes in ASAN_MARK [PR97696]

2024-03-27 Thread Richard Sandiford via Gcc-cvs
https://gcc.gnu.org/g:d98467091bfc23522fefd32f1253e1c9e80331d3 commit r11-11296-gd98467091bfc23522fefd32f1253e1c9e80331d3 Author: Richard Sandiford Date: Wed Mar 27 19:26:57 2024 + asan: Handle poly-int sizes in ASAN_MARK [PR97696] This patch makes the expansion

[gcc r11-11295] aarch64: Fix vld1/st1_x4 intrinsic definitions

2024-03-27 Thread Richard Sandiford via Gcc-cvs
https://gcc.gnu.org/g:daee0409d195d346562e423da783d5d1cf8ea175 commit r11-11295-gdaee0409d195d346562e423da783d5d1cf8ea175 Author: Richard Sandiford Date: Wed Mar 27 19:26:56 2024 + aarch64: Fix vld1/st1_x4 intrinsic definitions The vld1_x4 and vst1_x4 patterns use XI

[gcc r12-10296] asan: Handle poly-int sizes in ASAN_MARK [PR97696]

2024-03-27 Thread Richard Sandiford via Gcc-cvs
https://gcc.gnu.org/g:51e1629bc11f0ae4b8050712b26521036ed360aa commit r12-10296-g51e1629bc11f0ae4b8050712b26521036ed360aa Author: Richard Sandiford Date: Wed Mar 27 17:38:09 2024 + asan: Handle poly-int sizes in ASAN_MARK [PR97696] This patch makes the expansion

[gcc r13-8501] asan: Handle poly-int sizes in ASAN_MARK [PR97696]

2024-03-27 Thread Richard Sandiford via Gcc-cvs
https://gcc.gnu.org/g:86b80b049167d28a9ef43aebdfbb80ae5deb0888 commit r13-8501-g86b80b049167d28a9ef43aebdfbb80ae5deb0888 Author: Richard Sandiford Date: Wed Mar 27 15:30:19 2024 + asan: Handle poly-int sizes in ASAN_MARK [PR97696] This patch makes the expansion

Re: [PATCH] libstdc++: add ARM SVE support to std::experimental::simd

2024-03-27 Thread Richard Sandiford
Matthias Kretz writes: > Hi Richard, > > sorry for not answering sooner. I took action on your mail but failed to also > give feedback. Now in light of your veto of Srinivas patch I wanted to use > the > opportunity to pick this up again. > > On Dienstag, 23. Januar 2

Re: [PATCH v2] libstdc++: add ARM SVE support to std::experimental::simd

2024-03-27 Thread Richard Sandiford
Matthias Kretz writes: > On Wednesday, 27 March 2024 11:07:14 CET Richard Sandiford wrote: >> I'm still worried about: >> >> #if _GLIBCXX_SIMD_HAVE_SVE >> constexpr inline int __sve_vectorized_size_bytes = __ARM_FEATURE_SVE_BITS >> / 8

Re: [PATCH v2] libstdc++: add ARM SVE support to std::experimental::simd

2024-03-27 Thread Richard Sandiford
Jonathan Wakely writes: > On Fri, 8 Mar 2024 at 09:58, Matthias Kretz wrote: >> >> Hi, >> >> I applied and did extended testing on x86_64 (no regressions) and aarch64 >> using qemu testing SVE 256, 512, and 1024. Looks good! >> >> While going through the applied patch I noticed a few style issues

Re: [pushed] aarch64: Define out-of-class static constants

2024-03-26 Thread Richard Sandiford
Vaseeharan Vinayagamoorthy writes: > Hi Richard, > > I think this patch is breaking the build of aarch64-none-elf and > aarch64-none-linux-gnu targets, when building with GCC 4.8. > This is not an issue when building with GCC 7.5. > > Kind regards, > Vasee Thanks. I pushed the attached patch

[gcc r14-9678] aarch64: Use constexpr for out-of-line statics

2024-03-26 Thread Richard Sandiford via Gcc-cvs
https://gcc.gnu.org/g:5be2313bceea7b482c17ee730efe604b910800bd commit r14-9678-g5be2313bceea7b482c17ee730efe604b910800bd Author: Richard Sandiford Date: Tue Mar 26 17:27:56 2024 + aarch64: Use constexpr for out-of-line statics GCC 4.8 complained about the use of const rather

Re: [PATCH] aarch64: Add +lse128 architectural extension command-line flag

2024-03-26 Thread Richard Sandiford
Victor Do Nascimento writes: > Given how, at present, the choice of using LSE128 atomic instructions > by the toolchain is delegated to run-time selection in the form of > Libatomic ifuncs, responsible for querying target support, the > `+lse128' target architecture compile-time flag is absent

Re: [PATCH] gomp: testsuite: improve compatibility of bad-array-section-3.c [PR113428]

2024-03-08 Thread Richard Sandiford
Richard Earnshaw writes: > This test generates different warnings on ilp32 targets because the size > of an integer matches the size of a pointer. Avoid this by using > signed char. > > gcc/testsuite: > > PR testsuite/113428 > * gcc.dg/gomp/bad-array-section-c-3.c: Use signed char

Re: [r14-9173 Regression] FAIL: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3 "_expr" on Linux/x86_64

2024-03-07 Thread Richard Sandiford
Sorry, still catching up on email, but: Richard Biener writes: > We have optimize_vectors_before_lowering_p but we shouldn't even there > turn supported into not supported ops and as said, what's supported or > not cannot be finally decided (if it's only vcond and not vcond_mask > that is

Re: [PATCH] libatomic: Fix build for --disable-gnu-indirect-function [PR113986]

2024-03-07 Thread Richard Sandiford
Wilco Dijkstra writes: > Fix libatomic build to support --disable-gnu-indirect-function on AArch64. > Always build atomic_16.S and add aliases to the __atomic_* functions if > !HAVE_IFUNC. This description is too brief for me. Could you say in detail how the new scheme works? E.g. the

Re: [PATCH 2/2] aarch64: Add support for _BitInt

2024-03-07 Thread Richard Sandiford
"Andre Vieira (lists)" writes: > Hey, > > Dropped the first patch and dealt with the comments above, hopefully I > didn't miss any this time. > > -- > > This patch adds support for C23's _BitInt for the AArch64 port when > compiling > for little endianness. Big

Re: [PATCH] AArch64: memcpy/memset expansions should not emit LDP/STP [PR113618]

2024-03-07 Thread Richard Sandiford
Wilco Dijkstra writes: > Hi Richard, > >> It looks like this is really doing two things at once: disabling the >> direct emission of LDP/STP Qs, and switching the GPR handling from using >> pairs of DImode moves to single TImode moves.  At least, that seems to be >> the effect of... > > No it

Re: [PATCH] aarch64: Fix costing of manual bfi instructions

2024-03-07 Thread Richard Sandiford
Andrew Pinski writes: > This fixes the cost model for BFI instructions which don't > use directly zero_extract on the LHS. > aarch64_bfi_rtx_p does the heavy lifting by matching of > the patterns. > > Note this alone does not fix PR 107270, it is a step in the right > direction. There we get z

Re: [PATCH 2/2] aarch64: Support `{1.0f, 1.0f, 0.0, 0.0}` CST forming with fmov with a smaller vector type.

2024-03-07 Thread Richard Sandiford
Andrew Pinski writes: > This enables construction of V4SF CST like `{1.0f, 1.0f, 0.0f, 0.0f}` > (and other fp enabled CSTs) by using `fmov v0.2s, 1.0` as the instruction > is designed to zero out the other bits. > This is a small extension on top of the code that creates fmov for the case > where

Re: [PATCH 1/2] aarch64: Use fmov s/d/hN, FP_CST for some vector CST [PR113856]

2024-03-07 Thread Richard Sandiford
Richard Sandiford writes: > Andrew Pinski writes: >> Aarch64 has a way to form some floating point CSTs via the fmov instructions, >> these instructions also zero out the upper parts of the registers so they can >> be used for vector CSTs that have have one non-zer

Re: [PATCH 1/2] aarch64: Use fmov s/d/hN, FP_CST for some vector CST [PR113856]

2024-03-07 Thread Richard Sandiford
Andrew Pinski writes: > Aarch64 has a way to form some floating point CSTs via the fmov instructions, > these instructions also zero out the upper parts of the registers so they can > be used for vector CSTs that have have one non-zero constant that would be > able > to formed via the fmov in

[pushed] aarch64: Define out-of-class static constants

2024-03-06 Thread Richard Sandiford
While reworking the aarch64 feature descriptions, I forgot to add out-of-class definitions of some static constants. This could lead to a build failure with some compilers. This was seen with some WIP to increase the number of extensions beyond 64. It's latent on trunk though, and a regression

[gcc r14-9333] aarch64: Define out-of-class static constants

2024-03-06 Thread Richard Sandiford via Gcc-cvs
https://gcc.gnu.org/g:c7a9883663a888617b6e3584233aa756b30519f8 commit r14-9333-gc7a9883663a888617b6e3584233aa756b30519f8 Author: Richard Sandiford Date: Wed Mar 6 10:04:56 2024 + aarch64: Define out-of-class static constants While reworking the aarch64 feature descriptions, I

Re: [PATCHv2] fwprop: Avoid volatile defines to be propagated

2024-03-05 Thread Richard Sandiford
HAO CHEN GUI writes: > Hi, > This patch tries to fix a potential problem which is raised by the patch > for PR111267. The volatile asm operand tries to be propagated to a single > set insn with the patch for PR111267. The volatile asm operand might be > executed for multiple times if the define

  1   2   3   4   5   6   7   8   9   10   >