https://gcc.gnu.org/g:a0fe4fb1c8d7804515845dd5d2a814b3c7a1ccba
commit r15-820-ga0fe4fb1c8d7804515845dd5d2a814b3c7a1ccba
Author: Richard Sandiford
Date: Fri May 24 13:47:21 2024 +0100
vect: Fix access size alignment assumption [PR115192]
create_intersect_range_checks checks
create_intersect_range_checks checks whether two access ranges
a and b are alias-free using something equivalent to:
end_a <= start_b || end_b <= start_a
It has two ways of doing this: a "vanilla" way that calculates
the exact exclusive end pointers, and another way that uses the
last
Pengxuan Zheng writes:
> This patch is a follow-up of r15-697-ga2e4fe5a53cf75 to also fold vget_high_*
> intrinsics to BIT_FILED_REF and remove the vget_high_* definitions from
> arm_neon.h to use the new intrinsics framework.
>
> PR target/102171
>
> gcc/ChangeLog:
>
> *
Evgeny Karpov writes:
> The DLL import/export mingw implementation, originally from ix86, requires
> minor adjustments to be compatible with AArch64.
>
> gcc/ChangeLog:
>
> * config/mingw/mingw32.h (defined): Use the correct DllMainCRTStartup
> entry function.
> *
Evgeny Karpov writes:
> This patch extends the aarch64 attributes list with the selectany
> attribute for the aarch64-w64-mingw32 target and reuses the mingw
> implementation to handle it.
>
> * config/aarch64/aarch64.cc:
> Extend the aarch64 attributes list.
> *
Evgeny Karpov writes:
> This patch renames functions related to dllimport/dllexport
> and selectany functionality. These functions will be reused
> in the aarch64-w64-mingw32 target.
>
> gcc/ChangeLog:
>
> * config/i386/cygming.h (mingw_pe_record_stub):
> Rename functions in mingw
Richard Biener writes:
> On Tue, 21 May 2024, Richard Biener wrote:
>
>> The gcc.dg/vect/slp-12a.c case is interesting as we currently split
>> the 8 store group into lanes 0-5 which we SLP with an unroll factor
>> of two (on x86-64 with SSE) and the remaining two lanes are using
>> interleaving
Tamar Christina writes:
>> -Original Message-
>> From: Richard Sandiford
>> Sent: Wednesday, May 22, 2024 10:48 AM
>> To: Tamar Christina
>> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
>> ; Marcus Shawcroft
>> ; ktkac...@gcc.gnu.org
&
Evgeny Karpov writes:
> This patch extracts the ix86 implementation for expanding a SYMBOL
> into its corresponding dllimport, far-address, or refptr symbol.
> It will be reused in the aarch64-w64-mingw32 target.
> The implementation is copied as is from i386/i386.cc with
> minor changes to
Tamar Christina writes:
> Hi All,
>
> This patch adds new alternatives to the patterns which are affected. The new
> alternatives with the conditional early clobbers are added before the normal
> ones in order for LRA to prefer them in the event that we have enough free
> registers to
Richard Sandiford writes:
> Richard Biener writes:
>> When change_vec_perm_layout runs into a permute combining two
>> nodes where one is invariant and one internal the partition of
>> one input can be -1 but the other might not be. The following
>> supports this cas
Richard Biener writes:
> The following avoids splitting store dataref groups during SLP
> discovery but instead forces (eventually single-lane) consecutive
> lane SLP discovery for all lanes of the group, creating VEC_PERM
> SLP nodes merging them so the store will always cover the whole group.
>
Richard Biener writes:
> When change_vec_perm_layout runs into a permute combining two
> nodes where one is invariant and one internal the partition of
> one input can be -1 but the other might not be. The following
> supports this case by simply ignoring inputs with input partiton -1.
>
> I'm
Wilco Dijkstra writes:
> Hi Andrew,
>
> A few comments on the implementation, I think it can be simplified a lot:
FWIW, I agree with Wilco's comments, except:
>> +++ b/gcc/config/aarch64/aarch64.h
>> @@ -700,8 +700,9 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE =
>> AARCH64_FL_SM_OFF;
>>
https://gcc.gnu.org/g:7f35863ebbf7ba63e2f075edfbec105de272578a
commit r15-752-g7f35863ebbf7ba63e2f075edfbec105de272578a
Author: Richard Sandiford
Date: Tue May 21 10:21:16 2024 +0100
Cache the set of EH_RETURN_DATA_REGNOs
While reviewing Andrew's fix for PR114843, it seemed like
While reviewing Andrew's fix for PR114843, it seemed like it would
be convenient to have a HARD_REG_SET of EH_RETURN_DATA_REGNOs.
This patch adds one and uses it to simplify a couple of use sites.
Tested on aarch64-linux-gnu & x86_64-linux-gnu. OK to install?
Richard
gcc/
*
Richard Biener writes:
> On Fri, May 17, 2024 at 11:56 AM Tamar Christina
> wrote:
>>
>> > -Original Message-
>> > From: Richard Biener
>> > Sent: Friday, May 17, 2024 10:46 AM
>> > To: Tamar Christina
>> > Cc: Victor Do Nasc
Andrew Carlotti writes:
> On Fri, May 17, 2024 at 04:45:05PM +0100, Richard Sandiford wrote:
>> Andrew Carlotti writes:
>> > The end goal of the series is to change the definition of
>> > aarch64_feature_flags
>> > from a uint64_t typedef to a class w
Ajit Agarwal writes:
> Hello Alex/Richard:
>
> Renaming of generic code is done to make target independent
> and target dependent code to support multiple targets.
>
> Target independent code is the Generic code with pure virtual function
> to interface betwwen target independent and dependent
Wilco Dijkstra writes:
> Improve costing of ctz - both TARGET_CSSC and vector cases were not handled
> yet.
>
> Passes regress & bootstrap - OK for commit?
>
> gcc:
> * config/aarch64/aarch64.cc (aarch64_rtx_costs): Improve CTZ costing.
Ok, thanks.
Richard
> diff --git
Wilco Dijkstra writes:
> Add missing '\' in 2-instruction movsi/di alternatives so that they are
> printed on separate lines.
>
> Passes bootstrap and regress, OK for commit once stage 1 reopens?
>
> gcc:
> * config/aarch64/aarch64.md (movsi_aarch64): Use '\;' to force
> newline
Pengxuan Zheng writes:
> This patch folds vget_low_* intrinsics to BIT_FILED_REF to open up more
> optimization opportunities for gimple optimizers.
>
> While we are here, we also remove the vget_low_* definitions from arm_neon.h
> and
> use the new intrinsics framework.
>
> PR
Ajit Agarwal writes:
> Hello Alex/Richard:
>
> All comments are addressed.
>
> Common infrastructure of load store pair fusion is divided into target
> independent and target dependent changed code.
>
> Target independent code is the Generic code with pure virtual function
> to interface between
Ajit Agarwal writes:
> Hello Alex/Richard:
>
> All review comments are addressed.
>
> Common infrastructure of load store pair fusion is divided into target
> independent and target dependent changed code.
>
> Target independent code is the Generic code with pure virtual function
> to interface
Andrew Carlotti writes:
> The end goal of the series is to change the definition of
> aarch64_feature_flags
> from a uint64_t typedef to a class with 128 bits of storage. This class uses
> operator overloading to mimic the existing integer interface as much as
> possible, but with added
Richard Biener via Gcc writes:
> Hi,
>
> I'd like to discuss how to go forward with getting the vectorizer to
> all-SLP for this stage1. While there is a personal branch with my
> ongoing work (users/rguenth/vect-force-slp) branches haven't proved
> themselves working well for collaboration.
Richard Sandiford writes:
> Wilco Dijkstra writes:
>> Use LDP/STP for large struct types as they have useful immediate offsets and
>> are typically faster.
>> This removes differences between little and big endian and allows use of
>> LDP/STP without UNSPEC.
>>
Wilco Dijkstra writes:
> Use LDP/STP for large struct types as they have useful immediate offsets and
> are typically faster.
> This removes differences between little and big endian and allows use of
> LDP/STP without UNSPEC.
>
> Passes regress and bootstrap, OK for commit?
>
> gcc:
>
Tamar Christina writes:
>> >> On Wed, May 15, 2024 at 12:29 PM Tamar Christina
>> >> wrote:
>> >> >
>> >> > Hi All,
>> >> >
>> >> > Some Neoverse Software Optimization Guides (SWoG) have a clause that
>> >> > state
>> >> > that for predicated operations that also produce a predicate it is
>>
Tamar Christina writes:
>> -Original Message-
>> From: Richard Biener
>> Sent: Wednesday, May 15, 2024 12:20 PM
>> To: Tamar Christina
>> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
>> ; Marcus Shawcroft
>> ; ktkac...@gcc.gnu.org; Richar
Thanks for doing this a pre-patch. Minor request below:
Tamar Christina writes:
> ;; Perform a logical operation on operands 2 and 3, using operand 1 as
> @@ -6676,38 +6690,42 @@ (define_insn "@aarch64_pred__z"
> (define_insn "*3_cc"
>[(set (reg:CC_NZC CC_REGNUM)
> (unspec:CC_NZC
>
Tamar Christina writes:
> Hi All,
>
> This adds a new tuning parameter EARLY_CLOBBER_SVE_PRED_DEST for AArch64 to
> allow us to conditionally enable the early clobber alternatives based on the
> tuning models.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Ok for master?
Wilco Dijkstra writes:
> Use UZP1 instead of INS when combining low and high halves of vectors.
> UZP1 has 3 operands which improves register allocation, and is faster on
> some microarchitectures.
>
> Passes regress & bootstrap, OK for commit?
OK, thanks. We can add core-specific tuning later
The svzero_mask_za intrinsic tried to use the shortest combination
of .b, .h, .s and .d tiles, allowing mixtures of sizes where necessary.
However, Iain S pointed out that LLVM instead requires the tiles to
have the same suffix. GAS supports both versions, so this patch
generates the
Hi Andrew,
Thanks for doing this. I think it improves the organisation of the
FMV documentation and adds some details that were previously missing.
I've made some suggestions below, but documentation is subjective
and I realise that not everyone will agree with them.
I've also added Sandra to
Andrew Carlotti writes:
> We don't yet have a separate feature flag for FEAT_LRCPC2 (and adding
> one will require extending the feature bitmask). Instead, make the
> FEAT_LRCPC patterns available when either armv8.4-a or +rcpc3 is
> specified. On the other hand, we already have a +rcpc flag,
Andrew Carlotti writes:
> FEAT_CSSC is mandatory in the architecture from Armv8.9.
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64-arches.def: Add CSSC to V8_9A
> dependencies.
OK, thanks.
Richard
>
> ---
>
> Bootstrapped and regression tested on aarch64. Ok for master?
>
>
> diff
Richard Biener writes:
> On Fri, 12 Apr 2024, Tamar Christina wrote:
>
>> Hi All,
>>
>> This is a story all about how the peeling for gaps introduces a bug in the
>> upper
>> bounds.
>>
>> Before I go further, I'll first explain how I understand this to work for
>> loops
>> with a single
Alex Coplan writes:
> This is a v2 because I accidentally sent a WIP version of the patch last
> time round which used replace_equiv_address instead of
> replace_equiv_address_nv; that caused some ICEs (pointed out by the
> Linaro CI) since pair addressing modes aren't a subset of the addresses
>
"Andre Vieira (lists)" writes:
> This patch fixes some testisms introduced by:
>
> commit 5aa3fec38cc6f52285168b161bab1a869d864b44
> Author: Andre Vieira
> Date: Wed Apr 10 16:29:46 2024 +0100
>
> aarch64: Add support for _BitInt
>
> The testcases were relying on an unnecessary
https://gcc.gnu.org/g:b87ba79200f2a727aa5c523abcc5c03fa11fc007
commit r14-9925-gb87ba79200f2a727aa5c523abcc5c03fa11fc007
Author: Andre Vieira (lists)
Date: Thu Apr 11 17:54:37 2024 +0100
aarch64: Fix _BitInt testcases
This patch fixes some testisms introduced by:
commit
Evgeny Karpov writes:
> Wednesday, April 10, 2024 8:40 PM
> Richard Sandiford wrote:
>
>> Thanks for the updates and sorry again for the slow review.
>> I've replied to some of the patches in the series but otherwise it looks
>> good to
>> me.
>>
>
Alex Coplan writes:
> Hi,
>
> The ldp/stp fusion pass can change the base of an access so that the two
> accesses end up using a common base register. So far we have been using
> adjust_address_nv to do this, but this means that we don't preserve
> other properties of the mem we're replacing.
Andrew Carlotti writes:
> On Wed, Apr 10, 2024 at 05:42:05PM +0100, Richard Sandiford wrote:
>> Andrew Carlotti writes:
>> > On Tue, Apr 09, 2024 at 04:43:16PM +0100, Richard Sandiford wrote:
>> >> Andrew Carlotti writes:
>> >> > The first three pa
Evgeny Karpov writes:
> Hello,
>
> v2 is ready for the review!
> Based on the v1 review:
> https://gcc.gnu.org/pipermail/gcc-patches/2024-February/thread.html#646203
>
> Testing for the x86_64-w64-mingw32 target is in progress to avoid
> regression due to refactoring.
Thanks for the updates and
Evgeny Karpov writes:
> From: Zac Walker
> Date: Fri, 1 Mar 2024 02:17:39 +0100
> Subject: [PATCH v2 10/13] Rename "x86 Windows Options" to "Cygwin and MinGW
> Options"
>
> Rename "x86 Windows Options" to "Cygwin and MinGW Options".
> It will be used also for AArch64.
>
> gcc/ChangeLog:
>
>
Evgeny Karpov writes:
> From: Zac Walker
> Date: Fri, 1 Mar 2024 10:49:28 +0100
> Subject: [PATCH v2 08/13] aarch64: Add Cygwin and MinGW environments for
> AArch64
>
> Define Cygwin and MinGW environment such as types, SEH definitions,
> shared libraries, etc.
>
> gcc/ChangeLog:
>
> *
Evgeny Karpov writes:
> From: Zac Walker
> Date: Fri, 1 Mar 2024 01:55:47 +0100
> Subject: [PATCH v2 04/13] aarch64: Add aarch64-w64-mingw32 COFF
>
> Define ASM specific for COFF format on AArch64.
>
> gcc/ChangeLog:
>
> * config.gcc: Add COFF format support definitions.
> *
Evgeny Karpov writes:
> From: Zac Walker
> Date: Fri, 1 Mar 2024 09:56:59 +0100
> Subject: [PATCH v2 03/13] aarch64: Mark x18 register as a fixed register for
> MS ABI
>
> Define the MS ABI for aarch64-w64-mingw32.
> Adjust FIXED_REGISTERS, CALL_REALLY_USED_REGISTERS and
> STATIC_CHAIN_REGNUM
Sorry for the slow reply.
Evgeny Karpov writes:
> From: Zac Walker
> Date: Fri, 1 Mar 2024 01:45:13 +0100
> Subject: [PATCH v2 02/13] aarch64: The aarch64-w64-mingw32 target implements
> the MS ABI
>
> Two ABIs for aarch64 have been defined for different platforms.
>
> gcc/ChangeLog:
>
>
Andrew Carlotti writes:
> On Tue, Apr 09, 2024 at 04:43:16PM +0100, Richard Sandiford wrote:
>> Andrew Carlotti writes:
>> > The first three patches are trivial changes to the feature list to reflect
>> > recent changes in the ACLE. Patch 4 removes most of th
h I've not gone through the tests very thorougly
this time around, and just gone by the internal diff between this
version and the previous one. But we can adjust them as necessary
based on any reports that come in.
Richard
>
> On 28/03/2024 15:21, Richard Sandiford wrote:
>> Jakub
"Andre Vieira (lists)" writes:
> @@ -6907,6 +6938,11 @@ aarch64_layout_arg (cumulative_args_t pcum_v, const
> function_arg_info )
> && (!alignment || abi_break_gcc_9 < alignment)
> && (!abi_break_gcc_13 || alignment < abi_break_gcc_13));
>
> + /* _BitInt(N) was only
Tamar Christina writes:
> Hi All,
>
> This is a backport of g:306713c953d509720dc394c43c0890548bb0ae07.
>
> The AArch64 vector PCS does not allow simd calls with simdlen 1,
> however due to a bug we currently do allow it for num == 0.
>
> This causes us to emit a symbol that doesn't exist and we
Andrew Carlotti writes:
> The first three patches are trivial changes to the feature list to reflect
> recent changes in the ACLE. Patch 4 removes most of the FMV multiversioning
> features that don't work at the moment, and should be entirely
> uncontroversial.
>
> Patch 5 handles the
Andrew Carlotti writes:
> There was an assumption in some places that the aarch64_fmv_feature_data
> array contained FEAT_MAX elements. While this assumption held up till
> now, it is safer and more flexible to use the array size directly.
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64.cc
Richard Ball writes:
> When using LTO, handling the pragma for sme before the pragma
> for the neon-sve-bridge caused the following error on svset_neonq,
> in the neon-sve-bridge.c test.
>
> error: ACLE function '0' can only be called when SME streaming mode is
> enabled.
>
> This has been
Not sure how this happend, but: svsudot is supposed to be expanded
as USDOT with the operands swapped. However, a thinko in the
expansion of svsudot meant that the arguments weren't in fact
swapped; the attempted swap was just a no-op. And the testcases
blithely accepted that.
Tested on
https://gcc.gnu.org/g:2c1c2485a4b1aca746ac693041e51ea6da5c64ca
commit r14-9836-g2c1c2485a4b1aca746ac693041e51ea6da5c64ca
Author: Richard Sandiford
Date: Mon Apr 8 16:53:32 2024 +0100
aarch64: Fix expansion of svsudot [PR114607]
Not sure how this happend, but: svsudot is supposed
Richard Ball writes:
> Hi all,
>
> Adding the NEON-SVE bridge intrinsics that were missed
> in the last patch.
>
> Thanks,
> Richard
OK, thanks.
Richard
> diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html
> index
>
https://gcc.gnu.org/g:278cad85077509b73b1faf32d36f3889c2a5524b
commit r14-9833-g278cad85077509b73b1faf32d36f3889c2a5524b
Author: Swinney, Jonathan
Date: Mon Apr 8 14:02:33 2024 +0100
aarch64: Fix vld1/st1_x4 intrinsic test
The test for this intrinsic was failing silently and so
"Swinney, Jonathan" writes:
> The test for this intrinsic was failing silently and so it failed to
> report the bug reported in 114521. This patch modifes the test to
> report the result.
>
> Bug report: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114521
>
> Signed-off-by: Jonathan Swinney
>
Segher Boessenkool writes:
> Hi!
>
> On Wed, Apr 03, 2024 at 01:07:41PM +0200, Richard Biener wrote:
>> The following avoids re-walking and re-combining the instructions
>> between i2 and i3 when the pattern of i2 doesn't change.
>>
>> Bootstrap and regtest running ontop of a reversal of
>>
Richard Biener writes:
> On Fri, Apr 5, 2024 at 3:52 PM Richard Sandiford
>> This isn't a regression on a known testcase. However, it's a nasty
>> wrong code bug that could conceivably trigger for autovec code (although
>> I've not been able to construct a reproducer so fa
aarch64-sve.md had a pattern that combined:
cmpeq pb.T, pa/z, zc.T, #0
mov zd.T, pb/z, #1
into:
cnotzd.T, pa/m, zc.T
But this is only valid if pa.T is a ptrue. In other cases, the
original would set inactive elements of zd.T to 0, whereas the
combined form
https://gcc.gnu.org/g:67cbb1c638d6ab3a9cb77e674541e2b291fb67df
commit r14-9811-g67cbb1c638d6ab3a9cb77e674541e2b291fb67df
Author: Richard Sandiford
Date: Fri Apr 5 14:47:15 2024 +0100
aarch64: Fix bogus cnot optimisation [PR114603]
aarch64-sve.md had a pattern that combined
GCC 14 adds the header file arm_neon_sve_bridge.h to help interface
SVE and Advanced SIMD code. One of the defined idioms is:
svset_neonq (svundef_TYPE (), advsimd_vector)
which simply reinterprets advsimd_vector as an SVE vector without
regard for what's in the upper bits.
GCC was failing
https://gcc.gnu.org/g:86dce005a1d440154dbf585dde5a2dd4cfac7a05
commit r14-9787-g86dce005a1d440154dbf585dde5a2dd4cfac7a05
Author: Richard Sandiford
Date: Thu Apr 4 14:15:49 2024 +0100
aarch64: Recognise svundef idiom [PR114577]
GCC 14 adds the header file arm_neon_sve_bridge.h
Wilco Dijkstra writes:
> As mentioned in
> https://gcc.gnu.org/pipermail/gcc-patches/2024-March/648397.html ,
> do some additional cleanup of the macros and aliases:
>
> Cleanup the macros to add the libat_ prefixes in atomic_16.S. Emit the
> alias to __atomic_ when ifuncs are not enabled in
Wilco Dijkstra writes:
> v2:
>
> Fix libatomic build to support --disable-gnu-indirect-function on AArch64.
> Always build atomic_16.S, add aliases to the __atomic_ functions if
> !HAVE_IFUNC.
> Include auto-config.h in atomic_16.S to avoid having to pass defines via
> makefiles.
> Fix build
Alex Coplan writes:
> On 23/02/2024 16:41, Ajit Agarwal wrote:
>> Hello Richard/Alex/Segher:
>
> Hi Ajit,
>
> Sorry for the delay and thanks for working on this.
>
> Generally this looks like the right sort of approach (IMO) but I've left
> some comments below.
>
> I'll start with a meta comment:
Sergey Bugaev writes:
> Coupled with a corresponding binutils patch, this produces a toolchain that
> can
> sucessfully build working binaries targeting aarch64-gnu.
>
> gcc/Changelog:
>
> * config.gcc: Recognize aarch64*-*-gnu* targets.
> * config/aarch64/aarch64-gnu.h: New file.
>
Christophe Lyon writes:
> Fix the comment to document FEATURE_STRING instead of FEAT_STRING.
>
> 2024-03-29 Christophe Lyon
>
> gcc/
> * config/aarch64/aarch64-option-extensions.def: Fix comment.
OK, thanks.
Richard
> ---
> gcc/config/aarch64/aarch64-option-extensions.def | 16
Wilco Dijkstra writes:
> A few HWCAP entries are missing from aarch64/cpuinfo.c. This results in
> build errors
> on older machines.
>
> This counts a trivial build fix, but since it's late in stage 4 I'll let
> maintainers chip in.
> OK for commit?
>
> libgcc/
> *
Gah. As mentioned on irc, I'd written this patch to fix PR114521.
The bug was fixed properly by Jonathan's struct rework in GCC 12,
but that's much too invasive to backport. The attached patch therefore
deals with the bug directly.
Since it's new work, and since there's only one GCC 11 release
Jakub Jelinek writes:
> On Thu, Mar 28, 2024 at 03:00:46PM +0000, Richard Sandiford wrote:
>> >* gcc.target/aarch64/bitint-alignments.c: New test.
>> >* gcc.target/aarch64/bitint-args.c: New test.
>> >* gcc.target/aarch64/bitint-sizes.c: New test.
>>
"Andre Vieira (lists)" writes:
> This patch adds support for C23's _BitInt for the AArch64 port when
> compiling for little endianness. Big Endianness requires further
> target-agnostic support and we therefor disable it for now.
>
> The tests expose some suboptimal codegen for which I'll
"Andre Vieira (lists)" writes:
> This patch makes sure we do not give ABI change diagnostics for the ABI
> breaks of GCC 9, 13 and 14 for any type involving _BitInt(N), since that
> type did not exist before this GCC version.
>
> ChangeLog:
>
> * config/aarch64/aarch64.cc
https://gcc.gnu.org/g:d98467091bfc23522fefd32f1253e1c9e80331d3
commit r11-11296-gd98467091bfc23522fefd32f1253e1c9e80331d3
Author: Richard Sandiford
Date: Wed Mar 27 19:26:57 2024 +
asan: Handle poly-int sizes in ASAN_MARK [PR97696]
This patch makes the expansion
https://gcc.gnu.org/g:daee0409d195d346562e423da783d5d1cf8ea175
commit r11-11295-gdaee0409d195d346562e423da783d5d1cf8ea175
Author: Richard Sandiford
Date: Wed Mar 27 19:26:56 2024 +
aarch64: Fix vld1/st1_x4 intrinsic definitions
The vld1_x4 and vst1_x4 patterns use XI
https://gcc.gnu.org/g:51e1629bc11f0ae4b8050712b26521036ed360aa
commit r12-10296-g51e1629bc11f0ae4b8050712b26521036ed360aa
Author: Richard Sandiford
Date: Wed Mar 27 17:38:09 2024 +
asan: Handle poly-int sizes in ASAN_MARK [PR97696]
This patch makes the expansion
https://gcc.gnu.org/g:86b80b049167d28a9ef43aebdfbb80ae5deb0888
commit r13-8501-g86b80b049167d28a9ef43aebdfbb80ae5deb0888
Author: Richard Sandiford
Date: Wed Mar 27 15:30:19 2024 +
asan: Handle poly-int sizes in ASAN_MARK [PR97696]
This patch makes the expansion
Matthias Kretz writes:
> Hi Richard,
>
> sorry for not answering sooner. I took action on your mail but failed to also
> give feedback. Now in light of your veto of Srinivas patch I wanted to use
> the
> opportunity to pick this up again.
>
> On Dienstag, 23. Januar 2
Matthias Kretz writes:
> On Wednesday, 27 March 2024 11:07:14 CET Richard Sandiford wrote:
>> I'm still worried about:
>>
>> #if _GLIBCXX_SIMD_HAVE_SVE
>> constexpr inline int __sve_vectorized_size_bytes = __ARM_FEATURE_SVE_BITS
>> / 8
Jonathan Wakely writes:
> On Fri, 8 Mar 2024 at 09:58, Matthias Kretz wrote:
>>
>> Hi,
>>
>> I applied and did extended testing on x86_64 (no regressions) and aarch64
>> using qemu testing SVE 256, 512, and 1024. Looks good!
>>
>> While going through the applied patch I noticed a few style issues
Vaseeharan Vinayagamoorthy writes:
> Hi Richard,
>
> I think this patch is breaking the build of aarch64-none-elf and
> aarch64-none-linux-gnu targets, when building with GCC 4.8.
> This is not an issue when building with GCC 7.5.
>
> Kind regards,
> Vasee
Thanks. I pushed the attached patch
https://gcc.gnu.org/g:5be2313bceea7b482c17ee730efe604b910800bd
commit r14-9678-g5be2313bceea7b482c17ee730efe604b910800bd
Author: Richard Sandiford
Date: Tue Mar 26 17:27:56 2024 +
aarch64: Use constexpr for out-of-line statics
GCC 4.8 complained about the use of const rather
Victor Do Nascimento writes:
> Given how, at present, the choice of using LSE128 atomic instructions
> by the toolchain is delegated to run-time selection in the form of
> Libatomic ifuncs, responsible for querying target support, the
> `+lse128' target architecture compile-time flag is absent
Richard Earnshaw writes:
> This test generates different warnings on ilp32 targets because the size
> of an integer matches the size of a pointer. Avoid this by using
> signed char.
>
> gcc/testsuite:
>
> PR testsuite/113428
> * gcc.dg/gomp/bad-array-section-c-3.c: Use signed char
Sorry, still catching up on email, but:
Richard Biener writes:
> We have optimize_vectors_before_lowering_p but we shouldn't even there
> turn supported into not supported ops and as said, what's supported or
> not cannot be finally decided (if it's only vcond and not vcond_mask
> that is
Wilco Dijkstra writes:
> Fix libatomic build to support --disable-gnu-indirect-function on AArch64.
> Always build atomic_16.S and add aliases to the __atomic_* functions if
> !HAVE_IFUNC.
This description is too brief for me. Could you say in detail how the
new scheme works? E.g. the
"Andre Vieira (lists)" writes:
> Hey,
>
> Dropped the first patch and dealt with the comments above, hopefully I
> didn't miss any this time.
>
> --
>
> This patch adds support for C23's _BitInt for the AArch64 port when
> compiling
> for little endianness. Big
Wilco Dijkstra writes:
> Hi Richard,
>
>> It looks like this is really doing two things at once: disabling the
>> direct emission of LDP/STP Qs, and switching the GPR handling from using
>> pairs of DImode moves to single TImode moves. At least, that seems to be
>> the effect of...
>
> No it
Andrew Pinski writes:
> This fixes the cost model for BFI instructions which don't
> use directly zero_extract on the LHS.
> aarch64_bfi_rtx_p does the heavy lifting by matching of
> the patterns.
>
> Note this alone does not fix PR 107270, it is a step in the right
> direction. There we get z
Andrew Pinski writes:
> This enables construction of V4SF CST like `{1.0f, 1.0f, 0.0f, 0.0f}`
> (and other fp enabled CSTs) by using `fmov v0.2s, 1.0` as the instruction
> is designed to zero out the other bits.
> This is a small extension on top of the code that creates fmov for the case
> where
Richard Sandiford writes:
> Andrew Pinski writes:
>> Aarch64 has a way to form some floating point CSTs via the fmov instructions,
>> these instructions also zero out the upper parts of the registers so they can
>> be used for vector CSTs that have have one non-zer
Andrew Pinski writes:
> Aarch64 has a way to form some floating point CSTs via the fmov instructions,
> these instructions also zero out the upper parts of the registers so they can
> be used for vector CSTs that have have one non-zero constant that would be
> able
> to formed via the fmov in
While reworking the aarch64 feature descriptions, I forgot
to add out-of-class definitions of some static constants.
This could lead to a build failure with some compilers.
This was seen with some WIP to increase the number of extensions
beyond 64. It's latent on trunk though, and a regression
https://gcc.gnu.org/g:c7a9883663a888617b6e3584233aa756b30519f8
commit r14-9333-gc7a9883663a888617b6e3584233aa756b30519f8
Author: Richard Sandiford
Date: Wed Mar 6 10:04:56 2024 +
aarch64: Define out-of-class static constants
While reworking the aarch64 feature descriptions, I
HAO CHEN GUI writes:
> Hi,
> This patch tries to fix a potential problem which is raised by the patch
> for PR111267. The volatile asm operand tries to be propagated to a single
> set insn with the patch for PR111267. The volatile asm operand might be
> executed for multiple times if the define
1 - 100 of 8953 matches
Mail list logo