Require equal shift amounts for IFN_DIV_POW2

2019-12-19 Thread Richard Sandiford
ector/ vector-scalar split that we already have for normal shifts. Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? Richard 2019-12-19 Richard Sandiford gcc/ * tree-vect-slp.c (vect_build_slp_tree_1): Require all shifts in an IFN_DIV_POW2 node to be equal.

Ping: Fix tree-nrv.c ICE for direct internal functions

2019-12-19 Thread Richard Sandiford
Ping Richard Sandiford writes: > pass_return_slot::execute has: > > /* Ignore internal functions without direct optabs, >those are expanded specially and aggregate_value_p >on their result might result in undesirable warnings >

[C++] Fix ICE for binding lax vector conversions to references (PR 93014)

2019-12-19 Thread Richard Sandiford
e don't get the error we expected for that case. The patch doesn't attempt to fix this though. Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? Richard 2019-12-19 Richard Sandiford gcc/cp/ PR c++/93014 * cvt.c (ocp_convert): Appl

Re: [C++ PATCH] Make same_type_p return false for gnu_vector_type_p differences (PR 92789)

2019-12-19 Thread Richard Sandiford
Jason Merrill writes: > On 12/12/19 10:16 AM, Richard Sandiford wrote: >> As Jason pointed out in the review of the C++ gnu_vector_type_p patch: >> >> https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00173.html >> >> the real fix for the XFAILs in acle/

Add a generic lhd_simulate_enum_decl

2019-12-19 Thread Richard Sandiford
patch adds a simple default implementation that it can use. Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK for the generic bits? Richard 2019-12-19 Richard Sandiford gcc/ * langhooks-def.h (lhd_simulate_enum_decl): Declare. (LANG_HOOKS_SIMULATE_ENUM_DECL): U

Re: [C++ PATCH] PR c++/92576 - redeclaration of variable template.

2019-12-19 Thread Richard Sandiford
ve that error was from 2004, predating alias > templates; I can't see anything in the standard to prohibit redefining an > alias template to refer to the same type, as you definitely can with a > non-template alias or typedef, and other compilers allow it. So I think > the tests shou

Re: [committed, amdgcn] Allow constants in vector extends and truncates

2019-12-19 Thread Richard Sandiford
Andrew Stubbs writes: > This patch changes the operand predicates such that vector constants are > permitted during compilation. This prevents ICEs caused by the compiler > trying to emit such instructions without checking. That sounds like a target-independent bug though. Why didn't we apply

Re: [GCC][PATCH][AArch64]Add ACLE intrinsics for dot product (usdot - vector, dot - by element) for AArch64 AdvSIMD ARMv8.6 Extension

2019-12-20 Thread Richard Sandiford
Stam Markianos-Wright writes: > diff --git a/gcc/config/aarch64/aarch64-simd.md > b/gcc/config/aarch64/aarch64-simd.md > index > ad4676bc167f08951e693916c7ef796e3501762a..eba71f004ef67af654f9c512b720aa6cfdd1d7fc > 100644 > --- a/gcc/config/aarch64/aarch64-simd.md > +++ b/gcc/config/aarch64/aarc

Re: [GCC][PATCH][AArch64]Add ACLE intrinsics for bfdot for ARMv8.6 Extension

2019-12-20 Thread Richard Sandiford
Stam Markianos-Wright writes: > Hi all, > > This patch adds the ARMv8.6 Extension ACLE intrinsics for the bfloat bfdot > operation. > > The functions are declared in arm_neon.h with the armv8.2-a+bf16 target > option > as required. > > RTL patterns are defined to generate assembler. > > Tests a

Check mask argument's type when vectorising conditional functions

2019-12-23 Thread Richard Sandiford
e. Unlike vectorizable_load and vectorizalbe_streo, vectorizable_call wasn't checking whether the mask had a suitable type, leading to an ICE on the testcases. Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? Richard 2019-12-23 Richard Sandiford gcc/ * tr

Re: [GCC][PATCH][Aarch64] Add Bfloat16_t scalar type, vector types and machine modes to Aarch64 back-end [1/2]

2019-12-23 Thread Richard Sandiford
Stam Markianos-Wright writes: > On 12/19/19 10:01 AM, Richard Sandiford wrote: >>> + >>> +#pragma GCC push_options >>> +#pragma GCC target ("arch=armv8.2-a+bf16") >>> +#ifdef __ARM_FEATURE_BF16_SCALAR_ARITHMETIC >>> + >>> +

Re: [GCC][PATCH][AArch64] ACLE intrinsics bfmmla and bfmlal for AArch64 AdvSIMD

2019-12-23 Thread Richard Sandiford
Thanks for the patch, looks good. Delia Burduv writes: > This patch adds the ARMv8.6 ACLE intrinsics for bfmmla, bfmlalb and bfmlalt > as part of the BFloat16 extension. > (https://developer.arm.com/architectures/instruction-sets/simd-isas/neon/intrinsics) > The intrinsics are declared in arm_ne

Re: [GCC][PATCH][AArch64] ACLE intrinsics for BFCVTN, BFCVTN2 (AArch64 AdvSIMD) and BFCVT (AArch64 FP)

2019-12-23 Thread Richard Sandiford
Some of the comments on the BFMMLA/BFMLA[LT] patch apply here too. Delia Burduv writes: > This patch adds the Armv8.6-a ACLE intrinsics for bfmmla, bfmlalb and > bfmlalt as part of the BFloat16 extension. That's the other patch :-) > [...] > diff --git a/gcc/config/aarch64/aarch64-simd.md > b

[C++ PATCH v2] Don't mangle attributes that have a space in their name

2019-12-27 Thread Richard Sandiford
Jason Merrill writes: > On 12/18/19 1:24 PM, Richard Sandiford wrote: >> The SVE port needs to maintain a different type identity for >> GNU vectors and "SVE vectors" even during LTO, since the types >> use different ABIs. The easiest way of doing that seemed

Add missing target check for fully-masked fold-left reductions

2019-12-27 Thread Richard Sandiford
is supported. Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? Richard 2019-12-27 Richard Sandiford gcc/ * tree-vect-loop.c (vectorizable_reduction): Check whether the target supports the required VEC_COND_EXPR operation before allowing the fallback

[committed][AArch64] Fix typo in V_INT_CONTAINER

2019-12-27 Thread Richard Sandiford
All VNx2 V_INT_CONTAINER entries should map to VNx2DI. The lower-case version was already correct. Tested on aarch64-linux-gnu and applied as r279743. Richard 2019-12-27 Richard Sandiford gcc/ * config/aarch64/iterators.md (V_INT_CONTAINER): Fix VNx2SF entry. gcc/testsuite

Check for a supported comparison when using EXTRACT_LAST_REDUCTION

2019-12-28 Thread Richard Sandiford
x-gnu. OK to install? Richard 2019-12-28 Richard Sandiford gcc/ * tree-vect-stmts.c (vectorizable_condition): For extract-last reductions, check that the target supports the required comparison operation. gcc/testsuite/ * gcc.dg/vect/vect-cond-12.c: New

Unshare DR_STEP before gimplifying it

2019-12-28 Thread Richard Sandiford
d be other instances of this too, but this patch just deals with the gather/scatter case. Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? Richard 2019-12-28 Richard Sandiford gcc/ * tree-vect-stmts.c (vect_get_strided_load_store_ops): Copy DR_STEP before g

Re: [GCC][PATCH][AArch64]Add ACLE intrinsics for dot product (usdot - vector, dot - by element) for AArch64 AdvSIMD ARMv8.6 Extension

2019-12-30 Thread Richard Sandiford
Stam Markianos-Wright writes: > On 12/20/19 2:13 PM, Richard Sandiford wrote: >> Stam Markianos-Wright writes: >>> +**... >>> +**ret >>> +*/ >>> +int32x2_t ufoo (int32x2_t r, uint8x8_t x, int8x8_t y) >>> +{ >>> + return vusdot_

Re: [GCC][PATCH][AArch64]Add ACLE intrinsics for bfdot for ARMv8.6 Extension

2019-12-30 Thread Richard Sandiford
Stam Markianos-Wright writes: > diff --git a/gcc/config/aarch64/aarch64-simd.md > b/gcc/config/aarch64/aarch64-simd.md > index > adfda96f077075ad53d4bea2919c4d3b326e49f5..7587bc46ba1c80389ea49fa83a0e6f8a489711e9 > 100644 > --- a/gcc/config/aarch64/aarch64-simd.md > +++ b/gcc/config/aarch64/aarc

Fix SSA update when vectorisation adds a vdef to a read-only loop

2019-12-30 Thread Richard Sandiford
has no vdefs, the definition that applies on exit is the same as the one that applies on entry.) This patch therefore adds a third case: the scalar loop and to-be-vectorised epilogue have no virtual defs, but the main loop does. Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? R

[committed] Fix EXTRACT_LAST_REDUCTION segfault

2019-12-31 Thread Richard Sandiford
ed as obvious. Richard 2019-12-31 Richard Sandiford gcc/ * tree-vect-stmts.c (vectorizable_condition): Only nullify cond_expr if we've created a new condition. Don't nullify it if we've decided to keep it and then invert the result. gcc/testsuite/

[AArch64] Fix constraints for CPY /M

2020-01-06 Thread Richard Sandiford
The constraints for CPY /M allowed p0-p15 instead of the intended p0-p7. This looks like a pasto from the preceding constant pattern, where p0-p15 is allowed. Tested on aarch64-linux-gnu and applied as 279899. Richard 2020-01-06 Richard Sandiford gcc/ * config/aarch64/aarch64

[committed][AArch64] Use move-if-change for aarch64-tune.md

2020-01-06 Thread Richard Sandiford
ed by changing aarch64-cores.def and making sure that the file was updated appropriately. Richard 2020-01-06 Richard Sandiford gcc/ * config/aarch64/t-aarch64 ($(srcdir)/config/aarch64/aarch64-tune.md): Depend on... (s-aarch64-tune-md): ...this new stamp file. Pipe th

[committed][AArch64] Use type attributes to mark types that use the SVE PCS

2020-01-07 Thread Richard Sandiford
types and shouldn't be user-visible. The patch tries to ensure this by including a space in the attribute name, like we already do for things like "fn spec" and "omp declare simd". Tested on aarch64-linux-gnu, applied as r279953. Richard 2020-01-07 Richard Sandi

Re: Add a compatible_vector_types_p target hook

2020-01-07 Thread Richard Sandiford
Richard Sandiford writes: > Richard Biener writes: >> On December 14, 2019 11:43:48 AM GMT+01:00, Richard Sandiford >> wrote: >>>Richard Biener writes: >>>> On December 13, 2019 10:12:40 AM GMT+01:00, Richard Sandiford >>> wrote: >>>>&g

Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT

2020-01-07 Thread Richard Sandiford
Richard Biener writes: > On Tue, 7 Jan 2020, Andrew Pinski wrote: > >> On Mon, Jan 6, 2020 at 11:36 PM Richard Biener wrote: >> > >> > On Mon, 16 Dec 2019, Andrew Pinski wrote: >> > >> > > On Thu, Nov 15, 2018 at 12:31 AM Richard Biener >> > > wrote: >> > > > >> > > > On Thu, 15 Nov 2018, Richa

Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT

2020-01-07 Thread Richard Sandiford
Richard Biener writes: > On Tue, 7 Jan 2020, Richard Sandiford wrote: > >> Richard Biener writes: >> > On Tue, 7 Jan 2020, Andrew Pinski wrote: >> > >> >> On Mon, Jan 6, 2020 at 11:36 PM Richard Biener wrote: >> >> > >> >>

Re: [GCC][PATCH][Aarch64] Add Bfloat16_t scalar type, vector types and machine modes to Aarch64 back-end [2/2]

2020-01-07 Thread Richard Sandiford
Stam Markianos-Wright writes: > On 12/19/19 10:08 AM, Richard Sandiford wrote: >> Stam Markianos-Wright writes: >>> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c >>> index f57469b6e23..f40f6432fd4 100644 >>> --- a/gcc/config/aa

Re: [GCC][PATCH][Aarch64] Add Bfloat16_t scalar type, vector types and machine modes to Aarch64 back-end [1/2]

2020-01-07 Thread Richard Sandiford
Thanks for the update. The new patch looks really good, just some minor comments. Stam Markianos-Wright writes: > [...] > Also I've update the filenames of all our tests to make them a bit clearer: > > C tests: > > __ bfloat16_scalar_compile_1.c to bfloat16_scalar_compile_3.c: Compilation of >

Re: [patch] relax aarch64 stack-clash tests depedence on alloca.h

2020-01-07 Thread Richard Sandiford
Olivier Hainque writes: > Hi Andrew, > >> On 6 Jan 2020, at 23:24, Andrew Pinski wrote: >> Just one small suggestion: > > Sure > >> Instead of: >> - char* pStr = alloca(SIZE); >> + char* pStr = __builtin_alloca(SIZE); >> >> Why not just do: >> -#include >> +#define alloca __builtin_alloca > >

[pushed] aarch64: Remove SME2.1 forms of LUTI2/4

2024-03-05 Thread Richard Sandiford
I was over-eager when adding support for strided SME2 instructions and accidentally included forms of LUTI2 and LUTI4 that are only available with SME2.1, not SME2. This patch removes them for now. We're planning to add proper support for SME2.1 in the GCC 15 timeframe. Sorry for the blunder :(

[PATCH] asan: Handle poly-int sizes in ASAN_MARK [PR97696]

2024-03-05 Thread Richard Sandiford
This patch makes the expansion of IFN_ASAN_MARK let through poly-int-sized objects. The expansion itself was already generic enough, but the tests for the fast path were too strict. Bootstrapped & regression tested on aarch64-linux-gnu. Is this OK for trunk now, or should it wait for GCC 15? I'

Re: [PATCH] asan: Handle poly-int sizes in ASAN_MARK [PR97696]

2024-03-05 Thread Richard Sandiford
Jakub Jelinek writes: > On Tue, Mar 05, 2024 at 06:03:41PM +0000, Richard Sandiford wrote: >> This patch makes the expansion of IFN_ASAN_MARK let through >> poly-int-sized objects. The expansion itself was already generic >> enough, but the tests for the fast

Re: [PATCH] asan: Handle poly-int sizes in ASAN_MARK [PR97696]

2024-03-05 Thread Richard Sandiford
Jakub Jelinek writes: > On Tue, Mar 05, 2024 at 06:30:40PM +0000, Richard Sandiford wrote: >> (1) Keep the test where it is, taking advantage of the current SVE >> handling in aarch64-sve.exp, and add: >> >> /* { dg-skip-if "" { no_fsanitize_addre

Re: [PATCHv2] fwprop: Avoid volatile defines to be propagated

2024-03-05 Thread Richard Sandiford
HAO CHEN GUI writes: > Hi, > This patch tries to fix a potential problem which is raised by the patch > for PR111267. The volatile asm operand tries to be propagated to a single > set insn with the patch for PR111267. The volatile asm operand might be > executed for multiple times if the define

[pushed] aarch64: Define out-of-class static constants

2024-03-06 Thread Richard Sandiford
While reworking the aarch64 feature descriptions, I forgot to add out-of-class definitions of some static constants. This could lead to a build failure with some compilers. This was seen with some WIP to increase the number of extensions beyond 64. It's latent on trunk though, and a regression fr

Re: [PATCH 1/2] aarch64: Use fmov s/d/hN, FP_CST for some vector CST [PR113856]

2024-03-07 Thread Richard Sandiford
Andrew Pinski writes: > Aarch64 has a way to form some floating point CSTs via the fmov instructions, > these instructions also zero out the upper parts of the registers so they can > be used for vector CSTs that have have one non-zero constant that would be > able > to formed via the fmov in the

Re: [PATCH 1/2] aarch64: Use fmov s/d/hN, FP_CST for some vector CST [PR113856]

2024-03-07 Thread Richard Sandiford
Richard Sandiford writes: > Andrew Pinski writes: >> Aarch64 has a way to form some floating point CSTs via the fmov instructions, >> these instructions also zero out the upper parts of the registers so they can >> be used for vector CSTs that have have one non-zero c

Re: [PATCH 2/2] aarch64: Support `{1.0f, 1.0f, 0.0, 0.0}` CST forming with fmov with a smaller vector type.

2024-03-07 Thread Richard Sandiford
Andrew Pinski writes: > This enables construction of V4SF CST like `{1.0f, 1.0f, 0.0f, 0.0f}` > (and other fp enabled CSTs) by using `fmov v0.2s, 1.0` as the instruction > is designed to zero out the other bits. > This is a small extension on top of the code that creates fmov for the case > where

Re: [PATCH] aarch64: Fix costing of manual bfi instructions

2024-03-07 Thread Richard Sandiford
Andrew Pinski writes: > This fixes the cost model for BFI instructions which don't > use directly zero_extract on the LHS. > aarch64_bfi_rtx_p does the heavy lifting by matching of > the patterns. > > Note this alone does not fix PR 107270, it is a step in the right > direction. There we get z zer

Re: [PATCH] AArch64: memcpy/memset expansions should not emit LDP/STP [PR113618]

2024-03-07 Thread Richard Sandiford
Wilco Dijkstra writes: > Hi Richard, > >> It looks like this is really doing two things at once: disabling the >> direct emission of LDP/STP Qs, and switching the GPR handling from using >> pairs of DImode moves to single TImode moves.  At least, that seems to be >> the effect of... > > No it stil

Re: [PATCH 2/2] aarch64: Add support for _BitInt

2024-03-07 Thread Richard Sandiford
"Andre Vieira (lists)" writes: > Hey, > > Dropped the first patch and dealt with the comments above, hopefully I > didn't miss any this time. > > -- > > This patch adds support for C23's _BitInt for the AArch64 port when > compiling > for little endianness. Big E

Re: [PATCH] libatomic: Fix build for --disable-gnu-indirect-function [PR113986]

2024-03-07 Thread Richard Sandiford
Wilco Dijkstra writes: > Fix libatomic build to support --disable-gnu-indirect-function on AArch64. > Always build atomic_16.S and add aliases to the __atomic_* functions if > !HAVE_IFUNC. This description is too brief for me. Could you say in detail how the new scheme works? E.g. the descripti

Re: [r14-9173 Regression] FAIL: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3 "_expr" on Linux/x86_64

2024-03-07 Thread Richard Sandiford
Sorry, still catching up on email, but: Richard Biener writes: > We have optimize_vectors_before_lowering_p but we shouldn't even there > turn supported into not supported ops and as said, what's supported or > not cannot be finally decided (if it's only vcond and not vcond_mask > that is suppor

Re: [PATCH] gomp: testsuite: improve compatibility of bad-array-section-3.c [PR113428]

2024-03-08 Thread Richard Sandiford
Richard Earnshaw writes: > This test generates different warnings on ilp32 targets because the size > of an integer matches the size of a pointer. Avoid this by using > signed char. > > gcc/testsuite: > > PR testsuite/113428 > * gcc.dg/gomp/bad-array-section-c-3.c: Use signed char ins

Re: [PATCH] aarch64: Add +lse128 architectural extension command-line flag

2024-03-26 Thread Richard Sandiford
Victor Do Nascimento writes: > Given how, at present, the choice of using LSE128 atomic instructions > by the toolchain is delegated to run-time selection in the form of > Libatomic ifuncs, responsible for querying target support, the > `+lse128' target architecture compile-time flag is absent fro

Re: [pushed] aarch64: Define out-of-class static constants

2024-03-26 Thread Richard Sandiford
Vaseeharan Vinayagamoorthy writes: > Hi Richard, > > I think this patch is breaking the build of aarch64-none-elf and > aarch64-none-linux-gnu targets, when building with GCC 4.8. > This is not an issue when building with GCC 7.5. > > Kind regards, > Vasee Thanks. I pushed the attached patch to

Re: [PATCH v2] libstdc++: add ARM SVE support to std::experimental::simd

2024-03-27 Thread Richard Sandiford
Jonathan Wakely writes: > On Fri, 8 Mar 2024 at 09:58, Matthias Kretz wrote: >> >> Hi, >> >> I applied and did extended testing on x86_64 (no regressions) and aarch64 >> using qemu testing SVE 256, 512, and 1024. Looks good! >> >> While going through the applied patch I noticed a few style issues

Re: [PATCH v2] libstdc++: add ARM SVE support to std::experimental::simd

2024-03-27 Thread Richard Sandiford
Matthias Kretz writes: > On Wednesday, 27 March 2024 11:07:14 CET Richard Sandiford wrote: >> I'm still worried about: >> >> #if _GLIBCXX_SIMD_HAVE_SVE >> constexpr inline int __sve_vectorized_size_bytes = __ARM_FEATURE_SVE_BITS >>

Re: [PATCH] libstdc++: add ARM SVE support to std::experimental::simd

2024-03-27 Thread Richard Sandiford
Matthias Kretz writes: > Hi Richard, > > sorry for not answering sooner. I took action on your mail but failed to also > give feedback. Now in light of your veto of Srinivas patch I wanted to use > the > opportunity to pick this up again. > > On Dienstag, 23. Januar 2

Re: [PATCHv2 1/2] aarch64: Do not give ABI change diagnostics for _BitInt(N)

2024-03-28 Thread Richard Sandiford
"Andre Vieira (lists)" writes: > This patch makes sure we do not give ABI change diagnostics for the ABI > breaks of GCC 9, 13 and 14 for any type involving _BitInt(N), since that > type did not exist before this GCC version. > > ChangeLog: > > * config/aarch64/aarch64.cc (bitint_or_aggr_o

Re: [PATCHv2 2/2] aarch64: Add support for _BitInt

2024-03-28 Thread Richard Sandiford
"Andre Vieira (lists)" writes: > This patch adds support for C23's _BitInt for the AArch64 port when > compiling for little endianness. Big Endianness requires further > target-agnostic support and we therefor disable it for now. > > The tests expose some suboptimal codegen for which I'll creat

Re: [PATCHv2 2/2] aarch64: Add support for _BitInt

2024-03-28 Thread Richard Sandiford
Jakub Jelinek writes: > On Thu, Mar 28, 2024 at 03:00:46PM +0000, Richard Sandiford wrote: >> >* gcc.target/aarch64/bitint-alignments.c: New test. >> >* gcc.target/aarch64/bitint-args.c: New test. >> >* gcc.target/aarch64/bitint-sizes.c: New test. >>

[oops pushed] aarch64: Fix vld1/st1_x4 intrinsic definitions

2024-03-28 Thread Richard Sandiford
Gah. As mentioned on irc, I'd written this patch to fix PR114521. The bug was fixed properly by Jonathan's struct rework in GCC 12, but that's much too invasive to backport. The attached patch therefore deals with the bug directly. Since it's new work, and since there's only one GCC 11 release t

Re: [PATCH] libgcc: Add missing HWCAP entries to aarch64/cpuinfo.c

2024-04-02 Thread Richard Sandiford
Wilco Dijkstra writes: > A few HWCAP entries are missing from aarch64/cpuinfo.c. This results in > build errors > on older machines. > > This counts a trivial build fix, but since it's late in stage 4 I'll let > maintainers chip in. > OK for commit? > > libgcc/ > * config/aarch64/cpuinf

Re: [PATCH] aarch64: Fix typo in comment about FEATURE_STRING

2024-04-02 Thread Richard Sandiford
Christophe Lyon writes: > Fix the comment to document FEATURE_STRING instead of FEAT_STRING. > > 2024-03-29 Christophe Lyon > > gcc/ > * config/aarch64/aarch64-option-extensions.def: Fix comment. OK, thanks. Richard > --- > gcc/config/aarch64/aarch64-option-extensions.def | 16 +

Re: [PATCH v2 2/3] aarch64: Add support for aarch64-gnu (GNU/Hurd on AArch64)

2024-04-02 Thread Richard Sandiford
Sergey Bugaev writes: > Coupled with a corresponding binutils patch, this produces a toolchain that > can > sucessfully build working binaries targeting aarch64-gnu. > > gcc/Changelog: > > * config.gcc: Recognize aarch64*-*-gnu* targets. > * config/aarch64/aarch64-gnu.h: New file. > >

Re: [PATCH V3 0/2] aarch64: Place target independent and dependent changed code in one file.

2024-04-03 Thread Richard Sandiford
Alex Coplan writes: > On 23/02/2024 16:41, Ajit Agarwal wrote: >> Hello Richard/Alex/Segher: > > Hi Ajit, > > Sorry for the delay and thanks for working on this. > > Generally this looks like the right sort of approach (IMO) but I've left > some comments below. > > I'll start with a meta comment:

Re: [PATCH] libatomic: Fix build for --disable-gnu-indirect-function [PR113986]

2024-04-04 Thread Richard Sandiford
Wilco Dijkstra writes: > v2: > > Fix libatomic build to support --disable-gnu-indirect-function on AArch64. > Always build atomic_16.S, add aliases to the __atomic_ functions if > !HAVE_IFUNC. > Include auto-config.h in atomic_16.S to avoid having to pass defines via > makefiles. > Fix build i

Re: [PATCH] libatomic: Cleanup macros in atomic_16.S

2024-04-04 Thread Richard Sandiford
Wilco Dijkstra writes: > As mentioned in > https://gcc.gnu.org/pipermail/gcc-patches/2024-March/648397.html , > do some additional cleanup of the macros and aliases: > > Cleanup the macros to add the libat_ prefixes in atomic_16.S. Emit the > alias to __atomic_ when ifuncs are not enabled in the

[pushed] aarch64: Recognise svundef idiom [PR114577]

2024-04-04 Thread Richard Sandiford
GCC 14 adds the header file arm_neon_sve_bridge.h to help interface SVE and Advanced SIMD code. One of the defined idioms is: svset_neonq (svundef_TYPE (), advsimd_vector) which simply reinterprets advsimd_vector as an SVE vector without regard for what's in the upper bits. GCC was failing to

[pushed] aarch64: Fix bogus cnot optimisation [PR114603]

2024-04-05 Thread Richard Sandiford
aarch64-sve.md had a pattern that combined: cmpeq pb.T, pa/z, zc.T, #0 mov zd.T, pb/z, #1 into: cnotzd.T, pa/m, zc.T But this is only valid if pa.T is a ptrue. In other cases, the original would set inactive elements of zd.T to 0, whereas the combined form wou

Re: [pushed] aarch64: Fix bogus cnot optimisation [PR114603]

2024-04-08 Thread Richard Sandiford
Richard Biener writes: > On Fri, Apr 5, 2024 at 3:52 PM Richard Sandiford >> This isn't a regression on a known testcase. However, it's a nasty >> wrong code bug that could conceivably trigger for autovec code (although >> I've not been able to construct a r

Re: [PATCH] rtl-optimization/101523 - avoid re-combine after noop 2->2 combination

2024-04-08 Thread Richard Sandiford
Segher Boessenkool writes: > Hi! > > On Wed, Apr 03, 2024 at 01:07:41PM +0200, Richard Biener wrote: >> The following avoids re-walking and re-combining the instructions >> between i2 and i3 when the pattern of i2 doesn't change. >> >> Bootstrap and regtest running ontop of a reversal of >> r14-

Re: [PATCH] aarch64: Fix vld1/st1_x4 intrinsic test

2024-04-08 Thread Richard Sandiford
"Swinney, Jonathan" writes: > The test for this intrinsic was failing silently and so it failed to > report the bug reported in 114521. This patch modifes the test to > report the result. > > Bug report: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114521 > > Signed-off-by: Jonathan Swinney > ---

Re: [PATCH][wwwdocs] Add NEON-SVE bridge intrinsics to changes.html

2024-04-08 Thread Richard Sandiford
Richard Ball writes: > Hi all, > > Adding the NEON-SVE bridge intrinsics that were missed > in the last patch. > > Thanks, > Richard OK, thanks. Richard > diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html > index > 9fd224c1df3f05eadcedaaa41c0859e712b93b78..df63af48298564de9c

[pushed] aarch64: Fix expansion of svsudot [PR114607]

2024-04-08 Thread Richard Sandiford
Not sure how this happend, but: svsudot is supposed to be expanded as USDOT with the operands swapped. However, a thinko in the expansion of svsudot meant that the arguments weren't in fact swapped; the attempted swap was just a no-op. And the testcases blithely accepted that. Tested on aarch64-

Re: [PATCH v2] aarch64: Fix ACLE SME streaming mode error in neon-sve-bridge

2024-04-09 Thread Richard Sandiford
Richard Ball writes: > When using LTO, handling the pragma for sme before the pragma > for the neon-sve-bridge caused the following error on svset_neonq, > in the neon-sve-bridge.c test. > > error: ACLE function '0' can only be called when SME streaming mode is > enabled. > > This has been resolv

Re: [PATCH 2/5] aarch64: Don't use FEAT_MAX as array length

2024-04-09 Thread Richard Sandiford
Andrew Carlotti writes: > There was an assumption in some places that the aarch64_fmv_feature_data > array contained FEAT_MAX elements. While this assumption held up till > now, it is safer and more flexible to use the array size directly. > > gcc/ChangeLog: > > * config/aarch64/aarch64.cc

Re: [PATCH 0/5] aarch64: FMV feature list fixes

2024-04-09 Thread Richard Sandiford
Andrew Carlotti writes: > The first three patches are trivial changes to the feature list to reflect > recent changes in the ACLE. Patch 4 removes most of the FMV multiversioning > features that don't work at the moment, and should be entirely > uncontroversial. > > Patch 5 handles the remaining

Re: [PATCH]AArch64: Do not allow SIMD clones with simdlen 1 [PR113552][GCC 13/12/11 backport]

2024-04-09 Thread Richard Sandiford
Tamar Christina writes: > Hi All, > > This is a backport of g:306713c953d509720dc394c43c0890548bb0ae07. > > The AArch64 vector PCS does not allow simd calls with simdlen 1, > however due to a bug we currently do allow it for num == 0. > > This causes us to emit a symbol that doesn't exist and we f

Re: [PATCHv2 1/2] aarch64: Do not give ABI change diagnostics for _BitInt(N)

2024-04-10 Thread Richard Sandiford
"Andre Vieira (lists)" writes: > @@ -6907,6 +6938,11 @@ aarch64_layout_arg (cumulative_args_t pcum_v, const > function_arg_info &arg) > && (!alignment || abi_break_gcc_9 < alignment) > && (!abi_break_gcc_13 || alignment < abi_break_gcc_13)); > > + /* _BitInt(N) was only

Re: [PATCHv3 2/2] aarch64: Add support for _BitInt

2024-04-10 Thread Richard Sandiford
hanks. In truth I've not gone through the tests very thorougly this time around, and just gone by the internal diff between this version and the previous one. But we can adjust them as necessary based on any reports that come in. Richard > > On 28/03/2024 15:21, Richard Sandiford wr

Re: [PATCH 0/5] aarch64: FMV feature list fixes

2024-04-10 Thread Richard Sandiford
Andrew Carlotti writes: > On Tue, Apr 09, 2024 at 04:43:16PM +0100, Richard Sandiford wrote: >> Andrew Carlotti writes: >> > The first three patches are trivial changes to the feature list to reflect >> > recent changes in the ACLE. Patch 4 removes most of th

Re: [PATCH v2 02/13] aarch64: The aarch64-w64-mingw32 target implements

2024-04-10 Thread Richard Sandiford
Sorry for the slow reply. Evgeny Karpov writes: > From: Zac Walker > Date: Fri, 1 Mar 2024 01:45:13 +0100 > Subject: [PATCH v2 02/13] aarch64: The aarch64-w64-mingw32 target implements > the MS ABI > > Two ABIs for aarch64 have been defined for different platforms. > > gcc/ChangeLog: > >

Re: [PATCH v2 03/13] aarch64: Mark x18 register as a fixed register for MS ABI

2024-04-10 Thread Richard Sandiford
Evgeny Karpov writes: > From: Zac Walker > Date: Fri, 1 Mar 2024 09:56:59 +0100 > Subject: [PATCH v2 03/13] aarch64: Mark x18 register as a fixed register for > MS ABI > > Define the MS ABI for aarch64-w64-mingw32. > Adjust FIXED_REGISTERS, CALL_REALLY_USED_REGISTERS and > STATIC_CHAIN_REGNUM fo

Re: [PATCH v2 04/13] aarch64: Add aarch64-w64-mingw32 COFF

2024-04-10 Thread Richard Sandiford
Evgeny Karpov writes: > From: Zac Walker > Date: Fri, 1 Mar 2024 01:55:47 +0100 > Subject: [PATCH v2 04/13] aarch64: Add aarch64-w64-mingw32 COFF > > Define ASM specific for COFF format on AArch64. > > gcc/ChangeLog: > > * config.gcc: Add COFF format support definitions. > * config/aa

Re: [PATCH v2 08/13] aarch64: Add Cygwin and MinGW environments for AArch64

2024-04-10 Thread Richard Sandiford
Evgeny Karpov writes: > From: Zac Walker > Date: Fri, 1 Mar 2024 10:49:28 +0100 > Subject: [PATCH v2 08/13] aarch64: Add Cygwin and MinGW environments for > AArch64 > > Define Cygwin and MinGW environment such as types, SEH definitions, > shared libraries, etc. > > gcc/ChangeLog: > > * con

Re: [PATCH v2 10/13] Rename "x86 Windows Options" to "Cygwin and MinGW Options"

2024-04-10 Thread Richard Sandiford
Evgeny Karpov writes: > From: Zac Walker > Date: Fri, 1 Mar 2024 02:17:39 +0100 > Subject: [PATCH v2 10/13] Rename "x86 Windows Options" to "Cygwin and MinGW > Options" > > Rename "x86 Windows Options" to "Cygwin and MinGW Options". > It will be used also for AArch64. > > gcc/ChangeLog: > >

Re: [PATCH v2 00/13] Add aarch64-w64-mingw32 target

2024-04-10 Thread Richard Sandiford
Evgeny Karpov writes: > Hello, > > v2 is ready for the review! > Based on the v1 review: > https://gcc.gnu.org/pipermail/gcc-patches/2024-February/thread.html#646203 > > Testing for the x86_64-w64-mingw32 target is in progress to avoid > regression due to refactoring. Thanks for the updates and

Re: [PATCH 0/5] aarch64: FMV feature list fixes

2024-04-10 Thread Richard Sandiford
Andrew Carlotti writes: > On Wed, Apr 10, 2024 at 05:42:05PM +0100, Richard Sandiford wrote: >> Andrew Carlotti writes: >> > On Tue, Apr 09, 2024 at 04:43:16PM +0100, Richard Sandiford wrote: >> >> Andrew Carlotti writes: >> >> > The first three pa

Re: [PATCH] aarch64: Preserve mem info on change of base for ldp/stp [PR114674]

2024-04-11 Thread Richard Sandiford
Alex Coplan writes: > Hi, > > The ldp/stp fusion pass can change the base of an access so that the two > accesses end up using a common base register. So far we have been using > adjust_address_nv to do this, but this means that we don't preserve > other properties of the mem we're replacing. It

Re: [PATCH v2 00/13] Add aarch64-w64-mingw32 target

2024-04-11 Thread Richard Sandiford
Evgeny Karpov writes: > Wednesday, April 10, 2024 8:40 PM > Richard Sandiford wrote: > >> Thanks for the updates and sorry again for the slow review. >> I've replied to some of the patches in the series but otherwise it looks >> good to >> me. >> >

Re: [PATCH] aarch64: Fix _BitInt testcases

2024-04-11 Thread Richard Sandiford
"Andre Vieira (lists)" writes: > This patch fixes some testisms introduced by: > > commit 5aa3fec38cc6f52285168b161bab1a869d864b44 > Author: Andre Vieira > Date: Wed Apr 10 16:29:46 2024 +0100 > > aarch64: Add support for _BitInt > > The testcases were relying on an unnecessary sign-extend

Re: [PATCH] aarch64: libgcc: Cleanup ELF marking in asm

2024-01-31 Thread Richard Sandiford
Szabolcs Nagy writes: > Use aarch64-asm.h in asm code consistently, this was started in > > commit c608ada288ced0268c1fd4136f56c34b24d4 > Author: Zac Walker > CommitDate: 2024-01-23 15:32:30 + > > Ifdef `.hidden`, `.type`, and `.size` pseudo-ops for `aarch64-w64-mingw32` > ta

Re: [PATCH] aarch64: -mstrict-align vs __arm_data512_t [PR113657]

2024-01-31 Thread Richard Sandiford
Andrew Pinski writes: > After r14-1187-gd6b756447cd58b, simplify_gen_subreg can return > NULL for "unaligned" memory subreg. Since V8DI has an alignment of 8 bytes, > using TImode causes simplify_gen_subreg to return NULL. > This fixes the issue by using DImode instead for the loop. And then we wi

Re: [PATCH] aarch64: Fix ICE in poly-int.h due to SLP.

2024-01-31 Thread Richard Sandiford
> From: Prathamesh Kulkarni > Sent: 30 January 2024 17:36 > To: Richard Ball > Cc: gcc-patches@gcc.gnu.org ; Richard Sandiford > ; Kyrylo Tkachov ; Richard > Earnshaw ; Marcus Shawcroft > > Subject: Re: [PATCH] aarch64: Fix ICE in poly-int.h due to SLP. > > On

Re: [PATCH 1/3] vect: Pass stmt_vec_info to TARGET_SIMD_CLONE_USABLE

2024-01-31 Thread Richard Sandiford
"Andre Vieira (lists)" writes: > [...] The question at hand > here is, what can the vectorizer use for a specific loop. If we are > using Advanced SIMD modes then it needs to call an Advanced SIMD clone, > and if we are using SVE modes then it needs to call an SVE clone. At > least until we su

Re: [PATCH]AArch64: update vget_set_lane_1.c test output

2024-02-01 Thread Richard Sandiford
Andrew Pinski writes: > On Thu, Feb 1, 2024 at 1:26 AM Tamar Christina > wrote: >> >> Hi All, >> >> In the vget_set_lane_1.c test the following entries now generate a zip1 >> instead of an INS >> >> BUILD_TEST (float32x2_t, float32x2_t, , , f32, 1, 0) >> BUILD_TEST (int32x2_t, int32x2_t, ,

Re: [PATCH]AArch64: update vget_set_lane_1.c test output

2024-02-01 Thread Richard Sandiford
Tamar Christina writes: >> -Original Message- >> From: Richard Sandiford >> Sent: Thursday, February 1, 2024 2:24 PM >> To: Andrew Pinski >> Cc: Tamar Christina ; gcc-patches@gcc.gnu.org; nd >> ; Richard Earnshaw ; Marcus >> Shawcroft ; Kyrylo

Re: [PATCH v2] middle-end: Fix ICE in poly-int.h due to SLP.

2024-02-01 Thread Richard Sandiford
Richard Ball writes: > Adds a check to ensure that the input vector arguments > to a function are not variable length. Previously, only the > output vector of a function was checked. > > The ICE in question is within the neon-sve-bridge.c test, > and is related to https://gcc.gnu.org/bugzilla/show

Re: [PATCH] aarch64: Fix ACLE SME streaming mode error in neon-sve-bridge

2024-02-01 Thread Richard Sandiford
Richard Ball writes: > When using LTO, handling the pragma for sme before the pragma > for the neon-sve-bridge caused the following error on svset_neonq, > in the neon-sve-bridge.c test. > > error: ACLE function '0' can only be called when SME streaming mode is > enabled. > > Handling the pragmas

Re: [PATCH] AArch64: Remove AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS

2024-02-01 Thread Richard Sandiford
Wilco Dijkstra writes: > (follow-on based on review comments on > https://gcc.gnu.org/pipermail/gcc-patches/2024-January/641913.html) > > > Remove the tune AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS since it is only > used by an old core and doesn't properly support -Os. SPECINT_2017 > shows that removi

Re: [PATCH v4] AArch64: Cleanup memset expansion

2024-02-01 Thread Richard Sandiford
Wilco Dijkstra writes: > Hi Richard, > >>> That tune is only used by an obsolete core. I ran the memcpy and memset >>> benchmarks from Optimized Routines on xgene-1 with and without LDP/STP. >>> There is no measurable penalty for using LDP/STP. I'm not sure why it was >>> ever added given it does

Re: [PATCH] AArch64: memcpy/memset expansions should not emit LDP/STP [PR113618]

2024-02-01 Thread Richard Sandiford
Wilco Dijkstra writes: > The new RTL introduced for LDP/STP results in regressions due to use of > UNSPEC. > Given the new LDP fusion pass is good at finding LDP opportunities, change the > memcpy, memmove and memset expansions to emit single vector loads/stores. > This fixes the regression and e

Re: [PATCH 3/3] aarch64: Add SVE support for simd clones [PR 96342]

2024-02-01 Thread Richard Sandiford
Andre Vieira writes: > This patch finalizes adding support for the generation of SVE simd clones when > no simdlen is provided, following the ABI rules where the widest data type > determines the minimum amount of elements in a length agnostic vector. > > gcc/ChangeLog: > > * config/aarch64/

Re: [PATCH V1] Common infrastructure for load-store fusion for aarch64 and rs6000 target

2024-02-14 Thread Richard Sandiford
Hi, Thanks for working on this. You posted a version of this patch on Sunday too. If you need to repost to fix bugs or make other improvements, could you describe the changes that you've made since the previous version? It makes things easier to follow. Also, sorry for starting with a meta dis

Re: [PATCH] middle-end/113576 - zero padding of vector bools when expanding compares

2024-02-14 Thread Richard Sandiford
Richard Biener writes: > The following zeros paddings of vector bools when expanding compares > and the mode used for the compare is an integer mode. In that case > targets cannot distinguish between a 4 element and 8 element vector > compare (both get to the QImode compare optab) so we have to d

Re: [PATCH][GCC 12] aarch64: Avoid out-of-range shrink-wrapped saves [PR111677]

2024-02-14 Thread Richard Sandiford
Alex Coplan writes: > This is a backport of the GCC 13 fix for PR111677 to the GCC 12 branch. > The only part of the patch that isn't a straight cherry-pick is due to > the TX iterator lacking TDmode for GCC 12, so this version adjusts > TX_V16QI accordingly. > > Bootstrapped/regtested on aarch64-

<    2   3   4   5   6   7   8   9   10   11   >