On 13/01/2022 14:56, Christophe Lyon via Gcc-patches wrote:
The vmvnq_n* intrinsics and have [u]int[16|32]_t arguments, so use
iterator instead of HI in mve_vmvnq_n_.
2022-01-13 Christophe Lyon
gcc/
* config/arm/mve.md (mve_vmvnq_n_): Use V_elem mode
for operand
On 13/01/2022 14:56, Christophe Lyon via Gcc-patches wrote:
VPR_REG is the only register in its class, so it should be handled by
TARGET_CLASS_LIKELY_SPILLED_P, which is achieved by calling
default_class_likely_spilled_p. No test fails without this patch, but
it seems it should be
On 19/01/2022 11:04, Richard Biener wrote:
On Tue, 18 Jan 2022, Andre Vieira (lists) wrote:
On 14/01/2022 09:57, Richard Biener wrote:
The 'used_vector_modes' is also a heuristic by itself since it registers
every vector type we query, not only those that are used in the end ...
So it's
On 20/01/2022 09:14, Christophe Lyon wrote:
On Wed, Jan 19, 2022 at 7:18 PM Andre Vieira (lists) via Gcc-patches
wrote:
Hi Christophe,
On 13/01/2022 14:56, Christophe Lyon via Gcc-patches wrote:
> At some point during the development of this patch series, it
appea
On 20/01/2022 10:40, Richard Sandiford wrote:
"Andre Vieira (lists)" writes:
On 20/01/2022 09:14, Christophe Lyon wrote:
On Wed, Jan 19, 2022 at 7:18 PM Andre Vieira (lists) via Gcc-patches
wrote:
Hi Christophe,
On 13/01/2022 14:56, Christophe Lyon via Gcc-pat
On 20/01/2022 10:45, Richard Sandiford wrote:
"Andre Vieira (lists)" writes:
On 13/01/2022 14:56, Christophe Lyon via Gcc-patches wrote:
The vmvnq_n* intrinsics and have [u]int[16|32]_t arguments, so use
iterator instead of HI in mve_vmvnq_n_.
2022-01-13 Christophe Lyon
This time to the list too (sorry for double email)
Hi,
The original patch '[vect] Re-analyze all modes for epilogues', skipped
modes that should not be skipped since it used the vector mode provided
by autovectorize_vector_modes to derive the minimum VF required for it.
However, those modes
Hi,
This addresses the compile-time increase seen in the PR target/105157.
This was being caused by selecting the wrong core tuning, as when we
added the latest AArch64 the TARGET_CPU_generic tuning was pushed beyond
the 0x3f mask we used to encode both target cpu and attributes into
On 08/04/2022 08:04, Richard Sandiford wrote:
I think this would be better as a static assert at the top level:
static_assert (TARGET_CPU_generic < TARGET_CPU_MASK,
"TARGET_CPU_NBITS is big enough");
The motivation being that you want this to be checked regardless of
This patch adds a machine_mode parameter to the TARGET_SIMD_CLONE_USABLE
hook to enable rejecting SVE modes when the target architecture does not
support SVE.
gcc/ChangeLog:
* config/aarch64/aarch64.cc (aarch64_simd_clone_usable): Add mode
parameter and use to to reject SVE
Teach parloops how to handle a poly nit and bound e ahead of the changes
to enable non-constant simdlen.
gcc/ChangeLog:
* tree-parloops.cc (try_to_transform_to_exit_first_loop_alt): Accept
poly NIT and ALT_BOUND.diff --git a/gcc/tree-parloops.cc b/gcc/tree-parloops.cc
index
When analyzing a loop and choosing a simdclone to use it is possible to
choose a simdclone that cannot be used 'inbranch' for a loop that can
use partial vectors. This may lead to the vectorizer deciding to use
partial vectors which are not supported for notinbranch simd clones.
This patch
This patch enables the compiler to use inbranch simdclones when
generating masked loops in autovectorization.
gcc/ChangeLog:
* omp-simd-clone.cc (simd_clone_adjust_argument_types): Make function
compatible with mask parameters in clone.
* tree-vect-stmts.cc
Hi,
This patch series aims to implement support for SVE simd clones when not
specifying a 'simdlen' clause for AArch64. This patch depends on my
earlier patch: '[PATCH] aarch64: enable mixed-types for aarch64 simdclones'.
Bootstrapped and regression tested the series on
SVE simd clones require to be compiled with a SVE target enabled or the
argument types will not be created properly. To achieve this we need to
copy DECL_FUNCTION_SPECIFIC_TARGET from the original function
declaration to the clones. I decided it was probably also a good idea
to copy
The vect_get_smallest_scalar_type helper function was using any argument
to a simd clone call when trying to determine the smallest scalar type
that would be vectorized. This included the function pointer type in a
MASK_CALL for instance, and would result in the wrong type being
selected.
This patch finalizes adding support for the generation of SVE simd
clones when no simdlen is provided, following the ABI rules where the
widest data type determines the minimum amount of elements in a length
agnostic vector.
gcc/ChangeLog:
* config/aarch64/aarch64-protos.h
This patch adds a new target hook to enable us to adapt the types of
return and parameters of simd clones. We use this in two ways, the
first one is to make sure we can create valid SVE types, including the
SVE type attribute, when creating a SVE simd clone, even when the target
options do
Forgot to CC this one to maintainers...
On 30/08/2023 10:14, Andre Vieira (lists) via Gcc-patches wrote:
This patch adds a machine_mode parameter to the TARGET_SIMD_CLONE_USABLE
hook to enable rejecting SVE modes when the target architecture does not
support SVE.
gcc/ChangeLog
On 30/08/2023 14:01, Richard Biener wrote:
On Wed, Aug 30, 2023 at 11:15 AM Andre Vieira (lists) via Gcc-patches
wrote:
This patch adds a machine_mode parameter to the TARGET_SIMD_CLONE_USABLE
hook to enable rejecting SVE modes when the target architecture does not
support SVE.
How does
Hi,
This patch enables the use of mixed-types for simd clones for AArch64,
adds aarch64 as a target_vect_simd_clones and corrects the way the
simdlen is chosen for non-specified simdlen clauses according to the
'Vector Function Application Binary Interface Specification for AArch64'.
On 26/04/2022 16:12, Jakub Jelinek wrote:
On Tue, Apr 26, 2022 at 03:43:13PM +0100, Richard Sandiford via Gcc-patches
wrote:
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr105219-2.c
@@ -0,0 +1,29 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -march=armv8.2-a -mtune=thunderx
On 26/04/2022 15:43, Richard Sandiford wrote:
"Andre Vieira (lists)" writes:
Hi,
This patch disables epilogue vectorization when we are peeling for
alignment in the prologue and we can't guarantee the main vectorized
loop is entered. This is to prevent executing vecto
Hi,
This patch disables epilogue vectorization when we are peeling for
alignment in the prologue and we can't guarantee the main vectorized
loop is entered. This is to prevent executing vectorized code with an
unaligned access if the target has indicated it wants to peel for
alignment. We
Hi,
This patch teaches the aarch64 backend to improve codegen when using dup
with NEON vectors with repeating patterns. It will attempt to use a
smaller NEON vector (or element) to limit the number of instructions
needed to construct the input vector.
Bootstrapped and regression tested
Hi Prathamesh,
I am just looking at this as it interacts with a change I am trying to
make, but I'm not a reviewer so take my comments with a pinch of salt ;)
I copied in bits of your patch below to comment.
> -@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST
(machine_mode
On 29/06/2022 08:18, Richard Sandiford wrote:
+ break;
+case AARCH64_RBIT:
+case AARCH64_RBITL:
+case AARCH64_RBITLL:
+ if (mode == SImode)
+ icode = CODE_FOR_aarch64_rbitsi;
+ else
+ icode = CODE_FOR_aarch64_rbitdi;
+ break;
+default:
+
On 17/06/2022 11:54, Richard Sandiford wrote:
"Andre Vieira (lists)" writes:
Hi,
This patch adds support for the ACLE Data Intrinsics to the AArch64 port.
Bootstrapped and regression tested on aarch64-none-linux.
OK for trunk?
Sorry for the slow review.
Hi,
This patch adds support for the ACLE Data Intrinsics to the AArch64 port.
Bootstrapped and regression tested on aarch64-none-linux.
OK for trunk?
gcc/ChangeLog:
2022-06-10 Andre Vieira
* config/aarch64/aarch64.md (rbit2): Rename this ...
(@aarch64_rbit): ... this and
Hi,
This is a RFC for my prototype for bitfield read vectorization. This
patch enables bit-field read vectorization by removing the rejection of
bit-field read's during DR analysis and by adding two vect patterns. The
first one transforms TREE_COMPONENT's with BIT_FIELD_DECL's into
Hi Richard,
Thanks for the review, I don't completely understand all of the below,
so I added some extra questions to help me understand :)
On 27/07/2022 12:37, Richard Biener wrote:
On Tue, 26 Jul 2022, Andre Vieira (lists) wrote:
I don't think this is a good approach for what you gain
On 27/04/2022 07:35, Richard Biener wrote:
On Tue, 26 Apr 2022, Richard Sandiford wrote:
"Andre Vieira (lists)" writes:
Hi,
This patch disables epilogue vectorization when we are peeling for
alignment in the prologue and we can't guarantee the main vectorized
loop
On 27/04/2022 15:03, Richard Biener wrote:
On Wed, 27 Apr 2022, Richard Biener wrote:
The following makes sure to take into account prologue peeling
when trying to narrow down the maximum number of iterations
computed for the epilogue of a vectorized epilogue.
Bootstrap & regtest running on
On 17/08/2022 13:49, Richard Biener wrote:
Yes, of course. What you need to do is subtract DECL_FIELD_BIT_OFFSET
of the representative from DECL_FIELD_BIT_OFFSET of the original bitfield
access - that's the offset within the representative (by construction
both fields share DECL_FIELD_OFFSET).
On 27/09/2022 13:34, Richard Biener wrote:
On Mon, 26 Sep 2022, Andre Vieira (lists) wrote:
On 08/09/2022 12:51, Richard Biener wrote:
I'm curious, why the push to redundant_ssa_names? That could use
a comment ...
So I purposefully left a #if 0 #else #endif in there so you can see the two
.c: New test.
* gcc.dg/vect/vect-bitfield-write-4.c: New test.
* gcc.dg/vect/vect-bitfield-write-5.c: New test.
On 28/09/2022 10:43, Andre Vieira (lists) via Gcc-patches wrote:
On 27/09/2022 13:34, Richard Biener wrote:
On Mon, 26 Sep 2022, Andre Vieira (lists) wrote:
On 08/09
The ifcvt dead code elimination code was not built to deal with inline
assembly, as loops with such would never be if-converted in the past since
we can't do data-reference analysis on them and vectorization would
eventually fail.
For this reason we now also do not lower bitfields if the
On 24/10/2022 13:46, Richard Biener wrote:
On Mon, 24 Oct 2022, Andre Vieira (lists) wrote:
On 24/10/2022 08:17, Richard Biener wrote:
Can you check why vect_find_stmt_data_reference doesn't trip on the
if (TREE_CODE (DR_REF (dr)) == COMPONENT_REF
&& DECL_B
On 24/10/2022 14:29, Richard Biener wrote:
On Mon, 24 Oct 2022, Andre Vieira (lists) wrote:
Changing if-convert would merely change this testcase but we could still
trigger using a different structure type, changing the size of Int24 to 32
bits rather than 24:
package Loop_Optimization23_Pkg
Hi,
The ada failure reported in the PR was being caused by
vect_check_gather_scatter failing to deal with bit offsets that weren't
multiples of BITS_PER_UNIT. This patch makes vect_check_gather_scatter
reject memory accesses with such offsets.
Bootstrapped and regression tested on aarch64
On 24/10/2022 08:17, Richard Biener wrote:
Can you check why vect_find_stmt_data_reference doesn't trip on the
if (TREE_CODE (DR_REF (dr)) == COMPONENT_REF
&& DECL_BIT_FIELD (TREE_OPERAND (DR_REF (dr), 1)))
{
free_data_ref (dr);
return opt_result::failure_at
Hi,
The 'vect_recog_bitfield_ref_pattern' was not correctly adapting the
vectype when widening the container.
I thought the original tests covered that code-path but they didn't, so
I added a new run-test that covers it too.
Bootstrapped and regression tested on x86_64 and aarch64.
Ping.
On 25/08/2022 10:09, Andre Vieira (lists) via Gcc-patches wrote:
On 17/08/2022 13:49, Richard Biener wrote:
Yes, of course. What you need to do is subtract DECL_FIELD_BIT_OFFSET
of the representative from DECL_FIELD_BIT_OFFSET of the original
bitfield
access - that's the offset
On 08/09/2022 12:51, Richard Biener wrote:
I'm curious, why the push to redundant_ssa_names? That could use
a comment ...
So I purposefully left a #if 0 #else #endif in there so you can see the
two options. But the reason I used redundant_ssa_names is because ifcvt
seems to use that as a
Hi all,
Can I backport this to gcc-11 branch? Also applies cleanly (with the
exception of the file extensions being different: 'aarch64-builtins.cc
vs aarch64-builtins.c').
Bootstrapped and regression tested on aarch64-linux-gnu.
Kind regards,
Andre Vieira
Hi,
The original patch supported matching the
vect_recog_bitfield_ref_pattern for
BITFIELD_REF's where the first operand didn't have a INTEGRAL_TYPE_P type.
That means it would also match vectors, leading to regressions in
targets that
supported vectorization of those.
Bootstrappend and
Hi,
The bitposition calculation for the bitfield lowering in loop if
conversion was not
taking DECL_FIELD_OFFSET into account, which meant that it would result in
wrong bitpositions for bitfields that did not end up having representations
starting at the beginning of the struct.
Bootstrappend
Added some extra comments to describe what is going on there.
On 13/10/2022 09:14, Richard Biener wrote:
On Wed, 12 Oct 2022, Andre Vieira (lists) wrote:
Hi,
The bitposition calculation for the bitfield lowering in loop if conversion
was not
taking DECL_FIELD_OFFSET into account, which meant
On 13/10/2022 15:15, Richard Biener wrote:
On Thu, 13 Oct 2022, Andre Vieira (lists) wrote:
Hi Rainer,
Thanks for reporting, I was actually expecting these! I thought about
pre-empting them by using a positive filter on the tests for aarch64 and
x86_64 as I knew those would pass, but I
Hi Rainer,
Thanks for reporting, I was actually expecting these! I thought about
pre-empting them by using a positive filter on the tests for aarch64 and
x86_64 as I knew those would pass, but I thought it would be better to
let other targets report failures since then you get a testsuite
to
commit on Friday in case something breaks over the weekend, so I'll
leave it until Monday.
Thanks,
Andre
On 29/09/2022 08:54, Richard Biener wrote:
On Wed, Sep 28, 2022 at 7:32 PM Andre Vieira (lists) via Gcc-patches
wrote:
Made the change and also created the ChangeLogs.
OK if bootstrap
Hi,
This patch series is a work in progress towards getting the compiler to
generate better code for constructors and dups in both NEON and SVE
targets. It first changes the backend to use rtx_vector_builder for
vector_init's. Then it is followed by some prepraration passes to better
handle
Hi,
This patch changes aarch64_expand_vector_init to use rtx_vector_builder,
exploiting it's internal pattern detection to find 'dup' patterns.
Bootstrapped and regression tested on aarch64-none-linux-gnu.
Is this OK for trunk or should we wait for the rest of the series?
gcc/ChangeLog:
Hi,
This enables and makes it more likely the compiler is able to use GPR
input for SIMD inserts. I believe this is some outdated hack we used to
prevent costly GPR<->SIMD register file swaps. This patch is required
for better codegen in situations like the test case 'int8_3' in the next
This isn't really a 'PATCH' yet, it's something I was working on but had
to put on hold. Feel free to re-use any bits or trash all of it if you'd
like.diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
index
Hi,
This patch is part of the WIP patch that follows in this series. It's
goal is to teach forwprop to handle VLA VEC_PERM_EXPRs with VLS
CONSTRUCTORs as arguments as preparation for the 'VLA constructor' hook
approach.
Kind Regards,
Andrediff --git a/gcc/match.pd b/gcc/match.pd
index
OK to backport this to gcc-12? Applies cleanly and did a bootstrat and
regression test on aarch64-linux-gnu
Regards,
Andre
On 01/07/2022 12:26, Richard Sandiford wrote:
"Andre Vieira (lists)" writes:
On 29/06/2022 08:18, Richard Sandiford wrote:
+ break;
+case AA
Hi,
New version of the patch attached, but haven't recreated the ChangeLog
yet, just waiting to see if this is what you had in mind. See also some
replies to your comments in-line below:
On 09/08/2022 15:34, Richard Biener wrote:
@@ -2998,7 +3013,7 @@ ifcvt_split_critical_edges (class loop
Hi,
So I've changed the approach from the RFC as suggested, moving the
bitfield lowering to the if-convert pass.
So to reiterate, ifcvt will lower COMPONENT_REF's with DECL_BIT_FIELD
field's to either BIT_FIELD_REF if they are reads or BIT_INSERT_EXPR if
they are writes, using loads and
On 29/07/2022 11:31, Jakub Jelinek wrote:
On Fri, Jul 29, 2022 at 09:57:29AM +0100, Andre Vieira (lists) via Gcc-patches
wrote:
The 'only on the vectorized code path' remains the same though as vect_recog
also only happens on the vectorized code path right?
if conversion (in some cases
On 29/07/2022 11:52, Richard Biener wrote:
On Fri, 29 Jul 2022, Jakub Jelinek wrote:
On Fri, Jul 29, 2022 at 09:57:29AM +0100, Andre Vieira (lists) via Gcc-patches
wrote:
The 'only on the vectorized code path' remains the same though as vect_recog
also only happens on the vectorized code
ote:
-Original Message-
From: Richard Sandiford
Sent: Tuesday, November 15, 2022 6:05 PM
To: Andre Simoes Dias Vieira
Cc: gcc-patches@gcc.gnu.org; Kyrylo Tkachov ;
Richard Earnshaw
Subject: Re: [PATCH 2/2] aarch64: Add support for widening LDAPR
instructions
"Andre Vieira (lists)&
On 11/11/2022 17:40, Stam Markianos-Wright via Gcc-patches wrote:
Hi all,
This is the 2/2 patch that contains the functional changes needed
for MVE Tail Predicated Low Overhead Loops. See my previous email
for a general introduction of MVE LOLs.
This support is added through the already
ping. (reattaching patch in the hopes patchwork picks it up).
On 13/01/2023 16:05, Andre Simoes Dias Vieira via Gcc-patches wrote:
Hi,
This patch adds the memory operand of MVE masked stores as input operands to
mimic the 'partial' writes, to prevent erroneous write-after-write
optimizations
Hi,
This patch teaches GCC that zero-extending a MVE predicate from 16-bits
to 32-bits and then only using 16-bits is a no-op.
It does so in two steps:
- it lets gcc know that it can access any MVE predicate mode using any
other MVE predicate mode without needing to copy it, using the
Hi,
This patch fixes the way we synthesize MVE predicate immediates and
fixes some other inconsistencies around predicates. For instance this
patch fixes the modes used in the vctp intrinsics, to couple them with
predicate modes with the appropriate lane numbers. For this V2QI is
added to
Hi,
The ACLE defines mve_pred16_t as an unsigned short. This patch makes
sure GCC treats the predicate as an unsigned type, rather than signed.
Bootstrapped on aarch64-none-eabi and regression tested on arm-none-eabi
and armeb-none-eabi for armv8.1-m.main+mve.fp.
OK for trunk?
Hi all,
This patch series aims to fix two or three (depends on how you look at
it) regressions that came about in gcc 11. The first and third patch
address wrong-codegen regressions and the second a performance
regression. Patch two makes a change to the mid-end so I can understand
if there
I meant bootstrapped on aarch64-none-linux-gnu and not none-eabi.
On 24/01/2023 13:40, Andre Vieira (lists) via Gcc-patches wrote:
Hi,
The ACLE defines mve_pred16_t as an unsigned short. This patch makes
sure GCC treats the predicate as an unsigned type, rather than signed.
Bootstrapped
Hi,
This patch adds aarch64 to the list of vect_long_long targets.
Regression tested on aarch64-none-linux-gnu.
OK for trunk?
gcc/testsuite/ChangeLog:
* lib/target-supports.exp (check_effective_target_vect_long_long): Add
aarch64 to list of targets supporting long long
Looks like the first patch was missing a change I had made to prevent
mve_bool_vec_to_const ICEing if called with a non-vector immediate. Now
included.
On 24/01/2023 13:56, Andre Vieira (lists) via Gcc-patches wrote:
Hi,
This patch fixes the way we synthesize MVE predicate immediates
On 26/01/2023 15:06, Kyrylo Tkachov wrote:
Hi Andre,
-Original Message-
From: Andre Vieira (lists)
Sent: Tuesday, January 24, 2023 1:54 PM
To: gcc-patches@gcc.gnu.org
Cc: Richard Sandiford ; Richard Earnshaw
; Richard Biener ;
Kyrylo Tkachov
Subject: [PATCH 2/3] arm: Remove
On 26/01/2023 15:02, Kyrylo Tkachov wrote:
Hi Andre,
-Original Message-
From: Andre Vieira (lists)
Sent: Tuesday, January 24, 2023 1:41 PM
To: gcc-patches@gcc.gnu.org
Cc: Kyrylo Tkachov ; Richard Earnshaw
Subject: [PATCH 1/3] arm: Fix sign of MVE predicate mve_pred16_t [PR
107674
Here's a new version with a more robust test.
OK for trunk?
On 27/01/2023 09:56, Kyrylo Tkachov wrote:
-Original Message-
From: Andre Vieira (lists)
Sent: Friday, January 27, 2023 9:54 AM
To: Kyrylo Tkachov ; gcc-patches@gcc.gnu.org
Cc: Richard Earnshaw
Subject: Re: [PATCH 1/3
Changed the testcase to be more robust (as per the discussion for the
first patch).
Still need the OK for the mid-end (simplify-rtx) part.
Kind regards,
Andre
On 27/01/2023 09:59, Kyrylo Tkachov wrote:
-Original Message-
From: Andre Vieira (lists)
Sent: Friday, January 27, 2023 9
This applies cleanly to gcc-12 and regressions for arm-none-eabi look clean.
OK to apply to gcc-12?
On 06/12/2022 11:23, Kyrylo Tkachov wrote:
-Original Message-
From: Andre Simoes Dias Vieira
Sent: Tuesday, December 6, 2022 11:19 AM
To: 'gcc-patches@gcc.gnu.org'
Cc: Kyrylo
Sorry for the delay, just been reminded I still had this patch
outstanding from last stage 1. Hopefully since it has been mostly
reviewed it could go in for this stage 1?
I addressed the comments and gave the slp-part of vectorizable_call some
TLC to make it work.
I also changed
Hi,
With Tamar's patch
(https://gcc.gnu.org/pipermail/gcc-patches/2022-November/604880.html)
enabling the vectorization of early-breaks, I'd like to allow bitfield
lowering in such loops, which requires the relaxation of allowing
multiple exits when doing so. In order to avoid a similar
Hi,
This patch adds support for the widening LDAPR instructions.
Bootstrapped and regression tested on aarch64-none-linux-gnu.
OK for trunk?
2022-11-09 Andre Vieira
Kyrylo Tkachov
gcc/ChangeLog:
* config/aarch64/atomics.md
(*aarch64_atomic_load_rcpc_zext): New
Hello,
This patch enables the use of LDAPR for load-acquire semantics. After
some internal investigation based on the work published by Podkopaev et
al. (https://dl.acm.org/doi/10.1145/3290382) we can confirm that using
LDAPR for the C++ load-acquire semantics is a correct relaxation.
On 07/11/2022 11:05, Richard Biener wrote:
On Fri, 4 Nov 2022, Andre Vieira (lists) wrote:
Sorry for the delay, just been reminded I still had this patch outstanding
from last stage 1. Hopefully since it has been mostly reviewed it could go in
for this stage 1?
I addressed the comments
On 07/11/2022 14:56, Richard Biener wrote:
On Mon, 7 Nov 2022, Andre Vieira (lists) wrote:
On 07/11/2022 11:05, Richard Biener wrote:
On Fri, 4 Nov 2022, Andre Vieira (lists) wrote:
Sorry for the delay, just been reminded I still had this patch outstanding
from last stage 1. Hopefully
on code generation.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/ldapr.c: New test.
On 10/11/2022 15:55, Kyrylo Tkachov wrote:
Hi Andre,
-Original Message-
From: Andre Vieira (lists)
Sent: Thursday, November 10, 2022 11:17 AM
To: gcc-patches@gcc.gnu.org
Cc: Kyrylo
On 14/11/2022 14:12, Kyrylo Tkachov wrote:
-Original Message-
From: Andre Vieira (lists)
Sent: Monday, November 14, 2022 2:09 PM
To: Kyrylo Tkachov ; gcc-patches@gcc.gnu.org
Cc: Richard Earnshaw ; Richard Sandiford
Subject: Re: [PATCH 1/2] aarch64: Enable the use of LDAPR for load
Updated version of the patch to account for the testsuite changes in the
first patch.
On 10/11/2022 11:20, Andre Vieira (lists) via Gcc-patches wrote:
Hi,
This patch adds support for the widening LDAPR instructions.
Bootstrapped and regression tested on aarch64-none-linux-gnu.
OK for trunk
Yeah that shouldn't be there, it's from an earlier version of the patch
I wrote where I was experimenting changing the existing modes, I'll
remove it from the ChangeLog.
On 31/01/2023 09:53, Kyrylo Tkachov wrote:
gcc/testsuite/ChangeLog:
* gcc.dg/rtl/arm/mve-vxbi.c: Use new
Hi all,
This is a series of patches/RFCs to implement support in GCC to be able
to target AArch64's libmvec functions that will be/are being added to glibc.
We have chosen to use the omp pragma '#pragma omp declare variant ...'
with a simd construct as the way for glibc to inform GCC what
Hi,
This patch modifies this function in parloops to allow it to handle
loops with poly iteration counts.
gcc/ChangeLog:
* tree-parloops.cc (try_transform_to_exit_first_loop_alt):
Handle poly nits.
Is this OK for Stage 1?diff --git a/gcc/tree-parloops.cc b/gcc/tree-parloops.cc
Hi,
This patch makes sure we copy over
DECL_FUNCTION_SPECIFIC_{TARGET,OPTIMIZATION} in parloops when creating
function clones. This is required for SVE clones as we will need to
enable +sve for them, regardless of the current target options.
I don't actually need the 'OPTIMIZATION' for this
Hi,
This RFC extends the omp-simd-clone pass to create simd clones for
functions with 'omp declare variant' pragmas that contain simd
constructs. This patch also implements AArch64's use for this functionality.
This requires two extra pieces of information be kept for each
simd-clone, a
Hi,
This patch adds SVE support for simd clone generation when using 'omp
declare simd'. The design is based on what was discussed in PR 96342,
but I did not look at YangYang's patch as I wasn't sure of whether that
code's copyright had been assigned to FSF.
This patch also is not in
Hi,
This RFC is to propose relaxing the flag needed to allow the creation of
simd clones from omp declare variants, such that we can use
-fopenmp-simd rather than -fopenmp.
This should only change the behaviour of omp simd clones and should not
enable any other openmp functionality, though I
Hi,
This patch replaces the uses of simd_clone_subparts with
TYPE_VECTOR_SUBPARTS and removes the definition of the first.
gcc/ChangeLog:
* omp-sind-clone.cc (simd_clone_subparts): Remove.
(simd_clone_init_simd_arrays): Replace simd_clone_subparts with
TYPE_VECTOR_SUBPARTS.
Hi Richard,
I'm only picking this up now. Just going through your earlier comments
and stuff and I noticed we didn't address the situation with the
gimple::build. Do you want me to add overloaded static member functions
to cover all gimple_build_* functions, or just create one to replace
This patch fixes the condition check for eligilibity of lowering bitfields,
where before we would check for non-BLKmode types, in the hope of excluding
unsuitable aggregate types, we now check directly the representative is
not an
aggregate type, i.e. suitable for a scalar register.
I tried
Hey both,
Sorry about that, don't know how I missed those. Just running a test on
that now and will commit when it's done. I assume the comment and 0 ->
byte change can be seen as obvious, especially since it was supposed to
be in my original patch...
On 27/02/2023 15:46, Richard Sandiford
Committed attached patch.
On 02/03/2023 10:13, Richard Sandiford wrote:
"Andre Vieira (lists)" writes:
Hey both,
Sorry about that, don't know how I missed those. Just running a test on
that now and will commit when it's done. I assume the comment and 0 ->
byte change can be se
On 01/03/2023 10:01, Andrew Stubbs wrote:
> On 28/02/2023 23:01, Kwok Cheung Yeung wrote:
>> Hello
>>
>> This patch implements the TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION
>> target hook for the AMD GCN architecture, such that when vectorized,
>> calls to builtin standard math functions
Rebased all three patches and made some small changes to the second one:
- removed sub and abd optabs from commutative_optab_p, I suspect this
was a copy paste mistake,
- removed what I believe to be a superfluous switch case in vectorizable
conversion, the one that was here:
+ if
On 20/04/2023 15:51, Richard Sandiford wrote:
"Andre Vieira (lists)" writes:
Hi all,
This is a series of patches/RFCs to implement support in GCC to be able
to target AArch64's libmvec functions that will be/are being added to glibc.
We have chosen to use the omp pragma '#
501 - 600 of 640 matches
Mail list logo