Hi,
Bootstrapped and tested the gcc-13 backport of this on gcc-12 for
aarch64-unknown-linux-gnu and x86_64-pc-linux-gnu and no regressions.
OK to push to gcc-12 branch?
Kind regards,
Andre Vieira
On 10/11/2023 13:16, Richard Biener wrote:
The following fixes the issue that when SLP stmts
Hi Thiago,
Thanks for this, LGTM but I can't approve this, CC'ing Richard.
Do have a nitpick, in the gcc/testsuite/ChangeLog: remove
'gcc/testsuite' from bullet points 2-4.
Kind regards,
Andre
On 13/01/2024 00:55, Thiago Jung Bauermann wrote:
Since commits 2c3db94d9fd ("c: Turn
On 27/02/2024 08:47, Richard Biener wrote:
On Mon, 26 Feb 2024, Andre Vieira (lists) wrote:
On 05/02/2024 09:56, Richard Biener wrote:
On Thu, 1 Feb 2024, Andre Vieira (lists) wrote:
On 01/02/2024 07:19, Richard Biener wrote:
On Wed, 31 Jan 2024, Andre Vieira (lists) wrote
Hi,
After the backport off PR target/112787 a failure was reported against
x86_64, this would be fixed by backporting:
* tree-optimization/91838 - fix FAIL of g++.dg/opt/pr91838.C
(d1c072a1c3411a6fe29900750b38210af8451eeb)
* tree-optimization/110838 - less aggressively fold out-of-bound shifts
Hi,
Introduced a new patch to disable diagnostics for ABI breaks involving
_BitInt(N) given the type didn't exist, let me know what you think of that.
Also added further testing to replicate the ABI diagnostic tests to use
_BitInt(N).
Andre Vieira (2)
aarch64: Do not give ABI change
This patch makes sure we do not give ABI change diagnostics for the ABI
breaks of GCC 9, 13 and 14 for any type involving _BitInt(N), since that
type did not exist before this GCC version.
ChangeLog:
* config/aarch64/aarch64.cc (bitint_or_aggr_of_bitint_p): New function.
This patch adds support for C23's _BitInt for the AArch64 port when
compiling for little endianness. Big Endianness requires further
target-agnostic support and we therefor disable it for now.
The tests expose some suboptimal codegen for which I'll create PR's for
optimizations after this
This patch fixes some testisms introduced by:
commit 5aa3fec38cc6f52285168b161bab1a869d864b44
Author: Andre Vieira
Date: Wed Apr 10 16:29:46 2024 +0100
aarch64: Add support for _BitInt
The testcases were relying on an unnecessary sign-extend that is no longer
generated.
The tested
regards,
Andre
On 28/03/2024 12:54, Richard Sandiford wrote:
"Andre Vieira (lists)" writes:
This patch makes sure we do not give ABI change diagnostics for the ABI
breaks of GCC 9, 13 and 14 for any type involving _BitInt(N), since that
type did not exist before this GCC version.
Added the target check, also had to change some of the assembly checking
due to changes upstream, the assembly is still valid, but we do extend
where not necessary, I do believe that's a general issue though.
The _BitInt(N > 64) codegen for non-powers of 2 did get worse, we see
similar
Hi,
Patch to add AArch64 to the list of supported _BitInt(N) in
gcc-14/changes.html.
OK?diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html
index
a7ba957110183f906938d935bfa17aaed2ba20c8..55ab8c14c6d0b54e05a5f266f25c8ef1a4f959bf
100644
--- a/htdocs/gcc-14/changes.html
+++
Hi Christophe,
On 26/11/2020 15:31, Christophe Lyon wrote:
Hi Andre,
Thanks for the quick feedback.
On Wed, 25 Nov 2020 at 18:17, Andre Simoes Dias Vieira
wrote:
Hi Christophe,
Thanks for these! See some inline comments.
On 25/11/2020 13:54, Christophe Lyon via Gcc-patches wrote:
This
LGTM, but please wait for maintainer review.
On 25/11/2020 13:54, Christophe Lyon via Gcc-patches wrote:
This patch enables MVE veorq instructions for auto-vectorization. MVE
veorq insns in mve.md are modified to use xor instead of unspec
expression to support xor3. The xor3 expander is
On 08/12/2020 13:50, Christophe Lyon via Gcc-patches wrote:
Hi,
My 'vand' patch changes the definition of VDQ so that the relevant
modes are enabled only when !TARGET_HAVE_MVE (V8QI, ...), and this
helps writing a simpler expander.
However, vneg is used by vshr (right-shifts by register are
This patch introduces a vect.mul RTX cost and decouples the vector
multiplication costing from the scalar one.
After Wilco's "AArch64: Add cost table for Cortex-A76" patch we saw a
regression in vector codegen. Reproduceable with the small test added in
this patch.
Upon further investigation
Same patch applies cleanly on gcc-8, bootstrapped
arm-none-linux-gnueabihf and ran regressions also clean.
Can I also commit it to gcc-8?
Thanks,
Andre
On 02/02/2021 17:36, Kyrylo Tkachov wrote:
-Original Message-
From: Andre Vieira (lists)
Sent: 02 February 2021 17:27
To: gcc
Hi,
This is a gcc-9 backport of the PR97528 fix that has been applied to
trunk and gcc-10.
Bootstraped on arm-linux-gnueabihf and regression tested.
OK for gcc-9 branch?
2021-02-02 Andre Vieira
Backport from mainline
2020-11-20 Jakub Jelinek
PR target/97528
*
On 14/06/2021 11:57, Richard Biener wrote:
On Mon, 14 Jun 2021, Richard Biener wrote:
Indeed. For example a simple
int a[1024], b[1024], c[1024];
void foo(int n)
{
for (int i = 0; i < n; ++i)
a[i+1] += c[i+i] ? b[i+1] : 0;
}
should usually see peeling for alignment (though on x86
On 08/06/2021 16:00, Andre Simoes Dias Vieira via Gcc-patches wrote:
Hi Bin,
Thank you for the reply, I have some questions, see below.
On 07/06/2021 12:28, Bin.Cheng wrote:
On Fri, Jun 4, 2021 at 12:35 AM Andre Vieira (lists) via Gcc-patches
wrote:
Hi Andre,
I didn't look
Hi,
On 20/05/2021 11:22, Richard Biener wrote:
On Mon, 17 May 2021, Andre Vieira (lists) wrote:
Hi,
So this is my second attempt at finding a way to improve how we generate the
vector IV's and teach the vectorizer to share them between main loop and
epilogues. On IRC we discussed my idea
Hi,
Using aarch64_pred_mov for these was tricky as it did both store and
load. Furthermore there was some concern it might allow for a predicated
mov to end up as a mem -> mem and a predicated load being wrongfully
reloaded to a full-load to register. So instead we decided to let the
Hi,
So this is my second attempt at finding a way to improve how we generate
the vector IV's and teach the vectorizer to share them between main loop
and epilogues. On IRC we discussed my idea to use the loop's control_iv,
but that was a terrible idea and I quickly threw it in the bin. The
Hi,
I noticed we were missing out on LD1 + UXT combinations in some cases
and found it was because of inconsistent use of the unspec enum
UNSPEC_LD1_SVE. The combine pattern for LD1[S][BHWD] uses UNSPEC_LD1_SVE
whereas one of the LD1 expanders was using UNSPEC_PRED_X. I wasn't sure
whether
Hi,
When vectorizing with --param vect-partial-vector-usage=1 the vectorizer
uses an unpredicated (all-true predicate for SVE) main loop and a
predicated tail loop. The way this was implemented seems to mean it
re-uses the same vector-mode for both loops, which means the tail loop
isn't an
Hi,
This RFC is motivated by the IV sharing RFC in
https://gcc.gnu.org/pipermail/gcc-patches/2021-May/569502.html and the
need to have the IVOPTS pass be able to clean up IV's shared between
multiple loops. When creating a similar problem with C code I noticed
IVOPTs treated IV's with uses
Thank you Kewen!!
I will apply this now.
BR,
Andre
On 25/05/2021 09:42, Kewen.Lin wrote:
on 2021/5/24 下午3:21, Kewen.Lin via Gcc-patches wrote:
Hi Andre,
on 2021/5/24 下午2:17, Andre Vieira (lists) via Gcc-patches wrote:
Hi,
When vectorizing with --param vect-partial-vector-usage=1
Streams got crossed there and used the wrong subject ...
On 03/06/2021 17:34, Andre Vieira (lists) via Gcc-patches wrote:
Hi,
This RFC is motivated by the IV sharing RFC in
https://gcc.gnu.org/pipermail/gcc-patches/2021-May/569502.html and the
need to have the IVOPTS pass be able to clean up
Hi,
The aim of this RFC is to explore a way of cleaning up the codegen
around data_references. To be specific, I'd like to reuse the
main-loop's updated data_reference as the base_address for the
epilogue's corresponding data_reference, rather than use the niters. We
have found this leads
?
On 04/05/2021 10:56, Richard Biener wrote:
On Fri, 30 Apr 2021, Andre Vieira (lists) wrote:
Hi,
The aim of this RFC is to explore a way of cleaning up the codegen around
data_references. To be specific, I'd like to reuse the main-loop's updated
data_reference as the base_address for the e
On 05/05/2021 13:34, Richard Biener wrote:
On Wed, 5 May 2021, Andre Vieira (lists) wrote:
I tried to see what IVOPTs would make of this and it is able to analyze the
IVs but it doesn't realize (not even sure it tries) that one IV's end (loop 1)
could be used as the base for the other (loop
Hi Christophe,
On 30/04/2021 15:09, Christophe Lyon via Gcc-patches wrote:
Since MVE has a different set of vector comparison operators from
Neon, we have to update the expansion to take into account the new
ones, for instance 'NE' for which MVE does not require to use 'EQ'
with the inverted
It would be good to also add tests for NEON as you also enable auto-vec
for it. I checked and I do think the necessary 'neon_vc' patterns exist
for 'VH', so we should be OK there.
On 30/04/2021 15:09, Christophe Lyon via Gcc-patches wrote:
This patch adds __fp16 support to the previous patch
Hi Christophe,
The series LGTM but you'll need the approval of an arm port maintainer
before committing. I only did code-review, did not try to build/run tests.
Kind regards,
Andre
On 30/04/2021 15:09, Christophe Lyon via Gcc-patches wrote:
This patch enables MVE vld4/vst4 instructions for
Hi,
As mentioned in the PR, this patch fixes up the nvectors parameter passed to
vect_get_loop_mask in vectorizable_condition.
Before the STMT_VINFO_VEC_STMTS rework we used to handle each ncopy separately,
now we gather them all at the same time and don't need to multiply vec_num with
On 05/02/2021 12:47, Richard Sandiford wrote:
"Andre Vieira (lists)" writes:
Hi,
As mentioned in the PR, this patch fixes up the nvectors parameter passed to
vect_get_loop_mask in vectorizable_condition.
Before the STMT_VINFO_VEC_STMTS rework we used to handle each ncopy separa
On 19/02/2021 15:05, Vladimir Makarov wrote:
On 2021-02-19 5:53 a.m., Andre Vieira (lists) wrote:
Hi,
This patch makes sure that allocno copies are not created for
unordered modes. The testcases in the PR highlighted a case where an
allocno copy was being created for:
(insn 121 120 123
Hi,
This patch prevents generating a vec_duplicate with illegal predicate.
Regression tested on aarch64-linux-gnu.
OK for trunk?
gcc/ChangeLog:
2021-02-17 Andre Vieira
PR target/98657
* config/aarch64/aarch64-sve.md: Use 'expand_vector_broadcast'
to emit vec_duplicate's
Hi,
This patch makes sure that allocno copies are not created for unordered
modes. The testcases in the PR highlighted a case where an allocno copy
was being created for:
(insn 121 120 123 11 (parallel [
(set (reg:VNx2QI 217)
(vec_duplicate:VNx2QI (subreg/s/v:QI
Hi Alex,
On 22/02/2021 10:20, Alex Coplan wrote:
For the testcase, you might want to use the one I posted most recently:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98791#c3
which avoids the dependency on the aarch64-autovec-preference param
(which is in GCC 11 only) as this will simplify
Hi all,
This patch adds the ability to define a target hook to unroll the main
vectorized loop. It also introduces --param's vect-unroll and
vect-unroll-reductions to control this through a command-line. I found
this useful to experiment and believe can help when tuning, so I decided
to
Hi all,
This patch series enables unrolling of an unpredicated main vectorized
loop based on a target hook. The epilogue loop will have (at least) half
the VF of the main loop and can be predicated.
Andre Vieira (3):
[vect] Add main vectorized loop unrolling
[vect] Consider outside costs
Hi,
This patch changes the order in which we check outside and inside costs
for epilogue loops, this is to ensure that a predicated epilogue is more
likely to be picked over an unpredicated one, since it saves having to
enter a scalar epilogue loop.
gcc/ChangeLog:
*
On 13/10/2021 13:37, Kyrylo Tkachov wrote:
Hi Andre,
@@ -24276,7 +24271,7 @@ arm_print_operand (FILE *stream, rtx x, int code)
else if (code == POST_MODIFY || code == PRE_MODIFY)
{
asm_fprintf (stream, "[%r", REGNO (XEXP (addr, 0)));
- postinc_reg =
.
* tree-vect-slp.c (vect_bb_vectorization_profitable_p): Adjust
call to finish_cost.
* tree-vectorizer.h (finish_cost): Change to pass new class
vec_info parameter.
On 01/10/2021 09:19, Richard Biener wrote:
On Thu, 30 Sep 2021, Andre Vieira (lists) wrote:
Hi,
That just forces
Hi,
The way we were previously dealing with addressing modes for MVE was
preventing
the use of pre, post and offset addressing modes for the normal loads and
stores, including widening and narrowing. This patch fixes that and
adds tests to ensure we are capable of using all the available
Hi,
I completely forgot I still had this patch out as well, I grouped it
together with the unrolling because it was what motivated the change,
but it is actually wider applicable and can be reviewed separately.
On 17/09/2021 16:32, Andre Vieira (lists) via Gcc-patches wrote:
Hi,
This patch
On 19/10/2021 00:22, Joseph Myers wrote:
On Fri, 15 Oct 2021, Richard Biener via Gcc-patches wrote:
On Fri, Sep 24, 2021 at 2:59 PM Jirui Wu via Gcc-patches
wrote:
Hi,
Ping: https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577846.html
The patch is attached as text for ease of use. Is
On 27/09/2021 12:54, Richard Biener via Gcc-patches wrote:
On Mon, 27 Sep 2021, Jirui Wu wrote:
Hi all,
I now use the type based on the specification of the intrinsic
instead of type based on formal argument.
I use signed Int vector types because the outputs of the neon builtins
that I am
On 15/10/2021 09:48, Richard Biener wrote:
On Tue, 12 Oct 2021, Andre Vieira (lists) wrote:
Hi Richi,
I think this is what you meant, I now hide all the unrolling cost calculations
in the existing target hooks for costs. I did need to adjust 'finish_cost' to
take the loop_vinfo so
Hi,
That just forces trying the vector modes we've tried before. Though I might
need to revisit this now I think about it. I'm afraid it might be possible for
this to generate an epilogue with a vf that is not lower than that of the main
loop, but I'd need to think about this again.
Either
Hi Richi,
Thanks for the review, see below some questions.
On 21/09/2021 13:30, Richard Biener wrote:
On Fri, 17 Sep 2021, Andre Vieira (lists) wrote:
Hi all,
This patch adds the ability to define a target hook to unroll the main
vectorized loop. It also introduces --param's vect-unroll
Made the suggested changes.
Regarding the name change to partial vectors, I agree in the name change
since that is the terminology we are using in the loop_vinfo members
too, but is there an actual difference between predication/masking and
partial vectors that I am missing?
OK for trunk?
Hi Richard,
Thank you for the review, I've adopted all above suggestions downstream,
I am still surprised how many style things I still miss after years of
gcc development :(
On 17/12/2021 12:44, Richard Sandiford wrote:
@@ -3252,16 +3257,31 @@ vectorizable_call (vec_info *vinfo,
On 16/11/2021 12:10, Richard Biener wrote:
On Fri, 12 Nov 2021, Andre Simoes Dias Vieira wrote:
On 12/11/2021 10:56, Richard Biener wrote:
On Thu, 11 Nov 2021, Andre Vieira (lists) wrote:
Hi,
This patch introduces two IFN's FTRUNC32 and FTRUNC64, the corresponding
optabs and mappings
Hi,
This is the rebased and reworked version of the unroll patch. I wasn't
entirely sure whether I should compare the costs of the unrolled
loop_vinfo with the original loop_vinfo it was unrolled of. I did now,
but I wasn't too sure whether it was a good idea to... Any thoughts on
this?
Hi,
This patch introduces two IFN's FTRUNC32 and FTRUNC64, the corresponding
optabs and mappings. It also creates a backend pattern to implement them
for aarch64 and a match.pd pattern to idiom recognize these.
These IFN's (and optabs) represent a truncation towards zero, as if
performed by
Hi,
Committed this as obvious. My earlier patch removed the need for the GSI
to be used.
gcc/ChangeLog:
* config/aarch64/aarch64-builtins.c
(aarch64_general_gimple_fold_builtin): Mark argument as unused.
diff --git a/gcc/config/aarch64/aarch64-builtins.c
On 22/11/2021 12:39, Richard Biener wrote:
+ if (first_loop_vinfo->suggested_unroll_factor > 1)
+{
+ if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo))
+ {
+ if (dump_enabled_p ())
+ dump_printf_loc (MSG_NOTE, vect_location,
+
On 18/11/2021 11:05, Richard Biener wrote:
+ (if (!flag_trapping_math
+ && direct_internal_fn_supported_p (IFN_TRUNC, type,
+OPTIMIZE_FOR_BOTH))
+ (IFN_TRUNC @0)
#endif
does IFN_FTRUNC_INT preserve the same exceptions as
On 24/11/2021 11:00, Richard Biener wrote:
On Wed, 24 Nov 2021, Andre Vieira (lists) wrote:
On 22/11/2021 12:39, Richard Biener wrote:
+ if (first_loop_vinfo->suggested_unroll_factor > 1)
+{
+ if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop
On 22/11/2021 11:41, Richard Biener wrote:
On 18/11/2021 11:05, Richard Biener wrote:
This is a good shout and made me think about something I hadn't before... I
thought I could handle the vector forms later, but the problem is if I add
support for the scalar, it will stop the vectorizer. It
On 12/11/2021 13:12, Richard Biener wrote:
On Thu, 11 Nov 2021, Andre Vieira (lists) wrote:
Hi,
This is the rebased and reworked version of the unroll patch. I wasn't
entirely sure whether I should compare the costs of the unrolled loop_vinfo
with the original loop_vinfo it was unrolled
On 18/11/2021 11:05, Richard Biener wrote:
@@ -3713,12 +3713,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
trapping behaviour, so require !flag_trapping_math. */
#if GIMPLE
(simplify
- (float (fix_trunc @0))
- (if (!flag_trapping_math
- && types_match (type, TREE_TYPE (@0))
-
Hi,
This fixes the alignment on the memory access type for neon loads &
stores in the gimple lowering. Bootstrap ubsan on aarch64 builds again
with this change.
2021-10-25 Andre Vieira
gcc/ChangeLog:
* config/aarch64/aarch64-builtins.c
(aarch64_general_gimple_fold_builtin):
Thank you both!
Here is a reworked version, this OK for trunk?diff --git a/gcc/config/aarch64/aarch64-builtins.c
b/gcc/config/aarch64/aarch64-builtins.c
index
a815e4cfbccab692ca688ba87c71b06c304abbfb..e06131a7c61d31c1be3278dcdccc49c3053c78cb
100644
--- a/gcc/config/aarch64/aarch64-builtins.c
Decided to split the patches up to make it clear that the testisms fixes
had nothing to do with the TBAA fix. I'll be committing these two separately
First:
[AArch64] Fix big-endian testisms introduced by NEON gimple lowering patch
This patch reverts the tests for big-endian after the NEON
And second (also added a test):
[AArch64] Fix TBAA information when lowering NEON loads and stores to gimple
This patch fixes the wrong TBAA information when lowering NEON loads and
stores
to gimple that showed up when bootstrapping with UBSAN.
gcc/ChangeLog:
*
Hi,
This should address the ubsan bootstrap build and big-endian testisms
reported against the last NEON load/store gimple lowering patch. I also
fixed a follow-up issue where the alias information was leading to a bad
codegen transformation. The NEON intrinsics specifications do not forbid
On 25/11/2021 12:46, Richard Biener wrote:
Oops, my fault, yes, it does. I would suggest to refactor things so
that the mode_i = first_loop_i case is there only once. I also wonder
if all the argument about starting at 0 doesn't apply to the
not unrolled
On 07/12/2021 11:45, Richard Biener wrote:
Can you check whether, give we know the main VF, the epilogue analysis
does not start with am autodetected vector mode that needs a too large VF?
Hmm struggling to see how we could check this here. AFAIU before we
analyze the loop for a given
Hi,
Added an extra step to skip unusable epilogue modes when we know the
target does not support predication. This uses a new function
'support_predication_p' that is generated at build time and checks
whether the target supports at least one optab that can be used for
predicated
(vect_better_loop_vinfo_p): Round factors up
for epilogue costing.
(vect_analyze_loop): Re-analyze all modes for epilogues.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/masked_epilogue.c: New test.
On 30/11/2021 13:56, Richard Biener wrote:
On Tue, 30 Nov 2021, Andre Vieira (lists
): Add new member
m_suggested_unroll_factor.
(vector_costs::suggested_unroll_factor): New getter function.
(finish_cost): Set return argument suggested_unroll_factor.
Regards,
Andre
On 07/12/2021 11:27, Andre Vieira (lists) via Gcc-patches wrote:
Hi,
I've split this particular
ping
On 25/11/2021 13:53, Andre Vieira (lists) via Gcc-patches wrote:
On 22/11/2021 11:41, Richard Biener wrote:
On 18/11/2021 11:05, Richard Biener wrote:
This is a good shout and made me think about something I hadn't
before... I
thought I could handle the vector forms later
On 12/01/2022 11:59, Richard Biener wrote:
On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:
On 12/01/2022 11:44, Richard Sandiford wrote:
Another alternative would be to push autodetected_vector_mode when the
length is 1 and keep 1 as the starting point.
Richard
I'm guessing we would still
Hi,
This a fix for the regression caused by '[vect] Re-analyze all modes for
epilogues'. The earlier patch assumed there was always at least one
other mode than VOIDmode, but that does not need to be the case.
If we are dealing with a target that does not define more modes for
On 12/01/2022 11:44, Richard Sandiford wrote:
Another alternative would be to push autodetected_vector_mode when the
length is 1 and keep 1 as the starting point.
Richard
I'm guessing we would still want to skip epilogue vectorization if
!VECTOR_MODE_P (autodetected_vector_mode) in that
On 13/01/2022 12:36, Richard Biener wrote:
On Thu, 13 Jan 2022, Andre Vieira (lists) wrote:
This time to the list too (sorry for double email)
Hi,
The original patch '[vect] Re-analyze all modes for epilogues', skipped modes
that should not be skipped since it used the vector mode provided
On 13/01/2022 14:25, Richard Biener wrote:
On Thu, 13 Jan 2022, Andre Vieira (lists) wrote:
On 13/01/2022 12:36, Richard Biener wrote:
On Thu, 13 Jan 2022, Andre Vieira (lists) wrote:
This time to the list too (sorry for double email)
Hi,
The original patch '[vect] Re-analyze all modes
On 14/01/2022 07:08, Richard Biener wrote:
On Thu, 13 Jan 2022, Andre Vieira (lists) wrote:
On 13/01/2022 14:25, Richard Biener wrote:
On Thu, 13 Jan 2022, Andre Vieira (lists) wrote:
On 13/01/2022 12:36, Richard Biener wrote:
On Thu, 13 Jan 2022, Andre Vieira (lists) wrote:
This time
On 12/01/2022 12:57, Richard Biener wrote:
On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:
On 12/01/2022 11:59, Richard Biener wrote:
On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:
On 12/01/2022 11:44, Richard Sandiford wrote:
Another alternative would be to push
.
* gcc.target/aarch64/frintnz_vec.c: New test.
On 03/01/2022 12:18, Richard Biener wrote:
On Wed, 29 Dec 2021, Andre Vieira (lists) wrote:
Hi Richard,
Thank you for the review, I've adopted all above suggestions downstream, I am
still surprised how many style things I still miss after years of gcc
.
(finish_cost): Set return argument suggested_unroll_factor.
Regards,
Andre
On 30/11/2021 13:56, Richard Biener wrote:
On Tue, 30 Nov 2021, Andre Vieira (lists) wrote:
On 25/11/2021 12:46, Richard Biener wrote:
Oops, my fault, yes, it does. I would suggest to refactor things so
that the mode_i
On 17/01/2022 07:48, Christophe Lyon wrote:
Hi André,
On Fri, Jan 14, 2022 at 6:03 PM Andre Vieira (lists) via Gcc-patches
wrote:
Hi Christophe,
This patch relaxes the addressing modes for the mve full load and
stores
(by full loads and stores I mean non-widening
Hi all,
This patch adds the feature bit for FP16 to the feature set for Armv9
since Armv9 requires SVE to be implemented and SVE requires FP16 to be
implemented.
2022-03-04 Andre Vieira
* config/aarch64/aarch64.h (AARCH64_FL_FOR_ARCH9): Add FP16
feature bit.
diff --git
This patch introduces a struct to differentiate between different
memmove costs to enable a better modeling of memory operations. These
have been modelled for
-mcpu/-mtune=neoverse-v1/neoverse-n1/neoverse-n2/neoverse-512tvb, for
all other tunings all entries are equal to the old single memmove
Hi,
This patch adds tuning structures for Neoverse N2.
2022-03-16 Tamar Christina
Andre Vieira
* config/aarch64/aarch64.cc (neoversen2_addrcost_table,
neoversen2_regmove_cost,
neoversen2_advsimd_vector_cost, neoversen2_sve_vector_cost,
Hi,
As requested, I updated the Neoverse N2 entry to use the
AARCH64_FL_FOR_ARCH9 feature set, removed duplicate entries, updated the
ARCH_INDENT to 9A and moved it under the Armv9 cores.
gcc/ChangeLog:
* config/aarch64/aarch64-cores.def: Update Neoverse N2 core entry.
diff --git
Hi,
This patch adds tuning structs for -mcpu/-mtune=demeter.
2022-03-16 Tamar Christina
Andre Vieira
* config/aarch64/aarch64.cc (demeter_addrcost_table,
demeter_regmove_cost,
demeter_advsimd_vector_cost, demeter_sve_vector_cost,
Hi,
This patch implements the costing function
determine_suggested_unroll_factor for aarch64.
It determines the unrolling factor by dividing the number of X
operations we can do per cycle by the number of X operations in the loop
body, taking this information from the vec_ops analysis during
Hi,
This patch updates the register move tunings for
-mcpu/-mtune={neoverse-v1,neoverse-512tvb}.
2022-03-16 Tamar Christina
Andre Vieira
* config/aarch64/aarch64.cc (neoversev1_regmove_cost): New
tuning struct.
(neoversev1_tunings): Use
Hi,
This patch adds predicate move costs to several SVE enabled cores.
2022-02-25 Tamar Christina
Andre Vieira
gcc/ChangeLog:
* config/aarch64/aarch64-protos.h (struct cpu_regmove_cost):
Add PR2PR member.
* config/aarch64/aarch64.cc
Hi,
As reported on PR104498, the issue here is that when
compare_base_symbol_refs swaps x and y but doesn't take that into
account when computing the distance.
This patch makes sure that if x and y are swapped, we correct the
distance computation by multiplying it by -1 to end up with the
Andre Vieira (lists)" writes:
Hi,
This patch implements the costing function
determine_suggested_unroll_factor for aarch64.
It determines the unrolling factor by dividing the number of X
operations we can do per cycle by the number of X operations in the loop
body, taking this inform
Ping.
On 16/03/2022 15:00, Andre Vieira (lists) via Gcc-patches wrote:
Hi,
As requested, I updated the Neoverse N2 entry to use the
AARCH64_FL_FOR_ARCH9 feature set, removed duplicate entries, updated
the ARCH_INDENT to 9A and moved it under the Armv9 cores.
gcc/ChangeLog
On 28/03/2022 15:59, Richard Sandiford wrote:
"Andre Vieira (lists)" writes:
Hi,
Addressed all of your comments bar the pred ops one.
Is this OK?
gcc/ChangeLog:
* config/aarch64/aarch64.cc (aarch64_vector_costs): Define
determine_suggested_unroll_factor and m_nos
Hi Christophe,
This patch relaxes the addressing modes for the mve full load and stores
(by full loads and stores I mean non-widening or narrowing loads and
stores resp). The code before was requiring a LO_REGNUM for these, where
this is only a requirement if the load is widening or the store
On 14/01/2022 09:57, Richard Biener wrote:
The 'used_vector_modes' is also a heuristic by itself since it registers
every vector type we query, not only those that are used in the end ...
So it's really all heuristics that can eventually go bad.
IMHO remembering the VF that we ended up with
Hi Christophe,
On 13/01/2022 14:56, Christophe Lyon via Gcc-patches wrote:
diff --git a/gcc/config/arm/arm-simd-builtin-types.def
b/gcc/config/arm/arm-simd-builtin-types.def
index 6ba6f211531..920c2a68e4c 100644
--- a/gcc/config/arm/arm-simd-builtin-types.def
+++
Hi Christophe,
On 13/01/2022 14:56, Christophe Lyon via Gcc-patches wrote:
At some point during the development of this patch series, it appeared
that in some cases the register allocator wants “VPR or general”
rather than “VPR or general or FP” (which is the same thing as
ALL_REGS). The
401 - 500 of 640 matches
Mail list logo