Re: [PATCH v3 06/15] arm: Fix mve_vmvnq_n_ argument mode

2022-01-19 Thread Andre Vieira (lists) via Gcc-patches
On 13/01/2022 14:56, Christophe Lyon via Gcc-patches wrote: The vmvnq_n* intrinsics and have [u]int[16|32]_t arguments, so use iterator instead of HI in mve_vmvnq_n_. 2022-01-13 Christophe Lyon gcc/ * config/arm/mve.md (mve_vmvnq_n_): Use V_elem mode for operand

Re: [PATCH v3 05/15] arm: Add support for VPR_REG in arm_class_likely_spilled_p

2022-01-19 Thread Andre Vieira (lists) via Gcc-patches
On 13/01/2022 14:56, Christophe Lyon via Gcc-patches wrote: VPR_REG is the only register in its class, so it should be handled by TARGET_CLASS_LIKELY_SPILLED_P, which is achieved by calling default_class_likely_spilled_p. No test fails without this patch, but it seems it should be

Re: [vect] PR103997: Fix epilogue mode skipping

2022-01-19 Thread Andre Vieira (lists) via Gcc-patches
On 19/01/2022 11:04, Richard Biener wrote: On Tue, 18 Jan 2022, Andre Vieira (lists) wrote: On 14/01/2022 09:57, Richard Biener wrote: The 'used_vector_modes' is also a heuristic by itself since it registers every vector type we query, not only those that are used in the end ... So it's

Re: [PATCH v3 04/15] arm: Add GENERAL_AND_VPR_REGS regclass

2022-01-20 Thread Andre Vieira (lists) via Gcc-patches
On 20/01/2022 09:14, Christophe Lyon wrote: On Wed, Jan 19, 2022 at 7:18 PM Andre Vieira (lists) via Gcc-patches wrote: Hi Christophe, On 13/01/2022 14:56, Christophe Lyon via Gcc-patches wrote: > At some point during the development of this patch series, it appea

Re: [PATCH v3 04/15] arm: Add GENERAL_AND_VPR_REGS regclass

2022-01-20 Thread Andre Vieira (lists) via Gcc-patches
On 20/01/2022 10:40, Richard Sandiford wrote: "Andre Vieira (lists)" writes: On 20/01/2022 09:14, Christophe Lyon wrote: On Wed, Jan 19, 2022 at 7:18 PM Andre Vieira (lists) via Gcc-patches wrote: Hi Christophe, On 13/01/2022 14:56, Christophe Lyon via Gcc-pat

Re: [PATCH v3 06/15] arm: Fix mve_vmvnq_n_ argument mode

2022-01-20 Thread Andre Vieira (lists) via Gcc-patches
On 20/01/2022 10:45, Richard Sandiford wrote: "Andre Vieira (lists)" writes: On 13/01/2022 14:56, Christophe Lyon via Gcc-patches wrote: The vmvnq_n* intrinsics and have [u]int[16|32]_t arguments, so use iterator instead of HI in mve_vmvnq_n_. 2022-01-13 Christophe Lyon

[vect] PR103997: Fix epilogue mode skipping

2022-01-13 Thread Andre Vieira (lists) via Gcc-patches
This time to the list too (sorry for double email) Hi, The original patch '[vect] Re-analyze all modes for epilogues', skipped modes that should not be skipped since it used the vector mode provided by autovectorize_vector_modes to derive the minimum VF required for it. However, those modes

[AArch64] PR target/105157 Increase number of cores TARGET_CPU_DEFAULT can encode

2022-04-07 Thread Andre Vieira (lists) via Gcc-patches
Hi, This addresses the compile-time increase seen in the PR target/105157. This was being caused by selecting the wrong core tuning, as when we added the latest AArch64 the TARGET_CPU_generic tuning was pushed beyond the 0x3f mask we used to encode both target cpu and attributes into

Re: [AArch64] PR target/105157 Increase number of cores TARGET_CPU_DEFAULT can encode

2022-04-08 Thread Andre Vieira (lists) via Gcc-patches
On 08/04/2022 08:04, Richard Sandiford wrote: I think this would be better as a static assert at the top level: static_assert (TARGET_CPU_generic < TARGET_CPU_MASK, "TARGET_CPU_NBITS is big enough"); The motivation being that you want this to be checked regardless of

[PATCH 6/8] vect: Add vector_mode paramater to simd_clone_usable

2023-08-30 Thread Andre Vieira (lists) via Gcc-patches
This patch adds a machine_mode parameter to the TARGET_SIMD_CLONE_USABLE hook to enable rejecting SVE modes when the target architecture does not support SVE. gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_simd_clone_usable): Add mode parameter and use to to reject SVE

[Patch 2/8] parloops: Allow poly nit and bound

2023-08-30 Thread Andre Vieira (lists) via Gcc-patches
Teach parloops how to handle a poly nit and bound e ahead of the changes to enable non-constant simdlen. gcc/ChangeLog: * tree-parloops.cc (try_to_transform_to_exit_first_loop_alt): Accept poly NIT and ALT_BOUND.diff --git a/gcc/tree-parloops.cc b/gcc/tree-parloops.cc index

[PATCH 4/8] vect: don't allow fully masked loops with non-masked simd clones [PR 110485]

2023-08-30 Thread Andre Vieira (lists) via Gcc-patches
When analyzing a loop and choosing a simdclone to use it is possible to choose a simdclone that cannot be used 'inbranch' for a loop that can use partial vectors. This may lead to the vectorizer deciding to use partial vectors which are not supported for notinbranch simd clones. This patch

[PATCH 5/8] vect: Use inbranch simdclones in masked loops

2023-08-30 Thread Andre Vieira (lists) via Gcc-patches
This patch enables the compiler to use inbranch simdclones when generating masked loops in autovectorization. gcc/ChangeLog: * omp-simd-clone.cc (simd_clone_adjust_argument_types): Make function compatible with mask parameters in clone. * tree-vect-stmts.cc

aarch64, vect, omp: Add SVE support for simd clones [PR 96342]

2023-08-30 Thread Andre Vieira (lists) via Gcc-patches
Hi, This patch series aims to implement support for SVE simd clones when not specifying a 'simdlen' clause for AArch64. This patch depends on my earlier patch: '[PATCH] aarch64: enable mixed-types for aarch64 simdclones'. Bootstrapped and regression tested the series on

[PATCH 1/8] parloops: Copy target and optimizations when creating a function clone

2023-08-30 Thread Andre Vieira (lists) via Gcc-patches
SVE simd clones require to be compiled with a SVE target enabled or the argument types will not be created properly. To achieve this we need to copy DECL_FUNCTION_SPECIFIC_TARGET from the original function declaration to the clones. I decided it was probably also a good idea to copy

[Patch 3/8] vect: Fix vect_get_smallest_scalar_type for simd clones

2023-08-30 Thread Andre Vieira (lists) via Gcc-patches
The vect_get_smallest_scalar_type helper function was using any argument to a simd clone call when trying to determine the smallest scalar type that would be vectorized. This included the function pointer type in a MASK_CALL for instance, and would result in the wrong type being selected.

[PATCH 8/8] aarch64: Add SVE support for simd clones [PR 96342]

2023-08-30 Thread Andre Vieira (lists) via Gcc-patches
This patch finalizes adding support for the generation of SVE simd clones when no simdlen is provided, following the ABI rules where the widest data type determines the minimum amount of elements in a length agnostic vector. gcc/ChangeLog: * config/aarch64/aarch64-protos.h

[PATCH7/8] vect: Add TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM

2023-08-30 Thread Andre Vieira (lists) via Gcc-patches
This patch adds a new target hook to enable us to adapt the types of return and parameters of simd clones. We use this in two ways, the first one is to make sure we can create valid SVE types, including the SVE type attribute, when creating a SVE simd clone, even when the target options do

Re: [PATCH 6/8] vect: Add vector_mode paramater to simd_clone_usable

2023-08-30 Thread Andre Vieira (lists) via Gcc-patches
Forgot to CC this one to maintainers... On 30/08/2023 10:14, Andre Vieira (lists) via Gcc-patches wrote: This patch adds a machine_mode parameter to the TARGET_SIMD_CLONE_USABLE hook to enable rejecting SVE modes when the target architecture does not support SVE. gcc/ChangeLog

Re: [PATCH 6/8] vect: Add vector_mode paramater to simd_clone_usable

2023-08-30 Thread Andre Vieira (lists) via Gcc-patches
On 30/08/2023 14:01, Richard Biener wrote: On Wed, Aug 30, 2023 at 11:15 AM Andre Vieira (lists) via Gcc-patches wrote: This patch adds a machine_mode parameter to the TARGET_SIMD_CLONE_USABLE hook to enable rejecting SVE modes when the target architecture does not support SVE. How does

Re: [PATCH] aarch64: enable mixed-types for aarch64 simdclones

2023-08-29 Thread Andre Vieira (lists) via Gcc-patches
Hi, This patch enables the use of mixed-types for simd clones for AArch64, adds aarch64 as a target_vect_simd_clones and corrects the way the simdlen is chosen for non-specified simdlen clauses according to the 'Vector Function Application Binary Interface Specification for AArch64'.

Re: [PATCH] vect, tree-optimization/105219: Disable epilogue vectorization when peeling for alignment

2022-04-26 Thread Andre Vieira (lists) via Gcc-patches
On 26/04/2022 16:12, Jakub Jelinek wrote: On Tue, Apr 26, 2022 at 03:43:13PM +0100, Richard Sandiford via Gcc-patches wrote: --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/pr105219-2.c @@ -0,0 +1,29 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -march=armv8.2-a -mtune=thunderx

Re: [PATCH] vect, tree-optimization/105219: Disable epilogue vectorization when peeling for alignment

2022-04-26 Thread Andre Vieira (lists) via Gcc-patches
On 26/04/2022 15:43, Richard Sandiford wrote: "Andre Vieira (lists)" writes: Hi, This patch disables epilogue vectorization when we are peeling for alignment in the prologue and we can't guarantee the main vectorized loop is entered.  This is to prevent executing vecto

[PATCH] vect, tree-optimization/105219: Disable epilogue vectorization when peeling for alignment

2022-04-26 Thread Andre Vieira (lists) via Gcc-patches
Hi, This patch disables epilogue vectorization when we are peeling for alignment in the prologue and we can't guarantee the main vectorized loop is entered.  This is to prevent executing vectorized code with an unaligned access if the target has indicated it wants to peel for alignment. We

[AArch64] Improve SVE dup intrinsics codegen

2022-05-17 Thread Andre Vieira (lists) via Gcc-patches
Hi, This patch teaches the aarch64 backend to improve codegen when using dup with NEON vectors with repeating patterns. It will attempt to use a smaller NEON vector (or element) to limit the number of instructions needed to construct the input vector. Bootstrapped and regression tested 

Re: [0/9] [middle-end] Add param to vec_perm_const hook to specify mode of input operand

2022-05-18 Thread Andre Vieira (lists) via Gcc-patches
Hi Prathamesh, I am just looking at this as it interacts with a change I am trying to make, but I'm not a reviewer so take my comments with a pinch of salt ;) I copied in bits of your patch below to comment. > -@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode

Re: [PATCH][AArch64] Implement ACLE Data Intrinsics

2022-07-01 Thread Andre Vieira (lists) via Gcc-patches
On 29/06/2022 08:18, Richard Sandiford wrote: + break; +case AARCH64_RBIT: +case AARCH64_RBITL: +case AARCH64_RBITLL: + if (mode == SImode) + icode = CODE_FOR_aarch64_rbitsi; + else + icode = CODE_FOR_aarch64_rbitdi; + break; +default: +

Re: [PATCH][AArch64] Implement ACLE Data Intrinsics

2022-06-28 Thread Andre Vieira (lists) via Gcc-patches
On 17/06/2022 11:54, Richard Sandiford wrote: "Andre Vieira (lists)" writes: Hi, This patch adds support for the ACLE Data Intrinsics to the AArch64 port. Bootstrapped and regression tested on aarch64-none-linux. OK for trunk? Sorry for the slow review.

[PATCH][AArch64] Implement ACLE Data Intrinsics

2022-06-10 Thread Andre Vieira (lists) via Gcc-patches
Hi, This patch adds support for the ACLE Data Intrinsics to the AArch64 port. Bootstrapped and regression tested on aarch64-none-linux. OK for trunk? gcc/ChangeLog: 2022-06-10  Andre Vieira      * config/aarch64/aarch64.md (rbit2): Rename this ...     (@aarch64_rbit): ... this and

[RFC] Teach vectorizer to deal with bitfield reads

2022-07-26 Thread Andre Vieira (lists) via Gcc-patches
Hi, This is a RFC for my prototype for bitfield read vectorization. This patch enables bit-field read vectorization by removing the rejection of bit-field read's during DR analysis and by adding two vect patterns. The first one transforms TREE_COMPONENT's with BIT_FIELD_DECL's into

Re: [RFC] Teach vectorizer to deal with bitfield reads

2022-07-29 Thread Andre Vieira (lists) via Gcc-patches
Hi Richard, Thanks for the review, I don't completely understand all of the below, so I added some extra questions to help me understand :) On 27/07/2022 12:37, Richard Biener wrote: On Tue, 26 Jul 2022, Andre Vieira (lists) wrote: I don't think this is a good approach for what you gain

Re: [PATCH] vect, tree-optimization/105219: Disable epilogue vectorization when peeling for alignment

2022-04-27 Thread Andre Vieira (lists) via Gcc-patches
On 27/04/2022 07:35, Richard Biener wrote: On Tue, 26 Apr 2022, Richard Sandiford wrote: "Andre Vieira (lists)" writes: Hi, This patch disables epilogue vectorization when we are peeling for alignment in the prologue and we can't guarantee the main vectorized loop

Re: [PATCH] tree-optimization/105219 - bogus max iters for vectorized epilogue

2022-04-28 Thread Andre Vieira (lists) via Gcc-patches
On 27/04/2022 15:03, Richard Biener wrote: On Wed, 27 Apr 2022, Richard Biener wrote: The following makes sure to take into account prologue peeling when trying to narrow down the maximum number of iterations computed for the epilogue of a vectorized epilogue. Bootstrap & regtest running on

Re: [PATCH] Teach vectorizer to deal with bitfield accesses (was: [RFC] Teach vectorizer to deal with bitfield reads)

2022-08-25 Thread Andre Vieira (lists) via Gcc-patches
On 17/08/2022 13:49, Richard Biener wrote: Yes, of course. What you need to do is subtract DECL_FIELD_BIT_OFFSET of the representative from DECL_FIELD_BIT_OFFSET of the original bitfield access - that's the offset within the representative (by construction both fields share DECL_FIELD_OFFSET).

Re: [PATCH] Teach vectorizer to deal with bitfield accesses (was: [RFC] Teach vectorizer to deal with bitfield reads)

2022-09-28 Thread Andre Vieira (lists) via Gcc-patches
On 27/09/2022 13:34, Richard Biener wrote: On Mon, 26 Sep 2022, Andre Vieira (lists) wrote: On 08/09/2022 12:51, Richard Biener wrote: I'm curious, why the push to redundant_ssa_names? That could use a comment ... So I purposefully left a #if 0 #else #endif in there so you can see the two

Re: [PATCH] Teach vectorizer to deal with bitfield accesses (was: [RFC] Teach vectorizer to deal with bitfield reads)

2022-09-28 Thread Andre Vieira (lists) via Gcc-patches
.c: New test.     * gcc.dg/vect/vect-bitfield-write-4.c: New test.     * gcc.dg/vect/vect-bitfield-write-5.c: New test. On 28/09/2022 10:43, Andre Vieira (lists) via Gcc-patches wrote: On 27/09/2022 13:34, Richard Biener wrote: On Mon, 26 Sep 2022, Andre Vieira (lists) wrote: On 08/09

[PATCH] ifcvt: Do not lower bitfields if we can't analyze dr's [PR107275]

2022-10-18 Thread Andre Vieira (lists) via Gcc-patches
The ifcvt dead code elimination code was not built to deal with inline assembly, as loops with such would never be if-converted in the past since we can't do data-reference analysis on them and vectorization would eventually fail. For this reason we now also do not lower bitfields if the

Re: vect: Make vect_check_gather_scatter reject offsets that aren't multiples of BITS_PER_UNIT [PR107346]

2022-10-24 Thread Andre Vieira (lists) via Gcc-patches
On 24/10/2022 13:46, Richard Biener wrote: On Mon, 24 Oct 2022, Andre Vieira (lists) wrote: On 24/10/2022 08:17, Richard Biener wrote: Can you check why vect_find_stmt_data_reference doesn't trip on the if (TREE_CODE (DR_REF (dr)) == COMPONENT_REF && DECL_B

Re: vect: Make vect_check_gather_scatter reject offsets that aren't multiples of BITS_PER_UNIT [PR107346]

2022-10-28 Thread Andre Vieira (lists) via Gcc-patches
On 24/10/2022 14:29, Richard Biener wrote: On Mon, 24 Oct 2022, Andre Vieira (lists) wrote: Changing if-convert would merely change this testcase but we could still trigger using a different structure type, changing the size of Int24 to 32 bits rather than 24: package Loop_Optimization23_Pkg

vect: Make vect_check_gather_scatter reject offsets that aren't multiples of BITS_PER_UNIT [PR107346]

2022-10-21 Thread Andre Vieira (lists) via Gcc-patches
Hi, The ada failure reported in the PR was being caused by vect_check_gather_scatter failing to deal with bit offsets that weren't multiples of BITS_PER_UNIT. This patch makes vect_check_gather_scatter reject memory accesses with such offsets. Bootstrapped and regression tested on aarch64

Re: vect: Make vect_check_gather_scatter reject offsets that aren't multiples of BITS_PER_UNIT [PR107346]

2022-10-24 Thread Andre Vieira (lists) via Gcc-patches
On 24/10/2022 08:17, Richard Biener wrote: Can you check why vect_find_stmt_data_reference doesn't trip on the if (TREE_CODE (DR_REF (dr)) == COMPONENT_REF && DECL_BIT_FIELD (TREE_OPERAND (DR_REF (dr), 1))) { free_data_ref (dr); return opt_result::failure_at

[PATCH]vect: Fix vectype when widening container type in bitfield pattern [PR107326]

2022-10-20 Thread Andre Vieira (lists) via Gcc-patches
Hi, The 'vect_recog_bitfield_ref_pattern' was not correctly adapting the vectype when widening the container. I thought the original tests covered that code-path but they didn't, so I added a new run-test that covers it too. Bootstrapped and regression tested on x86_64 and aarch64.

Re: [PATCH] Teach vectorizer to deal with bitfield accesses (was: [RFC] Teach vectorizer to deal with bitfield reads)

2022-09-08 Thread Andre Vieira (lists) via Gcc-patches
Ping. On 25/08/2022 10:09, Andre Vieira (lists) via Gcc-patches wrote: On 17/08/2022 13:49, Richard Biener wrote: Yes, of course.  What you need to do is subtract DECL_FIELD_BIT_OFFSET of the representative from DECL_FIELD_BIT_OFFSET of the original bitfield access - that's the offset

Re: [PATCH] Teach vectorizer to deal with bitfield accesses (was: [RFC] Teach vectorizer to deal with bitfield reads)

2022-09-26 Thread Andre Vieira (lists) via Gcc-patches
On 08/09/2022 12:51, Richard Biener wrote: I'm curious, why the push to redundant_ssa_names? That could use a comment ... So I purposefully left a #if 0 #else #endif in there so you can see the two options. But the reason I used redundant_ssa_names is because ifcvt seems to use that as a

Re: [PATCH][AArch64] Implement ACLE Data Intrinsics

2022-10-04 Thread Andre Vieira (lists) via Gcc-patches
Hi all, Can I backport this to gcc-11 branch? Also applies cleanly (with the exception of the file extensions being different: 'aarch64-builtins.cc vs aarch64-builtins.c'). Bootstrapped and regression tested on aarch64-linux-gnu. Kind regards, Andre Vieira

vect: Don't pattern match BITFIELD_REF's of non-integrals [PR107226]

2022-10-12 Thread Andre Vieira (lists) via Gcc-patches
Hi, The original patch supported matching the vect_recog_bitfield_ref_pattern for BITFIELD_REF's where the first operand didn't have a INTEGRAL_TYPE_P type. That means it would also match vectors, leading to regressions in targets that supported vectorization of those. Bootstrappend and

ifcvt: Fix bitpos calculation in bitfield lowering [PR107229]

2022-10-12 Thread Andre Vieira (lists) via Gcc-patches
Hi, The bitposition calculation for the bitfield lowering in loop if conversion was not taking DECL_FIELD_OFFSET into account, which meant that it would result in wrong bitpositions for bitfields that did not end up having representations starting at the beginning of the struct. Bootstrappend

Re: ifcvt: Fix bitpos calculation in bitfield lowering [PR107229]

2022-10-13 Thread Andre Vieira (lists) via Gcc-patches
Added some extra comments to describe what is going on there. On 13/10/2022 09:14, Richard Biener wrote: On Wed, 12 Oct 2022, Andre Vieira (lists) wrote: Hi, The bitposition calculation for the bitfield lowering in loop if conversion was not taking DECL_FIELD_OFFSET into account, which meant

Re: ifcvt: Fix bitpos calculation in bitfield lowering [PR107229]

2022-10-13 Thread Andre Vieira (lists) via Gcc-patches
On 13/10/2022 15:15, Richard Biener wrote: On Thu, 13 Oct 2022, Andre Vieira (lists) wrote: Hi Rainer, Thanks for reporting, I was actually expecting these! I thought about pre-empting them by using a positive filter on the tests for aarch64 and x86_64 as I knew those would pass, but I

Re: ifcvt: Fix bitpos calculation in bitfield lowering [PR107229]

2022-10-13 Thread Andre Vieira (lists) via Gcc-patches
Hi Rainer, Thanks for reporting, I was actually expecting these! I thought about pre-empting them by using a positive filter on the tests for aarch64 and x86_64 as I knew those would pass, but I thought it would be better to let other targets report failures since then you get a testsuite

Re: [PATCH] Teach vectorizer to deal with bitfield accesses (was: [RFC] Teach vectorizer to deal with bitfield reads)

2022-10-07 Thread Andre Vieira (lists) via Gcc-patches
to commit on Friday in case something breaks over the weekend, so I'll leave it until Monday. Thanks, Andre On 29/09/2022 08:54, Richard Biener wrote: On Wed, Sep 28, 2022 at 7:32 PM Andre Vieira (lists) via Gcc-patches wrote: Made the change and also created the ChangeLogs. OK if bootstrap

[PATCH 0/4] aarch64: Improve codegen for dups and constructors

2022-08-05 Thread Andre Vieira (lists) via Gcc-patches
Hi, This patch series is a work in progress towards getting the compiler to generate better code for constructors and dups in both NEON and SVE targets.  It first changes the backend to use rtx_vector_builder for vector_init's. Then it is followed by some prepraration passes to better handle

[PATCH 2/4]aarch64: Change aarch64_expand_vector_init to use rtx_vector_builder

2022-08-05 Thread Andre Vieira (lists) via Gcc-patches
Hi, This patch changes aarch64_expand_vector_init to use rtx_vector_builder, exploiting it's internal pattern detection to find 'dup' patterns. Bootstrapped and regression tested on aarch64-none-linux-gnu. Is this OK for trunk or should we wait for the rest of the series? gcc/ChangeLog:

[PATCH 1/4] aarch64: encourage use of GPR input for SIMD inserts

2022-08-05 Thread Andre Vieira (lists) via Gcc-patches
Hi, This enables and makes it more likely the compiler is able to use GPR input for SIMD inserts. I believe this is some outdated hack we used to prevent costly GPR<->SIMD register file swaps. This patch is required for better codegen in situations like the test case 'int8_3' in the next

[PATCH 4/4][RFC] VLA Constructor

2022-08-05 Thread Andre Vieira (lists) via Gcc-patches
This isn't really a 'PATCH' yet, it's something I was working on but had to put on hold. Feel free to re-use any bits or trash all of it if you'd like.diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc b/gcc/config/aarch64/aarch64-sve-builtins-base.cc index

[PATCH 3/4] match.pd: Teach forwprop to handle VLA VEC_PERM_EXPRs with VLS CONSTRUCTORs as arguments

2022-08-05 Thread Andre Vieira (lists) via Gcc-patches
Hi, This patch is part of the WIP patch that follows in this series. It's goal is to teach forwprop to handle VLA VEC_PERM_EXPRs with VLS CONSTRUCTORs as arguments as preparation for the 'VLA constructor' hook approach. Kind Regards, Andrediff --git a/gcc/match.pd b/gcc/match.pd index

Re: [PATCH][AArch64] Implement ACLE Data Intrinsics

2022-08-11 Thread Andre Vieira (lists) via Gcc-patches
OK to backport this to gcc-12? Applies cleanly and did a bootstrat and regression test on aarch64-linux-gnu Regards, Andre On 01/07/2022 12:26, Richard Sandiford wrote: "Andre Vieira (lists)" writes: On 29/06/2022 08:18, Richard Sandiford wrote: + break; +case AA

Re: [PATCH] Teach vectorizer to deal with bitfield accesses (was: [RFC] Teach vectorizer to deal with bitfield reads)

2022-08-16 Thread Andre Vieira (lists) via Gcc-patches
Hi, New version of the patch attached, but haven't recreated the ChangeLog yet, just waiting to see if this is what you had in mind. See also some replies to your comments in-line below: On 09/08/2022 15:34, Richard Biener wrote: @@ -2998,7 +3013,7 @@ ifcvt_split_critical_edges (class loop

[PATCH] Teach vectorizer to deal with bitfield accesses (was: [RFC] Teach vectorizer to deal with bitfield reads)

2022-08-08 Thread Andre Vieira (lists) via Gcc-patches
Hi, So I've changed the approach from the RFC as suggested, moving the bitfield lowering to the if-convert pass. So to reiterate, ifcvt will lower COMPONENT_REF's with DECL_BIT_FIELD field's to either BIT_FIELD_REF if they are reads or BIT_INSERT_EXPR if they are writes, using loads and

Re: [RFC] Teach vectorizer to deal with bitfield reads

2022-08-01 Thread Andre Vieira (lists) via Gcc-patches
On 29/07/2022 11:31, Jakub Jelinek wrote: On Fri, Jul 29, 2022 at 09:57:29AM +0100, Andre Vieira (lists) via Gcc-patches wrote: The 'only on the vectorized code path' remains the same though as vect_recog also only happens on the vectorized code path right? if conversion (in some cases

Re: [RFC] Teach vectorizer to deal with bitfield reads

2022-08-01 Thread Andre Vieira (lists) via Gcc-patches
On 29/07/2022 11:52, Richard Biener wrote: On Fri, 29 Jul 2022, Jakub Jelinek wrote: On Fri, Jul 29, 2022 at 09:57:29AM +0100, Andre Vieira (lists) via Gcc-patches wrote: The 'only on the vectorized code path' remains the same though as vect_recog also only happens on the vectorized code

Re: [PATCH 2/2] aarch64: Add support for widening LDAPR instructions

2022-11-18 Thread Andre Vieira (lists) via Gcc-patches
ote: -Original Message- From: Richard Sandiford Sent: Tuesday, November 15, 2022 6:05 PM To: Andre Simoes Dias Vieira Cc: gcc-patches@gcc.gnu.org; Kyrylo Tkachov ; Richard Earnshaw Subject: Re: [PATCH 2/2] aarch64: Add support for widening LDAPR instructions "Andre Vieira (lists)&

Re: [PATCH 2/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2022-11-15 Thread Andre Vieira (lists) via Gcc-patches
On 11/11/2022 17:40, Stam Markianos-Wright via Gcc-patches wrote: Hi all, This is the 2/2 patch that contains the functional changes needed for MVE Tail Predicated Low Overhead Loops.  See my previous email for a general introduction of MVE LOLs. This support is added through the already

Re: [PATCH] arm: Make MVE masked stores read memory operand [PR 108177]

2023-01-24 Thread Andre Vieira (lists) via Gcc-patches
ping. (reattaching patch in the hopes patchwork picks it up). On 13/01/2023 16:05, Andre Simoes Dias Vieira via Gcc-patches wrote: Hi, This patch adds the memory operand of MVE masked stores as input operands to mimic the 'partial' writes, to prevent erroneous write-after-write optimizations

[PATCH 2/3] arm: Remove unnecessary zero-extending of MVE predicates before use [PR 107674]

2023-01-24 Thread Andre Vieira (lists) via Gcc-patches
Hi, This patch teaches GCC that zero-extending a MVE predicate from 16-bits to 32-bits and then only using 16-bits is a no-op. It does so in two steps: - it lets gcc know that it can access any MVE predicate mode using any other MVE predicate mode without needing to copy it, using the

[PATCH 3/3] arm: Fix MVE predicates synthesis [PR 108443]

2023-01-24 Thread Andre Vieira (lists) via Gcc-patches
Hi, This patch fixes the way we synthesize MVE predicate immediates and fixes some other inconsistencies around predicates. For instance this patch fixes the modes used in the vctp intrinsics, to couple them with predicate modes with the appropriate lane numbers. For this V2QI is added to

[PATCH 1/3] arm: Fix sign of MVE predicate mve_pred16_t [PR 107674]

2023-01-24 Thread Andre Vieira (lists) via Gcc-patches
Hi, The ACLE defines mve_pred16_t as an unsigned short. This patch makes sure GCC treats the predicate as an unsigned type, rather than signed. Bootstrapped on aarch64-none-eabi and regression tested on arm-none-eabi and armeb-none-eabi for armv8.1-m.main+mve.fp. OK for trunk?

[PATCH 0/3] arm: Fix regressions around MVE predicate codegen

2023-01-24 Thread Andre Vieira (lists) via Gcc-patches
Hi all, This patch series aims to fix two or three (depends on how you look at it) regressions that came about in gcc 11. The first and third patch address wrong-codegen regressions and the second a performance regression. Patch two makes a change to the mid-end so I can understand if there

Re: [PATCH 1/3] arm: Fix sign of MVE predicate mve_pred16_t [PR 107674]

2023-01-24 Thread Andre Vieira (lists) via Gcc-patches
I meant bootstrapped on aarch64-none-linux-gnu and not none-eabi. On 24/01/2023 13:40, Andre Vieira (lists) via Gcc-patches wrote: Hi, The ACLE defines mve_pred16_t as an unsigned short.  This patch makes sure GCC treats the predicate as an unsigned type, rather than signed. Bootstrapped

[PATCH] aarch64: Add aarch64*-*-* to the list of vect_long_long targets

2023-01-24 Thread Andre Vieira (lists) via Gcc-patches
Hi, This patch adds aarch64 to the list of vect_long_long targets. Regression tested on aarch64-none-linux-gnu. OK for trunk? gcc/testsuite/ChangeLog: * lib/target-supports.exp (check_effective_target_vect_long_long): Add aarch64 to list of targets supporting long long

Re: [PATCH 3/3] arm: Fix MVE predicates synthesis [PR 108443]

2023-01-25 Thread Andre Vieira (lists) via Gcc-patches
Looks like the first patch was missing a change I had made to prevent mve_bool_vec_to_const ICEing if called with a non-vector immediate. Now included. On 24/01/2023 13:56, Andre Vieira (lists) via Gcc-patches wrote: Hi, This patch fixes the way we synthesize MVE predicate immediates

Re: [PATCH 2/3] arm: Remove unnecessary zero-extending of MVE predicates before use [PR 107674]

2023-01-27 Thread Andre Vieira (lists) via Gcc-patches
On 26/01/2023 15:06, Kyrylo Tkachov wrote: Hi Andre, -Original Message- From: Andre Vieira (lists) Sent: Tuesday, January 24, 2023 1:54 PM To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford ; Richard Earnshaw ; Richard Biener ; Kyrylo Tkachov Subject: [PATCH 2/3] arm: Remove

Re: [PATCH 1/3] arm: Fix sign of MVE predicate mve_pred16_t [PR 107674]

2023-01-27 Thread Andre Vieira (lists) via Gcc-patches
On 26/01/2023 15:02, Kyrylo Tkachov wrote: Hi Andre, -Original Message- From: Andre Vieira (lists) Sent: Tuesday, January 24, 2023 1:41 PM To: gcc-patches@gcc.gnu.org Cc: Kyrylo Tkachov ; Richard Earnshaw Subject: [PATCH 1/3] arm: Fix sign of MVE predicate mve_pred16_t [PR 107674

Re: [PATCH 1/3] arm: Fix sign of MVE predicate mve_pred16_t [PR 107674]

2023-01-30 Thread Andre Vieira (lists) via Gcc-patches
Here's a new version with a more robust test. OK for trunk? On 27/01/2023 09:56, Kyrylo Tkachov wrote: -Original Message- From: Andre Vieira (lists) Sent: Friday, January 27, 2023 9:54 AM To: Kyrylo Tkachov ; gcc-patches@gcc.gnu.org Cc: Richard Earnshaw Subject: Re: [PATCH 1/3

Re: [PATCH 2/3] arm: Remove unnecessary zero-extending of MVE predicates before use [PR 107674]

2023-01-30 Thread Andre Vieira (lists) via Gcc-patches
Changed the testcase to be more robust (as per the discussion for the first patch). Still need the OK for the mid-end (simplify-rtx) part. Kind regards, Andre On 27/01/2023 09:59, Kyrylo Tkachov wrote: -Original Message- From: Andre Vieira (lists) Sent: Friday, January 27, 2023 9

Re: [PATCH] arm: Fix MVE's vcmp vector-scalar patterns [PR107987]

2023-01-27 Thread Andre Vieira (lists) via Gcc-patches
This applies cleanly to gcc-12 and regressions for arm-none-eabi look clean. OK to apply to gcc-12? On 06/12/2022 11:23, Kyrylo Tkachov wrote: -Original Message- From: Andre Simoes Dias Vieira Sent: Tuesday, December 6, 2022 11:19 AM To: 'gcc-patches@gcc.gnu.org' Cc: Kyrylo

Re: [AArch64] Enable generation of FRINTNZ instructions

2022-11-04 Thread Andre Vieira (lists) via Gcc-patches
Sorry for the delay, just been reminded I still had this patch outstanding from last stage 1. Hopefully since it has been mostly reviewed it could go in for this stage 1? I addressed the comments and gave the slp-part of vectorizable_call some TLC to make it work. I also changed

[PATCH] ifcvt: Support bitfield lowering of multiple-exit loops

2022-11-03 Thread Andre Vieira (lists) via Gcc-patches
Hi, With Tamar's patch (https://gcc.gnu.org/pipermail/gcc-patches/2022-November/604880.html) enabling the vectorization of early-breaks, I'd like to allow bitfield lowering in such loops, which requires the relaxation of allowing multiple exits when doing so.  In order to avoid a similar

[PATCH 2/2] aarch64: Add support for widening LDAPR instructions

2022-11-10 Thread Andre Vieira (lists) via Gcc-patches
Hi, This patch adds support for the widening LDAPR instructions. Bootstrapped and regression tested on aarch64-none-linux-gnu. OK for trunk? 2022-11-09  Andre Vieira      Kyrylo Tkachov  gcc/ChangeLog:     * config/aarch64/atomics.md (*aarch64_atomic_load_rcpc_zext): New

[PATCH 1/2] aarch64: Enable the use of LDAPR for load-acquire semantics

2022-11-10 Thread Andre Vieira (lists) via Gcc-patches
Hello, This patch enables the use of LDAPR for load-acquire semantics. After some internal investigation based on the work published by Podkopaev et al. (https://dl.acm.org/doi/10.1145/3290382) we can confirm that using LDAPR for the C++ load-acquire semantics is a correct relaxation.

Re: [AArch64] Enable generation of FRINTNZ instructions

2022-11-07 Thread Andre Vieira (lists) via Gcc-patches
On 07/11/2022 11:05, Richard Biener wrote: On Fri, 4 Nov 2022, Andre Vieira (lists) wrote: Sorry for the delay, just been reminded I still had this patch outstanding from last stage 1. Hopefully since it has been mostly reviewed it could go in for this stage 1? I addressed the comments

Re: [AArch64] Enable generation of FRINTNZ instructions

2022-11-09 Thread Andre Vieira (lists) via Gcc-patches
On 07/11/2022 14:56, Richard Biener wrote: On Mon, 7 Nov 2022, Andre Vieira (lists) wrote: On 07/11/2022 11:05, Richard Biener wrote: On Fri, 4 Nov 2022, Andre Vieira (lists) wrote: Sorry for the delay, just been reminded I still had this patch outstanding from last stage 1. Hopefully

Re: [PATCH 1/2] aarch64: Enable the use of LDAPR for load-acquire semantics

2022-11-14 Thread Andre Vieira (lists) via Gcc-patches
    on code generation. gcc/testsuite/ChangeLog:     * gcc.target/aarch64/ldapr.c: New test. On 10/11/2022 15:55, Kyrylo Tkachov wrote: Hi Andre, -Original Message- From: Andre Vieira (lists) Sent: Thursday, November 10, 2022 11:17 AM To: gcc-patches@gcc.gnu.org Cc: Kyrylo

Re: [PATCH 1/2] aarch64: Enable the use of LDAPR for load-acquire semantics

2022-11-14 Thread Andre Vieira (lists) via Gcc-patches
On 14/11/2022 14:12, Kyrylo Tkachov wrote: -Original Message- From: Andre Vieira (lists) Sent: Monday, November 14, 2022 2:09 PM To: Kyrylo Tkachov ; gcc-patches@gcc.gnu.org Cc: Richard Earnshaw ; Richard Sandiford Subject: Re: [PATCH 1/2] aarch64: Enable the use of LDAPR for load

Re: [PATCH 2/2] aarch64: Add support for widening LDAPR instructions

2022-11-14 Thread Andre Vieira (lists) via Gcc-patches
Updated version of the patch to account for the testsuite changes in the first patch. On 10/11/2022 11:20, Andre Vieira (lists) via Gcc-patches wrote: Hi, This patch adds support for the widening LDAPR instructions. Bootstrapped and regression tested on aarch64-none-linux-gnu. OK for trunk

Re: [PATCH 3/3] arm: Fix MVE predicates synthesis [PR 108443]

2023-01-31 Thread Andre Vieira (lists) via Gcc-patches
Yeah that shouldn't be there, it's from an earlier version of the patch I wrote where I was experimenting changing the existing modes, I'll remove it from the ChangeLog. On 31/01/2023 09:53, Kyrylo Tkachov wrote: gcc/testsuite/ChangeLog:     * gcc.dg/rtl/arm/mve-vxbi.c: Use new

[RFC 0/X] Implement GCC support for AArch64 libmvec

2023-03-08 Thread Andre Vieira (lists) via Gcc-patches
Hi all, This is a series of patches/RFCs to implement support in GCC to be able to target AArch64's libmvec functions that will be/are being added to glibc. We have chosen to use the omp pragma '#pragma omp declare variant ...' with a simd construct as the way for glibc to inform GCC what

[PATCH 3/X] parloops: Allow poly number of iterations

2023-03-08 Thread Andre Vieira (lists) via Gcc-patches
Hi, This patch modifies this function in parloops to allow it to handle loops with poly iteration counts. gcc/ChangeLog: * tree-parloops.cc (try_transform_to_exit_first_loop_alt): Handle poly nits. Is this OK for Stage 1?diff --git a/gcc/tree-parloops.cc b/gcc/tree-parloops.cc

[PATCH 2/X] parloops: Copy target and optimizations when creating a function clone

2023-03-08 Thread Andre Vieira (lists) via Gcc-patches
Hi, This patch makes sure we copy over DECL_FUNCTION_SPECIFIC_{TARGET,OPTIMIZATION} in parloops when creating function clones. This is required for SVE clones as we will need to enable +sve for them, regardless of the current target options. I don't actually need the 'OPTIMIZATION' for this

[RFC 5/X] omp: Create simd clones from 'omp declare variant's

2023-03-08 Thread Andre Vieira (lists) via Gcc-patches
Hi, This RFC extends the omp-simd-clone pass to create simd clones for functions with 'omp declare variant' pragmas that contain simd constructs. This patch also implements AArch64's use for this functionality. This requires two extra pieces of information be kept for each simd-clone, a

[RFC 4/X] omp, aarch64: Add SVE support for 'omp declare simd' [PR 96342]

2023-03-08 Thread Andre Vieira (lists) via Gcc-patches
Hi, This patch adds SVE support for simd clone generation when using 'omp declare simd'. The design is based on what was discussed in PR 96342, but I did not look at YangYang's patch as I wasn't sure of whether that code's copyright had been assigned to FSF. This patch also is not in

[RFC 6/X] omp: Allow creation of simd clones from omp declare variant with -fopenmp-simd flag

2023-03-08 Thread Andre Vieira (lists) via Gcc-patches
Hi, This RFC is to propose relaxing the flag needed to allow the creation of simd clones from omp declare variants, such that we can use -fopenmp-simd rather than -fopenmp. This should only change the behaviour of omp simd clones and should not enable any other openmp functionality, though I

[PATCH 1/X] omp: Replace simd_clone_subparts with TYPE_VECTOR_SUBPARTS

2023-03-08 Thread Andre Vieira (lists) via Gcc-patches
Hi, This patch replaces the uses of simd_clone_subparts with TYPE_VECTOR_SUBPARTS and removes the definition of the first. gcc/ChangeLog: * omp-sind-clone.cc (simd_clone_subparts): Remove. (simd_clone_init_simd_arrays): Replace simd_clone_subparts with TYPE_VECTOR_SUBPARTS.

Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns

2023-03-17 Thread Andre Vieira (lists) via Gcc-patches
Hi Richard, I'm only picking this up now. Just going through your earlier comments and stuff and I noticed we didn't address the situation with the gimple::build. Do you want me to add overloaded static member functions to cover all gimple_build_* functions, or just create one to replace

[PATCH] ifcvt: Lower bitfields only if suitable for scalar register [PR tree/109005]

2023-03-13 Thread Andre Vieira (lists) via Gcc-patches
This patch fixes the condition check for eligilibity of lowering bitfields, where before we would check for non-BLKmode types, in the hope of excluding unsuitable aggregate types, we now check directly the representative is not an aggregate type, i.e. suitable for a scalar register. I tried

Re: [PATCH] simplify-rtx: Fix VOIDmode operand handling in simplify_subreg [PR108805]

2023-03-02 Thread Andre Vieira (lists) via Gcc-patches
Hey both, Sorry about that, don't know how I missed those. Just running a test on that now and will commit when it's done. I assume the comment and 0 -> byte change can be seen as obvious, especially since it was supposed to be in my original patch... On 27/02/2023 15:46, Richard Sandiford

Re: [PATCH] simplify-rtx: Fix VOIDmode operand handling in simplify_subreg [PR108805]

2023-03-02 Thread Andre Vieira (lists) via Gcc-patches
Committed attached patch. On 02/03/2023 10:13, Richard Sandiford wrote: "Andre Vieira (lists)" writes: Hey both, Sorry about that, don't know how I missed those. Just running a test on that now and will commit when it's done. I assume the comment and 0 -> byte change can be se

Re: [PATCH] amdgcn: Enable SIMD vectorization of math functions

2023-03-01 Thread Andre Vieira (lists) via Gcc-patches
On 01/03/2023 10:01, Andrew Stubbs wrote: > On 28/02/2023 23:01, Kwok Cheung Yeung wrote: >> Hello >> >> This patch implements the TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION >> target hook for the AMD GCN architecture, such that when vectorized, >> calls to builtin standard math functions

Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns

2023-04-20 Thread Andre Vieira (lists) via Gcc-patches
Rebased all three patches and made some small changes to the second one: - removed sub and abd optabs from commutative_optab_p, I suspect this was a copy paste mistake, - removed what I believe to be a superfluous switch case in vectorizable conversion, the one that was here: + if

Re: [RFC 0/X] Implement GCC support for AArch64 libmvec

2023-04-20 Thread Andre Vieira (lists) via Gcc-patches
On 20/04/2023 15:51, Richard Sandiford wrote: "Andre Vieira (lists)" writes: Hi all, This is a series of patches/RFCs to implement support in GCC to be able to target AArch64's libmvec functions that will be/are being added to glibc. We have chosen to use the omp pragma '#

<    1   2   3   4   5   6   7   >