Re: [PATCH 6/8] vect: Add vector_mode paramater to simd_clone_usable

2023-08-30 Thread Andre Vieira (lists) via Gcc-patches
On 30/08/2023 14:01, Richard Biener wrote: On Wed, Aug 30, 2023 at 11:15 AM Andre Vieira (lists) via Gcc-patches wrote: This patch adds a machine_mode parameter to the TARGET_SIMD_CLONE_USABLE hook to enable rejecting SVE modes when the target architecture does not support SVE. How does

[PATCH 8/8] aarch64: Add SVE support for simd clones [PR 96342]

2023-08-30 Thread Andre Vieira (lists) via Gcc-patches
This patch finalizes adding support for the generation of SVE simd clones when no simdlen is provided, following the ABI rules where the widest data type determines the minimum amount of elements in a length agnostic vector. gcc/ChangeLog: * config/aarch64/aarch64-protos.h

[PATCH7/8] vect: Add TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM

2023-08-30 Thread Andre Vieira (lists) via Gcc-patches
This patch adds a new target hook to enable us to adapt the types of return and parameters of simd clones. We use this in two ways, the first one is to make sure we can create valid SVE types, including the SVE type attribute, when creating a SVE simd clone, even when the target options do

Re: [PATCH 6/8] vect: Add vector_mode paramater to simd_clone_usable

2023-08-30 Thread Andre Vieira (lists) via Gcc-patches
Forgot to CC this one to maintainers... On 30/08/2023 10:14, Andre Vieira (lists) via Gcc-patches wrote: This patch adds a machine_mode parameter to the TARGET_SIMD_CLONE_USABLE hook to enable rejecting SVE modes when the target architecture does not support SVE. gcc/ChangeLog

[PATCH 6/8] vect: Add vector_mode paramater to simd_clone_usable

2023-08-30 Thread Andre Vieira (lists) via Gcc-patches
This patch adds a machine_mode parameter to the TARGET_SIMD_CLONE_USABLE hook to enable rejecting SVE modes when the target architecture does not support SVE. gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_simd_clone_usable): Add mode parameter and use to to reject SVE

[PATCH 5/8] vect: Use inbranch simdclones in masked loops

2023-08-30 Thread Andre Vieira (lists) via Gcc-patches
This patch enables the compiler to use inbranch simdclones when generating masked loops in autovectorization. gcc/ChangeLog: * omp-simd-clone.cc (simd_clone_adjust_argument_types): Make function compatible with mask parameters in clone. * tree-vect-stmts.cc

[PATCH 4/8] vect: don't allow fully masked loops with non-masked simd clones [PR 110485]

2023-08-30 Thread Andre Vieira (lists) via Gcc-patches
When analyzing a loop and choosing a simdclone to use it is possible to choose a simdclone that cannot be used 'inbranch' for a loop that can use partial vectors. This may lead to the vectorizer deciding to use partial vectors which are not supported for notinbranch simd clones. This patch

[Patch 3/8] vect: Fix vect_get_smallest_scalar_type for simd clones

2023-08-30 Thread Andre Vieira (lists) via Gcc-patches
The vect_get_smallest_scalar_type helper function was using any argument to a simd clone call when trying to determine the smallest scalar type that would be vectorized. This included the function pointer type in a MASK_CALL for instance, and would result in the wrong type being selected.

[Patch 2/8] parloops: Allow poly nit and bound

2023-08-30 Thread Andre Vieira (lists) via Gcc-patches
Teach parloops how to handle a poly nit and bound e ahead of the changes to enable non-constant simdlen. gcc/ChangeLog: * tree-parloops.cc (try_to_transform_to_exit_first_loop_alt): Accept poly NIT and ALT_BOUND.diff --git a/gcc/tree-parloops.cc b/gcc/tree-parloops.cc index

[PATCH 1/8] parloops: Copy target and optimizations when creating a function clone

2023-08-30 Thread Andre Vieira (lists) via Gcc-patches
SVE simd clones require to be compiled with a SVE target enabled or the argument types will not be created properly. To achieve this we need to copy DECL_FUNCTION_SPECIFIC_TARGET from the original function declaration to the clones. I decided it was probably also a good idea to copy

aarch64, vect, omp: Add SVE support for simd clones [PR 96342]

2023-08-30 Thread Andre Vieira (lists) via Gcc-patches
Hi, This patch series aims to implement support for SVE simd clones when not specifying a 'simdlen' clause for AArch64. This patch depends on my earlier patch: '[PATCH] aarch64: enable mixed-types for aarch64 simdclones'. Bootstrapped and regression tested the series on

Re: [PATCH] aarch64: enable mixed-types for aarch64 simdclones

2023-08-29 Thread Andre Vieira (lists) via Gcc-patches
Hi, This patch enables the use of mixed-types for simd clones for AArch64, adds aarch64 as a target_vect_simd_clones and corrects the way the simdlen is chosen for non-specified simdlen clauses according to the 'Vector Function Application Binary Interface Specification for AArch64'.

Re: [PATCH] aarch64: enable mixed-types for aarch64 simdclones

2023-08-09 Thread Andre Vieira (lists) via Gcc-patches
On 09/08/2023 17:55, Richard Sandiford wrote: "Andre Vieira (lists)" writes: On 08/08/2023 11:51, Richard Sandiford wrote: "Andre Vieira (lists)" writes: warning_at (DECL_SOURCE_LOCATION (node->decl), 0, - "unsupported return type %qT for % functions", +

Re: [PATCH] aarch64: enable mixed-types for aarch64 simdclones

2023-08-09 Thread Andre Vieira (lists) via Gcc-patches
Here is my new version, see inline response to your comments. New cover letter: This patch enables the use of mixed-types for simd clones for AArch64, adds aarch64 as a target_vect_simd_clones and corrects the way the simdlen is chosen for non-specified simdlen clauses according to the

[PATCH] aarch64: enable mixed-types for aarch64 simdclones

2023-07-26 Thread Andre Vieira (lists) via Gcc-patches
Hi, This patch enables the use of mixed-types for simd clones for AArch64 and adds aarch64 as a target_vect_simd_clones. Bootstrapped and regression tested on aarch64-unknown-linux-gnu gcc/ChangeLog: * config/aarch64/aarch64.cc (currently_supported_simd_type): Remove.

Re: [PATCH] Include insn-opinit.h in PLUGIN_H [PR110610]

2023-07-17 Thread Andre Vieira (lists) via Gcc-patches
On 11/07/2023 23:28, Jeff Law wrote: On 7/11/23 04:37, Andre Vieira (lists) via Gcc-patches wrote: Hi, This patch fixes PR110610 by including OPTABS_H in the INTERNAL_FN_H list, as insn-opinit.h is now required by internal-fn.h. This will lead to insn-opinit.h, among the other OPTABS_H

[PATCH] Include insn-opinit.h in PLUGIN_H [PR110610]

2023-07-11 Thread Andre Vieira (lists) via Gcc-patches
Hi, This patch fixes PR110610 by including OPTABS_H in the INTERNAL_FN_H list, as insn-opinit.h is now required by internal-fn.h. This will lead to insn-opinit.h, among the other OPTABS_H header files, being installed in the plugin directory. Bootstrapped aarch64-unknown-linux-gnu. @Jakub:

[PATCH] vect: Treat vector widening IFN calls as 'simple' [PR110436]

2023-07-03 Thread Andre Vieira (lists) via Gcc-patches
Hi, This patch makes the vectorizer treat any vector widening IFN as simple, like it did with the tree codes VEC_WIDEN_*. I wasn't sure whether I should make all IFN's simple and then exclude some (like GOMP_ ones), or include more than just the new widening IFNs. But since this is the only

Re: [PATCH 2/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2023-06-23 Thread Andre Vieira (lists) via Gcc-patches
+ /* In order to find out if the loop is of type A or B above look for the + loop counter: it will either be incrementing by one per iteration or + it will be decrementing by num_of_lanes. We can find the loop counter + in the condition at the end of the loop. */ + rtx_insn

Re: [PATCH 2/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2023-06-23 Thread Andre Vieira (lists) via Gcc-patches
+ if (insn != arm_mve_get_loop_vctp (body)) +{ probably a good idea to invert the condition here and return false, helps reducing the indenting in this function. + /* Starting from the current insn, scan backwards through the insn + chain until BB_HEAD: "for each insn in

Re: [PATCH 2/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2023-06-22 Thread Andre Vieira (lists) via Gcc-patches
Some comments below, all quite minor. I'll continue to review tomorrow, I need a fresher brain for arm_mve_check_df_chain_back_for_implic_predic ;) +static int +arm_mve_get_vctp_lanes (rtx x) +{ + if (GET_CODE (x) == SET && GET_CODE (XEXP (x, 1)) == UNSPEC + && (XINT (XEXP (x, 1), 1) ==

Re: [PATCH] inline: improve internal function costs

2023-06-12 Thread Andre Vieira (lists) via Gcc-patches
On 05/06/2023 04:04, Jan Hubicka wrote: On Thu, 1 Jun 2023, Andre Vieira (lists) wrote: Hi, This is a follow-up of the internal function patch to add widening and narrowing patterns. This patch improves the inliner cost estimation for internal functions. I have no idea why calls are

vect: Don't pass subtype to vect_widened_op_tree where not needed [PR 110142]

2023-06-07 Thread Andre Vieira (lists) via Gcc-patches
Hi, This patch fixes an issue introduced by g:2f482a07365d9f4a94a56edd13b7f01b8f78b5a0, where a subtype was beeing passed to vect_widened_op_tree, when no subtype was to be used. This lead to an errorneous use of IFN_VEC_WIDEN_MINUS. gcc/ChangeLog: * tree-vect-patterns.cc

Re: [PATCH] modula2: Fix bootstrap

2023-06-07 Thread Andre Vieira (lists) via Gcc-patches
Thanks Jakub! I do need those includes and sorry I broke your bootstrap it didn't show up on my aarch64-unknown-linux-gnu bootstrap, I'm guessing the rules there were just run in a different order. Glad you were able to fix it :) On 06/06/2023 22:28, Jakub Jelinek wrote: Hi! internal-fn.h

Re: [PATCH] inline: improve internal function costs

2023-06-02 Thread Andre Vieira (lists) via Gcc-patches
On 02/06/2023 10:13, Richard Biener wrote: On Thu, 1 Jun 2023, Andre Vieira (lists) wrote: Hi, This is a follow-up of the internal function patch to add widening and narrowing patterns. This patch improves the inliner cost estimation for internal functions. I have no idea why calls are

[PATCH] gimple-range: implement widen plus range

2023-06-01 Thread Andre Vieira (lists) via Gcc-patches
Hi, This patch adds gimple-range information for the new IFN_VEC_WIDEN_PLUS* internal functions, identical to what VEC_WIDEN_PLUS did. Bootstrapped and regression tested on aarch64-unknown-linux-gnu. gcc/ChangeLog: * gimple-range-op.cc (gimple_range_op_handler::maybe_non_standard):

[PATCH] inline: improve internal function costs

2023-06-01 Thread Andre Vieira (lists) via Gcc-patches
Hi, This is a follow-up of the internal function patch to add widening and narrowing patterns. This patch improves the inliner cost estimation for internal functions. Bootstrapped and regression tested on aarch64-unknown-linux-gnu. gcc/ChangeLog: * ipa-fnsummary.cc

Re: [PATCH 2/3] Refactor widen_plus as internal_fn

2023-06-01 Thread Andre Vieira (lists) via Gcc-patches
Hi, This is the updated patch and cover letter. Patches for inline and gimple-op changes will follow soon. DEF_INTERNAL_WIDENING_OPTAB_FN and DEF_INTERNAL_NARROWING_OPTAB_FN are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN respectively. With the exception that they

Re: [PATCH 2/3] Refactor widen_plus as internal_fn

2023-05-18 Thread Andre Vieira (lists) via Gcc-patches
How about this? Not sure about the DEF_INTERNAL documentation I rewrote in internal-fn.def, was struggling to word these, so improvements welcome! gcc/ChangeLog: 2023-04-25 Andre Vieira Joel Hutton Tamar Christina * config/aarch64/aarch64-simd.md

Re: [PATCH 2/3] Refactor widen_plus as internal_fn

2023-05-15 Thread Andre Vieira (lists) via Gcc-patches
On 15/05/2023 12:01, Richard Biener wrote: On Mon, 15 May 2023, Richard Sandiford wrote: Richard Biener writes: On Fri, 12 May 2023, Richard Sandiford wrote: Richard Biener writes: On Fri, 12 May 2023, Andre Vieira (lists) wrote: I have dealt with, I think..., most of your comments.

Re: [PATCH 2/3] Refactor widen_plus as internal_fn

2023-05-12 Thread Andre Vieira (lists) via Gcc-patches
On 12/05/2023 14:28, Richard Biener wrote: On Fri, 12 May 2023, Andre Vieira (lists) wrote: I have dealt with, I think..., most of your comments. There's quite a few changes, I think it's all a bit simpler now. I made some other changes to the costing in tree-inline.cc and

Re: [PATCH 3/3] Remove widen_plus/minus_expr tree codes

2023-05-12 Thread Andre Vieira (lists) via Gcc-patches
Moved the 'changes' from this patch back to the second so it's all just about removing code that we no longer use. I don't really know why Joel formatted the patches this way, but I thought I'd keep it as is for now. cover letter: This patch removes the old widen plus/minus tree codes which

Re: [PATCH 2/3] Refactor widen_plus as internal_fn

2023-05-12 Thread Andre Vieira (lists) via Gcc-patches
I have dealt with, I think..., most of your comments. There's quite a few changes, I think it's all a bit simpler now. I made some other changes to the costing in tree-inline.cc and gimple-range-op.cc in which I try to preserve the same behaviour as we had with the tree codes before. Also

Re: [PATCH 1/3] Refactor to allow internal_fn's

2023-05-12 Thread Andre Vieira (lists) via Gcc-patches
Hi, I think I tackled all of your comments, let me know if I missed something. gcc/ChangeLog: 2023-05-12 Andre Vieira Joel Hutton * tree-vect-patterns.cc (vect_gimple_build): New Function. (vect_recog_widen_op_pattern): Refactor to use code_helper. *

Re: [PATCH 3/3] Remove widen_plus/minus_expr tree codes

2023-05-10 Thread Andre Vieira (lists) via Gcc-patches
On 03/05/2023 13:29, Richard Biener wrote: On Fri, 28 Apr 2023, Andre Vieira (lists) wrote: This is a rebase of Joel's previous patch. This patch removes the old widen plus/minus tree codes which have been replaced by internal functions. I guess that's obvious then. I wonder what we do

Re: [PATCH 1/3] Refactor to allow internal_fn's

2023-05-04 Thread Andre Vieira (lists) via Gcc-patches
On 03/05/2023 12:55, Richard Biener wrote: On Fri, 28 Apr 2023, Andre Vieira (lists) wrote: Hi, I'm posting the patches separately now with ChangeLogs. I made the suggested changes and tried to simplify the code a bit further. Where internal to tree-vect-stmts I changed most functions to

Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns

2023-04-28 Thread Andre Vieira (lists) via Gcc-patches
On 25/04/2023 13:30, Richard Biener wrote: On Mon, 24 Apr 2023, Richard Sandiford wrote: Richard Biener writes: On Thu, Apr 20, 2023 at 3:24?PM Andre Vieira (lists) via Gcc-patches wrote: Rebased all three patches and made some small changes to the second one: - removed sub and abd

[PATCH 2/3] Refactor widen_plus as internal_fn

2023-04-28 Thread Andre Vieira (lists) via Gcc-patches
This patch replaces the existing tree_code widen_plus and widen_minus patterns with internal_fn versions. DEF_INTERNAL_OPTAB_HILO_FN is like DEF_INTERNAL_OPTAB_FN except it provides convenience wrappers for defining conversions that require a hi/lo split, like widening and narrowing

[PATCH 1/3] Refactor to allow internal_fn's

2023-04-28 Thread Andre Vieira (lists) via Gcc-patches
Hi, I'm posting the patches separately now with ChangeLogs. I made the suggested changes and tried to simplify the code a bit further. Where internal to tree-vect-stmts I changed most functions to use code_helper to avoid having to check at places we didn't need to. I was trying to simplify

[PATCH 3/3] Remove widen_plus/minus_expr tree codes

2023-04-28 Thread Andre Vieira (lists) via Gcc-patches
This is a rebase of Joel's previous patch. This patch removes the old widen plus/minus tree codes which have been replaced by internal functions. gcc/ChangeLog: 2023-04-28 Andre Vieira Joel Hutton * doc/generic.texi: Remove old tree codes. * expr.cc

Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns

2023-04-25 Thread Andre Vieira (lists) via Gcc-patches
On 24/04/2023 12:57, Richard Biener wrote: On Thu, Apr 20, 2023 at 3:24 PM Andre Vieira (lists) via Gcc-patches wrote: Rebased all three patches and made some small changes to the second one: - removed sub and abd optabs from commutative_optab_p, I suspect this was a copy paste mistake

Re: [RFC 0/X] Implement GCC support for AArch64 libmvec

2023-04-21 Thread Andre Vieira (lists) via Gcc-patches
On 20/04/2023 17:13, Richard Sandiford wrote: "Andre Vieira (lists)" writes: On 20/04/2023 15:51, Richard Sandiford wrote: "Andre Vieira (lists)" writes: Hi all, This is a series of patches/RFCs to implement support in GCC to be able to target AArch64's libmvec functions that will

Re: [RFC 0/X] Implement GCC support for AArch64 libmvec

2023-04-20 Thread Andre Vieira (lists) via Gcc-patches
On 20/04/2023 15:51, Richard Sandiford wrote: "Andre Vieira (lists)" writes: Hi all, This is a series of patches/RFCs to implement support in GCC to be able to target AArch64's libmvec functions that will be/are being added to glibc. We have chosen to use the omp pragma '#pragma omp

Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns

2023-04-20 Thread Andre Vieira (lists) via Gcc-patches
Rebased all three patches and made some small changes to the second one: - removed sub and abd optabs from commutative_optab_p, I suspect this was a copy paste mistake, - removed what I believe to be a superfluous switch case in vectorizable conversion, the one that was here: + if

Re: [PATCH] aarch64: Add -mveclibabi=sleefgnu

2023-04-14 Thread Andre Vieira (lists) via Gcc-patches
No problem Lou and testing will be appreciated. I strongly recommend against rebasing the version that is on the mailing list now, the conflicts with Andrew's patches aren't simple to resolve ;) I'll do my best to get you revised versions next week :) On 14/04/2023 16:07, Lou Knauer wrote:

Re: [PATCH] Fix vect-simd-clone testcase dump scanning

2023-04-14 Thread Andre Vieira (lists) via Gcc-patches
SGTM On 14/04/2023 12:00, Jakub Jelinek wrote: On Fri, Apr 14, 2023 at 11:59:02AM +0100, Andre Vieira (lists) wrote: On the other thread I commented that inbranch simdclones are failing for AVX512F because it sets the mask_mode, for which inbranch hasn't been implemented, and so it is

Re: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64

2023-04-14 Thread Andre Vieira (lists) via Gcc-patches
On 14/04/2023 12:47, Richard Biener wrote: On Fri, Apr 14, 2023 at 11:42 AM Andre Vieira (lists) wrote: Ah, but then vect_get_smallest_scalar_type should simply ignore that pointer in MASK_CALL. It should only look at the arguments relevant for vectorization. So diff --git

Re: [PATCH] Fix vect-simd-clone testcase dump scanning

2023-04-14 Thread Andre Vieira (lists) via Gcc-patches
On the other thread I commented that inbranch simdclones are failing for AVX512F because it sets the mask_mode, for which inbranch hasn't been implemented, and so it is rejected. On 14/04/2023 11:25, Jakub Jelinek via Gcc-patches wrote: On Fri, Apr 14, 2023 at 10:15:06AM +, Richard Biener

Re: [PATCH] aarch64: Add -mveclibabi=sleefgnu

2023-04-14 Thread Andre Vieira (lists) via Gcc-patches
I have (outdated) RFC's here: https://gcc.gnu.org/pipermail/gcc-patches/2023-March/613593.html I am working on this patch series for stage 1. The list of features I am working on are: * SVE support for #pragma omp declare simd * Support for simdclone usage in autovec from #pragma omp declare

Re: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64

2023-04-14 Thread Andre Vieira (lists) via Gcc-patches
gt; On Thu, Apr 13, 2023 at 4:25 PM Andre Vieira (lists) >> wrote: >>> >>> >>> >>> On 13/04/2023 15:00, Richard Biener wrote: >>>> On Thu, Apr 13, 2023 at 3:00 PM Andre Vieira (lists) via Gcc-patches >>>>

Re: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64

2023-04-14 Thread Andre Vieira (lists) via Gcc-patches
t; >>> On 13/04/2023 15:00, Richard Biener wrote: >>>> On Thu, Apr 13, 2023 at 3:00 PM Andre Vieira (lists) via Gcc-patches >>>> wrote: >>>>> >>>>> >>>>> >>> >>> But that's not it, I've been looking at it, and

Re: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64

2023-04-13 Thread Andre Vieira (lists) via Gcc-patches
On 13/04/2023 15:00, Richard Biener wrote: On Thu, Apr 13, 2023 at 3:00 PM Andre Vieira (lists) via Gcc-patches wrote: On 13/04/2023 11:01, Andrew Stubbs wrote: Hi Andre, I don't have a cascadelake device to test on, nor any knowledge about what makes it different from regular x86_64

Re: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64

2023-04-13 Thread Andre Vieira (lists) via Gcc-patches
On 13/04/2023 11:01, Andrew Stubbs wrote: Hi Andre, I don't have a cascadelake device to test on, nor any knowledge about what makes it different from regular x86_64. Not sure you need one, but yeah I don't know either, it looks like it fails because: in-branch vector clones are not yet

Re: [PATCH] tree-optimization/108888 - call if-conversion

2023-04-05 Thread Andre Vieira (lists) via Gcc-patches
Hi, The original patch to fix this PR broke the if-conversion of calls into IFN_MASK_CALL. This patch restores that original behaviour and makes sure the tests added earlier specifically test inbranch SIMD clones. Bootstrapped and regression tested on aarch64-none-linux-gnu and

Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns

2023-03-17 Thread Andre Vieira (lists) via Gcc-patches
Hi Richard, I'm only picking this up now. Just going through your earlier comments and stuff and I noticed we didn't address the situation with the gimple::build. Do you want me to add overloaded static member functions to cover all gimple_build_* functions, or just create one to replace

[PATCH] ifcvt: Lower bitfields only if suitable for scalar register [PR tree/109005]

2023-03-13 Thread Andre Vieira (lists) via Gcc-patches
This patch fixes the condition check for eligilibity of lowering bitfields, where before we would check for non-BLKmode types, in the hope of excluding unsuitable aggregate types, we now check directly the representative is not an aggregate type, i.e. suitable for a scalar register. I tried

[RFC 6/X] omp: Allow creation of simd clones from omp declare variant with -fopenmp-simd flag

2023-03-08 Thread Andre Vieira (lists) via Gcc-patches
Hi, This RFC is to propose relaxing the flag needed to allow the creation of simd clones from omp declare variants, such that we can use -fopenmp-simd rather than -fopenmp. This should only change the behaviour of omp simd clones and should not enable any other openmp functionality, though I

[RFC 5/X] omp: Create simd clones from 'omp declare variant's

2023-03-08 Thread Andre Vieira (lists) via Gcc-patches
Hi, This RFC extends the omp-simd-clone pass to create simd clones for functions with 'omp declare variant' pragmas that contain simd constructs. This patch also implements AArch64's use for this functionality. This requires two extra pieces of information be kept for each simd-clone, a

[RFC 4/X] omp, aarch64: Add SVE support for 'omp declare simd' [PR 96342]

2023-03-08 Thread Andre Vieira (lists) via Gcc-patches
Hi, This patch adds SVE support for simd clone generation when using 'omp declare simd'. The design is based on what was discussed in PR 96342, but I did not look at YangYang's patch as I wasn't sure of whether that code's copyright had been assigned to FSF. This patch also is not in

[PATCH 3/X] parloops: Allow poly number of iterations

2023-03-08 Thread Andre Vieira (lists) via Gcc-patches
Hi, This patch modifies this function in parloops to allow it to handle loops with poly iteration counts. gcc/ChangeLog: * tree-parloops.cc (try_transform_to_exit_first_loop_alt): Handle poly nits. Is this OK for Stage 1?diff --git a/gcc/tree-parloops.cc b/gcc/tree-parloops.cc

[PATCH 2/X] parloops: Copy target and optimizations when creating a function clone

2023-03-08 Thread Andre Vieira (lists) via Gcc-patches
Hi, This patch makes sure we copy over DECL_FUNCTION_SPECIFIC_{TARGET,OPTIMIZATION} in parloops when creating function clones. This is required for SVE clones as we will need to enable +sve for them, regardless of the current target options. I don't actually need the 'OPTIMIZATION' for this

[PATCH 1/X] omp: Replace simd_clone_subparts with TYPE_VECTOR_SUBPARTS

2023-03-08 Thread Andre Vieira (lists) via Gcc-patches
Hi, This patch replaces the uses of simd_clone_subparts with TYPE_VECTOR_SUBPARTS and removes the definition of the first. gcc/ChangeLog: * omp-sind-clone.cc (simd_clone_subparts): Remove. (simd_clone_init_simd_arrays): Replace simd_clone_subparts with TYPE_VECTOR_SUBPARTS.

[RFC 0/X] Implement GCC support for AArch64 libmvec

2023-03-08 Thread Andre Vieira (lists) via Gcc-patches
Hi all, This is a series of patches/RFCs to implement support in GCC to be able to target AArch64's libmvec functions that will be/are being added to glibc. We have chosen to use the omp pragma '#pragma omp declare variant ...' with a simd construct as the way for glibc to inform GCC what

Re: [PATCH] simplify-rtx: Fix VOIDmode operand handling in simplify_subreg [PR108805]

2023-03-02 Thread Andre Vieira (lists) via Gcc-patches
Committed attached patch. On 02/03/2023 10:13, Richard Sandiford wrote: "Andre Vieira (lists)" writes: Hey both, Sorry about that, don't know how I missed those. Just running a test on that now and will commit when it's done. I assume the comment and 0 -> byte change can be seen as obvious,

Re: [PATCH] simplify-rtx: Fix VOIDmode operand handling in simplify_subreg [PR108805]

2023-03-02 Thread Andre Vieira (lists) via Gcc-patches
Hey both, Sorry about that, don't know how I missed those. Just running a test on that now and will commit when it's done. I assume the comment and 0 -> byte change can be seen as obvious, especially since it was supposed to be in my original patch... On 27/02/2023 15:46, Richard Sandiford

Re: [PATCH] amdgcn: Enable SIMD vectorization of math functions

2023-03-01 Thread Andre Vieira (lists) via Gcc-patches
On 01/03/2023 10:01, Andrew Stubbs wrote: > On 28/02/2023 23:01, Kwok Cheung Yeung wrote: >> Hello >> >> This patch implements the TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION >> target hook for the AMD GCN architecture, such that when vectorized, >> calls to builtin standard math functions

Re: [PATCH 3/3] arm: Fix MVE predicates synthesis [PR 108443]

2023-01-31 Thread Andre Vieira (lists) via Gcc-patches
Yeah that shouldn't be there, it's from an earlier version of the patch I wrote where I was experimenting changing the existing modes, I'll remove it from the ChangeLog. On 31/01/2023 09:53, Kyrylo Tkachov wrote: gcc/testsuite/ChangeLog:     * gcc.dg/rtl/arm/mve-vxbi.c: Use new

Re: [PATCH 2/3] arm: Remove unnecessary zero-extending of MVE predicates before use [PR 107674]

2023-01-30 Thread Andre Vieira (lists) via Gcc-patches
Changed the testcase to be more robust (as per the discussion for the first patch). Still need the OK for the mid-end (simplify-rtx) part. Kind regards, Andre On 27/01/2023 09:59, Kyrylo Tkachov wrote: -Original Message- From: Andre Vieira (lists) Sent: Friday, January 27, 2023

Re: [PATCH 1/3] arm: Fix sign of MVE predicate mve_pred16_t [PR 107674]

2023-01-30 Thread Andre Vieira (lists) via Gcc-patches
Here's a new version with a more robust test. OK for trunk? On 27/01/2023 09:56, Kyrylo Tkachov wrote: -Original Message- From: Andre Vieira (lists) Sent: Friday, January 27, 2023 9:54 AM To: Kyrylo Tkachov ; gcc-patches@gcc.gnu.org Cc: Richard Earnshaw Subject: Re: [PATCH 1/3]

Re: [PATCH] arm: Fix MVE's vcmp vector-scalar patterns [PR107987]

2023-01-27 Thread Andre Vieira (lists) via Gcc-patches
This applies cleanly to gcc-12 and regressions for arm-none-eabi look clean. OK to apply to gcc-12? On 06/12/2022 11:23, Kyrylo Tkachov wrote: -Original Message- From: Andre Simoes Dias Vieira Sent: Tuesday, December 6, 2022 11:19 AM To: 'gcc-patches@gcc.gnu.org' Cc: Kyrylo

Re: [PATCH 2/3] arm: Remove unnecessary zero-extending of MVE predicates before use [PR 107674]

2023-01-27 Thread Andre Vieira (lists) via Gcc-patches
On 26/01/2023 15:06, Kyrylo Tkachov wrote: Hi Andre, -Original Message- From: Andre Vieira (lists) Sent: Tuesday, January 24, 2023 1:54 PM To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford ; Richard Earnshaw ; Richard Biener ; Kyrylo Tkachov Subject: [PATCH 2/3] arm: Remove

Re: [PATCH 1/3] arm: Fix sign of MVE predicate mve_pred16_t [PR 107674]

2023-01-27 Thread Andre Vieira (lists) via Gcc-patches
On 26/01/2023 15:02, Kyrylo Tkachov wrote: Hi Andre, -Original Message- From: Andre Vieira (lists) Sent: Tuesday, January 24, 2023 1:41 PM To: gcc-patches@gcc.gnu.org Cc: Kyrylo Tkachov ; Richard Earnshaw Subject: [PATCH 1/3] arm: Fix sign of MVE predicate mve_pred16_t [PR 107674]

Re: [PATCH 3/3] arm: Fix MVE predicates synthesis [PR 108443]

2023-01-25 Thread Andre Vieira (lists) via Gcc-patches
Looks like the first patch was missing a change I had made to prevent mve_bool_vec_to_const ICEing if called with a non-vector immediate. Now included. On 24/01/2023 13:56, Andre Vieira (lists) via Gcc-patches wrote: Hi, This patch fixes the way we synthesize MVE predicate immediates

[PATCH] aarch64: Add aarch64*-*-* to the list of vect_long_long targets

2023-01-24 Thread Andre Vieira (lists) via Gcc-patches
Hi, This patch adds aarch64 to the list of vect_long_long targets. Regression tested on aarch64-none-linux-gnu. OK for trunk? gcc/testsuite/ChangeLog: * lib/target-supports.exp (check_effective_target_vect_long_long): Add aarch64 to list of targets supporting long long

Re: [PATCH] arm: Make MVE masked stores read memory operand [PR 108177]

2023-01-24 Thread Andre Vieira (lists) via Gcc-patches
ping. (reattaching patch in the hopes patchwork picks it up). On 13/01/2023 16:05, Andre Simoes Dias Vieira via Gcc-patches wrote: Hi, This patch adds the memory operand of MVE masked stores as input operands to mimic the 'partial' writes, to prevent erroneous write-after-write optimizations

[PATCH 3/3] arm: Fix MVE predicates synthesis [PR 108443]

2023-01-24 Thread Andre Vieira (lists) via Gcc-patches
Hi, This patch fixes the way we synthesize MVE predicate immediates and fixes some other inconsistencies around predicates. For instance this patch fixes the modes used in the vctp intrinsics, to couple them with predicate modes with the appropriate lane numbers. For this V2QI is added to

[PATCH 2/3] arm: Remove unnecessary zero-extending of MVE predicates before use [PR 107674]

2023-01-24 Thread Andre Vieira (lists) via Gcc-patches
Hi, This patch teaches GCC that zero-extending a MVE predicate from 16-bits to 32-bits and then only using 16-bits is a no-op. It does so in two steps: - it lets gcc know that it can access any MVE predicate mode using any other MVE predicate mode without needing to copy it, using the

Re: [PATCH 1/3] arm: Fix sign of MVE predicate mve_pred16_t [PR 107674]

2023-01-24 Thread Andre Vieira (lists) via Gcc-patches
I meant bootstrapped on aarch64-none-linux-gnu and not none-eabi. On 24/01/2023 13:40, Andre Vieira (lists) via Gcc-patches wrote: Hi, The ACLE defines mve_pred16_t as an unsigned short.  This patch makes sure GCC treats the predicate as an unsigned type, rather than signed. Bootstrapped

[PATCH 1/3] arm: Fix sign of MVE predicate mve_pred16_t [PR 107674]

2023-01-24 Thread Andre Vieira (lists) via Gcc-patches
Hi, The ACLE defines mve_pred16_t as an unsigned short. This patch makes sure GCC treats the predicate as an unsigned type, rather than signed. Bootstrapped on aarch64-none-eabi and regression tested on arm-none-eabi and armeb-none-eabi for armv8.1-m.main+mve.fp. OK for trunk?

[PATCH 0/3] arm: Fix regressions around MVE predicate codegen

2023-01-24 Thread Andre Vieira (lists) via Gcc-patches
Hi all, This patch series aims to fix two or three (depends on how you look at it) regressions that came about in gcc 11. The first and third patch address wrong-codegen regressions and the second a performance regression. Patch two makes a change to the mid-end so I can understand if there

Re: [PATCH 2/2] aarch64: Add support for widening LDAPR instructions

2022-11-18 Thread Andre Vieira (lists) via Gcc-patches
quot; writes: Updated version of the patch to account for the testsuite changes in the first patch. On 10/11/2022 11:20, Andre Vieira (lists) via Gcc-patches wrote: Hi, This patch adds support for the widening LDAPR instructions. Bootstrapped and regression tested on aarch64-none-linux-gnu. OK for

Re: [PATCH 2/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2022-11-15 Thread Andre Vieira (lists) via Gcc-patches
On 11/11/2022 17:40, Stam Markianos-Wright via Gcc-patches wrote: Hi all, This is the 2/2 patch that contains the functional changes needed for MVE Tail Predicated Low Overhead Loops.  See my previous email for a general introduction of MVE LOLs. This support is added through the already

Re: [PATCH 1/2] aarch64: Enable the use of LDAPR for load-acquire semantics

2022-11-14 Thread Andre Vieira (lists) via Gcc-patches
On 14/11/2022 14:12, Kyrylo Tkachov wrote: -Original Message- From: Andre Vieira (lists) Sent: Monday, November 14, 2022 2:09 PM To: Kyrylo Tkachov ; gcc-patches@gcc.gnu.org Cc: Richard Earnshaw ; Richard Sandiford Subject: Re: [PATCH 1/2] aarch64: Enable the use of LDAPR for

Re: [PATCH 2/2] aarch64: Add support for widening LDAPR instructions

2022-11-14 Thread Andre Vieira (lists) via Gcc-patches
Updated version of the patch to account for the testsuite changes in the first patch. On 10/11/2022 11:20, Andre Vieira (lists) via Gcc-patches wrote: Hi, This patch adds support for the widening LDAPR instructions. Bootstrapped and regression tested on aarch64-none-linux-gnu. OK for trunk

Re: [PATCH 1/2] aarch64: Enable the use of LDAPR for load-acquire semantics

2022-11-14 Thread Andre Vieira (lists) via Gcc-patches
Here is the latest version and an updated ChangeLog: 2022-11-14  Andre Vieira     Kyrylo Tkachov gcc/ChangeLog:     * config/aarch64/aarch64.h (AARCH64_ISA_RCPC): New Macro.     (TARGET_RCPC): New Macro.     * config/aarch64/atomics.md (atomic_load): Change

[PATCH 2/2] aarch64: Add support for widening LDAPR instructions

2022-11-10 Thread Andre Vieira (lists) via Gcc-patches
Hi, This patch adds support for the widening LDAPR instructions. Bootstrapped and regression tested on aarch64-none-linux-gnu. OK for trunk? 2022-11-09  Andre Vieira      Kyrylo Tkachov  gcc/ChangeLog:     * config/aarch64/atomics.md (*aarch64_atomic_load_rcpc_zext): New

[PATCH 1/2] aarch64: Enable the use of LDAPR for load-acquire semantics

2022-11-10 Thread Andre Vieira (lists) via Gcc-patches
Hello, This patch enables the use of LDAPR for load-acquire semantics. After some internal investigation based on the work published by Podkopaev et al. (https://dl.acm.org/doi/10.1145/3290382) we can confirm that using LDAPR for the C++ load-acquire semantics is a correct relaxation.

Re: [AArch64] Enable generation of FRINTNZ instructions

2022-11-09 Thread Andre Vieira (lists) via Gcc-patches
On 07/11/2022 14:56, Richard Biener wrote: On Mon, 7 Nov 2022, Andre Vieira (lists) wrote: On 07/11/2022 11:05, Richard Biener wrote: On Fri, 4 Nov 2022, Andre Vieira (lists) wrote: Sorry for the delay, just been reminded I still had this patch outstanding from last stage 1. Hopefully

Re: [AArch64] Enable generation of FRINTNZ instructions

2022-11-07 Thread Andre Vieira (lists) via Gcc-patches
On 07/11/2022 11:05, Richard Biener wrote: On Fri, 4 Nov 2022, Andre Vieira (lists) wrote: Sorry for the delay, just been reminded I still had this patch outstanding from last stage 1. Hopefully since it has been mostly reviewed it could go in for this stage 1? I addressed the comments and

Re: [AArch64] Enable generation of FRINTNZ instructions

2022-11-04 Thread Andre Vieira (lists) via Gcc-patches
Sorry for the delay, just been reminded I still had this patch outstanding from last stage 1. Hopefully since it has been mostly reviewed it could go in for this stage 1? I addressed the comments and gave the slp-part of vectorizable_call some TLC to make it work. I also changed

[PATCH] ifcvt: Support bitfield lowering of multiple-exit loops

2022-11-03 Thread Andre Vieira (lists) via Gcc-patches
Hi, With Tamar's patch (https://gcc.gnu.org/pipermail/gcc-patches/2022-November/604880.html) enabling the vectorization of early-breaks, I'd like to allow bitfield lowering in such loops, which requires the relaxation of allowing multiple exits when doing so.  In order to avoid a similar

Re: vect: Make vect_check_gather_scatter reject offsets that aren't multiples of BITS_PER_UNIT [PR107346]

2022-10-28 Thread Andre Vieira (lists) via Gcc-patches
On 24/10/2022 14:29, Richard Biener wrote: On Mon, 24 Oct 2022, Andre Vieira (lists) wrote: Changing if-convert would merely change this testcase but we could still trigger using a different structure type, changing the size of Int24 to 32 bits rather than 24: package Loop_Optimization23_Pkg

Re: vect: Make vect_check_gather_scatter reject offsets that aren't multiples of BITS_PER_UNIT [PR107346]

2022-10-24 Thread Andre Vieira (lists) via Gcc-patches
On 24/10/2022 13:46, Richard Biener wrote: On Mon, 24 Oct 2022, Andre Vieira (lists) wrote: On 24/10/2022 08:17, Richard Biener wrote: Can you check why vect_find_stmt_data_reference doesn't trip on the if (TREE_CODE (DR_REF (dr)) == COMPONENT_REF && DECL_BIT_FIELD

Re: vect: Make vect_check_gather_scatter reject offsets that aren't multiples of BITS_PER_UNIT [PR107346]

2022-10-24 Thread Andre Vieira (lists) via Gcc-patches
On 24/10/2022 08:17, Richard Biener wrote: Can you check why vect_find_stmt_data_reference doesn't trip on the if (TREE_CODE (DR_REF (dr)) == COMPONENT_REF && DECL_BIT_FIELD (TREE_OPERAND (DR_REF (dr), 1))) { free_data_ref (dr); return opt_result::failure_at

vect: Make vect_check_gather_scatter reject offsets that aren't multiples of BITS_PER_UNIT [PR107346]

2022-10-21 Thread Andre Vieira (lists) via Gcc-patches
Hi, The ada failure reported in the PR was being caused by vect_check_gather_scatter failing to deal with bit offsets that weren't multiples of BITS_PER_UNIT. This patch makes vect_check_gather_scatter reject memory accesses with such offsets. Bootstrapped and regression tested on aarch64

[PATCH]vect: Fix vectype when widening container type in bitfield pattern [PR107326]

2022-10-20 Thread Andre Vieira (lists) via Gcc-patches
Hi, The 'vect_recog_bitfield_ref_pattern' was not correctly adapting the vectype when widening the container. I thought the original tests covered that code-path but they didn't, so I added a new run-test that covers it too. Bootstrapped and regression tested on x86_64 and aarch64.

[PATCH] ifcvt: Do not lower bitfields if we can't analyze dr's [PR107275]

2022-10-18 Thread Andre Vieira (lists) via Gcc-patches
The ifcvt dead code elimination code was not built to deal with inline assembly, as loops with such would never be if-converted in the past since we can't do data-reference analysis on them and vectorization would eventually fail. For this reason we now also do not lower bitfields if the

Re: ifcvt: Fix bitpos calculation in bitfield lowering [PR107229]

2022-10-13 Thread Andre Vieira (lists) via Gcc-patches
On 13/10/2022 15:15, Richard Biener wrote: On Thu, 13 Oct 2022, Andre Vieira (lists) wrote: Hi Rainer, Thanks for reporting, I was actually expecting these! I thought about pre-empting them by using a positive filter on the tests for aarch64 and x86_64 as I knew those would pass, but I

Re: ifcvt: Fix bitpos calculation in bitfield lowering [PR107229]

2022-10-13 Thread Andre Vieira (lists) via Gcc-patches
Hi Rainer, Thanks for reporting, I was actually expecting these! I thought about pre-empting them by using a positive filter on the tests for aarch64 and x86_64 as I knew those would pass, but I thought it would be better to let other targets report failures since then you get a testsuite

Re: ifcvt: Fix bitpos calculation in bitfield lowering [PR107229]

2022-10-13 Thread Andre Vieira (lists) via Gcc-patches
Added some extra comments to describe what is going on there. On 13/10/2022 09:14, Richard Biener wrote: On Wed, 12 Oct 2022, Andre Vieira (lists) wrote: Hi, The bitposition calculation for the bitfield lowering in loop if conversion was not taking DECL_FIELD_OFFSET into account, which meant

  1   2   3   >