Re: [RFC] [v2] Extend fold_vec_perm to handle VLA vectors

2023-08-14 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > On Thu, 10 Aug 2023 at 21:27, Richard Sandiford > wrote: >> >> Prathamesh Kulkarni writes: >> >> static bool >> >> is_simple_vla_size (poly_uint64 size) >> >> { >> >> if (size.is_constant ()) >> >> return false; >> >> for (int i = 1; i < ARRAY_SIZE (size.coe

Re: [PATCH] vect: Move VMAT_GATHER_SCATTER handlings from final loop nest

2023-08-14 Thread Richard Sandiford via Gcc-patches
Thanks for the clean-ups. But... "Kewen.Lin" writes: > Hi, > > Following Richi's suggestion [1], this patch is to move the > handlings on VMAT_GATHER_SCATTER in the final loop nest > of function vectorizable_load to its own loop. Basically > it duplicates the final loop nest, clean up some usel

Re: [PATCH] genrecog: Add SUBREG_BYTE.to_constant check to the genrecog

2023-08-14 Thread Richard Sandiford via Gcc-patches
Juzhe-Zhong writes: > Hi, there is genrecog issue happens in RISC-V backend. > > This is the ICE info: > > 0xfa3ba4 poly_int_pod<2u, unsigned short>::to_constant() const > ../../../riscv-gcc/gcc/poly-int.h:504 > 0x28eaa91 recog_5 > ../../../riscv-gcc/gcc/config/riscv/bitmanip.md:31

Re: [PATCH V3] VECT: Support loop len control on EXTRACT_LAST vectorization

2023-08-11 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Fri, 11 Aug 2023, juzhe.zh...@rivai.ai wrote: > >> Hi, Richi. >> >> > 1. Target is using loop MASK as the partial vector loop control. >> >> I don't think it checks for this? >> >> I am not sure whether I understand EXTRACT_LAST correctly. >> But if target doesn't use

Re: [PATCH] tree-optimization/110979 - fold-left reduction and partial vectors

2023-08-11 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > When we vectorize fold-left reductions with partial vectors but > no target operation available we use a vector conditional to force > excess elements to zero. But that doesn't correctly preserve > the sign of zero. The following patch disables partial vector > support i

Re: [RFC] GCC Security policy

2023-08-10 Thread Richard Sandiford via Gcc-patches
Siddhesh Poyarekar writes: > On 2023-08-08 10:30, Siddhesh Poyarekar wrote: >>> Do you have a suggestion for the language to address libgcc, >>> libstdc++, etc. and libiberty, libbacktrace, etc.? >> >> I'll work on this a bit and share a draft. > > Hi David, > > Here's what I came up with for di

Re: [RFC] [v2] Extend fold_vec_perm to handle VLA vectors

2023-08-10 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: >> static bool >> is_simple_vla_size (poly_uint64 size) >> { >> if (size.is_constant ()) >> return false; >> for (int i = 1; i < ARRAY_SIZE (size.coeffs); ++i) >> if (size[i] != (i <= 1 ? size[0] : 0)) > Just wondering is this should be (i == 1 ? size[0] : 0

Re: [PATCH] VR-VALUES: Simplify comparison using range pairs

2023-08-10 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Thu, Aug 10, 2023 at 3:44 PM Richard Sandiford > wrote: >> >> Richard Biener via Gcc-patches writes: >> > On Wed, Aug 9, 2023 at 6:16 PM Andrew Pinski via Gcc-patches >> > wrote: >> >> >> >> If `A` has a range of `[0,0][100,INF]` and the comparison >> >> of `A < 50`.

Re: [PATCH] VR-VALUES: Simplify comparison using range pairs

2023-08-10 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > On Wed, Aug 9, 2023 at 6:16 PM Andrew Pinski via Gcc-patches > wrote: >> >> If `A` has a range of `[0,0][100,INF]` and the comparison >> of `A < 50`. This should be optimized to `A <= 0` (which then >> will be optimized to just `A == 0`). >> This patch imp

Re: [PATCH] aarch64: enable mixed-types for aarch64 simdclones

2023-08-10 Thread Richard Sandiford via Gcc-patches
Jakub Jelinek writes: > On Wed, Aug 09, 2023 at 06:27:20PM +0100, Richard Sandiford wrote: >> Jakub Jelinek writes: >> > On Wed, Aug 09, 2023 at 05:55:28PM +0100, Richard Sandiford wrote: >> >> Jakub: do you remember what the reason was? I don't mind dropping >> >> "function", but it feels weird

Re: [PATCH] aarch64: enable mixed-types for aarch64 simdclones

2023-08-09 Thread Richard Sandiford via Gcc-patches
Jakub Jelinek writes: > On Wed, Aug 09, 2023 at 05:55:28PM +0100, Richard Sandiford wrote: >> Jakub: do you remember what the reason was? I don't mind dropping >> "function", but it feels weird to drop the quotes around "simd". >> Seems like, if we do that, there'll one day be a patch to add >> t

Re: [PATCH] aarch64: enable mixed-types for aarch64 simdclones

2023-08-09 Thread Richard Sandiford via Gcc-patches
"Andre Vieira (lists)" writes: > Here is my new version, see inline response to your comments. > > New cover letter: > > This patch enables the use of mixed-types for simd clones for AArch64, > adds aarch64 as a target_vect_simd_clones and corrects the way the > simdlen is chosen for non-specifi

Re: [PATCH] VECT: Support loop len control on EXTRACT_LAST vectorization

2023-08-09 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: > Hi, Richi. > >>> that should be > >>> || (!LOOP_VINFO_FULLY_MASKED_P (loop_vinfo) >>> && !LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo)) > >>> I think. It seems to imply that SLP isn't supported with >>> masking/lengthing. > > Oh, yes. At first glance, the

Re: [PATCH] aarch64: SVE/NEON Bridging intrinsics

2023-08-09 Thread Richard Sandiford via Gcc-patches
Richard Ball writes: > ACLE has added intrinsics to bridge between SVE and Neon. > > The NEON_SVE Bridge adds intrinsics that allow conversions between NEON and > SVE vectors. > > This patch adds support to GCC for the following 3 intrinsics: > svset_neonq, svget_neonq and svdup_neonq > > gcc/Chan

Re: [PATCH][GCC] aarch64: Add support for Cortex-A520 CPU

2023-08-08 Thread Richard Sandiford via Gcc-patches
Richard Ball writes: > This patch adds support for the Cortex-A520 CPU to GCC. > > No regressions on aarch64-none-elf. > > Ok for master? > > > gcc/ChangeLog: > >     * config/aarch64/aarch64-cores.def (AARCH64_CORE): Add > Cortex-A520 CPU. >     * config/aarch64/aarch64-tune.md: Regene

Re: [PATCH] aarch64: enable mixed-types for aarch64 simdclones

2023-08-08 Thread Richard Sandiford via Gcc-patches
"Andre Vieira (lists)" writes: > Hi, > > This patch enables the use of mixed-types for simd clones for AArch64 > and adds aarch64 as a target_vect_simd_clones. > > Bootstrapped and regression tested on aarch64-unknown-linux-gnu > > gcc/ChangeLog: > > * config/aarch64/aarch64.cc (currentl

Re: [RFC] [v2] Extend fold_vec_perm to handle VLA vectors

2023-08-08 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > On Fri, 4 Aug 2023 at 20:36, Richard Sandiford > wrote: >> >> Full review this time, sorry for the skipping the tests earlier. > Thanks for the detailed review! Please find my responses inline below. >> >> Prathamesh Kulkarni writes: >> > diff --git a/gcc/fold-const

Re: [RFC] [v2] Extend fold_vec_perm to handle VLA vectors

2023-08-04 Thread Richard Sandiford via Gcc-patches
Full review this time, sorry for the skipping the tests earlier. Prathamesh Kulkarni writes: > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc > index 7e5494dfd39..680d0e54fd4 100644 > --- a/gcc/fold-const.cc > +++ b/gcc/fold-const.cc > @@ -85,6 +85,10 @@ along with GCC; see the file COPYING3.

Re: [PATCH] tree-optimization/110838 - vectorization of widened right shifts

2023-08-04 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > The following fixes a problem with my last attempt of avoiding > out-of-bound shift values for vectorized right shifts of widened > operands. Instead of truncating the shift amount with a bitwise > and we actually need to saturate it to the target precision. > > The follo

Re: [RFC] Combine zero_extract and sign_extend for TARGET_TRULY_NOOP_TRUNCATION

2023-08-04 Thread Richard Sandiford via Gcc-patches
YunQiang Su writes: > PR #104914 > > On TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)) == true platforms, > zero_extract (SI, SI) can be sign-extended. So, if a zero_extract (DI, > DI) following with an sign_extend(SI, DI) can be merged to a single > zero_extract (SI, SI). > > gcc/ChangeLog: >

Re: [PATCH]AArch64 update costing for MLA by invariant

2023-08-03 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: >> >> Do you see vect_constant_defs in practice, or is this just for >> >> completeness? >> >> I would expect any constants to appear as direct operands. I don't >> >> mind keeping it if it's just a belt-and-braces thing though. >> > >> > In the latency case where I had a

Re: [RFC] [v2] Extend fold_vec_perm to handle VLA vectors

2023-08-03 Thread Richard Sandiford via Gcc-patches
Richard Sandiford writes: > Prathamesh Kulkarni writes: >> On Tue, 25 Jul 2023 at 18:25, Richard Sandiford >> wrote: >>> >>> Hi, >>> >>> Thanks for the rework and sorry for the slow review. >> Hi Richard, >> Thanks for the suggestions! Please find my responses inline below. >>> >>> Prathamesh K

Re: [RFC] [v2] Extend fold_vec_perm to handle VLA vectors

2023-08-03 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > On Tue, 25 Jul 2023 at 18:25, Richard Sandiford > wrote: >> >> Hi, >> >> Thanks for the rework and sorry for the slow review. > Hi Richard, > Thanks for the suggestions! Please find my responses inline below. >> >> Prathamesh Kulkarni writes: >> > Hi Richard, >> >

Re: [PATCH]AArch64 Undo vec_widen_shiftl optabs [PR106346]

2023-08-03 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: >> > + >> > +(define_constraint "D3" >> > + "@internal >> > + A constraint that matches vector of immediates that is with 0 to >> > +(bits(mode)/2)-1." >> > + (and (match_code "const,const_vector") >> > + (match_test "aarch64_const_vec_all_same_in_range_p (op, 0, >> >

[PATCH] poly_int: Handle more can_div_trunc_p cases

2023-08-03 Thread Richard Sandiford via Gcc-patches
can_div_trunc_p (a, b, &Q, &r) tries to compute a Q and r that satisfy the usual conditions for truncating division: (1) a = b * Q + r (2) |b * Q| <= |a| (3) |r| < |b| We can compute Q using the constant component (the case when all indeterminates are zero). Since |r| < |b| for th

Re: [PATCH] AArch64: Do not increase the vect reduction latency by multiplying count [PR110625]

2023-08-03 Thread Richard Sandiford via Gcc-patches
Hao Liu OS writes: > Hi Richard, > > Update the patch with a simple case (see below case and comments). It shows > a live stmt may not have reduction def, which introduce the ICE. > > Is it OK for trunk? OK, thanks. Richard > > Fix the assertion failure on empty reduction define in info_

Re: [PATCH] tree-optimization/110838 - vectorization of widened shifts

2023-08-02 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > [...] >> >> in vect_determine_precisions_from_range. Maybe we should drop >> >> the shift handling from there and instead rely on >> >> vect_determine_precisions_from_users, extending: >> >> >> >> if (TREE_CODE (shift) != INTEGER_CST >> >> || !wi::ltu_p (wi::to_w

Re: [PATCH][gensupport]: Don't segfault on empty attrs list

2023-08-02 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: > Hi All, > > Currently we segfault when len == 0 for an attribute list. > > essentially [cons: =0, 1, 2, 3; attrs: ] segfaults but should be equivalent to > [cons: =0, 1, 2, 3] and [cons: =0, 1, 2, 3; attrs:]. This fixes it by just > returning early and leaving it to the

Re: [PATCH]AArch64 Undo vec_widen_shiftl optabs [PR106346]

2023-08-02 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: > Hi All, > > In GCC 11 we implemented the vectorizer optab for widening left shifts, > however this optab is only supported for uniform shift constants. > > At the moment GCC still has two loop vectorization strategy (classical loop > and > SLP based loop vec) and the opt

Re: [PATCH] tree-optimization/110838 - vectorization of widened shifts

2023-08-02 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Tue, 1 Aug 2023, Richard Sandiford wrote: > >> Richard Sandiford writes: >> > Richard Biener via Gcc-patches writes: >> >> The following makes sure to limit the shift operand when vectorizing >> >> (short)((int)x >> 31) via (short)x >> 31 as the out of bounds shift >>

Re: [PATCH]AArch64 update costing for MLA by invariant

2023-08-02 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: >> Tamar Christina writes: >> > Hi All, >> > >> > When determining issue rates we currently discount non-constant MLA >> > accumulators for Advanced SIMD but don't do it for the latency. >> > >> > This means the costs for Advanced SIMD with a constant accumulator are >> >

Re: [PATCH]AArch64 update costing for combining vector conditionals

2023-08-02 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: > Hi All, > > boolean comparisons have different cost depending on the mode. e.g. > a && b when predicated doesn't require an addition instruction, the AND is > free Nit (for the commit msg): additional Maybe: for SVE, a && b doesn't require an additional instruction

Re: [PATCH]AArch64 update costing for MLA by invariant

2023-08-02 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: > Hi All, > > When determining issue rates we currently discount non-constant MLA > accumulators > for Advanced SIMD but don't do it for the latency. > > This means the costs for Advanced SIMD with a constant accumulator are wrong > and > results in us costing SVE and Adv

Re: [PATCH 2/5] [RISC-V] Generate Zicond instruction for basic semantics

2023-08-02 Thread Richard Sandiford via Gcc-patches
Jeff Law via Gcc-patches writes: > On 8/1/23 05:18, Richard Sandiford wrote: >> >> Where were you seeing the requirement for pointer equality? genrecog.cc >> at least uses rtx_equal_p, and I think it has to. E.g. some patterns >> use (match_dup ...) to match output and input mems, and mem rtxes

Re: [PATCH 2/5] [RISC-V] Generate Zicond instruction for basic semantics

2023-08-01 Thread Richard Sandiford via Gcc-patches
Jeff Law via Gcc-patches writes: > On 7/19/23 04:11, Xiao Zeng wrote: >> This patch completes the recognition of the basic semantics >> defined in the spec, namely: >> >> Conditional zero, if condition is equal to zero >>rd = (rs2 == 0) ? 0 : rs1 >> Conditional zero, if condition is non zero

Re: [PATCH] tree-optimization/110838 - vectorization of widened shifts

2023-08-01 Thread Richard Sandiford via Gcc-patches
Richard Sandiford writes: > Richard Biener via Gcc-patches writes: >> The following makes sure to limit the shift operand when vectorizing >> (short)((int)x >> 31) via (short)x >> 31 as the out of bounds shift >> operand otherwise invokes undefined behavior. When we determine >> whether we can d

Re: [PATCH] tree-optimization/110838 - vectorization of widened shifts

2023-08-01 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > The following makes sure to limit the shift operand when vectorizing > (short)((int)x >> 31) via (short)x >> 31 as the out of bounds shift > operand otherwise invokes undefined behavior. When we determine > whether we can demote the operand we know we at m

Re: [PATCH V2] VECT: Support CALL vectorization for COND_LEN_*

2023-07-31 Thread Richard Sandiford via Gcc-patches
Juzhe-Zhong writes: > Hi, Richard and Richi. > > Base on the suggestions from Richard: > https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625396.html > > This patch choose (1) approach that Richard provided, meaning: > > RVV implements cond_* optabs as expanders. RVV therefore supports > both

Re: [PATCH] internal-fn: Refine macro define of COND_* and COND_LEN_* internal functions

2023-07-31 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes: > From: Ju-Zhe Zhong > > Hi, Richard and Richi. > > Base on previous disscussions, we should make COND_* and COND_LEN_* > consistent. > > So, this patch define these internal function together by these 2 > wrappers: > > #ifndef DEF_INTERNAL_COND_FN > #define DEF_INTERN

Re: [PATCH] Add POLY_INT_CST support to fold_ctor_reference in gimple-fold.cc

2023-07-31 Thread Richard Sandiford via Gcc-patches
Richard Ball writes: > Add POLY_INT_CST support to code within > fold_ctor_reference. This code previously > only supported INTEGER_CST which caused a > bug when using VEC_PERM_EXPR with SVE vectors. Just to add for others: this is a prerequisite for a follow-on patch, so the change will be teste

Re: [PATCH] AArch64: Do not increase the vect reduction latency by multiplying count [PR110625]

2023-07-31 Thread Richard Sandiford via Gcc-patches
Hao Liu OS writes: >> Which test case do you see this for? The two tests in the patch still >> seem to report correct latencies for me if I make the change above. > > Not the newly added tests. It is still the existing case causing the > previous ICE (i.e. assertion problem): gcc.target/aarch64

Re: [PATCH] AArch64: Do not increase the vect reduction latency by multiplying count [PR110625]

2023-07-28 Thread Richard Sandiford via Gcc-patches
Sorry for the slow response. Hao Liu OS writes: >> Ah, thanks. In that case, Hao, I think we can avoid the ICE by changing: >> >> if ((kind == scalar_stmt || kind == vector_stmt || kind == vec_to_scalar) >> && vect_is_reduction (stmt_info)) >> >> to: >> >> if ((kind == scalar_stmt || k

Re: [PATCH] AArch64: Do not increase the vect reduction latency by multiplying count [PR110625]

2023-07-26 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Wed, Jul 26, 2023 at 11:14 AM Richard Sandiford > wrote: >> >> Richard Biener writes: >> > On Wed, Jul 26, 2023 at 4:02 AM Hao Liu OS via Gcc-patches >> > wrote: >> >> >> >> > When was STMT_VINFO_REDUC_DEF empty? I just want to make sure that >> >> > we're not pape

Re: [PATCH] AArch64: Do not increase the vect reduction latency by multiplying count [PR110625]

2023-07-26 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Wed, Jul 26, 2023 at 4:02 AM Hao Liu OS via Gcc-patches > wrote: >> >> > When was STMT_VINFO_REDUC_DEF empty? I just want to make sure that we're >> > not papering over an issue elsewhere. >> >> Yes, I also wonder if this is an issue in vectorizable_reduction. Below

Re: vectorizer: Avoid an OOB access from vectorization

2023-07-25 Thread Richard Sandiford via Gcc-patches
Was leaving a bit of time in case Richi had any comments, but: Matthew Malcomson writes: > Our checks for whether the vectorization of a given loop would make an > out of bounds access miss the case when the vector we load is so large > as to span multiple iterations worth of data (while only bei

Re: [RFC] [v2] Extend fold_vec_perm to handle VLA vectors

2023-07-25 Thread Richard Sandiford via Gcc-patches
Hi, Thanks for the rework and sorry for the slow review. Prathamesh Kulkarni writes: > Hi Richard, > This is reworking of patch to extend fold_vec_perm to handle VLA vectors. > The attached patch unifies handling of VLS and VLA vector_csts, while > using fallback code > for ctors. > > For VLS ve

Re: [PATCH] VECT: Support CALL vectorization for COND_LEN_*

2023-07-25 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: > Hi, Richard. >>> I think we should have an internal-fn helper that returns IFN_COND_LEN_* >>> for a given IFN_COND_*. It could handle IFN_MASK_LOAD -> IFN_MASK_LEN_LOAD >>> etc. too. > Could you name this helper function for me? Does it call > "get_conditional_le

Re: [PATCH] VECT: Support CALL vectorization for COND_LEN_*

2023-07-25 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: > Thanks Richard. > > Do you suggest we should add a macro like this first: > > #ifndef DEF_INTERNAL_COND_FN > #define DEF_INTERNAL_COND_FN(NAME, FLAGS, OPTAB, TYPE) \ > DEF_INTERNAL_OPTAB_FN (COND_##NAME, FLAGS, cond_##optab, cond_##TYPE) > DEF_INTERNAL_OPTAB_FN

Re: [PATCH] AArch64: Do not increase the vect reduction latency by multiplying count [PR110625]

2023-07-25 Thread Richard Sandiford via Gcc-patches
Hao Liu OS writes: > Hi, > > Thanks for the suggestion. I tested it and found a gcc_assert failure: > gcc.target/aarch64/sve/cost_model_13.c (internal compiler error: in > info_for_reduction, at tree-vect-loop.cc:5473) > > It is caused by empty STMT_VINFO_REDUC_DEF. When was STMT_VINFO_REDU

Re: [PATCH] VECT: Support CALL vectorization for COND_LEN_*

2023-07-25 Thread Richard Sandiford via Gcc-patches
钟居哲 writes: > Hi, Richi. Thank you so much for review. > >>> This function doesn't seem to care about conditional vectorization >>> support, so why are you changing it? > > I debug and analyze the code here: > > Breakpoint 1, vectorizable_call (vinfo=0x3d358d0, stmt_info=0x3dcc820, > gsi=0x0, vec

Re: [PATCH 2/2] AARCH64: Turn off unwind tables for crtbeginT.o

2023-07-24 Thread Richard Sandiford via Gcc-patches
Andrew Pinski via Gcc-patches writes: > The problem -fasynchronous-unwind-tables is on by default for aarch64 > We need turn it off for crt*.o because it would make __EH_FRAME_BEGIN__ point > to .eh_frame data from crtbeginT.o instead of the user-defined object > during static linking. Could you

Re: [PATCH] AArch64: Do not increase the vect reduction latency by multiplying count [PR110625]

2023-07-24 Thread Richard Sandiford via Gcc-patches
Hao Liu OS writes: > This only affects the new costs in aarch64 backend. Currently, the reduction > latency of vector body is too large as it is multiplied by stmt count. As the > scalar reduction latency is small, the new costs model may think "scalar code > would issue more quickly" and increa

Re: [PATCH] Remove SLP_TREE_VEC_STMTS in favor of SLP_TREE_VEC_DEFS

2023-07-24 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > The following unifies SLP_TREE_VEC_STMTS into SLP_TREE_VEC_DEFS > which can handle all cases we need. > > Bootstrap & regtest running on x86_64-unknown-linux-gnu. Nice! Just curious... > @@ -149,6 +147,20 @@ _slp_tree::~_slp_tree () > free (failed); > } > > +/*

Re: [PATCH]AArch64 fix regexp for live_1.c sve test

2023-07-21 Thread Richard Sandiford via Gcc-patches
Jan Hubicka writes: > Avoid scaling flat loop profiles of vectorized loops > > As discussed, when vectorizing loop with static profile, it is not always > good idea > to divide the header frequency by vectorization factor because the profile may > not realistically represent the expected number o

Re: [PATCH] tree-optimization/110742 - fix latent issue with permuting existing vectors

2023-07-20 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: >> Am 20.07.2023 um 18:59 schrieb Richard Sandiford : >> >> Richard Biener writes: > Am 20.07.2023 um 16:09 schrieb Richard Sandiford > : Richard Biener via Gcc-patches writes: > When we materialize a layout we push edge permutes to constant/exte

Re: [PATCH] tree-optimization/110742 - fix latent issue with permuting existing vectors

2023-07-20 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: >> Am 20.07.2023 um 16:09 schrieb Richard Sandiford : >> >> Richard Biener via Gcc-patches writes: >>> When we materialize a layout we push edge permutes to constant/external >>> defs without checking we can actually do so. For externals defined >>> by vector stmts rathe

Re: [PATCH] tree-optimization/110742 - fix latent issue with permuting existing vectors

2023-07-20 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > When we materialize a layout we push edge permutes to constant/external > defs without checking we can actually do so. For externals defined > by vector stmts rather than scalar components we can't. > > Bootstrapped and tested on x86_64-unknown-linux-gnu.

Re: [PATCH] sccvn: Correct the index of bias for IFN_LEN_STORE [PR110744]

2023-07-20 Thread Richard Sandiford via Gcc-patches
"Kewen.Lin" writes: > Hi, > > Commit r14-2267-gb8806f6ffbe72e adjusts the arguments order > of LEN_STORE from {len,vector,bias} to {len,bias,vector}, > in order to make them consistent with LEN_MASK_STORE and > MASK_STORE. But it missed to update the related handlings > in tree-ssa-sccvn.cc, it c

Re: [PATCH] testsuite: Add a test case for PR110729

2023-07-20 Thread Richard Sandiford via Gcc-patches
"Kewen.Lin" writes: > Hi, > > As PR110729 reported, there was one issue for .section > __patchable_function_entries with -ffunction-sections, that > is we put the same symbol as link_to section symbol for all > functions wrongly. The commit r13-4294 for PR99889 has > fixed this with the correspon

Re: [PATCH v2] Store_bit_field_1: Use SUBREG instead of REG if possible

2023-07-20 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: #> On Thu, 20 Jul 2023, Richard Sandiford wrote: > >> Jeff Law via Gcc-patches writes: >> > On 7/19/23 04:25, Richard Biener wrote: >> >> On Wed, 19 Jul 2023, YunQiang Su wrote: >> >> >> >>> Eric Botcazou ?2023?7?19??? 17:45??? >> >> > I don't see that. That's d

Re: [PATCH]AArch64 fix regexp for live_1.c sve test

2023-07-20 Thread Richard Sandiford via Gcc-patches
Jan Hubicka writes: >> Tamar Christina writes: >> > Hi All, >> > >> > The resulting predicate register of a whilelo is not >> > restricted to the lower half of the predicate register file. >> > >> > As such these tests started failing after recent changes >> > because the whilelo outside the loop

Re: [PATCH]AArch64 fix regexp for live_1.c sve test

2023-07-20 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Thu, 20 Jul 2023, Richard Sandiford wrote: > >> Tamar Christina writes: >> > Hi All, >> > >> > The resulting predicate register of a whilelo is not >> > restricted to the lower half of the predicate register file. >> > >> > As such these tests started failing after rec

Re: [PATCH v2] Store_bit_field_1: Use SUBREG instead of REG if possible

2023-07-20 Thread Richard Sandiford via Gcc-patches
Jeff Law via Gcc-patches writes: > On 7/19/23 04:25, Richard Biener wrote: >> On Wed, 19 Jul 2023, YunQiang Su wrote: >> >>> Eric Botcazou ?2023?7?19??? 17:45??? > I don't see that. That's definitely not what GCC expects here, > the left-most word of the doubleword should be unchan

Re: [GCC 13 PATCH] aarch64: Remove architecture dependencies from intrinsics

2023-07-19 Thread Richard Sandiford via Gcc-patches
Andrew Carlotti writes: > Updated patch to fix the fp16 intrinsic pragmas, and pushed to master. > OK to backport to GCC 13? OK, thanks. Richard > Many intrinsics currently depend on both an architecture version and a > feature, despite the corresponding instructions being available within > GC

Re: [PATCH]AArch64 fix regexp for live_1.c sve test

2023-07-19 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: > Hi All, > > The resulting predicate register of a whilelo is not > restricted to the lower half of the predicate register file. > > As such these tests started failing after recent changes > because the whilelo outside the loop is getting assigned p15. It's the whilelo i

Re: [PATCH v2 0/2] ifcvt: Allow if conversion of arithmetic in basic blocks with multiple sets

2023-07-18 Thread Richard Sandiford via Gcc-patches
Manolis Tsamis writes: > On Tue, Jul 18, 2023 at 1:12 AM Richard Sandiford > wrote: >> >> Manolis Tsamis writes: >> > noce_convert_multiple_sets has been introduced and extended over time to >> > handle >> > if conversion for blocks with multiple sets. Currently this is focused on >> > register

Re: [x86-64] RFC: Add nosse abi attribute

2023-07-17 Thread Richard Sandiford via Gcc-patches
Michael Matz via Gcc-patches writes: > Hello, > > the ELF psABI for x86-64 doesn't have any callee-saved SSE > registers (there were actual reasons for that, but those don't > matter anymore). This starts to hurt some uses, as it means that > as soon as you have a call (say to memmove/memcpy, eve

Re: [PATCH v2 0/2] ifcvt: Allow if conversion of arithmetic in basic blocks with multiple sets

2023-07-17 Thread Richard Sandiford via Gcc-patches
Manolis Tsamis writes: > noce_convert_multiple_sets has been introduced and extended over time to > handle > if conversion for blocks with multiple sets. Currently this is focused on > register moves and rejects any sort of arithmetic operations. > > This series is an extension to allow more sequ

Re: [PATCH V2] RTL_SSA: Relax PHI_MODE in phi_setup

2023-07-17 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes: > From: Ju-Zhe Zhong > > Hi, Richard. > > RISC-V port needs to add a bunch VLS modes (V16QI,V32QI,V64QI,...etc) > There are sharing same REG_CLASS with VLA modes (VNx16QI,VNx32QI,...etc) > > When I am adding those VLS modes, the RTL_SSA initialization in VSETVL PASS >

Re: [PATCH] RTL_SSA: Relax PHI_MODE in phi_setup

2023-07-17 Thread Richard Sandiford via Gcc-patches
Juzhe-Zhong writes: > Hi, Richard. > > RISC-V port needs to add a bunch VLS modes (V16QI,V32QI,V64QI,...etc) > There are sharing same REG_CLASS with VLA modes (VNx16QI,VNx32QI,...etc) > > When I am adding those VLS modes, the RTL_SSA initialization in VSETVL PASS > (inserted after RA) ICE: > rvv.

Re: [WIP RFC] Add support for keyword-based attributes

2023-07-17 Thread Richard Sandiford via Gcc-patches
Jason Merrill writes: > On Sun, Jul 16, 2023 at 6:50 AM Richard Sandiford > wrote: > >> Jakub Jelinek writes: >> > On Fri, Jul 14, 2023 at 04:56:18PM +0100, Richard Sandiford via >> Gcc-patches wrote: >> >> Summary: We'd like to be able to speci

Re: [WIP RFC] Add support for keyword-based attributes

2023-07-17 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Fri, Jul 14, 2023 at 5:58 PM Richard Sandiford via Gcc-patches > wrote: >> >> Summary: We'd like to be able to specify some attributes using >> keywords, rather than the traditional __attribute__ or [[...]] >> syntax. Wou

Re: [WIP RFC] Add support for keyword-based attributes

2023-07-16 Thread Richard Sandiford via Gcc-patches
Jakub Jelinek writes: > On Fri, Jul 14, 2023 at 04:56:18PM +0100, Richard Sandiford via Gcc-patches > wrote: >> Summary: We'd like to be able to specify some attributes using >> keywords, rather than the traditional __attribute__ or [[...]] >> syntax. Would that

Re: [WIP RFC] Add support for keyword-based attributes

2023-07-16 Thread Richard Sandiford via Gcc-patches
Thanks for the feedback. Nathan Sidwell writes: > On 7/14/23 11:56, Richard Sandiford wrote: >> Summary: We'd like to be able to specify some attributes using >> keywords, rather than the traditional __attribute__ or [[...]] >> syntax. Would that be OK? >> >> In more detail: >> >> We'd like to

[WIP RFC] Add support for keyword-based attributes

2023-07-14 Thread Richard Sandiford via Gcc-patches
Summary: We'd like to be able to specify some attributes using keywords, rather than the traditional __attribute__ or [[...]] syntax. Would that be OK? In more detail: We'd like to add some new target-specific attributes for Arm SME. These attributes affect semantics and code generation and so t

Re: [pushed][LRA][PR110372]: Refine reload pseudo class

2023-07-12 Thread Richard Sandiford via Gcc-patches
Vladimir Makarov writes: > On 7/12/23 06:07, Richard Sandiford wrote: >> Vladimir Makarov via Gcc-patches writes: >>> diff --git a/gcc/lra-assigns.cc b/gcc/lra-assigns.cc >>> index 73fbef29912..2f95121df06 100644 >>> --- a/gcc/lra-assigns.cc >>> +++ b/gcc/lra-assigns.cc >>> @@ -1443,10 +1443,11 @

Re: [PATCH] tree-optimization/94864 - vector insert of vector extract simplification

2023-07-12 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > The PRs ask for optimizing of > > _1 = BIT_FIELD_REF ; > result_4 = BIT_INSERT_EXPR ; > > to a vector permutation. The following implements this as > match.pd pattern, improving code generation on x86_64. > > On the RTL level we face the issue that backend patterns in

Re: [PATCH V3] VECT: Apply COND_LEN_* into vectorizable_operation

2023-07-12 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes: > From: Ju-Zhe Zhong > > Hi, Richard and Richi. > As we disscussed before, COND_LEN_* patterns were added for multiple > situations. > This patch apply CON_LEN_* for the following situation: > > Support for the situation that in "vectorizable_operation": > /* If ope

Re: [PATCH] simplify-rtx: Fix invalid simplification with paradoxical subregs [PR110206]

2023-07-12 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Wed, Jul 12, 2023 at 1:05 PM Uros Bizjak wrote: >> >> On Wed, Jul 12, 2023 at 12:58 PM Uros Bizjak wrote: >> > >> > On Wed, Jul 12, 2023 at 12:23 PM Richard Sandiford >> > wrote: >> > > >> > > Richard Biener via Gcc-patches writes: >> > > > On Mon, Jul 10, 2023 at 1

Re: [PATCH V2] VECT: Apply COND_LEN_* into vectorizable_operation

2023-07-12 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes: > From: Ju-Zhe Zhong > > Hi, Richard and Richi. > As we disscussed before, COND_LEN_* patterns were added for multiple > situations. > This patch apply CON_LEN_* for the following situation: > > Support for the situation that in "vectorizable_operation": > /* If ope

Re: [PATCH] simplify-rtx: Fix invalid simplification with paradoxical subregs [PR110206]

2023-07-12 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > On Mon, Jul 10, 2023 at 1:01 PM Uros Bizjak wrote: >> >> On Mon, Jul 10, 2023 at 11:47 AM Richard Biener >> wrote: >> > >> > On Mon, Jul 10, 2023 at 11:26 AM Uros Bizjak wrote: >> > > >> > > On Mon, Jul 10, 2023 at 11:17 AM Richard Biener >> > > wrote:

Re: [pushed][LRA][PR110372]: Refine reload pseudo class

2023-07-12 Thread Richard Sandiford via Gcc-patches
Vladimir Makarov via Gcc-patches writes: > The following patch solves > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110372 > > The patch was successfully bootstrapped and tested on x86-64. > > commit 1f7e5a7b91862b999aab88ee0319052aaf00f0f1 > Author: Vladimir N. Makarov > Date: Fri Jul 7 09:

Re: [PATCH V5] RISC-V: Support gather_load/scatter RVV auto-vectorization

2023-07-12 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Wed, 12 Jul 2023, juzhe.zh...@rivai.ai wrote: > >> Thanks Richard. >> >> Is it correct that the better way is to add optabs >> (len_strided_load/len_strided_store), >> then expand LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE to >> len_strided_load/len_strided_store op

Re: [PATCH] VECT: Apply COND_LEN_* into vectorizable_operation

2023-07-12 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes: > From: Ju-Zhe Zhong > > Hi, Richard and Richi. > As we disscussed before, COND_LEN_* patterns were added for multiple > situations. > This patch apply CON_LEN_* for the following situation: > > Support for the situation that in "vectorizable_operation": > /* If ope

Re: [PATCH] genopinit: Allow more than 256 modes.

2023-07-11 Thread Richard Sandiford via Gcc-patches
Robin Dapp writes: > Ok so the consensus seems to rather stay with 32 bits and only > change the shift to 10/20? Yeah. The check would then be: if (NUM_OPTABS > 0xfff || NUM_MACHINE_MODES > 0x3ff) fatal ("genopinit range assumptions invalid"); > As MACHINE_MODE_BITSIZE is already > 16 we

Re: [PATCH] genopinit: Allow more than 256 modes.

2023-07-11 Thread Richard Sandiford via Gcc-patches
Richard Sandiford writes: > Robin Dapp via Gcc-patches writes: >> Hi, >> >> upcoming changes for RISC-V will have us exceed 256 modes or 8 bits. The >> helper functions in gen* rely on the opcode as well as two modes fitting >> into an unsigned int (a signed int even if we consider the qsort defa

Re: [PATCH] genopinit: Allow more than 256 modes.

2023-07-11 Thread Richard Sandiford via Gcc-patches
Robin Dapp via Gcc-patches writes: > Hi, > > upcoming changes for RISC-V will have us exceed 256 modes or 8 bits. The > helper functions in gen* rely on the opcode as well as two modes fitting > into an unsigned int (a signed int even if we consider the qsort default > comparison function). This

Re: [PATCH] Vect: use a small step to calculate induction for the unrolled loop (PR tree-optimization/110449)

2023-07-07 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: >> Am 06.07.2023 um 19:50 schrieb Richard Sandiford : >> >> Richard Biener via Gcc-patches writes: On Wed, Jul 5, 2023 at 8:44 AM Hao Liu OS via Gcc-patches wrote: Hi, If a loop is unrolled by n times during vectoriation, two steps are use

Re: [PATCH] Vect: use a small step to calculate induction for the unrolled loop (PR tree-optimization/110449)

2023-07-06 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > On Wed, Jul 5, 2023 at 8:44 AM Hao Liu OS via Gcc-patches > wrote: >> >> Hi, >> >> If a loop is unrolled by n times during vectoriation, two steps are used to >> calculate the induction variable: >> - The small step for the unrolled ith-copy: vec_1 = vec

Re: [PATCH] Vect: select small VF for epilog of unrolled loop (PR tree-optimization/110474)

2023-07-05 Thread Richard Sandiford via Gcc-patches
Hao Liu OS via Gcc-patches writes: > Hi, > > If a loop is unrolled during vectorization (i.e. suggested_unroll_factor > 1), > the VFs of both main and epilog loop are enlarged. The epilog vect loop is > specific for a loop with small iteration counts, so a large VF may hurt > performance. > > Thi

Re: [PATCH][RFC] target/110456 - avoid loop masking with zero distance dependences

2023-07-05 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Tue, 4 Jul 2023, Richard Sandiford wrote: > >> Richard Biener writes: >> > On Thu, 29 Jun 2023, Richard Biener wrote: >> > >> >> On Thu, 29 Jun 2023, Richard Sandiford wrote: >> >> >> >> > Richard Biener writes: >> >> > > With applying loop masking to epilogues on x8

Re: [PATCH] middle-end/110541 - VEC_PERM_EXPR documentation is off

2023-07-05 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > The following adjusts the tree.def documentation about VEC_PERM_EXPR > which wasn't adjusted when the restrictions of permutes with constant > mask were relaxed. I was going to complain about having two copies of the documentation, but then I realised that

Re: [PATCH][RFC] target/110456 - avoid loop masking with zero distance dependences

2023-07-04 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Thu, 29 Jun 2023, Richard Biener wrote: > >> On Thu, 29 Jun 2023, Richard Sandiford wrote: >> >> > Richard Biener writes: >> > > With applying loop masking to epilogues on x86_64 AVX512 we see >> > > some significant performance regressions when evaluating SPEC CPU 20

Re: [PATCH] tree-optimization/110310 - move vector epilogue disabling to analysis phase

2023-07-04 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Tue, 4 Jul 2023, Richard Biener wrote: > >> On Mon, 3 Jul 2023, Richard Sandiford wrote: >> >> > Richard Biener writes: >> > > The following removes late deciding to elide vectorized epilogues to >> > > the analysis phase and also avoids altering the epilogues niter.

Re: [PATCH] tree-optimization/110310 - move vector epilogue disabling to analysis phase

2023-07-03 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > The following removes late deciding to elide vectorized epilogues to > the analysis phase and also avoids altering the epilogues niter. > The costing part from vect_determine_partial_vectors_and_peeling is > moved to vect_analyze_loop_costing where we use the main loop > a

Re: [PATCH V7] Machine Description: Add LEN_MASK_{GATHER_LOAD, SCATTER_STORE} pattern

2023-07-03 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes: > From: Ju-Zhe Zhong > > Hi, Richi and Richard. > > Base one the review comments from Richard: > https://gcc.gnu.org/pipermail/gcc-patches/2023-July/623405.html > > I change len_mask_gather_load/len_mask_scatter_store order into: > {len,bias,mask} > > We adjust adding

Re: [PATCH] middle-end/110495 - avoid associating constants with (VL) vectors

2023-07-03 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > When trying to associate (v + INT_MAX) + INT_MAX we are using > the TREE_OVERFLOW bit to check for correctness. That isn't > working for VECTOR_CSTs and it can't in general when one considers > VL vectors. It looks like it should work for COMPLEX_CSTs but

Re: [PATCH V2] Middle-end: Change order of LEN_MASK_LOAD/LEN_MASK_STORE arguments

2023-07-03 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes: > From: Ju-Zhe Zhong > > Hi, Richard. I fix the order as you suggeted. > > Before this patch, the order is {len,mask,bias}. > > Now, after this patch, the order becomes {len,bias,mask}. > > Since you said we should not need 'internal_fn_bias_index', the bias index > s

[PATCH] aarch64: Fix vector-to-vector vec_extract

2023-07-03 Thread Richard Sandiford via Gcc-patches
The documentation says: - @cindex @code{vec_extract@var{m}@var{n}} instruction pattern @item @samp{vec_extract@var{m}@var{n}} Extract given field from the vector value. [...] The @var{n} mode is the mode of the field or vect

Re: [PATCH] Middle-end: Change order of LEN_MASK_LOAD/LEN_MASK_STORE arguments

2023-07-03 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes: > From: Ju-Zhe Zhong > > Hi, Richard and Richi. > > According to Richard's review comments: > https://gcc.gnu.org/pipermail/gcc-patches/2023-July/623405.html > > current len, bias and mask order is not reasonable. > > Change {len,mask,bias} into {len,bias,mask}. > > Th

<    1   2   3   4   5   6   7   8   9   10   >