RE: [PATCH 2/2]middle-end: use two's complement equality when comparing IVs during candidate selection [PR114932]

2024-09-10 Thread Tamar Christina
ping > -Original Message- > From: Tamar Christina > Sent: Tuesday, August 20, 2024 2:06 PM > To: gcc-patches@gcc.gnu.org > Cc: nd ; rguent...@suse.de; j...@ventanamicro.com > Subject: [PATCH 2/2]middle-end: use two's complement equality when comparing > IVs

RE: [PATCH 1/2]middle-end: refactor type to be explicit in operand_equal_p [PR114932]

2024-09-10 Thread Tamar Christina
ping > -Original Message- > From: Tamar Christina > Sent: Tuesday, August 20, 2024 2:06 PM > To: gcc-patches@gcc.gnu.org > Cc: nd ; rguent...@suse.de; j...@ventanamicro.com > Subject: [PATCH 1/2]middle-end: refactor type to be explicit in > operand_equal_p >

RE: [PATCH 4/4]AArch64: Define VECTOR_STORE_FLAG_VALUE.

2024-09-09 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Monday, September 9, 2024 9:29 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH 4/4]AArch64: Define VECTOR_STOR

RE: [PATCH]middle-end: check that the lhs of a COND_EXPR is an SSA_NAME in cond_store recognition [PR116628]

2024-09-06 Thread Tamar Christina
> -Original Message- > From: Kyrylo Tkachov > Sent: Friday, September 6, 2024 2:15 PM > To: Tamar Christina > Cc: GCC Patches ; nd ; Richard Biener > ; j...@ventanamicro.com > Subject: Re: [PATCH]middle-end: check that the lhs of a COND_EXPR is an > SSA_NAME i

RE: [PATCH 3/4][rtl]: simplify boolean vector EQ and NE comparisons

2024-09-06 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Friday, September 6, 2024 2:21 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd > Subject: Re: [PATCH 3/4][rtl]: simplify boolean vector EQ and NE comparisons > > Tamar Christina writes: >

RE: [PATCH 2/4]middle-end: lower COND_EXPR into gimple form in vect_recog_bool_pattern

2024-09-06 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Friday, September 6, 2024 2:09 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com > Subject: Re: [PATCH 2/4]middle-end: lower COND_EXPR into gimple form in > vect_recog_bool_patte

[PATCH]middle-end: check that the lhs of a COND_EXPR is an SSA_NAME in cond_store recognition [PR116628]

2024-09-06 Thread Tamar Christina
Hi All, Because the vect_recog_bool_pattern can at the moment still transition out of GIMPLE and back into GENERIC the vect_recog_cond_store_pattern can end up using an expression as a mask rather than an SSA_NAME. This adds an explicit check that we have a mask and not an expression. Bootstrapp

[PATCH 4/4]AArch64: Define VECTOR_STORE_FLAG_VALUE.

2024-09-03 Thread Tamar Christina
Hi All, This defines VECTOR_STORE_FLAG_VALUE to CONST1_RTX for AArch64 so we simplify vector comparisons in AArch64. With this enabled res: moviv0.4s, 0 cmeqv0.4s, v0.4s, v0.4s ret is simplified to: res: mvniv0.4s, 0 ret NOTE: I don't really

[PATCH 3/4][rtl]: simplify boolean vector EQ and NE comparisons

2024-09-03 Thread Tamar Christina
Hi All, This adds vector constant simplification for EQ and NE. This is useful since the vectorizer generates a lot more vector compares now, in particular NE and EQ and so these help us optimize cases where the values were not known at GIMPLE but instead only at RTL. Bootstrapped Regtested on a

[PATCH 2/4]middle-end: lower COND_EXPR into gimple form in vect_recog_bool_pattern

2024-09-03 Thread Tamar Christina
Hi All, Currently the vectorizer cheats when lowering COND_EXPR during bool recog. In the cases where the conditonal is loop invariant or non-boolean it instead converts the operation back into GENERIC and hides much of the operation from the analysis part of the vectorizer. i.e. a ? b : c is

[PATCH 1/4]middle-end: have vect_recog_cond_store_pattern use pattern statement for cond if available

2024-09-03 Thread Tamar Christina
Hi All, When vectorizing a conditional operation we rely on the bool_recog pattern to hit and convert the bool of the operand to a valid mask. However we are currently not using the converted operand as this is in a pattern statement. This change updates it to look at the actual statement to be

[PATCH][docs]: [committed] remove double mention of armv9-a.

2024-09-03 Thread Tamar Christina
Hi All, The list of available architecture for Arm is incorrectly listing armv9-a twice. This removes the duplicate armv9-a enumeration from the part of the list having M-profile targets. committed under the obvious rule. Thanks, Tamar gcc/ChangeLog: * doc/invoke.texi: Remove duplicate

[PATCH][testsuite]: remove -fwrapv from signbit-5.c

2024-09-03 Thread Tamar Christina
Hi All, The meaning of the testcase was changed by passing it -fwrapv. The reason for the test failures on some platform was because the test was testing some implementation defined behavior wrt INT_MIN in generic code. Instead of using -fwrapv this just removes the border case from the test so

RE: [RFC][PATCH] AArch64: Remove AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS

2024-08-28 Thread Tamar Christina
> -Original Message- > From: Kyrylo Tkachov > Sent: Wednesday, August 28, 2024 8:55 AM > To: Tamar Christina > Cc: Richard Sandiford ; Jennifer Schmitz > ; gcc-patches@gcc.gnu.org; Kyrylo Tkachov > > Subject: Re: [RFC][PATCH] AArch64: Remove > AARCH64_EXTR

RE: [RFC][PATCH] AArch64: Remove AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS

2024-08-27 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Tuesday, August 27, 2024 11:46 AM > To: Tamar Christina > Cc: Jennifer Schmitz ; gcc-patches@gcc.gnu.org; Kyrylo > Tkachov > Subject: Re: [RFC][PATCH] AArch64: Remove > AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_CO

RE: [RFC][PATCH] AArch64: Remove AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS

2024-08-23 Thread Tamar Christina
Hi Jennifer, > -Original Message- > From: Jennifer Schmitz > Sent: Friday, August 23, 2024 1:07 PM > To: gcc-patches@gcc.gnu.org > Cc: Richard Sandiford ; Kyrylo Tkachov > > Subject: [RFC][PATCH] AArch64: Remove > AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS > > This patch removes the AARCH6

RE: [RFC] Support single lane SLP early break

2024-08-22 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Wednesday, August 21, 2024 12:12 PM > To: Tamar Christina > Cc: GCC Patches > Subject: Re: [RFC] Support single lane SLP early break > > On Tue, 20 Aug 2024, Tamar Christina wrote: > > > Hi, > &g

RE: [PATCH] testsuite: Add -fwrapv to signbit-5.c

2024-08-22 Thread Tamar Christina
> -Original Message- > From: Torbjorn SVENSSON > Sent: Wednesday, August 21, 2024 2:23 PM > To: Tamar Christina ; Richard Biener > > Cc: Jeff Law ; gcc-patches@gcc.gnu.org; Richard > Earnshaw ; quic_apin...@quicinc.com; > yvan.r...@foss.st.com > Subject:

[PATCH 2/2]middle-end: use two's complement equality when comparing IVs during candidate selection [PR114932]

2024-08-20 Thread Tamar Christina
Hi All, IVOPTS normally uses affine trees to perform comparisons between different IVs, but these seem to have been missing in two key spots and instead normal tree equivalencies used. In some cases where we have a two-complements equivalence but not a strict signedness equivalencies we end up ge

[PATCH 1/2]middle-end: refactor type to be explicit in operand_equal_p [PR114932]

2024-08-20 Thread Tamar Christina
Hi All, This is a refactoring with no expected behavioral change. The goal with this is to make the type of the expressions being used explicit. I did not change all the recursive calls to operand_equal_p () to recurse directly to the new function but instead this goes through the top level call

RE: [PATCH V3 10/10] autovectorizer: Test autovectorization of different dot-prod modes.

2024-08-20 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Tuesday, August 20, 2024 1:54 PM > To: Tamar Christina > Cc: Victor Do Nascimento ; gcc- > patc...@gcc.gnu.org; claz...@gmail.com; hongtao@intel.com; > s...@gcc.gnu.org; bernds_...@t-online.de; al...@

RE: [PATCH] testsuite: Add -fwrapv to signbit-5.c

2024-08-20 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Tuesday, August 20, 2024 12:33 PM > To: Torbjorn SVENSSON > Cc: Jeff Law ; gcc-patches@gcc.gnu.org; Richard > Earnshaw ; quic_apin...@quicinc.com; > yvan.r...@foss.st.com; Tamar Christina > Subject: Re: [PATCH

[RFC] Support single lane SLP early break

2024-08-20 Thread Tamar Christina
Hi, I've been working on a prototype of moving early break to SLP. As we've discussed on IRC I've decided to first try adding the gconds as roots and start SLP discovery using them as roots. This works great and doesn't require any changed to build_slp, it also has the additional benefit in that

[RFC] early vector boolean lowering

2024-08-20 Thread Tamar Christina
Hi, As you know I've been working on removing the code that demotes GIMPLE COND_EXPR to GENERIC during vect_recog_bool_pattern. To restate why, The issue we currently have today is that the mask (boolean argument of a COND_EXPR) is not always available during pattern matching. This is a problem

RE: [PATCH V3 10/10] autovectorizer: Test autovectorization of different dot-prod modes.

2024-08-20 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Tuesday, August 20, 2024 10:37 AM > To: Victor Do Nascimento > Cc: gcc-patches@gcc.gnu.org; Tamar Christina ; > claz...@gmail.com; hongtao@intel.com; s...@gcc.gnu.org; bernds_cb1@t- > online.de; al...@

RE: [PATCH v1] Vect: Promote unsigned .SAT_ADD constant operand for vectorizable_call

2024-08-20 Thread Tamar Christina
Hi Pan, > -Original Message- > From: Li, Pan2 > Sent: Tuesday, August 20, 2024 1:58 AM > To: Tamar Christina ; Jakub Jelinek > > Cc: Richard Biener ; gcc-patches@gcc.gnu.org; > juzhe.zh...@rivai.ai; kito.ch...@gmail.com; jeffreya...@gmail.com; > rdapp

RE: [PATCH v1] Vect: Promote unsigned .SAT_ADD constant operand for vectorizable_call

2024-08-19 Thread Tamar Christina
> -Original Message- > From: Jakub Jelinek > Sent: Monday, August 19, 2024 8:25 PM > To: Tamar Christina > Cc: Li, Pan2 ; Richard Biener ; > gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; > jeffreya...@gmail.com; rdapp@gmail.com; Liu, Hon

RE: [PATCH v1] Vect: Promote unsigned .SAT_ADD constant operand for vectorizable_call

2024-08-19 Thread Tamar Christina
Hi Pan, > > Thanks Jakub for explaining. > > Hi Richard, > > Does it mean we need to do some promotion similar as this patch to make the > vectorizable_call happy > when there is a constant operand? I am not sure if there is a better approach > for > this case. I'll leave it up to Richi, but

RE: [PATCH V3 02/10] autovectorizer: Add basic support for convert optabs

2024-08-15 Thread Tamar Christina
Hi Victor, > -Original Message- > From: Victor Do Nascimento > Sent: Thursday, August 15, 2024 9:44 AM > To: gcc-patches@gcc.gnu.org > Cc: Tamar Christina ; claz...@gmail.com; > hongtao@intel.com; s...@gcc.gnu.org; bernds_...@t-online.de; > al...@redhat.com;

RE: [PATCH V2 02/10] autovectorizer: Add basic support for convert optabs

2024-08-14 Thread Tamar Christina
Hi Victor, > -Original Message- > From: Victor Do Nascimento > Sent: Tuesday, August 13, 2024 1:42 PM > To: gcc-patches@gcc.gnu.org > Cc: Tamar Christina ; claz...@gmail.com; > hongtao@intel.com; s...@gcc.gnu.org; bernds_...@t-online.de; > al...@redhat.com;

RE: [PATCH][RFC] aarch64: Reduce FP reassociation width for Neoverse V2 and set AARCH64_EXTRA_TUNE_FULLY_PIPELINED_FMA

2024-08-12 Thread Tamar Christina
> -Original Message- > From: Kyrylo Tkachov > Sent: Monday, August 12, 2024 3:54 PM > To: Tamar Christina > Cc: GCC Patches ; Richard Sandiford > > Subject: Re: [PATCH][RFC] aarch64: Reduce FP reassociation width for Neoverse > V2 and set AARCH64_EXTRA_TUNE_FUL

RE: [PATCH][RFC] aarch64: Reduce FP reassociation width for Neoverse V2 and set AARCH64_EXTRA_TUNE_FULLY_PIPELINED_FMA

2024-08-12 Thread Tamar Christina
Hi Kyrill, > -Original Message- > From: Kyrylo Tkachov > Sent: Monday, August 12, 2024 3:07 PM > To: GCC Patches > Cc: Tamar Christina ; Richard Sandiford > > Subject: [PATCH][RFC] aarch64: Reduce FP reassociation width for Neoverse V2 > and set AARCH64_EXTRA_

[PATCH]AArch64: Fix signbit mask creation after late combine [PR116229]

2024-08-07 Thread Tamar Christina
Hi All, The optimization to generate a DI signbit constant by using fneg was relying on nothing being able to push the constant into the negate. It's run quite late for this reason. However late combine now runs after it and triggers RTL simplification based on the neg. When -fno-signed-zeros t

RE: [PATCH 8/8]AArch64: take gather/scatter decode overhead into account

2024-08-01 Thread Tamar Christina
> -Original Message- > From: Tamar Christina > Sent: Thursday, August 1, 2024 9:51 AM > To: Richard Sandiford > Cc: Kyrylo Tkachov ; gcc-patches@gcc.gnu.org; nd > ; Richard Earnshaw ; Marcus > Shawcroft ; ktkac...@gcc.gnu.org > Subject: RE: [PATCH 8/8]AArch64: ta

RE: [PATCH 8/8]AArch64: take gather/scatter decode overhead into account

2024-08-01 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Wednesday, July 31, 2024 7:17 PM > To: Tamar Christina > Cc: Kyrylo Tkachov ; gcc-patches@gcc.gnu.org; nd > ; Richard Earnshaw ; Marcus > Shawcroft ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH 8/8]AArch64: ta

RE: [PATCH 8/8]AArch64: take gather/scatter decode overhead into account

2024-07-31 Thread Tamar Christina
Hi Kyrill, > > /* True if the vector body contains a store to a decl and if the > > function is known to have a vld1 from the same decl. > > > > @@ -17291,6 +17297,17 @@ aarch64_vector_costs::add_stmt_cost (int count, > vect_cost_for_stmt kind, > >stmt_cost = aarch64_detect_vector_s

RE: [RFC][middle-end] SLP Early break and control flow support in GCC

2024-07-30 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Thursday, July 18, 2024 10:00 AM > To: Tamar Christina > Cc: GCC Patches ; Richard Sandiford > > Subject: RE: [RFC][middle-end] SLP Early break and control flow support in GCC > > On Wed, 17 Jul

RE: [PATCH 1/8]AArch64: Update Neoverse V2 cost model to release costs

2024-07-26 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Friday, July 26, 2024 2:12 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH 1/8]AArch64: Update Neoverse

RE: [PATCH 5/8]AArch64: Update Generic Armv9-a cost model to release costs

2024-07-26 Thread Tamar Christina
> -Original Message- > From: Kyrylo Tkachov > Sent: Friday, July 26, 2024 1:35 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; ktkac...@gcc.gnu.org; Richard Sandiford > > Subject: Re: [PATCH 5/8]AArch64:

RE: [PATCH 1/8]AArch64: Update Neoverse V2 cost model to release costs

2024-07-26 Thread Tamar Christina
> -Original Message- > From: Kyrylo Tkachov > Sent: Friday, July 26, 2024 1:10 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; ktkac...@gcc.gnu.org; Richard Sandiford > > Subject: Re: [PATCH 1/8]AArch

RE: [PATCH 3/8]AArch64: Add Neoverse V3AE core definition and cost model

2024-07-26 Thread Tamar Christina
Hi All, This is a new version with the confirmed correct part number. An update TRM is being published. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-cores.def (neoverse-v3ae): New. * confi

RE: [PATCH]AArch64: check for vector mode in get_mask_mode [PR116074]

2024-07-26 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Friday, July 26, 2024 10:43 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: check for vector mode in g

[PATCH]middle-end: check for vector mode before in get_mask_mode [PR116074]

2024-07-26 Thread Tamar Christina
Hi All, For historical reasons AArch64 has TI mode vector types but does not consider TImode a vector mode. What's happening in the PR is that get_vectype_for_scalar_type is returning vector(1) TImode for a TImode scalar. This then fails when we call targetm.vectorize.get_mask_mode (vecmode).exi

RE: [PATCH]AArch64: check for vector mode in get_mask_mode [PR116074]

2024-07-26 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Friday, July 26, 2024 10:24 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: check for vector mode in g

[PATCH 5/8]AArch64: Update Generic Armv9-a cost model to release costs

2024-07-26 Thread Tamar Christina
Hi All, this updates the costs for gener-armv9-a based on the updated costs for Neoverse V2 and Neoverse N2. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/tuning_models/generic_armv9_a.h: Update costs. ---

[PATCH 3/8]AArch64: Add Neoverse V3AE core definition and cost model

2024-07-26 Thread Tamar Christina
Hi All, This adds a cost model and core definition for Neoverse V3AE. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-cores.def (neoverse-v3ae): New. * config/aarch64/aarch64-tune.md: Regenera

[PATCH 7/8]AArch64: Add Cortex-X925 core definition and cost model

2024-07-26 Thread Tamar Christina
Hi All, This adds a cost model and core definition for Cortex-X925. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-cores.def (cortex-x925): New. * config/aarch64/aarch64-tune.md: Regenerate.

[PATCH 8/8]AArch64: take gather/scatter decode overhead into account

2024-07-26 Thread Tamar Christina
Hi All, Gather and scatters are not usually beneficial when the loop count is small. This is because there's not only a cost to their execution within the loop but there is also some cost to enter loops with them. As such this patch models this overhead. For generic tuning we however still prefe

[PATCH 2/8]AArch64: Add Neoverse V3 core definition and cost model

2024-07-26 Thread Tamar Christina
Hi All, This adds a cost model and core definition for Neoverse V3. It also makes Cortex-X4 use the Neoverse V3 cost model. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-cores.def (cortex-x4): Upda

[PATCH 6/8]AArch64: Update Neoverse N2 cost model to release costs

2024-07-26 Thread Tamar Christina
Hi All, This updates the cost for Neoverse N2 to reflect the updated Software Optimization Guide. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/tuning_models/neoversen2.h: Update costs. --- diff --git a/gc

[PATCH 4/8]AArch64: Add Neoverse N3 and Cortex-A725 core definition and cost model

2024-07-26 Thread Tamar Christina
Hi All, This adds a cost model and core definition for Neoverse N3 and Cortex-A725. It also makes Cortex-A725 use the Neoverse N3 cost model. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-cores.def

[PATCH 1/8]AArch64: Update Neoverse V2 cost model to release costs

2024-07-26 Thread Tamar Christina
Hi All, This updates the cost for Neoverse V2 to reflect the updated Software Optimization Guide. It also makes Cortex-X3 use the Neoverse V2 cost model. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch

[PATCH]AArch64: check for vector mode in get_mask_mode [PR116074]

2024-07-25 Thread Tamar Christina
Hi All, For historical reasons AArch64 has TI mode vector types but does not consider TImode a vector mode. What's happening in the PR is that get_vectype_for_scalar_type is returning vector(1) TImode for a TImode scalar. This then fails when we call targetm.vectorize.get_mask_mode (vecmode).exi

RE: [PATCH][contrib]: support json output from check_GNU_style_lib.py

2024-07-23 Thread Tamar Christina
Hi Both, > -Original Message- > From: Jonathan Wakely > Sent: Monday, July 22, 2024 3:21 PM > To: Filip Kastl > Cc: Tamar Christina ; gcc-patches@gcc.gnu.org; nd > > Subject: Re: [PATCH][contrib]: support json output from check_GNU_style_lib.py > > On Mon,

RE: [PATCH v1] Match: Only allow single use of MIN_EXPR for SAT_TRUNC form 2 [PR115863]

2024-07-18 Thread Tamar Christina
> -Original Message- > From: pan2...@intel.com > Sent: Thursday, July 18, 2024 1:27 PM > To: gcc-patches@gcc.gnu.org > Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com; > Tamar Christina ; jeffreya...@gmail.com; > rdapp@gmail.com; hongt

[PATCH][contrib]: support json output from check_GNU_style_lib.py

2024-07-18 Thread Tamar Christina
Hi All, It would be useful to automated tools if check_GNU_style[_lib] supported returning the result in a structured format like json. With this change calling: > cat patch | ./contrib/check_GNU_style.py --format json - | jq . produces: [ { "type": 1, "msg": "lines should not excee

RE: [PATCH v1] Match: Bugfix .SAT_TRUNC honor types has no mode precision [PR115961]

2024-07-17 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Wednesday, July 17, 2024 8:55 PM > To: Richard Biener > Cc: pan2...@intel.com; gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; > kito.ch...@gmail.com; Tamar Christina ; > jeffreya...@gmail.com; rdapp..

RE: [RFC][middle-end] SLP Early break and control flow support in GCC

2024-07-17 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Tuesday, July 16, 2024 4:08 PM > To: Tamar Christina > Cc: GCC Patches ; Richard Sandiford > > Subject: Re: [RFC][middle-end] SLP Early break and control flow support in GCC > > On Mon, 15 Jul 2024, Tamar

RE: [PATCH]middle-end: fix 0 offset creation and folding [PR115936]

2024-07-17 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Tuesday, July 16, 2024 12:47 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com > Subject: Re: [PATCH]middle-end: fix 0 offset creation and folding [PR115936] > > On Tue, 16 J

[PATCH]middle-end: fix 0 offset creation and folding [PR115936]

2024-07-16 Thread Tamar Christina
Hi All, As shown in PR115936 SCEV and IVOPTS create an invalidate IV when the IV is a pointer type: ivtmp.39_65 = ivtmp.39_59 + 0B; where the IVs are DI mode and the offset is a pointer. This comes from this weird candidate: Candidate 8: Var befor: ivtmp.39_59 Var after: ivtmp.39_65 Incr

[RFC][middle-end] SLP Early break and control flow support in GCC

2024-07-15 Thread Tamar Christina
Hi All, This RFC document covers at a high level how to extend early break support in GCC to support SLP and how this will be extended in the future to support full control flow in GCC. The basic idea in this is based on the paper "All You Need Is Superword-Level Parallelism: Systematic Control-F

RE: [PATCH][ivopts]: use affine_tree when comparing IVs during candidate selection [PR114932]

2024-07-11 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Thursday, July 11, 2024 1:10 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; bin.ch...@linux.alibaba.com > Subject: RE: [PATCH][ivopts]: use affine_tree when comparing IVs during > candidate > sele

RE: [PATCH][ivopts]: perform affine fold on unsigned addressing modes known not to overflow. [PR114932]

2024-07-11 Thread Tamar Christina
-Original Message- > From: Richard Biener > Sent: Thursday, July 11, 2024 12:39 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; bin.ch...@linux.alibaba.com > Subject: RE: [PATCH][ivopts]: perform affine fold on unsigned addressing modes > known not to o

RE: [PATCH 02/10] autovectorizer: Add basic support for convert optabs

2024-07-11 Thread Tamar Christina
Hi Victor, > -Original Message- > From: Victor Do Nascimento > Sent: Wednesday, July 10, 2024 3:06 PM > To: gcc-patches@gcc.gnu.org > Cc: Richard Sandiford ; Richard Earnshaw > ; Victor Do Nascimento > > Subject: [PATCH 02/10] autovectorizer: Add basic support for convert optabs > > Giv

RE: [PATCH 10/10] autovectorizer: Test autovectorization of different dot-prod modes.

2024-07-11 Thread Tamar Christina
Hi Victor, > -Original Message- > From: Victor Do Nascimento > Sent: Wednesday, July 10, 2024 3:06 PM > To: gcc-patches@gcc.gnu.org > Cc: Richard Sandiford ; Richard Earnshaw > ; Victor Do Nascimento > > Subject: [PATCH 10/10] autovectorizer: Test autovectorization of different > dot- >

RE: [PATCH]middle-end: Implement conditonal store vectorizer pattern [PR115531]

2024-07-10 Thread Tamar Christina
Sorry missed a review comment to change !DR_IS_WRITE into DR_IS_READ. Updated patch: Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: PR tree-optimization/115531 * tree-vect-patterns.cc (vect_cond_store_pattern_same_re

RE: [PATCH 2/3] Support group-size of three in SLP load permutation lowering

2024-07-10 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Wednesday, July 10, 2024 10:04 AM > To: gcc-patches@gcc.gnu.org > Subject: [PATCH 2/3] Support group-size of three in SLP load permutation > lowering > > The following adds support for group-size three in SLP load permutation > lowering

[PATCH 2/2]AArch64: implement TARGET_VECTORIZE_CONDITIONAL_OPERATION_IS_EXPENSIVE [PR115531].

2024-07-10 Thread Tamar Christina
Hi All, This implements the new target hook indicating that for AArch64 when possible we prefer masked operations for any type vs doing LOAD + SELECT or SELECT + STORE. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: PR tree-

RE: [PATCH]middle-end: Implement conditonal store vectorizer pattern [PR115531]

2024-07-10 Thread Tamar Christina
> > > > > > > + } > > > > + > > > > + if (new_code == ERROR_MARK) > > > > + { > > > > + /* We couldn't flip the condition, so invert the mask > > > > instead. */ > > > > + itype = TREE_TYPE (cmp_ls); > > > > + conv = gimple_build_assign (var, BIT_XOR_EXPR,

RE: [PATCH][ivopts]: use affine_tree when comparing IVs during candidate selection [PR114932]

2024-07-10 Thread Tamar Christina
> > > I might also point back to the idea I threw in somewhere, adding > > > OEP_VALUE (or a better name) to the set of flags accepted by > > > operand_equal_p. You mentioned hashing IIRC but I don't see the patches > > > touching hashing? > > > > > > > Yes, That can indeed be done with this appro

RE: [PATCH][ivopts]: perform affine fold on unsigned addressing modes known not to overflow. [PR114932]

2024-07-10 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Thursday, June 20, 2024 8:55 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; bin.ch...@linux.alibaba.com > Subject: RE: [PATCH][ivopts]: perform affine fold on unsigned addressing modes > known not t

RE: [PATCH 2/2]AArch64: lower 2 reg TBL permutes with one zero register to 1 reg TBL.

2024-07-05 Thread Tamar Christina
> > +v16qi f3b (v16qi a) > > +{ > > + v16qi zeros = {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}; > > + return __builtin_shufflevector (a, zeros, 0, 5, 1, 6, 2, 7, 3, 8, 4, 9, > > 5, 10, 6, 11, > 7, 12); > > +} > > + > > +/* { dg-final { scan-assembler-times {tbl\tv[0-9]+.16b, \{v[0-9]+.16b\}, > > v[0- >

RE: [PATCH 1/2]AArch64: make aarch64_simd_vec_unpack_lo_/_hi_ consistent.

2024-07-05 Thread Tamar Christina
> > The principle is that, say: > > > > (vec_select:V2SI (reg:V2DI R) (parallel [(const_int 0) (const_int 1)])) > > > > is (for little-endian) equivalent to: > > > > (subreg:V2SI (reg:V2DI R) 0) > > Sigh, of course I meant V4SI rather than V2DI in the above :) > > > and similarly for the equi

RE: [PATCH 1/2]AArch64: make aarch64_simd_vec_unpack_lo_/_hi_ consistent.

2024-07-04 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Thursday, July 4, 2024 12:46 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH 1/2]AArch64: make aarch64_

[PATCH 2/2]AArch64: lower 2 reg TBL permutes with one zero register to 1 reg TBL.

2024-07-04 Thread Tamar Christina
Hi All, When a two reg TBL is performed with one operand being a zero vector we can instead use a single reg TBL and map the indices for accessing the zero vector to an out of range constant. On AArch64 out of range indices into a TBL have a defined semantics of setting the element to zero. Many

[PATCH 1/2]AArch64: make aarch64_simd_vec_unpack_lo_/_hi_ consistent.

2024-07-04 Thread Tamar Christina
Hi All, The fix for PR18127 reworked the uxtl to zip optimization. In doing so it undid the changes in aarch64_simd_vec_unpack_lo_ and this now no longer matches aarch64_simd_vec_unpack_hi_. It still works because the RTL generated by aarch64_simd_vec_unpack_lo_ overlaps with the general zero ext

[PATCH][committed][testsuite]: Update test for PR115537 to use SVE .

2024-07-04 Thread Tamar Christina
Hi All, The PR was about SVE codegen, the testcase accidentally used neoverse-n1 instead of neoverse-v1 as was the original report. This updates the tool options. Regtested on aarch64-none-linux-gnu and no issues. committed under the obvious rule. Thanks, Tamar gcc/testsuite/ChangeLog:

RE: [PATCH v1] Vect: Support IFN SAT_TRUNC for unsigned vector int

2024-07-02 Thread Tamar Christina
nks, Tamar > -Original Message- > From: pan2...@intel.com > Sent: Tuesday, July 2, 2024 2:32 PM > To: gcc-patches@gcc.gnu.org > Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com; > Tamar Christina ; jeffreya...@gmail.com; > rdapp@gmail.com; Pan

RE: [PATCH 1/2]middle-end: fix wide_int_constant_multiple_p when VAL and DIV are 0. [PR114932]

2024-07-01 Thread Tamar Christina
> -Original Message- > From: Tamar Christina > Sent: Monday, July 1, 2024 9:14 PM > To: gcc-patches@gcc.gnu.org > Cc: nd ; rguent...@suse.de; j...@ventanamicro.com > Subject: [PATCH 1/2]middle-end: fix wide_int_constant_multiple_p when VAL and > DIV are 0. [

[PATCH 2/2]middle-end: replace constant_multiple_of with aff_combination_constant_multiple_p [PR114932]

2024-07-01 Thread Tamar Christina
Hi All, The current implementation of constant_multiple_of is doing a more limited version of aff_combination_constant_multiple_p. The only non-debug usage of constant_multiple_of will proceed with the values as affine trees. There is scope for further optimization here, namely I believe that if

[PATCH 1/2]middle-end: fix wide_int_constant_multiple_p when VAL and DIV are 0. [PR114932]

2024-07-01 Thread Tamar Christina
Hi All, wide_int_constant_multiple_p tries to check if for two tree expressions a and b that there is a multiplier which makes a == b * c. This code however seems to think that there's no c where a=0 and b=0 are equal which is of course wrong. This fixes it and also fixes the comment. Bootstrap

RE: [PATCH v3] Vect: Support truncate after .SAT_SUB pattern in zip

2024-06-28 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Friday, June 28, 2024 6:39 AM > To: Li, Pan2 > Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; > jeffreya...@gmail.com; rdapp....@gmail.com; Tamar Christina > > Subject: Re: [PATCH v3

RE: [RFC PATCH] cse: Add another CSE pass after split1

2024-06-28 Thread Tamar Christina
Hi, > -Original Message- > From: Palmer Dabbelt > Sent: Thursday, June 27, 2024 10:57 PM > To: gcc-patches@gcc.gnu.org > Cc: Palmer Dabbelt > Subject: [RFC PATCH] cse: Add another CSE pass after split1 > > This is really more of a question than a patch. > > Looking at PR/115687 I manag

RE: [PATCH][c++ frontend]: check for missing condition for novector [PR115623]

2024-06-27 Thread Tamar Christina
> -Original Message- > From: Jason Merrill > Sent: Tuesday, June 25, 2024 10:24 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; nat...@acm.org > Subject: Re: [PATCH][c++ frontend]: check for missing condition for novector > [PR115623] > > On 6/2

RE: [PATCH] aarch64: Remove RNG and MTE from -mcpu=neoverse-v2

2024-06-27 Thread Tamar Christina
> -Original Message- > From: Kyrylo Tkachov > Sent: Thursday, June 27, 2024 3:49 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw ; > Richard Sandiford > Subject: Re: [PATCH] aarch64: Remove RNG and MTE from -mcpu=neoverse-v2 > > H

RE: [PATCH] aarch64: Remove RNG and MTE from -mcpu=neoverse-v2

2024-06-27 Thread Tamar Christina
Hi Kyrill, > -Original Message- > From: Kyrylo Tkachov > Sent: Thursday, June 27, 2024 9:58 AM > To: gcc-patches@gcc.gnu.org > Cc: Richard Earnshaw ; Richard Sandiford > > Subject: [PATCH] aarch64: Remove RNG and MTE from -mcpu=neoverse-v2 > > Hi all, > > According to the TRM for Neove

RE: [PATCH]middle-end: Implement conditonal store vectorizer pattern [PR115531]

2024-06-26 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Wednesday, June 26, 2024 2:23 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com > Subject: Re: [PATCH]middle-end: Implement conditonal store vectorizer pattern > [PR115531] >

Re: [PATCH][c++ frontend]: check for missing condition for novector [PR115623]

2024-06-25 Thread Tamar Christina
The 06/25/2024 17:10, Jason Merrill wrote: > On 6/25/24 04:01, Tamar Christina wrote: > > Hi All, > > > > It looks like I forgot to check in the C++ frontend if a condition exist > > for the > > loop being adorned with novector. This causes a segfault because c

[PATCH]middle-end: Implement conditonal store vectorizer pattern [PR115531]

2024-06-25 Thread Tamar Christina
Hi All, This adds a conditional store optimization for the vectorizer as a pattern. The vectorizer already supports modifying memory accesses because of the pattern based gather/scatter recognition. Doing it in the vectorizer allows us to still keep the ability to vectorize such loops for archite

[PATCH][c++ frontend]: check for missing condition for novector [PR115623]

2024-06-25 Thread Tamar Christina
Hi All, It looks like I forgot to check in the C++ frontend if a condition exist for the loop being adorned with novector. This causes a segfault because cond isn't expected to be null. This fixes it by issuing the same kind of diagnostics we issue for the other pragmas. Bootstrapped Regtested

RE: [PATCH v2] Vect: Support truncate after .SAT_SUB pattern in zip

2024-06-24 Thread Tamar Christina
> -Original Message- > From: Li, Pan2 > Sent: Tuesday, June 25, 2024 7:06 AM > To: Tamar Christina ; gcc-patches@gcc.gnu.org > Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com; > jeffreya...@gmail.com; pins...@gmail.com > Subject: RE: [PA

RE: [PATCH v2] Vect: Support truncate after .SAT_SUB pattern in zip

2024-06-24 Thread Tamar Christina
> -Original Message- > From: Li, Pan2 > Sent: Tuesday, June 25, 2024 3:25 AM > To: Tamar Christina ; gcc-patches@gcc.gnu.org > Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com; > jeffreya...@gmail.com; pins...@gmail.com > Subject: RE: [PA

RE: [PATCH v2] Vect: Support truncate after .SAT_SUB pattern in zip

2024-06-24 Thread Tamar Christina
Hi, > -Original Message- > From: pan2...@intel.com > Sent: Monday, June 24, 2024 2:55 PM > To: gcc-patches@gcc.gnu.org > Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com; > jeffreya...@gmail.com; pins...@gmail.com; Pan Li > Subject: [PATCH v2] Vect: Support trun

RE: [PATCH][ivopts]: perform affine fold on unsigned addressing modes known not to overflow. [PR114932]

2024-06-24 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Thursday, June 20, 2024 8:55 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; bin.ch...@linux.alibaba.com > Subject: RE: [PATCH][ivopts]: perform affine fold on unsigned addressing modes > known not t

RE: [PATCH 1/3 v3] vect: generate suitable convert insn for int -> int, float -> float and int <-> float.

2024-06-24 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Monday, June 24, 2024 1:34 PM > To: Hu, Lin1 > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao ; > ubiz...@gmail.com > Subject: RE: [PATCH 1/3 v3] vect: generate suitable convert insn for int -> > int, float > -> float and int <-> float. > >

RE: [PATCH][ivopts]: use affine_tree when comparing IVs during candidate selection [PR114932]

2024-06-24 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Thursday, June 20, 2024 8:49 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; bin.ch...@linux.alibaba.com > Subject: RE: [PATCH][ivopts]: use affine_tree when comparing IVs during > candidate

RE: [PATCH][ivopts]: use affine_tree when comparing IVs during candidate selection [PR114932]

2024-06-19 Thread Tamar Christina
> -Original Message- > From: Michael Matz > Sent: Wednesday, June 19, 2024 3:46 PM > To: Tamar Christina > Cc: Richard Biener ; gcc-patches@gcc.gnu.org; nd > ; bin.ch...@linux.alibaba.com > Subject: RE: [PATCH][ivopts]: use affine_tree when comparing IVs during &g

RE: [PATCH][ivopts]: use affine_tree when comparing IVs during candidate selection [PR114932]

2024-06-19 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Wednesday, June 19, 2024 12:55 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; bin.ch...@linux.alibaba.com > Subject: Re: [PATCH][ivopts]: use affine_tree when comparing IVs during > candidate

RE: [PATCH][ivopts]: perform affine fold on unsigned addressing modes known not to overflow. [PR114932]

2024-06-19 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Wednesday, June 19, 2024 1:14 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; bin.ch...@linux.alibaba.com > Subject: Re: [PATCH][ivopts]: perform affine fold on unsigned addressing modes > known not t

RE: [PATCH v3] aarch64: Add vector popcount besides QImode [PR113859]

2024-06-17 Thread Tamar Christina
Hi, > -Original Message- > From: Pengxuan Zheng > Sent: Friday, June 14, 2024 12:57 AM > To: gcc-patches@gcc.gnu.org > Cc: Pengxuan Zheng > Subject: [PATCH v3] aarch64: Add vector popcount besides QImode [PR113859] > > This patch improves GCC’s vectorization of __builtin_popcount for aa

  1   2   3   4   5   6   7   8   9   10   >