RE: [PATCH v3] Internal-fn: Introduce new internal function SAT_ADD

2024-05-02 Thread Tamar Christina
t y) noexcept { uint64_t z; if (!__builtin_add_overflow(x, y, )) return z; return -1u; } Is a valid and common way to do saturation too. But for now, it's fine. Cheers, Tamar > Sorry not sure if my understanding is correct, feel free to correct me. > > Pan >

RE: [PATCH v3] Internal-fn: Introduce new internal function SAT_ADD

2024-05-01 Thread Tamar Christina
> -Original Message- > From: Li, Pan2 > Sent: Thursday, May 2, 2024 4:11 AM > To: Tamar Christina ; gcc-patches@gcc.gnu.org > Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com; > Liu, Hongtao > Subject: RE: [PATCH v3] Internal-fn: Introduce

RE: [PATCH v3] Internal-fn: Introduce new internal function SAT_ADD

2024-05-01 Thread Tamar Christina
Hi, > From: Pan Li > > Update in v3: > * Rebase upstream for conflict. > > Update in v2: > * Fix one failure for x86 bootstrap. > > Original log: > > This patch would like to add the middle-end presentation for the > saturation add. Aka set the result of add to the max when overflow. > It

[PATCH]middle-end: refactory vect_recog_absolute_difference to simplify flow [PR114769]

2024-04-19 Thread Tamar Christina
Hi All, As the reporter in PR114769 points out the control flow for the abd detection is hard to follow. This is because vect_recog_absolute_difference has two different ways it can return true. 1. It can return true when the widening operation is matched, in which case unprom is set,

[PATCH]AArch64: remove reliance on register allocator for simd/gpreg costing. [PR114741]

2024-04-18 Thread Tamar Christina
Hi All, In PR114741 we see that we have a regression in codegen when SVE is enable where the simple testcase: void foo(unsigned v, unsigned *p) { *p = v & 1; } generates foo: fmovs31, w0 and z31.s, z31.s, #1 str s31, [x1] ret instead of: foo:

RE: [PATCH]middle-end: skip vectorization check on ilp32 on vect-early-break_124-pr114403.c

2024-04-16 Thread Tamar Christina
> On Tue, Apr 16, 2024 at 09:00:53AM +0200, Richard Biener wrote: > > > PR tree-optimization/114403 > > > * gcc.dg/vect/vect-early-break_124-pr114403.c: Skip in ilp32. > > > > > > --- > > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_124-pr114403.c >

[PATCH]middle-end: skip vectorization check on ilp32 on vect-early-break_124-pr114403.c

2024-04-15 Thread Tamar Christina
Hi all, The testcase seems to fail vectorization on -m32 since the access pattern is determined as too complex. This skips the vectorization check on ilp32 systems as I couldn't find a better proxy for being able to do strided 64-bit loads and I suspect it would fail on all 32-bit targets.

docs: document early break support and pragma novector

2024-04-15 Thread Tamar Christina
docs: document early break support and pragma novector --- diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html index b4c602a523717c1d64333e44aefb60ba0ed02e7a..aceecb86f17443cfae637e90987427b98c42f6eb 100644 --- a/htdocs/gcc-14/changes.html +++ b/htdocs/gcc-14/changes.html @@

[PATCH]middle-end: adjust loop upper bounds when peeling for gaps and early break [PR114403].

2024-04-12 Thread Tamar Christina
Hi All, This is a story all about how the peeling for gaps introduces a bug in the upper bounds. Before I go further, I'll first explain how I understand this to work for loops with a single exit. When peeling for gaps we peel N < VF iterations to scalar. This happens by removing N iterations

[PATCH]middle-end vect: adjust loop upper bounds when peeling for gaps and early break [PR114403]

2024-04-04 Thread Tamar Christina
Hi All, The report shows that we end up in a situation where the code has been peeled for gaps and we have an early break. The code for peeling for gaps assume that a scalar loop needs to perform at least one iteration. However this doesn't take into account early break where the scalar loop

Summary: [PATCH][committed]AArch64: Do not allow SIMD clones with simdlen 1 [PR113552][GCC 13/12/11 backport]

2024-03-12 Thread Tamar Christina
Hi All, This is a backport of g:306713c953d509720dc394c43c0890548bb0ae07. The AArch64 vector PCS does not allow simd calls with simdlen 1, however due to a bug we currently do allow it for num == 0. This causes us to emit a symbol that doesn't exist and we fail to link. Bootstrapped Regtested

RE: [PATCH] vect: Do not peel epilogue for partial vectors [PR114196].

2024-03-07 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Thursday, March 7, 2024 8:47 AM > To: Robin Dapp > Cc: gcc-patches ; Tamar Christina > > Subject: Re: [PATCH] vect: Do not peel epilogue for partial vectors > [PR114196]. > > On Wed, Mar 6, 202

RE: [PATCH v2] Draft|Internal-fn: Introduce internal fn saturation US_PLUS

2024-02-27 Thread Tamar Christina
e. This would allow us to better understand what kind of gimple would have to to deal with in ISEL and VECT if we decide not to lower early. Thanks, Tamar > Pan > > -Original Message- > From: Tamar Christina > Sent: Tuesday, February 27, 2024 5:57 PM > To: Richard Biener

RE: [PATCH v1] Internal-fn: Add new internal function SAT_ADDU

2024-02-27 Thread Tamar Christina
> Am 19.02.24 um 08:36 schrieb Richard Biener: > > On Sat, Feb 17, 2024 at 11:30 AM wrote: > >> > >> From: Pan Li > >> > >> This patch would like to add the middle-end presentation for the > >> unsigned saturation add. Aka set the result of add to the max > >> when overflow. It will take the

RE: [PATCH v2] Draft|Internal-fn: Introduce internal fn saturation US_PLUS

2024-02-27 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Tuesday, February 27, 2024 9:44 AM > To: Tamar Christina > Cc: pan2...@intel.com; gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; > yanzhang.w...@intel.com; kito.ch...@gmail.com; > richard.sandiford@arm.com2;

RE: [PATCH]middle-end: delay updating of dominators until later during vectorization. [PR114081]

2024-02-26 Thread Tamar Christina
> > The testcase shows an interesting case where we have multiple loops sharing > > a > > live value and have an early exit that go to the same location. The > > additional > > complication is that on x86_64 with -mavx we seem to also do prologue > > peeling > > on the loops. > > > > We

[PATCH]middle-end: delay updating of dominators until later during vectorization. [PR114081]

2024-02-25 Thread Tamar Christina
Hi All, The testcase shows an interesting case where we have multiple loops sharing a live value and have an early exit that go to the same location. The additional complication is that on x86_64 with -mavx we seem to also do prologue peeling on the loops. We correctly identify which BB we need

RE: [PATCH v2] Draft|Internal-fn: Introduce internal fn saturation US_PLUS

2024-02-25 Thread Tamar Christina
Hi Pan, > From: Pan Li > > Hi Richard & Tamar, > > Try the DEF_INTERNAL_INT_EXT_FN as your suggestion. By mapping > us_plus$a3 to the RTL representation (us_plus:m x y) in optabs.def. > And then expand_US_PLUS in internal-fn.cc. Not very sure if my > understanding is correct for

[PATCH]middle-end: update vuses out of loop which use a vdef that's moved [PR114068]

2024-02-23 Thread Tamar Christina
Hi All, In certain cases we can have a situation where the merge block has a vUSE virtual PHI and the exits do not. In this case for instance the exits lead to an abort so they have no virtual PHIs. If we have a store before the first exit and we move it to a later block during vectorization we

RE: [PATCH v1] Internal-fn: Add new internal function SAT_ADDU

2024-02-19 Thread Tamar Christina
> -Original Message- > From: Li, Pan2 > Sent: Monday, February 19, 2024 12:59 PM > To: Tamar Christina ; Richard Biener > > Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang > ; kito.ch...@gmail.com > Subject: RE: [PATCH v1] Internal-fn: Add new in

RE: [PATCH]AArch64: xfail modes_1.f90 [PR107071]

2024-02-19 Thread Tamar Christina
> -Original Message- > From: Tamar Christina > Sent: Thursday, February 15, 2024 11:05 AM > To: Richard Earnshaw (lists) ; gcc- > patc...@gcc.gnu.org > Cc: nd ; Marcus Shawcroft ; Kyrylo > Tkachov ; Richard Sandiford > > Subject: RE: [PATCH]AArch64:

RE: [PATCH v1] Internal-fn: Add new internal function SAT_ADDU

2024-02-19 Thread Tamar Christina
Thanks for doing this! > -Original Message- > From: Li, Pan2 > Sent: Monday, February 19, 2024 8:42 AM > To: Richard Biener > Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang > ; kito.ch...@gmail.com; Tamar Christina > > Subject: RE: [PATCH

RE: [PATCH] aarch64: Improve PERM<{0}, a, ...> (64bit) by adding whole vector shift right [PR113872]

2024-02-15 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Thursday, February 15, 2024 2:56 PM > To: Andrew Pinski > Cc: gcc-patches@gcc.gnu.org; Tamar Christina > Subject: Re: [PATCH] aarch64: Improve PERM<{0}, a, ...> (64bit) by adding > whole > vector shif

RE: [PATCH]AArch64: xfail modes_1.f90 [PR107071]

2024-02-15 Thread Tamar Christina
> -Original Message- > From: Richard Earnshaw (lists) > Sent: Thursday, February 15, 2024 11:01 AM > To: Tamar Christina ; gcc-patches@gcc.gnu.org > Cc: nd ; Marcus Shawcroft ; Kyrylo > Tkachov ; Richard Sandiford > > Subject: Re: [PATCH]AArch64: xfail modes_1.f

[PATCH]AArch64: xfail modes_1.f90 [PR107071]

2024-02-15 Thread Tamar Christina
Hi All, This test has never worked on AArch64 since the day it was committed. It has a number of issues that prevent it from working on AArch64: 1. IEEE does not require that FP operations raise a SIGFPE for FP operations, only that an exception is raised somehow. 2. Most Arm designed

RE: [PATCH]AArch64: remove ls64 from being mandatory on armv8.7-a..

2024-02-15 Thread Tamar Christina
Hi, this I a new version of the patch updating some additional tests because some of the LTO tests required a newer binutils than my distro had. --- The Arm Architectural Reference Manual (Version J.a, section A2.9 on FEAT_LS64) shows that ls64 is an optional extensions and should not be

RE: [PATCH]AArch64: update vget_set_lane_1.c test output

2024-02-15 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Thursday, February 1, 2024 4:42 PM > To: Tamar Christina > Cc: Andrew Pinski ; gcc-patches@gcc.gnu.org; nd > ; Richard Earnshaw ; Marcus > Shawcroft ; Kyrylo Tkachov > > Subject: Re: [PATCH]AArch64: up

[PATCH]AArch64: remove ls64 from being mandatory on armv8.7-a..

2024-02-14 Thread Tamar Christina
Hi All, The Arm Architectural Reference Manual (Version J.a, section A2.9 on FEAT_LS64) shows that ls64 is an optional extensions and should not be enabled by default for Armv8.7-a. This drops it from the mandatory bits for the architecture and brings GCC inline with LLVM and the achitecture.

RE: [PATCH]middle-end: inspect all exits for additional annotations for loop.

2024-02-14 Thread Tamar Christina
> > I think this isn't entirely good. For simple cases for > do {} while the condition ends up in the latch while for while () {} > loops it ends up in the header. In your case the latch isn't empty > so it doesn't end up with the conditional. > > I think your patch is OK to the point of

[PATCH]middle-end: inspect all exits for additional annotations for loop.

2024-02-14 Thread Tamar Christina
Hi All, Attaching a pragma to a loop which has a complex condition often gets the pragma dropped. e.g. #pragma GCC novector while (i < N && parse_tables_n--) before lowering this is represented as: if (ANNOTATE_EXPR ) ... But after lowering the condition is broken appart and attached to

[PATCH]middle-end: update vector loop upper bounds when early break vect [PR113734]

2024-02-13 Thread Tamar Christina
Hi All, When doing early break vectorization we should treat the final iteration as possibly being partial. This so that when we calculate the vector loop upper bounds we take into account that final iteration could have done some work. The attached testcase shows that if we don't then cunroll

RE: [PATCH]middle-end: add two debug counters for early-break vectorization debugging

2024-02-08 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Thursday, February 8, 2024 2:16 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com > Subject: Re: [PATCH]middle-end: add two debug counters for early-break > vectorization debuggi

[PATCH]middle-end: add two debug counters for early-break vectorization debugging

2024-02-08 Thread Tamar Christina
Hi All, This adds two new debug counter to aid in debugging early break code. - vect_force_last_exit: when reached will always force the final loop exit. - vect_skip_exit: when reached will skip selecting the current candidate exit as the loop exit. The first counter

RE: [PATCH]middle-end: don't cache restart_loop in vectorizable_live_operations [PR113808]

2024-02-08 Thread Tamar Christina
> Please either drop lastprivate(k) clause or use linear(k:1) > The iteration var of simd loop without collapse or with > collapse(1) is implicitly linear with the step, and even linear > means the value from the last iteration can be used after the > simd construct. Overriding the data sharing

[PATCH]middle-end: don't cache restart_loop in vectorizable_live_operations [PR113808]

2024-02-08 Thread Tamar Christina
Hi All, There's a bug in vectorizable_live_operation that restart_loop is defined outside the loop. This variable is supposed to indicate whether we are doing a first or last index reduction. The problem is that by defining it outside the loop it becomes dependent on the order we visit the

[PATCH][committed]middle-end: fix pointer conversion error in testcase vect-early-break_110-pr113467.c

2024-02-08 Thread Tamar Christina
Hi All, I had missed a conversion from unsigned long to uint64_t. This fixes the failing test on -m32. Regtested on x86_64-pc-linux-gnu with -m32 and no issues. Committed as obvious. Thanks, Tamar gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-early-break_110-pr113467.c: Change unsigned

RE: [PATCH]middle-end: fix ICE when moving statements to empty BB [PR113731]

2024-02-05 Thread Tamar Christina
> It looks like LOOP_VINFO_EARLY_BRK_STORES is "reverse"? Is that > why you are doing gsi_move_before + gsi_prev? Why do gsi_prev > at all? > As discussed on IRC, then how about this one. Incremental building passed all tests and bootstrap is running. Ok for master if bootstrap and regtesting

RE: [PATCH]middle-end: add additional runtime test for [PR113467]

2024-02-05 Thread Tamar Christina
> > Ok for master? > > I think you need a lp64 target check for the large constants or > alternatively use uint64_t? > Ok, how about this one. Regtested on x86_64-pc-linux-gnu with -m32,-m64 and no issues. Ok for master? Thanks, Tamar gcc/testsuite/ChangeLog: PR

RE: [PATCH]middle-end: fix ICE when moving statements to empty BB [PR113731]

2024-02-05 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Monday, February 5, 2024 1:22 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com > Subject: Re: [PATCH]middle-end: fix ICE when moving statements to empty BB > [PR113731] > >

[PATCH]middle-end: fix ICE when destination BB for stores starts with a label [PR113750]

2024-02-05 Thread Tamar Christina
Hi All, The report shows that if the FE leaves a label as the first thing in the dest BB then we ICE because we move the stores before the label. This is easy to fix if we know that there's still only one way into the BB. We would have already rejected the loop if there was multiple paths into

[PATCH]middle-end: fix ICE when moving statements to empty BB [PR113731]

2024-02-05 Thread Tamar Christina
Hi All, We use gsi_move_before (_gsi, _gsi); to request that the new statement be placed before any other statement. Typically this then moves the current pointer to be after the statement we just inserted. However it looks like when the BB is empty, this does not happen and the CUR pointer

[PATCH]middle-end: add additional runtime test for [PR113467]

2024-02-05 Thread Tamar Christina
Hi All, This just adds an additional runtime testcase for the fixed issue. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/testsuite/ChangeLog: PR tree-optimization/113467 * gcc.dg/vect/vect-early-break_110-pr113467.c: New

RE: [PATCH]middle-end: check memory accesses in the destination block [PR113588].

2024-02-01 Thread Tamar Christina
> > > > If the above is correct then I think I understand what you're saying and > > will update the patch and do some Checks. > > Yes, I think that's what I wanted to say. > As discussed: Bootstrapped Regtested on aarch64-none-linux-gnu and x86_64-pc-linux-gnu no issues. Also checked both

RE: [PATCH]AArch64: update vget_set_lane_1.c test output

2024-02-01 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Thursday, February 1, 2024 2:24 PM > To: Andrew Pinski > Cc: Tamar Christina ; gcc-patches@gcc.gnu.org; nd > ; Richard Earnshaw ; Marcus > Shawcroft ; Kyrylo Tkachov > > Subject: Re: [PATCH]AArch64: up

[PATCH]AArch64: update vget_set_lane_1.c test output

2024-02-01 Thread Tamar Christina
Hi All, In the vget_set_lane_1.c test the following entries now generate a zip1 instead of an INS BUILD_TEST (float32x2_t, float32x2_t, , , f32, 1, 0) BUILD_TEST (int32x2_t, int32x2_t, , , s32, 1, 0) BUILD_TEST (uint32x2_t, uint32x2_t, , , u32, 1, 0) This is because the non-Q variant for

[PATCH 2/2][libsanitizer] hwasan: Remove testsuite check for a complaint message [PR112644]

2024-01-31 Thread Tamar Christina
Hi All, With recent updates to hwasan runtime libraries, the error reporting for this particular check is has been reworked. I would question why it has lost this message. To me it looks strange that num_descriptions_printed is incremented whenever we call PrintHeapOrGlobalCandidate whether

[PATCH 1/2][libsanitizer] hwasan: Remove testsuite check for a complaint message [PR112644]

2024-01-31 Thread Tamar Christina
Hi All, Recent libhwasan updates[1] intercept various string and memory functions. These functions have checking in them, which means there's no need to inline the checking. This patch marks said functions as intercepted, and adjusts a testcase to handle the difference. It also looks for HWASAN

RE: [PATCH][libsanitizer]: Sync fixes for asan interceptors from upstream [PR112644]

2024-01-31 Thread Tamar Christina
> -Original Message- > From: Andrew Pinski > Sent: Monday, January 29, 2024 9:55 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; ja...@redhat.com; > do...@redhat.com; k...@google.com; dvyu...@google.com > Subject: Re: [PATCH][libsanitizer]: Sync fixes

RE: [PATCH]middle-end: check memory accesses in the destination block [PR113588].

2024-01-30 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Tuesday, January 30, 2024 9:51 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com > Subject: Re: [PATCH]middle-end: check memory accesses in the destination block > [PR113588]. >

[PATCH]middle-end: check memory accesses in the destination block [PR113588].

2024-01-29 Thread Tamar Christina
Hi All, When analyzing loads for early break it was always the intention that for the exit where things get moved to we only check the loads that can be reached from the condition. However the main loop checks all loads and we skip the destination BB. As such we never actually check the loads

[PATCH]AArch64: relax cbranch tests to accepted inverted branches [PR113502]

2024-01-29 Thread Tamar Christina
Hi All, Recently something in the midend had started inverting the branches by inverting the condition and the branches. While this is fine, it makes it hard to actually test. In RTL I disable scheduling and BB reordering to prevent this. But in GIMPLE there seems to be nothing I can do.

[PATCH][libsanitizer]: Sync fixes for asan interceptors from upstream [PR112644]

2024-01-29 Thread Tamar Christina
Hi All, This cherry-picks and squashes the differences between commits d3e5c20ab846303874a2a25e5877c72271fc798b..76e1e45922e6709392fb82aac44bebe3dbc2ea63 from LLVM upstream from compiler-rt/lib/hwasan/ to GCC on the changes relevant for GCC. This is required to fix the linked PR. As mentioned

[PATCH]AArch64: Do not allow SIMD clones with simdlen 1 [PR113552]

2024-01-24 Thread Tamar Christina
Hi All, The AArch64 vector PCS does not allow simd calls with simdlen 1, however due to a bug we currently do allow it for num == 0. This causes us to emit a symbol that doesn't exist and we fail to link. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? and for

[PATCH]AArch64: Fix expansion of Advanced SIMD div and mul using SVE [PR109636]

2024-01-24 Thread Tamar Christina
Hi All, As suggested in the ticket this replaces the expansion by converting the Advanced SIMD types to SVE types by simply printing out an SVE register for these instructions. This fixes the subreg issues since there are no subregs involved anymore. Bootstrapped Regtested on

[PATCH]middle-end: rename main_exit_p in reduction code.

2024-01-23 Thread Tamar Christina
Hi All, This renamed main_exit_p to last_val_reduc_p to more accurately reflect what the value is calculating. Ok for master if bootstrap passes? Incremental build shows it's fine. Thanks, Tamar gcc/ChangeLog: * tree-vect-loop.cc (vect_get_vect_def,

[PATCH]middle-end: fix epilog reductions when vector iters peeled [PR113364]

2024-01-23 Thread Tamar Christina
Hi All, This fixes a bug where vect_create_epilog_for_reduction does not handle the case where all exits are early exits. In this case we should do like induction handling code does and not have a main exit. Bootstrapped Regtested on x86_64-pc-linux-gnu with --enable-checking=release

[PATCH]middle-end: remove more usages of single_exit

2024-01-12 Thread Tamar Christina
Hi All, This replaces two more usages of single_exit that I had missed before. They both seem to happen when we re-use the ifcvt scalar loop for versioning. The condition in versioning is the same as the one for when we don't re-use the scalar loop. I hit these during an LTO enabled bootstrap

[PATCH]middle-end testsuite: remove -save-temps from many tests [PR113319]

2024-01-11 Thread Tamar Christina
Hi All, This removes -save-temps from the tests I've introduced to fix the LTO mismatches. Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no issue Ok for master? Thanks, Tamar gcc/testsuite/ChangeLog: PR testsuite/113319 * gcc.dg/bic-bitmask-13.c:

[PATCH]middle-end: make memory analysis for early break more deterministic [PR113135]

2024-01-11 Thread Tamar Christina
Hi All, Instead of searching for where to move stores to, they should always be in exit belonging to the latch. We can only ever delay stores and even if we pick a different exit than the latch one as the main one, effects still happen in program order when vectorized. If we don't move the

[PATCH]middle-end: fill in reduction PHI for all alt exits [PR113144]

2024-01-10 Thread Tamar Christina
Hi All, When we have a loop with more than 2 exits and a reduction I forgot to fill in the PHI value for all alternate exits. All alternate exits use the same PHI value so we should loop over the new PHI elements and copy the value across since we call the reduction calculation code only once

RE: [PATCH][testsuite]: Make bitint early vect test more accurate

2024-01-10 Thread Tamar Christina
> But I'm afraid I have no idea how is this supposed to work on > non-bitint targets or where __BITINT_MAXWIDTH__ is smaller than 9020. > There is no loop at all there, so what should be vectorized? > Yeah It was giving an unresolved and I didn't notice in diff. > I'd say introduce > # Return 1

[PATCH][testsuite]: Make bitint early vect test more accurate

2024-01-10 Thread Tamar Christina
Hi All, This changes the tests I committed for PR113287 to also run on targets that don't support bitint. Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no issues and tests run on both. Ok for master? Thanks, Tamar gcc/testsuite/ChangeLog: PR tree-optimization/113287

RE: [PATCH]middle-end: correctly identify the edge taken when condition is true. [PR113287]

2024-01-10 Thread Tamar Christina
> -Original Message- > From: Jakub Jelinek > Sent: Wednesday, January 10, 2024 2:42 PM > To: Tamar Christina ; Richard Biener > > Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com > Subject: Re: [PATCH]middle-end: correctly identify the edge taken when

[PATCH]middle-end: correctly identify the edge taken when condition is true. [PR113287]

2024-01-10 Thread Tamar Christina
Hi All, The vectorizer needs to know during early break vectorization whether the edge that will be taken if the condition is true stays or leaves the loop. This is because the code assumes that if you take the true branch you exit the loop. If you don't exit the loop it has to generate a

[PATCH][committed][c++ frontend]: initialize ivdep value

2024-01-10 Thread Tamar Christina
Hi All, Should control enter the switch from one of the cases other than the IVDEP one then the variable remains uninitialized. This fixes it by initializing it to false. Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no issues Committed as obvious. Thanks, Tamar

RE: [PATCH]middle-end: Don't apply copysign optimization if target does not implement optab [PR112468]

2024-01-10 Thread Tamar Christina
ping > -Original Message- > From: Tamar Christina > Sent: Friday, January 5, 2024 1:31 PM > To: Xi Ruoyao ; Palmer Dabbelt > Cc: gcc-patches@gcc.gnu.org; nd ; rguent...@suse.de; Jeff Law > > Subject: RE: [PATCH]middle-end: Don't apply copysign optimizat

[PATCH][committed]middle-end: removed unused variable in vectorizable_live_operation_1

2024-01-09 Thread Tamar Christina
Hi All, It looks like the previous patch had an unused variable. It's odd that my bootstrap didn't catch it (I'm assuming -Werror is still on for O3 bootstraps) but this fixes it. Committed to fix bootstrap. Thanks, Tamar gcc/ChangeLog: * tree-vect-loop.cc

RE: [PATCH]middle-end: check if target can do extract first for early breaks [PR113199]

2024-01-09 Thread Tamar Christina
Hmm I'm confused as to why It didn't break mine.. just did one again.. anyway I'll remove the unused variable. > -Original Message- > From: Rainer Orth > Sent: Tuesday, January 9, 2024 4:06 PM > To: Richard Biener > Cc: Tamar Christina ; gcc-patches@gcc.gn

RE: [PATCH]middle-end: Fix dominators updates when peeling with multiple exits [PR113144]

2024-01-09 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Tuesday, January 9, 2024 1:51 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com > Subject: RE: [PATCH]middle-end: Fix dominators updates when peeling with > multiple exits [PR11314

RE: [PATCH]middle-end: Fix dominators updates when peeling with multiple exits [PR113144]

2024-01-09 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Tuesday, January 9, 2024 12:26 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com > Subject: RE: [PATCH]middle-end: Fix dominators updates when peeling with > multiple exits [PR1

RE: [PATCH]Arm: Update early-break tests to accept thumb output too.

2024-01-09 Thread Tamar Christina
> > 3f40b2a241953 100644 > > --- a/gcc/testsuite/gcc.target/arm/vect-early-break-cbranch.c > > +++ b/gcc/testsuite/gcc.target/arm/vect-early-break-cbranch.c > > @@ -16,8 +16,12 @@ int b[N] = {0}; > > ** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ > > ** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ >

[PATCH]Arm: Update early-break tests to accept thumb output too.

2024-01-09 Thread Tamar Christina
Hi All, The tests I recently added for early break fail in thumb mode because in thumb mode `cbz/cbnz` exist and so the cmp+branch is fused. This updates the testcases to accept either output. Tested on arm-none-linux-gnueabihf with -mthumb/-marm. Ok for master? Thanks, Tamar

RE: [PATCH]middle-end: Fix dominators updates when peeling with multiple exits [PR113144]

2024-01-09 Thread Tamar Christina
> This makes it quadratic in the number of vectorized early exit loops > in a function. The vectorizer CFG manipulation operates in a local > enough bubble that programmatic updating of dominators should be > possible (after all we manage to produce correct SSA form!), the > proposed change gets

RE: [PATCH]middle-end: check if target can do extract first for early breaks [PR113199]

2024-01-09 Thread Tamar Christina
> > - > > - gimple_seq_add_seq (, tem); > > - > > - scalar_res = gimple_build (, CFN_EXTRACT_LAST, scalar_type, > > -mask, vec_lhs_phi); > > + scalar_res = gimple_build (, CFN_VEC_EXTRACT, TREE_TYPE > (vectype), > > +

RE: [PATCH]middle-end: check if target can do extract first for early breaks [PR113199]

2024-01-08 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Monday, January 8, 2024 12:48 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com > Subject: Re: [PATCH]middle-end: check if target can do extract first for > early breaks > [PR11

RE: [PATCH]middle-end: maintain LCSSA form when peeled vector iterations have virtual operands

2024-01-08 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Monday, January 8, 2024 12:38 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com > Subject: Re: [PATCH]middle-end: maintain LCSSA form when peeled vector > iterations have virtual ope

RE: [PATCH]middle-end: rejects loops with nonlinear inductions and early breaks [PR113163]

2024-01-08 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Monday, January 8, 2024 12:07 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com > Subject: Re: [PATCH]middle-end: rejects loops with nonlinear inductions and > early > breaks [P

RE: [PATCH] tree-optimization/113026 - avoid vector epilog in more cases

2024-01-08 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Monday, January 8, 2024 11:29 AM > To: gcc-patches@gcc.gnu.org > Cc: Tamar Christina > Subject: [PATCH] tree-optimization/113026 - avoid vector epilog in more cases > > The following avoids creating a n

[PATCH][frontend]: don't ice with pragma NOVECTOR if loop in C has no condition [PR113267]

2024-01-08 Thread Tamar Christina
Hi All, In C you can have loops without a condition, the original version of the patch was rejecting the use of #pragma GCC novector, however during review it was changed to not due this with the reason that we didn't want to give a compile error with such cases. However because annotations seem

Re: [PATCH]middle-end: thread through existing LCSSA variable for alternative exits too [PR113237]

2024-01-08 Thread Tamar Christina
for alternative exits too [PR113237] On 1/7/24 18:29, Tamar Christina wrote: > gcc/ChangeLog: > >PR tree-optimization/113237 >* tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg): Use >existing LCSSA variable for exit when all exits are earl

[PATCH]middle-end: thread through existing LCSSA variable for alternative exits too [PR113237]

2024-01-07 Thread Tamar Christina
Hi All, Builing on top of the previous patch, similar to when we have a single exit if we have a case where all exits are considered early exits and there are existing non virtual phi then in order to maintain LCSSA we have to use the existing PHI variables. We can't simply clear them and just

RE: [PATCH]middle-end: Don't apply copysign optimization if target does not implement optab [PR112468]

2024-01-05 Thread Tamar Christina
> On Fri, 2024-01-05 at 11:02 +0000, Tamar Christina wrote: > > Ok, so something like: > > > > > > ([istarget loongarch*-*-*] && > > > > ([check_effective_target_loongarch_sx] || > > > > [check_effective_target_hard_float])) >

RE: [PATCH]middle-end: Don't apply copysign optimization if target does not implement optab [PR112468]

2024-01-05 Thread Tamar Christina
> -Original Message- > From: Xi Ruoyao > Sent: Thursday, January 4, 2024 10:39 PM > To: Palmer Dabbelt ; Tamar Christina > > Cc: gcc-patches@gcc.gnu.org; nd ; rguent...@suse.de; Jeff Law > > Subject: Re: [PATCH]middle-end: Don't apply copysign optimi

[PATCH]middle-end: Don't apply copysign optimization if target does not implement optab [PR112468]

2024-01-04 Thread Tamar Christina
Hi All, currently GCC does not treat IFN_COPYSIGN the same as the copysign tree expr. The latter has a libcall fallback and the IFN can only do optabs. Because of this the change I made to optimize copysign only works if the target has impemented the optab, but it should work for those that have

RE: [PATCH 20/21]Arm: Add Advanced SIMD cbranch implementation

2024-01-04 Thread Tamar Christina
> -Original Message- > From: Kyrylo Tkachov > Sent: Thursday, January 4, 2024 11:12 AM > To: Tamar Christina ; gcc-patches@gcc.gnu.org > Cc: nd ; Ramana Radhakrishnan > ; Richard Earnshaw > ; ni...@redhat.com > Subject: RE: [PATCH 20/21]Arm: Add Advanced SIM

RE: [PATCH 20/21]Arm: Add Advanced SIMD cbranch implementation

2024-01-04 Thread Tamar Christina
Ping, --- Hi All, This adds an implementation for conditional branch optab for AArch32. The previous version only allowed operand 0 but it looks like cbranch expansion does not check with the target and so we have to implement all. I therefore did not commit it. This is a larger version. I've

[PATCH]middle-end: check if target can do extract first for early breaks [PR113199]

2024-01-02 Thread Tamar Christina
Hi All, I was generating the vector reverse mask without checking if the target actually supported such an operation. It also seems like more targets implement VEC_EXTRACT than permute on mask registers. So this adds a check for IFN_VEC_EXTRACT support when required and changes the select first

RE: skip vector profiles multiple exits

2024-01-02 Thread Tamar Christina
> -Original Message- > From: Jan Hubicka > Sent: Friday, December 29, 2023 10:32 PM > To: Tamar Christina > Cc: rguent...@suse.de; GCC Patches ; nd > > Subject: Re: skip vector profiles multiple exits > > > Hi Honza, > Hi, > > > > I was

skip vector profiles multiple exits

2023-12-29 Thread Tamar Christina
Hi Honza, I wasn't sure what to do here so I figured I'd ask. In adding support for multiple exits to the vectorizer I didn't know how to update this bit: https://github.com/gcc-mirror/gcc/blob/master/gcc/tree-vect-loop-manip.cc#L3363 Essentially, if skip_vector (i.e. not enough iteration to

[PATCH]middle-end: maintain LCSSA form when peeled vector iterations have virtual operands

2023-12-29 Thread Tamar Christina
Hi All, This patch fixes several interconnected issues. 1. When picking an exit we wanted to check for niter_desc.may_be_zero not true. i.e. we want to pick an exit which we know will iterate at least once. However niter_desc.may_be_zero is not a boolean. It is a tree that encodes a

[PATCH]middle-end: Fix dominators updates when peeling with multiple exits [PR113144]

2023-12-29 Thread Tamar Christina
Hi All, Only trying to update certain dominators doesn't seem to work very well because as the loop gets versioned, peeled, or skip_vector then we end up with very complicated control flow. This means that the final merge blocks for the loop exit are not easy to find or update. Instead of

[PATCH]middle-end: rejects loops with nonlinear inductions and early breaks [PR113163]

2023-12-29 Thread Tamar Christina
Hi All, We can't support nonlinear inductions other than neg when vectorizing early breaks and iteration count is known. For early break we currently require a peeled epilog but in these cases we can't compute the remaining values. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

[PATCH 20/21]Arm: Add Advanced SIMD cbranch implementation

2023-12-29 Thread Tamar Christina
Hi All, This adds an implementation for conditional branch optab for AArch32. The previous version only allowed operand 0 but it looks like cbranch expansion does not check with the target and so we have to implement all. I therefore did not commit it. This is a larger version. For e.g. void

[PATCH]AArch64 Update costing for vector conversions [PR110625]

2023-12-29 Thread Tamar Christina
Hi All, In gimple the operation short _8; double _9; _9 = (double) _8; denotes two operations. First we have to widen from short to long and then convert this integer to a double. Currently however we only count the widen/truncate operations: (double) _5 6 times vec_promote_demote costs 12

[PATCH][committed]middle-end: explicitly initialize vec_stmts [PR113132]

2023-12-25 Thread Tamar Christina
Hi All, when configured with --enable-checking=release we get a false positive on the use of vec_stmts as the compiler seems unable to notice it gets initialized through the pass-by-reference. This explicitly initializes the local. Bootstrapped Regtested on x86_64-pc-linux-gnu and no issues.

[PATCH][testsuite]: Add more pragma novector to new tests

2023-12-24 Thread Tamar Christina
Hi All, This patch was pre-appproved by Richi. This updates the testsuite and adds more #pragma GCC novector to various tests that would otherwise vectorize the vector result checking code. This cleans out the testsuite since the last rebase and prepares for the landing of the early break

RE: [PATCH 3/21]middle-end: Implement code motion and dependency analysis for early breaks

2023-12-20 Thread Tamar Christina
> > + /* If we've moved a VDEF, extract the defining MEM and update > > +usages of it. */ > > + tree vdef; > > + /* This statement is to be moved. */ > > + if ((vdef = gimple_vdef (stmt))) > > + LOOP_VINFO_EARLY_BRK_CONFLICT_STMTS >

RE: RE: [PATCH] Regression FIX: Remove vect_variable_length XFAIL from some tests

2023-12-19 Thread Tamar Christina
guess whomever added the vect_variable_length indended It to fail when VLA though. Perhaps these tests need a dg-add-options ? Since I think other tests already test fixed-length vectors. But lets see what Richi says. Thanks, Tamar From: 钟居哲 Sent: Tuesday, December 19, 2023 1:02 PM To: Tamar

RE: [PATCH] Regression FIX: Remove vect_variable_length XFAIL from some tests

2023-12-19 Thread Tamar Christina
Hi Juzhe, > -Original Message- > From: Juzhe-Zhong > Sent: Tuesday, December 19, 2023 11:19 AM > To: gcc-patches@gcc.gnu.org > Cc: rguent...@suse.de; Tamar Christina ; Juzhe- > Zhong > Subject: [PATCH] Regression FIX: Remove vect_variable_length XFAIL from

[PATCH]middle-end: Handle hybrid SLP induction vectorization with early breaks.

2023-12-19 Thread Tamar Christina
Hi All, While we don't support SLP for early break vectorization, we can land in the situation where the induction was vectorized through hybrid SLP. This means when vectorizing the early break live operation we need to get the results of the SLP operation. Bootstrapped Regtested on

  1   2   3   4   5   6   7   8   9   10   >