[Bug tree-optimization/114635] OpenMP reductions fail dependency analysis

2024-04-15 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635 --- Comment #18 from kugan at gcc dot gnu.org --- Also, can we set INT_MAX when there is no explicit safelen specified in OMP. Something like: --- a/gcc/omp-low.cc +++ b/gcc/omp-low.cc @@ -6975,14 +6975,11 @@ lower_rec_input_clauses (tree

[Bug tree-optimization/114635] OpenMP reductions fail dependency analysis

2024-04-15 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635 --- Comment #12 from kugan at gcc dot gnu.org --- (In reply to Jakub Jelinek from comment #11) > (In reply to kugan from comment #9) > > Looking at the options, looks to me that making loop->safelen a poly_in is > > the wa

[Bug tree-optimization/114635] OpenMP reductions fail dependency analysis

2024-04-15 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635 --- Comment #10 from kugan at gcc dot gnu.org --- Created attachment 57946 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57946=edit patch patch to make loop->safelen a poly_int

[Bug tree-optimization/114635] OpenMP reductions fail dependency analysis

2024-04-15 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635 --- Comment #9 from kugan at gcc dot gnu.org --- Looking at the options, looks to me that making loop->safelen a poly_in is the way to go. (In reply to Jakub Jelinek from comment #4) > The OpenMP safelen clause argument is a scalar integ

[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations

2024-04-10 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 Bug 53947 depends on bug 114653, which changed state. Bug 114653 Summary: Not vectorizing the loop with openmp reduction. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114653 What|Removed |Added

[Bug tree-optimization/114635] OpenMP reductions fail dependency analysis

2024-04-10 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635 kugan at gcc dot gnu.org changed: What|Removed |Added CC||kugan at gcc dot gnu.org

[Bug middle-end/114653] Not vectorizing the loop with openmp reduction.

2024-04-10 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114653 kugan at gcc dot gnu.org changed: What|Removed |Added Resolution|--- |DUPLICATE Status

[Bug middle-end/114653] Not vectorizing the loop with openmp reduction.

2024-04-10 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114653 --- Comment #5 from kugan at gcc dot gnu.org --- ddd for the : ref_a: _57 = D.4803[_20]; ref_b: D.4803[_20] = _ifc__174; We get DDR_ARE_DEPENDENT (ddr) == chrec_dont_know. Hence apply_safelen ().

[Bug middle-end/114653] Not vectorizing the loop with openmp reduction.

2024-04-09 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114653 --- Comment #4 from kugan at gcc dot gnu.org --- This particular loop has loop->safelen set to 16. Does this mean this can never be loop vectorized for VLA?

[Bug middle-end/114653] Not vectorizing the loop with openmp reduction.

2024-04-09 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114653 --- Comment #3 from kugan at gcc dot gnu.org --- For SVE mode in vect_analyze_loop_2, we have (gdb) p min_vf $15 = {coeffs = {4, 4}} (gdb) p max_vf $16 = 16 Thus maybe_lt (max_vf, min_vf)) is false. This results in bad data dependence.

[Bug middle-end/114653] Not vectorizing the loop with openmp reduction.

2024-04-09 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114653 --- Comment #2 from kugan at gcc dot gnu.org --- Thanks. I see the following in the log: test.cpp:33:53: missed: not vectorized: relevant stmt not supported: _54 = .MASK_LOAD (_53, 32B, _171); test.cpp:22:19: missed: bad operation

[Bug middle-end/114653] New: Not vectoring the loop with openmp reduction.

2024-04-09 Thread kugan at gcc dot gnu.org via Gcc-bugs
: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: kugan at gcc dot gnu.org Target Milestone: --- Created attachment 57910 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57910=edit testcase Main loop in the attached test case is not vectorized with -fope

[Bug middle-end/111683] [11/12/13/14 Regression] Incorrect answer when using SSE2 intrinsics with -O3 since r7-3163-g973625a04b3d9351f2485e37f7d3382af2aed87e

2024-03-09 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111683 --- Comment #5 from kugan at gcc dot gnu.org --- -O3 -fno-tree-vectorize and -O3 -fno-tree-vrp works. I looked at the ever dump and it is not doing anything suspicious. Looks like range_info usage in vectoriser is causing the problem.

[Bug libgomp/113698] GNU OpenMP with OMP_PROC_BIND alters thread affinity in a way that negatively affects performance

2024-02-09 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113698 --- Comment #4 from kugan at gcc dot gnu.org --- Thanks for looking into this. The main reason we ere seeing performance issue turned out to be due to glibc malloc issue in https://sourceware.org/bugzilla/show_bug.cgi?id=30945

[Bug libgomp/113698] New: GNU OpenMP with OMP_PROC_BIND alters thread affinity in a way that negatively affects performance

2024-01-31 Thread kugan at gcc dot gnu.org via Gcc-bugs
Severity: normal Priority: P3 Component: libgomp Assignee: unassigned at gcc dot gnu.org Reporter: kugan at gcc dot gnu.org CC: jakub at gcc dot gnu.org Target Milestone: --- Created attachment 57275 --> https://gcc.gnu.

Re: [PR47785] COLLECT_AS_OPTIONS

2019-11-07 Thread Kugan Vivekanandarajah
Hi Richard, Thanks for the review. On Tue, 5 Nov 2019 at 23:08, Richard Biener wrote: > > On Tue, Nov 5, 2019 at 12:17 AM Kugan Vivekanandarajah > wrote: > > > > Hi, > > Thanks for the review. > > > > On Tue, 5 Nov 2019 at 03:57, H.J. Lu wrote: > >

Re: [PR47785] COLLECT_AS_OPTIONS

2019-11-04 Thread Kugan Vivekanandarajah
Hi, Thanks for the review. On Tue, 5 Nov 2019 at 03:57, H.J. Lu wrote: > > On Sun, Nov 3, 2019 at 6:45 PM Kugan Vivekanandarajah > wrote: > > > > Thanks for the reviews. > > > > > > On Sat, 2 Nov 2019 at 02:49, H.J. Lu wrote: > > > > > &g

Re: [PR47785] COLLECT_AS_OPTIONS

2019-11-03 Thread Kugan Vivekanandarajah
Thanks for the reviews. On Sat, 2 Nov 2019 at 02:49, H.J. Lu wrote: > > On Thu, Oct 31, 2019 at 6:33 PM Kugan Vivekanandarajah > wrote: > > > > On Wed, 30 Oct 2019 at 03:11, H.J. Lu wrote: > > > > > > On Sun, Oct 27, 2019 at 6:33 PM Kugan Vivekanand

Re: [PR47785] COLLECT_AS_OPTIONS

2019-10-31 Thread Kugan Vivekanandarajah
On Wed, 30 Oct 2019 at 03:11, H.J. Lu wrote: > > On Sun, Oct 27, 2019 at 6:33 PM Kugan Vivekanandarajah > wrote: > > > > Hi Richard, > > > > Thanks for the review. > > > > On Wed, 23 Oct 2019 at 23:07, Richard Biener > > wrote:

Re: [PR47785] COLLECT_AS_OPTIONS

2019-10-28 Thread Kugan Vivekanandarajah
Hi Bernhard, Thanks for the review. On Tue, 29 Oct 2019 at 08:52, Bernhard Reutner-Fischer wrote: > > On Mon, 28 Oct 2019 11:53:06 +1100 > Kugan Vivekanandarajah wrote: > > > On Wed, 23 Oct 2019 at 23:07, Richard Biener > > wrote: > > > > Did you try this

[Bug driver/47785] GCC with -flto does not pass -Wa options to the assembler

2019-10-22 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47785 kugan at gcc dot gnu.org changed: What|Removed |Added CC||kugan at gcc dot gnu.org

Re: [PR47785] COLLECT_AS_OPTIONS

2019-10-21 Thread Kugan Vivekanandarajah
Hi Richard, Thanks for the pointers. On Fri, 11 Oct 2019 at 22:33, Richard Biener wrote: > > On Fri, Oct 11, 2019 at 6:15 AM Kugan Vivekanandarajah > wrote: > > > > Hi Richard, > > Thanks for the review. > > > > On Wed, 2 Oct 2019 at 20:41, Richard Bien

Re: [PR47785] COLLECT_AS_OPTIONS

2019-10-10 Thread Kugan Vivekanandarajah
Hi Richard, Thanks for the review. On Wed, 2 Oct 2019 at 20:41, Richard Biener wrote: > > On Wed, Oct 2, 2019 at 10:39 AM Kugan Vivekanandarajah > wrote: > > > > Hi, > > > > As mentioned in the PR, attached patch adds COLLECT_AS_OPTIONS for > > passi

[ARM] Enable DF only when TARGET_VFP_DOUBLE

2019-10-09 Thread Kugan Vivekanandarajah
uot;:30 -1 (nil)) This looks like due to a typo in the md patterns. Attached patch fixes this. Bootsrapped and regression tested on arm-linux-gnueabihf without any regressions. Is this OK for trunk? Thanks, Kugan gcc/ChangeLog: 2019-10-10 kugan.vivekanandarajah * config/arm/vf

[PR47785] COLLECT_AS_OPTIONS

2019-10-02 Thread Kugan Vivekanandarajah
either adjusting partitioning according to flags or emitting multiple object files from a single LTRANS CU. We could consider this as a follow up. Bootstrapped and regression tests on arm-linux-gcc. Is this OK for trunk? Thanks, Kugan gcc/ChangeLog: 2019-10-02 kugan.vivekanandarajah PR lto

[Bug ipa/91468] Suspicious codes in ipa-prop.c and ipa-cp.c

2019-08-26 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91468 kugan at gcc dot gnu.org changed: What|Removed |Added CC||kugan at gcc dot gnu.org

Re: On-Demand range technology [1/5] - Executive Summary

2019-06-21 Thread Kugan Vivekanandarajah
precision, the patch would be correct and used to eliminate redundant zero/sign extensions. Please let me know if my explanation is not clear and I will show it with more examples. Thanks, Kugan On Fri, 21 Jun 2019 at 23:27, Andrew MacLeod wrote: > > On 6/19/19 11:04 PM, Kugan Vivekanandarajah

Re: AARCH64 configure check for gas -mabi support

2019-06-20 Thread Kugan Vivekanandarajah
behalf as rev 205891. > > > > On 11 December 2013 13:27, Marcus Shawcroft > > wrote: > > > On 10/12/13 20:23, Kugan wrote: > > > > > >> gcc/ > > >> > > >> +2013-12-11 Kugan Vivekanandarajah > > >> + * con

Re: [PATCH 0/2][RFC][PR88836][AARCH64] Fix redundant ptest instruction

2019-06-19 Thread Kugan Vivekanandarajah
Hi Richard, Thanks for your comments. On Thu, 16 May 2019 at 18:13, Richard Sandiford wrote: > > kugan.vivekanandara...@linaro.org writes: > > From: Kugan Vivekanandarajah > > > > Inorder to fix this PR. > > * We need to change the whilelo pattern in

Re: On-Demand range technology [1/5] - Executive Summary

2019-06-19 Thread Kugan Vivekanandarajah
that it is not possible to get value ranges in PROMOTED_MODE precision on demand. Or is there any way we can use on-demand ranger here? Thanks, Kugan On Thu, 23 May 2019 at 11:28, Andrew MacLeod wrote: > > Now that stage 1 has reopened, I’d like to reopen a discussion about the > t

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-06-17 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834 --- Comment #21 from kugan at gcc dot gnu.org --- (In reply to Christophe Lyon from comment #20) > Hi Kugan, > > The new test fails with -mabi=ilp32: > FAIL: gcc.target/aarch64/pr88834.c scan-assembler-times \\tld2w\\t{z[0-9]+.s &

Fix ICE due to commit for PR88834

2019-06-16 Thread Kugan Vivekanandarajah
believe this is the only way we can have GET_MODE_UNIT_SIZE of 0. Otherwise, we can check for GET_MODE_UNIT_SIZE of zero. Bootstrapped and regression tested attached patch on x86_64-linux-gnu with no new regressions. Is this OK for trunk? Thanks, Kugan gcc/ChangeLog: 2019-06-17 Kugan Vivekanandarajah

Re: [AARCH64] Fix typo in comment

2019-06-12 Thread Kugan Vivekanandarajah
Hi Kyrill, Thanks for the comments. Committed as you suggested. Thanks, Kugan On Wed, 12 Jun 2019 at 18:07, Kyrill Tkachov wrote: > > Hi Kugan, > > On 6/12/19 4:59 AM, Kugan Vivekanandarajah wrote: > > AArch64 comment for ADDSUB iterator is a typo or copy-and-paste error.

[Bug target/88838] [SVE] Use 32-bit WHILELO in LP64 mode

2019-06-12 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88838 --- Comment #6 from kugan at gcc dot gnu.org --- Author: kugan Date: Thu Jun 13 03:34:28 2019 New Revision: 272233 URL: https://gcc.gnu.org/viewcvs?rev=272233=gcc=rev Log: gcc/ChangeLog: 2019-06-13 Kugan Vivekanandarajah PR target

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-06-12 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834 --- Comment #19 from kugan at gcc dot gnu.org --- Author: kugan Date: Thu Jun 13 03:18:54 2019 New Revision: 272232 URL: https://gcc.gnu.org/viewcvs?rev=272232=gcc=rev Log: gcc/ChangeLog: 2019-06-13 Kugan Vivekanandarajah PR target

[AARCH64] Fix typo in comment

2019-06-11 Thread Kugan Vivekanandarajah
AArch64 comment for ADDSUB iterator is a typo or copy-and-paste error. Attached patch fixes this. I believe this falls under obvious category. I will commit it after 48hrs unless comments should be better worded. Thanks, Kugan gcc/ChangeLog: 2019-06-12 Kugan Vivekanandarajah * config

Re: [RFC][PR88838][SVE] Use 32-bit WHILELO in LP64 mode

2019-06-06 Thread Kugan Vivekanandarajah
Hi Richard, On Thu, 6 Jun 2019 at 22:07, Richard Sandiford wrote: > > Kugan Vivekanandarajah writes: > > Hi Richard, > > > > On Thu, 6 Jun 2019 at 19:35, Richard Sandiford > > wrote: > >> > >> Kugan Vivekanandarajah writes: > >> &g

Re: [RFC][PR88838][SVE] Use 32-bit WHILELO in LP64 mode

2019-06-06 Thread Kugan Vivekanandarajah
Hi Richard, On Thu, 6 Jun 2019 at 19:35, Richard Sandiford wrote: > > Kugan Vivekanandarajah writes: > > Hi Richard, > > > > Thanks for the review. Attached is the latest patch. > > > > For testcase like cond_arith_1.c, with the patch, gcc ICE in fwprop. I

Re: [RFC][PR88838][SVE] Use 32-bit WHILELO in LP64 mode

2019-06-05 Thread Kugan Vivekanandarajah
, bailout when it is UNSPEC and MODEs are not compatible. */ + if (GET_MODE_CLASS (mode) != GET_MODE_CLASS (GET_MODE (reg))) +return false; new_rtx = propagate_rtx (*loc, mode, reg, src, optimize_bb_for_speed_p (BLOCK_FOR_INSN (use_insn))); Thanks, Kugan On Mon, 3 Jun

Re: [RFC][PR88838][SVE] Use 32-bit WHILELO in LP64 mode

2019-06-02 Thread Kugan Vivekanandarajah
Hi Richard, Thanks for the review, On Fri, 31 May 2019 at 19:43, Richard Sandiford wrote: > > Kugan Vivekanandarajah writes: > > @@ -609,8 +615,14 @@ vect_set_loop_masks_directly (struct loop *loop, > > loop_vec_info loop_vinfo, > > > >/* Get the mas

Re: [RFC][PR88838][SVE] Use 32-bit WHILELO in LP64 mode

2019-05-30 Thread Kugan Vivekanandarajah
Hi Richard, Thanks for the review. On Tue, 28 May 2019 at 20:44, Richard Sandiford wrote: > > Kugan Vivekanandarajah writes: > > [...] > > diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c > > index b3fae5b..c15b8a2 100644 > > --- a/gcc/tree-v

Re: [RFC][PR88838][SVE] Use 32-bit WHILELO in LP64 mode

2019-05-27 Thread Kugan Vivekanandarajah
Hi Richard, Thanks for the review. On Sat, 25 May 2019 at 19:41, Richard Sandiford wrote: > > Kugan Vivekanandarajah writes: > > diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c > > index 77d3dac..d6452a1 100644 > > --- a/gcc/tree-vect-loop-manip.c

Re: [PATCH 1/2] Add support for IVOPT

2019-05-21 Thread Kugan Vivekanandarajah
Hi Richard, On Fri, 17 May 2019 at 18:47, Richard Sandiford wrote: > > Kugan Vivekanandarajah writes: > > [...] > >> > +{ > >> > + struct mem_address parts = {NULL_TREE, integer_one_node, > >> > + N

[RFC][PR88838][SVE] Use 32-bit WHILELO in LP64 mode

2019-05-21 Thread Kugan Vivekanandarajah
Hi, Attached RFC patch attempts to use 32-bit WHILELO in LP64 mode to fix the PR. Bootstarp and regression testing ongoing. In earlier testing, I ran into an issue related to fwprop. I will tackle that based on the feedback for the patch. Thanks, Kugan From

Re: [PATCH v3 2/3] Add predict_doloop_p target hook

2019-05-16 Thread Kugan Vivekanandarajah
_TYPE (niters); > + unsigned cost = 0; > + bool speed = optimize_loop_for_speed_p (loop); > + int regno = LAST_VIRTUAL_REGISTER + 1; > + walk_tree (, prepare_decl_rtl, , NULL); > + start_sequence (); > + expand_expr (niters, NULL_RTX, TYPE_MODE (type), EXPAND_NORMAL); > + r

Re: [PATCH 1/2] Add support for IVOPT

2019-05-16 Thread Kugan Vivekanandarajah
Hi Richard, On Thu, 16 May 2019 at 21:14, Richard Biener wrote: > > On Wed, May 15, 2019 at 4:40 AM wrote: > > > > From: Kugan Vivekanandarajah > > > > gcc/ChangeLog: > > > > 2019-05-15 Kugan Vivekanandarajah > > > >

Re: [PATCH 1/2] Add support for IVOPT

2019-05-16 Thread Kugan Vivekanandarajah
Hi Richard, On Wed, 15 May 2019 at 16:57, Richard Sandiford wrote: > > Thanks for doing this. > > kugan.vivekanandara...@linaro.org writes: > > From: Kugan Vivekanandarajah > > > > gcc/ChangeLog: > > > > 2019-05-15 Kugan Vivekanandarajah > >

Re: [PATCH 2/2] aarch64 back-end changes

2019-05-15 Thread Kugan Vivekanandarajah
Hi Richard, On Wed, 15 May 2019 at 23:24, Richard Earnshaw (lists) wrote: > > On 15/05/2019 13:48, Richard Earnshaw (lists) wrote: > > On 15/05/2019 03:39, kugan.vivekanandara...@linaro.org wrote: > >> From: Kugan Vivekanandarajah > >> > > > > The subje

[PATCH 2/2] [PR88836][aarch64] Fix CSE to process parallel rtx dest one by one

2019-05-15 Thread kugan . vivekanandarajah
From: Kugan Vivekanandarajah This patch changes cse_insn to process parallel rtx one by one such that any destination rtx in cse list is invalidated before processing the next. gcc/ChangeLog: 2019-05-16 Kugan Vivekanandarajah PR target/88834 * cse.c (safe_hash): Handle

[PATCH 1/2] [PR88836][aarch64] Set CC_REGNUM instead of clobber

2019-05-15 Thread kugan . vivekanandarajah
From: Kugan Vivekanandarajah For aarch64 sve while_ult pattern, Set CC_REGNUM instead of clobbering. gcc/ChangeLog: 2019-05-16 Kugan Vivekanandarajah PR target/88834 * config/aarch64/aarch64-sve.md (while_ult): Set CC_REGNUM instead of clobbering. Change-Id

[PATCH 0/2][RFC][PR88836][AARCH64] Fix redundant ptest instruction

2019-05-15 Thread kugan . vivekanandarajah
From: Kugan Vivekanandarajah Inorder to fix this PR. * We need to change the whilelo pattern in backend * Change RTL CSE such that: - Add support for VEC_DUPLICATE - When handling PARALLEL rtx in cse_insn, we kill CSE defined by all the parallel rtx at the end. For example

[PATCH 2/2] aarch64 back-end changes

2019-05-14 Thread kugan . vivekanandarajah
From: Kugan Vivekanandarajah gcc/ChangeLog: 2019-05-15 Kugan Vivekanandarajah PR target/88834 * config/aarch64/aarch64.c (aarch64_classify_address): Relax allow_reg_index_p. gcc/testsuite/ChangeLog: 2019-05-15 Kugan Vivekanandarajah PR target/88834

[PATCH 0/2] [RFC][PR88834]

2019-05-14 Thread kugan . vivekanandarajah
From: Kugan Vivekanandarajah In PR88834, IVOPT is not selecting the right addressing mode. Inorder to fix thix, we need to add support to add IV uses for IFN_MASK_LOAD_LANES and IFN_MASK_STORE_LANES. In addition, we also need to add IV candidate with scaled by the element or access size

[PATCH 1/2] Add support for IVOPT

2019-05-14 Thread kugan . vivekanandarajah
From: Kugan Vivekanandarajah gcc/ChangeLog: 2019-05-15 Kugan Vivekanandarajah PR target/88834 * tree-ssa-loop-ivopts.c (get_mem_type_for_internal_fn): Handle IFN_MASK_LOAD_LANES and IFN_MASK_STORE_LANES. (find_interesting_uses_stmt): Likewise

Re: [aarch64][RFA][rtl-optimization/87763] Fix insv_1 and insv_2 for aarch64

2019-04-22 Thread Kugan Vivekanandarajah
king/inserting move only when new pattern is returned? Thanks, Kugan > > Jeff

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-04-09 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834 --- Comment #17 from kugan at gcc dot gnu.org --- (In reply to Wilco from comment #16) > (In reply to kugan from comment #15) > > (In reply to Wilco from comment #11) > > > There is also something odd with the way the loop iter

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-04-08 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834 --- Comment #15 from kugan at gcc dot gnu.org --- (In reply to Wilco from comment #11) > There is also something odd with the way the loop iterates, this doesn't > look right: > > whilelo p0.s, x3, x4 > incwx3

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-04-08 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834 --- Comment #14 from kugan at gcc dot gnu.org --- Created attachment 46104 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46104=edit testcase

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-04-08 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834 kugan at gcc dot gnu.org changed: What|Removed |Added Attachment #46040|0 |1 is obsolete

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-04-08 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834 --- Comment #12 from kugan at gcc dot gnu.org --- (In reply to rsand...@gcc.gnu.org from comment #10) > (In reply to kugan from comment #9) > > Created attachment 46040 [details] > > patch > > Wasn't sure whether this patch

[Bug rtl-optimization/89862] LTO bootstrap fails for ARM

2019-03-29 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89862 --- Comment #4 from kugan at gcc dot gnu.org --- Author: kugan Date: Sat Mar 30 04:28:51 2019 New Revision: 270031 URL: https://gcc.gnu.org/viewcvs?rev=270031=gcc=rev Log: 2019-03-29 Kugan Vivekanandarajah Backport from mainline

[Bug rtl-optimization/89862] LTO bootstrap fails for ARM

2019-03-29 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89862 --- Comment #3 from kugan at gcc dot gnu.org --- Author: kugan Date: Sat Mar 30 04:24:22 2019 New Revision: 270030 URL: https://gcc.gnu.org/viewcvs?rev=270030=gcc=rev Log: 2019-03-29 Kugan Vivekanandarajah Eric Botcazou

[PR89862] Fix ARM lto bootstrap

2019-03-28 Thread Kugan Vivekanandarajah
for this. However, it is being tested with LTO bootstrap for ARM. I therefore believe that it is OK. I have also tested the patch with x86_64-linux-gnu with no new regressions. Is this OK for trunk? Thanks, Kugan diff --git a/gcc/rtl.h b/gcc/rtl.h index f991919..52ecd5a 100644 --- a/gcc/rtl.h +++ b/gcc

[Bug rtl-optimization/89862] LTO bootstrap fails for ARM

2019-03-28 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89862 --- Comment #2 from kugan at gcc dot gnu.org --- (In reply to Eric Botcazou from comment #1) > Can you try this instead? > > Index: rtl.h > === > --- rtl.h (

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-03-27 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834 kugan at gcc dot gnu.org changed: What|Removed |Added Attachment #45686|0 |1 is obsolete

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-03-27 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834 --- Comment #8 from kugan at gcc dot gnu.org --- (In reply to rsand...@gcc.gnu.org from comment #7) > Thanks for looking at this. > > (In reply to kugan from comment #6) > > cmp w3, 0 > > ble .L1 >

[Bug rtl-optimization/89862] New: LTO bootstrap fails for ARM

2019-03-27 Thread kugan at gcc dot gnu.org
Assignee: unassigned at gcc dot gnu.org Reporter: kugan at gcc dot gnu.org Target Milestone: --- Created attachment 46039 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46039=edit patch With the commit: commit 67c18bce7054934528ff5930cca283b4ac967dca Author: ebotcazou D

[Bug target/88838] [SVE] Use 32-bit WHILELO in LP64 mode

2019-03-20 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88838 --- Comment #5 from kugan at gcc dot gnu.org --- Created attachment 46000 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46000=edit RFC patch RFC patch fixes this for review.

[SVE ACLE] svbic implementation

2019-03-19 Thread Kugan Vivekanandarajah
I have committed attached patch to aarch64/sve-acle-branch branch which implements svbic. Thanks, Kugan From 182bd15334874844bef5e317f55a6497f77e12ff Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah Date: Thu, 24 Jan 2019 20:57:19 +1100 Subject: [PATCH 1/3] svbic Change-Id

[Bug target/88836] [SVE] Redundant PTEST in loop test

2019-02-21 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88836 --- Comment #2 from kugan at gcc dot gnu.org --- Created attachment 45795 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45795=edit RFC patch AFIK, we need to: 1. Change the whilelo pattern in backend 2. Change RTL CSE - Add supp

[Bug target/88838] [SVE] Use 32-bit WHILELO in LP64 mode

2019-02-21 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88838 --- Comment #4 from kugan at gcc dot gnu.org --- sorry wr(In reply to kugan from comment #3) > Created attachment 45794 [details] > RFC patch Oops wrong place, it should be for https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88836

[Bug target/88838] [SVE] Use 32-bit WHILELO in LP64 mode

2019-02-21 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88838 --- Comment #3 from kugan at gcc dot gnu.org --- Created attachment 45794 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45794=edit RFC patch

[Bug target/88838] [SVE] Use 32-bit WHILELO in LP64 mode

2019-02-21 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88838 kugan at gcc dot gnu.org changed: What|Removed |Added CC||kugan at gcc dot gnu.org

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-02-12 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834 --- Comment #6 from kugan at gcc dot gnu.org --- > > Note the difference in mode for aarch64_classify_address. Not sure if this > is because of the way my patch changes ivopt. Yes, it ws my mistake in iv-use. with attached patch,

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-02-12 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834 kugan at gcc dot gnu.org changed: What|Removed |Added Attachment #45661|0 |1 is obsolete

[Bug tree-optimization/89296] New: tree copy-header masking uninitialized warning

2019-02-11 Thread kugan at gcc dot gnu.org
: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: kugan at gcc dot gnu.org Target Milestone: --- void test_func(void) { int loop; // uninitialized and "garbage" while (!loop) { loop = get_a_value(); // <- must be

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-02-11 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834 --- Comment #4 from kugan at gcc dot gnu.org --- Created attachment 45661 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45661=edit ivopt patch v1

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-02-11 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834 --- Comment #3 from kugan at gcc dot gnu.org --- I added iv-use for MASKED_LOAD_LANE and the result is cmp w3, 0 ble .L1 sub w5, w3, #1 mov x4, 0 lsr w5, w5, 1 add w5, w5, 1

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-02-03 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834 kugan at gcc dot gnu.org changed: What|Removed |Added CC||kugan at gcc dot gnu.org

[SVE ACLE] Implements svdot

2019-01-17 Thread Kugan Vivekanandarajah
I committed the following patch which implements svdot to aarch64/sve-acle-branch. branch Thanks, Kugan From b75cd8ba8f911c137380677b85882c22a6467bf6 Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah Date: Fri, 18 Jan 2019 09:07:10 +1100 Subject: [PATCH] [SVE ACLE] Implements svdot Change

[SVE ACLE] Implements svmulh

2019-01-17 Thread Kugan Vivekanandarajah
I committed the following patch which implements svmulh to aarch64/sve-acle-branch. branch Thanks, Kugan From 33b76de8ef5f370dfacba0addef2fe0b1f2a61db Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah Date: Fri, 18 Jan 2019 07:33:26 +1100 Subject: [PATCH] [SVE ACLE] Implements svmulh Change

[SVE ACLE] Implements svabs, svnot, svneg and svsqrt

2019-01-15 Thread Kugan Vivekanandarajah
I committed the following patch which implements svabs, svnot, svneg and svsqrt to aarch64/sve-acle-branch. branch Thanks, Kugan From 2af9609a58cf7efbed93f15413224a2552b9696d Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah Date: Wed, 16 Jan 2019 07:45:52 +1100 Subject: [PATCH] [SVE ACLE

[Bug sanitizer/88333] [9 Regression] ice in asan_emit_stack_protection, at asan.c:1574

2018-12-06 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88333 kugan at gcc dot gnu.org changed: What|Removed |Added CC||kugan at gcc dot gnu.org

[Bug sanitizer/88350] Linux kernel build ICE with allyesconfig for aarch64

2018-12-06 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88350 kugan at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution

[Bug sanitizer/88350] Linux kernel build ICE with allyesconfig for aarch64

2018-12-06 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88350 kugan at gcc dot gnu.org changed: What|Removed |Added Alias|PR88333 | --- Comment #2 from kugan

[Bug sanitizer/88350] New: Linux kernel build ICE with allyesconfig for aarch64

2018-12-04 Thread kugan at gcc dot gnu.org
Component: sanitizer Assignee: unassigned at gcc dot gnu.org Reporter: kugan at gcc dot gnu.org CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org, jakub at gcc dot gnu.org, kcc at gcc dot gnu.org, marxin at gcc dot gnu.org Target Milestone

[Bug rtl-optimization/88212] New: IRA Register Coalescing not working for the testcase

2018-11-26 Thread kugan at gcc dot gnu.org
Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: kugan at gcc dot gnu.org Target Milestone: --- When compiling the following on aarch64 with -O2: #include void g(int32_t *p, int32x2x2_t val, int x) { vst2_lane_s32(p,val,0); } generates

[Bug target/86677] popcount builtin detection is breaking some kernel build

2018-11-12 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86677 --- Comment #13 from kugan at gcc dot gnu.org --- Author: kugan Date: Mon Nov 12 23:43:56 2018 New Revision: 266039 URL: https://gcc.gnu.org/viewcvs?rev=266039=gcc=rev Log: gcc/ChangeLog: 2018-11-13 Kugan Vivekanandarajah PR middle

[Bug middle-end/87528] Popcount changes caused 531.deepsjeng_r run-time regression on Skylake

2018-11-12 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87528 --- Comment #7 from kugan at gcc dot gnu.org --- Author: kugan Date: Mon Nov 12 23:43:56 2018 New Revision: 266039 URL: https://gcc.gnu.org/viewcvs?rev=266039=gcc=rev Log: gcc/ChangeLog: 2018-11-13 Kugan Vivekanandarajah PR middle

Re: [RFC][PR87528][PR86677] Disable builtin popcount detection when back-end does not define it

2018-11-11 Thread Kugan Vivekanandarajah
Hi Richard, Thanks for the review. On Thu, 8 Nov 2018 at 00:03, Richard Biener wrote: > > On Fri, Nov 2, 2018 at 10:02 AM Kugan Vivekanandarajah > wrote: > > > > Hi Richard, > > Thanks for the review. > > On Tue, 30 Oct 2018 at 01:25, Richard Biener > >

Re: [RFC][PR87528][PR86677] Disable builtin popcount detection when back-end does not define it

2018-11-02 Thread Kugan Vivekanandarajah
Hi Richard, Thanks for the review. On Tue, 30 Oct 2018 at 01:25, Richard Biener wrote: > > On Mon, Oct 29, 2018 at 2:06 AM Kugan Vivekanandarajah > wrote: > > > > Hi Richard and Jeff, > > > > Thanks for your comments. > > > > On Fri, 26

[Bug c++/87469] [9 Regression] ice in record_estimate, at tree-ssa-loop-niter.c:3271

2018-10-29 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87469 --- Comment #5 from kugan at gcc dot gnu.org --- Author: kugan Date: Mon Oct 29 22:02:45 2018 New Revision: 265605 URL: https://gcc.gnu.org/viewcvs?rev=265605=gcc=rev Log: gcc/testsuite/ChangeLog: 2018-10-29 Kugan Vivekanandarajah

Re: [RFC][PR87528][PR86677] Disable builtin popcount detection when back-end does not define it

2018-10-28 Thread Kugan Vivekanandarajah
Hi Richard and Jeff, Thanks for your comments. On Fri, 26 Oct 2018 at 19:40, Richard Biener wrote: > > On Fri, Oct 26, 2018 at 4:55 AM Jeff Law wrote: > > > > On 10/25/18 4:33 PM, Kugan Vivekanandarajah wrote: > > > Hi, > > > > > > PR87528

[PR87469] ICE in record_estimate, at tree-ssa-loop-niter.c

2018-10-27 Thread Kugan Vivekanandarajah
. Is this OK? Thanks, Kugan gcc/testsuite/ChangeLog: 2018-10-26 Kugan Vivekanandarajah PR middle-end/87469 * g++.dg/pr87469.C: New test. gcc/ChangeLog: 2018-10-26 Kugan Vivekanandarajah PR middle-end/87469 * tree-ssa-loop-niter.c (number_of_iterations_popcount): Fix niter max

[ABSU_EXPR] Add some of the missing patterns in match.pd

2018-10-25 Thread Kugan Vivekanandarajah
Hi, This patch adds some of the missing patterns in match.pd for ABSU_EXPR and it is a revised version based on the review at https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00046.html Bootstrapped and regression tested on x86_64-linux-gnu with no new regressions. Is this OK trunk? Thanks, Kugan

[RFC][PR87528][PR86677] Disable builtin popcount detection when back-end does not define it

2018-10-25 Thread Kugan Vivekanandarajah
discussions) does this. Bootstrapped and regression tested on x86_64-linux-gnu with no new regressions. We need to disable the popcount* testcases. I will have to define a effective_target_with_popcount in gcc/testsuite/lib/target-supports.exp if this patch is OK? Thanks, Kugan gcc/ChangeLog: 2018

[Bug c++/87469] [9 Regression] ice in record_estimate, at tree-ssa-loop-niter.c:3271

2018-10-17 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87469 --- Comment #4 from kugan at gcc dot gnu.org --- In the loop here, the value defined in the loop (e) is used outside the loop hence this should not be detected as popcount (AFIK). I will have a look at fixing this.

[SVE ACLE] Implements ACLE svdup, svindex, svqad/qsub, svabd and svmul

2018-10-15 Thread Kugan Vivekanandarajah
Hi, Attached patch implements ACLE svdup, svindex, svqad/qsub, svabd and svmul built-ins. Committed to ACLE branch, Thanks, Kugan 0001-svdup-svindex-svqad-qsub-svabd-and-svmul.patch.gz Description: application/gzip

[Bug target/87253] New: Python test_ctypes fails when built with gcc 8.2

2018-09-08 Thread kugan at gcc dot gnu.org
: target Assignee: unassigned at gcc dot gnu.org Reporter: kugan at gcc dot gnu.org Target Milestone: --- Python-2.7.15 Steps to reproduce error In Python src directory: ./configure make ./python Lib/test/regrtest.py -v test_ctypes

Re: [RFC] Fix recent popcount change is breaking

2018-07-27 Thread Kugan Vivekanandarajah
tions for other libgcc functions IIRC. >> >>Can you please Kugan create Linux kernel bug for that? So that >>discussion >>can happen? > > There's no discussion necessary, libgcc is the core compiler runtime. If you > choose not to use it you have to provide your own implem

  1   2   3   4   5   6   7   8   9   >