RE: [PATCH PR95696] regrename creates overlapping register allocations for vliw

2020-08-03 Thread Yangfei (Felix)
install if it's good to go. Felix > -Original Message- > From: Richard Sandiford [mailto:richard.sandif...@arm.com] > Sent: Friday, July 31, 2020 5:33 PM > To: Zhongyunde > Cc: gcc-patches@gcc.gnu.org; Yangfei (Felix) > Subject: Re: [PATCH PR95696] regrename creates ov

RE: [PATCH PR95961] vect: ICE: in exact_div, at poly-int.h:2182

2020-07-02 Thread Yangfei (Felix)
Hi, > -Original Message- > From: Richard Sandiford [mailto:richard.sandif...@arm.com] > Sent: Thursday, July 2, 2020 5:17 PM > To: Yangfei (Felix) > Cc: Richard Biener ; Richard Biener > ; gcc-patches@gcc.gnu.org > Subject: Re: [PATCH PR95961] vect: ICE: in exact_di

RE: [PATCH PR95961] vect: ICE: in exact_div, at poly-int.h:2182

2020-07-02 Thread Yangfei (Felix)
Hi, > -Original Message- > From: Richard Sandiford [mailto:richard.sandif...@arm.com] > Sent: Wednesday, July 1, 2020 9:03 PM > To: Yangfei (Felix) > Cc: Richard Biener ; Richard Biener > ; gcc-patches@gcc.gnu.org > Subject: Re: [PATCH PR95961] vect: ICE: in exact_di

RE: [PATCH PR95961] vect: ICE: in exact_div, at poly-int.h:2182

2020-07-01 Thread Yangfei (Felix)
Hi, > -Original Message- > From: Richard Biener [mailto:rguent...@suse.de] > Sent: Tuesday, June 30, 2020 10:50 PM > To: Richard Sandiford > Cc: Richard Biener ; Yangfei (Felix) > ; gcc-patches@gcc.gnu.org > Subject: Re: [PATCH PR95961] vect: ICE: in exact_di

[PATCH PR95961] vect: ICE: in exact_div, at poly-int.h:2182

2020-06-29 Thread Yangfei (Felix)
Hi, PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95961 In the test case for PR95961, vectorization factor computed by vect_determine_vectorization_factor is [8,8]. But this is updated to [1,1] later by vect_update_vf_for_slp. When we call vect_get_num_vectors in

[PATCH] vect: Use vect_relevant_for_alignment_p consistently

2020-06-18 Thread Yangfei (Felix)
Hi, Noticed two places in tree-vect-data-refs.c where we can use function vect_relevant_for_alignment_p. Looks like these two are missed when we were introducing the function. Bootstrapped and tested on aarch64-linux-gnu. OK to go? ChangeLog modification is contained in the patch.

RE: [PATCH] vect: Use LOOP_VINFO_DATAREFS and LOOP_VINFO_DDRS consistently

2020-06-15 Thread Yangfei (Felix)
Hi, > -Original Message- > From: Richard Biener [mailto:richard.guent...@gmail.com] > Sent: Monday, June 15, 2020 5:12 PM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH] vect: Use LOOP_VINFO_DATAREFS and > LOOP_VINFO_DDRS consisten

RE: [PATCH] vect: Use LOOP_VINFO_DATAREFS and LOOP_VINFO_DDRS consistently

2020-06-15 Thread Yangfei (Felix)
Hi Richard, > -Original Message- > From: Richard Biener [mailto:richard.guent...@gmail.com] > Sent: Monday, June 15, 2020 3:25 PM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH] vect: Use LOOP_VINFO_DATAREFS and > LOOP_VINFO_DDRS consiste

[PATCH] vect: Use LOOP_VINFO_DATAREFS and LOOP_VINFO_DDRS consistently

2020-06-12 Thread Yangfei (Felix)
Hi, This is minor code refactorings in tree-vect-data-refs.c and tree-vect-loop.c. Use LOOP_VINFO_DATAREFS and LOOP_VINFO_DDRS when possible and rename several parameters to make code more consistent. Bootstrapped and tested on aarch64-linux-gnu. OK? Thanks, Felix gcc/ +2020-06-13 Felix

RE: [PATCH PR95570] vect: ICE: Segmentation fault in vect_loop_versioning

2020-06-11 Thread Yangfei (Felix)
Hi, > -Original Message- > From: Richard Sandiford [mailto:richard.sandif...@arm.com] > Sent: Friday, June 12, 2020 2:29 AM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH PR95570] vect: ICE: Segmentation fault in > vect_loop_versioning

RE: [PATCH PR95570] vect: ICE: Segmentation fault in vect_loop_versioning

2020-06-11 Thread Yangfei (Felix)
Hi, > -Original Message- > From: Richard Sandiford [mailto:richard.sandif...@arm.com] > Sent: Thursday, June 11, 2020 12:23 AM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH PR95570] vect: ICE: Segmentation fault in > vect_loop_versioning

[PATCH PR95570] vect: ICE: Segmentation fault in vect_loop_versioning

2020-06-10 Thread Yangfei (Felix)
Hi, PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95570 Here, we are doing loop versioning for alignment. The only dr here is a gather-statter operation: x[start][i]. Scalar evolution analysis for this dr failed, so DR_STEP is NULL_TREE, which leads to the segfault. But

RE: [PATCH PR95254] aarch64: gcc generate inefficient code with fixed sve vector length

2020-06-04 Thread Yangfei (Felix)
Hi, > -Original Message- > From: Richard Sandiford [mailto:richard.sandif...@arm.com] > Sent: Tuesday, June 2, 2020 7:17 PM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org; Uros Bizjak ; Jakub > Jelinek ; Hongtao Liu ; H.J. Lu > > Subject: Re: [PATCH PR9525

RE: [PATCH PR95459] aarch64: ICE in aarch64_short_vector_p, at config/aarch64/aarch64.c:16803

2020-06-03 Thread Yangfei (Felix)
Hi Richard, > -Original Message- > From: Richard Sandiford [mailto:richard.sandif...@arm.com] > Sent: Wednesday, June 3, 2020 1:19 AM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH PR95459] aarch64: ICE in aarch64_short_vector_p, at > con

[PATCH PR95459] aarch64: ICE in aarch64_short_vector_p, at config/aarch64/aarch64.c:16803

2020-06-02 Thread Yangfei (Felix)
Hi, Please review this trivial patch fixing an ICE in aarch64_short_vector_p. Bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95459 In aarch64_short_vector_p, we are simply checking whether a type (and a mode) is a 64/128-bit short vector or not. This should not be affected

PING: RE: [PATCH PR94026] combine missed opportunity to simplify comparisons with zero

2020-06-02 Thread Yangfei (Felix)
Gentle ping ... > -Original Message- > From: Yangfei (Felix) > Sent: Wednesday, May 27, 2020 11:52 AM > To: 'Segher Boessenkool' > Cc: gcc-patches@gcc.gnu.org; Zhanghaijian (A) > Subject: RE: [PATCH PR94026] combine missed opportunity to simplify > comparis

RE: [PATCH PR95254] aarch64: gcc generate inefficient code with fixed sve vector length

2020-06-01 Thread Yangfei (Felix)
Hi, > -Original Message- > From: Richard Sandiford [mailto:richard.sandif...@arm.com] > Sent: Monday, June 1, 2020 4:47 PM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org; Uros Bizjak ; Jakub > Jelinek ; Hongtao Liu ; H.J. Lu > > Subject: Re: [PATCH PR9525

RE: [PATCH PR95254] aarch64: gcc generate inefficient code with fixed sve vector length

2020-05-31 Thread Yangfei (Felix)
Hi, > -Original Message- > From: Richard Sandiford [mailto:richard.sandif...@arm.com] > Sent: Sunday, May 31, 2020 12:01 AM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org; Uros Bizjak ; Jakub > Jelinek ; Hongtao Liu ; H.J. Lu > > Subject: Re: [PATCH PR9525

RE: [PATCH PR95254] aarch64: gcc generate inefficient code with fixed sve vector length

2020-05-30 Thread Yangfei (Felix)
Hi, > -Original Message- > From: Yangfei (Felix) > Sent: Friday, May 29, 2020 2:56 PM > To: 'Hongtao Liu' ; H.J. Lu > Cc: gcc-patches@gcc.gnu.org; Uros Bizjak ; Jakub > Jelinek ; Richard Sandiford > > Subject: RE: [PATCH PR95254] aarch64: gcc generate inefficie

RE: [PATCH PR95254] aarch64: gcc generate inefficient code with fixed sve vector length

2020-05-29 Thread Yangfei (Felix)
Hi, > -Original Message- > From: Hongtao Liu [mailto:crazy...@gmail.com] > Sent: Friday, May 29, 2020 11:24 AM > To: H.J. Lu > Cc: Yangfei (Felix) ; gcc-patches@gcc.gnu.org; > Uros Bizjak ; Jakub Jelinek ; > Richard Sandiford > Subject: Re: [PATCH PR9525

RE: [PATCH PR95254] aarch64: gcc generate inefficient code with fixed sve vector length

2020-05-28 Thread Yangfei (Felix)
Hi, > -Original Message- > From: Richard Sandiford [mailto:richard.sandif...@arm.com] > Sent: Thursday, May 28, 2020 12:07 AM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH PR95254] aarch64: gcc generate inefficient code with > fixed sve v

RE: [PATCH PR95254] aarch64: gcc generate inefficient code with fixed sve vector length

2020-05-27 Thread Yangfei (Felix)
Hi, > -Original Message- > From: Richard Sandiford [mailto:richard.sandif...@arm.com] > Sent: Tuesday, May 26, 2020 11:58 PM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH PR95254] aarch64: gcc generate inefficient code with > fixed sve vec

RE: [PATCH PR94026] combine missed opportunity to simplify comparisons with zero

2020-05-26 Thread Yangfei (Felix)
Hi, > -Original Message- > From: Segher Boessenkool [mailto:seg...@kernel.crashing.org] > Sent: Tuesday, May 26, 2020 11:32 PM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org; Zhanghaijian (A) > Subject: Re: [PATCH PR94026] combine missed opportunity to simp

RE: [PATCH PR94026] combine missed opportunity to simplify comparisons with zero

2020-05-25 Thread Yangfei (Felix)
Hi, > -Original Message- > From: Segher Boessenkool [mailto:seg...@kernel.crashing.org] > Sent: Tuesday, May 26, 2020 12:27 AM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org; Zhanghaijian (A) > Subject: Re: [PATCH PR94026] combine missed opportunity to simp

RE: [PATCH PR94026] combine missed opportunity to simplify comparisons with zero

2020-05-24 Thread Yangfei (Felix)
Hi, > -Original Message- > From: Segher Boessenkool [mailto:seg...@kernel.crashing.org] > Sent: Saturday, May 23, 2020 10:57 PM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org; Zhanghaijian (A) > Subject: Re: [PATCH PR94026] combine missed opportunity to simp

RE: [PATCH PR95254] aarch64: gcc generate inefficient code with fixed sve vector length

2020-05-22 Thread Yangfei (Felix)
Hi Richard, Thanks for the suggestions. > -Original Message- > From: Richard Sandiford [mailto:richard.sandif...@arm.com] > Sent: Thursday, May 21, 2020 5:22 PM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH PR95254] aarch64: gcc generat

[PATCH PR95254] aarch64: gcc generate inefficient code with fixed sve vector length

2020-05-21 Thread Yangfei (Felix)
Hi, Notice a tiny SVE-related performance issue: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95254 For the given test case, SLP succeeds with VNx8HImode with or without option -msve-vector-bits=256. The root cause for the difference is that we choose a different mode in

RE: [PATCH PR94991] aarch64: ICE: Segmentation fault with option -mgeneral-regs-only

2020-05-11 Thread Yangfei (Felix)
Hi, > -Original Message- > From: Richard Sandiford [mailto:richard.sandif...@arm.com] > Sent: Monday, May 11, 2020 10:27 PM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH PR94991] aarch64: ICE: Segmentation fault with option - > mgenera

[PATCH PR94991] aarch64: ICE: Segmentation fault with option -mgeneral-regs-only

2020-05-07 Thread Yangfei (Felix)
Hi, Witnessed another ICE with option -mgeneral-regs-only. I have created a bug for that: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94991 For the given testcase, we are doing FAIL for scalar floating move expand pattern since TARGET_FLOAT is false with option -mgeneral-regs-only.

RE: [PATCH PR94026] combine missed opportunity to simplify comparisons with zero

2020-05-06 Thread Yangfei (Felix)
Hi, > -Original Message- > From: Segher Boessenkool [mailto:seg...@kernel.crashing.org] > Sent: Tuesday, March 24, 2020 10:58 PM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org; Zhanghaijian (A) > Subject: Re: [PATCH PR94026] combine missed opportunity to simp

RE: [PATCH PR94784] ICE: in simplify_vector_constructor, at tree-ssa-forwprop.c:2482

2020-04-27 Thread Yangfei (Felix)
> -Original Message- > From: Richard Sandiford [mailto:richard.sandif...@arm.com] > Sent: Monday, April 27, 2020 6:10 PM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH PR94784] ICE: in simplify_vector_constructor, at tree- > ssa-forwpr

RE: [PATCH PR94784] ICE: in simplify_vector_constructor, at tree-ssa-forwprop.c:2482

2020-04-27 Thread Yangfei (Felix)
> -Original Message- > From: Richard Sandiford [mailto:richard.sandif...@arm.com] > Sent: Monday, April 27, 2020 3:51 PM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH PR94784] ICE: in simplify_vector_constructor, at tree- > ssa-forwprop

[PATCH PR94784] ICE: in simplify_vector_constructor, at tree-ssa-forwprop.c:2482

2020-04-27 Thread Yangfei (Felix)
Hi, I see one gcc_assert was introduce in: https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544271.html This is causing an ICE for certain cases. I have created a PR for this: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94784 I did some check and it looks like everything works fine

[PATCH] aarch64: add tests for CPP predefines under -mgeneral-regs-only

2020-04-23 Thread Yangfei (Felix)
Hi, I noticed that gcc.target/aarch64/pragma_cpp_predefs_1.c performs testing for -mgeneral-regs-only. This adds similar testing in the following two tests to make sure CPP predefines redefinitions on #pragma works as expected when -mgeneral-regs-only option is specified (See

RE: [PATCH PR94678] aarch64: unexpected result with -mgeneral-regs-only and sve

2020-04-22 Thread Yangfei (Felix)
Hi, > -Original Message- > From: Richard Sandiford [mailto:richard.sandif...@arm.com] > Sent: Wednesday, April 22, 2020 6:03 PM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH PR94678] aarch64: unexpected result with -mgeneral- > regs-onl

RE: [PATCH PR94678] aarch64: unexpected result with -mgeneral-regs-only and sve

2020-04-22 Thread Yangfei (Felix)
> -Original Message- > From: Richard Sandiford [mailto:richard.sandif...@arm.com] > Sent: Tuesday, April 21, 2020 6:11 PM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH PR94678] aarch64: unexpected result with -mgeneral- > regs-on

RE: [PATCH PR94678] aarch64: unexpected result with -mgeneral-regs-only and sve

2020-04-21 Thread Yangfei (Felix)
> -Original Message- > From: Richard Sandiford [mailto:richard.sandif...@arm.com] > Sent: Tuesday, April 21, 2020 4:01 PM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH PR94678] aarch64: unexpected result with -mgeneral- > regs-only and sve

[PATCH PR94678] aarch64: unexpected result with -mgeneral-regs-only and sve

2020-04-21 Thread Yangfei (Felix)
Hi, It looks like there are several issues out there for sve codegen with -mgeneral-regs-only. I have created a bug for that: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94678 We do ISA extension checks for SVE in check_required_extensions(aarch64-sve-builtins.cc). I think we may

RE: [PATCH] ICE: in vectorizable_load, at tree-vect-stmts.c:9173

2020-03-31 Thread Yangfei (Felix)
Hi! > -Original Message- > From: Richard Sandiford [mailto:richard.sandif...@arm.com] > Sent: Tuesday, March 31, 2020 4:55 PM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org; rguent...@suse.de > Subject: Re: [PATCH] ICE: in vectorizable_load, at tree-vect-stmts.c:9

RE: [PATCH] ICE: in vectorizable_load, at tree-vect-stmts.c:9173

2020-03-31 Thread Yangfei (Felix)
Hi! > -Original Message- > From: Richard Sandiford [mailto:richard.sandif...@arm.com] > Sent: Monday, March 30, 2020 8:08 PM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org; rguent...@suse.de > Subject: Re: [PATCH] ICE: in vectorizable_load, at tree-vect-stmts.c:

RE: [PATCH] ICE: in vectorizable_load, at tree-vect-stmts.c:9173

2020-03-30 Thread Yangfei (Felix)
Hi! > -Original Message- > From: Yangfei (Felix) > Sent: Monday, March 30, 2020 5:28 PM > To: gcc-patches@gcc.gnu.org > Cc: 'rguent...@suse.de' > Subject: [PATCH] ICE: in vectorizable_load, at tree-vect-stmts.c:9173 > > Hi, > > New bug: https://gcc.gnu.org/

[PATCH] ICE: in vectorizable_load, at tree-vect-stmts.c:9173

2020-03-30 Thread Yangfei (Felix)
Hi, New bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94398 With -mstrict-align, aarch64_builtin_support_vector_misalignment will returns false when misalignment factor is unknown at compile time. Then vect_supportable_dr_alignment returns dr_unaligned_unsupported, which triggers the ICE.

RE: [RFC] Should widening_mul should consider block frequency?

2020-03-26 Thread Yangfei (Felix)
> -Original Message- > From: Richard Biener [mailto:richard.guent...@gmail.com] > Sent: Thursday, March 26, 2020 3:37 PM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [RFC] Should widening_mul should consider block frequency? > > > >

RE: [RFC] Should widening_mul should consider block frequency?

2020-03-25 Thread Yangfei (Felix)
Hi! > -Original Message- > From: Richard Biener [mailto:richard.guent...@gmail.com] > Sent: Tuesday, March 24, 2020 10:14 PM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [RFC] Should widening_mul should consider block frequency? > > >

RE: [RFC] Should widening_mul should consider block frequency?

2020-03-24 Thread Yangfei (Felix)
Hi! > -Original Message- > From: Richard Biener [mailto:richard.guent...@gmail.com] > Sent: Monday, March 23, 2020 11:25 PM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [RFC] Should widening_mul should consider block frequency? > > On Mon,

RE: [PATCH PR94026] combine missed opportunity to simplify comparisons with zero

2020-03-24 Thread Yangfei (Felix)
Hi! > -Original Message- > From: Segher Boessenkool [mailto:seg...@kernel.crashing.org] > Sent: Monday, March 23, 2020 8:10 PM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org; Zhanghaijian (A) > Subject: Re: [PATCH PR94026] combine missed opportunity to simp

[RFC] Should widening_mul should consider block frequency?

2020-03-23 Thread Yangfei (Felix)
Hi, I created a bug for this issue: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94269 Looks like widening_mul phase may move multiply instruction from outside the loop to inside the loop, merging with one add instruction inside the loop. This will increase the cost of the loop at

RE: [PATCH PR94026] combine missed opportunity to simplify comparisons with zero

2020-03-23 Thread Yangfei (Felix)
> -Original Message- > From: Segher Boessenkool [mailto:seg...@kernel.crashing.org] > Sent: Friday, March 20, 2020 9:38 AM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org; Zhanghaijian (A) > Subject: Re: [PATCH PR94026] combine missed opportunity to simplify >

RE: [PATCH PR94026] combine missed opportunity to simplify comparisons with zero

2020-03-18 Thread Yangfei (Felix)
Hi, > -Original Message- > From: Segher Boessenkool [mailto:seg...@kernel.crashing.org] > Sent: Thursday, March 19, 2020 7:52 AM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org; Zhanghaijian (A) > Subject: Re: [PATCH PR94026] combine missed opportunity to simp

RE: [PATCH PR94026] combine missed opportunity to simplify comparisons with zero

2020-03-16 Thread Yangfei (Felix)
Hi, > -Original Message- > From: Segher Boessenkool [mailto:seg...@kernel.crashing.org] > Sent: Tuesday, March 17, 2020 1:58 AM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org; Zhanghaijian (A) > Subject: Re: [PATCH PR94026] combine missed opportunity to simp

RE: [PATCH PR94026] combine missed opportunity to simplify comparisons with zero

2020-03-16 Thread Yangfei (Felix)
Hi, > -Original Message- > From: Segher Boessenkool [mailto:seg...@kernel.crashing.org] > Sent: Saturday, March 14, 2020 12:07 AM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org; Zhanghaijian (A) > Subject: Re: [PATCH PR94026] combine missed opportunity to simp

RE: [PATCH PR94026] combine missed opportunity to simplify comparisons with zero

2020-03-12 Thread Yangfei (Felix)
> -Original Message- > From: Segher Boessenkool [mailto:seg...@kernel.crashing.org] > Sent: Friday, March 13, 2020 7:50 AM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org; Zhanghaijian (A) > Subject: Re: [PATCH PR94026] combine missed opportunity to simplify >

RE: [PATCH PR94026] combine missed opportunity to simplify comparisons with zero

2020-03-05 Thread Yangfei (Felix)
> -Original Message- > From: Jeff Law [mailto:l...@redhat.com] > Sent: Thursday, March 5, 2020 11:37 PM > To: Yangfei (Felix) ; gcc-patches@gcc.gnu.org > Cc: Zhanghaijian (A) > Subject: Re: [PATCH PR94026] combine missed opportunity to simplify > comparisons with zero

[PATCH PR94026] combine missed opportunity to simplify comparisons with zero

2020-03-04 Thread Yangfei (Felix)
Hi, This is a simple fix for PR94026. With this fix, combine will try make an extraction if we are in a equality comparison and this is an AND with a constant which is power of two minus one. Shift here should be an constant. For example, combine will transform (compare (and

Re: [PATCH, AArch64] atomics: prefetch the destination for write prior to ldxr/stxr loops

2016-03-07 Thread Yangfei (Felix)
> On Mon, Mar 7, 2016 at 7:27 PM, Yangfei (Felix) <felix.y...@huawei.com> wrote: > > Hi, > > > > As discussed in LKML: > http://lists.infradead.org/pipermail/linux-arm-kernel/2015-July/355996.html, > the > cost of changing a cache line > > from sh

[PATCH, AArch64] atomics: prefetch the destination for write prior to ldxr/stxr loops

2016-03-07 Thread Yangfei (Felix)
Hi, As discussed in LKML: http://lists.infradead.org/pipermail/linux-arm-kernel/2015-July/355996.html, the cost of changing a cache line from shared to exclusive state can be significant on aarch64 cores, especially when this is triggered by an exclusive store, since it may result

Re: [PATCH] Only accept BUILT_IN_NORMAL stringops for interesting_stringop_to_profile_p

2015-08-20 Thread Yangfei (Felix)
operations. + 2015-08-18 Segher Boessenkool seg...@kernel.crashing.org Backport from mainline: On Thu, Aug 20, 2015 at 5:17 AM, Yangfei (Felix) felix.y...@huawei.com wrote: Hi, As DECL_FUNCTION_CODE is overloaded for builtin functions in different classes, so need to check

[PATCH] Only accept BUILT_IN_NORMAL stringops for interesting_stringop_to_profile_p

2015-08-19 Thread Yangfei (Felix)
Hi, As DECL_FUNCTION_CODE is overloaded for builtin functions in different classes, so need to check builtin class before using fcode. Patch posted below. Bootstrapped on x86_64-suse-linux, OK for trunk? Thanks. Index: gcc/value-prof.c

[PING^2, AArch64] Add long-call attribute and pragma interfaces

2015-05-05 Thread Yangfei (Felix)
Patch ping ... On 04/02/2015 11:59 PM, Yangfei (Felix) wrote: Patch ping: https://gcc.gnu.org/ml/gcc-patches/2014-11/msg01148.html This patch needs documentation for the new attributes and pragmas before it can be committed. (Since this is a new feature I think it has to wait

Re: [PING, AArch64] Add long-call attribute and pragma interfaces

2015-04-12 Thread Yangfei (Felix)
On 04/02/2015 11:59 PM, Yangfei (Felix) wrote: Patch ping: https://gcc.gnu.org/ml/gcc-patches/2014-11/msg01148.html This patch needs documentation for the new attributes and pragmas before it can be committed. (Since this is a new feature I think it has to wait until stage 1, too

Re: [RFC AArch64] Implement TARGET_PROMOTE_FUNCTION_MODE for ILP32 code generation

2015-04-07 Thread Yangfei (Felix)
Hi Andrew, Sorry for the late reply. Seems that I misunderstood the AAPCS64 specification. Thanks for the clarification. On Mar 16, 2015, at 2:28 AM, Yangfei (Felix) felix.y...@huawei.com wrote: Hi, For this trivial testcase: extern int bar (int , int); int

[PING, AArch64] Add long-call attribute and pragma interfaces

2015-04-03 Thread Yangfei (Felix)
Patch ping: https://gcc.gnu.org/ml/gcc-patches/2014-11/msg01148.html Thanks.

[RFC AArch64] Implement TARGET_PROMOTE_FUNCTION_MODE for ILP32 code generation

2015-03-16 Thread Yangfei (Felix)
Hi, For this trivial testcase: extern int bar (int , int); int foo (int *a, int *b) { return bar (*a, *b); } I noticed that GCC generate redundant zero-extension instructions under ILP32 (aarch64-linux-gnu-gcc -S -O2 -mabi=ilp32). Assembly code: .arch armv8-a+fp+simd

[PING ^ 5] [PATCH, AARCH64] Add support for -mlong-calls option

2015-02-14 Thread Yangfei (Felix)
Ping ... Patch ping: https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02258.html Any comments, Richard? Thanks.

Re: [PING] [PATCH] [AArch64, NEON] Add vfms_n_f32, vfmsq_n_f32 and vfmsq_n_f64 specified by the ACLE

2015-01-21 Thread Yangfei (Felix)
On 21/01/15 09:22, Yangfei (Felix) wrote: This is a ping for: https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01008.html I updated the testcase adding test for vfmsq_n_f64 intrinsic. Test OK for both aarch64-linux-gnu and aarch64_be-linux-gnu-gcc. OK for the trunk? Thanks. Index

[PING] [PATCH] [AArch64, NEON] Add vfms_n_f32, vfmsq_n_f32 and vfmsq_n_f64 specified by the ACLE

2015-01-21 Thread Yangfei (Felix)
This is a ping for: https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01008.html I updated the testcase adding test for vfmsq_n_f64 intrinsic. Test OK for both aarch64-linux-gnu and aarch64_be-linux-gnu-gcc. OK for the trunk? Thanks. Index: gcc/ChangeLog

[PING] [PATCH] [AArch64, NEON] Fix testcases add by r218484

2015-01-19 Thread Yangfei (Felix)
Hi, This is a ping for: https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01328.html OK for the trunk? Thanks.

[PING ^ 4] [RFC PATCH, AARCH64] Add support for -mlong-calls option

2015-01-19 Thread Yangfei (Felix)
Patch ping: https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02258.html Any comments, Richard? Thanks.

Re: [PATCH, autofdo] Some code cleanup

2015-01-17 Thread Yangfei (Felix)
Hi, I updated the patch adding one ChangeLog entry. OK for the trunk? Thanks. Index: gcc/ChangeLog === --- gcc/ChangeLog (revision 219297) +++ gcc/ChangeLog (working copy) @@ -1,3 +1,12 @@ +2015-01-17 Felix

Re: [PATCH] [AArch64, NEON] Improve vpmaxX vpminX intrinsics

2015-01-13 Thread Yangfei (Felix)
On 09/12/14 08:17, Yangfei (Felix) wrote: On 28 November 2014 at 09:23, Yangfei (Felix) felix.y...@huawei.com wrote: Hi, This patch converts vpmaxX vpminX intrinsics to use builtin functions instead of the previous inline assembly syntax. Regtested with aarch64-linux-gnu

[PATCH, autofdo] Some code cleanup

2015-01-12 Thread Yangfei (Felix)
Hi, The attached patch does some code cleanup for auto-profile.c: fix typos and remove some unnecessary MAX/MIN checks plus some else. OK for the trunk? Index: gcc/auto-profile.c === --- gcc/auto-profile.c (revision 219297)

[PATCH] Fix PR64240

2014-12-16 Thread Yangfei (Felix)
Hi, This patch fixes an obvious typo which may affect the DDG creation of SMS and make this optimization produce buggy code. Bootstrapped on x86_64-suse-linux. Also passed check-gcc test for aarch64-linux-gnu. OK for the trunk? Index: gcc/ddg.c

Re: [PATCH] Fix PR64240

2014-12-16 Thread Yangfei (Felix)
On December 16, 2014 9:51:25 AM CET, Yangfei (Felix) felix.y...@huawei.com wrote: Hi, This patch fixes an obvious typo which may affect the DDG creation of SMS and make this optimization produce buggy code. Bootstrapped on x86_64-suse-linux. Also passed check-gcc test for aarch64-linux

Re: [PATCH] [AArch64, NEON] Fix testcases add by r218484

2014-12-16 Thread Yangfei (Felix)
#define DECL_VABD_VAR(VAR) \ be careful with your cut and paste. VABD should probably be VFMA_N here, although it's purely a naming convention :-) The v3 patch attached fixed this minor issue. Thanks. It's OK for me with that change, but I'm not a maintainer. One

Re: [PATCH] [AArch64, NEON] Fix testcases add by r218484

2014-12-12 Thread Yangfei (Felix)
Thanks for reviewing the patch. See my comments inlined: This patch fix this two issues. Three changes: 1. vfma_f32, vfmaq_f32, vfms_f32, vfmsq_f32 are only available for arm*-*-* target with the FMA feature, we take care of this through the macro __ARM_FEATURE_FMA. 2. vfma_n_f32

[PATCH] [AArch64, NEON] Add vfms_n_f32, vfmsq_n_f32 and vfmsq_n_f64 specified by the ACLE

2014-12-11 Thread Yangfei (Felix)
Hi, This patch add three intrinsics that are required by the ACLE specification. A new testcase is added which covers vfms_n_f32 and vfmsq_n_f32. Tested on both aarch64-linux-gnu and aarch64_be-linux-gnu. OK? Index: gcc/ChangeLog

Re: [PING ^ 3][PATCH, AArch64] Add doloop_end pattern for -fmodulo-sched

2014-12-10 Thread Yangfei (Felix)
--- gcc/config/aarch64/aarch64.c(revision 217394) +++ gcc/config/aarch64/aarch64.c(working copy) @@ -10224,6 +10224,9 @@ aarch64_use_by_pieces_infrastructure_p (unsigned i #define TARGET_USE_BY_PIECES_INFRASTRUCTURE_P \ aarch64_use_by_pieces_infrastructure_p +#undef

Re: [COMMITTED] [PING] [PATCH] [AArch64, NEON] More NEON intrinsics improvement

2014-12-10 Thread Yangfei (Felix)
+__extension__ static __inline float32x2_t __attribute__ +((__always_inline__)) +vfms_f32 (float32x2_t __a, float32x2_t __b, float32x2_t __c) { + return __builtin_aarch64_fmav2sf (-__b, __c, __a); } + +__extension__ static __inline float32x4_t __attribute__

[PATCH] [AArch64, NEON] Fix testcases add by r218484

2014-12-10 Thread Yangfei (Felix)
Hi, We find that the committed patch is not correctly generated from our local branch. This caused some code necessary for the testcases missing. As pointed out by Christophe in https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00778.html, we need to rework the testcases so that it can work

Re: [PATCH] [AArch64, NEON] Improve vpmaxX vpminX intrinsics

2014-12-09 Thread Yangfei (Felix)
On 28 November 2014 at 09:23, Yangfei (Felix) felix.y...@huawei.com wrote: Hi, This patch converts vpmaxX vpminX intrinsics to use builtin functions instead of the previous inline assembly syntax. Regtested with aarch64-linux-gnu on QEMU. Also passed the glorious testsuite

Re: [PATCH] [AArch64, NEON] Improve vpmaxX vpminX intrinsics

2014-12-09 Thread Yangfei (Felix)
You'll need to rebase over Alan Lawrance's patch. https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00279.html Yes, see my new patch: https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00750.html +;; Pairwise Integer Max/Min operations. +(define_insn aarch64_maxmin_unspmode + [(set

[PING ^ 3] [RFC PATCH, AARCH64] Add support for -mlong-calls option

2014-12-09 Thread Yangfei (Felix)
Hi, This is a pin for: https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02258.html Thanks.

[PATCH, trivial] [AArch64] Remove declaration of removed function from aarch64-protos.h

2014-12-09 Thread Yangfei (Felix)
The definition of function aarch64_function_profiler is removed since GCC-4.9. But the declaration is still there in aarch64-protos.h. So remove it. OK for the trunk? Index: gcc/ChangeLog === --- gcc/ChangeLog (revision

[COMMITTED] [PING] [PATCH] [AArch64, NEON] More NEON intrinsics improvement

2014-12-08 Thread Yangfei (Felix)
On 5 December 2014 at 18:44, Tejas Belagod tejas.bela...@arm.com wrote: +__extension__ static __inline float32x2_t __attribute__ +((__always_inline__)) +vfms_f32 (float32x2_t __a, float32x2_t __b, float32x2_t __c) { + return __builtin_aarch64_fmav2sf (-__b, __c, __a); } +

[PING] [PATCH] [AArch64, NEON] More NEON intrinsics improvement

2014-12-03 Thread Yangfei (Felix)
Any comments? Thanks. Hi, This patch converts more intrinsics to use builtin functions instead of the previous inline assembly syntax. Passed the glorious testsuite of Christophe Lyon. Three testcases are added for the testing of intriniscs which are not covered by

[PATCH] [AArch64, NEON] Improve vpmaxX vpminX intrinsics

2014-11-28 Thread Yangfei (Felix)
Hi, This patch converts vpmaxX vpminX intrinsics to use builtin functions instead of the previous inline assembly syntax. Regtested with aarch64-linux-gnu on QEMU. Also passed the glorious testsuite of Christophe Lyon. OK for the trunk? Index: gcc/ChangeLog

Re: [PATCH PR59593] [arm] Backport r217772 r217826 to 4.8 4.9

2014-11-28 Thread Yangfei (Felix)
I've backported this fix to 4.8 4.9 branch. These patches have been tested for armeb-none-eabi-gcc/g++ with qemu, and both the test results were ok. Looks OK with me. Ramana, is this OK for the 4.8 4.9 branches? Thanks.

Re: [PATCH, PR63742][ARM] Fix arm *movhi_insn_arch4 pattern for big-endian

2014-11-20 Thread Yangfei (Felix)
On 19/11/14 09:29, Yangfei (Felix) wrote: Sorry for missing the point. It seems to me that 't2' here will conflict with condition of the pattern *movhi_insn_arch4: TARGET_ARM arm_arch4 (register_operand (operands[0], HImode) || register_operand (operands[1

Re: [PATCH, PR63742][ARM] Fix arm *movhi_insn_arch4 pattern for big-endian

2014-11-20 Thread Yangfei (Felix)
On 19/11/14 09:29, Yangfei (Felix) wrote: Sorry for missing the point. It seems to me that 't2' here will conflict with condition of the pattern *movhi_insn_arch4: TARGET_ARM arm_arch4 (register_operand (operands[0], HImode) || register_operand

Re: [PATCH, PR63742][ARM] Fix arm *movhi_insn_arch4 pattern for big-endian

2014-11-19 Thread Yangfei (Felix)
Sorry for missing the point. It seems to me that 't2' here will conflict with condition of the pattern *movhi_insn_arch4: TARGET_ARM arm_arch4 (register_operand (operands[0], HImode) || register_operand (operands[1], HImode)) #define TARGET_ARM

[PATCH] [AArch64, NEON] More NEON intrinsics improvement

2014-11-18 Thread Yangfei (Felix)
Hi, This patch converts more intrinsics to use builtin functions instead of the previous inline assembly syntax. Passed the glorious testsuite of Christophe Lyon. Three testcases are added for the testing of intriniscs which are not covered by the testsuite:

Re: [PATCH 0/3][AArch64]More intrinsics/builtins improvements

2014-11-18 Thread Yangfei (Felix)
Yangfei (Felix) wrote: These three are logically independent, but all on a common theme, and I've tested them all together by bootstrapped + check-gcc on aarch64-none-elf cross-tested check-gcc on aarch64_be-none-elf Ok for trunk? Hi Alan, It seems that we are duplicating

Re: [PING ^ 3][PATCH, AArch64] Add doloop_end pattern for -fmodulo-sched

2014-11-18 Thread Yangfei (Felix)
= TARGET_INITIALIZER; #include gt-aarch64.h On 17 November 2014 07:59, Yangfei (Felix) felix.y...@huawei.com wrote: +2014-11-13 Felix Yang felix.y...@huawei.com + + * config/aarch64/aarch64.md (doloop_end): New pattern. + This looks like a straight copy of the ARM code, but without

Re: [PATCH, PR63742][ARM] Fix arm *movhi_insn_arch4 pattern for big-endian

2014-11-18 Thread Yangfei (Felix)
On 06/11/14 08:35, Yangfei (Felix) wrote: The idea is simple: Use movw for certain const source operand instead of ldrh. And exclude the const values which cannot be handled by mov/mvn/movw. I am doing regression test for this patch. Assuming no issue pops up, OK

Re: [PING ^ 3][PATCH, AArch64] Add doloop_end pattern for -fmodulo-sched

2014-11-18 Thread Yangfei (Felix)
On 11/18/2014 11:48 AM, Yangfei (Felix) wrote: +(define_expand doloop_end + [(use (match_operand 0 )) ; loop pseudo + (use (match_operand 1 ))] ; label + +{ + /* Currently SMS relies on the do-loop pattern to recognize loops + where (1) the control part consists

[PING ^ 2][RFC PATCH, AARCH64] Add support for -mlong-calls option

2014-11-18 Thread Yangfei (Felix)
Ping again? Any comment please? Ping? I hope this patch can catch up with stage 1 of GCC-5.0. Thanks. Hi Felix, Sorry for the delay responding, I've been out of the office recently and I'm only just catching up on a backlog of GCC related emails. I'm in two minds

Re: [PATCH, PR63742][ARM] Fix arm *movhi_insn_arch4 pattern for big-endian

2014-11-18 Thread Yangfei (Felix)
On 18/11/14 11:02, Yangfei (Felix) wrote: On 06/11/14 08:35, Yangfei (Felix) wrote: The idea is simple: Use movw for certain const source operand instead of ldrh. And exclude the const values which cannot be handled by mov/mvn/movw. I am doing regression test

Re: [PING ^ 2][RFC PATCH, AARCH64] Add support for -mlong-calls option

2014-11-18 Thread Yangfei (Felix)
On Tue, Nov 18, 2014 at 11:51 AM, Yangfei (Felix) felix.y...@huawei.com wrote: Ping again? Any comment please? Pinging daily is only going to irritate people. Please desist from doing so. Ramana Oh, thanks for reminding me. And sorry if this bothers you guys. The end of stage1

Re: [PING][PATCH] [AARCH64, NEON] Improve vcls(q?) vcnt(q?) and vld1(q?)_dup intrinsics

2014-11-18 Thread Yangfei (Felix)
On 17 November 2014 06:58, Yangfei (Felix) felix.y...@huawei.com wrote: PING? BTW: It seems that Alan's way of improving vld1(q?)_dup intrinsic is more elegant. So is the improvement of vcls(q?) vcnt(q?) OK for trunk? Thanks. Please rebase over Alan's patch and repost, thank you

[PING ^ 3][PATCH, AArch64] Add doloop_end pattern for -fmodulo-sched

2014-11-17 Thread Yangfei (Felix)
PING? Is it OK for me to apply this patch? Thanks. On 11/12/2014 11:01 AM, Yangfei (Felix) wrote: +(define_expand doloop_end + [(use (match_operand 0 )) ; loop pseudo + (use (match_operand 1 ))] ; label + + +{ Drop the surrounding the { }. r

  1   2   >