Re: [testsuite] Fix PR93935 to guard case under vect_hw_misalign

2020-03-11 Thread Kewen.Lin via Gcc-patches
Hi, Gentle ping this patch, also request to backport to gcc9 after some burn-in time. BR, Kewen on 2020/2/26 下午2:17, Kewen.Lin wrote: > Hi, > > This patch is to apply the same fix as r267528 to another similar case > bb-slp-over-widen-2.c which requires misaligned vector access. > > Verified

[PATCH, vect] Check alignment for no peeling gaps handling

2020-04-10 Thread Kewen.Lin via Gcc-patches
Hi, This is one fix following Richi's comments here: https://gcc.gnu.org/pipermail/gcc-patches/2020-March/542232.html I noticed the current half vector support for no peeling gaps handled some cases which never check the half size vector support. By further investigation, those cases are safe

Re: [RFC] split pseudos during loop unrolling in RTL unroller

2020-04-15 Thread Kewen.Lin via Gcc-patches
on 2020/4/15 下午2:21, Richard Biener via Gcc-patches wrote: > On Wed, Apr 15, 2020 at 3:56 AM Jiufu Guo via Gcc-patches > wrote: >> >> Hi, >> >> As you may know, we have loop unroll pass in RTL which was introduced a few >> years ago, and works for a long time. Currently, this unroller is using

[PATCH, testsuite] Fix PR94079 by respecting vect_hw_misalign

2020-04-08 Thread Kewen.Lin via Gcc-patches
Hi, This is another vect case which requires special handling with vect_hw_misalign. The alignment of the second part requires misaligned vector access supports. This patch is to adjust the related guard condition and comments. Verified it on ppc64-redhat-linux (Power7 BE). Is it ok for

[PATCH] Fix PR94043 by making vect_live_op generate lc-phi

2020-03-30 Thread Kewen.Lin via Gcc-patches
Hi, As PR94043 shows, my commit r10-4524 exposed one issue in vectorizable_live_operation, which inserts one extra BB before the single exit, leading unexpected operand expansion and unexpected loop depth assertion. As Richi suggested, this patch is to teach vectorizable_live_operation to

[PATCH] Fix PR94401 by considering reverse overrun

2020-04-02 Thread Kewen.Lin via Gcc-patches
Hi, The commit r10-7415 brings scalar type consideration to eliminate epilogue peeling for gaps, but it exposed one problem that the current handling doesn't consider the memory access type VMAT_CONTIGUOUS_REVERSE, for which the overrun happens on low address side. This patch is to make the

Re: [PATCH] Fix PR94401 by considering reverse overrun

2020-04-02 Thread Kewen.Lin via Gcc-patches
Hi, on 2020/4/2 下午4:28, Jakub Jelinek wrote: > Hi! > > On Thu, Apr 02, 2020 at 03:15:42PM +0800, Kewen.Lin via Gcc-patches wrote: > > Just formatting nits, not commenting on what the actual patch does. > >> --- a/gcc/tree-vect-stmts.c >> +++ b/gcc/tree-vect-s

Re: [PATCH] Fix PR94401 by considering reverse overrun

2020-04-02 Thread Kewen.Lin via Gcc-patches
on 2020/4/2 下午5:21, Richard Biener wrote: > On Thu, Apr 2, 2020 at 9:15 AM Kewen.Lin wrote: >> >> Hi, >> >> The commit r10-7415 brings scalar type consideration >> to eliminate epilogue peeling for gaps, but it exposed >> one problem that the current handling doesn't consider >> the memory access

[PATCH] Fix PR94443 with gsi_insert_seq_before

2020-04-02 Thread Kewen.Lin via Gcc-patches
on 2020/4/2 上午6:51, H.J. Lu wrote: > > This caused: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94449 > Thanks for reporting this. The attached patch is to fix the stupid mistake by using gsi_insert_seq_before instead of gsi_insert_before. BTW, the regression testing on one x86_64

[PATCH v3] Fix PR90332 by extending half size vector mode

2020-03-26 Thread Kewen.Lin via Gcc-patches
Hi Richi, on 2020/3/25 下午4:25, Richard Biener wrote: > On Tue, Mar 24, 2020 at 9:30 AM Kewen.Lin wrote: >> >> Hi, >> >> The new version with refactoring has been attached. >> Bootstrapped/regtested on powerpc64le-linux-gnu (LE) P8 and P9. >> >> Is it ok for trunk? > > Yes. > Thanks! I'm

Re: [PATCH, vect] Check alignment for no peeling gaps handling

2020-04-28 Thread Kewen.Lin via Gcc-patches
Hi, Gentle ping for this patch. https://gcc.gnu.org/pipermail/gcc-patches/2020-April/543701.html BR, Kewen on 2020/4/10 下午5:28, Kewen.Lin via Gcc-patches wrote: > Hi, > > This is one fix following Richi's comments here: > https://gcc.gnu.org/pipermail/gcc-patches/2020-March/542232

[PATCH 3/4 V3 GCC11] IVOPTs Consider cost_step on different forms during unrolling

2020-05-12 Thread Kewen.Lin via Gcc-patches
Hi, Updated to v3 according to 2/4's param change. BR, Kewen --- gcc/ChangeLog 2020-MM-DD Kewen Lin * tree-ssa-loop-ivopts.c (struct iv_group): New field reg_offset_p. (struct iv_cand): New field reg_offset_p. (struct ivopts_data): New field

Re: [PATCH 2/4 GCC11] Add target hook stride_dform_valid_p

2020-05-12 Thread Kewen.Lin via Gcc-patches
Hi, I'd like to ping this patch as well as its sblings. Thanks in advance. 1/4 v3 https://gcc.gnu.org/pipermail/gcc-patches/2020-February/540171.html 2/4 v3 https://gcc.gnu.org/pipermail/gcc-patches/2020-March/541387.html 3/4 v3 https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545643.html

Re: [PATCH] Fix PR90332 by extending half size vector mode

2020-03-18 Thread Kewen.Lin via Gcc-patches
on 2020/3/18 下午6:40, Richard Biener wrote: > On Wed, Mar 18, 2020 at 11:39 AM Richard Biener > wrote: >> >> On Wed, Mar 18, 2020 at 11:06 AM Kewen.Lin wrote: >>> >>> Hi, >>> >>> As PR90332 shows, the current scalar epilogue peeling for gaps >>> elimination requires expected vec_init optab with

Re: [PATCH] Fix PR90332 by extending half size vector mode

2020-03-18 Thread Kewen.Lin via Gcc-patches
Hi Richi, Thanks for your comments. on 2020/3/18 下午6:39, Richard Biener wrote: > On Wed, Mar 18, 2020 at 11:06 AM Kewen.Lin wrote: >> >> Hi, >> >> As PR90332 shows, the current scalar epilogue peeling for gaps >> elimination requires expected vec_init optab with two half size >> vector mode.

[PATCH] Fix PR90332 by extending half size vector mode

2020-03-18 Thread Kewen.Lin via Gcc-patches
Hi, As PR90332 shows, the current scalar epilogue peeling for gaps elimination requires expected vec_init optab with two half size vector mode. On Power, we don't support vector mode like V8QI, so can't support optab like vec_initv16qiv8qi. But we want to leverage existing scalar mode like DI

[PATCH v2] Fix PR90332 by extending half size vector mode

2020-03-24 Thread Kewen.Lin via Gcc-patches
Hi, on 2020/3/18 下午11:10, Richard Biener wrote: > On Wed, Mar 18, 2020 at 2:56 PM Kewen.Lin wrote: >> >> Hi Richi, >> >> Thanks for your comments. >> >> on 2020/3/18 下午6:39, Richard Biener wrote: >>> On Wed, Mar 18, 2020 at 11:06 AM Kewen.Lin wrote: >> This path can define overrun_p to

Re: [PATCH] test/rs6000: Add Power9 and up as vect_len target

2020-08-31 Thread Kewen.Lin via Gcc-patches
Hi Will, Thanks for the review! on 2020/9/1 上午1:13, will schmidt wrote: > On Mon, 2020-08-31 at 14:43 +0800, Kewen.Lin via Gcc-patches wrote: >> Hi, >> >> Power9 supports vector with length in bytes load/store, this patch >> is to teach check_effective_target_

Re: [PATCH v2] testsuite: Update some vect cases for partial vectors

2020-08-30 Thread Kewen.Lin via Gcc-patches
Hi Richard, > >> +# Return true if loops using partial vectors are supported but only for >> loops >> +# whose need to iterate can be removed, that is, value of >> +# param_vect_partial_vector_usage is set to 1. > > For these comments, I think it would be good to use the sourcebuild.texi >

PING [PATCH 1/4] unroll: Add middle-end unroll factor estimation

2020-08-30 Thread Kewen.Lin via Gcc-patches
Hi, I'd like to gentle ping this since IVOPTs part is already to land. https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546698.html BR, Kewen on 2020/5/28 下午8:19, Kewen.Lin via Gcc-patches wrote: > > gcc/ChangeLog > > 2020-MM-DD Kewen Lin > > * cfgloop.h (struc

[PATCH] test/rs6000: Add Power9 and up as vect_len target

2020-08-31 Thread Kewen.Lin via Gcc-patches
Hi, Power9 supports vector with length in bytes load/store, this patch is to teach check_effective_target_vect_len_load_store to take it and its laters as effective vector with length targets. Also supplement the documents for has_arch_pwr*. Bootstrapped/regtested on powerpc64le-linux-gnu P8.

[PATCH,GCC9]rs6000: Backport fixes for PR92923 and PR93136

2020-08-30 Thread Kewen.Lin via Gcc-patches
Hi, This patch is to backport the fix for PR92923 and its sequent fix for PR93136 to GCC-9 branch. We found the builtin functions needlessly using VIEW_CONVERT_EXPRs on their operands can probably cause remarkable performance issue especailly when they are in the hotspot. One typical case is

Re: [PATCH 3/4 v3] ivopts: Consider cost_step on different forms during unrolling

2020-09-01 Thread Kewen.Lin via Gcc-patches
Hi Segher, on 2020/9/1 上午3:41, Segher Boessenkool wrote: > Hi! > > Just a note: > > On Tue, Aug 25, 2020 at 08:46:55PM +0800, Kewen.Lin wrote: >> 1) Currently address_cost hook on rs6000 always return zero, but at least >> from Power7, pre_inc/pre_dec kind instructions are cracked, it means we

Re: [PATCH 3/4 v3] ivopts: Consider cost_step on different forms during unrolling

2020-09-01 Thread Kewen.Lin via Gcc-patches
Hi Bin, >> 2) This case makes me think we should exclude ainc candidates in function >> mark_reg_offset_candidates. The justification is that: ainc candidate >> handles step update itself and when we calculate the cost for it against >> its ainc_use, the cost_step has been reduced. When

Re: [PATCH 3/4 v3] ivopts: Consider cost_step on different forms during unrolling

2020-09-01 Thread Kewen.Lin via Gcc-patches
Hi Bin, I've updated the patch to punt ainc_use candidates as below: > + /* Skip AINC candidate since it contains address update itself, > +the replicated AINC computations when unrolling still have > +updates, unlike reg_offset_p candidates

Re: [PATCH] test/rs6000: Add Power9 and up as vect_len target

2020-08-31 Thread Kewen.Lin via Gcc-patches
Hi Segher, >> proc check_effective_target_vect_len_load_store { } { >> -return 0 >> +return [expr { [check_effective_target_has_arch_pwr9] }] >> } > > Why not just > > return check_effective_target_has_arch_pwr9; > > ? (Or lose at least two pairs of brackets if not all three :-) )

[PATCH] test/rs6000: Replace test target p8 and p9+

2020-08-31 Thread Kewen.Lin via Gcc-patches
Hi, This is a trivial patch to clean existing rs6000 test targets p8 and p9+ with existing has_arch_pwr8 and has_arch_pwr9 target combination or only one of them. Not sure if it's a good idea to tidy this, but send out for comments. Bootstrapped/regtested on powerpc64le-linux-gnu P9. Any

[PATCH] rs6000: Use direct move for char/short vector CTOR [PR96933]

2020-09-08 Thread Kewen.Lin via Gcc-patches
Hi, This patch is to make vector CTOR with char/short leverage direct move instructions when they are available. With one constructed test case, it can speed up 145% for char and 190% for short on P9. Tested SPEC2017 x264_r at -Ofast on P9, it gets 1.61% speedup (but based on unexpected SLP see

[PATCH v2] rs6000: Use direct move for char/short vector CTOR [PR96933]

2020-09-09 Thread Kewen.Lin via Gcc-patches
Hi, As Segher's suggestion in the PR, for 128bit_direct_move, this new version leverages vector pack insns instead of vector perms with one control vector. The performance evaluation shows that it's on par with the previous version for char, while it's better than the previous for short.

PING^2 [PATCH 1/4] unroll: Add middle-end unroll factor estimation

2020-09-15 Thread Kewen.Lin via Gcc-patches
Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546698.html BR, Kewen on 2020/8/31 下午1:49, Kewen.Lin via Gcc-patches wrote: > Hi, > > I'd like to gentle ping this since IVOPTs part is already to land. > > https://gcc.gnu.org/pipermail/gcc-patches/2020-

Re: [PATCH 3/4 v3] ivopts: Consider cost_step on different forms during unrolling

2020-09-15 Thread Kewen.Lin via Gcc-patches
Hi Hans, on 2020/9/6 上午10:47, Hans-Peter Nilsson wrote: > On Tue, 1 Sep 2020, Bin.Cheng via Gcc-patches wrote: >>> Great idea! With explicitly specified -funroll-loops, it's bootstrapped >>> but the regression testing did show one failure (the only one): >>> >>> PASS->FAIL: gcc.dg/sms-4.c

[PATCH v2] rs6000: Remove useless insns fed into lvx/stvx [PR97019]

2020-09-15 Thread Kewen.Lin via Gcc-patches
Hi Segher, Thanks for the review! >> * config/rs6000/rs6000-p8swap.c (insn_rtx_pair_t): New type. > > Please don't do that. The "first" and "second" are completely > meaningless. Also, keeping it separate arrays can very well result in > better machine code, and certainly makes easier

Re: [PATCH v2] rs6000: Remove useless insns fed into lvx/stvx [PR97019]

2020-09-15 Thread Kewen.Lin via Gcc-patches
Hi Segher, Thanks for your suggestions! >> + for (unsigned i = 0; i < and_insns.length (); ++i) > > "i++" is used more often, is more traditional. > Updated. >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/powerpc/pr97019.c >> @@ -0,0 +1,82 @@ >> +/* This issue can only exist on

[PATCH]rs6000: Remove useless insns fed into lvx/stvx [PR97019]

2020-09-14 Thread Kewen.Lin via Gcc-patches
Hi, This patch is to extend the existing function find_alignment_op to check all defintions of base_reg are AND operations with mask -16B to force the alignment. If all are satifised, it passes all AND operations and instructions in one vector to recombine_lvx_pattern and recombine_stvx_pattern,

Re: [PATCH 3/4 v3] ivopts: Consider cost_step on different forms during unrolling

2020-09-04 Thread Kewen.Lin via Gcc-patches
Hi Segher, >> Good question! I agree that they can execute in parallel, but it depends >> on how we interprete the addressing cost, if it's for required execution >> resource, I think it's off, since comparing with ld, the ldu has two iops >> and extra ALU requirement. > > OTOH, if you do not

Re: [PATCH 3/4 v3] ivopts: Consider cost_step on different forms during unrolling

2020-09-02 Thread Kewen.Lin via Gcc-patches
Hi Segher, on 2020/9/2 下午6:25, Segher Boessenkool wrote: > Hi! > > On Wed, Sep 02, 2020 at 11:16:00AM +0800, Kewen.Lin wrote: >> on 2020/9/1 上午3:41, Segher Boessenkool wrote: >>> On Tue, Aug 25, 2020 at 08:46:55PM +0800, Kewen.Lin wrote: 1) Currently address_cost hook on rs6000 always

Re: [PATCH] vec: remove unreachable code

2020-09-04 Thread Kewen.Lin via Gcc-patches
Hi Andrea, on 2020/9/4 下午8:11, Andrea Corallo wrote: > Hi all, > > just a small patch removing a piece of unreachable code in > 'vect_estimate_min_profitable_iters' given the condition > (LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo)) is always true as > checked just above. > FWIW, I had

Re: [PATCH 3/4 v3] ivopts: Consider cost_step on different forms during unrolling

2020-09-04 Thread Kewen.Lin via Gcc-patches
Hi Segher, on 2020/9/4 下午10:16, Segher Boessenkool wrote: > Hi! > > On Fri, Sep 04, 2020 at 04:47:37PM +0800, Kewen.Lin wrote: Apart from that, one P9 specific point is that the update form load isn't preferred, the reason is that the instruction can not retire until both parts

PING^1 [PATCH v2] rs6000: Use direct move for char/short vector CTOR [PR96933]

2020-10-13 Thread Kewen.Lin via Gcc-patches
Hi, I'd like to gentle ping this patch: https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553555.html BR, Kewen on 2020/9/10 上午11:19, Kewen.Lin via Gcc-patches wrote: > Hi, > > As Segher's suggestion in the PR, for 128bit_direct_move, this new > version leverages vecto

[PATCH v2] pass: Run cleanup passes before SLP [PR96789]

2020-10-13 Thread Kewen.Lin via Gcc-patches
Hi! >> Can you repeat the compile-time measurement there? I also wonder >> whether we should worry about compile-time at -O[12] when SLP is not run. >> Thus, probably rename the cleanup pass to pre_slp_scalar_cleanup and >> gate it on && flag_slp_vectorize > > Good idea, will evaluate it. >

PING^3 [PATCH 1/4] unroll: Add middle-end unroll factor estimation

2020-10-13 Thread Kewen.Lin via Gcc-patches
Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546698.html BR, Kewen on 2020/9/15 下午3:44, Kewen.Lin via Gcc-patches wrote: > Hi, > > Gentle ping this: > > https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546698.html > > BR, > Kewen

[PATCH 3/4 v3] ivopts: Consider cost_step on different forms during unrolling

2020-08-25 Thread Kewen.Lin via Gcc-patches
Hi Bin, >> >> For one particular case like: >> >> for (i = 0; i < SIZE; i++) >> y[i] = a * x[i] + z[i]; >> >> we will mark reg_offset_p for IV candidates on x as below: >>- (unsigned long) (x_18(D) + 8)// only mark this before. >>- x_18(D) + 8 >>-

[PATCH 3/4 v2] ivopts: Consider cost_step on different forms during unrolling

2020-08-18 Thread Kewen.Lin via Gcc-patches
Hi Bin, > I see, it's similar to the auto-increment case where cost should be > recorded only once. So this is okay given 1) fine predicting > rtl-unroll is likely impossible here; 2) the patch has very limited > impact. > Really appreciate your time and patience! I extended the previous

[PATCH v2] options: Make --help= to emit values post-overrided

2020-08-18 Thread Kewen.Lin via Gcc-patches
Hi Segher, on 2020/8/15 上午6:01, Segher Boessenkool wrote: > Hi! > > On Fri, Aug 14, 2020 at 01:42:24PM +0800, Kewen.Lin wrote: >>> I think personally I'd prefer an option (3): call >>> target_option_override_hook directly in decode_options, >>> if help_option_arguments is nonempty. Like you

[PATCH v2] testsuite: Update some vect cases for partial vectors

2020-08-19 Thread Kewen.Lin via Gcc-patches
Hi Richard, >> Yeah, the comments were confusing, its intent is to check which targets >> support partial vectors and which usage to be used. >> >> How about to update them like: >> >> "Return true if loops using partial vectors are supported and usage kind is >> 1/2". > > I wasn't really

Re: [PATCH] pass: Run cleanup passes before SLP [PR96789]

2020-09-29 Thread Kewen.Lin via Gcc-patches
Hi Richard, Thanks for the comments! > diff --git a/gcc/tree-ssa-loop-ivcanon.c b/gcc/tree-ssa-loop-ivcanon.c > index 298ab215530..7016f993339 100644 > --- a/gcc/tree-ssa-loop-ivcanon.c > +++ b/gcc/tree-ssa-loop-ivcanon.c > @@ -1605,6 +1605,14 @@ pass_complete_unroll::execute (function *fun) >

[PATCH] pass: Run cleanup passes before SLP [PR96789]

2020-09-29 Thread Kewen.Lin via Gcc-patches
Hi, As the discussion in PR96789, we found that some scalar stmts which can be eliminated by some passes after SLP, but we still modeled their costs when trying to SLP, it could impact vectorizer's decision. One typical case is the case in PR on target Power. As Richard suggested there, this

Re: [PATCH] vect/test: Don't check for epilogue loop [PR97075]

2020-09-21 Thread Kewen.Lin via Gcc-patches
Hi Richard, on 2020/9/21 下午2:50, Richard Sandiford wrote: > "Kewen.Lin" writes: >> Hi Richard, >>> "Kewen.Lin" writes: Hi, The commit r11-3230 brings a nice improvement to use full vectors instead of partial vectors when available. But it caused some vector with length

Re: [PATCH] vect: Fix epilogue loop handling of partial vectors

2020-09-23 Thread Kewen.Lin via Gcc-patches
on 2020/9/23 下午7:33, Richard Sandiford wrote: > "Kewen.Lin" writes: >> on 2020/9/22 下午10:34, Richard Sandiford wrote: >>> Also, while splitting out the logic that handles epilogues with >>> constant iterations, I added a check to make sure that we don't >>> try to use partial vectors to vectorise

Re: [PATCH] vect: Fix epilogue loop handling of partial vectors

2020-09-22 Thread Kewen.Lin via Gcc-patches
Hi Richard, on 2020/9/22 下午10:34, Richard Sandiford wrote: > Richard Sandiford writes: >> I'll try to have a patch ready tomorrow morning European time. > > Well, I totally failed to hit that deadline. When testing on Power, > I saw a couple of extra failures, but I now think they're

[PATCH] vect/test: Don't check for epilogue loop [PR97075]

2020-09-17 Thread Kewen.Lin via Gcc-patches
Hi, The commit r11-3230 brings a nice improvement to use full vectors instead of partial vectors when available. But it caused some vector with length test cases to fail on Power. The failure on gcc.target/powerpc/p9-vec-length-epil-7.c exposed one issue that: we call function

Re: [PATCH 5/7] vect: Support vector load/store with length in vectorizer

2020-05-27 Thread Kewen.Lin via Gcc-patches
on 2020/5/27 下午6:02, Richard Sandiford wrote: > "Kewen.Lin" writes: >> Hi Richard, >> >> Thanks for your comments! >> >> on 2020/5/26 锟斤拷锟斤拷8:49, Richard Sandiford wrote: >>> "Kewen.Lin" writes: @@ -626,6 +645,12 @@ public: /* True if have decided to use a fully-masked loop. */

Ping^1 [PATCH 2/4 V3] Add target hook stride_dform_valid_p

2020-05-27 Thread Kewen.Lin via Gcc-patches
! Kewen on 2020/5/13 下午1:50, Kewen.Lin via Gcc-patches wrote: > Hi, > > I'd like to ping this patch as well as its sblings. Thanks in advance. > > 1/4 v3 https://gcc.gnu.org/pipermail/gcc-patches/2020-February/540171.html > 2/4 v3 https://gcc.gnu.org/pipermail/gcc-patches/2020-Mar

[PATCH 0/4] IVOPTs consider step cost for different forms when unrolling

2020-05-28 Thread Kewen.Lin via Gcc-patches
Hi, This is one repost and you can refer to the original series via https://gcc.gnu.org/pipermail/gcc-patches/2020-January/538360.html. As we discussed in the thread https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00196.html Original: https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00104.html, I'm

[PATCH 1/4] unroll: Add middle-end unroll factor estimation

2020-05-28 Thread Kewen.Lin via Gcc-patches
gcc/ChangeLog 2020-MM-DD Kewen Lin * cfgloop.h (struct loop): New field estimated_unroll. * tree-ssa-loop-manip.c (decide_unroll_const_iter): New function. (decide_unroll_runtime_iter): Likewise. (decide_unroll_stupid): Likewise.

[PATCH 2/4] param: Introduce one param to control ivopts reg-offset consideration

2020-05-28 Thread Kewen.Lin via Gcc-patches
gcc/ChangeLog 2020-MM-DD Kewen Lin * doc/invoke.texi (iv-consider-reg-offset-for-unroll): Document new option. * params.opt (iv-consider-reg-offset-for-unroll): New. * config/s390/s390.c (s390_option_override_internal): Disable parameter

[PATCH 3/4] ivopts: Consider cost_step on different forms during unrolling

2020-05-28 Thread Kewen.Lin via Gcc-patches
gcc/ChangeLog 2020-MM-DD Kewen Lin * tree-ssa-loop-ivopts.c (struct iv_group): New field reg_offset_p. (struct iv_cand): New field reg_offset_p. (struct ivopts_data): New field consider_reg_offset_for_unroll_p. (dump_groups): Dump group with reg_offset_p.

[PATCH 5/7] vect: Support vector load/store with length in vectorizer

2020-05-25 Thread Kewen.Lin via Gcc-patches
gcc/ChangeLog 2020-MM-DD Kewen Lin * doc/invoke.texi (vect-with-length-scope): Document new option. * params.opt (vect-with-length-scope): New. * tree-vect-loop-manip.c (vect_set_loop_lens_directly): New function. (vect_set_loop_condition_len): Likewise.

[PATCH 4/7] hook/rs6000: Add vectorize length mode for vector with length

2020-05-25 Thread Kewen.Lin via Gcc-patches
gcc/ChangeLog: 2020-MM-DD Kewen Lin * config/rs6000/rs6000.c (TARGET_VECTORIZE_LENGTH_MODE): New macro. * doc/tm.texi: Regenerate. * doc/tm.texi.in: New hook. * target.def: Likewise. --- gcc/config/rs6000/rs6000.c | 3 +++ gcc/doc/tm.texi| 6

[PATCH 3/7] vect: Factor out codes for niters smaller than vf check

2020-05-25 Thread Kewen.Lin via Gcc-patches
gcc/ChangeLog: 2020-MM-DD Kewen Lin * tree-vect-loop.c (known_niters_smaller_than_vf): New function, factored out from ... (vect_analyze_loop_costing): ... here. --- gcc/tree-vect-loop.c | 31 ++- 1 file changed, 22 insertions(+), 9

[PATCH 2/7] rs6000: lenload/lenstore optab support

2020-05-25 Thread Kewen.Lin via Gcc-patches
gcc/ChangeLog: 2020-MM-DD Kewen Lin * config/rs6000/vsx.md (lenloaddi): New define_expand. (lenstoredi): Likewise. --- gcc/config/rs6000/vsx.md | 30 ++ 1 file changed, 30 insertions(+) diff --git a/gcc/config/rs6000/vsx.md

[PATCH 7/7] rs6000/testsuite: Vector with length test cases

2020-05-26 Thread Kewen.Lin via Gcc-patches
gcc/testsuite/ChangeLog 2020-MM-DD Kewen Lin * gcc.target/powerpc/p9-vec-length-1.h: New test. * gcc.target/powerpc/p9-vec-length-2.h: New test. * gcc.target/powerpc/p9-vec-length-3.h: New test. * gcc.target/powerpc/p9-vec-length-4.h: New test. *

[PATCH 6/7] ivopts: Add handlings for vector with length IFNs

2020-05-25 Thread Kewen.Lin via Gcc-patches
gcc/ChangeLog 2020-MM-DD Kewen Lin * tree-ssa-loop-ivopts.c (get_mem_type_for_internal_fn): Handle IFN_LEN_LOAD and IFN_LEN_STORE. (get_alias_ptr_type_for_ptr_address): Likewise. --- gcc/tree-ssa-loop-ivopts.c | 4 1 file changed, 4 insertions(+) diff --git

[PATCH 0/7] Support vector load/store with length

2020-05-25 Thread Kewen.Lin via Gcc-patches
Hi all, This patch set adds support for vector load/store with length, Power ISA 3.0 brings instructions lxvl/stxvl to perform vector load/store with length, it's good to be exploited for those cases we don't have enough stuffs to fill in the whole vector like epilogues. This support mainly

[PATCH 1/7] ifn/optabs: Support vector load/store with length

2020-05-25 Thread Kewen.Lin via Gcc-patches
gcc/ChangeLog: 2020-MM-DD Kewen Lin * doc/md.texi (lenload@var{m}@var{n}): Document. (lenstore@var{m}@var{n}): Likewise. * internal-fn.c (len_load_direct): New macro. (len_store_direct): Likewise. (expand_len_load_optab_fn): Likewise.

Re: [PATCH 0/7] Support vector load/store with length

2020-05-26 Thread Kewen.Lin via Gcc-patches
Hi Richi, on 2020/5/26 下午3:12, Richard Biener wrote: > On Tue, 26 May 2020, Kewen.Lin wrote: > >> Hi all, >> >> This patch set adds support for vector load/store with length, Power >> ISA 3.0 brings instructions lxvl/stxvl to perform vector load/store with >> length, it's good to be exploited

Re: [PATCH 0/7] Support vector load/store with length

2020-05-26 Thread Kewen.Lin via Gcc-patches
on 2020/5/26 下午5:44, Richard Biener wrote: > On Tue, 26 May 2020, Kewen.Lin wrote: > >> Hi Richi, >> >> on 2020/5/26 下午3:12, Richard Biener wrote: >>> On Tue, 26 May 2020, Kewen.Lin wrote: >>> Hi all, This patch set adds support for vector load/store with length, Power ISA

Re: [PATCH 3/4] ivopts: Consider cost_step on different forms during unrolling

2020-06-01 Thread Kewen.Lin via Gcc-patches
Hi Richard, Thanks for the comments! on 2020/6/2 上午1:59, Richard Sandiford wrote: > Could you go into more detail about this choice of cost calculation? > It looks like we first calculate per-group flags, which are true only if > the unrolled offsets are valid for all uses in the group. Then we

[PATCH 5/7 v3] vect: Support vector load/store with length in vectorizer

2020-06-02 Thread Kewen.Lin via Gcc-patches
Hi Richard, on 2020/5/29 下午4:32, Richard Sandiford wrote: > "Kewen.Lin" writes: >> on 2020/5/27 下午6:02, Richard Sandiford wrote: >>> "Kewen.Lin" writes: Hi Richard, Snip ... >> >> Thanks a lot for your detailed explanation! This proposal looks good >> based on the current

Re: [PATCH 0/7] Support vector load/store with length

2020-05-27 Thread Kewen.Lin via Gcc-patches
on 2020/5/27 下午3:25, Richard Biener wrote: > On Tue, 26 May 2020, Segher Boessenkool wrote: > >> Hi! >> >> On Tue, May 26, 2020 at 01:29:30PM +0100, Richard Sandiford wrote: >>> FWIW, I agree adding .LEN_LOAD and .LEN_STORE seems like a good >>> approach. I think it'll be more maintainable in

Re: [PATCH 5/7] vect: Support vector load/store with length in vectorizer

2020-05-27 Thread Kewen.Lin via Gcc-patches
Hi Richard, Thanks for your comments! on 2020/5/26 下午8:49, Richard Sandiford wrote: > "Kewen.Lin" writes: >> @@ -626,6 +645,12 @@ public: >>/* True if have decided to use a fully-masked loop. */ >>bool fully_masked_p; >> >> + /* Records whether we still have the option of using a

Re: [PATCH 2/2] rs6000: tune loop size for cunroll at O2

2020-05-19 Thread Kewen.Lin via Gcc-patches
Hi Jeff, on 2020/5/20 上午11:58, Jiufu Guo via Gcc-patches wrote: > Hi, > > This patch check the size of a loop to be unrolled/peeled completely, > and set the limits to a number (24). This prevents large loop from > being unrolled, then avoid binary size increasing, and this limit keeps >

Re: [PATCH] vect/test: Don't check for epilogue loop [PR97075]

2020-09-20 Thread Kewen.Lin via Gcc-patches
Hi Richard, > "Kewen.Lin" writes: >> Hi, >> >> The commit r11-3230 brings a nice improvement to use full >> vectors instead of partial vectors when available. But >> it caused some vector with length test cases to fail on >> Power. >> >> The failure on gcc.target/powerpc/p9-vec-length-epil-7.c

Re: [PATCH] testsuite/rs6000: Add option to ignore vect cost model

2020-07-16 Thread Kewen.Lin via Gcc-patches
Hi, on 2020/7/17 上午4:31, Segher Boessenkool wrote: > Hi! > > On Thu, Jul 16, 2020 at 02:51:23PM +0800, Kewen.Lin wrote: >> In my testing with cost tweaking for vector with length, I found >> two cases below didn't get the expected output. Since the expected >> instructions reply on the

[PATCH] rs6000: Rename adjust_vectorization_cost

2020-07-21 Thread Kewen.Lin via Gcc-patches
Hi, This trivial patch is to rename adjust_vectorization_cost to adjust_vect_cost_per_stmt. Hope it's more meaningful, as well as to avoid the confusion between the possible to be landed function "adjust_vect_cost" and "adjust_vectorization_cost". Even without "adjust_vect_cost", I guess it's

[PATCH v2] vect/rs6000: Support vector with length cost modeling

2020-07-21 Thread Kewen.Lin via Gcc-patches
Hi Richard, on 2020/7/21 下午3:57, Richard Biener wrote: > On Tue, Jul 21, 2020 at 7:52 AM Kewen.Lin wrote: >> >> Hi, >> >> This patch is to add the cost modeling for vector with length, >> it mainly follows what we generate for vector with length in >> functions vect_set_loop_controls_directly

Re: [PATCH v2] vect/rs6000: Support vector with length cost modeling

2020-07-22 Thread Kewen.Lin via Gcc-patches
Hi Richard, on 2020/7/22 下午2:38, Richard Biener wrote: > On Wed, Jul 22, 2020 at 3:26 AM Kewen.Lin wrote: >> >> Hi Richard, >> >> on 2020/7/21 下午3:57, Richard Biener wrote: >>> On Tue, Jul 21, 2020 at 7:52 AM Kewen.Lin wrote: Hi, This patch is to add the cost modeling for

[PATCH] testsuite: Update some vect cases for partial vectors

2020-08-05 Thread Kewen.Lin via Gcc-patches
Hi, This patch is to adjust some existing vectorization test cases to stay well with the newly introduced partial vector usages. Bootstrapped/regtested on aarch64-linux-gnu and powerpc64le-linux-gnu P9 (with explicit param vect-partial-vector-usage=1 and enablement on

[PATCH] vect: Skip epilogue loops for dbgcnt check [PR96451]

2020-08-05 Thread Kewen.Lin via Gcc-patches
Hi, As the PR shows, commit r11-2453 exposed one issue that vectorizer wants to vectorize the epilogue loop and leaves the if-cvt body there, but later dbgcnt check disables it, the left scalar mask_store statement cause ICE. As Richard pointed out in that PR, the dbgcnt is to count original

Re: [PATCH/RFC] options: Make --help= to emit values post-overrided

2020-08-06 Thread Kewen.Lin via Gcc-patches
Hi Segher! Thanks for the comments! on 2020/8/7 上午6:04, Segher Boessenkool wrote: > Hi! > > On Thu, Aug 06, 2020 at 08:37:23PM +0800, Kewen.Lin wrote: >> When I was working to update patch as Richard's review comments >> here https://gcc.gnu.org/pipermail/gcc-patches/2020-August/551474.html, >>

Re: [PATCH v5] vect/rs6000: Support vector with length cost modeling

2020-08-06 Thread Kewen.Lin via Gcc-patches
on 2020/8/5 下午10:06, Segher Boessenkool wrote: > On Wed, Aug 05, 2020 at 08:27:57AM +0100, Richard Sandiford wrote: >> OK for the vectoriser parts with those changes, thanks. > > The rs6000 part is still fine as well. Thanks! > > Committed via r11-2586. Thanks all! BR, Kewen

Re: [PATCH] testsuite: Update some vect cases for partial vectors

2020-08-06 Thread Kewen.Lin via Gcc-patches
Hi Richard, Thanks for the review! on 2020/8/6 下午1:52, Richard Sandiford wrote: > "Kewen.Lin" writes: >> diff --git a/gcc/testsuite/gcc.dg/vect/slp-multitypes-11.c >> b/gcc/testsuite/gcc.dg/vect/slp-multitypes-11.c >> index 5200ed1cd94..da6fb12eb0d 100644 >> ---

Re: [PATCH 3/4] ivopts: Consider cost_step on different forms during unrolling

2020-08-10 Thread Kewen.Lin via Gcc-patches
Hi Bin, on 2020/8/10 下午8:38, Bin.Cheng wrote: > On Mon, Aug 10, 2020 at 12:27 PM Kewen.Lin wrote: >> >> Hi Bin, >> >> Thanks for the review!! >> >> on 2020/8/8 下午4:01, Bin.Cheng wrote: >>> Hi Kewen, >>> Sorry for the late reply. >>> The patch's most important change is below cost computation:

[PATCH] testsuite: Add -fno-common to pr82374.c [PR94077]

2020-08-12 Thread Kewen.Lin via Gcc-patches
Hi, As the PR comments show, the case gcc.dg/gomp/pr82374.c fails on Power7 since gcc8. But it passes from gcc10. By looking into the difference, it's due to that gcc10 sets -fno-common as default, which makes vectorizer force the alignment and be able to use aligned vector load/store on those

Re: [PATCH v4] vect/rs6000: Support vector with length cost modeling

2020-07-31 Thread Kewen.Lin via Gcc-patches
on 2020/7/31 下午9:01, Richard Biener wrote: > On Fri, Jul 31, 2020 at 2:37 PM Kewen.Lin wrote: >> >> Hi Richards, >> >> on 2020/7/31 下午7:20, Richard Biener wrote: >>> On Fri, Jul 31, 2020 at 1:03 PM Richard Sandiford >>> wrote: "Kewen.Lin" writes: >>> + bool niters_known_p =

Re: [PATCH v4] vect/rs6000: Support vector with length cost modeling

2020-07-31 Thread Kewen.Lin via Gcc-patches
Hi Richards, on 2020/7/31 下午7:20, Richard Biener wrote: > On Fri, Jul 31, 2020 at 1:03 PM Richard Sandiford > wrote: >> >> "Kewen.Lin" writes: > + bool niters_known_p = LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo); > + bool need_iterate_p > + = (!LOOP_VINFO_EPILOGUE_P

Re: Refactor peel_iters_{pro,epi}logue cost model handlings

2020-07-31 Thread Kewen.Lin via Gcc-patches
on 2020/7/31 下午6:57, Richard Sandiford wrote: > "Kewen.Lin" writes: >> Hi Richard, >> >> on 2020/7/27 下午9:10, Richard Sandiford wrote: >>> "Kewen.Lin" writes: Hi, As Richard S. suggested in the thread: https://gcc.gnu.org/pipermail/gcc-patches/2020-July/550633.html

[PATCH v5] vect/rs6000: Support vector with length cost modeling

2020-07-31 Thread Kewen.Lin via Gcc-patches
Hi Richard, New version v5 is attached. v5 main changes against v4: 1) use _stmt instead of _cnt to avoid confusion 2) factor out function vect_rgroup_iv_might_wrap_p 3) use generic scalar_stmt for min/max stmt Does this look better? Thanks in advance! BR, Kewen - gcc/ChangeLog:

Re: [PATCH] options: Make --help= to emit values post-overrided

2020-08-13 Thread Kewen.Lin via Gcc-patches
Hi Richard, Thanks for the comments! on 2020/8/13 上午12:10, Richard Sandiford wrote: > "Kewen.Lin" writes: >> Hi Segher, >> >> on 2020/8/7 锟斤拷锟斤拷10:42, Segher Boessenkool wrote: >>> Hi! >>> >>> On Fri, Aug 07, 2020 at 10:44:10AM +0800, Kewen.Lin wrote: > I think this makes a lot of sense.

[PATCH/RFC] options: Make --help= to emit values post-overrided

2020-08-06 Thread Kewen.Lin via Gcc-patches
Hi, When I was working to update patch as Richard's review comments here https://gcc.gnu.org/pipermail/gcc-patches/2020-August/551474.html, I noticed that the options "-Q --help=params" don't show the final values after target option overriding, instead it emits the default values in params.opt

Re: [PATCH 3/4] ivopts: Consider cost_step on different forms during unrolling

2020-08-09 Thread Kewen.Lin via Gcc-patches
Hi Bin, Thanks for the review!! on 2020/8/8 下午4:01, Bin.Cheng wrote: > Hi Kewen, > Sorry for the late reply. > The patch's most important change is below cost computation: > >> @@ -5890,6 +5973,10 @@ determine_iv_cost (struct ivopts_data *data, struct >> iv_cand *cand) >> cost_step =

[PATCH] options: Make --help= to emit values post-overrided

2020-08-09 Thread Kewen.Lin via Gcc-patches
Hi Segher, on 2020/8/7 下午10:42, Segher Boessenkool wrote: > Hi! > > On Fri, Aug 07, 2020 at 10:44:10AM +0800, Kewen.Lin wrote: >>> I think this makes a lot of sense. >>> btw, not sure whether it's a good idea to move target_option_override_hook call into print_specific_help and use one

[PATCH 1/7 v8] ifn/optabs: Support vector load/store with length

2020-07-01 Thread Kewen.Lin via Gcc-patches
Hi Richard, on 2020/6/30 下午11:32, Richard Sandiford wrote: > "Kewen.Lin" writes: >> Hi Richard, >> >> Thanks for the comments! >> >> on 2020/6/29 下午6:07, Richard Sandiford wrote: >>> Thanks for the update. I agree with the summary of the IRC discussion >>> except for… >>> >>> "Kewen.Lin"

Re: [PATCH 5/7 v6] vect: Support vector load/store with length in vectorizer

2020-07-01 Thread Kewen.Lin via Gcc-patches
Hi Richard, Many thanks for your great review comments! on 2020/7/1 上午3:53, Richard Sandiford wrote: > "Kewen.Lin" writes: >> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi >> index 06a04e3d7dd..284c15705ea 100644 >> --- a/gcc/doc/invoke.texi >> +++ b/gcc/doc/invoke.texi >> @@ -13389,6

Re: [PATCH 5/7 v6] vect: Support vector load/store with length in vectorizer

2020-07-08 Thread Kewen.Lin via Gcc-patches
Hi Richard, on 2020/7/7 下午6:44, Richard Sandiford wrote: > "Kewen.Lin" writes: >> on 2020/7/2 下午1:20, Kewen.Lin via Gcc-patches wrote: >>> on 2020/7/1 下午11:17, Richard Sandiford wrote: >>>> "Kewen.Lin" writes: >>>>> on 202

[PATCH] vect: Enhance condition check to use partial vectors in vectorizable_condition

2020-07-08 Thread Kewen.Lin via Gcc-patches
Hi, This patch is derived from the review of vector with length patch series. The length-based partial vector approach doesn't support reduction so far, so we would like to disable vectorization with partial vectors explicitly for it in vectorizable_condition. Otherwise, it will cause some

[PATCH] vect/testsuite: Adjust dumping for fully masking decision

2020-07-08 Thread Kewen.Lin via Gcc-patches
Hi, As Richard S. suggested in the review of vector with length patch series, we can use one message on "partial vectors" instead of "fully with masking". This patch is to update the dumping string and related test cases. Bootstrapped/regtested on aarch64-linux-gnu. Is it ok for trunk? BR,

Re: [PATCH 5/7 v6] vect: Support vector load/store with length in vectorizer

2020-07-08 Thread Kewen.Lin via Gcc-patches
Hi Richard, on 2020/7/7 下午6:15, Richard Sandiford wrote: > "Kewen.Lin" writes: >> Hi Richard, >> >> on 2020/7/1 下午11:17, Richard Sandiford wrote: >>> "Kewen.Lin" writes: on 2020/7/1 上午3:53, Richard Sandiford wrote: > "Kewen.Lin" writes: Sorry, I didn't quite follow this

Re: [PATCH 5/7 v6] vect: Support vector load/store with length in vectorizer

2020-07-07 Thread Kewen.Lin via Gcc-patches
Hi Richard, on 2020/7/2 下午1:20, Kewen.Lin via Gcc-patches wrote: > on 2020/7/1 下午11:17, Richard Sandiford wrote: >> "Kewen.Lin" writes: >>> on 2020/7/1 上午3:53, Richard Sandiford wrote: >>>> "Kewen.Lin" writes: [...] >> Hmm, OK. Bu

[PATCH 5/7 v7] vect: Support vector load/store with length in vectorizer

2020-07-10 Thread Kewen.Lin via Gcc-patches
Hi Richard, The new version v7 is attached which has addressed your review comments on v6. Could you have a further look? Many thanks in advance! Bootstrapped/regtested on aarch64-linux-gnu and powerpc64le-linux-gnu P9. Even with explicit vect-partial-vector-usage settings 1/2 on Power target,

  1   2   3   4   5   6   7   8   9   10   >