Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-12-21 Thread Yuri Rumyantsev
Sorry, I put wrong test - fix it here. 2016-12-21 13:12 GMT+03:00 Yuri Rumyantsev : > Hi Richard, > > I occasionally found out a bug in my patch related to epilogue > vectorization without masking : need to put label before > initialization. > > Could you please review and

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-12-21 Thread Yuri Rumyantsev
Hi Richard, I occasionally found out a bug in my patch related to epilogue vectorization without masking : need to put label before initialization. Could you please review and integrate it to trunk. Test-case is also attached. Thanks ahead. Yuri. ChangeLog: 2016-12-21 Yuri Rumyantsev

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-12-01 Thread Richard Biener
On Thu, 1 Dec 2016, Yuri Rumyantsev wrote: > Thanks Richard for your comments. > > You asked me about possible performance improvements for AVX2 machines > - we did not see any visible speed-up for spec2k with any method of Spec 2000? Can you check with SPEC 2006 or CPUv6? Did you see

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-12-01 Thread Yuri Rumyantsev
Thanks Richard for your comments. You asked me about possible performance improvements for AVX2 machines - we did not see any visible speed-up for spec2k with any method of masking, including epilogue masking and combining, only on AVX512 machine aka knl. I will answer on your question later.

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-12-01 Thread Richard Biener
On Mon, 28 Nov 2016, Yuri Rumyantsev wrote: > Richard! > > I attached vect dump for hte part of attached test-case which > illustrated how vectorization of epilogues works through masking: > #define SIZE 1023 > #define ALIGN 64 > > extern int posix_memalign(void **memptr, __SIZE_TYPE__

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-29 Thread Christophe Lyon
On 18 November 2016 at 16:54, Christophe Lyon wrote: > On 18 November 2016 at 16:46, Yuri Rumyantsev wrote: >> It is very strange that this test failed on arm, since it requires >> target avx2 to check vectorizer dumps: >> >> /* { dg-final {

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-28 Thread Yuri Rumyantsev
Richard! I attached vect dump for hte part of attached test-case which illustrated how vectorization of epilogues works through masking: #define SIZE 1023 #define ALIGN 64 extern int posix_memalign(void **memptr, __SIZE_TYPE__ alignment, __SIZE_TYPE__ size) __attribute__((weak)); extern void

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-28 Thread Richard Biener
On Thu, 24 Nov 2016, Yuri Rumyantsev wrote: > Hi All, > > Here is the second patch which supports epilogue vectorization using > masking without cost model. Currently it is possible > only with passing parameter "--param vect-epilogues-mask=1". > > Bootstrapping and regression testing did not

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-24 Thread Yuri Rumyantsev
Hi All, Here is the second patch which supports epilogue vectorization using masking without cost model. Currently it is possible only with passing parameter "--param vect-epilogues-mask=1". Bootstrapping and regression testing did not show any new regression. Any comments will be appreciated.

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-18 Thread Christophe Lyon
On 18 November 2016 at 16:46, Yuri Rumyantsev wrote: > It is very strange that this test failed on arm, since it requires > target avx2 to check vectorizer dumps: > > /* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 2 "vect" { > target avx2_runtime } } } */ > /* {

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-18 Thread Yuri Rumyantsev
It is very strange that this test failed on arm, since it requires target avx2 to check vectorizer dumps: /* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 2 "vect" { target avx2_runtime } } } */ /* { dg-final { scan-tree-dump-times "LOOP EPILOGUE VECTORIZED \\(VS=16\\)" 2 "vect" { target

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-18 Thread Christophe Lyon
On 15 November 2016 at 15:41, Yuri Rumyantsev wrote: > Hi All, > > Here is patch for non-masked epilogue vectoriziation. > > Bootstrap and regression testing did not show any new failures. > > Is it OK for trunk? > > Thanks. > Changelog: > > 2016-11-15 Yuri Rumyantsev

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-16 Thread Richard Biener
On Tue, 15 Nov 2016, Yuri Rumyantsev wrote: > Hi All, > > Here is patch for non-masked epilogue vectoriziation. > > Bootstrap and regression testing did not show any new failures. > > Is it OK for trunk? Ok for trunk. I believe we ultimatively want to remove the new --param and enable this

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-15 Thread Yuri Rumyantsev
Hi All, Here is patch for non-masked epilogue vectoriziation. Bootstrap and regression testing did not show any new failures. Is it OK for trunk? Thanks. Changelog: 2016-11-15 Yuri Rumyantsev * params.def (PARAM_VECT_EPILOGUES_NOMASK): New. * tree-if-conv.c

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-14 Thread Richard Biener
On November 14, 2016 4:39:40 PM GMT+01:00, Yuri Rumyantsev wrote: >Richard, > >I checked one of the tests designed for epilogue vectorization using >patches 1 - 3 and found out that build compiler performs vectorization >of epilogues with --param vect-epilogues-nomask=1

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-14 Thread Yuri Rumyantsev
Richard, I checked one of the tests designed for epilogue vectorization using patches 1 - 3 and found out that build compiler performs vectorization of epilogues with --param vect-epilogues-nomask=1 passed: $ gcc -Ofast -mavx2 t1.c -S --param vect-epilogues-nomask=1 -o t1.new-nomask.s

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-14 Thread Richard Biener
On Mon, 14 Nov 2016, Yuri Rumyantsev wrote: > Richard, > > In my previous patch I forgot to remove couple lines related to aux field. > Here is the correct updated patch. Yeah, I noticed. This patch would be ok for trunk (together with necessary parts from 1 and 2) if all not required parts

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-14 Thread Yuri Rumyantsev
Richard, In my previous patch I forgot to remove couple lines related to aux field. Here is the correct updated patch. Thanks. Yuri. 2016-11-14 15:51 GMT+03:00 Richard Biener : > On Fri, 11 Nov 2016, Yuri Rumyantsev wrote: > >> Richard, >> >> I prepare updated 3 patch with

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-14 Thread Richard Biener
On Fri, 11 Nov 2016, Yuri Rumyantsev wrote: > Richard, > > Here is fixed version of updated patch 3. > > Any comments will be appreciated. Looks good apart from + if (epilogue) +{ + epilogue->force_vectorize = loop->force_vectorize; + epilogue->safelen = loop->safelen; +

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-14 Thread Richard Biener
On Fri, 11 Nov 2016, Yuri Rumyantsev wrote: > Richard, > > I prepare updated 3 patch with passing additional argument to > vect_analyze_loop as you proposed (untested). > > You wrote: > tw, I wonder if you can produce a single patch containing just > epilogue vectorization, that is combine

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-11 Thread Yuri Rumyantsev
Richard, Here is fixed version of updated patch 3. Any comments will be appreciated. Thanks. Yuri. 2016-11-11 17:15 GMT+03:00 Yuri Rumyantsev : > Richard, > > Sorry for confusion but my updated patch does not work properly, so I > need to fix it. > > Yuri. > > 2016-11-11

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-11 Thread Yuri Rumyantsev
Richard, Sorry for confusion but my updated patch does not work properly, so I need to fix it. Yuri. 2016-11-11 14:15 GMT+03:00 Yuri Rumyantsev : > Richard, > > I prepare updated 3 patch with passing additional argument to > vect_analyze_loop as you proposed (untested). > >

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-11 Thread Yuri Rumyantsev
Richard, I prepare updated 3 patch with passing additional argument to vect_analyze_loop as you proposed (untested). You wrote: tw, I wonder if you can produce a single patch containing just epilogue vectorization, that is combine patches 1-3 but rip out changes only needed by later patches?

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-10 Thread Richard Biener
On Thu, 10 Nov 2016, Richard Biener wrote: > On Tue, 8 Nov 2016, Yuri Rumyantsev wrote: > > > Richard, > > > > Here is updated 3 patch. > > > > I checked that all new tests related to epilogue vectorization passed with > > it. > > > > Your comments will be appreciated. > > A lot better now.

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-10 Thread Richard Biener
On Tue, 8 Nov 2016, Yuri Rumyantsev wrote: > Richard, > > Here is updated 3 patch. > > I checked that all new tests related to epilogue vectorization passed with it. > > Your comments will be appreciated. A lot better now. Instead of the ->aux dance I now prefer to pass the original loops

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-09 Thread Richard Biener
On Wed, 9 Nov 2016, Yuri Rumyantsev wrote: > Thanks Richard for your comments. > Your proposed to handle epilogue loop just like normal short-trip loop > but this is not exactly truth since e.g. epilogue must not be peeled > for alignment. But if we know the epilogue data-refs are aligned we

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-09 Thread Bin.Cheng
On Wed, Nov 9, 2016 at 12:12 PM, Yuri Rumyantsev wrote: > I am familiar with SVE extension and understand that implemented > approach might be not suitable for ARM. The main point is that only > load/store instructions are masked but all other calculations are not > (we did

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-09 Thread Yuri Rumyantsev
I am familiar with SVE extension and understand that implemented approach might be not suitable for ARM. The main point is that only load/store instructions are masked but all other calculations are not (we did special conversion for reduction statements to implement merging predication semantic).

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-09 Thread Bin.Cheng
On Wed, Nov 9, 2016 at 11:28 AM, Yuri Rumyantsev wrote: > Thanks Richard for your comments. > Your proposed to handle epilogue loop just like normal short-trip loop > but this is not exactly truth since e.g. epilogue must not be peeled > for alignment. Yes there must be some

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-09 Thread Yuri Rumyantsev
Thanks Richard for your comments. Your proposed to handle epilogue loop just like normal short-trip loop but this is not exactly truth since e.g. epilogue must not be peeled for alignment. It is not clear for me what are my next steps? Should I re-design the patch completely or simply decompose

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-09 Thread Bin.Cheng
On Tue, Nov 1, 2016 at 12:38 PM, Yuri Rumyantsev wrote: > Hi All, > > I re-send all patches sent by Ilya earlier for review which support > vectorization of loop epilogues and loops with low trip count. We > assume that the only patch - vec-tails-07-combine-tail.patch - was

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-08 Thread Yuri Rumyantsev
Richard, Here is updated 3 patch. I checked that all new tests related to epilogue vectorization passed with it. Your comments will be appreciated. 2016-11-08 15:38 GMT+03:00 Richard Biener : > On Thu, 3 Nov 2016, Yuri Rumyantsev wrote: > >> Hi Richard, >> >> I did not

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-08 Thread Richard Biener
On Thu, 3 Nov 2016, Yuri Rumyantsev wrote: > Hi Richard, > > I did not understand your last remark: > > > That is, here (and avoid the FOR_EACH_LOOP change): > > > > @@ -580,12 +586,21 @@ vectorize_loops (void) > > && dump_enabled_p ()) > > dump_printf_loc

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-06 Thread Richard Biener
On November 5, 2016 3:40:04 AM GMT+01:00, Jeff Law wrote: >On 11/02/2016 06:27 AM, Richard Biener wrote: >> I'm still torn about all the rest of the stuff and question its >> usability (esp. merging the epilogue with the main vector loop). >> But it has already been approved ...

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-05 Thread Jeff Law
On 11/02/2016 06:27 AM, Richard Biener wrote: I'm still torn about all the rest of the stuff and question its usability (esp. merging the epilogue with the main vector loop). But it has already been approved ... oh well. Note that merging of the epilogue with the main vector loop may well be

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-03 Thread Yuri Rumyantsev
Hi Richard, I did not understand your last remark: > That is, here (and avoid the FOR_EACH_LOOP change): > > @@ -580,12 +586,21 @@ vectorize_loops (void) > && dump_enabled_p ()) > dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, vect_location, >"loop

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-02 Thread Richard Biener
On Tue, 1 Nov 2016, Yuri Rumyantsev wrote: > Hi All, > > I re-send all patches sent by Ilya earlier for review which support > vectorization of loop epilogues and loops with low trip count. We > assume that the only patch - vec-tails-07-combine-tail.patch - was not > approved by Jeff. > > I did