RE: [PATCH PR95199] vect: Remove extra variable created for memory reference
On Sun, 14 Jun 2020, zhoukaipeng (A) wrote: > Hi, > > I modified the issue you mentioned. Bootstrap and tested on aarch64 Linux > platform again. No new regression witnessed. > > For "*vec_offset", it is indeed a point optimizable. However, it is not > able to eliminate it by the same way as "*dataref_bump". The > cse_and_gimplify_to_preheader will return the cached operand if > operand_compare::operand_equal_p for the cached one and the new one > returns true. But it did not work well for "*vec_offset". You have to see what is actually in *vec_offset, if there's partly gimplified but not CSEd stmts in there then that's the issue. Richard. > Do you have any suggestions for the problem? > > Thanks, > Kaipeng Zhou > > > -Original Message- > > From: Richard Sandiford [mailto:richard.sandif...@arm.com] > > Sent: Friday, June 12, 2020 3:48 PM > > To: zhoukaipeng (A) > > Cc: gcc-patches@gcc.gnu.org; rguent...@suse.de; am...@gcc.gnu.org; > > rgue...@gcc.gnu.org > > Subject: Re: [PATCH PR95199] vect: Remove extra variable created for > > memory reference > > > > "zhoukaipeng (A)" writes: > > > Hi, > > > > > > This is a fix for pr95199. > > > In vectorization, vinfo->ivexpr_map is supposed to catch those "IV base > > and/or step expressions". Just call cse_and_gimplify_to_preheader to > > handle gathering/scattering to avoid the extra variable. > > > > > > Bootstrap and tested on aarch64/x86_64 Linux platform. No new > > regression witnessed. > > > > > > Is it ok? > > > > > > Thanks, > > > Kaipeng Zhou > > > > > > From 12cf9b362576735e4c584c48cd6db3d2b0f2e14b Mon Sep 17 00:00:00 > > 2001 > > > From: Kaipeng Zhou > > > Date: Fri, 12 Jun 2020 00:54:15 +0800 > > > Subject: [PATCH] vect: Remove extra variable created for memory > > > reference in loop vectorization. > > > > > > Vectorization create two copies of the same IV and IVOPTs does not > > > perform CSE. But vinfo->ivexpr_map is supposed to catch those "IV > > > base and/or step expressions". Just call > > > cse_and_gimplify_to_preheader to handle gathering/scattering to avoid > > the extra variable. > > > > > > 2020-06-12 Bin Cheng > > > Kaipeng Zhou > > > > > > PR tree-optimization/95199 > > > * tree-vect-stmts.c: Use CSE map to catch the IV step and eliminate > > > extra variable. > > > > > > 2020-06-12 Bin Cheng > > > Kaipeng Zhou > > > > > > PR tree-optimization/95199 > > > * gcc.dg/vect/pr95199.c: New test. > > > --- > > > gcc/ChangeLog | 7 +++ > > > gcc/testsuite/ChangeLog | 6 ++ > > > gcc/testsuite/gcc.dg/vect/pr95199.c | 17 + > > > gcc/tree-vect-stmts.c | 1 + > > > 4 files changed, 31 insertions(+) > > > create mode 100644 gcc/testsuite/gcc.dg/vect/pr95199.c > > > > > > diff --git a/gcc/ChangeLog b/gcc/ChangeLog index > > > c92582df7fe..753d70fc94b 100644 > > > --- a/gcc/ChangeLog > > > +++ b/gcc/ChangeLog > > > @@ -1,3 +1,10 @@ > > > +2020-06-12 Bin Cheng > > > + Kaipeng Zhou > > > + > > > + PR tree-optimization/95199 > > > + * tree-vect-stmts.c: Use CSE map to catch the IV step and eliminate > > > + extra variable. > > > + > > > 2020-06-08 Tobias Burnus > > > > > > PR lto/94848 > > > diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index > > > 60d9ecca3ed..a27dd3fa072 100644 > > > --- a/gcc/testsuite/ChangeLog > > > +++ b/gcc/testsuite/ChangeLog > > > @@ -1,3 +1,9 @@ > > > +2020-06-12 Bin Cheng > > > + Kaipeng Zhou > > > + > > > + PR tree-optimization/95199 > > > + * gcc.dg/vect/pr95199.c: New test. > > > + > > > 2020-06-08 Harald Anlauf > > > > > > PR fortran/95195 > > > diff --git a/gcc/testsuite/gcc.dg/vect/pr95199.c > > > b/gcc/testsuite/gcc.dg/vect/pr95199.c > > > new file mode 100644 > > > index 000..ec201feaec8 > > > --- /dev/null > > > +++ b/gcc/testsuite/gcc.dg/vect/pr95199.c > > > @@ -0,0 +1,17 @@ > > > +/* { dg-do compile { target aarch64-*-* } } */ > > > +/* { dg-options "-O3 -march=armv8.2-a+fp+sve" } */ > > > + &
RE: [PATCH PR95199] vect: Remove extra variable created for memory reference
Hi, I modified the issue you mentioned. Bootstrap and tested on aarch64 Linux platform again. No new regression witnessed. For "*vec_offset", it is indeed a point optimizable. However, it is not able to eliminate it by the same way as "*dataref_bump". The cse_and_gimplify_to_preheader will return the cached operand if operand_compare::operand_equal_p for the cached one and the new one returns true. But it did not work well for "*vec_offset". Do you have any suggestions for the problem? Thanks, Kaipeng Zhou > -Original Message- > From: Richard Sandiford [mailto:richard.sandif...@arm.com] > Sent: Friday, June 12, 2020 3:48 PM > To: zhoukaipeng (A) > Cc: gcc-patches@gcc.gnu.org; rguent...@suse.de; am...@gcc.gnu.org; > rgue...@gcc.gnu.org > Subject: Re: [PATCH PR95199] vect: Remove extra variable created for > memory reference > > "zhoukaipeng (A)" writes: > > Hi, > > > > This is a fix for pr95199. > > In vectorization, vinfo->ivexpr_map is supposed to catch those "IV base > and/or step expressions". Just call cse_and_gimplify_to_preheader to > handle gathering/scattering to avoid the extra variable. > > > > Bootstrap and tested on aarch64/x86_64 Linux platform. No new > regression witnessed. > > > > Is it ok? > > > > Thanks, > > Kaipeng Zhou > > > > From 12cf9b362576735e4c584c48cd6db3d2b0f2e14b Mon Sep 17 00:00:00 > 2001 > > From: Kaipeng Zhou > > Date: Fri, 12 Jun 2020 00:54:15 +0800 > > Subject: [PATCH] vect: Remove extra variable created for memory > > reference in loop vectorization. > > > > Vectorization create two copies of the same IV and IVOPTs does not > > perform CSE. But vinfo->ivexpr_map is supposed to catch those "IV > > base and/or step expressions". Just call > > cse_and_gimplify_to_preheader to handle gathering/scattering to avoid > the extra variable. > > > > 2020-06-12 Bin Cheng > > Kaipeng Zhou > > > > PR tree-optimization/95199 > > * tree-vect-stmts.c: Use CSE map to catch the IV step and eliminate > > extra variable. > > > > 2020-06-12 Bin Cheng > > Kaipeng Zhou > > > > PR tree-optimization/95199 > > * gcc.dg/vect/pr95199.c: New test. > > --- > > gcc/ChangeLog | 7 +++ > > gcc/testsuite/ChangeLog | 6 ++ > > gcc/testsuite/gcc.dg/vect/pr95199.c | 17 + > > gcc/tree-vect-stmts.c | 1 + > > 4 files changed, 31 insertions(+) > > create mode 100644 gcc/testsuite/gcc.dg/vect/pr95199.c > > > > diff --git a/gcc/ChangeLog b/gcc/ChangeLog index > > c92582df7fe..753d70fc94b 100644 > > --- a/gcc/ChangeLog > > +++ b/gcc/ChangeLog > > @@ -1,3 +1,10 @@ > > +2020-06-12 Bin Cheng > > + Kaipeng Zhou > > + > > + PR tree-optimization/95199 > > + * tree-vect-stmts.c: Use CSE map to catch the IV step and eliminate > > + extra variable. > > + > > 2020-06-08 Tobias Burnus > > > > PR lto/94848 > > diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index > > 60d9ecca3ed..a27dd3fa072 100644 > > --- a/gcc/testsuite/ChangeLog > > +++ b/gcc/testsuite/ChangeLog > > @@ -1,3 +1,9 @@ > > +2020-06-12 Bin Cheng > > + Kaipeng Zhou > > + > > + PR tree-optimization/95199 > > + * gcc.dg/vect/pr95199.c: New test. > > + > > 2020-06-08 Harald Anlauf > > > > PR fortran/95195 > > diff --git a/gcc/testsuite/gcc.dg/vect/pr95199.c > > b/gcc/testsuite/gcc.dg/vect/pr95199.c > > new file mode 100644 > > index 000..ec201feaec8 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.dg/vect/pr95199.c > > @@ -0,0 +1,17 @@ > > +/* { dg-do compile { target aarch64-*-* } } */ > > +/* { dg-options "-O3 -march=armv8.2-a+fp+sve" } */ > > + > > +void > > +foo (double *a, double *b, double m, int inc_x, int inc_y) { > > + int ix = 0, iy = 0; > > + for (int i = 0; i < 1000; ++i) > > +{ > > + a[ix] += m * b[iy]; > > + ix += inc_x; > > + iy += inc_y; > > +} > > + return ; > > +} > > + > > +/* { dg-final { scan-assembler-times "add" 12 } } */ > > For asm tests, it's probably better to put this in aarch64/sve instead. > The scan-assembler should be a bit more precise though, since just "add" > could catch filenames like /homd/ladd/gcc. E.g. maybe it would be clearer to > make the regexp match
Re: [PATCH PR95199] vect: Remove extra variable created for memory reference
"zhoukaipeng (A)" writes: > Hi, > > This is a fix for pr95199. > In vectorization, vinfo->ivexpr_map is supposed to catch those "IV base > and/or step expressions". Just call cse_and_gimplify_to_preheader to handle > gathering/scattering to avoid the extra variable. > > Bootstrap and tested on aarch64/x86_64 Linux platform. No new regression > witnessed. > > Is it ok? > > Thanks, > Kaipeng Zhou > > From 12cf9b362576735e4c584c48cd6db3d2b0f2e14b Mon Sep 17 00:00:00 2001 > From: Kaipeng Zhou > Date: Fri, 12 Jun 2020 00:54:15 +0800 > Subject: [PATCH] vect: Remove extra variable created for memory reference in > loop vectorization. > > Vectorization create two copies of the same IV and IVOPTs does not perform > CSE. But vinfo->ivexpr_map is supposed to catch those "IV base and/or step > expressions". Just call cse_and_gimplify_to_preheader to handle > gathering/scattering to avoid the extra variable. > > 2020-06-12 Bin Cheng > Kaipeng Zhou > > PR tree-optimization/95199 > * tree-vect-stmts.c: Use CSE map to catch the IV step and eliminate > extra variable. > > 2020-06-12 Bin Cheng > Kaipeng Zhou > > PR tree-optimization/95199 > * gcc.dg/vect/pr95199.c: New test. > --- > gcc/ChangeLog | 7 +++ > gcc/testsuite/ChangeLog | 6 ++ > gcc/testsuite/gcc.dg/vect/pr95199.c | 17 + > gcc/tree-vect-stmts.c | 1 + > 4 files changed, 31 insertions(+) > create mode 100644 gcc/testsuite/gcc.dg/vect/pr95199.c > > diff --git a/gcc/ChangeLog b/gcc/ChangeLog > index c92582df7fe..753d70fc94b 100644 > --- a/gcc/ChangeLog > +++ b/gcc/ChangeLog > @@ -1,3 +1,10 @@ > +2020-06-12 Bin Cheng > + Kaipeng Zhou > + > + PR tree-optimization/95199 > + * tree-vect-stmts.c: Use CSE map to catch the IV step and eliminate > + extra variable. > + > 2020-06-08 Tobias Burnus > > PR lto/94848 > diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog > index 60d9ecca3ed..a27dd3fa072 100644 > --- a/gcc/testsuite/ChangeLog > +++ b/gcc/testsuite/ChangeLog > @@ -1,3 +1,9 @@ > +2020-06-12 Bin Cheng > + Kaipeng Zhou > + > + PR tree-optimization/95199 > + * gcc.dg/vect/pr95199.c: New test. > + > 2020-06-08 Harald Anlauf > > PR fortran/95195 > diff --git a/gcc/testsuite/gcc.dg/vect/pr95199.c > b/gcc/testsuite/gcc.dg/vect/pr95199.c > new file mode 100644 > index 000..ec201feaec8 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/vect/pr95199.c > @@ -0,0 +1,17 @@ > +/* { dg-do compile { target aarch64-*-* } } */ > +/* { dg-options "-O3 -march=armv8.2-a+fp+sve" } */ > + > +void > +foo (double *a, double *b, double m, int inc_x, int inc_y) > +{ > + int ix = 0, iy = 0; > + for (int i = 0; i < 1000; ++i) > +{ > + a[ix] += m * b[iy]; > + ix += inc_x; > + iy += inc_y; > +} > + return ; > +} > + > +/* { dg-final { scan-assembler-times "add" 12 } } */ For asm tests, it's probably better to put this in aarch64/sve instead. The scan-assembler should be a bit more precise though, since just "add" could catch filenames like /homd/ladd/gcc. E.g. maybe it would be clearer to make the regexp match the register operands too. (In that case it's better to quote the regexp with {…} rather than "…", to reduce the number of backslashes.) > diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c > index b24b0fe4304..26a2b143b01 100644 > --- a/gcc/tree-vect-stmts.c > +++ b/gcc/tree-vect-stmts.c > @@ -2969,6 +2969,7 @@ vect_get_strided_load_store_ops (stmt_vec_info > stmt_info, >tree bump = size_binop (MULT_EXPR, > fold_convert (sizetype, unshare_expr (DR_STEP (dr))), > size_int (TYPE_VECTOR_SUBPARTS (vectype))); > + bump = cse_and_gimplify_to_preheader (loop_vinfo, bump); >*dataref_bump = force_gimple_operand (bump, , true, NULL_TREE); >if (stmts) > gsi_insert_seq_on_edge_immediate (loop_preheader_edge (loop), stmts); Agree it's good to do this, but using cse_and_gimplify_to_preheader makes the force_gimple_operand and gsi_insert_seq_on_edge_immediate redundant. I guess we might as well do the same thing for “*vec_offset” too. Thanks, Richard
[PATCH PR95199] vect: Remove extra variable created for memory reference
Hi, This is a fix for pr95199. In vectorization, vinfo->ivexpr_map is supposed to catch those "IV base and/or step expressions". Just call cse_and_gimplify_to_preheader to handle gathering/scattering to avoid the extra variable. Bootstrap and tested on aarch64/x86_64 Linux platform. No new regression witnessed. Is it ok? Thanks, Kaipeng Zhou 0001-vect-Remove-extra-variable-created-for-memory-refere.patch Description: 0001-vect-Remove-extra-variable-created-for-memory-refere.patch