"Yangfei (Felix)" <felix.y...@huawei.com> writes: > diff --git a/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c > b/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c > index e050db1a2e4..ea39fcac0e0 100644 > --- a/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c > +++ b/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c > @@ -1,6 +1,7 @@ > /* { dg-do compile } */ > /* { dg-additional-options "-O3" } */ > /* { dg-additional-options "-mavx2" { target { i?86-*-* x86_64-*-* } } } */ > +/* { dg-additional-options "-march=armv8.2-a+sve -fno-vect-cost-model" { > target aarch64*-*-* } } */ > > typedef struct { > unsigned short mprr_2[5][16][16];
This test is useful for Advanced SIMD too, so I think we should continue to test it with whatever options the person running the testsuite chose. Instead we could duplicate the test in gcc.target/aarch64/sve with appropriate options. > diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c > index eb8288e7a85..b30a7d8a3bb 100644 > --- a/gcc/tree-vect-data-refs.c > +++ b/gcc/tree-vect-data-refs.c > @@ -1823,8 +1823,11 @@ vect_enhance_data_refs_alignment (loop_vec_info > loop_vinfo) > { > poly_uint64 nscalars = (STMT_SLP_TYPE (stmt_info) > ? vf * DR_GROUP_SIZE (stmt_info) : > vf); > - possible_npeel_number > - = vect_get_num_vectors (nscalars, vectype); > + if (maybe_lt (nscalars, TYPE_VECTOR_SUBPARTS (vectype))) > + possible_npeel_number = 0; > + else > + possible_npeel_number > + = vect_get_num_vectors (nscalars, vectype); > > /* NPEEL_TMP is 0 when there is no misalignment, but also > allow peeling NELEMENTS. */ OK, so this is coming from: int s[16][2]; … … =s[j][1]; and an SLP node containing 16 instances of “s[j][1]”. The DR_GROUP_SIZE is 2 because that's the inner dimension of “s”. I don't think maybe_lt is right here though. The same problem could in principle happen for cases in which NSCALARS > TYPE_VECTOR_SUBPARTS, e.g. for different inner dimensions of “s”. I think the testcase shows that using DR_GROUP_SIZE in this calculation is flawed. I'm not sure whether we can really do better given the current representation though. This is one case where having a separate dr_vec_info per SLP node would help. Maybe one option (for now) would be to use: if (multiple_p (nscalars, TREE_VECTOR_SUBPARTS (vectype))) possible_npeel_number = vect_get_num_vectors (nscalars, vectype); else /* This isn't a simple stream of contiguous vector accesses. It's hard to predict from the available information how many vector accesses we'll actually need per iteration, so be conservative and assume one. */ possible_npeel_number = 1; BTW, I'm not sure whether the current choice of STMT_SLP_TYPE (stmt_info) instead of PURE_SLP_STMT (stmt_info) is optimal or not. It means that for hybrid SLP we base the peeling on the SLP stmt rather than the non-SLP stmt. I guess hybrid SLP is going away soon though, so let's not worry about that. :-) Maybe Richard has a better suggestion. Thanks, Richard