"Yangfei (Felix)" <felix.y...@huawei.com> writes:
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c 
> b/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c
> index e050db1a2e4..ea39fcac0e0 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c
> @@ -1,6 +1,7 @@
>  /* { dg-do compile } */
>  /* { dg-additional-options "-O3" } */
>  /* { dg-additional-options "-mavx2" { target { i?86-*-* x86_64-*-* } } } */
> +/* { dg-additional-options "-march=armv8.2-a+sve -fno-vect-cost-model" { 
> target aarch64*-*-* } } */
>  
>  typedef struct {
>      unsigned short mprr_2[5][16][16];

This test is useful for Advanced SIMD too, so I think we should continue
to test it with whatever options the person running the testsuite chose.
Instead we could duplicate the test in gcc.target/aarch64/sve with
appropriate options.

> diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
> index eb8288e7a85..b30a7d8a3bb 100644
> --- a/gcc/tree-vect-data-refs.c
> +++ b/gcc/tree-vect-data-refs.c
> @@ -1823,8 +1823,11 @@ vect_enhance_data_refs_alignment (loop_vec_info 
> loop_vinfo)
>               {
>                 poly_uint64 nscalars = (STMT_SLP_TYPE (stmt_info)
>                                         ? vf * DR_GROUP_SIZE (stmt_info) : 
> vf);
> -               possible_npeel_number
> -                 = vect_get_num_vectors (nscalars, vectype);
> +               if (maybe_lt (nscalars, TYPE_VECTOR_SUBPARTS (vectype)))
> +                 possible_npeel_number = 0;
> +               else
> +                 possible_npeel_number
> +                   = vect_get_num_vectors (nscalars, vectype);
>  
>                 /* NPEEL_TMP is 0 when there is no misalignment, but also
>                    allow peeling NELEMENTS.  */

OK, so this is coming from:

  int s[16][2];
  …
  … =s[j][1];

and an SLP node containing 16 instances of “s[j][1]”.  The DR_GROUP_SIZE
is 2 because that's the inner dimension of “s”.

I don't think maybe_lt is right here though.  The same problem could in
principle happen for cases in which NSCALARS > TYPE_VECTOR_SUBPARTS,
e.g. for different inner dimensions of “s”.

I think the testcase shows that using DR_GROUP_SIZE in this calculation
is flawed.  I'm not sure whether we can really do better given the current
representation though.  This is one case where having a separate dr_vec_info
per SLP node would help.

Maybe one option (for now) would be to use:

  if (multiple_p (nscalars, TREE_VECTOR_SUBPARTS (vectype)))
    possible_npeel_number = vect_get_num_vectors (nscalars, vectype);
  else
    /* This isn't a simple stream of contiguous vector accesses.  It's hard
       to predict from the available information how many vector accesses
       we'll actually need per iteration, so be conservative and assume
       one.  */
    possible_npeel_number = 1;

BTW, I'm not sure whether the current choice of STMT_SLP_TYPE (stmt_info)
instead of PURE_SLP_STMT (stmt_info) is optimal or not.  It means that for
hybrid SLP we base the peeling on the SLP stmt rather than the non-SLP stmt.
I guess hybrid SLP is going away soon though, so let's not worry about
that. :-)

Maybe Richard has a better suggestion.

Thanks,
Richard

Reply via email to