> Am 06.09.2024 um 16:05 schrieb Robin Dapp <rdapp....@gmail.com>:
> 
> Hi,
> 
> PR112694 shows that we try to create sub-vectors of single-element
> vectors because can_duplicate_and_interleave_p returns true.

Can we avoid querying the function?  CCing Richard who should know more about 
this.

Richard 

> The problem resurfaced in PR116611.
> 
> This patch makes can_duplicate_and_interleave_p return false
> if count / nvectors > 0 and removes the corresponding check in the riscv
> backend.
> 
> This partially gets rid of the FAIL in slp-19a.c.  At least when built
> with cost model we don't have LOAD_LANES anymore.  Without cost model,
> as in the test suite, we choose a different path and still end up with
> LOAD_LANES.
> 
> Bootstrapped and regtested on x86 and power10, regtested on
> rv64gcv_zvfh_zvbb.  Still waiting for the aarch64 results.
> 
> Regards
> Robin
> 
> gcc/ChangeLog:
> 
>    PR target/112694
>    PR target/116611.
> 
>    * config/riscv/riscv-v.cc (expand_vec_perm_const): Remove early
>    return.
>    * tree-vect-slp.cc (can_duplicate_and_interleave_p): Return
>    false when we cannot create sub-elements.
> ---
> gcc/config/riscv/riscv-v.cc | 9 ---------
> gcc/tree-vect-slp.cc        | 4 ++++
> 2 files changed, 4 insertions(+), 9 deletions(-)
> 
> diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
> index 9b6c3a21e2d..5c5ed63d22e 100644
> --- a/gcc/config/riscv/riscv-v.cc
> +++ b/gcc/config/riscv/riscv-v.cc
> @@ -3709,15 +3709,6 @@ expand_vec_perm_const (machine_mode vmode, 
> machine_mode op_mode, rtx target,
>      mask to do the iteration loop control. Just disable it directly.  */
>   if (GET_MODE_CLASS (vmode) == MODE_VECTOR_BOOL)
>     return false;
> -  /* FIXME: Explicitly disable VLA interleave SLP vectorization when we
> -     may encounter ICE for poly size (1, 1) vectors in loop vectorizer.
> -     Ideally, middle-end loop vectorizer should be able to disable it
> -     itself, We can remove the codes here when middle-end code is able
> -     to disable VLA SLP vectorization for poly size (1, 1) VF.  */
> -  if (!BYTES_PER_RISCV_VECTOR.is_constant ()
> -      && maybe_lt (BYTES_PER_RISCV_VECTOR * TARGET_MAX_LMUL,
> -           poly_int64 (16, 16)))
> -    return false;
> 
>   struct expand_vec_perm_d d;
> 
> diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> index 3d2973698e2..17b59870c69 100644
> --- a/gcc/tree-vect-slp.cc
> +++ b/gcc/tree-vect-slp.cc
> @@ -434,6 +434,10 @@ can_duplicate_and_interleave_p (vec_info *vinfo, 
> unsigned int count,
>   unsigned int nvectors = 1;
>   for (;;)
>     {
> +      /* We need to be able to to fuse COUNT / NVECTORS elements together,
> +     so no point in continuing if there are none.  */
> +      if (nvectors > count)
> +    return false;
>       scalar_int_mode int_mode;
>       poly_int64 elt_bits = elt_bytes * BITS_PER_UNIT;
>       if (int_mode_for_size (elt_bits, 1).exists (&int_mode))
> --
> 2.46.0
> 

Reply via email to