Michael Collison <colli...@rivosinc.com> writes:
> While working on autovectorizing for the RISCV port I encountered an issue
> where can_duplicate_and_interleave_p assumes that GET_MODE_NUNITS is a
> evenly divisible by two. The RISC-V target has vector modes (e.g. VNx1DImode),
> where GET_MODE_NUNITS is equal to one.
>
> Tested on RISCV and x86_64-linux-gnu. Okay?
>
> 2023-03-09  Michael Collison  <colli...@rivosinc.com>
>
>       * tree-vect-slp.cc (can_duplicate_and_interleave_p):
>       Check that GET_MODE_NUNITS is a multiple of 2.

OK, thanks.  Doesn't need to wait for any other of the other patches
in the series.

Richard

> ---
>  gcc/tree-vect-slp.cc | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> index d73deaecce0..a64fe454e19 100644
> --- a/gcc/tree-vect-slp.cc
> +++ b/gcc/tree-vect-slp.cc
> @@ -423,10 +423,13 @@ can_duplicate_and_interleave_p (vec_info *vinfo, 
> unsigned int count,
>           (GET_MODE_BITSIZE (int_mode), 1);
>         tree vector_type
>           = get_vectype_for_scalar_type (vinfo, int_type, count);
> +       poly_int64 half_nelts;
>         if (vector_type
>             && VECTOR_MODE_P (TYPE_MODE (vector_type))
>             && known_eq (GET_MODE_SIZE (TYPE_MODE (vector_type)),
> -                        GET_MODE_SIZE (base_vector_mode)))
> +                        GET_MODE_SIZE (base_vector_mode))
> +           && multiple_p (GET_MODE_NUNITS (TYPE_MODE (vector_type)),
> +                          2, &half_nelts))
>           {
>             /* Try fusing consecutive sequences of COUNT / NVECTORS elements
>                together into elements of type INT_TYPE and using the result
> @@ -434,7 +437,7 @@ can_duplicate_and_interleave_p (vec_info *vinfo, unsigned 
> int count,
>             poly_uint64 nelts = GET_MODE_NUNITS (TYPE_MODE (vector_type));
>             vec_perm_builder sel1 (nelts, 2, 3);
>             vec_perm_builder sel2 (nelts, 2, 3);
> -           poly_int64 half_nelts = exact_div (nelts, 2);
> +
>             for (unsigned int i = 0; i < 3; ++i)
>               {
>                 sel1.quick_push (i);

Reply via email to