Re: [PATCH v4 07/10] vect: Verify that GET_MODE_NUNITS is a multiple of 2.
writes: > Yes, like kito said. > We won't enable VNx1DImode in auto-vectorization so it's meaningless to fix > it here. > We dynamic adjust the minimum vector-length for different '-march' according > to RVV ISA specification. > So we strongly suggest that we should drop this fix. I think the patch should go in regardless. If we have a port with a VNx1 mode then the exact_div is at best dubious and at worst wrong. Thanks, Richard
Re: [PATCH v4 07/10] vect: Verify that GET_MODE_NUNITS is a multiple of 2.
Juzhe and Kito, Thank you for the clarification. On 4/18/23 18:48, juzhe.zh...@rivai.ai wrote: Yes, like kito said. We won't enable VNx1DImode in auto-vectorization so it's meaningless to fix it here. We dynamic adjust the minimum vector-length for different '-march' according to RVV ISA specification. So we strongly suggest that we should drop this fix. Thanks. juzhe.zh...@rivai.ai *From:* Kito Cheng <mailto:kito.ch...@gmail.com> *Date:* 2023-04-19 02:21 *To:* Richard Biener <mailto:richard.guent...@gmail.com>; Jeff Law <mailto:jeffreya...@gmail.com>; Palmer Dabbelt <mailto:pal...@dabbelt.com> *CC:* Michael Collison <mailto:colli...@rivosinc.com>; gcc-patches <mailto:gcc-patches@gcc.gnu.org>; 钟居哲 <mailto:juzhe.zh...@rivai.ai> *Subject:* Re: [PATCH v4 07/10] vect: Verify that GET_MODE_NUNITS is a multiple of 2. Few more background about RVV: RISC-V has provide different VLEN configuration by different ISA extension like `zve32x`, `zve64x` and `v` zve32x just guarantee the minimal VLEN is 32 bits, zve64x guarantee the minimal VLEN is 64 bits, and v guarantee the minimal VLEN is 128 bits, Current status (without that patch): Zve32x: Mode for one vector register mode is VNx1SImode and VNx1DImode is invalid mode - one vector register could hold 1 + 1x SImode where x is 0~n, so it might hold just one SI Zve64x: Mode for one vector register mode is VNx1DImode or VNx2SImode - one vector register could hold 1 + 1x DImode where x is 0~n, so it might hold just one DI - one vector register could hold 2 + 2x SImode where x is 0~n, so it might hold just two SI So what I want to say here is VNx1DImode is really NOT safe to assume to have more than two DI in theory. However `v` extension guarantees the minimal VLEN is 128 bits. We are trying to introduce another type/mode mapping for this configure: v: Mode for one vector register mode is VNx2DImode or VNx4SImode - one vector register could hold 2 + 2x DImode where x is 0~n, so it will hold at least two DI - one vector register could hold 4 + 4x SImode where x is 0~n, so it will hold at least four DI So GET_MODE_NUNITS for a single vector register with DI mode will become 2 (VNx2DImode) if it is really possible, which is a more precise way to model the vector extension for RISC-V . On Tue, Apr 18, 2023 at 10:28 PM Kito Cheng wrote: > > Wait, VNx1DImode can be really evaluate to just one element if > -march=rv64g_zve64x, > > I thinks this should be just fixed on backend by this patch: > > https://patchwork.ozlabs.org/project/gcc/patch/20230414014518.15458-1-juzhe.zh...@rivai.ai/ > > On Tue, Apr 18, 2023 at 2:12 PM Richard Biener via Gcc-patches > wrote: > > > > On Mon, Apr 17, 2023 at 8:42 PM Michael Collison wrote: > > > > > > While working on autovectorizing for the RISCV port I encountered an issue > > > where can_duplicate_and_interleave_p assumes that GET_MODE_NUNITS is a > > > evenly divisible by two. The RISC-V target has vector modes (e.g. VNx1DImode), > > > where GET_MODE_NUNITS is equal to one. > > > > > > Tested on RISCV and x86_64-linux-gnu. Okay? > > > > OK. > > > > > 2023-03-09 Michael Collison > > > > > > * tree-vect-slp.cc (can_duplicate_and_interleave_p): > > > Check that GET_MODE_NUNITS is a multiple of 2. > > > --- > > > gcc/tree-vect-slp.cc | 7 +-- > > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > > > diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc > > > index d73deaecce0..a64fe454e19 100644 > > > --- a/gcc/tree-vect-slp.cc > > > +++ b/gcc/tree-vect-slp.cc > > > @@ -423,10 +423,13 @@ can_duplicate_and_interleave_p (vec_info *vinfo, unsigned int count, > > > (GET_MODE_BITSIZE (int_mode), 1); > > > tree vector_type > > > = get_vectype_for_scalar_type (vinfo, int_type, count); > > > + poly_int64 half_nelts; > > > if (vector_type > > > && VECTOR_MODE_P (TYPE_MODE (vector_type)) > > > && known_eq (GET_MODE_SIZE (TYPE_MODE (vector_type)), > > > - GET_MODE_SIZE (base_vector_mode))) > > > + GET_MODE_SIZE (base_vector_mode)) > > > +
Re: Re: [PATCH v4 07/10] vect: Verify that GET_MODE_NUNITS is a multiple of 2.
Yes, like kito said. We won't enable VNx1DImode in auto-vectorization so it's meaningless to fix it here. We dynamic adjust the minimum vector-length for different '-march' according to RVV ISA specification. So we strongly suggest that we should drop this fix. Thanks. juzhe.zh...@rivai.ai From: Kito Cheng Date: 2023-04-19 02:21 To: Richard Biener; Jeff Law; Palmer Dabbelt CC: Michael Collison; gcc-patches; 钟居哲 Subject: Re: [PATCH v4 07/10] vect: Verify that GET_MODE_NUNITS is a multiple of 2. Few more background about RVV: RISC-V has provide different VLEN configuration by different ISA extension like `zve32x`, `zve64x` and `v` zve32x just guarantee the minimal VLEN is 32 bits, zve64x guarantee the minimal VLEN is 64 bits, and v guarantee the minimal VLEN is 128 bits, Current status (without that patch): Zve32x: Mode for one vector register mode is VNx1SImode and VNx1DImode is invalid mode - one vector register could hold 1 + 1x SImode where x is 0~n, so it might hold just one SI Zve64x: Mode for one vector register mode is VNx1DImode or VNx2SImode - one vector register could hold 1 + 1x DImode where x is 0~n, so it might hold just one DI - one vector register could hold 2 + 2x SImode where x is 0~n, so it might hold just two SI So what I want to say here is VNx1DImode is really NOT safe to assume to have more than two DI in theory. However `v` extension guarantees the minimal VLEN is 128 bits. We are trying to introduce another type/mode mapping for this configure: v: Mode for one vector register mode is VNx2DImode or VNx4SImode - one vector register could hold 2 + 2x DImode where x is 0~n, so it will hold at least two DI - one vector register could hold 4 + 4x SImode where x is 0~n, so it will hold at least four DI So GET_MODE_NUNITS for a single vector register with DI mode will become 2 (VNx2DImode) if it is really possible, which is a more precise way to model the vector extension for RISC-V . On Tue, Apr 18, 2023 at 10:28 PM Kito Cheng wrote: > > Wait, VNx1DImode can be really evaluate to just one element if > -march=rv64g_zve64x, > > I thinks this should be just fixed on backend by this patch: > > https://patchwork.ozlabs.org/project/gcc/patch/20230414014518.15458-1-juzhe.zh...@rivai.ai/ > > On Tue, Apr 18, 2023 at 2:12 PM Richard Biener via Gcc-patches > wrote: > > > > On Mon, Apr 17, 2023 at 8:42 PM Michael Collison > > wrote: > > > > > > While working on autovectorizing for the RISCV port I encountered an issue > > > where can_duplicate_and_interleave_p assumes that GET_MODE_NUNITS is a > > > evenly divisible by two. The RISC-V target has vector modes (e.g. > > > VNx1DImode), > > > where GET_MODE_NUNITS is equal to one. > > > > > > Tested on RISCV and x86_64-linux-gnu. Okay? > > > > OK. > > > > > 2023-03-09 Michael Collison > > > > > > * tree-vect-slp.cc (can_duplicate_and_interleave_p): > > > Check that GET_MODE_NUNITS is a multiple of 2. > > > --- > > > gcc/tree-vect-slp.cc | 7 +-- > > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > > > diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc > > > index d73deaecce0..a64fe454e19 100644 > > > --- a/gcc/tree-vect-slp.cc > > > +++ b/gcc/tree-vect-slp.cc > > > @@ -423,10 +423,13 @@ can_duplicate_and_interleave_p (vec_info *vinfo, > > > unsigned int count, > > > (GET_MODE_BITSIZE (int_mode), 1); > > > tree vector_type > > > = get_vectype_for_scalar_type (vinfo, int_type, count); > > > + poly_int64 half_nelts; > > > if (vector_type > > > && VECTOR_MODE_P (TYPE_MODE (vector_type)) > > > && known_eq (GET_MODE_SIZE (TYPE_MODE (vector_type)), > > > - GET_MODE_SIZE (base_vector_mode))) > > > + GET_MODE_SIZE (base_vector_mode)) > > > + && multiple_p (GET_MODE_NUNITS (TYPE_MODE (vector_type)), > > > +2, &half_nelts)) > > > { > > > /* Try fusing consecutive sequences of COUNT / NVECTORS > > > elements > > > together into elements of type INT_TYPE and using the > > > result > > > @@ -434,7 +437,7 @@ can_duplicate_and_interleave_p (vec_info *vinfo, > > > unsigned int count, > > > poly_uint64 nelts = GET_MODE_NUNITS (TYPE_MODE > > > (vector_type)); > > > vec_perm_builder sel1 (nelts, 2, 3); > > > vec_perm_builder sel2 (nelts, 2, 3); > > > - poly_int64 half_nelts = exact_div (nelts, 2); > > > + > > > for (unsigned int i = 0; i < 3; ++i) > > > { > > > sel1.quick_push (i); > > > -- > > > 2.34.1 > > >
Re: [PATCH v4 07/10] vect: Verify that GET_MODE_NUNITS is a multiple of 2.
Few more background about RVV: RISC-V has provide different VLEN configuration by different ISA extension like `zve32x`, `zve64x` and `v` zve32x just guarantee the minimal VLEN is 32 bits, zve64x guarantee the minimal VLEN is 64 bits, and v guarantee the minimal VLEN is 128 bits, Current status (without that patch): Zve32x: Mode for one vector register mode is VNx1SImode and VNx1DImode is invalid mode - one vector register could hold 1 + 1x SImode where x is 0~n, so it might hold just one SI Zve64x: Mode for one vector register mode is VNx1DImode or VNx2SImode - one vector register could hold 1 + 1x DImode where x is 0~n, so it might hold just one DI - one vector register could hold 2 + 2x SImode where x is 0~n, so it might hold just two SI So what I want to say here is VNx1DImode is really NOT safe to assume to have more than two DI in theory. However `v` extension guarantees the minimal VLEN is 128 bits. We are trying to introduce another type/mode mapping for this configure: v: Mode for one vector register mode is VNx2DImode or VNx4SImode - one vector register could hold 2 + 2x DImode where x is 0~n, so it will hold at least two DI - one vector register could hold 4 + 4x SImode where x is 0~n, so it will hold at least four DI So GET_MODE_NUNITS for a single vector register with DI mode will become 2 (VNx2DImode) if it is really possible, which is a more precise way to model the vector extension for RISC-V . On Tue, Apr 18, 2023 at 10:28 PM Kito Cheng wrote: > > Wait, VNx1DImode can be really evaluate to just one element if > -march=rv64g_zve64x, > > I thinks this should be just fixed on backend by this patch: > > https://patchwork.ozlabs.org/project/gcc/patch/20230414014518.15458-1-juzhe.zh...@rivai.ai/ > > On Tue, Apr 18, 2023 at 2:12 PM Richard Biener via Gcc-patches > wrote: > > > > On Mon, Apr 17, 2023 at 8:42 PM Michael Collison > > wrote: > > > > > > While working on autovectorizing for the RISCV port I encountered an issue > > > where can_duplicate_and_interleave_p assumes that GET_MODE_NUNITS is a > > > evenly divisible by two. The RISC-V target has vector modes (e.g. > > > VNx1DImode), > > > where GET_MODE_NUNITS is equal to one. > > > > > > Tested on RISCV and x86_64-linux-gnu. Okay? > > > > OK. > > > > > 2023-03-09 Michael Collison > > > > > > * tree-vect-slp.cc (can_duplicate_and_interleave_p): > > > Check that GET_MODE_NUNITS is a multiple of 2. > > > --- > > > gcc/tree-vect-slp.cc | 7 +-- > > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > > > diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc > > > index d73deaecce0..a64fe454e19 100644 > > > --- a/gcc/tree-vect-slp.cc > > > +++ b/gcc/tree-vect-slp.cc > > > @@ -423,10 +423,13 @@ can_duplicate_and_interleave_p (vec_info *vinfo, > > > unsigned int count, > > > (GET_MODE_BITSIZE (int_mode), 1); > > > tree vector_type > > > = get_vectype_for_scalar_type (vinfo, int_type, count); > > > + poly_int64 half_nelts; > > > if (vector_type > > > && VECTOR_MODE_P (TYPE_MODE (vector_type)) > > > && known_eq (GET_MODE_SIZE (TYPE_MODE (vector_type)), > > > - GET_MODE_SIZE (base_vector_mode))) > > > + GET_MODE_SIZE (base_vector_mode)) > > > + && multiple_p (GET_MODE_NUNITS (TYPE_MODE (vector_type)), > > > +2, &half_nelts)) > > > { > > > /* Try fusing consecutive sequences of COUNT / NVECTORS > > > elements > > > together into elements of type INT_TYPE and using the > > > result > > > @@ -434,7 +437,7 @@ can_duplicate_and_interleave_p (vec_info *vinfo, > > > unsigned int count, > > > poly_uint64 nelts = GET_MODE_NUNITS (TYPE_MODE > > > (vector_type)); > > > vec_perm_builder sel1 (nelts, 2, 3); > > > vec_perm_builder sel2 (nelts, 2, 3); > > > - poly_int64 half_nelts = exact_div (nelts, 2); > > > + > > > for (unsigned int i = 0; i < 3; ++i) > > > { > > > sel1.quick_push (i); > > > -- > > > 2.34.1 > > >
Re: [PATCH v4 07/10] vect: Verify that GET_MODE_NUNITS is a multiple of 2.
Wait, VNx1DImode can be really evaluate to just one element if -march=rv64g_zve64x, I thinks this should be just fixed on backend by this patch: https://patchwork.ozlabs.org/project/gcc/patch/20230414014518.15458-1-juzhe.zh...@rivai.ai/ On Tue, Apr 18, 2023 at 2:12 PM Richard Biener via Gcc-patches wrote: > > On Mon, Apr 17, 2023 at 8:42 PM Michael Collison > wrote: > > > > While working on autovectorizing for the RISCV port I encountered an issue > > where can_duplicate_and_interleave_p assumes that GET_MODE_NUNITS is a > > evenly divisible by two. The RISC-V target has vector modes (e.g. > > VNx1DImode), > > where GET_MODE_NUNITS is equal to one. > > > > Tested on RISCV and x86_64-linux-gnu. Okay? > > OK. > > > 2023-03-09 Michael Collison > > > > * tree-vect-slp.cc (can_duplicate_and_interleave_p): > > Check that GET_MODE_NUNITS is a multiple of 2. > > --- > > gcc/tree-vect-slp.cc | 7 +-- > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc > > index d73deaecce0..a64fe454e19 100644 > > --- a/gcc/tree-vect-slp.cc > > +++ b/gcc/tree-vect-slp.cc > > @@ -423,10 +423,13 @@ can_duplicate_and_interleave_p (vec_info *vinfo, > > unsigned int count, > > (GET_MODE_BITSIZE (int_mode), 1); > > tree vector_type > > = get_vectype_for_scalar_type (vinfo, int_type, count); > > + poly_int64 half_nelts; > > if (vector_type > > && VECTOR_MODE_P (TYPE_MODE (vector_type)) > > && known_eq (GET_MODE_SIZE (TYPE_MODE (vector_type)), > > - GET_MODE_SIZE (base_vector_mode))) > > + GET_MODE_SIZE (base_vector_mode)) > > + && multiple_p (GET_MODE_NUNITS (TYPE_MODE (vector_type)), > > +2, &half_nelts)) > > { > > /* Try fusing consecutive sequences of COUNT / NVECTORS > > elements > > together into elements of type INT_TYPE and using the > > result > > @@ -434,7 +437,7 @@ can_duplicate_and_interleave_p (vec_info *vinfo, > > unsigned int count, > > poly_uint64 nelts = GET_MODE_NUNITS (TYPE_MODE (vector_type)); > > vec_perm_builder sel1 (nelts, 2, 3); > > vec_perm_builder sel2 (nelts, 2, 3); > > - poly_int64 half_nelts = exact_div (nelts, 2); > > + > > for (unsigned int i = 0; i < 3; ++i) > > { > > sel1.quick_push (i); > > -- > > 2.34.1 > >
Re: [PATCH v4 07/10] vect: Verify that GET_MODE_NUNITS is a multiple of 2.
On Mon, Apr 17, 2023 at 8:42 PM Michael Collison wrote: > > While working on autovectorizing for the RISCV port I encountered an issue > where can_duplicate_and_interleave_p assumes that GET_MODE_NUNITS is a > evenly divisible by two. The RISC-V target has vector modes (e.g. VNx1DImode), > where GET_MODE_NUNITS is equal to one. > > Tested on RISCV and x86_64-linux-gnu. Okay? OK. > 2023-03-09 Michael Collison > > * tree-vect-slp.cc (can_duplicate_and_interleave_p): > Check that GET_MODE_NUNITS is a multiple of 2. > --- > gcc/tree-vect-slp.cc | 7 +-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc > index d73deaecce0..a64fe454e19 100644 > --- a/gcc/tree-vect-slp.cc > +++ b/gcc/tree-vect-slp.cc > @@ -423,10 +423,13 @@ can_duplicate_and_interleave_p (vec_info *vinfo, > unsigned int count, > (GET_MODE_BITSIZE (int_mode), 1); > tree vector_type > = get_vectype_for_scalar_type (vinfo, int_type, count); > + poly_int64 half_nelts; > if (vector_type > && VECTOR_MODE_P (TYPE_MODE (vector_type)) > && known_eq (GET_MODE_SIZE (TYPE_MODE (vector_type)), > - GET_MODE_SIZE (base_vector_mode))) > + GET_MODE_SIZE (base_vector_mode)) > + && multiple_p (GET_MODE_NUNITS (TYPE_MODE (vector_type)), > +2, &half_nelts)) > { > /* Try fusing consecutive sequences of COUNT / NVECTORS elements > together into elements of type INT_TYPE and using the result > @@ -434,7 +437,7 @@ can_duplicate_and_interleave_p (vec_info *vinfo, unsigned > int count, > poly_uint64 nelts = GET_MODE_NUNITS (TYPE_MODE (vector_type)); > vec_perm_builder sel1 (nelts, 2, 3); > vec_perm_builder sel2 (nelts, 2, 3); > - poly_int64 half_nelts = exact_div (nelts, 2); > + > for (unsigned int i = 0; i < 3; ++i) > { > sel1.quick_push (i); > -- > 2.34.1 >