from:"Li, Pan2 via Gcc\-patches"

RE: [PATCH v1] RISC-V: Support VLS mode for vec_set

2023-09-18 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

-Original Message-
From: Kito Cheng  
Sent: Monday, September 18, 2023 11:36 AM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 

Subject: Re: [PATCH v1] RISC-V: Support VLS mode for vec_set

LGTM

On Mon, Sep 18, 2023 at 11:27 AM Pan Li via Gcc-patches
 wrote:
>
> From: Pan Li 
>
> This patch would like to add the VLS support vec_set, both INT
> and FP are included.
>
> Give sample code as below:
>
> typedef long long vl_t \
>   __attribute__((vector_size(2 * sizeof (long long;
>
> vl_t init_vl (vl_t v, unsigned index, unsigned value)
> {
>   v[index] = value;
>
>   return v;
> }
>
> Before this patch:
> init_vl:
>   addi sp,sp,-16
>   vsetivli zero,2,e64,m1,ta,ma
>   vle64.v  v1,0(a1)
>   vse64.v  v1,0(sp)
>   slli a4,a2,32
>   srli a2,a4,29
>   add  a2,sp,a2
>   slli a3,a3,32
>   srli a3,a3,32
>   sd   a3,0(a2)
>   vle64.v  v1,0(sp)
>   vse64.v  v1,0(a0)
>   addi sp,sp,16
>   jr   ra
>
> After this patch:
> init_vl:
>   vsetivlizero,2,e64,m1,ta,ma
>   vle64.v v1,0(a1)
>   sllia3,a3,32
>   srlia3,a3,32
>   addia5,a2,1
>   vsetvli zero,a5,e64,m1,tu,ma
>   vmv.v.x v2,a3
>   vslideup.vx v1,v2,a2
>   vsetivlizero,2,e64,m1,ta,ma
>   vse64.v v1,0(a0)
>   ret
>
> Please note this patch depends the RVV SCALAR_MOVE_MERGED_OP bugfix.
>
> gcc/ChangeLog:
>
> * config/riscv/autovec.md: Extend to vls mode.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/autovec/vls/def.h: New macros.
> * gcc.target/riscv/rvv/autovec/vls/vec-set-1.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/vec-set-10.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/vec-set-11.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/vec-set-12.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/vec-set-13.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/vec-set-14.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/vec-set-15.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/vec-set-16.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/vec-set-17.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/vec-set-18.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/vec-set-19.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/vec-set-2.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/vec-set-20.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/vec-set-21.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/vec-set-22.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/vec-set-3.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/vec-set-4.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/vec-set-5.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/vec-set-6.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/vec-set-7.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/vec-set-8.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/vec-set-9.c: New test.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/config/riscv/autovec.md   |  4 +--
>  .../gcc.target/riscv/rvv/autovec/vls/def.h| 18 ++
>  .../riscv/rvv/autovec/vls/vec-set-1.c | 35 +++
>  .../riscv/rvv/autovec/vls/vec-set-10.c| 31 
>  .../riscv/rvv/autovec/vls/vec-set-11.c| 29 +++
>  .../riscv/rvv/autovec/vls/vec-set-12.c| 21 +++
>  .../riscv/rvv/autovec/vls/vec-set-13.c| 20 +++
>  .../riscv/rvv/autovec/vls/vec-set-14.c| 19 ++
>  .../riscv/rvv/autovec/vls/vec-set-15.c| 18 ++
>  .../riscv/rvv/autovec/vls/vec-set-16.c| 21 +++
>  .../riscv/rvv/autovec/vls/vec-set-17.c| 20 +++
>  .../riscv/rvv/autovec/vls/vec-set-18.c| 19 ++
>  .../riscv/rvv/autovec/vls/vec-set-19.c| 18 ++
>  .../riscv/rvv/autovec/vls/vec-set-2.c | 33 +
>  .../riscv/rvv/autovec/vls/vec-set-20.c| 20 +++
>  .../riscv/rvv/autovec/vls/vec-set-21.c| 19 ++
>  .../riscv/rvv/autovec/vls/vec-set-22.c| 18 ++
>  .../riscv/rvv/autovec/vls/vec-set-3.c | 31 
>  .../riscv/rvv/autovec/vls/vec-set-4.c | 29 +++
>  .../riscv/rvv/autovec/vls/vec-set-5.c | 35 +++
>  .../riscv/rvv/autovec/vls/vec-set-6.c | 33 +
>  .../riscv/rvv/autovec/vls/vec-set-7.c | 31 
>  .../riscv/rvv/autovec/vls/vec-set-8.c | 29 +++
>  .../riscv/rvv/autovec/vls/vec-set-9.c | 33 +
>  24 files changed, 582 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/vec-set-1.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/vec-set-10.c
>  create mode 100644 
>

RE: [PATCH v1] RISC-V: Bugfix for scalar move with merged operand

2023-09-18 Thread Li, Pan2 via Gcc-patches

Committed, thanks Jeff and Robin.

Pan

-Original Message-
From: Jeff Law  
Sent: Tuesday, September 19, 2023 1:44 AM
To: Robin Dapp ; Li, Pan2 ; 
gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
kito.ch...@gmail.com
Subject: Re: [PATCH v1] RISC-V: Bugfix for scalar move with merged operand



On 9/18/23 04:00, Robin Dapp wrote:
>> I must be missing something.  Doesn't insn 10 broadcast the immediate
>> 0x2 to both elements of r142?!?  What am I missing?
> It is indeed a bit misleading.  The difference is in the mask which
> is not displayed in the short form.  So we actually use a vec_dup
> for a single-element move, essentially a masked vec_dup where only
> one element is masked in.
Ah :-)

> 
> The problem was that the original doesn't use a merging "vec_set"
> but a "destructive" one where the other elements get ignored.
> 
> The fix is OK IMHO.
Agreed.

jeff

RE: [PATCH v1] RISC-V: Bugfix for scalar move with merged operand

2023-09-18 Thread Li, Pan2 via Gcc-patches

Thanks Robin, let's wait Jeff's confirmation for this.

Pan

-Original Message-
From: Robin Dapp  
Sent: Monday, September 18, 2023 6:01 PM
To: Jeff Law ; Li, Pan2 ; 
gcc-patches@gcc.gnu.org
Cc: rdapp@gmail.com; juzhe.zh...@rivai.ai; Wang, Yanzhang 
; kito.ch...@gmail.com
Subject: Re: [PATCH v1] RISC-V: Bugfix for scalar move with merged operand

> I must be missing something.  Doesn't insn 10 broadcast the immediate
> 0x2 to both elements of r142?!?  What am I missing?
It is indeed a bit misleading.  The difference is in the mask which
is not displayed in the short form.  So we actually use a vec_dup
for a single-element move, essentially a masked vec_dup where only
one element is masked in.

The problem was that the original doesn't use a merging "vec_set"
but a "destructive" one where the other elements get ignored.

The fix is OK IMHO. 

Regards
 Robin

RE: [PATCH] RISC-V: Remove autovec-vls.md file and clean up VLS move modes[NFC]

2023-09-18 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan


-Original Message-
From: Gcc-patches  On Behalf 
Of Kito Cheng via Gcc-patches
Sent: Monday, September 18, 2023 4:01 PM
To: Juzhe-Zhong 
Cc: gcc-patches@gcc.gnu.org; kito.ch...@gmail.com; jeffreya...@gmail.com; 
rdapp@gmail.com
Subject: Re: [PATCH] RISC-V: Remove autovec-vls.md file and clean up VLS move 
modes[NFC]

LGTM :)

On Mon, Sep 18, 2023 at 3:07 PM Juzhe-Zhong  wrote:
>
> We have largely supportted VLS modes. Only move patterns of VLS modes are
> different from VLS patterns. The rest of them are the same.
>
> We always extend the current VLA patterns with VLSmodes:
>
> VI --> V_VLSI
> VF --> V_VLSF
>
> It makes no sense to have a separate file holding a very few VLS patterns
> that can not be extended from the current VLA patterns.
>
> So remove autovec-vls.md
>
> gcc/ChangeLog:
>
> * config/riscv/vector.md (mov): New pattern.
> (*mov_mem_to_mem): Ditto.
> (*mov): Ditto.
> (@mov_lra): Ditto.
> (*mov_lra): Ditto.
> (*mov_vls): Ditto.
> (movmisalign): Ditto.
> (@vec_duplicate): Ditto.
> * config/riscv/autovec-vls.md: Removed.
>
> ---
>  gcc/config/riscv/autovec-vls.md | 196 
>  gcc/config/riscv/vector.md  | 172 +++-
>  2 files changed, 170 insertions(+), 198 deletions(-)
>  delete mode 100644 gcc/config/riscv/autovec-vls.md
>
> diff --git a/gcc/config/riscv/autovec-vls.md b/gcc/config/riscv/autovec-vls.md
> deleted file mode 100644
> index 3488f452e5d..000
> --- a/gcc/config/riscv/autovec-vls.md
> +++ /dev/null
> @@ -1,196 +0,0 @@
> -;; Machine description for VLS of RVV auto-vectorization.
> -;; Copyright (C) 2023 Free Software Foundation, Inc.
> -;; Contributed by Juzhe Zhong (juzhe.zh...@rivai.ai), RiVAI Technologies Ltd.
> -
> -;; This file is part of GCC.
> -
> -;; GCC is free software; you can redistribute it and/or modify
> -;; it under the terms of the GNU General Public License as published by
> -;; the Free Software Foundation; either version 3, or (at your option)
> -;; any later version.
> -
> -;; GCC is distributed in the hope that it will be useful,
> -;; but WITHOUT ANY WARRANTY; without even the implied warranty of
> -;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> -;; GNU General Public License for more details.
> -
> -;; You should have received a copy of the GNU General Public License
> -;; along with GCC; see the file COPYING3.  If not see
> -;; .
> -
> -;; We define VLS modes as 'define_insn_and_split' with normal
> -;; RTX_CODE operation, so we can gain benefits from Combine optimizations.
> -
> -;; -
> -;;  Moves Operations
> -;; -
> -
> -(define_expand "mov"
> -  [(set (match_operand:VLS_AVL_IMM 0 "reg_or_mem_operand")
> -   (match_operand:VLS_AVL_IMM 1 "general_operand"))]
> -  "TARGET_VECTOR"
> -{
> -  if (riscv_vector::legitimize_move (operands[0], operands[1]))
> -DONE;
> -})
> -
> -(define_insn_and_split "*mov_mem_to_mem"
> -  [(set (match_operand:VLS_AVL_IMM 0 "memory_operand")
> -   (match_operand:VLS_AVL_IMM 1 "memory_operand"))]
> -  "TARGET_VECTOR && can_create_pseudo_p ()"
> -  "#"
> -  "&& 1"
> -  [(const_int 0)]
> -  {
> -if (GET_MODE_BITSIZE (mode).to_constant () <= MAX_BITS_PER_WORD)
> -  {
> -/* Opitmize the following case:
> -
> -   typedef int8_t v2qi __attribute__ ((vector_size (2)));
> -   v2qi v = *(v2qi*)in;
> -   *(v2qi*)out = v;
> -
> -   We prefer scalar load/store instead of vle.v/vse.v when
> -   the VLS modes size is smaller scalar mode.  */
> -machine_mode mode;
> -unsigned size = GET_MODE_BITSIZE (mode).to_constant ();
> -if (FLOAT_MODE_P (mode))
> - mode = mode_for_size (size, MODE_FLOAT, 0).require ();
> -else
> - mode = mode_for_size (size, MODE_INT, 0).require ();
> -emit_move_insn (gen_lowpart (mode, operands[0]),
> -   gen_lowpart (mode, operands[1]));
> -  }
> -else
> -  {
> -   operands[1] = force_reg (mode, operands[1]);
> -   emit_move_insn (operands[0], operands[1]);
> -  }
> -DONE;
> -  }
> -  [(set_attr "type" "vmov")]
> -)
> -
> -(define_insn_and_split "*mov"
> -  [(set (match_operand:VLS_AVL_IMM 0 "reg_or_mem_operand" "=vr, m, vr")
> -   (match_operand:VLS_AVL_IMM 1 "reg_or_mem_operand" "  m,vr, vr"))]
> -  "TARGET_VECTOR
> -   && (register_operand (operands[0], mode)
> -   || register_operand (operands[1], mode))"
> -  "@
> -   #
> -   #
> -   vmv%m1r.v\t%0,%1"
> -  "&& reload_completed
> -   && (!register_operand (operands[0], mode)
> -   || !register_operand (operands[1], mode))"
> -  [(const_int 0)]
> -  {
> -bool ok_p = riscv_vector::legitimize_move (operands[0], operands[1]);
> -

RE: [PATCH] RISC-V: Support VLS modes reduction[PR111153]

2023-09-18 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Kito Cheng via Gcc-patches
Sent: Monday, September 18, 2023 4:20 PM
To: Juzhe-Zhong 
Cc: gcc-patches@gcc.gnu.org; kito.ch...@sifive.com; jeffreya...@gmail.com; 
rdapp@gmail.com
Subject: Re: [PATCH] RISC-V: Support VLS modes reduction[PR53]

LGTM

On Sun, Sep 17, 2023 at 10:07 AM Juzhe-Zhong  wrote:
>
> This patch supports VLS reduction vectorization.
>
> It can optimize the current reduction vectorization codegen with current COST 
> model.
>
> #define DEF_REDUC_PLUS(TYPE)\
> TYPE __attribute__ ((noinline, noclone))\
> reduc_plus_##TYPE (TYPE * __restrict a, int n)  \
> {   \
>   TYPE r = 0;   \
>   for (int i = 0; i < n; ++i)   \
> r += a[i];  \
>   return r; \
> }
>
> #define TEST_PLUS(T)\
>   T (int32_t)   \
>
> TEST_PLUS (DEF_REDUC_PLUS)
>
>
> Before this patch:
>
> vle32.v v2,0(a5)
> addia5,a5,16
> vadd.vv v1,v1,v2
> bne a5,a4,.L4
> lui a4,%hi(.LC0)
> lui a5,%hi(.LC1)
> addia4,a4,%lo(.LC0)
> vlm.v   v0,0(a4)
> addia5,a5,%lo(.LC1)
> andia1,a1,-4
> vmv1r.v v2,v3
> vlm.v   v4,0(a5)
> vcompress.vmv2,v1,v0
> vmv1r.v v0,v4
> vadd.vv v1,v2,v1
> vcompress.vmv3,v1,v0
> vadd.vv v3,v3,v1
> vmv.x.s a0,v3
> sext.w  a0,a0
> beq a3,a1,.L12
>
> After this patch:
>
> vle32.v v2,0(a5)
> addia5,a5,16
> vadd.vv v1,v1,v2
> bne a5,a4,.L4
> li  a5,0
> andia1,a1,-4
> vmv.s.x v2,a5
> vredsum.vs  v1,v1,v2
> vmv.x.s a0,v1
> beq a3,a1,.L12
>
> gcc/ChangeLog:
>
> * config/riscv/autovec.md: Add VLS modes.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/autovec/vls/def.h: Add VLS mode reduction case.
> * gcc.target/riscv/rvv/autovec/vls/reduc-1.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/reduc-10.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/reduc-11.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/reduc-12.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/reduc-13.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/reduc-14.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/reduc-15.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/reduc-16.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/reduc-17.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/reduc-18.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/reduc-19.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/reduc-2.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/reduc-20.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/reduc-21.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/reduc-3.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/reduc-4.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/reduc-5.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/reduc-6.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/reduc-7.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/reduc-8.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/reduc-9.c: New test.
>
> ---
>  gcc/config/riscv/autovec.md   |  2 +-
>  .../gcc.target/riscv/rvv/autovec/vls/def.h| 30 +++
>  .../riscv/rvv/autovec/vls/reduc-1.c   | 31 +++
>  .../riscv/rvv/autovec/vls/reduc-10.c  | 50 
>  .../riscv/rvv/autovec/vls/reduc-11.c  | 46 +++
>  .../riscv/rvv/autovec/vls/reduc-12.c  | 30 +++
>  .../riscv/rvv/autovec/vls/reduc-13.c  | 28 +++
>  .../riscv/rvv/autovec/vls/reduc-14.c  | 26 ++
>  .../riscv/rvv/autovec/vls/reduc-15.c  | 81 +++
>  .../riscv/rvv/autovec/vls/reduc-16.c  | 75 +
>  .../riscv/rvv/autovec/vls/reduc-17.c  | 69 
>  .../riscv/rvv/autovec/vls/reduc-18.c  | 63 +++
>  .../riscv/rvv/autovec/vls/reduc-19.c  | 18 +
>  .../riscv/rvv/autovec/vls/reduc-2.c   | 29 +++
>  .../riscv/rvv/autovec/vls/reduc-20.c  | 17 
>  .../riscv/rvv/autovec/vls/reduc-21.c  | 16 
>  .../riscv/rvv/autovec/vls/reduc-3.c   | 27 +++
>  .../riscv/rvv/autovec/vls/reduc-4.c   | 25 ++
>  .../riscv/rvv/autovec/vls/reduc-5.c   | 18 +
>  .../riscv/rvv/autovec/vls/reduc-6.c   | 17 
>  .../riscv/rvv/autovec/vls/reduc-7.c   | 16 
>  .../riscv/rvv/autovec/vls/reduc-8.c   | 58 +
>

RE: [PATCH v1] RISC-V: Bugfix for scalar move with merged operand

2023-09-17 Thread Li, Pan2 via Gcc-patches

> I must be missing something.  Doesn't insn 10 broadcast the immediate 
> 0x2 to both elements of r142?!?  What am I missing?

Thanks Jeff for comments.

The insn 10 is VECTOR_SCALAR_MOV, aka vmv.s.x from the asm code.

Pan

-Original Message-
From: Jeff Law  
Sent: Sunday, September 17, 2023 11:53 PM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
kito.ch...@gmail.com; rdapp@gmail.com
Subject: Re: [PATCH v1] RISC-V: Bugfix for scalar move with merged operand




On 9/17/23 01:42, Pan Li via Gcc-patches wrote:
> From: Pan Li 
> 
> Given below example for VLS mode
> 
> void
> test (vl_t *u)
> {
>vl_t t;
>long long *p = (long long *)
> 
>p[0] = p[1] = 2;
> 
>*u = t;
> }
> 
> The vec_set will simplify the insn to vmv.s.x when index is 0, without
> merged operand. That will result in some problems in DCE, aka:
> 
> 1:  137[DI] = a0
> 2:  138[V2DI] = 134[V2DI]  // deleted by DCE
> 3:  139[DI] = #2   // deleted by DCE
> 4:  140[DI] = #2   // deleted by DCE
> 5:  141[V2DI] = vec_dup:V2DI (139[DI]) // deleted by DCE
> 6:  138[V2DI] = vslideup_imm (138[V2DI], 141[V2DI], 1) // deleted by DCE
> 7:  135[V2DI] = 138[V2DI]  // deleted by DCE
> 8:  142[V2DI] = 135[V2DI]  // deleted by DCE
> 9:  143[DI] = #2
> 10: 142[V2DI] = vec_dup:V2DI (143[DI])
> 11: (137[DI]) = 142[V2DI]
> 
> The higher 64 bits of 142[V2DI] is unknown here and it generated
> incorrect code when store back to memory. This patch would like to
> fix this issue by adding a new SCALAR_MOVE_MERGED_OP for vec_set.
I must be missing something.  Doesn't insn 10 broadcast the immediate 
0x2 to both elements of r142?!?  What am I missing?

JEff

RE: [PATCH V4] RISC-V: Expand VLS mode to scalar mode move[PR111391]

2023-09-16 Thread Li, Pan2 via Gcc-patches

Committed, thanks Robin.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Robin Dapp via Gcc-patches
Sent: Friday, September 15, 2023 11:44 PM
To: 钟居哲 ; Jeff Law ; kito.cheng 

Cc: rdapp@gmail.com; gcc-patches ; kito.cheng 

Subject: Re: [PATCH V4] RISC-V: Expand VLS mode to scalar mode move[PR111391]

> You mean this patch is ok?

I thought about it a bit more.  From my point of view the patch is OK
for now in order to get the bug out of the way.

In the longer term I would really prefer a more "regular" solution
(i.e. via hard_regno_mode_ok) and related.  I can take care of that
once I have a bit of time but for now let's go ahead.

Regards
 Robin

RE: [PATCH v1] RISC-V: Support FP SGNJX autovec for VLS mode

2023-09-15 Thread Li, Pan2 via Gcc-patches

Committed, thanks Juzhe.

Pan

From: 钟居哲 
Sent: Saturday, September 16, 2023 7:21 AM
To: Li, Pan2 ; gcc-patches 
Cc: Li, Pan2 ; Wang, Yanzhang ; 
kito.cheng 
Subject: Re: [PATCH v1] RISC-V: Support FP SGNJX autovec for VLS mode

lgtm


juzhe.zh...@rivai.ai

From: pan2.li
Date: 2023-09-15 21:23
To: gcc-patches
CC: juzhe.zhong; 
pan2.li; 
yanzhang.wang; 
kito.cheng
Subject: [PATCH v1] RISC-V: Support FP SGNJX autovec for VLS mode
From: Pan Li mailto:pan2...@intel.com>>

This patch would like to allow the VLS mode autovec for the
floating-point binary operation SGNJX.

Give sample code as below:

void
test (float * restrict out, float * restrict in1, float * restrict in2)
{
  for (int i = 0; i < 128; i++)
out[i] = in1[i] * copysignf (1.0, in2[i]);
}

Before this patch:
test:
  li  a5,128
  vsetvli zero,a5,e32,m1,ta,ma
  vle32.v v2,0(a1)
  lui a4,%hi(.LC0)
  flw fa5,%lo(.LC0)(a4)
  vfmv.v.fv1,fa5
  vle32.v v3,0(a2)
  vfsgnj.vv   v1,v1,v3
  vfmul.vvv1,v1,v2
  vse32.v v1,0(a0)
  ret

After this patch:
test:
  li  a5,128
  vsetvli zero,a5,e32,m1,ta,ma
  vle32.v v1,0(a1)
  vle32.v v2,0(a2)
  vfsgnjx.vv  v1,v1,v2
  vse32.v v1,0(a0)
  ret

This SGNJX autovec acts on function call copysignf/copysignf
in math.h too. And it depends on the option -ffast-math.

gcc/ChangeLog:

* config/riscv/autovec-vls.md (xorsign3): New pattern.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/def.h: New macro.
* gcc.target/riscv/rvv/autovec/vls/floating-point-sgnjx-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/floating-point-sgnjx-2.c: New test.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>
---
gcc/config/riscv/autovec-vls.md   | 21 +
.../gcc.target/riscv/rvv/autovec/vls/def.h|  8 
.../rvv/autovec/vls/floating-point-sgnjx-1.c  | 43 +++
.../rvv/autovec/vls/floating-point-sgnjx-2.c  | 31 +
4 files changed, 103 insertions(+)
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-sgnjx-1.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-sgnjx-2.c

diff --git a/gcc/config/riscv/autovec-vls.md b/gcc/config/riscv/autovec-vls.md
index 6f48f7d6232..d4ed2081537 100644
--- a/gcc/config/riscv/autovec-vls.md
+++ b/gcc/config/riscv/autovec-vls.md
@@ -289,6 +289,27 @@ (define_insn_and_split "copysign3"
   [(set_attr "type" "vector")]
)
+;; -
+;; Includes:
+;; - vfsgnjx.vv
+;; - vfsgnjx.vf
+;; -
+(define_insn_and_split "xorsign3"
+  [(set (match_operand:VLSF 0 "register_operand")
+(unspec:VLSF
+  [(match_operand:VLSF  1 "register_operand")
+   (match_operand:VLSF  2 "register_operand")] UNSPEC_VXORSIGN))]
+  "TARGET_VECTOR && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+  {
+riscv_vector::emit_vlmax_insn (code_for_pred (UNSPEC_VXORSIGN, mode),
+riscv_vector::BINARY_OP, operands);
+DONE;
+  }
+)
+
;; 
---
;;  [INT] Unary operations
;; 
---
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h
index 1edc1910920..81c4570836b 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h
@@ -258,3 +258,11 @@ typedef double v512df __attribute__ ((vector_size (4096)));
 for (int i = 0; i < NUM; ++i)  
\
   a[i] = (b[i] > c[i]) OP (d[i] < e[i]);   
\
   }
+
+#define DEF_SGNJX_VV(PREFIX, NUM, TYPE, CALL)  
\
+  void __attribute__ ((noinline, noclone)) 
\
+  PREFIX##_##TYPE##NUM (TYPE *restrict a, TYPE *restrict b, TYPE *restrict c)  
\
+  {
\
+for (int i = 0; i < NUM; ++i)  
\
+  a[i] = b[i] * CALL (1.0, c[i]);  
\
+  }
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-sgnjx-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-sgnjx-1.c
new file mode 100644
index 000..86c23ef0436
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-sgnjx-1.c
@@ -0,0 +1,43 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvfh_zvl4096b -mabi=lp64d -O3 
-fno-schedule-insns -fno-schedule-insns2

RE: [PATCH] test: Block SLP check of slp-35.c for vect_strided5

2023-09-15 Thread Li, Pan2 via Gcc-patches

Committed, thanks Richard.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Richard Biener via Gcc-patches
Sent: Friday, September 15, 2023 6:07 PM
To: Juzhe-Zhong 
Cc: gcc-patches@gcc.gnu.org; jeffreya...@gmail.com
Subject: Re: [PATCH] test: Block SLP check of slp-35.c for vect_strided5

On Fri, 15 Sep 2023, Juzhe-Zhong wrote:

> gcc/testsuite/ChangeLog:

OK.

>   * gcc.dg/vect/slp-35.c: Block SLP check for vect_strided5 targets.
> 
> ---
>  gcc/testsuite/gcc.dg/vect/slp-35.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/slp-35.c 
> b/gcc/testsuite/gcc.dg/vect/slp-35.c
> index 5e9f6739e1f..2c9d168e096 100644
> --- a/gcc/testsuite/gcc.dg/vect/slp-35.c
> +++ b/gcc/testsuite/gcc.dg/vect/slp-35.c
> @@ -68,5 +68,5 @@ int main (void)
>  }
>  
>  /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
> -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" 
> } } */
> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" 
> { target {! vect_strided5 } } } } */
>
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

RE: [PATCH] test: Block SLP check of slp-34.c for vect_strided5

2023-09-15 Thread Li, Pan2 via Gcc-patches

Committed, thanks Richard.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Richard Biener via Gcc-patches
Sent: Friday, September 15, 2023 6:07 PM
To: Juzhe-Zhong 
Cc: gcc-patches@gcc.gnu.org; jeffreya...@gmail.com
Subject: Re: [PATCH] test: Block SLP check of slp-34.c for vect_strided5

On Fri, 15 Sep 2023, Juzhe-Zhong wrote:

> Since RISC-V use vsseg5 which is the vect_store_lanes with stride 5
> if failed on RISC-V.

OK.

> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/slp-34.c: Block check for vect_strided5.
> 
> ---
>  gcc/testsuite/gcc.dg/vect/slp-34.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/slp-34.c 
> b/gcc/testsuite/gcc.dg/vect/slp-34.c
> index 41832d7f519..53b8284d084 100644
> --- a/gcc/testsuite/gcc.dg/vect/slp-34.c
> +++ b/gcc/testsuite/gcc.dg/vect/slp-34.c
> @@ -57,5 +57,5 @@ int main (void)
>  }
>  
>  /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
> -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect"  
> } } */
> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" 
> { target {! vect_strided5 } } } } */
>
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

RE: [PATCH] test: Block vect_strided5 for slp-34-big-array.c SLP check

2023-09-15 Thread Li, Pan2 via Gcc-patches

Committed, thanks Richard.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Richard Biener via Gcc-patches
Sent: Friday, September 15, 2023 6:07 PM
To: Juzhe-Zhong 
Cc: gcc-patches@gcc.gnu.org; jeffreya...@gmail.com
Subject: Re: [PATCH] test: Block vect_strided5 for slp-34-big-array.c SLP check

On Fri, 15 Sep 2023, Juzhe-Zhong wrote:

> If failed on RISC-V since it use vect_store_lanes with array 5.

OK.

> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/slp-34-big-array.c: Block SLP check for vect_strided5.
> 
> ---
>  gcc/testsuite/gcc.dg/vect/slp-34-big-array.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/slp-34-big-array.c 
> b/gcc/testsuite/gcc.dg/vect/slp-34-big-array.c
> index 0baaff7dc6e..db0e440639e 100644
> --- a/gcc/testsuite/gcc.dg/vect/slp-34-big-array.c
> +++ b/gcc/testsuite/gcc.dg/vect/slp-34-big-array.c
> @@ -63,5 +63,5 @@ int main (void)
>  }
>  
>  /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
> -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect"  
> } } */
> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" 
> { target {! vect_strided5 } } } } */
>  
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

RE: [PATCH] test: Block slp-16.c check for target support vect_strided6

2023-09-15 Thread Li, Pan2 via Gcc-patches

Committed, thanks Richard.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Richard Biener via Gcc-patches
Sent: Friday, September 15, 2023 5:38 PM
To: Juzhe-Zhong 
Cc: gcc-patches@gcc.gnu.org; jeffreya...@gmail.com; richard.sandif...@arm.com
Subject: Re: [PATCH] test: Block slp-16.c check for target support vect_strided6

On Fri, 15 Sep 2023, Juzhe-Zhong wrote:

> This testcase FAIL in RISC-V because RISC-V support vect_load_lanes with 6.
> FAIL: gcc.dg/vect/slp-16.c -flto -ffat-lto-objects  scan-tree-dump-times vect 
> "vectorizing stmts using SLP" 2
> FAIL: gcc.dg/vect/slp-16.c scan-tree-dump-times vect "vectorizing stmts using 
> SLP" 2
> 
> Since it use vlseg6 (vect_load_lanes with array size = 6)

OK.

> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/slp-16.c: Block vect_strided6.
>   * lib/target-supports.exp: Add strided type.
> 
> ---
>  gcc/testsuite/gcc.dg/vect/slp-16.c| 2 +-
>  gcc/testsuite/lib/target-supports.exp | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/slp-16.c 
> b/gcc/testsuite/gcc.dg/vect/slp-16.c
> index d053a64276d..44ba730bda8 100644
> --- a/gcc/testsuite/gcc.dg/vect/slp-16.c
> +++ b/gcc/testsuite/gcc.dg/vect/slp-16.c
> @@ -67,5 +67,5 @@ int main (void)
>  }
>  
>  /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  { target 
> vect_int_mult } } } */
> -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" 
> { target vect_int_mult } } } */
> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" 
> { target { vect_int_mult && {! vect_strided6 } } } } } */
>
> diff --git a/gcc/testsuite/lib/target-supports.exp 
> b/gcc/testsuite/lib/target-supports.exp
> index edaa010258f..2de41cef2f6 100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -8621,7 +8621,7 @@ proc check_effective_target_vect_interleave { } {
>&& [check_effective_target_s390_vx]) }}]
>  }
>  
> -foreach N {2 3 4 8} {
> +foreach N {2 3 4 5 6 7 8} {
>  eval [string map [list N $N] {
>   # Return 1 if the target supports 2-vector interleaving
>   proc check_effective_target_vect_stridedN { } {
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

RE: [PATCH] test: Isolate slp-1.c check of target supports vect_strided5

2023-09-15 Thread Li, Pan2 via Gcc-patches

Committed, thanks Richard.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Richard Biener via Gcc-patches
Sent: Friday, September 15, 2023 5:38 PM
To: Juzhe-Zhong 
Cc: gcc-patches@gcc.gnu.org; jeffreya...@gmail.com; richard.sandif...@arm.com
Subject: Re: [PATCH] test: Isolate slp-1.c check of target supports 
vect_strided5

On Fri, 15 Sep 2023, Juzhe-Zhong wrote:

> This test failed in RISC-V:
> FAIL: gcc.dg/vect/slp-1.c -flto -ffat-lto-objects  scan-tree-dump-times vect 
> "vectorizing stmts using SLP" 4
> FAIL: gcc.dg/vect/slp-1.c scan-tree-dump-times vect "vectorizing stmts using 
> SLP" 4
> 
> Because this loop:
>   /* SLP with unrolling by 8.  */
>   for (i = 0; i < N; i++)
> {
>   out[i*5] = 8;
>   out[i*5 + 1] = 7;
>   out[i*5 + 2] = 81;
>   out[i*5 + 3] = 28;
>   out[i*5 + 4] = 18;
> }
> 
> is using vect_load_lanes with array size = 5.
> instead of SLP.
> 
> When we adjust the COST of LANES load store, then it will use SLP.

OK.

> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/slp-1.c: Add vect_stried5.
> 
> ---
>  gcc/testsuite/gcc.dg/vect/slp-1.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/slp-1.c 
> b/gcc/testsuite/gcc.dg/vect/slp-1.c
> index 82e4f6469fb..d4a13f12df6 100644
> --- a/gcc/testsuite/gcc.dg/vect/slp-1.c
> +++ b/gcc/testsuite/gcc.dg/vect/slp-1.c
> @@ -122,5 +122,5 @@ int main (void)
>  }
>  
>  /* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect"  } } */
> -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "vect" 
> } } */
> -  
> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "vect" 
> { target {! vect_strided5 } } } } */
> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 3 "vect" 
> { target vect_strided5 } } } */
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

RE: [PATCH v3] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for RVV intrinsic

2023-09-14 Thread Li, Pan2 via Gcc-patches

Thanks Lehua, actually Yes.

Consider we will have a try for hashmap way and will keep you posted.

Pan

-Original Message-
From: Lehua Ding  
Sent: Friday, September 15, 2023 10:29 AM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: Wang, Yanzhang ; kito.ch...@gmail.com; 
juzhe.zh...@rivai.ai
Subject: Re: [PATCH v3] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for RVV 
intrinsic

Hi Pan,

> +function_instance *
> +function_base::get_non_overloaded_instance (unsigned int code,
> + vec ) const
> +{
> +  unsigned int code_limit = vec_safe_length (registered_functions);
> +
> +  for (unsigned fun_code = code; fun_code < code_limit; fun_code++)
> +{
> +  registered_function *rfun = (*registered_functions)[fun_code];
> +  function_instance instance = rfun->instance;
> +
> +  if (rfun->overloaded_p)
> + continue;
> +
> +  unsigned k;
> +  const rvv_arg_type_info *args = instance.op_info->args;
> +
> +  for (k = 0; args[k].base_type != NUM_BASE_TYPES; k++)
> + {
> +   if (k >= arglist.length ())
> + break;

Can we fast continue if args length not equal arglist length before this 
loop:

   if (args lengh != arglist.length ())
 continue;

   for (k = 0; args[k].base_type != NUM_BASE_TYPES; k++)
   {
 ...

-- 
Best,
Lehua

RE: Re: [PATCH v3] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for RVV intrinsic

2023-09-14 Thread Li, Pan2 via Gcc-patches

Thanks Juzhe for comments, got the point and will have a try for hashmap liked 
approach to get the non-overloaded later in PATCH v4. Sorry for that in the 
middle of something.

Pan

From: juzhe.zh...@rivai.ai 
Sent: Friday, September 15, 2023 10:21 AM
To: Li, Pan2 ; gcc-patches 
Cc: Li, Pan2 ; Wang, Yanzhang ; 
kito.cheng 
Subject: Re: Re: [PATCH v3] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for 
RVV intrinsic

More information:

For PRED_TYPE_tumu, it's easy to analyze, just need to count how many arguments 
in the arglist.
If arglist has 5 arguments (mask, merge, op1, op2, len) Then it must be TUMU.

What I mean is that we should be able to quickly to compute the arguments of 
the construction of the function_instance.
Then we can get the non-overloaeded function.

juzhe.zh...@rivai.ai

From: juzhe.zh...@rivai.ai
Date: 2023-09-15 10:02
To: pan2.li; 
gcc-patches
CC: pan2.li; 
yanzhang.wang; 
kito.cheng
Subject: Re: [PATCH v3] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for RVV 
intrinsic
Sorry for comment again.

I am not happy with current get_non_overloaeded_instance function.

I think the searching approach is very in-effective:

+function_instance *
+function_base::get_non_overloaded_instance (unsigned int code,
+ vec ) const
+{
+  unsigned int code_limit = vec_safe_length (registered_functions);
+
+  for (unsigned fun_code = code; fun_code < code_limit; fun_code++)
+{
+  registered_function *rfun = (*registered_functions)[fun_code];
+  function_instance instance = rfun->instance;
+
+  if (rfun->overloaded_p)
+ continue;
+
+  unsigned k;
+  const rvv_arg_type_info *args = instance.op_info->args;
+
+  for (k = 0; args[k].base_type != NUM_BASE_TYPES; k++)
+ {
+   if (k >= arglist.length ())
+ break;
+
+   if (TYPE_MODE (instance.get_arg_type (k))
+ != TYPE_MODE (TREE_TYPE (arglist[k])))
+ break;
+ }
+
+ if (args[k].base_type == NUM_BASE_TYPES)
+   return >instance;
+}
+
+  return NULL;
+}

Instead, I think we should build up a table which map non-overloaded function 
according to the arguments so that we could get the "instance" effectively.

E.g. For vint8mf8_t tumu vadd intrinsic the instance is like this:
function_instance ("vadd", bases::vadd, shapes::alu,
  iu_ops[VECTOR_TYPE_vuint8mf8_t], PRED_TYPE_tumu, _vvv_ops);

Since the get_nonoverloaed_instance is already the function of the class BASE.
So, The first 3 arguments "vadd", bases::vadd, shapes::alu
should already known since it is a known function_base.

The last 3 arguments may need some elegant analysis or map table to quickly 
grep.

So, I think we should consider this framework seriously.

juzhe.zh...@rivai.ai

From: pan2.li
Date: 2023-09-12 16:46
To: gcc-patches
CC: juzhe.zhong; 
pan2.li; 
yanzhang.wang; 
kito.cheng
Subject: [PATCH v3] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for RVV 
intrinsic
From: Pan Li mailto:pan2...@intel.com>>

Update in v3:

* Rewrite comment for overloaded function add.
* Move get_non_overloaded_instance to function_base.

Update in v2:

* Add get_non_overloaded_instance for function instance.
* Fix overload check for policy function.
* Enrich the test cases check.

Original log:

This patch would like add the framework to support the RVV overloaded
intrinsic API in riscv-xxx-xxx-gcc, like riscv-xxx-xxx-g++ did.

However, it almost leverage the hook TARGET_RESOLVE_OVERLOADED_BUILTIN
with below steps.

* Register overloaded functions.
* Add function_resolver for overloaded function resolving.
* Add resolve API for function shape with default implementation.
* Implement HOOK for navigating the overloaded API to non-overloaded API.

We validated this framework by the vmv_v intrinsic API(s), and we will
add more intrins API support in the underlying patches.

gcc/ChangeLog:

* config/riscv/riscv-c.cc
(riscv_resolve_overloaded_builtin): New function for the hook.
(riscv_register_pragmas): Register the hook
* config/riscv/riscv-protos.h (resolve_overloaded_builtin): New decl.
* config/riscv/riscv-vector-builtins-shapes.cc (build_one):
Register overloaded function.
(struct overloaded_base): New struct for overloaded shape.
(struct non_overloaded_base): New struct for non overloaded shape.
(struct move_def): Inherit overloaded shape.
* config/riscv/riscv-vector-builtins.cc
(function_base::get_non_overloaded_instance): New API impl.
(function_builder::add_function): Add overloaded arg.
(function_resolver::function_resolver): New constructor.
(function_builder::add_overloaded_function): New API

RE: [PATCH] RISC-V: Support VLS modes mask operations

2023-09-14 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Kito Cheng via Gcc-patches
Sent: Thursday, September 14, 2023 10:23 PM
To: Juzhe-Zhong 
Cc: GCC Patches ; Kito Cheng ; 
Jeff Law ; Robin Dapp 
Subject: Re: [PATCH] RISC-V: Support VLS modes mask operations

LGTM

Juzhe-Zhong  於 2023年9月14日 週四 20:44 寫道：

> This patch support mask operations (comparison and logical).
>
> This patch reduce these FAILs of "vect" testsuite:
> FAIL: gcc.dg/vect/vect-bic-bitmask-12.c -flto -ffat-lto-objects
> scan-tree-dump dce7 "<=\\s*.+{ 255,.+}"
> FAIL: gcc.dg/vect/vect-bic-bitmask-12.c scan-tree-dump dce7 "<=\\s*.+{
> 255,.+}"
> FAIL: gcc.dg/vect/vect-bic-bitmask-23.c -flto -ffat-lto-objects
> scan-tree-dump dce7 "<=\\s*.+{ 255, 15, 1, 65535 }"
> FAIL: gcc.dg/vect/vect-bic-bitmask-23.c scan-tree-dump dce7 "<=\\s*.+{
> 255, 15, 1, 65535 }"
>
> Full regression passed (with reducing 4 FAILs).
>
> gcc/ChangeLog:
>
> * config/riscv/autovec-opt.md: Add VLS mask modes.
> * config/riscv/autovec.md (@vcond_mask_): Remove @.
> (vcond_mask_): Add VLS mask modes.
> * config/riscv/vector.md: Ditto.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/autovec/vls/def.h: Add VLS tests.
> * gcc.target/riscv/rvv/autovec/vls/cmp-1.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/cmp-2.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/cmp-3.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/cmp-4.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/cmp-5.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/cmp-6.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/mask-1.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/mask-2.c: New test.
> * gcc.target/riscv/rvv/autovec/vls/mask-3.c: New test.
>
> ---
>  gcc/config/riscv/autovec-opt.md   |  18 +--
>  gcc/config/riscv/autovec.md   |  32 +++---
>  gcc/config/riscv/vector.md|  60 +-
>  .../gcc.target/riscv/rvv/autovec/vls/cmp-1.c  | 106 ++
>  .../gcc.target/riscv/rvv/autovec/vls/cmp-2.c  | 106 ++
>  .../gcc.target/riscv/rvv/autovec/vls/cmp-3.c  | 106 ++
>  .../gcc.target/riscv/rvv/autovec/vls/cmp-4.c  | 106 ++
>  .../gcc.target/riscv/rvv/autovec/vls/cmp-5.c  | 106 ++
>  .../gcc.target/riscv/rvv/autovec/vls/cmp-6.c  | 106 ++
>  .../gcc.target/riscv/rvv/autovec/vls/def.h|   9 ++
>  .../gcc.target/riscv/rvv/autovec/vls/mask-1.c |  69 
>  .../gcc.target/riscv/rvv/autovec/vls/mask-2.c |  69 
>  .../gcc.target/riscv/rvv/autovec/vls/mask-3.c |  69 
>  13 files changed, 907 insertions(+), 55 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/cmp-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/cmp-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/cmp-3.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/cmp-4.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/cmp-5.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/cmp-6.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/mask-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/mask-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/mask-3.c
>
> diff --git a/gcc/config/riscv/autovec-opt.md
> b/gcc/config/riscv/autovec-opt.md
> index e26c01856ff..22ab8afc994 100644
> --- a/gcc/config/riscv/autovec-opt.md
> +++ b/gcc/config/riscv/autovec-opt.md
> @@ -67,10 +67,10 @@
>  ;;
> -
>
>  (define_insn_and_split "*not"
> -  [(set (match_operand:VB 0 "register_operand"   "=vr")
> -   (bitmanip_bitwise:VB
> - (not:VB (match_operand:VB 2 "register_operand" " vr"))
> - (match_operand:VB 1 "register_operand" " vr")))]
> +  [(set (match_operand:VB_VLS 0 "register_operand"   "=vr")
> +   (bitmanip_bitwise:VB_VLS
> + (not:VB_VLS (match_operand:VB_VLS 2 "register_operand" " vr"))
> + (match_operand:VB_VLS 1 "register_operand" " vr")))]
>"TARGET_VECTOR && can_create_pseudo_p ()"
>"#"
>"&& 1"
> @@ -93,11 +93,11 @@
>  ;;
> -
>
>  (define_insn_and_split "*n"
> -  [(set (match_operand:VB 0 "register_operand" "=vr")
> -   (not:VB
> - (any_bitwise:VB
> -   (match_operand:VB 1 "register_operand" " vr")
> -   (match_operand:VB 2 "register_operand" " vr"]
> +  [(set (match_operand:VB_VLS 0 "register_operand" "=vr")
> +   (not:VB_VLS
> + (any_bitwise:VB_VLS
> +   (match_operand:VB_VLS 1 "register_operand" " vr")
> +   (match_operand:VB_VLS 2 "register_operand" "

RE: [PATCH V3] RISC-V: Fix ICE in get_avl_or_vl_reg

2023-09-14 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Kito Cheng via Gcc-patches
Sent: Thursday, September 14, 2023 3:56 PM
To: Juzhe-Zhong 
Cc: gcc-patches@gcc.gnu.org; kito.ch...@gmail.com; jeffreya...@gmail.com; 
rdapp@gmail.com
Subject: Re: [PATCH V3] RISC-V: Fix ICE in get_avl_or_vl_reg

lgtm

On Thu, Sep 14, 2023 at 3:52 PM Juzhe-Zhong  wrote:
>
> update v1 -> v2: Add available fortran compiler check in rvv-fortran.exp.
>
> This patch fix https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111395 ICE
>
> update v2 -> v3: Remove redundant format.
>
> PR target/111395
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vsetvl.cc (avl_info::operator==): Fix ICE.
> (vector_insn_info::global_merge): Ditto.
> (vector_insn_info::get_avl_or_vl_reg): Ditto.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/fortran/pr111395.f90: New test.
> * gcc.target/riscv/rvv/rvv-fortran.exp: New test.
>
> ---
>  gcc/config/riscv/riscv-vsetvl.cc  | 28 +++-
>  .../gcc.target/riscv/rvv/fortran/pr111395.f90 | 41 +
>  .../gcc.target/riscv/rvv/rvv-fortran.exp  | 45 +++
>  3 files changed, 103 insertions(+), 11 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/fortran/pr111395.f90
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/rvv-fortran.exp
>
> diff --git a/gcc/config/riscv/riscv-vsetvl.cc 
> b/gcc/config/riscv/riscv-vsetvl.cc
> index f81361c4ccd..8ec54092a48 100644
> --- a/gcc/config/riscv/riscv-vsetvl.cc
> +++ b/gcc/config/riscv/riscv-vsetvl.cc
> @@ -1652,6 +1652,8 @@ avl_info::operator== (const avl_info ) const
>/* Handle VLMAX AVL.  */
>if (vlmax_avl_p (m_value))
>  return vlmax_avl_p (other.get_value ());
> +  if (vlmax_avl_p (other.get_value ()))
> +return false;
>
>/* If any source is undef value, we think they are not equal.  */
>if (!m_source || !other.get_source ())
> @@ -2258,6 +2260,18 @@ vector_insn_info::global_merge (const vector_insn_info 
> _info,
> new_info.set_avl_source (first_set);
>  }
>
> +  /* Make sure VLMAX AVL always has a set_info the get VL.  */
> +  if (vlmax_avl_p (new_info.get_avl ()))
> +{
> +  if (this->get_avl_source ())
> +   new_info.set_avl_source (this->get_avl_source ());
> +  else
> +   {
> + gcc_assert (merge_info.get_avl_source ());
> + new_info.set_avl_source (merge_info.get_avl_source ());
> +   }
> +}
> +
>new_info.fuse_sew_lmul (*this, merge_info);
>new_info.fuse_tail_policy (*this, merge_info);
>new_info.fuse_mask_policy (*this, merge_info);
> @@ -2274,9 +2288,6 @@ vector_insn_info::get_avl_or_vl_reg (void) const
>if (!vlmax_avl_p (get_avl ()))
>  return get_avl ();
>
> -  if (get_avl_source ())
> -return get_avl_reg_rtx ();
> -
>rtx_insn *rinsn = get_insn ()->rtl ();
>if (has_vl_op (rinsn) || vsetvl_insn_p (rinsn))
>  {
> @@ -2288,14 +2299,9 @@ vector_insn_info::get_avl_or_vl_reg (void) const
> return vl;
>  }
>
> -  /* A DIRTY (polluted EMPTY) block if:
> -   - get_insn is scalar move (no AVL or VL operand).
> -   - get_avl_source is null (no def in the current DIRTY block).
> - Then we trace the previous insn which must be the insn
> - already inserted in Phase 2 to get the VL operand for VLMAX.  */
> -  rtx_insn *prev_rinsn = PREV_INSN (rinsn);
> -  gcc_assert (prev_rinsn && vsetvl_insn_p (prev_rinsn));
> -  return ::get_vl (prev_rinsn);
> +  /* We always has avl_source if it is VLMAX AVL.  */
> +  gcc_assert (get_avl_source ());
> +  return get_avl_reg_rtx ();
>  }
>
>  bool
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/fortran/pr111395.f90 
> b/gcc/testsuite/gcc.target/riscv/rvv/fortran/pr111395.f90
> new file mode 100644
> index 000..71253fe6bc5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/fortran/pr111395.f90
> @@ -0,0 +1,41 @@
> +! { dg-do compile }
> +! { dg-options "-march=rv64gcv -mabi=lp64d -Ofast -std=legacy" }
> +
> +MODULE a
> +  REAL b
> +CONTAINS
> +  SUBROUTINE c(d,KTE)
> +REAL,DIMENSION(KTE) :: d,e,f,g
> +REAL,DIMENSION(KTE) :: h
> +i : DO j=1,b
> +   z=k
> +   DO l=m,n
> +  IF(o>=p)THEN
> + IF(l +q=z/0
> + ENDIF
> + e=q
> + f=EXP(r)
> +  ENDIF
> +   ENDDO
> +   s : DO t=1,2
> +  DO l=m,u
> + v=v+l
> +  ENDDO
> +  IF(w<=x)THEN
> + DO l=w,x
> +g=y
> + ENDDO
> +  ENDIF
> +   ENDDO  s
> +   aa=v
> +   ab=ac/aa
> +   k=ad/ab
> +ENDDO  i
> +IF(ae>af)THEN
> +   DO l=m,n
> +  d=h
> +   ENDDO
> +ENDIF
> +  END SUBROUTINE c
> +END MODULE a
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/rvv-fortran.exp 
> b/gcc/testsuite/gcc.target/riscv/rvv/rvv-fortran.exp
> new file mode 100644
> index

RE: [PATCH] RISC-V: Support VLS modes VEC_EXTRACT auto-vectorization

2023-09-13 Thread Li, Pan2 via Gcc-patches

Committed, thanks Robin.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Robin Dapp via Gcc-patches
Sent: Wednesday, September 13, 2023 8:46 PM
To: juzhe.zh...@rivai.ai; gcc-patches 
Cc: rdapp@gmail.com; kito.cheng ; Kito.cheng 
; jeffreyalaw 
Subject: Re: [PATCH] RISC-V: Support VLS modes VEC_EXTRACT auto-vectorization

> Yes. We need the additional helper function since I will cal emit_insn 
> (gen_vec_extract (mode, mode)
> in the following patch which fixes PR111391 ICE.

OK.

Regards
 Robin

RE: [PATCH v1] RISC-V: Bugfix PR111362 for incorrect frm emit

2023-09-13 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

-Original Message-
From: Kito Cheng  
Sent: Wednesday, September 13, 2023 2:16 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 

Subject: Re: [PATCH v1] RISC-V: Bugfix PR111362 for incorrect frm emit

LGTM :)

On Wed, Sep 13, 2023 at 2:07 PM Pan Li via Gcc-patches
 wrote:
>
> From: Pan Li 
>
> When the mode switching from NONE to CALL, we will restore the
> frm but lack some check if we have static frm insn in cfun.
>
> This patch would like to fix this by adding static frm insn check.
>
> gcc/ChangeLog:
>
> * PR target/111362
> * config/riscv/riscv.cc (riscv_emit_frm_mode_set): Bugfix.
>
> gcc/testsuite/ChangeLog:
>
> * PR target/111362
> * gcc.target/riscv/rvv/base/no-honor-frm-1.c: New test.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/config/riscv/riscv.cc|  2 +-
>  .../gcc.target/riscv/rvv/base/no-honor-frm-1.c   | 12 
>  2 files changed, 13 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/no-honor-frm-1.c
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 9d04ddd69e0..762937b0e37 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -9173,7 +9173,7 @@ riscv_emit_frm_mode_set (int mode, int prev_mode)
>rtx frm = gen_int_mode (mode, SImode);
>
>if (mode == riscv_vector::FRM_DYN_CALL
> -   && prev_mode != riscv_vector::FRM_DYN)
> +   && prev_mode != riscv_vector::FRM_DYN && STATIC_FRM_P (cfun))
> /* No need to emit when prev mode is DYN already.  */
> emit_insn (gen_fsrmsi_restore_volatile (backup_reg));
>else if (mode == riscv_vector::FRM_DYN_EXIT && STATIC_FRM_P (cfun)
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/no-honor-frm-1.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/base/no-honor-frm-1.c
> new file mode 100644
> index 000..b2e0f217bfa
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/no-honor-frm-1.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
> +
> +void foo (void) {
> +  for (unsigned i = 0; i < sizeof(foo); i++)
> +__builtin_printf("%d", i);
> +}
> +
> +/* { dg-final { scan-assembler-not {fsrmi\s+[axs][0-9]+,\s*[01234]} } } */
> +/* { dg-final { scan-assembler-not {fsrmi\s+[01234]} } } */
> +/* { dg-final { scan-assembler-not {fsrm\s+[axs][0-9]+} } } */
> +/* { dg-final { scan-assembler-not {frrm\s+[axs][0-9]+} } } */
> --
> 2.34.1
>

RE: [PATCH v1] RISC-V: Remove unused structure in cost model

2023-09-12 Thread Li, Pan2 via Gcc-patches

Committed, thanks Jeff.

Pan

-Original Message-
From: Jeff Law  
Sent: Tuesday, September 12, 2023 9:12 PM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: Wang, Yanzhang ; kito.ch...@gmail.com; 
juzhe.zh...@rivai.ai
Subject: Re: [PATCH v1] RISC-V: Remove unused structure in cost model



On 9/12/23 07:02, Pan Li via Gcc-patches wrote:
> From: Pan Li 
> 
> The struct range is unused, remove it.
> 
> gcc/ChangeLog:
> 
>   * config/riscv/riscv-vector-costs.h (struct range): Removed.
OK
jeff

RE: [PATCH V5] RISC-V: Support Dynamic LMUL Cost model

2023-09-12 Thread Li, Pan2 via Gcc-patches

Committed, thanks Robin.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Robin Dapp via Gcc-patches
Sent: Tuesday, September 12, 2023 7:07 PM
To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org
Cc: kito.ch...@sifive.com; kito.ch...@gmail.com
Subject: Re: [PATCH V5] RISC-V: Support Dynamic LMUL Cost model

LGTM.  We should just keep in mind the restrictions discussed in the
other thread.

Regards
 Robin

RE: [PATCH v2] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for RVV intrinsic

2023-09-12 Thread Li, Pan2 via Gcc-patches

>I think it's better to move 'get_non_overloaded_instance' into function_base.
Sure.
> Plz rewrite the comments, don't mention aarch64 sve.
Sure

>Could you run your rvv intrinsic api ci with this patch?
>I am worrying that the resolve stuff will destroy the existing APi support.

This patch only enable the resolving for vmv_v, the test cases ensure the 
correctness for
both the exiting API and overloaded API of vmv_v.

Will send the v3 for this change.

Pan


From: juzhe.zh...@rivai.ai 
Sent: Tuesday, September 12, 2023 3:47 PM
To: Li, Pan2 ; gcc-patches 
Cc: Li, Pan2 ; Wang, Yanzhang ; 
kito.cheng 
Subject: Re: [PATCH v2] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for RVV 
intrinsic

I think it's better to move 'get_non_overloaded_instance' into function_base.

+  /* To avoid API conflicting, we use void return type and void argument
+ for the overloaded function register, like aarch64-sve.  */

Plz rewrite the comments, don't mention aarch64 sve.

Could you run your rvv intrinsic api ci with this patch?
I am worrying that the resolve stuff will destroy the existing APi support.



juzhe.zh...@rivai.ai

From: pan2.li
Date: 2023-09-12 15:20
To: gcc-patches
CC: juzhe.zhong; 
pan2.li; 
yanzhang.wang; 
kito.cheng
Subject: [PATCH v2] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for RVV 
intrinsic
From: Pan Li mailto:pan2...@intel.com>>

Update in v2:

* Add get_non_overloaded_instance for function instance.
* Fix overload check for policy function.
* Enrich the test cases check.

Original log:

This patch would like add the framework to support the RVV overloaded
intrinsic API in riscv-xxx-xxx-gcc, like riscv-xxx-xxx-g++ did.

However, it almost leverage the hook TARGET_RESOLVE_OVERLOADED_BUILTIN
with below steps.

* Register overloaded functions.
* Add function_resolver for overloaded function resolving.
* Add resolve API for function shape with default implementation.
* Implement HOOK for navigating the overloaded API to non-overloaded API.

We validated this framework by the vmv_v intrinsic API(s), and we will
add more intrins API support in the underlying patches.

gcc/ChangeLog:

* config/riscv/riscv-c.cc
(riscv_resolve_overloaded_builtin): New function for the hook.
(riscv_register_pragmas): Register the hook
* config/riscv/riscv-protos.h (resolve_overloaded_builtin): New decl.
* config/riscv/riscv-vector-builtins-shapes.cc (build_one):
Register overloaded function.
(struct overloaded_base): New struct for overloaded shape.
(struct non_overloaded_base): New struct for non overloaded shape.
(struct move_def): Inherit overloaded shape.
* config/riscv/riscv-vector-builtins.cc
(function_instance::get_non_overloaded_instance): New API impl.
(function_builder::add_function): Add overloaded arg.
(function_resolver::function_resolver): New constructor.
(function_builder::add_overloaded_function): New API impl.
(function_resolver::resolve): Ditto.
(function_resolver::lookup): Ditto.
(function_resolver::get_sub_code): Ditto.
(resolve_overloaded_builtin): New function impl.
* config/riscv/riscv-vector-builtins.h:
(class function_resolver): New class.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/overloaded_rv32_vmv_v.c: New test.
* gcc.target/riscv/rvv/base/overloaded_rv64_vmv_v.c: New test.
* gcc.target/riscv/rvv/base/overloaded_vmv_v.h: New test.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>
---
gcc/config/riscv/riscv-c.cc   |  36 
gcc/config/riscv/riscv-protos.h   |   1 +
.../riscv/riscv-vector-builtins-shapes.cc |  20 ++-
gcc/config/riscv/riscv-vector-builtins.cc | 155 +-
gcc/config/riscv/riscv-vector-builtins.h  |  35 +++-
.../riscv/rvv/base/overloaded_rv32_vmv_v.c|   8 +
.../riscv/rvv/base/overloaded_rv64_vmv_v.c|   8 +
.../riscv/rvv/base/overloaded_vmv_v.h |  27 +++
8 files changed, 287 insertions(+), 3 deletions(-)
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/overloaded_rv32_vmv_v.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/overloaded_rv64_vmv_v.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/overloaded_vmv_v.h

diff --git a/gcc/config/riscv/riscv-c.cc b/gcc/config/riscv/riscv-c.cc
index 283052ae313..060edd3129d 100644
--- a/gcc/config/riscv/riscv-c.cc
+++ b/gcc/config/riscv/riscv-c.cc
@@ -220,11 +220,47 @@ riscv_check_builtin_call (location_t loc, vec 
arg_loc, tree fndecl,
   gcc_unreachable ();
}
+/* Implement TARGET_RESOLVE_OVERLOADED_BUILTIN.  */
+static tree
+riscv_resolve_overloaded_builtin (unsigned int uncast_location, tree fndecl,
+   void *uncast_arglist)
+{
+  vec empty = {};
+  location_t loc = (location_t) uncast_location;
+  vec *arglist = (vec *) uncast_arglist;
+  unsigned int code = DECL_MD_FUNCTION_CODE (fndecl);
+

RE: RE: [PATCH v1] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for RVV intrinsic

2023-09-11 Thread Li, Pan2 via Gcc-patches

Got it, will have a try.

Pan

From: juzhe.zh...@rivai.ai 
Sent: Tuesday, September 12, 2023 9:30 AM
To: Li, Pan2 
Cc: kito.cheng ; gcc-patches ; 
Wang, Yanzhang 
Subject: Re: RE: [PATCH v1] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for 
RVV intrinsic

Add a function call get_non_overloaded_instance into instance.
The instance already know it is void vmv (void).
In this function search the arglist. and return the real non-overloaded decl.

juzhe.zh...@rivai.ai

From: Li, Pan2
Date: 2023-09-12 09:20
To: 钟居哲
CC: kito.cheng; 
gcc-patches; Wang, 
Yanzhang
Subject: RE: RE: [PATCH v1] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for 
RVV intrinsic
We cannot leverage this instance for correctness.
The rfun of below code is the overloaded builtin is for the overloaded 
function, which is registered as void xxx(void) as aarch64 did to avoid the 
conflict.

Let’s take vmv_v_i32m1 as example in rfun table.

Index 0: void vmv_v(void) overloaded
Index 1: i32m1 vmv_v_v_i32m1_i32m1 (i32m1, size_t) non-overloaded
Index 2: placeholder.

When we enter the hook(aka the code list below), the rfun we have is the index 
0 rfun instead of index 1.
Then we need the arglist to lookup the rfun of index 1 for the underlying call, 
as well as build the instance for the index 1 rfun.

Aarch64 has the same rfun table as above, they leverage a loop to parse the 
arglist with machine mode matching in a predefined type suffix(which is not 
available in RISC-V).

I think they almost try to resolve the same problem but different implement 
details.

Pan

From: 钟居哲 mailto:juzhe.zh...@rivai.ai>>
Sent: Tuesday, September 12, 2023 7:20 AM
To: Li, Pan2 mailto:pan2...@intel.com>>
Cc: kito.cheng mailto:kito.ch...@gmail.com>>; gcc-patches 
mailto:gcc-patches@gcc.gnu.org>>; Wang, Yanzhang 
mailto:yanzhang.w...@intel.com>>
Subject: Re: RE: [PATCH v1] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for 
RVV intrinsic

I don't understand.

+tree

+resolve_overloaded_builtin (location_t loc, unsigned int code,

+ vec *arglist)

+{

+  if (code >= vec_safe_length (registered_functions))

+return NULL_TREE;

+

+  const registered_function *rfun = (*registered_functions)[code];

+

+  if (!rfun || !rfun->overloaded_p)

+return NULL_TREE;

+

+  return function_resolver (loc, rfun->instance, rfun->decl, *arglist)

+.resolve ();

+}
You already have rfun->instance. Just use this instance should be good enough.

juzhe.zh...@rivai.ai

From: Li, Pan2
Date: 2023-09-11 23:24
To: 钟居哲
CC: kito.cheng; 
gcc-patches; Wang, 
Yanzhang
Subject: RE: RE: [PATCH v1] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for 
RVV intrinsic
For function instance with void or void arguments, it is easy as you mentioned 
as below.

For generate API (to get the right hash), you need to build the rvv_type_info, 
predications_type_index and rvv_op_info
from the arglist (aka vec) from hook.

Then we need to construct above parameters from one tree argument. Sorry I not 
sure if I understand correctly but I failed
to locate somewhere has similar usage.

Could you please help to insight me some best practice about the transformation 
from tree to above types?

Pan

From: 钟居哲 mailto:juzhe.zh...@rivai.ai>>
Sent: Monday, September 11, 2023 9:07 PM
To: Li, Pan2 mailto:pan2...@intel.com>>
Cc: kito.cheng mailto:kito.ch...@gmail.com>>; gcc-patches 
mailto:gcc-patches@gcc.gnu.org>>; Wang, Yanzhang 
mailto:yanzhang.w...@intel.com>>
Subject: Re: RE: [PATCH v1] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for 
RVV intrinsic

function_instance
get_read_vl_instance (void)
{
  return function_instance ("read_vl", bases::read_vl, shapes::read_vl,
  none_ops[0], PRED_TYPE_none, _none_void_ops);
}

tree
get_read_vl_decl (void)
{
  function_instance instance = get_read_vl_instance ();
  hashval_t hash = instance.hash ();
  registered_function *rfn = function_table->find_with_hash (instance, hash);
  gcc_assert (rfn);
  return rfn->decl;
}

You should reference it. I don't see why it's hard for use to construct 
instance first, then use that instance hash to get the decl.

juzhe.zh...@rivai.ai

From: Li, Pan2
Date: 2023-09-11 20:26
To: juzhe.zhong
CC: kito.cheng; 
gcc-patches; Wang, 
Yanzhang
Subject: RE: [PATCH v1] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for RVV 
intrinsic
> No. You must construct instance. 'strcmp' is very ugly.

Strcmp here

RE: RE: [PATCH v1] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for RVV intrinsic

2023-09-11 Thread Li, Pan2 via Gcc-patches

We cannot leverage this instance for correctness.
The rfun of below code is the overloaded builtin is for the overloaded 
function, which is registered as void xxx(void) as aarch64 did to avoid the 
conflict.

Let’s take vmv_v_i32m1 as example in rfun table.

Index 0: void vmv_v(void) overloaded
Index 1: i32m1 vmv_v_v_i32m1_i32m1 (i32m1, size_t) non-overloaded
Index 2: placeholder.

When we enter the hook(aka the code list below), the rfun we have is the index 
0 rfun instead of index 1.
Then we need the arglist to lookup the rfun of index 1 for the underlying call, 
as well as build the instance for the index 1 rfun.

Aarch64 has the same rfun table as above, they leverage a loop to parse the 
arglist with machine mode matching in a predefined type suffix(which is not 
available in RISC-V).

I think they almost try to resolve the same problem but different implement 
details.

Pan

From: 钟居哲 
Sent: Tuesday, September 12, 2023 7:20 AM
To: Li, Pan2 
Cc: kito.cheng ; gcc-patches ; 
Wang, Yanzhang 
Subject: Re: RE: [PATCH v1] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for 
RVV intrinsic

I don't understand.


+tree

+resolve_overloaded_builtin (location_t loc, unsigned int code,

+  vec *arglist)

+{

+  if (code >= vec_safe_length (registered_functions))

+return NULL_TREE;

+

+  const registered_function *rfun = (*registered_functions)[code];

+

+  if (!rfun || !rfun->overloaded_p)

+return NULL_TREE;

+

+  return function_resolver (loc, rfun->instance, rfun->decl, *arglist)

+.resolve ();

+}
You already have rfun->instance. Just use this instance should be good enough.

juzhe.zh...@rivai.ai

From: Li, Pan2
Date: 2023-09-11 23:24
To: 钟居哲
CC: kito.cheng; 
gcc-patches; Wang, 
Yanzhang
Subject: RE: RE: [PATCH v1] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for 
RVV intrinsic
For function instance with void or void arguments, it is easy as you mentioned 
as below.

For generate API (to get the right hash), you need to build the rvv_type_info, 
predications_type_index and rvv_op_info
from the arglist (aka vec) from hook.

Then we need to construct above parameters from one tree argument. Sorry I not 
sure if I understand correctly but I failed
to locate somewhere has similar usage.

Could you please help to insight me some best practice about the transformation 
from tree to above types?

Pan

From: 钟居哲 mailto:juzhe.zh...@rivai.ai>>
Sent: Monday, September 11, 2023 9:07 PM
To: Li, Pan2 mailto:pan2...@intel.com>>
Cc: kito.cheng mailto:kito.ch...@gmail.com>>; gcc-patches 
mailto:gcc-patches@gcc.gnu.org>>; Wang, Yanzhang 
mailto:yanzhang.w...@intel.com>>
Subject: Re: RE: [PATCH v1] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for 
RVV intrinsic

function_instance
get_read_vl_instance (void)
{
  return function_instance ("read_vl", bases::read_vl, shapes::read_vl,
  none_ops[0], PRED_TYPE_none, _none_void_ops);
}

tree
get_read_vl_decl (void)
{
  function_instance instance = get_read_vl_instance ();
  hashval_t hash = instance.hash ();
  registered_function *rfn = function_table->find_with_hash (instance, hash);
  gcc_assert (rfn);
  return rfn->decl;
}

You should reference it. I don't see why it's hard for use to construct 
instance first, then use that instance hash to get the decl.

juzhe.zh...@rivai.ai

From: Li, Pan2
Date: 2023-09-11 20:26
To: juzhe.zhong
CC: kito.cheng; 
gcc-patches; Wang, 
Yanzhang
Subject: RE: [PATCH v1] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for RVV 
intrinsic
> No. You must construct instance. 'strcmp' is very ugly.

Strcmp here is defensive code here for early exit if not found (can be removed 
for correctness), which is not required to find the right declaration.

Pan

From: juzhe.zhong mailto:juzhe.zh...@rivai.ai>>
Sent: Monday, September 11, 2023 8:20 PM
To: Li, Pan2 mailto:pan2...@intel.com>>
Cc: kito.cheng mailto:kito.ch...@gmail.com>>; gcc-patches 
mailto:gcc-patches@gcc.gnu.org>>; Wang, Yanzhang 
mailto:yanzhang.w...@intel.com>>
Subject: Re: [PATCH v1] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for RVV 
intrinsic

No. You must construct instance. 'strcmp' is very ugly.
 Replied Message 
From
Li, Pan2
Date
09/11/2023 20:09
To
juzhe.zh...@rivai.ai,
kito.cheng
Cc
gcc-patches,
Wang, Yanzhang
Subject
RE: Re: [PATCH v1] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for RVV 
intrinsic
> -if (overloaded_p && instance.pred == PRED_TYPE_m)
> +if (overloaded_p)

RE: RE: [PATCH v1] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for RVV intrinsic

2023-09-11 Thread Li, Pan2 via Gcc-patches

For function instance with void or void arguments, it is easy as you mentioned 
as below.

For generate API (to get the right hash), you need to build the rvv_type_info, 
predications_type_index and rvv_op_info
from the arglist (aka vec) from hook.

Then we need to construct above parameters from one tree argument. Sorry I not 
sure if I understand correctly but I failed
to locate somewhere has similar usage.

Could you please help to insight me some best practice about the transformation 
from tree to above types?

Pan

From: 钟居哲 
Sent: Monday, September 11, 2023 9:07 PM
To: Li, Pan2 
Cc: kito.cheng ; gcc-patches ; 
Wang, Yanzhang 
Subject: Re: RE: [PATCH v1] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for 
RVV intrinsic

function_instance
get_read_vl_instance (void)
{
  return function_instance ("read_vl", bases::read_vl, shapes::read_vl,
  none_ops[0], PRED_TYPE_none, _none_void_ops);
}

tree
get_read_vl_decl (void)
{
  function_instance instance = get_read_vl_instance ();
  hashval_t hash = instance.hash ();
  registered_function *rfn = function_table->find_with_hash (instance, hash);
  gcc_assert (rfn);
  return rfn->decl;
}

You should reference it. I don't see why it's hard for use to construct 
instance first, then use that instance hash to get the decl.

juzhe.zh...@rivai.ai

From: Li, Pan2
Date: 2023-09-11 20:26
To: juzhe.zhong
CC: kito.cheng; 
gcc-patches; Wang, 
Yanzhang
Subject: RE: [PATCH v1] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for RVV 
intrinsic
> No. You must construct instance. 'strcmp' is very ugly.

Strcmp here is defensive code here for early exit if not found (can be removed 
for correctness), which is not required to find the right declaration.

Pan

From: juzhe.zhong mailto:juzhe.zh...@rivai.ai>>
Sent: Monday, September 11, 2023 8:20 PM
To: Li, Pan2 mailto:pan2...@intel.com>>
Cc: kito.cheng mailto:kito.ch...@gmail.com>>; gcc-patches 
mailto:gcc-patches@gcc.gnu.org>>; Wang, Yanzhang 
mailto:yanzhang.w...@intel.com>>
Subject: Re: [PATCH v1] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for RVV 
intrinsic

No. You must construct instance. 'strcmp' is very ugly.
 Replied Message 
From
Li, Pan2
Date
09/11/2023 20:09
To
juzhe.zh...@rivai.ai,
kito.cheng
Cc
gcc-patches,
Wang, Yanzhang
Subject
RE: Re: [PATCH v1] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for RVV 
intrinsic
> -if (overloaded_p && instance.pred == PRED_TYPE_m)
> +if (overloaded_p)

Thanks for pointing this out, my misunderstanding for policy function result in 
this change as mistake, will send V2 for this.


> Plz change it into :



Actually, it is not easy to convert to this approach as aarch64 has different 
implementation of types information.

Like type_suffix_info (aarch64 loop type suffix to get the arglist type in 
infer_vector_or_tuple_type) etc.

Thus, it is not easy to construct rvv_type_info, predication_type_index and 
rvv_op_info from arglist, these are required

by function_instance when constructing.



Pan

From: juzhe.zh...@rivai.ai 
mailto:juzhe.zh...@rivai.ai>>
Sent: Monday, September 11, 2023 5:13 PM
To: kito.cheng mailto:kito.ch...@gmail.com>>
Cc: Li, Pan2 mailto:pan2...@intel.com>>; gcc-patches 
mailto:gcc-patches@gcc.gnu.org>>; Wang, Yanzhang 
mailto:yanzhang.w...@intel.com>>
Subject: Re: Re: [PATCH v1] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for 
RVV intrinsic

>> Just make sure it's the right change?
It seem incorrect to me.

More comments (I just reviewed again):


+tree

+function_resolver::lookup ()

+{

+  unsigned int code_limit = vec_safe_length (registered_functions);

+

+  for (unsigned code = get_sub_code () + 1; code < code_limit; code++)

+{

+  registered_function *rfun = (*registered_functions)[code];

+  function_instance instance = rfun->instance;

+

+  if (strcmp (base_name, instance.base_name) != 0)

+  break;

+

+  if (rfun->overloaded_p)

+  continue;

+

+  unsigned k;

+  const rvv_arg_type_info *args = instance.op_info->args;

+

+  for (k = 0; args[k].base_type != NUM_BASE_TYPES; k++)

+  {

+if (k >= m_arglist.length ())

+  break;

+

+if (TYPE_MODE (instance.get_arg_type (k))

+  != TYPE_MODE (TREE_TYPE (m_arglist[k])))

+  break;

+  }

+

+  if (args[k].base_type == NUM_BASE_TYPES)

+return rfun->decl;

+}

+

+  return NULL_TREE;

+}



Plz change it into :



/* Silently check whether there is an instance of the function with the

   mode suffix given by MODE and the type suffixes given by TYPE0 and TYPE1.

   Return its function decl if so,

RE: [PATCH] RISC-V: Enable RVV scalable vectorization by default[PR111311]

2023-09-11 Thread Li, Pan2 via Gcc-patches

Committed, thanks Jeff.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Jeff Law via Gcc-patches
Sent: Monday, September 11, 2023 9:12 PM
To: juzhe.zh...@rivai.ai; gcc-patches 
Cc: Kito.cheng ; kito.cheng 
Subject: Re: [PATCH] RISC-V: Enable RVV scalable vectorization by 
default[PR111311]



On 9/10/23 21:42, juzhe.zh...@rivai.ai wrote:
> Ping this patch.
> 
> I think it's time to enable scalable vectorization by default and do the 
> whole regression every time (except vect.exp that we didn't enable yet)
> 
> Update current FAILs status:
> 
> Real FAILS (ICE and execution FAIL):
> 
> FAIL: gcc.dg/pr70252.c (internal compiler error: in 
> gimple_expand_vec_cond_expr, at gimple-isel.cc:284)
> FAIL: gcc.dg/pr70252.c (test for excess errors)
> FAIL: gcc.dg/pr92301.c execution test
> 
> Robin is working on these 3 issues and will be solved soon.
> 
> FAIL: g++.dg/torture/vshuf-v4df.C   -O2 -flto -fno-use-linker-plugin 
> -flto-partition=none  (internal compiler error: in as_a, at machmode.h:381)
> FAIL: g++.dg/torture/vshuf-v4df.C   -O2 -flto -fno-use-linker-plugin 
> -flto-partition=none  (test for excess errors)
> FAIL: g++.dg/torture/vshuf-v4df.C   -O2 -flto -fuse-linker-plugin 
> -fno-fat-lto-objects  (internal compiler error: in as_a, at machmode.h:381)
> FAIL: g++.dg/torture/vshuf-v4df.C   -O2 -flto -fuse-linker-plugin 
> -fno-fat-lto-objects  (test for excess errors)
> 
> This is a long time known issue I have mentioned many times, we need 
> help for LTO since it's caused by mode bits extension.
> 
> The rest bogus FAILs:
> FAIL: gcc.dg/unroll-8.c scan-rtl-dump loop2_unroll "Not unrolling loop, 
> doesn't roll"
> FAIL: gcc.dg/unroll-8.c scan-rtl-dump loop2_unroll "likely upper bound: 6"
> FAIL: gcc.dg/unroll-8.c scan-rtl-dump loop2_unroll "realistic bound: -1"
> FAIL: gcc.dg/var-expand1.c scan-rtl-dump loop2_unroll "Expanding 
> Accumulator"
> FAIL: gcc.dg/tree-ssa/cunroll-16.c scan-tree-dump cunroll "optimized: 
> loop with [0-9]+ iterations completely unrolled"
> FAIL: gcc.dg/tree-ssa/cunroll-16.c scan-tree-dump-not optimized "foo"
> FAIL: gcc.dg/tree-ssa/forwprop-40.c scan-tree-dump-times optimized 
> "BIT_FIELD_REF" 0
> FAIL: gcc.dg/tree-ssa/forwprop-40.c scan-tree-dump-times optimized 
> "BIT_INSERT_EXPR" 0
> FAIL: gcc.dg/tree-ssa/forwprop-41.c scan-tree-dump-times optimized 
> "BIT_FIELD_REF" 0
> FAIL: gcc.dg/tree-ssa/forwprop-41.c scan-tree-dump-times optimized 
> "BIT_INSERT_EXPR" 1
> FAIL: gcc.dg/tree-ssa/gen-vect-11b.c scan-tree-dump-times vect 
> "vectorized 0 loops" 1
> FAIL: gcc.dg/tree-ssa/gen-vect-11c.c scan-tree-dump-times vect 
> "vectorized 0 loops" 1
> FAIL: gcc.dg/tree-ssa/gen-vect-26.c scan-tree-dump-times vect "Alignment 
> of access forced using peeling" 1
> FAIL: gcc.dg/tree-ssa/gen-vect-28.c scan-tree-dump-times vect "Alignment 
> of access forced using peeling" 1
> FAIL: gcc.dg/tree-ssa/loop-bound-1.c scan-tree-dump ivopts "bounded by 254"
> FAIL: gcc.dg/tree-ssa/loop-bound-2.c scan-tree-dump ivopts "bounded by 254"
> FAIL: gcc.dg/tree-ssa/predcom-2.c scan-tree-dump-times pcom "Unrolling 2 
> times." 2
> FAIL: gcc.dg/tree-ssa/predcom-4.c scan-tree-dump-times pcom "Combination" 1
> FAIL: gcc.dg/tree-ssa/predcom-4.c scan-tree-dump-times pcom "Unrolling 3 
> times." 1
> FAIL: gcc.dg/tree-ssa/predcom-5.c scan-tree-dump-times pcom "Combination" 2
> FAIL: gcc.dg/tree-ssa/predcom-5.c scan-tree-dump-times pcom "Unrolling 3 
> times." 1
> FAIL: gcc.dg/tree-ssa/predcom-9.c scan-tree-dump pcom "Executing 
> predictive commoning without unrolling"
> FAIL: gcc.dg/tree-ssa/reassoc-46.c scan-tree-dump-times optimized 
> "(?:vect_)?sum_[\\d._]+ = (?:(?:vect_)?_[\\d._]+ \\+ 
> (?:vect_)?sum_[\\d._]+|(?:v   ect_)?sum_[\\d._]+ \\+ (?:vect_)?_[\\d._]+)" 1
> FAIL: gcc.dg/tree-ssa/scev-10.c scan-tree-dump-times ivopts " 
>   Type:\\tREFERENCE ADDRESS\n" 1
> FAIL: gcc.dg/tree-ssa/scev-11.c scan-tree-dump-times ivopts " 
>   Type:\\tREFERENCE ADDRESS\n" 2
> FAIL: gcc.dg/tree-ssa/scev-14.c scan-tree-dump ivopts "Overflowness wrto 
> loop niter:\tNo-overflow"
> FAIL: gcc.dg/tree-ssa/scev-9.c scan-tree-dump-times ivopts " 
>   Type:\\tREFERENCE ADDRESS\n" 1
> FAIL: gcc.dg/tree-ssa/split-path-11.c scan-tree-dump-times split-paths 
> "join point for if-convertable half-diamond" 1
> 
> These are bogus dump FAILs and I have 100% confirm each of them, we are 
> having same behavior as SVE.
> 
> So is this patch ok for trunk ?
Yes, this is OK for the trunk.

Jeff

RE: [PATCH v1] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for RVV intrinsic

2023-09-11 Thread Li, Pan2 via Gcc-patches

> No. You must construct instance. 'strcmp' is very ugly.

Strcmp here is defensive code here for early exit if not found (can be removed 
for correctness), which is not required to find the right declaration.

Pan

From: juzhe.zhong 
Sent: Monday, September 11, 2023 8:20 PM
To: Li, Pan2 
Cc: kito.cheng ; gcc-patches ; 
Wang, Yanzhang 
Subject: Re: [PATCH v1] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for RVV 
intrinsic

No. You must construct instance. 'strcmp' is very ugly.
 Replied Message 
From
Li, Pan2
Date
09/11/2023 20:09
To
juzhe.zh...@rivai.ai,
kito.cheng
Cc
gcc-patches,
Wang, Yanzhang
Subject
RE: Re: [PATCH v1] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for RVV 
intrinsic
> -if (overloaded_p && instance.pred == PRED_TYPE_m)
> +if (overloaded_p)

Thanks for pointing this out, my misunderstanding for policy function result in 
this change as mistake, will send V2 for this.


> Plz change it into :



Actually, it is not easy to convert to this approach as aarch64 has different 
implementation of types information.

Like type_suffix_info (aarch64 loop type suffix to get the arglist type in 
infer_vector_or_tuple_type) etc.

Thus, it is not easy to construct rvv_type_info, predication_type_index and 
rvv_op_info from arglist, these are required

by function_instance when constructing.



Pan

From: juzhe.zh...@rivai.ai 
mailto:juzhe.zh...@rivai.ai>>
Sent: Monday, September 11, 2023 5:13 PM
To: kito.cheng mailto:kito.ch...@gmail.com>>
Cc: Li, Pan2 mailto:pan2...@intel.com>>; gcc-patches 
mailto:gcc-patches@gcc.gnu.org>>; Wang, Yanzhang 
mailto:yanzhang.w...@intel.com>>
Subject: Re: Re: [PATCH v1] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for 
RVV intrinsic

>> Just make sure it's the right change?
It seem incorrect to me.

More comments (I just reviewed again):


+tree

+function_resolver::lookup ()

+{

+  unsigned int code_limit = vec_safe_length (registered_functions);

+

+  for (unsigned code = get_sub_code () + 1; code < code_limit; code++)

+{

+  registered_function *rfun = (*registered_functions)[code];

+  function_instance instance = rfun->instance;

+

+  if (strcmp (base_name, instance.base_name) != 0)

+   break;

+

+  if (rfun->overloaded_p)

+   continue;

+

+  unsigned k;

+  const rvv_arg_type_info *args = instance.op_info->args;

+

+  for (k = 0; args[k].base_type != NUM_BASE_TYPES; k++)

+   {

+ if (k >= m_arglist.length ())

+   break;

+

+ if (TYPE_MODE (instance.get_arg_type (k))

+   != TYPE_MODE (TREE_TYPE (m_arglist[k])))

+   break;

+   }

+

+   if (args[k].base_type == NUM_BASE_TYPES)

+ return rfun->decl;

+}

+

+  return NULL_TREE;

+}



Plz change it into :



/* Silently check whether there is an instance of the function with the

   mode suffix given by MODE and the type suffixes given by TYPE0 and TYPE1.

   Return its function decl if so, otherwise return null.  */

tree

function_resolver::lookup_form (mode_suffix_index mode,

type_suffix_index type0,

type_suffix_index type1)

{

  type_suffix_pair types = { type0, type1 };

  function_instance instance (base_name, base, shape, mode, types, pred);

  registered_function *rfn

= function_table->find_with_hash (instance, instance.hash ());

  return rfn ? rfn->decl : NULL_TREE;

}


juzhe.zh...@rivai.ai

From: Kito Cheng
Date: 2023-09-11 17:04
To: juzhe.zh...@rivai.ai
CC: pan2.li; 
gcc-patches; 
yanzhang.wang
Subject: Re: [PATCH v1] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for RVV 
intrinsic
> @@ -545,7 +563,7 @@ struct move_def : public build_base
>  /* According to rvv-intrinsic-doc, it does not add "_m" suffix
> for vop_m C++ overloaded API.  */
> -if (overloaded_p && instance.pred == PRED_TYPE_m)
> +if (overloaded_p)

Just make sure it's the right change?

>return b.finish_name ();
>  b.append_name (predication_suffixes[instance.pred]);
>  return b.finish_name ();

RE: Re: [PATCH v1] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for RVV intrinsic

2023-09-11 Thread Li, Pan2 via Gcc-patches

> -if (overloaded_p && instance.pred == PRED_TYPE_m)
> +if (overloaded_p)

Thanks for pointing this out, my misunderstanding for policy function result in 
this change as mistake, will send V2 for this.


> Plz change it into :



Actually, it is not easy to convert to this approach as aarch64 has different 
implementation of types information.

Like type_suffix_info (aarch64 loop type suffix to get the arglist type in 
infer_vector_or_tuple_type) etc.

Thus, it is not easy to construct rvv_type_info, predication_type_index and 
rvv_op_info from arglist, these are required

by function_instance when constructing.



Pan

From: juzhe.zh...@rivai.ai 
Sent: Monday, September 11, 2023 5:13 PM
To: kito.cheng 
Cc: Li, Pan2 ; gcc-patches ; Wang, 
Yanzhang 
Subject: Re: Re: [PATCH v1] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for 
RVV intrinsic

>> Just make sure it's the right change?
It seem incorrect to me.

More comments (I just reviewed again):


+tree

+function_resolver::lookup ()

+{

+  unsigned int code_limit = vec_safe_length (registered_functions);

+

+  for (unsigned code = get_sub_code () + 1; code < code_limit; code++)

+{

+  registered_function *rfun = (*registered_functions)[code];

+  function_instance instance = rfun->instance;

+

+  if (strcmp (base_name, instance.base_name) != 0)

+   break;

+

+  if (rfun->overloaded_p)

+   continue;

+

+  unsigned k;

+  const rvv_arg_type_info *args = instance.op_info->args;

+

+  for (k = 0; args[k].base_type != NUM_BASE_TYPES; k++)

+   {

+ if (k >= m_arglist.length ())

+   break;

+

+ if (TYPE_MODE (instance.get_arg_type (k))

+   != TYPE_MODE (TREE_TYPE (m_arglist[k])))

+   break;

+   }

+

+   if (args[k].base_type == NUM_BASE_TYPES)

+ return rfun->decl;

+}

+

+  return NULL_TREE;

+}



Plz change it into :



/* Silently check whether there is an instance of the function with the

   mode suffix given by MODE and the type suffixes given by TYPE0 and TYPE1.

   Return its function decl if so, otherwise return null.  */

tree

function_resolver::lookup_form (mode_suffix_index mode,

type_suffix_index type0,

type_suffix_index type1)

{

  type_suffix_pair types = { type0, type1 };

  function_instance instance (base_name, base, shape, mode, types, pred);

  registered_function *rfn

= function_table->find_with_hash (instance, instance.hash ());

  return rfn ? rfn->decl : NULL_TREE;

}


juzhe.zh...@rivai.ai

From: Kito Cheng
Date: 2023-09-11 17:04
To: juzhe.zh...@rivai.ai
CC: pan2.li; 
gcc-patches; 
yanzhang.wang
Subject: Re: [PATCH v1] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for RVV 
intrinsic
> @@ -545,7 +563,7 @@ struct move_def : public build_base
>  /* According to rvv-intrinsic-doc, it does not add "_m" suffix
> for vop_m C++ overloaded API.  */
> -if (overloaded_p && instance.pred == PRED_TYPE_m)
> +if (overloaded_p)

Just make sure it's the right change?

>return b.finish_name ();
>  b.append_name (predication_suffixes[instance.pred]);
>  return b.finish_name ();

RE: [PATCH] RISC-V: Remove redundant functions

2023-09-11 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Kito Cheng via Gcc-patches
Sent: Monday, September 11, 2023 5:26 PM
To: Juzhe-Zhong 
Cc: gcc-patches@gcc.gnu.org; kito.ch...@gmail.com
Subject: Re: [PATCH] RISC-V: Remove redundant functions

LGTM

On Mon, Sep 11, 2023 at 5:20 PM Juzhe-Zhong  wrote:
>
> I just finished V2 version of LMUL cost model.
> Turns out we don't these redundant functions.
>
> Remove them.
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-protos.h (get_all_predecessors): Remove.
> (get_all_successors): Ditto.
> * config/riscv/riscv-v.cc (get_all_predecessors): Ditto.
> (get_all_successors): Ditto.
>
> ---
>  gcc/config/riscv/riscv-protos.h |  2 --
>  gcc/config/riscv/riscv-v.cc | 48 -
>  2 files changed, 50 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
> index 46d77ef927c..e91a55ec057 100644
> --- a/gcc/config/riscv/riscv-protos.h
> +++ b/gcc/config/riscv/riscv-protos.h
> @@ -498,8 +498,6 @@ enum floating_point_rounding_mode get_frm_mode (rtx);
>  opt_machine_mode vectorize_related_mode (machine_mode, scalar_mode,
>  poly_uint64);
>  unsigned int autovectorize_vector_modes (vec *, bool);
> -hash_set get_all_predecessors (basic_block);
> -hash_set get_all_successors (basic_block);
>  bool cmp_lmul_le_one (machine_mode);
>  bool cmp_lmul_gt_one (machine_mode);
>  }
> diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
> index 3cd1f61de0e..4d95bd773a2 100644
> --- a/gcc/config/riscv/riscv-v.cc
> +++ b/gcc/config/riscv/riscv-v.cc
> @@ -3388,54 +3388,6 @@ expand_fold_extract_last (rtx *ops)
>emit_label (end_label);
>  }
>
> -hash_set
> -get_all_predecessors (basic_block bb)
> -{
> -  hash_set blocks;
> -  auto_vec work_list;
> -  hash_set visited_list;
> -  work_list.safe_push (bb);
> -
> -  while (!work_list.is_empty ())
> -{
> -  basic_block new_bb = work_list.pop ();
> -  visited_list.add (new_bb);
> -  edge e;
> -  edge_iterator ei;
> -  FOR_EACH_EDGE (e, ei, new_bb->preds)
> -   {
> - if (!visited_list.contains (e->src))
> -   work_list.safe_push (e->src);
> - blocks.add (e->src);
> -   }
> -}
> -  return blocks;
> -}
> -
> -hash_set
> -get_all_successors (basic_block bb)
> -{
> -  hash_set blocks;
> -  auto_vec work_list;
> -  hash_set visited_list;
> -  work_list.safe_push (bb);
> -
> -  while (!work_list.is_empty ())
> -{
> -  basic_block new_bb = work_list.pop ();
> -  visited_list.add (new_bb);
> -  edge e;
> -  edge_iterator ei;
> -  FOR_EACH_EDGE (e, ei, new_bb->succs)
> -   {
> - if (!visited_list.contains (e->dest))
> -   work_list.safe_push (e->dest);
> - blocks.add (e->dest);
> -   }
> -}
> -  return blocks;
> -}
> -
>  /* Return true if the LMUL of comparison less than or equal to one.  */
>  bool
>  cmp_lmul_le_one (machine_mode mode)
> --
> 2.36.3
>

RE: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook

2023-09-11 Thread Li, Pan2 via Gcc-patches

Hi Jeff,

Kindly ping for the Patch V2 as below.

https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628508.html

Pan

-Original Message-
From: Li, Pan2  
Sent: Friday, August 25, 2023 8:45 PM
To: Li, Pan2 ; Jeff Law ; 
gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
kito.ch...@gmail.com
Subject: RE: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook

Hi Jeff,

> You might also peek at the RTL gcse/pre code which is also LCM based and 
> has the same class of problems.

I found a similar approach to take care of this in gcse.cc/pre_edge_insert with 
some comments as below.

  /* We can't insert anything on an abnormal and
   critical edge, so we insert the insn at the end of
   the previous block. There are several alternatives
   detailed in Morgans book P277 (sec 10.5) for
   handling this situation.  This one is easiest for
   now.  */

if (eg->flags & EDGE_ABNORMAL)
  insert_insn_end_basic_block (index_map[j], bb);
else
  {
  insn = process_insert_insn (index_map[j]);
  insert_insn_on_edge (insn, eg);
  }

It looks the insert_insn_end_basic_block is designed to handle the ABNORMAL 
edge by inserting at end of previous block from the comments.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Li, Pan2 via Gcc-patches
Sent: Thursday, August 24, 2023 12:54 PM
To: Jeff Law ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
kito.ch...@gmail.com
Subject: RE: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook

Thanks Jeff.

> That implies a save/restore pair around the call (possibly optimized so 
> that we minimize the number of save/restores).  I would have expected 
> x86 to already be doing this.  But maybe there's some ABI thing around 
> mmx vs x86 state that allows it to be avoided

Very similar to save/restore but optional.
If no static rounding mode instrinsic here, it is unnecessary to add 
save/restore
pair around the call. I bet mode-switching take care of this already.

Pan

-Original Message-
From: Jeff Law  
Sent: Thursday, August 24, 2023 7:27 AM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
kito.ch...@gmail.com
Subject: Re: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook



On 8/23/23 08:54, Li, Pan2 wrote:
> Thanks Jeff for comments.
> 
>> Understood.  So the natural question is why does x86/sh not need this
>> for its mode switching?   Don't all the same issues exist on those
>> targets as well?
> 
> AFAIK, it comes from the different design principle between the risc-v and 
> x86/arm intrinsic API.
> The risc-v rvv FP rounding mode intrinsic API has one abstract level above 
> the insn itself, while
> the x86/arm only indicates the semantics of the insn.
> 
> For example, if one vector instruction VFADD doesn't have static rounding 
> mode (aka encoding rm in insn),
> there is no such a intrinsic API contains rounding mode argument in x86/arm. 
> While the risc-v fp
> vector intrinsic will always have static rounding mode API if the frm is 
> honored.
> 
> In short, the risc-v intrinsic API is closer to the end-user, while the 
> x86/arm instrinsic API is closer to insn itself.
OK, but I'm still strugging to see how the distinction is important 
here.  Ultimately there's a state at a call site.  We need to make sure 
that state from the current function doesn't impact the callee and we 
need to make sure that the callee doesn't impact the state in the caller.

That implies a save/restore pair around the call (possibly optimized so 
that we minimize the number of save/restores).  I would have expected 
x86 to already be doing this.  But maybe there's some ABI thing around 
mmx vs x86 state that allows it to be avoided

> 
> For the rest part, will have a try based on your suggestion soon as I am in 
> the middle of something.
No problem.  Get to it when you can.  I think it affects you more than 
me :-)

jeff

RE: [PATCH] RISC-V: Expand fixed-vlmax/vls vector permutation in targethook

2023-09-10 Thread Li, Pan2 via Gcc-patches

Committed, thanks Jeff.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Jeff Law via Gcc-patches
Sent: Sunday, September 10, 2023 9:38 PM
To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org
Cc: kito.ch...@sifive.com; kito.ch...@gmail.com
Subject: Re: [PATCH] RISC-V: Expand fixed-vlmax/vls vector permutation in 
targethook



On 9/9/23 20:33, Juzhe-Zhong wrote:
> When debugging FAIL: gcc.dg/pr92301.c execution test.
> Realize a vls vector permutation situation failed to vectorize since early 
> return false:
> 
> -  /* For constant size indices, we dont't need to handle it here.
> - Just leave it to vec_perm.  */
> -  if (d->perm.length ().is_constant ())
> -return false;
> 
> To avoid more potential failed vectorization case. Now expand it in 
> targethook.
> 
> gcc/ChangeLog:
> 
>   * config/riscv/riscv-v.cc (shuffle_generic_patterns): Expand 
> fixed-vlmax/vls vector permutation.
OK.
jeff

RE: [PATCH V2] RISC-V: Avoid unnecessary slideup in compress pattern of vec_perm

2023-09-10 Thread Li, Pan2 via Gcc-patches

Committed, thanks Jeff.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Jeff Law via Gcc-patches
Sent: Sunday, September 10, 2023 11:25 PM
To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org
Cc: kito.ch...@sifive.com; kito.ch...@gmail.com
Subject: Re: [PATCH V2] RISC-V: Avoid unnecessary slideup in compress pattern 
of vec_perm



On 9/10/23 08:07, Juzhe-Zhong wrote:
> gcc/ChangeLog:
> 
>   * config/riscv/riscv-v.cc (shuffle_compress_patterns): Avoid 
> unnecessary slideup.
OK
jeff

RE: [PATCH] RISC-V: Fix dump FILE of VSETVL PASS[PR111311]

2023-09-09 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Kito Cheng via Gcc-patches
Sent: Sunday, September 10, 2023 9:22 AM
To: Juzhe-Zhong 
Cc: GCC Patches ; Kito Cheng 
Subject: Re: [PATCH] RISC-V: Fix dump FILE of VSETVL PASS[PR111311]

LGTM

Juzhe-Zhong  於 2023年9月10日 週日 07:58 寫道：

> To make the dump FILE not too big, add TDF_DETAILS.
>
> This patch fix these following FAILs in
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111311
>
> FAIL: gcc.c-torture/unsorted/dump-noaddr.c.*r.vsetvl,  -O3
> -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer
> -finline-functions  comparison
> FAIL: gcc.c-torture/unsorted/dump-noaddr.c.*r.vsetvl,  -O3 -g  comparison
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vsetvl.cc (pass_vsetvl::vsetvl_fusion): Add
> TDF_DETAILS.
> (pass_vsetvl::pre_vsetvl): Ditto.
> (pass_vsetvl::init): Ditto.
> (pass_vsetvl::lazy_vsetvl): Ditto.
>
> ---
>  gcc/config/riscv/riscv-vsetvl.cc | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv-vsetvl.cc
> b/gcc/config/riscv/riscv-vsetvl.cc
> index ae362a3f6a8..134b97737ae 100644
> --- a/gcc/config/riscv/riscv-vsetvl.cc
> +++ b/gcc/config/riscv/riscv-vsetvl.cc
> @@ -3438,7 +3438,7 @@ pass_vsetvl::vsetvl_fusion (void)
> m_vector_manager->vector_kill,
> m_vector_manager->vector_earliest);
>changed_p |= earliest_fusion ();
> -  if (dump_file)
> +  if (dump_file && (dump_flags & TDF_DETAILS))
> {
>   fprintf (dump_file, "\nEARLIEST fusion %d\n", fusion_no);
>   m_vector_manager->dump (dump_file);
> @@ -3720,7 +3720,7 @@ pass_vsetvl::pre_vsetvl (void)
>
>/* We should dump the information before CFG is changed. Otherwise it
> will
>   produce ICE (internal compiler error).  */
> -  if (dump_file)
> +  if (dump_file && (dump_flags & TDF_DETAILS))
>  m_vector_manager->dump (dump_file);
>
>refine_vsetvls ();
> @@ -4250,7 +4250,7 @@ pass_vsetvl::init (void)
>m_vector_manager = new vector_infos_manager ();
>compute_probabilities ();
>
> -  if (dump_file)
> +  if (dump_file && (dump_flags & TDF_DETAILS))
>  {
>fprintf (dump_file, "\nPrologue: Initialize vector infos\n");
>m_vector_manager->dump (dump_file);
> @@ -4334,7 +4334,7 @@ pass_vsetvl::lazy_vsetvl (void)
>  fprintf (dump_file, "\nPhase 1: Compute local backward vector
> infos\n");
>for (const bb_info *bb : crtl->ssa->bbs ())
>  compute_local_backward_infos (bb);
> -  if (dump_file)
> +  if (dump_file && (dump_flags & TDF_DETAILS))
>  m_vector_manager->dump (dump_file);
>
>/* Phase 2 - Emit vsetvl instructions within each basic block according
> to
> @@ -4344,7 +4344,7 @@ pass_vsetvl::lazy_vsetvl (void)
>  "\nPhase 2: Emit vsetvl instruction within each block\n");
>for (const bb_info *bb : crtl->ssa->bbs ())
>  emit_local_forward_vsetvls (bb);
> -  if (dump_file)
> +  if (dump_file && (dump_flags & TDF_DETAILS))
>  m_vector_manager->dump (dump_file);
>
>/* Phase 3 - Propagate demanded info across blocks.  */
> --
> 2.36.3
>
>

RE: [PATCH] RISC-V: Suppress bogus warning for VLS types

2023-09-08 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Kito Cheng via Gcc-patches
Sent: Friday, September 8, 2023 4:27 PM
To: Juzhe-Zhong 
Cc: GCC Patches ; Kito Cheng 
Subject: Re: [PATCH] RISC-V: Suppress bogus warning for VLS types

LGTM

Juzhe-Zhong  於 2023年9月8日 週五 16:20 寫道：

> This patch fixes over 100+ bogus FAILs due to experimental vector ABI
> warning.
>
> gcc/ChangeLog:
>
> * config/riscv/riscv.cc (riscv_pass_in_vector_p): Only allow RVV
> type.
>
> ---
>  gcc/config/riscv/riscv.cc | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 9f0c8bbe9ed..81682d95ba4 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -4414,7 +4414,7 @@ riscv_pass_in_vector_p (const_tree type)
>  {
>static int warned = 0;
>
> -  if (type && riscv_v_ext_mode_p (TYPE_MODE (type)) && !warned)
> +  if (type && riscv_vector::lookup_vector_type_attribute (type) &&
> !warned)
>  {
>warning (OPT_Wpsabi,
>"ABI for the vector type is currently in experimental stage
> and "
> --
> 2.36.3
>
>

RE: [PATCH] RISC-V: Fix incorrect nregs calculation for VLS modes

2023-09-08 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Kito Cheng via Gcc-patches
Sent: Friday, September 8, 2023 4:12 PM
To: Juzhe-Zhong 
Cc: GCC Patches ; Kito Cheng 
Subject: Re: [PATCH] RISC-V: Fix incorrect nregs calculation for VLS modes

LGTM

Juzhe-Zhong  於 2023年9月8日 週五 15:52 寫道：

> This patch fixes obvious bug: TARGET_MIN_VLEN is bitsize.
>
> All these following bugs are fixed with this patch:
> FAIL: gcc.target/riscv/zero-scratch-regs-3.c   -O0  (internal compiler
> error: in gen_reg_rtx, at emit-rtl.cc:1176)
> FAIL: gcc.target/riscv/zero-scratch-regs-3.c   -O0  (test for excess
> errors)
> FAIL: gcc.target/riscv/zero-scratch-regs-3.c   -O1  (internal compiler
> error: in gen_reg_rtx, at emit-rtl.cc:1176)
> FAIL: gcc.target/riscv/zero-scratch-regs-3.c   -O1  (test for excess
> errors)
> FAIL: gcc.target/riscv/zero-scratch-regs-3.c   -O2  (internal compiler
> error: in gen_reg_rtx, at emit-rtl.cc:1176)
> FAIL: gcc.target/riscv/zero-scratch-regs-3.c   -O2  (test for excess
> errors)
> FAIL: gcc.target/riscv/zero-scratch-regs-3.c   -O2 -flto
> -fno-use-linker-plugin -flto-partition=none  (internal compiler error: in
> gen_reg_rtx, at emit-rtl.cc:1176)
> FAIL: gcc.target/riscv/zero-scratch-regs-3.c   -O2 -flto
> -fno-use-linker-plugin -flto-partition=none  (test for excess errors)
> FAIL: gcc.target/riscv/zero-scratch-regs-3.c   -O2 -flto
> -fuse-linker-plugin -fno-fat-lto-objects  (internal compiler error: in
> gen_reg_rtx, at emit-rtl.cc:1176)
> FAIL: gcc.target/riscv/zero-scratch-regs-3.c   -O2 -flto
> -fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
> FAIL: gcc.target/riscv/zero-scratch-regs-3.c   -O3 -g  (internal compiler
> error: in gen_reg_rtx, at emit-rtl.cc:1176)
> FAIL: gcc.target/riscv/zero-scratch-regs-3.c   -O3 -g  (test for excess
> errors)
> FAIL: gcc.target/riscv/zero-scratch-regs-3.c   -Os  (internal compiler
> error: in gen_reg_rtx, at emit-rtl.cc:1176)
> FAIL: gcc.target/riscv/zero-scratch-regs-3.c   -Os  (test for excess
> errors)
> FAIL: gcc.target/riscv/rvv/base/mov-13.c (internal compiler error: in
> partial_subreg_p, at rtl.h:3186)
> FAIL: gcc.target/riscv/rvv/base/mov-13.c (test for excess errors)
> FAIL: gcc.target/riscv/rvv/base/spill-1.c (internal compiler error: in
> partial_subreg_p, at rtl.h:3186)
> FAIL: gcc.target/riscv/rvv/base/spill-1.c (test for excess errors)
> FAIL: gcc.target/riscv/rvv/base/spill-2.c (internal compiler error: in
> partial_subreg_p, at rtl.h:3186)
> FAIL: gcc.target/riscv/rvv/base/spill-2.c (test for excess errors)
> FAIL: gcc.target/riscv/rvv/base/spill-3.c (internal compiler error: in
> partial_subreg_p, at rtl.h:3186)
> FAIL: gcc.target/riscv/rvv/base/spill-3.c (test for excess errors)
> FAIL: gcc.target/riscv/rvv/base/spill-4.c (internal compiler error: in
> partial_subreg_p, at rtl.h:3186)
> FAIL: gcc.target/riscv/rvv/base/spill-4.c (test for excess errors)
> FAIL: gcc.target/riscv/rvv/base/spill-5.c (internal compiler error: in
> partial_subreg_p, at rtl.h:3186)
> FAIL: gcc.target/riscv/rvv/base/spill-5.c (test for excess errors)
> FAIL: gcc.target/riscv/rvv/base/spill-6.c (internal compiler error: in
> partial_subreg_p, at rtl.h:3186)
> FAIL: gcc.target/riscv/rvv/base/spill-6.c (test for excess errors)
> FAIL: gcc.target/riscv/rvv/base/spill-sp-adjust.c (internal compiler
> error: in partial_subreg_p, at rtl.h:3186)
> FAIL: gcc.target/riscv/rvv/base/spill-sp-adjust.c (test for excess errors)
>
> gcc/ChangeLog:
>
> * config/riscv/riscv.cc (riscv_hard_regno_nregs): Fix bug.
>
> ---
>  gcc/config/riscv/riscv.cc | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index c0c9c990a23..9f0c8bbe9ed 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -7548,7 +7548,7 @@ riscv_hard_regno_nregs (unsigned int regno,
> machine_mode mode)
>/* For VLS modes, we allocate registers according to TARGET_MIN_VLEN.
> */
>if (riscv_v_ext_vls_mode_p (mode))
>  {
> -  int size = GET_MODE_SIZE (mode).to_constant ();
> +  int size = GET_MODE_BITSIZE (mode).to_constant ();
>if (size < TARGET_MIN_VLEN)
> return 1;
>else
> --
> 2.36.3
>
>

RE: [PATCH] RISC-V: Remove incorrect earliest vsetvl post optimization[PR111313]

2023-09-06 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Kito Cheng via Gcc-patches
Sent: Thursday, September 7, 2023 11:39 AM
To: Juzhe-Zhong 
Cc: GCC Patches ; Kito Cheng 
Subject: Re: [PATCH] RISC-V: Remove incorrect earliest vsetvl post 
optimization[PR111313]

LGTM

Juzhe-Zhong  於 2023年9月7日 週四 11:36 寫道：

> This patch removes the incorrect earliest poset vsetvl optimization,
> such bug was found in vect-double-reduc-5.c which is runtime(execution
> fail) and also in PR111313.
>
> For VLMAX intrinsics, we always emit a bogus patter which is vlmax_avl
> (see vector.md) to
> occupy a scalar register which is used by the following RVV instruction
> which is VLMAX AVL.
>
> Then for O2, O3, Ofast, earliest LCM works so well.
> However, for O1, the vlmax_avl is not well optimized in the before pass
> which confused LCM earliest
> so that we will end up with some redundant vsetvli zero,zero instructions
> in O1. (Note that O2 O3 Ofast are all good).
>
> To elide those redundant vsetvli zero,zero, I added
> cleanup_earliest_vsetvls to elide those redundant vsetvls.
>
> Now, after I review the implementation of this post optimizaiton again, I
> found it is incorrect and it is hard to
> do the post optimizations for vsetvls that earliest LCM failed to
> eliminate.
>
> Besides, such performance issues only happen in O1 or O0, such issues may
> not be serious.
> So remove it and we may will find another way (E.g. adjust vlmax_avl
> pattern COST)
> to optimize it if we really need to care about performance for O1.
>
> PR target/111313
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vsetvl.cc
> (pass_vsetvl::cleanup_earliest_vsetvls): Remove.
> (pass_vsetvl::df_post_optimization): Remove incorrect function.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/vsetvl/avl_single-13.c: Adapt test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-17.c: Skip check for
> O1.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-18.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-19.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-20.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-1.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-10.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-11.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-12.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-13.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-14.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-15.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-16.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-17.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-18.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-19.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-2.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-20.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-21.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-22.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-23.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-24.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-25.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-26.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-27.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-28.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-3.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-4.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-5.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-6.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-7.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-8.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-9.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-2.c: Ditto.
> * gcc.target/riscv/rvv/autovec/pr111313.c: New test.
>
> ---
>  gcc/config/riscv/riscv-vsetvl.cc  | 58 ---
>  .../gcc.target/riscv/rvv/autovec/pr111313.c   | 20 +++
>  .../riscv/rvv/vsetvl/avl_single-13.c  |  2 +-
>  .../riscv/rvv/vsetvl/vlmax_bb_prop-17.c   |  8 +--
>  .../riscv/rvv/vsetvl/vlmax_bb_prop-18.c   |  8 +--
>  .../riscv/rvv/vsetvl/vlmax_bb_prop-19.c   |  2 +-
>  .../riscv/rvv/vsetvl/vlmax_bb_prop-20.c   |  4 +-
>  .../gcc.target/riscv/rvv/vsetvl/vlmax_phi-1.c |  2 +-
>  .../riscv/rvv/vsetvl/vlmax_phi-10.c   |  2 +-
>  .../riscv/rvv/vsetvl/vlmax_phi-11.c   |  2 +-
>  .../riscv/rvv/vsetvl/vlmax_phi-12.c   |  2 +-
>  .../riscv/rvv/vsetvl/vlmax_phi-13.c   |  2 +-
>  .../riscv/rvv/vsetvl/vlmax_phi-14.c   |  2 +-
>  .../riscv/rvv/vsetvl/vlmax_phi-15.c   |  2 +-
>  .../riscv/rvv/vsetvl/vlmax_phi-16.c   |  2 +-
>  .../riscv/rvv/vsetvl/vlmax_phi-17.c   |  2 +-
>

RE: [PATCH] RISC-V: Remove unreasonable TARGET_64BIT for VLS modes with size = 64bit

2023-09-06 Thread Li, Pan2 via Gcc-patches

Committed, thanks Robin.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Robin Dapp via Gcc-patches
Sent: Wednesday, September 6, 2023 9:39 PM
To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org
Cc: kito.ch...@sifive.com; kito.ch...@gmail.com
Subject: Re: [PATCH] RISC-V: Remove unreasonable TARGET_64BIT for VLS modes 
with size = 64bit

LGTM.

Regards
 Robin

RE: [PATCH] RISC-V: Fix VSETVL PASS AVL/VL fetch bug[111295]

2023-09-06 Thread Li, Pan2 via Gcc-patches

Committed, thanks Robin.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Robin Dapp via Gcc-patches
Sent: Wednesday, September 6, 2023 9:38 PM
To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org
Cc: kito.ch...@sifive.com; kito.ch...@gmail.com
Subject: Re: [PATCH] RISC-V: Fix VSETVL PASS AVL/VL fetch bug[111295]

OK.

Regards
 Robin

RE: [PATCH v1] RISC-V: Fix incorrect folder for VRGATHERI16 test case

2023-09-06 Thread Li, Pan2 via Gcc-patches

Committed, thanks Juzhe and sorry for my silly mistake.

Pan

From: juzhe.zhong 
Sent: Wednesday, September 6, 2023 8:53 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; Li, Pan2 ; Wang, Yanzhang 
; kito.ch...@gmail.com
Subject: Re: [PATCH v1] RISC-V: Fix incorrect folder for VRGATHERI16 test case

lgtm
 Replied Message 
From
pan2...@intel.com
Date
09/06/2023 20:52
To
gcc-patches@gcc.gnu.org
Cc
juzhe.zh...@rivai.ai,
pan2...@intel.com,
yanzhang.w...@intel.com,
kito.ch...@gmail.com
Subject
[PATCH v1] RISC-V: Fix incorrect folder for VRGATHERI16 test case

RE: [PATCH v1] RISC-V: Support FP SGNJ autovec for VLS mode

2023-09-05 Thread Li, Pan2 via Gcc-patches

Committed, thanks Juzhe.

Pan

From: juzhe.zh...@rivai.ai 
Sent: Tuesday, September 5, 2023 7:14 PM
To: Li, Pan2 ; gcc-patches 
Cc: Li, Pan2 ; Wang, Yanzhang ; 
kito.cheng 
Subject: Re: [PATCH v1] RISC-V: Support FP SGNJ autovec for VLS mode

LGTM


juzhe.zh...@rivai.ai

From: pan2.li
Date: 2023-09-05 18:32
To: gcc-patches
CC: juzhe.zhong; 
pan2.li; 
yanzhang.wang; 
kito.cheng
Subject: [PATCH v1] RISC-V: Support FP SGNJ autovec for VLS mode
From: Pan Li mailto:pan2...@intel.com>>

This patch would like to allow the VLS mode autovec for the
floating-point binary operation MAX/MIN.

Given below code example:

void test(float * restrict out, float * restrict in1, float * restrict in2)
{
  for (int i = 0; i < 128; i++)
out[i] = __builtin_copysignf (in1[i], in2[i]);
}

Before this patch:
test:
  csrra4,vlenb
  sllia4,a4,1
  li  a5,128
  bleua5,a4,.L2
  mv  a5,a4
.L2:
  vsetvli zero,a5,e32,m8,ta,ma
  vle32.v v8,0(a1)
  vle32.v v16,0(a2)
  vsetvli a4,zero,e32,m8,ta,ma
  vfsgnj.vv   v8,v8,v16
  vsetvli zero,a5,e32,m8,ta,ma
  vse32.v v8,0(a0)
  ret

After this patch:
test:
  li  a5,128
  vsetvli zero,a5,e32,m1,ta,ma
  vle32.v v1,0(a1)
  vle32.v v2,0(a2)
  vfsgnj.vv   v1,v1,v2
  vse32.v v1,0(a0)
  ret

Signed-off-by: Pan Li mailto:pan2...@intel.com>>

gcc/ChangeLog:

* config/riscv/autovec-vls.md (copysign3): New pattern.
* config/riscv/vector.md: Extend iterator for VLS.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/def.h: New macro.
* gcc.target/riscv/rvv/autovec/vls/floating-point-sgnj-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/floating-point-sgnj-2.c: New test.
---
gcc/config/riscv/autovec-vls.md   | 22 ++
gcc/config/riscv/vector.md| 24 +--
.../gcc.target/riscv/rvv/autovec/vls/def.h|  8 
.../rvv/autovec/vls/floating-point-sgnj-1.c   | 43 +++
.../rvv/autovec/vls/floating-point-sgnj-2.c   | 43 +++
5 files changed, 128 insertions(+), 12 deletions(-)
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-sgnj-1.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-sgnj-2.c

diff --git a/gcc/config/riscv/autovec-vls.md b/gcc/config/riscv/autovec-vls.md
index 7ef29637e33..31b6c4ae714 100644
--- a/gcc/config/riscv/autovec-vls.md
+++ b/gcc/config/riscv/autovec-vls.md
@@ -255,6 +255,28 @@ (define_insn_and_split "3"
[(set_attr "type" "vector")]
)
+;; -
+;; Includes:
+;; - vfsgnj.vv
+;; - vfsgnj.vf
+;; -
+(define_insn_and_split "copysign3"
+  [(set (match_operand:VLSF 0 "register_operand")
+(unspec:VLSF
+  [(match_operand:VLSF  1 "register_operand")
+   (match_operand:VLSF  2 "register_operand")] UNSPEC_VCOPYSIGN))]
+  "TARGET_VECTOR && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+  {
+riscv_vector::emit_vlmax_insn (code_for_pred (UNSPEC_VCOPYSIGN, 
mode),
+riscv_vector::BINARY_OP, operands);
+DONE;
+  }
+  [(set_attr "type" "vector")]
+)
+
;; 
---
;;  [INT] Unary operations
;; 
---
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 9d7b4bbe1d4..fc985ff6a01 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -6166,8 +6166,8 @@ (define_insn "@pred__reverse_scalar"
(symbol_ref "riscv_vector::get_frm_mode (operands[9])"))])
(define_insn "@pred_"
-  [(set (match_operand:VF 0 "register_operand"   "=vd, vd, vr, vr")
- (if_then_else:VF
+  [(set (match_operand:V_VLSF 0 "register_operand"   "=vd, vd, vr, vr")
+ (if_then_else:V_VLSF
  (unspec:
[(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1")
 (match_operand 5 "vector_length_operand"" rK, rK, rK, rK")
@@ -6176,10 +6176,10 @@ (define_insn "@pred_"
 (match_operand 8 "const_int_operand""  i,  i,  i,  i")
 (reg:SI VL_REGNUM)
 (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
-   (unspec:VF
- [(match_operand:VF 3 "register_operand"   " vr, vr, vr, vr")
-  (match_operand:VF 4 "register_operand"   " vr, vr, vr, vr")] 
VCOPYSIGNS)
-   (match_operand:VF 2 "vector_merge_operand" " vu,  0, vu,  0")))]
+   (unspec:V_VLSF
+ [(match_operand:V_VLSF 3 "register_operand"  " vr, vr, vr, vr")
+  (match_operand:V_VLSF 4 "register_operand"  " vr, vr, vr, vr")] 
VCOPYSIGNS)
+   (match_operand:V_VLSF 2 "vector_merge_operand" " vu,  0, vu,  0")))]
   "TARGET_VECTOR"
   "vfsgnj.vv\t%0,%3,%4%p1"
   [(set_attr "type"

RE: [PATCH v1] RISC-V: Support FP16 for RVV VRGATHEREI16 intrinsic

2023-09-04 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

-Original Message-
From: Kito Cheng  
Sent: Monday, September 4, 2023 3:29 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; Wang, Yanzhang ; 
juzhe.zh...@rivai.ai
Subject: Re: [PATCH v1] RISC-V: Support FP16 for RVV VRGATHEREI16 intrinsic

LGTM

On Mon, Sep 4, 2023 at 3:18 PM Pan Li via Gcc-patches
 wrote:
>
> From: Pan Li 
>
> This patch would like to add FP16 support for the VRGATHEREI16
> intrinsic. Aka:
>
> * __riscv_vrgatherei16_vv_f16mf4
> * __riscv_vrgatherei16_vv_f16mf4_m
>
> As well as f16mf2 to f16m8 types.
>
> Signed-off-by: Pan Li 
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vector-builtins-types.def
> (vfloat16mf4_t): Add FP16 intrinsic def.
> (vfloat16mf2_t): Ditto.
> (vfloat16m1_t): Ditto.
> (vfloat16m2_t): Ditto.
> (vfloat16m4_t): Ditto.
> (vfloat16m8_t): Ditto.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/intrisinc-vrgatherei16.c: New test.
> ---
>  .../riscv/riscv-vector-builtins-types.def |  9 ++
>  .../riscv/rvv/intrisinc-vrgatherei16.c| 28 +++
>  2 files changed, 37 insertions(+)
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/intrisinc-vrgatherei16.c
>
> diff --git a/gcc/config/riscv/riscv-vector-builtins-types.def 
> b/gcc/config/riscv/riscv-vector-builtins-types.def
> index 1c3cc0eb222..6aa45ae9a7e 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-types.def
> +++ b/gcc/config/riscv/riscv-vector-builtins-types.def
> @@ -689,11 +689,20 @@ DEF_RVV_EI16_OPS (vuint64m1_t, RVV_REQUIRE_ELEN_64)
>  DEF_RVV_EI16_OPS (vuint64m2_t, RVV_REQUIRE_ELEN_64)
>  DEF_RVV_EI16_OPS (vuint64m4_t, RVV_REQUIRE_ELEN_64)
>  DEF_RVV_EI16_OPS (vuint64m8_t, RVV_REQUIRE_ELEN_64)
> +
> +DEF_RVV_EI16_OPS (vfloat16mf4_t, RVV_REQUIRE_ELEN_FP_16 | 
> RVV_REQUIRE_MIN_VLEN_64)
> +DEF_RVV_EI16_OPS (vfloat16mf2_t, RVV_REQUIRE_ELEN_FP_16)
> +DEF_RVV_EI16_OPS (vfloat16m1_t, RVV_REQUIRE_ELEN_FP_16)
> +DEF_RVV_EI16_OPS (vfloat16m2_t, RVV_REQUIRE_ELEN_FP_16)
> +DEF_RVV_EI16_OPS (vfloat16m4_t, RVV_REQUIRE_ELEN_FP_16)
> +DEF_RVV_EI16_OPS (vfloat16m8_t, RVV_REQUIRE_ELEN_FP_16)
> +
>  DEF_RVV_EI16_OPS (vfloat32mf2_t, RVV_REQUIRE_ELEN_FP_32 | 
> RVV_REQUIRE_MIN_VLEN_64)
>  DEF_RVV_EI16_OPS (vfloat32m1_t, RVV_REQUIRE_ELEN_FP_32)
>  DEF_RVV_EI16_OPS (vfloat32m2_t, RVV_REQUIRE_ELEN_FP_32)
>  DEF_RVV_EI16_OPS (vfloat32m4_t, RVV_REQUIRE_ELEN_FP_32)
>  DEF_RVV_EI16_OPS (vfloat32m8_t, RVV_REQUIRE_ELEN_FP_32)
> +
>  DEF_RVV_EI16_OPS (vfloat64m1_t, RVV_REQUIRE_ELEN_FP_64)
>  DEF_RVV_EI16_OPS (vfloat64m2_t, RVV_REQUIRE_ELEN_FP_64)
>  DEF_RVV_EI16_OPS (vfloat64m4_t, RVV_REQUIRE_ELEN_FP_64)
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/intrisinc-vrgatherei16.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/intrisinc-vrgatherei16.c
> new file mode 100644
> index 000..59c6d7c887d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/intrisinc-vrgatherei16.c
> @@ -0,0 +1,28 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64 -O3 -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +typedef _Float16 float16_t;
> +
> +vfloat16mf4_t test_vrgatherei16_vv_f16mf4(vfloat16mf4_t op1, vuint16mf4_t 
> op2,
> +  size_t vl) {
> +  return __riscv_vrgatherei16_vv_f16mf4(op1, op2, vl);
> +}
> +
> +vfloat16m8_t test_vrgatherei16_vv_f16m8(vfloat16m8_t op1, vuint16m8_t op2,
> +  size_t vl) {
> +  return __riscv_vrgatherei16_vv_f16m8(op1, op2, vl);
> +}
> +
> +vfloat16mf4_t test_vrgatherei16_vv_f16mf4_m(vbool64_t mask, vfloat16mf4_t 
> op1,
> +  vuint16mf4_t op2, size_t vl) {
> +  return __riscv_vrgatherei16_vv_f16mf4_m(mask, op1, op2, vl);
> +}
> +
> +vfloat16m8_t test_vrgatherei16_vv_f16m8_m(vbool2_t mask, vfloat16m8_t op1,
> +  vuint16m8_t op2, size_t vl) {
> +  return __riscv_vrgatherei16_vv_f16m8_m(mask, op1, op2, vl);
> +}
> +
> +/* { dg-final { scan-assembler-times 
> {vrgatherei16.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
> --
> 2.34.1
>

RE: [PATCH v1] RISC-V: Support FP MAX/MIN autovec for VLS mode

2023-09-02 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

From: Kito Cheng 
Sent: Saturday, September 2, 2023 11:41 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 

Subject: Re: [PATCH v1] RISC-V: Support FP MAX/MIN autovec for VLS mode

Ok

Pan Li via Gcc-patches 
mailto:gcc-patches@gcc.gnu.org>>於 2023年9月2日 週六，16:54寫道：
From: Pan Li mailto:pan2...@intel.com>>

This patch would like to allow the VLS mode autovec for the
floating-point binary operation MAX/MIN.

Given below code example:

test (float *out, float *in1, float *in2)
{
  for (int i = 0; i < 128; i++)
out[i] = in1[i] > in2[i] ? in1[i] : in2[i];
// Or out[i] = fmax (in1[i], in2[i]);
}

Before this patch:
test:
  csrra4,vlenb
  sllia4,a4,1
  li  a5,128
  bleua5,a4,.L2
  mv  a5,a4
.L2:
  vsetvli zero,a5,e32,m8,ta,ma
  vle32.v v16,0(a1)
  vle32.v v8,0(a2)
  vsetvli a3,zero,e32,m8,ta,ma
  vmfgt.vvv0,v16,v8
  vmerge.vvm  v8,v8,v16,v0
  vsetvli zero,a5,e32,m8,ta,ma
  vse32.v v8,0(a0)
  ret

After this patch:
test:
  li  a5,128
  vsetvli zero,a5,e32,m1,ta,ma
  vle32.v v1,0(a1)
  vle32.v v2,0(a2)
  vfmax.vvv1,v1,v2
  vse32.v v1,0(a0)
  ret

This MAX/MIN autovec acts on function call like fmaxf/fmax in math.h
too. And it depends on the option -ffast-math.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>

gcc/ChangeLog:

* config/riscv/autovec-vls.md (3): New pattern for
fmax/fmin
* config/riscv/vector.md: Add VLS modes to vfmax/vfmin.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/def.h: New macros.
* gcc.target/riscv/rvv/autovec/vls/floating-point-max-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/floating-point-max-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/floating-point-max-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/floating-point-max-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/floating-point-max-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/floating-point-min-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/floating-point-min-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/floating-point-min-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/floating-point-min-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/floating-point-min-5.c: New test.
---
 gcc/config/riscv/autovec-vls.md   | 23 ++
 gcc/config/riscv/vector.md| 12 +++---
 .../gcc.target/riscv/rvv/autovec/vls/def.h| 16 +++
 .../rvv/autovec/vls/floating-point-max-1.c| 43 +++
 .../rvv/autovec/vls/floating-point-max-2.c| 43 +++
 .../rvv/autovec/vls/floating-point-max-3.c| 43 +++
 .../rvv/autovec/vls/floating-point-max-4.c| 43 +++
 .../rvv/autovec/vls/floating-point-max-5.c| 31 +
 .../rvv/autovec/vls/floating-point-min-1.c| 43 +++
 .../rvv/autovec/vls/floating-point-min-2.c| 43 +++
 .../rvv/autovec/vls/floating-point-min-3.c| 43 +++
 .../rvv/autovec/vls/floating-point-min-4.c| 43 +++
 .../rvv/autovec/vls/floating-point-min-5.c| 31 +
 13 files changed, 451 insertions(+), 6 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-max-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-max-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-max-3.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-max-4.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-max-5.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-min-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-min-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-min-3.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-min-4.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-min-5.c

diff --git a/gcc/config/riscv/autovec-vls.md b/gcc/config/riscv/autovec-vls.md
index 4ca640c11e2..7ef29637e33 100644
--- a/gcc/config/riscv/autovec-vls.md
+++ b/gcc/config/riscv/autovec-vls.md
@@ -232,6 +232,29 @@ (define_insn_and_split "3"
 [(set_attr "type" "vector")]
 )

+;; -
+;; Includes:
+;; - vfmin.vv/vfmax.vv
+;; - vfmin.vf/vfmax.vf
+;; - fmax/fmaxf in math.h
+;; -
+(define_insn_and_split "3"
+  [(set (match_operand:VLSF 0 "register_operand")
+(any_float_binop_nofrm:VLSF
+ (match_operand:VLSF 1 "")
+ (match_operand:VLSF 2 "")))]
+  "TARGET_VECTOR && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+{
+

RE: [PATCH] RISC-V: Enable VECT_COMPARE_COSTS by default

2023-09-01 Thread Li, Pan2 via Gcc-patches

Committed, thank Robin.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Robin Dapp via Gcc-patches
Sent: Friday, September 1, 2023 5:58 PM
To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org
Cc: kito.ch...@sifive.com; kito.ch...@gmail.com
Subject: Re: [PATCH] RISC-V: Enable VECT_COMPARE_COSTS by default

Hi Juzhe,

thanks, this is OK, we would have needed this sooner or later anyway.

Regards
 Robin

RE: [PATCH] RISC-V: Add dynamic LMUL compile option

2023-09-01 Thread Li, Pan2 via Gcc-patches

Committed, thanks Robin.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Robin Dapp via Gcc-patches
Sent: Friday, September 1, 2023 5:58 PM
To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org
Cc: kito.ch...@sifive.com; kito.ch...@gmail.com
Subject: Re: [PATCH] RISC-V: Add dynamic LMUL compile option

LGTM

Regards
 Robin

RE: [PATCH v1] RISC-V: Support FP ADD/SUB/MUL/DIV autovec for VLS mode

2023-09-01 Thread Li, Pan2 via Gcc-patches

Committed, thanks Juzhe.

Pan

From: 钟居哲 
Sent: Friday, September 1, 2023 3:28 PM
To: Li, Pan2 ; gcc-patches 
Cc: Li, Pan2 ; Wang, Yanzhang ; 
kito.cheng 
Subject: Re: [PATCH v1] RISC-V: Support FP ADD/SUB/MUL/DIV autovec for VLS mode

LGTM。


juzhe.zh...@rivai.ai

From: pan2.li
Date: 2023-09-01 11:33
To: gcc-patches
CC: juzhe.zhong; 
pan2.li; 
yanzhang.wang; 
kito.cheng
Subject: [PATCH v1] RISC-V: Support FP ADD/SUB/MUL/DIV autovec for VLS mode
From: Pan Li mailto:pan2...@intel.com>>

This patch would like to allow the VLS mode autovec for the
floating-point binary operation ADD/SUB/MUL/DIV.

Given below code example:

test (float *out, float *in1, float *in2)
{
  for (int i = 0; i < 128; i++)
out[i] = in1[i] + in2[i];
}

Before this patch:
test:
  csrr a4,vlenb
  slli a4,a4,1
  li   a5,128
  bleu a5,a4,.L38
  mv   a5,a4
.L38:
  vsetvli  zero,a5,e32,m8,ta,ma
  vle32.v  v16,0(a1)
  vsetvli  a4,zero,e32,m8,ta,ma
  vmv.v.i  v8,0
  vsetvli  zero,a5,e32,m8,tu,ma
  vle32.v  v24,0(a2)
  vfadd.vv v8,v24,v16
  vse32.v  v8,0(a0)
  ret

After this patch:
test:
  li   a5,128
  vsetvli  zero,a5,e32,m1,ta,ma
  vle32.v  v1,0(a2)
  vle32.v  v2,0(a1)
  vfadd.vv v1,v1,v2
  vse32.v  v1,0(a0)
  ret

Please note this patch also fix the execution failure of below
vect test cases.

* vect-alias-check-10.c
* vect-alias-check-11.c
* vect-alias-check-12.c
* vect-alias-check-14.c

Signed-off-by: Pan Li mailto:pan2...@intel.com>>

gcc/ChangeLog:

* config/riscv/autovec-vls.md (3): New pattern for
vls floating-point autovec.
* config/riscv/vector-iterators.md: New iterator for
floating-point V and VLS.
* config/riscv/vector.md: Add VLS to floating-point binop.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/def.h:
* gcc.target/riscv/rvv/autovec/vls/floating-point-add-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/floating-point-add-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/floating-point-add-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/floating-point-div-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/floating-point-div-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/floating-point-div-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/floating-point-mul-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/floating-point-mul-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/floating-point-mul-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/floating-point-sub-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/floating-point-sub-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/floating-point-sub-3.c: New test.
---
gcc/config/riscv/autovec-vls.md   | 24 ++
gcc/config/riscv/vector-iterators.md  | 80 +++
gcc/config/riscv/vector.md| 12 +--
.../gcc.target/riscv/rvv/autovec/vls/def.h|  8 ++
.../rvv/autovec/vls/floating-point-add-1.c| 43 ++
.../rvv/autovec/vls/floating-point-add-2.c| 43 ++
.../rvv/autovec/vls/floating-point-add-3.c| 43 ++
.../rvv/autovec/vls/floating-point-div-1.c| 43 ++
.../rvv/autovec/vls/floating-point-div-2.c| 43 ++
.../rvv/autovec/vls/floating-point-div-3.c| 43 ++
.../rvv/autovec/vls/floating-point-mul-1.c| 43 ++
.../rvv/autovec/vls/floating-point-mul-2.c| 43 ++
.../rvv/autovec/vls/floating-point-mul-3.c| 43 ++
.../rvv/autovec/vls/floating-point-sub-1.c| 43 ++
.../rvv/autovec/vls/floating-point-sub-2.c| 43 ++
.../rvv/autovec/vls/floating-point-sub-3.c| 43 ++
16 files changed, 634 insertions(+), 6 deletions(-)
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-add-1.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-add-2.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-add-3.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-div-1.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-div-2.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-div-3.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-mul-1.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-mul-2.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-mul-3.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-sub-1.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-sub-2.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-sub-3.c

diff --git a/gcc/config/riscv/autovec-vls.md b/gcc/config/riscv/autovec-vls.md

RE: [PATCH v1] RISC-V: Support rounding mode for VFMSAC/VFMSUB autovec

2023-08-31 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

-Original Message-
From: Kito Cheng  
Sent: Thursday, August 31, 2023 9:09 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 

Subject: Re: [PATCH v1] RISC-V: Support rounding mode for VFMSAC/VFMSUB autovec

LGTM

On Thu, Aug 24, 2023 at 3:13 PM Pan Li via Gcc-patches
 wrote:
>
> From: Pan Li 
>
> There will be a case like below for intrinsic and autovec combination.
>
> vfadd RTZ   <- intrinisc static rounding
> vfmsub  <- autovec/autovec-opt
>
> The autovec generated vfmsub should take DYN mode, and the
> frm must be restored before the vfmsub insn. This patch
> would like to fix this issue by:
>
> * Add the frm operand to the autovec/autovec-opt pattern.
> * Set the frm_mode attr to DYN.
>
> Thus, the frm flow when combine autovec and intrinsic should be.
>
> +
> | frrm  a5
> | ...
> | fsrmi 4
> | vfadd   <- intrinsic static rounding.
> | ...
> | fsrm  a5
> | vfmsub  <- autovec/autovec-opt
> | ...
> +
>
> Signed-off-by: Pan Li 
>
> gcc/ChangeLog:
>
> * config/riscv/autovec-opt.md: Add FRM_REGNUM to vfmsac/vfmsub
> * config/riscv/autovec.md: Ditto.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/float-point-frm-autovec-2.c: New test.
> ---
>  gcc/config/riscv/autovec-opt.md   | 36 
>  gcc/config/riscv/autovec.md   | 30 ---
>  .../rvv/base/float-point-frm-autovec-2.c  | 88 +++
>  3 files changed, 127 insertions(+), 27 deletions(-)
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-autovec-2.c
>
> diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md
> index 4b07e80ad95..732a51edacd 100644
> --- a/gcc/config/riscv/autovec-opt.md
> +++ b/gcc/config/riscv/autovec-opt.md
> @@ -583,13 +583,15 @@ (define_insn_and_split "*single_widen_fnma"
>  ;; vect__13.182_33 = .FMS (vect__11.180_35, vect__8.176_40, vect__4.172_45);
>  (define_insn_and_split "*double_widen_fms"
>[(set (match_operand:VWEXTF 0 "register_operand")
> -   (fma:VWEXTF
> - (float_extend:VWEXTF
> -   (match_operand: 2 "register_operand"))
> - (float_extend:VWEXTF
> -   (match_operand: 3 "register_operand"))
> - (neg:VWEXTF
> -   (match_operand:VWEXTF 1 "register_operand"]
> +   (unspec:VWEXTF
> + [(fma:VWEXTF
> +   (float_extend:VWEXTF
> + (match_operand: 2 "register_operand"))
> +   (float_extend:VWEXTF
> + (match_operand: 3 "register_operand"))
> +   (neg:VWEXTF
> + (match_operand:VWEXTF 1 "register_operand")))
> +  (reg:SI FRM_REGNUM)] UNSPEC_VFFMA))]
>"TARGET_VECTOR && can_create_pseudo_p ()"
>"#"
>"&& 1"
> @@ -600,17 +602,20 @@ (define_insn_and_split "*double_widen_fms"
>  DONE;
>}
>[(set_attr "type" "vfwmuladd")
> -   (set_attr "mode" "")])
> +   (set_attr "mode" "")
> +   (set (attr "frm_mode") (symbol_ref "riscv_vector::FRM_DYN"))])
>
>  ;; This helps to match ext + fms.
>  (define_insn_and_split "*single_widen_fms"
>[(set (match_operand:VWEXTF 0 "register_operand")
> -   (fma:VWEXTF
> - (float_extend:VWEXTF
> -   (match_operand: 2 "register_operand"))
> - (match_operand:VWEXTF 3 "register_operand")
> - (neg:VWEXTF
> -   (match_operand:VWEXTF 1 "register_operand"]
> +   (unspec:VWEXTF
> + [(fma:VWEXTF
> +   (float_extend:VWEXTF
> + (match_operand: 2 "register_operand"))
> +   (match_operand:VWEXTF 3 "register_operand")
> +   (neg:VWEXTF
> + (match_operand:VWEXTF 1 "register_operand")))
> +  (reg:SI FRM_REGNUM)] UNSPEC_VFFMA))]
>"TARGET_VECTOR && can_create_pseudo_p ()"
>"#"
>"&& 1"
> @@ -627,7 +632,8 @@ (define_insn_and_split "*single_widen_fms"
>  DONE;
>}
>[(set_attr "type" "vfwmuladd")
> -   (set_attr "mode" "")])
> +   (set_attr "mode" "")
> +   (set (attr "frm_mode") (symbol_ref "riscv_vector::FRM_DYN"))])
>
>  ;; -
>  ;;  [FP] VFWNMACC
> diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
> index 4894986d2a5..d9f1a10eb66 100644
> --- a/gcc/config/riscv/autovec.md
> +++ b/gcc/config/riscv/autovec.md
> @@ -1218,24 +1218,29 @@ (define_insn_and_split "*fnma"
>  (define_expand "fms4"
>[(parallel
>  [(set (match_operand:VF 0 "register_operand")
> - (fma:VF
> -   (match_operand:VF 1 "register_operand")
> -   (match_operand:VF 2 "register_operand")
> -   (neg:VF
> - (match_operand:VF 3 "register_operand"
> + (unspec:VF
> +   [(fma:VF
> + (match_operand:VF 1 "register_operand")
> + (match_operand:VF 2 "register_operand")
> + (neg:VF
> +   (match_operand:VF 3

RE: [PATCH v1] RISC-V: Support rounding mode for VFMADD/VFMACC autovec

2023-08-31 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

-Original Message-
From: Kito Cheng  
Sent: Thursday, August 31, 2023 9:10 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 

Subject: Re: [PATCH v1] RISC-V: Support rounding mode for VFMADD/VFMACC autovec

LGTM

On Thu, Aug 24, 2023 at 12:49 PM Pan Li via Gcc-patches
 wrote:
>
> From: Pan Li 
>
> There will be a case like below for intrinsic and autovec combination
>
> vfadd RTZ   <- intrinisc static rounding
> vfmadd  <- autovec/autovec-opt
>
> The autovec generated vfmadd should take DYN mode, and the
> frm must be restored before the vfmadd insn. This patch
> would like to fix this issue by:
>
> * Add the frm operand to the vfmadd/vfmacc autovec/autovec-opt pattern.
> * Set the frm_mode attr to DYN.
>
> Thus, the frm flow when combine autovec and intrinsic should be.
>
> +
> | frrm  a5
> | ...
> | fsrmi 4
> | vfadd   <- intrinsic static rounding.
> | ...
> | fsrm  a5
> | vfmadd  <- autovec/autovec-opt
> | ...
> +
>
> However, we leverage unspec instead of use to consume the FRM register
> because there are some restrictions from the combine pass. Some code
> path of try_combine may require the XVECLEN(pat, 0) == 2 for the
> recog_for_combine, and add new use will make the XVECLEN(pat, 0) == 3
> and result in the vfwmacc optimization failure. For example, in the
> test  widen-complicate-5.c and widen-8.c
>
> Finally, there will be other fma cases and they will be covered in
> the underlying patches.
>
> Signed-off-by: Pan Li 
> Co-Authored-By: Ju-Zhe Zhong 
>
> gcc/ChangeLog:
>
> * config/riscv/autovec-opt.md: Add FRM_REGNUM to vfmadd/vfmacc.
> * config/riscv/autovec.md: Ditto.
> * config/riscv/vector-iterators.md: Add UNSPEC_VFFMA.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/float-point-frm-autovec-1.c: New test.
> ---
>  gcc/config/riscv/autovec-opt.md   | 32 ---
>  gcc/config/riscv/autovec.md   | 26 +++---
>  gcc/config/riscv/vector-iterators.md  |  2 +
>  .../rvv/base/float-point-frm-autovec-1.c  | 88 +++
>  4 files changed, 125 insertions(+), 23 deletions(-)
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-autovec-1.c
>
> diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md
> index 99b609a99d9..4b07e80ad95 100644
> --- a/gcc/config/riscv/autovec-opt.md
> +++ b/gcc/config/riscv/autovec-opt.md
> @@ -459,12 +459,14 @@ (define_insn_and_split "*pred_single_widen_mul"
>  ;; vect__13.182_33 = .FMA (vect__11.180_35, vect__8.176_40, vect__4.172_45);
>  (define_insn_and_split "*double_widen_fma"
>[(set (match_operand:VWEXTF 0 "register_operand")
> -   (fma:VWEXTF
> - (float_extend:VWEXTF
> -   (match_operand: 2 "register_operand"))
> - (float_extend:VWEXTF
> -   (match_operand: 3 "register_operand"))
> - (match_operand:VWEXTF 1 "register_operand")))]
> +   (unspec:VWEXTF
> + [(fma:VWEXTF
> +   (float_extend:VWEXTF
> + (match_operand: 2 "register_operand"))
> +   (float_extend:VWEXTF
> + (match_operand: 3 "register_operand"))
> +   (match_operand:VWEXTF 1 "register_operand"))
> +  (reg:SI FRM_REGNUM)] UNSPEC_VFFMA))]
>"TARGET_VECTOR && can_create_pseudo_p ()"
>"#"
>"&& 1"
> @@ -475,16 +477,19 @@ (define_insn_and_split "*double_widen_fma"
>  DONE;
>}
>[(set_attr "type" "vfwmuladd")
> -   (set_attr "mode" "")])
> +   (set_attr "mode" "")
> +   (set (attr "frm_mode") (symbol_ref "riscv_vector::FRM_DYN"))])
>
>  ;; This helps to match ext + fma.
>  (define_insn_and_split "*single_widen_fma"
>[(set (match_operand:VWEXTF 0 "register_operand")
> -   (fma:VWEXTF
> - (float_extend:VWEXTF
> -   (match_operand: 2 "register_operand"))
> - (match_operand:VWEXTF 3 "register_operand")
> - (match_operand:VWEXTF 1 "register_operand")))]
> +   (unspec:VWEXTF
> + [(fma:VWEXTF
> +   (float_extend:VWEXTF
> + (match_operand: 2 "register_operand"))
> +   (match_operand:VWEXTF 3 "register_operand")
> +   (match_operand:VWEXTF 1 "register_operand"))
> +  (reg:SI FRM_REGNUM)] UNSPEC_VFFMA))]
>"TARGET_VECTOR && can_create_pseudo_p ()"
>"#"
>"&& 1"
> @@ -501,7 +506,8 @@ (define_insn_and_split "*single_widen_fma"
>  DONE;
>}
>[(set_attr "type" "vfwmuladd")
> -   (set_attr "mode" "")])
> +   (set_attr "mode" "")
> +   (set (attr "frm_mode") (symbol_ref "riscv_vector::FRM_DYN"))])
>
>  ;; -
>  ;;  [FP] VFWNMSAC
> diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
> index acca4c22b90..4894986d2a5 100644
> --- a/gcc/config/riscv/autovec.md
> +++ b/gcc/config/riscv/autovec.md
> @@ -1126,22 +1126,27 @@

RE: [PATCH] RISC-V: Add Vector cost model framework for RVV

2023-08-31 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Kito Cheng via Gcc-patches
Sent: Thursday, August 31, 2023 8:39 PM
To: Robin Dapp 
Cc: gcc-patches@gcc.gnu.org; kito.ch...@gmail.com; Juzhe-Zhong 

Subject: Re: [PATCH] RISC-V: Add Vector cost model framework for RVV

LGTM, Awesome!! It seems a sign of the next big move for RISC-V vectorization!

On Thu, Aug 31, 2023 at 8:36 PM Robin Dapp  wrote:
>
> OK.  As it doesn't do anything and we'll be needing it anyway no harm
> in adding it.
>
> Regards
>  Robin

RE: [PATCH] test: Adapt slp-26.c check for RVV

2023-08-30 Thread Li, Pan2 via Gcc-patches

Committed, thanks Richard.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Richard Biener via Gcc-patches
Sent: Wednesday, August 30, 2023 8:23 PM
To: Juzhe-Zhong 
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] test: Adapt slp-26.c check for RVV

On Wed, 30 Aug 2023, Juzhe-Zhong wrote:

> Fix FAILs:
> FAIL: gcc.dg/vect/slp-26.c -flto -ffat-lto-objects  scan-tree-dump-times vect 
> "vectorized 0 loops" 1
> FAIL: gcc.dg/vect/slp-26.c -flto -ffat-lto-objects  scan-tree-dump-times vect 
> "vectorizing stmts using SLP" 0
> FAIL: gcc.dg/vect/slp-26.c scan-tree-dump-times vect "vectorized 0 loops" 1
> FAIL: gcc.dg/vect/slp-26.c scan-tree-dump-times vect "vectorizing stmts using 
> SLP" 0
> 
> Since RVV is able to vectorize it with VLS modes like amdgcn.

OK

> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/slp-26.c: Adapt for RVV.
> 
> ---
>  gcc/testsuite/gcc.dg/vect/slp-26.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/slp-26.c 
> b/gcc/testsuite/gcc.dg/vect/slp-26.c
> index d398a5acb0c..196981d83c1 100644
> --- a/gcc/testsuite/gcc.dg/vect/slp-26.c
> +++ b/gcc/testsuite/gcc.dg/vect/slp-26.c
> @@ -47,7 +47,7 @@ int main (void)
>return 0;
>  }
>  
> -/* { dg-final { scan-tree-dump-times "vectorized 0 loops" 1 "vect" { target 
> { ! { mips_msa || amdgcn-*-* } } } } } */
> -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target 
> { mips_msa || amdgcn-*-* } } } } */
> -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect" 
> { target { ! { mips_msa || amdgcn-*-* } } } } } */
> -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" 
> { target { mips_msa || amdgcn-*-* } } } } */
> +/* { dg-final { scan-tree-dump-times "vectorized 0 loops" 1 "vect" { target 
> { ! { mips_msa || { amdgcn-*-* || riscv_vector } } } } } } */
> +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target 
> { mips_msa || { amdgcn-*-* || riscv_vector } } } } } */
> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect" 
> { target { ! { mips_msa || { amdgcn-*-* || riscv_vector } } } } } } */
> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" 
> { target { mips_msa || { amdgcn-*-* || riscv_vector } } } } } */
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

RE: [PATCH] test: Add xfail into slp-reduc-7.c for RVV VLA vectorization

2023-08-30 Thread Li, Pan2 via Gcc-patches

Committed, thanks Richard.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Richard Biener via Gcc-patches
Sent: Wednesday, August 30, 2023 8:23 PM
To: Juzhe-Zhong 
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] test: Add xfail into slp-reduc-7.c for RVV VLA 
vectorization

On Wed, 30 Aug 2023, Juzhe-Zhong wrote:

> Like ARM SVE, add RVV variable length xfail.

OK

> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/slp-reduc-7.c: Add RVV.
> 
> ---
>  gcc/testsuite/gcc.dg/vect/slp-reduc-7.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/slp-reduc-7.c 
> b/gcc/testsuite/gcc.dg/vect/slp-reduc-7.c
> index 7a958f24733..a8528ab53ee 100644
> --- a/gcc/testsuite/gcc.dg/vect/slp-reduc-7.c
> +++ b/gcc/testsuite/gcc.dg/vect/slp-reduc-7.c
> @@ -57,5 +57,5 @@ int main (void)
>  /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail 
> vect_no_int_add } } } */
>  /* For variable-length SVE, the number of scalar statements in the
> reduction exceeds the number of elements in a 128-bit granule.  */
> -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" 
> { xfail { vect_no_int_add || { aarch64_sve && vect_variable_length } } } } } 
> */
> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" 
> { xfail { vect_no_int_add || { { aarch64_sve && vect_variable_length } || { 
> riscv_vector && vect_variable_length } } } } } } */
>  /* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 0 "vect" { xfail { 
> aarch64_sve && vect_variable_length } } } } */
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

RE: [PATCH] test: Add xfail for riscv_vector

2023-08-30 Thread Li, Pan2 via Gcc-patches

Committed, thanks Richard.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Richard Biener via Gcc-patches
Sent: Wednesday, August 30, 2023 4:36 PM
To: Juzhe-Zhong 
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] test: Add xfail for riscv_vector

On Wed, 30 Aug 2023, Juzhe-Zhong wrote:

> Like ARM SVE, when we enable scalable vectorization for RVV,
> we can't do constant fold for these yet for both ARM SVE and RVV.
> 
> 
> Ok for trunk ?

OK.

> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/pr88598-1.c: Add riscv_vector.
>   * gcc.dg/vect/pr88598-2.c: Ditto.
>   * gcc.dg/vect/pr88598-3.c: Ditto.
> 
> ---
>  gcc/testsuite/gcc.dg/vect/pr88598-1.c | 2 +-
>  gcc/testsuite/gcc.dg/vect/pr88598-2.c | 2 +-
>  gcc/testsuite/gcc.dg/vect/pr88598-3.c | 2 +-
>  3 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/pr88598-1.c 
> b/gcc/testsuite/gcc.dg/vect/pr88598-1.c
> index e25c6c04543..ddcebb067ea 100644
> --- a/gcc/testsuite/gcc.dg/vect/pr88598-1.c
> +++ b/gcc/testsuite/gcc.dg/vect/pr88598-1.c
> @@ -51,4 +51,4 @@ main ()
>  
>  /* ??? We need more constant folding for this to work with fully-masked
> loops.  */
> -/* { dg-final { scan-tree-dump-not {REDUC_PLUS} "optimized" { xfail 
> aarch64_sve } } } */
> +/* { dg-final { scan-tree-dump-not {REDUC_PLUS} "optimized" { xfail { 
> aarch64_sve || riscv_vector } } } } */
> diff --git a/gcc/testsuite/gcc.dg/vect/pr88598-2.c 
> b/gcc/testsuite/gcc.dg/vect/pr88598-2.c
> index f4c41bd8e58..ef5ea8a1a86 100644
> --- a/gcc/testsuite/gcc.dg/vect/pr88598-2.c
> +++ b/gcc/testsuite/gcc.dg/vect/pr88598-2.c
> @@ -51,4 +51,4 @@ main ()
>  
>  /* ??? We need more constant folding for this to work with fully-masked
> loops.  */
> -/* { dg-final { scan-tree-dump-not {REDUC_PLUS} "optimized" { xfail 
> aarch64_sve } } } */
> +/* { dg-final { scan-tree-dump-not {REDUC_PLUS} "optimized" { xfail { 
> aarch64_sve || riscv_vector } } } } */
> diff --git a/gcc/testsuite/gcc.dg/vect/pr88598-3.c 
> b/gcc/testsuite/gcc.dg/vect/pr88598-3.c
> index 0fc23bf0ee7..75b8d024a95 100644
> --- a/gcc/testsuite/gcc.dg/vect/pr88598-3.c
> +++ b/gcc/testsuite/gcc.dg/vect/pr88598-3.c
> @@ -51,4 +51,4 @@ main ()
>  
>  /* ??? We need more constant folding for this to work with fully-masked
> loops.  */
> -/* { dg-final { scan-tree-dump-not {REDUC_PLUS} "optimized" { xfail 
> aarch64_sve } } } */
> +/* { dg-final { scan-tree-dump-not {REDUC_PLUS} "optimized" { xfail { 
> aarch64_sve || riscv_vector } } } } */
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

RE: [PATCH] test: Fix XPASS of RVV

2023-08-30 Thread Li, Pan2 via Gcc-patches

Committed, thanks Richard.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Richard Biener via Gcc-patches
Sent: Wednesday, August 30, 2023 6:24 PM
To: Juzhe-Zhong 
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] test: Fix XPASS of RVV

On Wed, 30 Aug 2023, Juzhe-Zhong wrote:

> XPASS: gcc.dg/vect/vect-outer-4e.c -flto -ffat-lto-objects  
> scan-tree-dump-times vect "OUTER LOOP VECTORIZED" 1
> XPASS: gcc.dg/vect/vect-outer-4e.c scan-tree-dump-times vect "OUTER LOOP 
> VECTORIZED" 1
> XPASS: gcc.dg/vect/vect-outer-4f.c -flto -ffat-lto-objects  
> scan-tree-dump-times vect "OUTER LOOP VECTORIZED" 1
> XPASS: gcc.dg/vect/vect-outer-4f.c scan-tree-dump-times vect "OUTER LOOP 
> VECTORIZED" 1
> XPASS: gcc.dg/vect/vect-outer-4g.c -flto -ffat-lto-objects  
> scan-tree-dump-times vect "OUTER LOOP VECTORIZED" 1
> XPASS: gcc.dg/vect/vect-outer-4g.c scan-tree-dump-times vect "OUTER LOOP 
> VECTORIZED" 1
> XPASS: gcc.dg/vect/vect-outer-4k.c -flto -ffat-lto-objects  
> scan-tree-dump-times vect "OUTER LOOP VECTORIZED" 1
> XPASS: gcc.dg/vect/vect-outer-4k.c scan-tree-dump-times vect "OUTER LOOP 
> VECTORIZED" 1
> XPASS: gcc.dg/vect/vect-outer-4l.c -flto -ffat-lto-objects  
> scan-tree-dump-times vect "OUTER LOOP VECTORIZED" 1
> XPASS: gcc.dg/vect/vect-outer-4l.c scan-tree-dump-times vect "OUTER LOOP 
> VECTORIZED" 1
> 
> Like ARM SVE, Fix these XPASS for RVV.

OK.

> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/vect-double-reduc-5.c: Add riscv.
>   * gcc.dg/vect/vect-outer-4e.c: Ditto.
>   * gcc.dg/vect/vect-outer-4f.c: Ditto.
>   * gcc.dg/vect/vect-outer-4g.c: Ditto.
>   * gcc.dg/vect/vect-outer-4k.c: Ditto.
>   * gcc.dg/vect/vect-outer-4l.c: Ditto.
> 
> ---
>  gcc/testsuite/gcc.dg/vect/vect-double-reduc-5.c | 2 +-
>  gcc/testsuite/gcc.dg/vect/vect-outer-4e.c   | 2 +-
>  gcc/testsuite/gcc.dg/vect/vect-outer-4f.c   | 2 +-
>  gcc/testsuite/gcc.dg/vect/vect-outer-4g.c   | 2 +-
>  gcc/testsuite/gcc.dg/vect/vect-outer-4k.c   | 2 +-
>  gcc/testsuite/gcc.dg/vect/vect-outer-4l.c   | 2 +-
>  6 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-double-reduc-5.c 
> b/gcc/testsuite/gcc.dg/vect/vect-double-reduc-5.c
> index 7465eae1c47..b990405745e 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-double-reduc-5.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-double-reduc-5.c
> @@ -53,5 +53,5 @@ int main ()
>  
>  /* Vectorization of loops with multiple types and double reduction is not 
> supported yet.  */   
> -/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { 
> xfail { ! aarch64*-*-* } } } } */
> +/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { 
> xfail { ! { aarch64*-*-* riscv*-*-* } } } } } */
>
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-outer-4e.c 
> b/gcc/testsuite/gcc.dg/vect/vect-outer-4e.c
> index e65a092f5bf..cc9e96f5d58 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-outer-4e.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-outer-4e.c
> @@ -23,4 +23,4 @@ foo (){
>return;
>  }
>  
> -/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { 
> xfail { ! aarch64*-*-* } } } } */
> +/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { 
> xfail { ! { aarch64*-*-* riscv*-*-* } } } } } */
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-outer-4f.c 
> b/gcc/testsuite/gcc.dg/vect/vect-outer-4f.c
> index a88014a2fbf..c903dc9bfea 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-outer-4f.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-outer-4f.c
> @@ -65,4 +65,4 @@ int main (void)
>return 0;
>  }
>  
> -/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { 
> xfail { ! aarch64*-*-* } } } } */
> +/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { 
> xfail { ! { aarch64*-*-* riscv*-*-* } } } } } */
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-outer-4g.c 
> b/gcc/testsuite/gcc.dg/vect/vect-outer-4g.c
> index a88014a2fbf..c903dc9bfea 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-outer-4g.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-outer-4g.c
> @@ -65,4 +65,4 @@ int main (void)
>return 0;
>  }
>  
> -/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { 
> xfail { ! aarch64*-*-* } } } } */
> +/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { 
> xfail { ! { aarch64*-*-* riscv*-*-* } } } } } */
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-outer-4k.c 
> b/gcc/testsuite/gcc.dg/vect/vect-outer-4k.c
> index a88014a2fbf..c903dc9bfea 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-outer-4k.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-outer-4k.c
> @@ -65,4 +65,4 @@ int main (void)
>return 0;
>  }
>  
> -/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { 
> xfail { ! aarch64*-*-* } } } } */
> +/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { 
> xfail { ! { aarch64*-*-* riscv*-*-* } } } } } */
> diff --git

RE: [PATCH] RISC-V: Make sure we get VL REG operand for VLMAX vsetvl

2023-08-29 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Kito Cheng via Gcc-patches
Sent: Wednesday, August 30, 2023 10:57 AM
To: Juzhe-Zhong 
Cc: gcc-patches@gcc.gnu.org; kito.ch...@sifive.com
Subject: Re: [PATCH] RISC-V: Make sure we get VL REG operand for VLMAX vsetvl

Lgtm

Juzhe-Zhong 於 2023年8月30日 週三，10:22寫道：

> Fix ICE in "vect" testsuite:
>
> FAIL: gcc.dg/vect/pr64495.c (internal compiler error: in df_uses_record,
> at df-scan.cc:2958)
> FAIL: gcc.dg/vect/pr64495.c (test for excess errors
>
> After this patch, all current found VSETVL PASS related bugs in "vect" are
> fixed.
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vsetvl.cc
> (vector_insn_info::get_avl_or_vl_reg): Fix bug.
>
> ---
>  gcc/config/riscv/riscv-vsetvl.cc | 16 
>  1 file changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv-vsetvl.cc
> b/gcc/config/riscv/riscv-vsetvl.cc
> index 73d672b083b..1386d9250ca 100644
> --- a/gcc/config/riscv/riscv-vsetvl.cc
> +++ b/gcc/config/riscv/riscv-vsetvl.cc
> @@ -2300,18 +2300,26 @@ vector_insn_info::get_avl_or_vl_reg (void) const
>if (!vlmax_avl_p (get_avl ()))
>  return get_avl ();
>
> -  if (has_vl_op (get_insn ()->rtl ()) || vsetvl_insn_p (get_insn ()->rtl
> ()))
> -return ::get_vl (get_insn ()->rtl ());
> -
>if (get_avl_source ())
>  return get_avl_reg_rtx ();
>
> +  rtx_insn *rinsn = get_insn ()->rtl ();
> +  if (has_vl_op (rinsn) || vsetvl_insn_p (rinsn))
> +{
> +  rtx vl = ::get_vl (rinsn);
> +  /* For VLMAX, we should make sure we get the
> +REG to emit 'vsetvl VL,zero' since the 'VL'
> +should be the REG according to RVV ISA.  */
> +  if (REG_P (vl))
> +   return vl;
> +}
> +
>/* A DIRTY (polluted EMPTY) block if:
> - get_insn is scalar move (no AVL or VL operand).
> - get_avl_source is null (no def in the current DIRTY block).
>   Then we trace the previous insn which must be the insn
>   already inserted in Phase 2 to get the VL operand for VLMAX.  */
> -  rtx_insn *prev_rinsn = PREV_INSN (get_insn ()->rtl ());
> +  rtx_insn *prev_rinsn = PREV_INSN (rinsn);
>gcc_assert (prev_rinsn && vsetvl_insn_p (prev_rinsn));
>return ::get_vl (prev_rinsn);
>  }
> --
> 2.36.3
>
>

RE: Re: [PATCH] RISC-V: Enable movmisalign for VLS modes

2023-08-29 Thread Li, Pan2 via Gcc-patches

Committed, thanks Jeff and Kito.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of ???
Sent: Wednesday, August 30, 2023 6:27 AM
To: Jeff Law ; kito.cheng 
Cc: gcc-patches ; kito.cheng 
Subject: Re: Re: [PATCH] RISC-V: Enable movmisalign for VLS modes

> OK for the trunk.
Thanks. Will commit it soon.

> Does force_reg safe for movmisalign?
Both operands[0] and operands[1] are vector QImode already, so it's safe to 
force reg.
And we have fully tested MEM->MEM and CONST->MEM in gcc.dg/vect.



juzhe.zh...@rivai.ai
 
From: Jeff Law
Date: 2023-08-29 22:23
To: Kito Cheng; Juzhe-Zhong
CC: gcc-patches; kito.cheng
Subject: Re: [PATCH] RISC-V: Enable movmisalign for VLS modes
 
 
On 8/29/23 07:54, Kito Cheng via Gcc-patches wrote:
>> +/* To support misalign data movement, we should use
>> +   minimum element alignment load/store.  */
>> +unsigned int size = GET_MODE_SIZE (GET_MODE_INNER (mode));
>> +poly_int64 nunits = GET_MODE_NUNITS (mode) * size;
>> +machine_mode mode = riscv_vector::get_vector_mode (QImode, 
>> nunits).require ();
>> +operands[0] = gen_lowpart (mode, operands[0]);
>> +operands[1] = gen_lowpart (mode, operands[1]);
>> +if (MEM_P (operands[0]) && !register_operand (operands[1], mode))
>> +  operands[1] = force_reg (mode, operands[1]);
> 
> Does force_reg safe for movmisalign?
It should be.  It's a pretty common idiom.  Essentially it's going to 
result in generating this for the MEM->MEM case:
 
MEM->REG
REG->MEM
 
 
Both of which are likely to go through the misalign expander.
 
I was about to ACK when I had to leave for a few minutes.
 
OK for the trunk.
 
jeff

RE: [PATCH v1] RISC-V: Fix one ICE for vect test vect-multitypes-5

2023-08-29 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

-Original Message-
From: Kito Cheng  
Sent: Tuesday, August 29, 2023 9:46 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; Wang, Yanzhang ; 
juzhe.zh...@rivai.ai
Subject: Re: [PATCH v1] RISC-V: Fix one ICE for vect test vect-multitypes-5

LGTM, thanks :)

On Tue, Aug 29, 2023 at 6:50 PM Pan Li via Gcc-patches
 wrote:
>
> From: Pan Li 
>
> There will be one ICE when build vect-multitypes-5.c similar as below:
>
> riscv64-unknown-elf-gcc -O3 \
>   -march=rv64imafdcv -mabi=lp64d -mcmodel=medlow \
>   -fdiagnostics-plain-output -flto -ffat-lto-objects \
>   --param riscv-autovec-preference=scalable -Wno-psabi \
>   -ftree-vectorize -fno-tree-loop-distribute-patterns \
>   -fno-vect-cost-model -fno-common -O2 -fdump-tree-vect-details \
>   gcc/testsuite/gcc.dg/vect/vect-multitypes-5.c -o test.elf -lm
>
> The below RTL is not well handled in riscv_legitimize_const_move, and
> then fall through to the default pass. Then the
> default force_const_mem will NULL_RTX, and will have ICE when operating
> one the NULL_RTX.
>
> (const:DI
>   (plus:DI
> (symbol_ref:DI ("ic") [flags 0x2] )
> (const_poly_int:DI [16, 16])))
>
> This patch would like to take care of this rtl in riscv_legitimize_const_move.
>
> Signed-off-by: Pan Li 
> Co-Authored-By: Ju-Zhe Zhong 
>
> gcc/ChangeLog:
>
> * config/riscv/riscv.cc (riscv_legitimize_poly_move): New declaration.
> (riscv_legitimize_const_move): Handle ref plus const poly.
> ---
>  gcc/config/riscv/riscv.cc | 23 +++
>  1 file changed, 23 insertions(+)
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 1d6e278ea90..bab6ed70b2d 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -366,6 +366,7 @@ static const struct riscv_tune_param 
> optimize_size_tune_info = {
>
>  static tree riscv_handle_fndecl_attribute (tree *, tree, tree, int, bool *);
>  static tree riscv_handle_type_attribute (tree *, tree, tree, int, bool *);
> +static void riscv_legitimize_poly_move (machine_mode, rtx, rtx, rtx);
>
>  /* Defining target-specific uses of __attribute__.  */
>  static const struct attribute_spec riscv_attribute_table[] =
> @@ -2118,6 +2119,28 @@ riscv_legitimize_const_move (machine_mode mode, rtx 
> dest, rtx src)
>return;
>  }
>
> +  /* Handle below format.
> + (const:DI
> +   (plus:DI
> +(symbol_ref:DI ("ic") [flags 0x2] ) <- 
> op_0
> +(const_poly_int:DI [16, 16]) // <- op_1
> + ))
> +   */
> +  rtx src_op_0 = XEXP (src, 0);
> +
> +  if (GET_CODE (src) == CONST && GET_CODE (src_op_0) == PLUS
> +&& CONST_POLY_INT_P (XEXP (src_op_0, 1)))
> +{
> +  rtx dest_tmp = gen_reg_rtx (mode);
> +  rtx tmp = gen_reg_rtx (mode);
> +
> +  riscv_emit_move (dest, XEXP (src_op_0, 0));
> +  riscv_legitimize_poly_move (mode, dest_tmp, tmp, XEXP (src_op_0, 1));
> +
> +  emit_insn (gen_rtx_SET (dest, gen_rtx_PLUS (mode, dest, dest_tmp)));
> +  return;
> +}
> +
>src = force_const_mem (mode, src);
>
>/* When using explicit relocs, constant pool references are sometimes
> --
> 2.34.1
>

RE: [PATCH] RISC-V: Fix uninitialized probability for GIMPLE IR tests

2023-08-28 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Kito Cheng via Gcc-patches
Sent: Monday, August 28, 2023 8:59 PM
To: Juzhe-Zhong 
Cc: GCC Patches ; Kito Cheng 
Subject: Re: [PATCH] RISC-V: Fix uninitialized probability for GIMPLE IR tests

LGTM

Juzhe-Zhong  於 2023年8月28日 週一 19:40 寫道：

> This patch fix unitialized probability in GIMPLE IR code tests:
> FAIL: gcc.dg/vect/slp-reduc-10a.c (internal compiler error: in
> compute_probabilities, at config/riscv/riscv-vsetvl.cc:4358)
> FAIL: gcc.dg/vect/slp-reduc-10a.c (test for excess errors)
> FAIL: gcc.dg/vect/slp-reduc-10a.c -flto -ffat-lto-objects (internal
> compiler error: in compute_probabilities, at
> config/riscv/riscv-vsetvl.cc:4358)
> FAIL: gcc.dg/vect/slp-reduc-10a.c -flto -ffat-lto-objects (test for excess
> errors)
> FAIL: gcc.dg/vect/slp-reduc-10b.c (internal compiler error: in
> compute_probabilities, at config/riscv/riscv-vsetvl.cc:4358)
> FAIL: gcc.dg/vect/slp-reduc-10b.c (test for excess errors)
> FAIL: gcc.dg/vect/slp-reduc-10b.c -flto -ffat-lto-objects (internal
> compiler error: in compute_probabilities, at
> config/riscv/riscv-vsetvl.cc:4358)
> FAIL: gcc.dg/vect/slp-reduc-10b.c -flto -ffat-lto-objects (test for excess
> errors)
> FAIL: gcc.dg/vect/slp-reduc-10c.c (internal compiler error: in
> compute_probabilities, at config/riscv/riscv-vsetvl.cc:4358)
> FAIL: gcc.dg/vect/slp-reduc-10c.c (test for excess errors)
> FAIL: gcc.dg/vect/slp-reduc-10c.c -flto -ffat-lto-objects (internal
> compiler error: in compute_probabilities, at
> config/riscv/riscv-vsetvl.cc:4358)
> FAIL: gcc.dg/vect/slp-reduc-10c.c -flto -ffat-lto-objects (test for excess
> errors)
> FAIL: gcc.dg/vect/slp-reduc-10d.c (internal compiler error: in
> compute_probabilities, at config/riscv/riscv-vsetvl.cc:4358)
> FAIL: gcc.dg/vect/slp-reduc-10d.c (test for excess errors)
> FAIL: gcc.dg/vect/slp-reduc-10d.c -flto -ffat-lto-objects (internal
> compiler error: in compute_probabilities, at
> config/riscv/riscv-vsetvl.cc:4358)
> FAIL: gcc.dg/vect/slp-reduc-10d.c -flto -ffat-lto-objects (test for excess
> errors)
> FAIL: gcc.dg/vect/slp-reduc-10e.c (internal compiler error: in
> compute_probabilities, at config/riscv/riscv-vsetvl.cc:4358)
> FAIL: gcc.dg/vect/slp-reduc-10e.c (test for excess errors)
> FAIL: gcc.dg/vect/slp-reduc-10e.c -flto -ffat-lto-objects (internal
> compiler error: in compute_probabilities, at
> config/riscv/riscv-vsetvl.cc:4358)
> FAIL: gcc.dg/vect/slp-reduc-10e.c -flto -ffat-lto-objects (test for excess
> errors)
> FAIL: gcc.dg/vect/vect-cond-arith-2.c (internal compiler error: in
> compute_probabilities, at config/riscv/riscv-vsetvl.cc:4358)
> FAIL: gcc.dg/vect/vect-cond-arith-2.c (test for excess errors)
> FAIL: gcc.dg/vect/vect-cond-arith-2.c -flto -ffat-lto-objects (internal
> compiler error: in compute_probabilities, at
> config/riscv/riscv-vsetvl.cc:4358)
> FAIL: gcc.dg/vect/vect-cond-arith-2.c -flto -ffat-lto-objects (test for
> excess errors)
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vsetvl.cc (pass_vsetvl::earliest_fusion):
> Skip never probability.
> (pass_vsetvl::compute_probabilities): Fix unitialized probability.
>
> ---
>  gcc/config/riscv/riscv-vsetvl.cc | 13 -
>  1 file changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/config/riscv/riscv-vsetvl.cc
> b/gcc/config/riscv/riscv-vsetvl.cc
> index 48e89fe2c03..f7ae6c16bee 100644
> --- a/gcc/config/riscv/riscv-vsetvl.cc
> +++ b/gcc/config/riscv/riscv-vsetvl.cc
> @@ -3272,6 +3272,10 @@ pass_vsetvl::earliest_fusion (void)
>   if (expr.empty_p ())
> continue;
>   edge eg = INDEX_EDGE (m_vector_manager->vector_edge_list, ed);
> + /* If it is the edge that we never reach, skip its possible PRE
> +fusion conservatively.  */
> + if (eg->probability == profile_probability::never ())
> +   break;
>   if (eg->src == ENTRY_BLOCK_PTR_FOR_FN (cfun)
>   || eg->dest == EXIT_BLOCK_PTR_FOR_FN (cfun))
> break;
> @@ -4359,7 +4363,14 @@ pass_vsetvl::compute_probabilities (void)
>FOR_EACH_EDGE (e, ei, cfg_bb->succs)
> {
>   auto _prob = get_block_info (e->dest).probability;
> - if (!new_prob.initialized_p ())
> + /* Normally, the edge probability should be initialized.
> +However, some special testing code which is written in
> +GIMPLE IR style force the edge probility uninitialized,
> +we conservatively set it as never so that it will not
> +affect PRE (Phase 3 && Phse 4).  */
> + if (!e->probability.initialized_p ())
> +   new_prob = profile_probability::never ();
> + else if (!new_prob.initialized_p ())
> new_prob = curr_prob * e->probability;
>   else if (new_prob == profile_probability::always ())
> continue;
> --
> 2.36.3
>
>

RE: [PATCH V2] RISC-V: Support LEN_FOLD_EXTRACT_LAST auto-vectorization

2023-08-25 Thread Li, Pan2 via Gcc-patches

Committed, thanks Robin.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Robin Dapp via Gcc-patches
Sent: Thursday, August 24, 2023 6:23 PM
To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org
Cc: rdapp@gmail.com; kito.ch...@gmail.com; kito.ch...@sifive.com; 
jeffreya...@gmail.com
Subject: Re: [PATCH V2] RISC-V: Support LEN_FOLD_EXTRACT_LAST auto-vectorization

LGTM.

Regards
 Robin

RE: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook

2023-08-25 Thread Li, Pan2 via Gcc-patches

Hi Jeff,

> You might also peek at the RTL gcse/pre code which is also LCM based and 
> has the same class of problems.

I found a similar approach to take care of this in gcse.cc/pre_edge_insert with 
some comments as below.

  /* We can't insert anything on an abnormal and
   critical edge, so we insert the insn at the end of
   the previous block. There are several alternatives
   detailed in Morgans book P277 (sec 10.5) for
   handling this situation.  This one is easiest for
   now.  */

if (eg->flags & EDGE_ABNORMAL)
  insert_insn_end_basic_block (index_map[j], bb);
else
  {
  insn = process_insert_insn (index_map[j]);
  insert_insn_on_edge (insn, eg);
  }

It looks the insert_insn_end_basic_block is designed to handle the ABNORMAL 
edge by inserting at end of previous block from the comments.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Li, Pan2 via Gcc-patches
Sent: Thursday, August 24, 2023 12:54 PM
To: Jeff Law ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
kito.ch...@gmail.com
Subject: RE: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook

Thanks Jeff.

> That implies a save/restore pair around the call (possibly optimized so 
> that we minimize the number of save/restores).  I would have expected 
> x86 to already be doing this.  But maybe there's some ABI thing around 
> mmx vs x86 state that allows it to be avoided

Very similar to save/restore but optional.
If no static rounding mode instrinsic here, it is unnecessary to add 
save/restore
pair around the call. I bet mode-switching take care of this already.

Pan

-Original Message-
From: Jeff Law  
Sent: Thursday, August 24, 2023 7:27 AM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
kito.ch...@gmail.com
Subject: Re: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook



On 8/23/23 08:54, Li, Pan2 wrote:
> Thanks Jeff for comments.
> 
>> Understood.  So the natural question is why does x86/sh not need this
>> for its mode switching?   Don't all the same issues exist on those
>> targets as well?
> 
> AFAIK, it comes from the different design principle between the risc-v and 
> x86/arm intrinsic API.
> The risc-v rvv FP rounding mode intrinsic API has one abstract level above 
> the insn itself, while
> the x86/arm only indicates the semantics of the insn.
> 
> For example, if one vector instruction VFADD doesn't have static rounding 
> mode (aka encoding rm in insn),
> there is no such a intrinsic API contains rounding mode argument in x86/arm. 
> While the risc-v fp
> vector intrinsic will always have static rounding mode API if the frm is 
> honored.
> 
> In short, the risc-v intrinsic API is closer to the end-user, while the 
> x86/arm instrinsic API is closer to insn itself.
OK, but I'm still strugging to see how the distinction is important 
here.  Ultimately there's a state at a call site.  We need to make sure 
that state from the current function doesn't impact the callee and we 
need to make sure that the callee doesn't impact the state in the caller.

That implies a save/restore pair around the call (possibly optimized so 
that we minimize the number of save/restores).  I would have expected 
x86 to already be doing this.  But maybe there's some ABI thing around 
mmx vs x86 state that allows it to be avoided

> 
> For the rest part, will have a try based on your suggestion soon as I am in 
> the middle of something.
No problem.  Get to it when you can.  I think it affects you more than 
me :-)

jeff

RE: [PATCH v1] RISC-V: Support rounding mode for VFNMSAC/VFNMSUB autovec

2023-08-24 Thread Li, Pan2 via Gcc-patches

Thanks Kito, will commit it after VFMADD, VFMSAC.

Pan

-Original Message-
From: Kito Cheng  
Sent: Thursday, August 24, 2023 10:24 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 

Subject: Re: [PATCH v1] RISC-V: Support rounding mode for VFNMSAC/VFNMSUB 
autovec

LGTM

On Thu, Aug 24, 2023 at 5:35 PM Pan Li via Gcc-patches
 wrote:
>
> From: Pan Li 
>
> There will be a case like below for intrinsic and autovec combination.
>
> vfadd RTZ   <- intrinisc static rounding
> vfnmsub <- autovec/autovec-opt
>
> The autovec generated vfnmsub should take DYN mode, and the
> frm must be restored before the vfnmsub insn. This patch
> would like to fix this issue by:
>
> * Add the frm operand to the autovec/autovec-opt pattern.
> * Set the frm_mode attr to DYN.
>
> Thus, the frm flow when combine autovec and intrinsic should be.
>
> +
> | frrm  a5
> | ...
> | fsrmi 4
> | vfadd   <- intrinsic static rounding.
> | ...
> | fsrm  a5
> | vfnmsub <- autovec/autovec-opt
> | ...
> +
>
> Signed-off-by: Pan Li 
>
> gcc/ChangeLog:
>
> * config/riscv/autovec-opt.md: Add FRM_REGNUM to vfnmsac/vfnmsub
> * config/riscv/autovec.md: Ditto.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/float-point-frm-autovec-3.c: New test.
> ---
>  gcc/config/riscv/autovec-opt.md   | 34 ---
>  gcc/config/riscv/autovec.md   | 30 ---
>  .../rvv/base/float-point-frm-autovec-3.c  | 88 +++
>  3 files changed, 126 insertions(+), 26 deletions(-)
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-autovec-3.c
>
> diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md
> index 732a51edacd..54ca6df721c 100644
> --- a/gcc/config/riscv/autovec-opt.md
> +++ b/gcc/config/riscv/autovec-opt.md
> @@ -523,13 +523,15 @@ (define_insn_and_split "*single_widen_fma"
>  ;; vect__13.182_33 = .FNMA (vect__11.180_35, vect__8.176_40, vect__4.172_45);
>  (define_insn_and_split "*double_widen_fnma"
>[(set (match_operand:VWEXTF 0 "register_operand")
> -   (fma:VWEXTF
> - (neg:VWEXTF
> +   (unspec:VWEXTF
> + [(fma:VWEXTF
> +   (neg:VWEXTF
> + (float_extend:VWEXTF
> +   (match_operand: 2 "register_operand")))
> (float_extend:VWEXTF
> - (match_operand: 2 "register_operand")))
> - (float_extend:VWEXTF
> -   (match_operand: 3 "register_operand"))
> - (match_operand:VWEXTF 1 "register_operand")))]
> + (match_operand: 3 "register_operand"))
> +   (match_operand:VWEXTF 1 "register_operand"))
> +  (reg:SI FRM_REGNUM)] UNSPEC_VFFMA))]
>"TARGET_VECTOR && can_create_pseudo_p ()"
>"#"
>"&& 1"
> @@ -540,17 +542,20 @@ (define_insn_and_split "*double_widen_fnma"
>  DONE;
>}
>[(set_attr "type" "vfwmuladd")
> -   (set_attr "mode" "")])
> +   (set_attr "mode" "")
> +   (set (attr "frm_mode") (symbol_ref "riscv_vector::FRM_DYN"))])
>
>  ;; This helps to match ext + fnma.
>  (define_insn_and_split "*single_widen_fnma"
>[(set (match_operand:VWEXTF 0 "register_operand")
> -   (fma:VWEXTF
> - (neg:VWEXTF
> -   (float_extend:VWEXTF
> - (match_operand: 2 "register_operand")))
> - (match_operand:VWEXTF 3 "register_operand")
> - (match_operand:VWEXTF 1 "register_operand")))]
> +   (unspec:VWEXTF
> + [(fma:VWEXTF
> +   (neg:VWEXTF
> + (float_extend:VWEXTF
> +   (match_operand: 2 "register_operand")))
> +   (match_operand:VWEXTF 3 "register_operand")
> +   (match_operand:VWEXTF 1 "register_operand"))
> +  (reg:SI FRM_REGNUM)] UNSPEC_VFFMA))]
>"TARGET_VECTOR && can_create_pseudo_p ()"
>"#"
>"&& 1"
> @@ -567,7 +572,8 @@ (define_insn_and_split "*single_widen_fnma"
>  DONE;
>}
>[(set_attr "type" "vfwmuladd")
> -   (set_attr "mode" "")])
> +   (set_attr "mode" "")
> +   (set (attr "frm_mode") (symbol_ref "riscv_vector::FRM_DYN"))])
>
>  ;; -
>  ;;  [FP] VFWMSAC
> diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
> index 0c1c546817a..28396c6175d 100644
> --- a/gcc/config/riscv/autovec.md
> +++ b/gcc/config/riscv/autovec.md
> @@ -1174,24 +1174,29 @@ (define_insn_and_split "*fma"
>  (define_expand "fnma4"
>[(parallel
>  [(set (match_operand:VF 0 "register_operand")
> - (fma:VF
> -   (neg:VF
> - (match_operand:VF 1 "register_operand"))
> -   (match_operand:VF 2 "register_operand")
> -   (match_operand:VF 3 "register_operand")))
> + (unspec:VF
> +   [(fma:VF
> + (neg:VF
> +   (match_operand:VF 1 "register_operand"))
> + (match_operand:VF 2 "register_operand")
> + (match_operand:VF 3

RE: [PATCH] RISC-V: Add COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS testcases

2023-08-24 Thread Li, Pan2 via Gcc-patches

Committed, thanks Robin.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Robin Dapp via Gcc-patches
Sent: Thursday, August 24, 2023 7:03 PM
To: 钟居哲 ; gcc-patches 
Cc: rdapp@gmail.com; kito.cheng ; kito.cheng 
; Jeff Law 
Subject: Re: [PATCH] RISC-V: Add COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS 
testcases

OK.

Regards
 Robin

RE: [PATCH V2] gimple_fold: Support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold

2023-08-24 Thread Li, Pan2 via Gcc-patches

Committed, thanks Richard.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Richard Sandiford via Gcc-patches
Sent: Thursday, August 24, 2023 6:34 PM
To: Juzhe-Zhong 
Cc: gcc-patches@gcc.gnu.org; rguent...@suse.de
Subject: Re: [PATCH V2] gimple_fold: Support 
COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold

Juzhe-Zhong  writes:
> Hi, Richard and Richi.
>
> Currently, GCC support COND_LEN_FMA for floating-point **NO** -ffast-math.
> It's supported in tree-ssa-math-opts.cc. However, GCC failed to support 
> COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS.
>
> Consider this following case:
> #define TEST_TYPE(TYPE)   
>  \
>   __attribute__ ((noipa)) void ternop_##TYPE (TYPE *__restrict dst,   
>  \
> TYPE *__restrict a,  \
> TYPE *__restrict b, int n)   \
>   {   
>  \
> for (int i = 0; i < n; i++)   
>  \
>   dst[i] -= a[i] * b[i];   \
>   }
>
> #define TEST_ALL()
>  \
>   TEST_TYPE (float)   
>  \
>
> TEST_ALL ()
>
> Gimple IR for RVV:
>
> ...
> _39 = -vect__8.14_26;
> vect__10.16_21 = .COND_LEN_FMA ({ -1, ... }, vect__6.11_30, _39, 
> vect__4.8_34, vect__4.8_34, _46, 0);
> ...
>
> This is because this following piece of codes in tree-ssa-math-opts.cc:
>
>   if (len)
>   fma_stmt
> = gimple_build_call_internal (IFN_COND_LEN_FMA, 7, cond, mulop1, op2,
>   addop, else_value, len, bias);
>   else if (cond)
>   fma_stmt = gimple_build_call_internal (IFN_COND_FMA, 5, cond, mulop1,
>  op2, addop, else_value);
>   else
>   fma_stmt = gimple_build_call_internal (IFN_FMA, 3, mulop1, op2, addop);
>   gimple_set_lhs (fma_stmt, gimple_get_lhs (use_stmt));
>   gimple_call_set_nothrow (fma_stmt, !stmt_can_throw_internal (cfun,
>  use_stmt));
>   gsi_replace (, fma_stmt, true);
>   /* Follow all SSA edges so that we generate FMS, FNMA and FNMS
>regardless of where the negation occurs.  */
>   gimple *orig_stmt = gsi_stmt (gsi);
>   if (fold_stmt (, follow_all_ssa_edges))
>   {
> if (maybe_clean_or_replace_eh_stmt (orig_stmt, gsi_stmt (gsi)))
>   gcc_unreachable ();
> update_stmt (gsi_stmt (gsi));
>   }
>
> 'fold_stmt' failed to fold NEGATE_EXPR + COND_LEN_FMA > COND_LEN_FNMA.
>
> This patch support STMT fold into:
>
> vect__10.16_21 = .COND_LEN_FNMA ({ -1, ... }, vect__8.14_26, vect__6.11_30, 
> vect__4.8_34, { 0.0, ... }, _46, 0);
>
> Note that COND_LEN_FNMA has 7 arguments and COND_LEN_ADD has 6 arguments.
>
> Extend maximum num ops:
> -  static const unsigned int MAX_NUM_OPS = 5;
> +  static const unsigned int MAX_NUM_OPS = 7;
>
> Bootstrap and Regtest on X86 passed.
> Tested on aarch64 Qemu.
>
> Fully tested COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS on RISC-V backend.
>
>
> gcc/ChangeLog:
>
> * genmatch.cc (decision_tree::gen): Support 
> COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold.
> * gimple-match-exports.cc (gimple_simplify): Ditto.
> (gimple_resimplify6): New function.
> (gimple_resimplify7): New function.
> (gimple_match_op::resimplify): Support 
> COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold.
> (convert_conditional_op): Ditto.
> (build_call_internal): Ditto.
> (try_conditional_simplification): Ditto.
> (gimple_extract): Ditto.
> * gimple-match.h (gimple_match_cond::gimple_match_cond): Ditto.
> * internal-fn.cc (CASE): Ditto.

OK, thanks.

Richard

>
> ---
>  gcc/genmatch.cc |   2 +-
>  gcc/gimple-match-exports.cc | 123 ++--
>  gcc/gimple-match.h  |  16 -
>  gcc/internal-fn.cc  |   7 +-
>  4 files changed, 138 insertions(+), 10 deletions(-)
>
> diff --git a/gcc/genmatch.cc b/gcc/genmatch.cc
> index f46d2e1520d..a1925a747a7 100644
> --- a/gcc/genmatch.cc
> +++ b/gcc/genmatch.cc
> @@ -4052,7 +4052,7 @@ decision_tree::gen (vec  , bool gimple)
>  }
>fprintf (stderr, "removed %u duplicate tails\n", rcnt);
>  
> -  for (unsigned n = 1; n <= 5; ++n)
> +  for (unsigned n = 1; n <= 7; ++n)
>  {
>bool has_kids_p = false;
>  
> diff --git a/gcc/gimple-match-exports.cc b/gcc/gimple-match-exports.cc
> index 7aeb4ddb152..b36027b0bad 100644
> --- a/gcc/gimple-match-exports.cc
> +++ b/gcc/gimple-match-exports.cc
> @@ -60,6 +60,12 @@ extern bool gimple_simplify (gimple_match_op *, gimple_seq 
> *, tree (*)(tree),
>code_helper,

RE: [PATCH] VECT: Apply LEN_FOLD_EXTRACT_LAST into loop vectorizer

2023-08-24 Thread Li, Pan2 via Gcc-patches

Committed, thanks Richard.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Richard Biener via Gcc-patches
Sent: Thursday, August 24, 2023 2:39 PM
To: Juzhe-Zhong 
Cc: gcc-patches@gcc.gnu.org; richard.sandif...@arm.com
Subject: Re: [PATCH] VECT: Apply LEN_FOLD_EXTRACT_LAST into loop vectorizer

On Thu, 24 Aug 2023, Juzhe-Zhong wrote:

> Hi.
> 
> This patch is apply LEN_FOLD_EXTRACT_LAST into loop vectorizer.
> 
> Consider this following case:
> #include 
> 
> #define N 32
> 
> /* Simple condition reduction.  */
> 
> int __attribute__ ((noinline, noclone))
> condition_reduction (int *a, int min_v)
> {
>   int last = 66; /* High start value.  */
> 
>   for (int i = 0; i < N; i++)
> if (a[i] < min_v)
>   last = i;
> 
>   return last;
> }
> 
> With this patch, we can generate this following IR:
> 
>   _44 = .SELECT_VL (ivtmp_42, POLY_INT_CST [4, 4]);
>   _34 = vect_vec_iv_.5_33 + { POLY_INT_CST [4, 4], ... };
>   ivtmp_36 = _44 * 4;
>   vect__4.8_39 = .MASK_LEN_LOAD (vectp_a.6_37, 32B, { -1, ... }, _44, 0);
> 
>   mask__11.9_41 = vect__4.8_39 < vect_cst__40;
>   last_5 = .LEN_FOLD_EXTRACT_LAST (last_14, mask__11.9_41, vect_vec_iv_.5_33, 
> _44, 0);
>   ...

LGTM.

Thanks,
Richard.

> gcc/ChangeLog:
> 
> * tree-vect-loop.cc (vectorizable_reduction): Apply 
> LEN_FOLD_EXTRACT_LAST.
> * tree-vect-stmts.cc (vectorizable_condition): Ditto.
> 
> ---
>  gcc/tree-vect-loop.cc  |  7 --
>  gcc/tree-vect-stmts.cc | 52 --
>  2 files changed, 50 insertions(+), 9 deletions(-)
> 
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index 1cd6c291377..ebee8037e02 100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -7494,8 +7494,11 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
>   }
>  
>if (reduc_chain_length == 1
> -   && direct_internal_fn_supported_p (IFN_FOLD_EXTRACT_LAST,
> -  vectype_in, OPTIMIZE_FOR_SPEED))
> +   && (direct_internal_fn_supported_p (IFN_FOLD_EXTRACT_LAST, vectype_in,
> +   OPTIMIZE_FOR_SPEED)
> +   || direct_internal_fn_supported_p (IFN_LEN_FOLD_EXTRACT_LAST,
> +  vectype_in,
> +  OPTIMIZE_FOR_SPEED)))
>   {
> if (dump_enabled_p ())
>   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index 413a88750d6..be9f3a280bd 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -11740,8 +11740,17 @@ vectorizable_condition (vec_info *vinfo,
> && LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo))
>   {
> if (reduction_type == EXTRACT_LAST_REDUCTION)
> - vect_record_loop_mask (loop_vinfo, _VINFO_MASKS (loop_vinfo),
> -ncopies * vec_num, vectype, NULL);
> + {
> +   if (direct_internal_fn_supported_p (IFN_LEN_FOLD_EXTRACT_LAST,
> +   vectype, OPTIMIZE_FOR_SPEED))
> + vect_record_loop_len (loop_vinfo,
> +   _VINFO_LENS (loop_vinfo),
> +   ncopies * vec_num, vectype, 1);
> +   else
> + vect_record_loop_mask (loop_vinfo,
> +_VINFO_MASKS (loop_vinfo),
> +ncopies * vec_num, vectype, NULL);
> + }
> /* Extra inactive lanes should be safe for vect_nested_cycle.  */
> else if (STMT_VINFO_DEF_TYPE (reduc_info) != vect_nested_cycle)
>   {
> @@ -11772,7 +11781,13 @@ vectorizable_condition (vec_info *vinfo,
>   mask to the condition, or to its inverse.  */
>  
>vec_loop_masks *masks = NULL;
> -  if (loop_vinfo && LOOP_VINFO_FULLY_MASKED_P (loop_vinfo))
> +  vec_loop_lens *lens = NULL;
> +  if (loop_vinfo && LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo))
> +{
> +  if (reduction_type == EXTRACT_LAST_REDUCTION)
> + lens = _VINFO_LENS (loop_vinfo);
> +}
> +  else if (loop_vinfo && LOOP_VINFO_FULLY_MASKED_P (loop_vinfo))
>  {
>if (reduction_type == EXTRACT_LAST_REDUCTION)
>   masks = _VINFO_MASKS (loop_vinfo);
> @@ -11910,7 +11925,8 @@ vectorizable_condition (vec_info *vinfo,
>/* Force vec_compare to be an SSA_NAME rather than a comparison,
>in cases where that's necessary.  */
>  
> -  if (masks || reduction_type == EXTRACT_LAST_REDUCTION)
> +  tree len = NULL_TREE, bias = NULL_TREE;
> +  if (masks || lens || reduction_type == EXTRACT_LAST_REDUCTION)
>   {
> if (!is_gimple_val (vec_compare))
>   {
> @@ -11931,6 +11947,23 @@ vectorizable_condition (vec_info *vinfo,
> vec_compare = vec_compare_name;
>   }
>  
> +   if (direct_internal_fn_supported_p (IFN_LEN_FOLD_EXTRACT_LAST,
> +

RE: [PATCH v2] RISC-V: Fix one typo in autovec.md pattern comment

2023-08-24 Thread Li, Pan2 via Gcc-patches

Committed, thanks Juzhe.

Pan

From: 钟居哲 
Sent: Thursday, August 24, 2023 4:37 PM
To: Li, Pan2 ; gcc-patches 
Cc: Li, Pan2 ; Wang, Yanzhang ; 
kito.cheng 
Subject: Re: [PATCH v2] RISC-V: Fix one typo in autovec.md pattern comment

LGTM


juzhe.zh...@rivai.ai

From: pan2.li
Date: 2023-08-24 16:14
To: gcc-patches
CC: juzhe.zhong; 
pan2.li; 
yanzhang.wang; 
kito.cheng
Subject: [PATCH v2] RISC-V: Fix one typo in autovec.md pattern comment
From: Pan Li mailto:pan2...@intel.com>>

vfmsac => vfnmacc
vfmsub => vfnmadd

Signed-off-by: Pan Li mailto:pan2...@intel.com>>

gcc/ChangeLog:

* config/riscv/autovec.md: Fix typo.
---
gcc/config/riscv/autovec.md | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index e1addc07036..e9659b2b157 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -1253,11 +1253,11 @@ (define_insn_and_split "*fms"
(set_attr "mode" "")])
;; -
-;;  [FP] VFMSAC and VFMSUB
+;;  [FP] VFNMACC and VFNMADD
;; -
;; Includes:
-;; - vfmsac
-;; - vfmsub
+;; - vfnmacc
+;; - vfnmadd
;; -
(define_expand "fnms4"
--
2.34.1

RE: [PATCH v1] RISC-V: Fix one typo in autovec.md pattern comment

2023-08-24 Thread Li, Pan2 via Gcc-patches

Thanks Kito. Looks need some additional change and will send the V2 for this.

Pan

From: Kito Cheng 
Sent: Thursday, August 24, 2023 3:44 PM
To: Li, Pan2 
Cc: GCC Patches ; 钟居哲 ; Wang, 
Yanzhang 
Subject: Re: [PATCH v1] RISC-V: Fix one typo in autovec.md pattern comment

LGTM

Pan Li via Gcc-patches 
mailto:gcc-patches@gcc.gnu.org>> 於 2023年8月24日 週四 15:41 
寫道：
From: Pan Li mailto:pan2...@intel.com>>

Fix below typo for the pattern comment.

vfmsac => vfnmsac
vfmsub => vfnmsub

Signed-off-by: Pan Li mailto:pan2...@intel.com>>

gcc/ChangeLog:

* config/riscv/autovec.md: Fix typo.
---
 gcc/config/riscv/autovec.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index d9f1a10eb66..18950ac7c4f 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -1258,11 +1258,11 @@ (define_insn_and_split "*fms"
(set (attr "frm_mode") (symbol_ref "riscv_vector::FRM_DYN"))])

 ;; -
-;;  [FP] VFMSAC and VFMSUB
+;;  [FP] VFNMSAC and VFNMSUB
 ;; -
 ;; Includes:
-;; - vfmsac
-;; - vfmsub
+;; - vfnmsac
+;; - vfnmsub
 ;; -

 (define_expand "fnms4"
--
2.34.1

RE: [PATCH v2] RISC-V: Refactor RVV class by frm_op_type template arg

2023-08-24 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

From: Kito Cheng 
Sent: Thursday, August 24, 2023 3:44 PM
To: Li, Pan2 
Cc: GCC Patches ; 钟居哲 ; Robin 
Dapp ; Jeff Law ; Wang, Yanzhang 

Subject: Re: [PATCH v2] RISC-V: Refactor RVV class by frm_op_type template arg

LGTM

Pan Li via Gcc-patches 
mailto:gcc-patches@gcc.gnu.org>> 於 2023年8月22日 週二 12:20 
寫道：
From: Pan Li mailto:pan2...@intel.com>>

Update in v2:

* Added gcc_assert for vx format in binop.
* Passed riscv/rvv.exp test.

Original Log:

As suggested by kito, we will add new frm_opt_type template arg
to the op class, to avoid the duplicated function expand.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(class binop_frm): Removed.
(class reverse_binop_frm): Ditto.
(class widen_binop_frm): Ditto.
(class vfmacc_frm): Ditto.
(class vfnmacc_frm): Ditto.
(class vfmsac_frm): Ditto.
(class vfnmsac_frm): Ditto.
(class vfmadd_frm): Ditto.
(class vfnmadd_frm): Ditto.
(class vfmsub_frm): Ditto.
(class vfnmsub_frm): Ditto.
(class vfwmacc_frm): Ditto.
(class vfwnmacc_frm): Ditto.
(class vfwmsac_frm): Ditto.
(class vfwnmsac_frm): Ditto.
(class unop_frm): Ditto.
(class vfrec7_frm): Ditto.
(class binop): Add frm_op_type template arg.
(class unop): Ditto.
(class widen_binop): Ditto.
(class widen_binop_fp): Ditto.
(class reverse_binop): Ditto.
(class vfmacc): Ditto.
(class vfnmsac): Ditto.
(class vfmadd): Ditto.
(class vfnmsub): Ditto.
(class vfnmacc): Ditto.
(class vfmsac): Ditto.
(class vfnmadd): Ditto.
(class vfmsub): Ditto.
(class vfwmacc): Ditto.
(class vfwnmacc): Ditto.
(class vfwmsac): Ditto.
(class vfwnmsac): Ditto.
(class float_misc): Ditto.
---
 .../riscv/riscv-vector-builtins-bases.cc  | 571 +-
 1 file changed, 143 insertions(+), 428 deletions(-)

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index 5ee7d3119db..54582ee130c 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -262,41 +262,21 @@ public:
vremu/vsadd/vsaddu/vssub/vssubu
vfadd/vfsub/
 */
-template
+template
 class binop : public function_base
 {
 public:
-  rtx expand (function_expander ) const override
+  bool has_rounding_mode_operand_p () const override
   {
-switch (e.op_info->op)
-  {
-  case OP_TYPE_vx:
-  case OP_TYPE_vf:
-   return e.use_exact_insn (code_for_pred_scalar (CODE, e.vector_mode ()));
-  case OP_TYPE_vv:
-   return e.use_exact_insn (code_for_pred (CODE, e.vector_mode ()));
-  default:
-   gcc_unreachable ();
-  }
+return FRM_OP == HAS_FRM;
   }
-};
-
-/* Implements below instructions for now.
-   - vfadd
-   - vfsub
-   - vfmul
-   - vfdiv
-*/
-template
-class binop_frm : public function_base
-{
-public:
-  bool has_rounding_mode_operand_p () const override { return true; }

   rtx expand (function_expander ) const override
   {
 switch (e.op_info->op)
   {
+  case OP_TYPE_vx:
+   gcc_assert (FRM_OP == NO_FRM);
   case OP_TYPE_vf:
return e.use_exact_insn (code_for_pred_scalar (CODE, e.vector_mode ()));
   case OP_TYPE_vv:
@@ -307,365 +287,6 @@ public:
   }
 };

-/* Implements below instructions for frm
-   - vfrsub
-   - vfrdiv
-*/
-template
-class reverse_binop_frm : public function_base
-{
-public:
-  bool has_rounding_mode_operand_p () const override { return true; }
-
-public:
-  rtx expand (function_expander ) const override
-  {
-return e.use_exact_insn (
-  code_for_pred_reverse_scalar (CODE, e.vector_mode ()));
-  }
-};
-
-/* Implements below instructions for frm
-   - vfwadd
-   - vfwsub
-   - vfwmul
-*/
-template
-class widen_binop_frm : public function_base
-{
-public:
-  bool has_rounding_mode_operand_p () const override { return true; }
-
-  rtx expand (function_expander ) const override
-  {
-switch (e.op_info->op)
-  {
-  case OP_TYPE_vv:
-   return e.use_exact_insn (
- code_for_pred_dual_widen (CODE, e.vector_mode ()));
-  case OP_TYPE_vf:
-   return e.use_exact_insn (
- code_for_pred_dual_widen_scalar (CODE, e.vector_mode ()));
-  case OP_TYPE_wv:
-   if (CODE == PLUS)
- return e.use_exact_insn (
-   code_for_pred_single_widen_add (e.vector_mode ()));
-   else
- return e.use_exact_insn (
-   code_for_pred_single_widen_sub (e.vector_mode ()));
-  case OP_TYPE_wf:
-   return e.use_exact_insn (
- code_for_pred_single_widen_scalar (CODE, e.vector_mode ()));
-  default:
-   gcc_unreachable ();
-  }
-  }
-};
-
-/* Implements below instructions for frm
-   - vfmacc
-*/
-class vfmacc_frm

RE: [PATCH v1] RISC-V: Refactor RVV class by frm_op_type template arg

2023-08-23 Thread Li, Pan2 via Gcc-patches

> So in the expand method, you added a case for OP_TYPE_vx. 

Actually this patch doesn't add a case OP_TYPE_vx, there are two classes 
binop_frm and binop before this patch.
Binop_frm doesn't have OP_TYPE_vx while binop has OP_TYPE_vx. When delete the 
whole binop_frm, the git diff demo
It looks like add a case OP_TYPE_vx but actually not.

As Jeff pre-approved, will commit the v2 (add gcc_assert suggested by kito) 
around the end of this week if no more comments.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Li, Pan2 via Gcc-patches
Sent: Tuesday, August 22, 2023 8:10 AM
To: Kito Cheng ; Jeff Law 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 

Subject: RE: [PATCH v1] RISC-V: Refactor RVV class by frm_op_type template arg

Thanks Kito and Jeff for comments, will double check and address the comment in 
v2.

Pan

-Original Message-
From: Kito Cheng  
Sent: Monday, August 21, 2023 11:07 PM
To: Jeff Law 
Cc: Li, Pan2 ; gcc-patches@gcc.gnu.org; 
juzhe.zh...@rivai.ai; Wang, Yanzhang 
Subject: Re: [PATCH v1] RISC-V: Refactor RVV class by frm_op_type template arg

Just one nit from me: plz add assertion to OP_TYPE_vx to make sure NO
FRM_OP == HAS_FRM there

On Mon, Aug 21, 2023 at 11:04 PM Jeff Law via Gcc-patches
 wrote:
>
>
>
> On 8/17/23 20:53, Pan Li via Gcc-patches wrote:
> > From: Pan Li 
> >
> > As suggested by kito, we will add new frm_opt_type template arg
> > to the op class, to avoid the duplicated function expand.
> >
> > Signed-off-by: Pan Li 
> >
> > gcc/ChangeLog:
> >
> >   * config/riscv/riscv-vector-builtins-bases.cc
> >   (class binop_frm): Removed.
> >   (class reverse_binop_frm): Ditto.
> >   (class widen_binop_frm): Ditto.
> >   (class vfmacc_frm): Ditto.
> >   (class vfnmacc_frm): Ditto.
> >   (class vfmsac_frm): Ditto.
> >   (class vfnmsac_frm): Ditto.
> >   (class vfmadd_frm): Ditto.
> >   (class vfnmadd_frm): Ditto.
> >   (class vfmsub_frm): Ditto.
> >   (class vfnmsub_frm): Ditto.
> >   (class vfwmacc_frm): Ditto.
> >   (class vfwnmacc_frm): Ditto.
> >   (class vfwmsac_frm): Ditto.
> >   (class vfwnmsac_frm): Ditto.
> >   (class unop_frm): Ditto.
> >   (class vfrec7_frm): Ditto.
> >   (class binop): Add frm_op_type template arg.
> >   (class unop): Ditto.
> >   (class widen_binop): Ditto.
> >   (class widen_binop_fp): Ditto.
> >   (class reverse_binop): Ditto.
> >   (class vfmacc): Ditto.
> >   (class vfnmsac): Ditto.
> >   (class vfmadd): Ditto.
> >   (class vfnmsub): Ditto.
> >   (class vfnmacc): Ditto.
> >   (class vfmsac): Ditto.
> >   (class vfnmadd): Ditto.
> >   (class vfmsub): Ditto.
> >   (class vfwmacc): Ditto.
> >   (class vfwnmacc): Ditto.
> >   (class vfwmsac): Ditto.
> >   (class vfwnmsac): Ditto.
> >   (class float_misc): Ditto.
> So in the expand method, you added a case for OP_TYPE_vx.  I assume that
> was intentional -- but it's not mentioned anywhere in the ChangeLog.  So
> please update the ChangeLog if it was intentional or remove the change
> if it wasn't intentional.  Pre-approved with whichever change is
> appropriate.
>
> Thanks,
> Jeff

RE: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook

2023-08-23 Thread Li, Pan2 via Gcc-patches

Thanks Jeff.

> That implies a save/restore pair around the call (possibly optimized so 
> that we minimize the number of save/restores).  I would have expected 
> x86 to already be doing this.  But maybe there's some ABI thing around 
> mmx vs x86 state that allows it to be avoided

Very similar to save/restore but optional.
If no static rounding mode instrinsic here, it is unnecessary to add 
save/restore
pair around the call. I bet mode-switching take care of this already.

Pan

-Original Message-
From: Jeff Law  
Sent: Thursday, August 24, 2023 7:27 AM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
kito.ch...@gmail.com
Subject: Re: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook



On 8/23/23 08:54, Li, Pan2 wrote:
> Thanks Jeff for comments.
> 
>> Understood.  So the natural question is why does x86/sh not need this
>> for its mode switching?   Don't all the same issues exist on those
>> targets as well?
> 
> AFAIK, it comes from the different design principle between the risc-v and 
> x86/arm intrinsic API.
> The risc-v rvv FP rounding mode intrinsic API has one abstract level above 
> the insn itself, while
> the x86/arm only indicates the semantics of the insn.
> 
> For example, if one vector instruction VFADD doesn't have static rounding 
> mode (aka encoding rm in insn),
> there is no such a intrinsic API contains rounding mode argument in x86/arm. 
> While the risc-v fp
> vector intrinsic will always have static rounding mode API if the frm is 
> honored.
> 
> In short, the risc-v intrinsic API is closer to the end-user, while the 
> x86/arm instrinsic API is closer to insn itself.
OK, but I'm still strugging to see how the distinction is important 
here.  Ultimately there's a state at a call site.  We need to make sure 
that state from the current function doesn't impact the callee and we 
need to make sure that the callee doesn't impact the state in the caller.

That implies a save/restore pair around the call (possibly optimized so 
that we minimize the number of save/restores).  I would have expected 
x86 to already be doing this.  But maybe there's some ABI thing around 
mmx vs x86 state that allows it to be avoided

> 
> For the rest part, will have a try based on your suggestion soon as I am in 
> the middle of something.
No problem.  Get to it when you can.  I think it affects you more than 
me :-)

jeff

RE: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook

2023-08-23 Thread Li, Pan2 via Gcc-patches

Thanks Jeff for comments.

> Understood.  So the natural question is why does x86/sh not need this 
> for its mode switching?   Don't all the same issues exist on those 
> targets as well?

AFAIK, it comes from the different design principle between the risc-v and 
x86/arm intrinsic API.
The risc-v rvv FP rounding mode intrinsic API has one abstract level above the 
insn itself, while
the x86/arm only indicates the semantics of the insn.

For example, if one vector instruction VFADD doesn't have static rounding mode 
(aka encoding rm in insn),
there is no such a intrinsic API contains rounding mode argument in x86/arm. 
While the risc-v fp
vector intrinsic will always have static rounding mode API if the frm is 
honored.

In short, the risc-v intrinsic API is closer to the end-user, while the x86/arm 
instrinsic API is closer to insn itself.

For the rest part, will have a try based on your suggestion soon as I am in the 
middle of something.

Pan

-Original Message-
From: Jeff Law  
Sent: Wednesday, August 23, 2023 10:25 PM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
kito.ch...@gmail.com
Subject: Re: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook



On 8/23/23 00:03, Li, Pan2 wrote:
> Thanks Jeff for comments, and sorry for late response.
> 
> The background comes from the CALL insn. For the RISC-V dynamic rounding mode 
> we need to
> 
> 1. restore the frm BEFORE call, to avoid the static rounding mode pollute the 
> call.
> 2. Backup the frm AFTER call, to ensure the frm value after call is live.
> 
> Currently, we don’t take care of it elegantly but we would like to refine 
> this part by the optional EMIT_AFTER.
Understood.  So the natural question is why does x86/sh not need this 
for its mode switching?   Don't all the same issues exist on those 
targets as well?

> 
>> I'm not aware of a case where we can have an insn with control flow that
>> isn't the end of the block.  So perhaps then that second conditional
>> into an assertion inside the true arm?
> 
> Not very sure my understanding is correct, but there may be a call insn in 
> the middle of the bb,
> And can be considered as control flow?
In the case where the call is control flow, then it'll end the block. 
Examples of this would be if the call could throw or perform a nonlocal 
goto.  For "normal" calls, they are not considered control flow and can 
show up in the middle of a block.

> 
>> Is this really correct for EDGE_ABNORMAL?  If the abnormal edge is
>> created by, say a nonlocal goto, exception handling, etc, then the insn
>> you insert at the end of the block will never be executed.
> 
> Got it, let me have a try for this, as well as there is somewhere take care 
> of this already.
You might also peek at the RTL gcse/pre code which is also LCM based and 
has the same class of problems.

jeff

RE: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook

2023-08-23 Thread Li, Pan2 via Gcc-patches

Thanks Jeff for comments, and sorry for late response.

The background comes from the CALL insn. For the RISC-V dynamic rounding mode 
we need to

1. restore the frm BEFORE call, to avoid the static rounding mode pollute the 
call.
2. Backup the frm AFTER call, to ensure the frm value after call is live.

Currently, we don’t take care of it elegantly but we would like to refine this 
part by the optional EMIT_AFTER.

> I'm not aware of a case where we can have an insn with control flow that 
> isn't the end of the block.  So perhaps then that second conditional 
> into an assertion inside the true arm?

Not very sure my understanding is correct, but there may be a call insn in the 
middle of the bb,
And can be considered as control flow?

> Is this really correct for EDGE_ABNORMAL?  If the abnormal edge is 
> created by, say a nonlocal goto, exception handling, etc, then the insn 
> you insert at the end of the block will never be executed.

Got it, let me have a try for this, as well as there is somewhere take care of 
this already.

Pan


-Original Message-
From: Jeff Law  
Sent: Monday, August 21, 2023 10:24 PM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
kito.ch...@gmail.com
Subject: Re: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook



On 8/21/23 01:26, pan2...@intel.com wrote:
> From: Pan Li 
> 
> We have EMIT hook in mode switching already, which will insert the
> insn before in most cases. However, in some arch like RISC-V, it
> requires the additional insn to be inserted after when meet a call.
> 
> |
> | <- EMIT HOOK, insert the insn before.
>   +---+
>   | ptr->insn |
>   +---+
> | <- EMIT_AFTER HOOK, insert the insn after.
> |
> 
> Thus, this patch would like to add one optional EMIT_AFTER hook, which
> will try to insert the emitted insn after. The end-user can either
> implement this HOOK or leave it NULL as is.
> 
> If the backend ignore this optinal hook, there is no impact to the
> original mode switching stuff. If the backend implement this optional
> hook, the mode switching will try to insert the insn after. Please note
> the EMIT_AFTER doen't have any impact to EMIT hook.
> 
> Passed both the regression and bootstrap test in x86.
> 
> Signed-off-by: Pan Li 
> 
> gcc/ChangeLog:
> 
>   * doc/tm.texi: Add hook def and update the description.
>   * doc/tm.texi.in: Ditto.
>   * mode-switching.cc (optimize_mode_switching): Insert the
>   emitted insn after ptr->insn.
>   * target.def (insn): Define emit_after hook.
Not a full review.  I think I need to know a bit more about why you need 
these additional hooks.

Presumably you can't use the current ".emit" hook because it doesn't 
give you access to the block or insn that you can then iterate on for 
insertion on the outgoing edges?



> @@ -831,6 +833,49 @@ optimize_mode_switching (void)
>   emit_insn_before (mode_set, ptr->insn_ptr);
>   }
>   
> +   if (targetm.mode_switching.emit_after)
> + {
> +   if (control_flow_insn_p (ptr->insn_ptr)
> + && ptr->insn_ptr == BB_END (bb))
I'm not aware of a case where we can have an insn with control flow that 
isn't the end of the block.  So perhaps then that second conditional 
into an assertion inside the true arm?


> + {
> +   edge eg;
> +   edge_iterator eg_iterator;
> +
> +   FOR_EACH_EDGE (eg, eg_iterator, bb->succs)
> + {
> +   start_sequence ();
> +   targetm.mode_switching.emit_after (entity_map[j],
> + ptr->mode, cur_mode, ptr->regs_live);
> +   mode_set = get_insns ();
> +   end_sequence ();
> +
> +   if (mode_set != NULL_RTX)
> + {
> +   if (eg->flags & EDGE_ABNORMAL)
> + insert_insn_end_basic_block (mode_set, bb);
> +   else
> + insert_insn_on_edge (mode_set, eg);
Is this really correct for EDGE_ABNORMAL?  If the abnormal edge is 
created by, say a nonlocal goto, exception handling, etc, then the insn 
you insert at the end of the block will never be executed.

This is a classic problem with these classes of algorithms and I suspect 
there's code elsewhere to deal with these cases.



Jeff

RE: [PATCH] VECT: Add LEN_FOLD_EXTRACT_LAST pattern

2023-08-22 Thread Li, Pan2 via Gcc-patches

Committed as passed both the regression and bootstrap tests in x86, thanks 
Richard.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Richard Biener via Gcc-patches
Sent: Tuesday, August 22, 2023 7:08 PM
To: Juzhe-Zhong 
Cc: gcc-patches@gcc.gnu.org; richard.sandif...@arm.com
Subject: Re: [PATCH] VECT: Add LEN_FOLD_EXTRACT_LAST pattern

On Tue, 22 Aug 2023, Juzhe-Zhong wrote:

> Hi, Richard and Richi.
> 
> This is the last autovec pattern I want to add for RVV (length loop control).
> 
> This patch is supposed to handled this following case:
> 
> int __attribute__ ((noinline, noclone))
> condition_reduction (int *a, int min_v, int n)
> {
>   int last = 66; /* High start value.  */
> 
>   for (int i = 0; i < n; i++)
> if (a[i] < min_v)
>   last = i;
> 
>   return last;
> }
> 
> ARM SVE IR:
> 
>   ...
>   mask__7.11_39 = vect__4.10_37 < vect_cst__38;
>   _40 = loop_mask_36 & mask__7.11_39;
>   last_5 = .FOLD_EXTRACT_LAST (last_15, _40, vect_vec_iv_.7_32);
>   ...
> 
> RVV IR, we want to see:
>  ...
>  loop_len = SELECT_VL
>  mask__7.11_39 = vect__4.10_37 < vect_cst__38;
>  last_5 = .LEN_FOLD_EXTRACT_LAST (last_15, _40, vect_vec_iv_.7_32, loop_len, 
> bias);
>  ...

OK.

Richard.

> gcc/ChangeLog:
> 
>   * doc/md.texi: Add LEN_FOLD_EXTRACT_LAST pattern.
>   * internal-fn.cc (fold_len_extract_direct): Ditto.
>   (expand_fold_len_extract_optab_fn): Ditto.
>   (direct_fold_len_extract_optab_supported_p): Ditto.
>   * internal-fn.def (LEN_FOLD_EXTRACT_LAST): Ditto.
> 
> ---
>  gcc/doc/md.texi | 6 ++
>  gcc/internal-fn.cc  | 5 +
>  gcc/internal-fn.def | 3 +++
>  3 files changed, 14 insertions(+)
> 
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index 89562fdb43c..24453693d89 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -5636,6 +5636,12 @@ has mode @var{m} and operands 0 and 1 have the mode 
> appropriate for
>  one element of @var{m}.  Operand 2 has the usual mask mode for vectors
>  of mode @var{m}; see @code{TARGET_VECTORIZE_GET_MASK_MODE}.
>  
> +@cindex @code{len_fold_extract_last_@var{m}} instruction pattern
> +@item @code{len_fold_extract_last_@var{m}}
> +Like @samp{fold_extract_last_@var{m}}, but takes an extra length operand as
> +operand 4 and an extra bias operand as operand 5.  The last associated 
> element
> +is extracted should have the index i < len (operand 4) + bias (operand 5).
> +
>  @cindex @code{fold_left_plus_@var{m}} instruction pattern
>  @item @code{fold_left_plus_@var{m}}
>  Take scalar operand 1 and successively add each element from vector
> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
> index 314f63b614b..4138cc31d7e 100644
> --- a/gcc/internal-fn.cc
> +++ b/gcc/internal-fn.cc
> @@ -188,6 +188,7 @@ init_internal_fns ()
>  #define cond_len_ternary_direct { 1, 1, true }
>  #define while_direct { 0, 2, false }
>  #define fold_extract_direct { 2, 2, false }
> +#define fold_len_extract_direct { 2, 2, false }
>  #define fold_left_direct { 1, 1, false }
>  #define mask_fold_left_direct { 1, 1, false }
>  #define mask_len_fold_left_direct { 1, 1, false }
> @@ -3863,6 +3864,9 @@ expand_convert_optab_fn (internal_fn fn, gcall *stmt, 
> convert_optab optab,
>  #define expand_fold_extract_optab_fn(FN, STMT, OPTAB) \
>expand_direct_optab_fn (FN, STMT, OPTAB, 3)
>  
> +#define expand_fold_len_extract_optab_fn(FN, STMT, OPTAB) \
> +  expand_direct_optab_fn (FN, STMT, OPTAB, 5)
> +
>  #define expand_fold_left_optab_fn(FN, STMT, OPTAB) \
>expand_direct_optab_fn (FN, STMT, OPTAB, 2)
>  
> @@ -3980,6 +3984,7 @@ multi_vector_optab_supported_p (convert_optab optab, 
> tree_pair types,
>  #define direct_mask_len_store_optab_supported_p convert_optab_supported_p
>  #define direct_while_optab_supported_p convert_optab_supported_p
>  #define direct_fold_extract_optab_supported_p direct_optab_supported_p
> +#define direct_fold_len_extract_optab_supported_p direct_optab_supported_p
>  #define direct_fold_left_optab_supported_p direct_optab_supported_p
>  #define direct_mask_fold_left_optab_supported_p direct_optab_supported_p
>  #define direct_mask_len_fold_left_optab_supported_p direct_optab_supported_p
> diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
> index 594f7881511..d09403c0a91 100644
> --- a/gcc/internal-fn.def
> +++ b/gcc/internal-fn.def
> @@ -312,6 +312,9 @@ DEF_INTERNAL_OPTAB_FN (EXTRACT_LAST, ECF_CONST | 
> ECF_NOTHROW,
>  DEF_INTERNAL_OPTAB_FN (FOLD_EXTRACT_LAST, ECF_CONST | ECF_NOTHROW,
>  fold_extract_last, fold_extract)
>  
> +DEF_INTERNAL_OPTAB_FN (LEN_FOLD_EXTRACT_LAST, ECF_CONST | ECF_NOTHROW,
> +len_fold_extract_last, fold_len_extract)
> +
>  DEF_INTERNAL_OPTAB_FN (FOLD_LEFT_PLUS, ECF_CONST | ECF_NOTHROW,
>  fold_left_plus, fold_left)
>  
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

RE: [PATCH V5] VECT: Support loop len control on EXTRACT_LAST vectorization

2023-08-22 Thread Li, Pan2 via Gcc-patches

Committed, thanks all.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Stefan Schulze Frielinghaus via Gcc-patches
Sent: Tuesday, August 22, 2023 2:23 PM
To: Juzhe-Zhong 
Cc: gcc-patches@gcc.gnu.org; richard.sandif...@arm.com; rguent...@suse.de; 
li...@linux.ibm.com
Subject: Re: [PATCH V5] VECT: Support loop len control on EXTRACT_LAST 
vectorization

On Mon, Aug 21, 2023 at 06:59:55PM +0800, Juzhe-Zhong wrote:
> Co-Authored-By: Kewen.Lin 
> 
> Hi, @Richi and @Richard, base on previous disscussion, I simpily fix issuses 
> for
> powerpc and s390 with your suggestions:
> 
> -  machine_mode len_load_mode = get_len_load_store_mode
> -(loop_vinfo->vector_mode, true).require ();
> -  machine_mode len_store_mode = get_len_load_store_mode
> -(loop_vinfo->vector_mode, false).require ();
> +  machine_mode len_load_mode, len_store_mode;
> +  if (!get_len_load_store_mode (loop_vinfo->vector_mode, true)
> +.exists (_load_mode))
> +return false;
> +  if (!get_len_load_store_mode (loop_vinfo->vector_mode, false)
> +.exists (_store_mode))
> +return false;
> 
> Hi, @Kewen and @Stefan

Successfully bootstrapped and regtested on s390 (z900 as well as z16;
both for 31- and 64-bit).

Thanks,
Stefan

> 
> Could you test this patch again ? Thanks.
> 
> Co-Authored-By: Kewen.Lin 
> 
> gcc/ChangeLog:
> 
>   * tree-vect-loop.cc (vect_verify_loop_lens): Add exists check.
>   (vectorizable_live_operation): Add live vectorization for length loop 
> control.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/riscv/rvv/autovec/partial/live-1.c: New test.
>   * gcc.target/riscv/rvv/autovec/partial/live_run-1.c: New test.
> 
> ---
>  .../riscv/rvv/autovec/partial/live-1.c| 34 +++
>  .../riscv/rvv/autovec/partial/live_run-1.c| 35 
>  gcc/tree-vect-loop.cc | 89 ++-
>  3 files changed, 138 insertions(+), 20 deletions(-)
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/live-1.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/live_run-1.c
> 
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/live-1.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/live-1.c
> new file mode 100644
> index 000..75fa2eba8cc
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/live-1.c
> @@ -0,0 +1,34 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param 
> riscv-autovec-preference=scalable -fdump-tree-optimized-details" } */
> +
> +#include 
> +
> +#define EXTRACT_LAST(TYPE)   
>   \
> +  TYPE __attribute__ ((noinline, noclone))   
>   \
> +  test_##TYPE (TYPE *x, int n, TYPE value)   
>   \
> +  {  
>   \
> +TYPE last;   
>   \
> +for (int j = 0; j < n; ++j)  
>   \
> +  {  
>   \
> + last = x[j];   \
> + x[j] = last * value;   \
> +  }  
>   \
> +return last; 
>   \
> +  }
> +
> +#define TEST_ALL(T)  
>   \
> +  T (int8_t) 
>   \
> +  T (int16_t)
>   \
> +  T (int32_t)
>   \
> +  T (int64_t)
>   \
> +  T (uint8_t)
>   \
> +  T (uint16_t)   
>   \
> +  T (uint32_t)   
>   \
> +  T (uint64_t)   
>   \
> +  T (_Float16)   
>   \
> +  T (float)  
>   \
> +  T (double)
> +
> +TEST_ALL (EXTRACT_LAST)
> +
> +/* { dg-final { scan-tree-dump-times "\.VEC_EXTRACT" 10 "optimized" } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/live_run-1.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/live_run-1.c
> new file mode 100644
> index 000..42913a112c6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/live_run-1.c
> @@ -0,0 +1,35 @@
> +/* { dg-do run { target {

RE: [PATCH v1] RISC-V: Refactor RVV class by frm_op_type template arg

2023-08-21 Thread Li, Pan2 via Gcc-patches

Thanks Kito and Jeff for comments, will double check and address the comment in 
v2.

Pan

-Original Message-
From: Kito Cheng  
Sent: Monday, August 21, 2023 11:07 PM
To: Jeff Law 
Cc: Li, Pan2 ; gcc-patches@gcc.gnu.org; 
juzhe.zh...@rivai.ai; Wang, Yanzhang 
Subject: Re: [PATCH v1] RISC-V: Refactor RVV class by frm_op_type template arg

Just one nit from me: plz add assertion to OP_TYPE_vx to make sure NO
FRM_OP == HAS_FRM there

On Mon, Aug 21, 2023 at 11:04 PM Jeff Law via Gcc-patches
 wrote:
>
>
>
> On 8/17/23 20:53, Pan Li via Gcc-patches wrote:
> > From: Pan Li 
> >
> > As suggested by kito, we will add new frm_opt_type template arg
> > to the op class, to avoid the duplicated function expand.
> >
> > Signed-off-by: Pan Li 
> >
> > gcc/ChangeLog:
> >
> >   * config/riscv/riscv-vector-builtins-bases.cc
> >   (class binop_frm): Removed.
> >   (class reverse_binop_frm): Ditto.
> >   (class widen_binop_frm): Ditto.
> >   (class vfmacc_frm): Ditto.
> >   (class vfnmacc_frm): Ditto.
> >   (class vfmsac_frm): Ditto.
> >   (class vfnmsac_frm): Ditto.
> >   (class vfmadd_frm): Ditto.
> >   (class vfnmadd_frm): Ditto.
> >   (class vfmsub_frm): Ditto.
> >   (class vfnmsub_frm): Ditto.
> >   (class vfwmacc_frm): Ditto.
> >   (class vfwnmacc_frm): Ditto.
> >   (class vfwmsac_frm): Ditto.
> >   (class vfwnmsac_frm): Ditto.
> >   (class unop_frm): Ditto.
> >   (class vfrec7_frm): Ditto.
> >   (class binop): Add frm_op_type template arg.
> >   (class unop): Ditto.
> >   (class widen_binop): Ditto.
> >   (class widen_binop_fp): Ditto.
> >   (class reverse_binop): Ditto.
> >   (class vfmacc): Ditto.
> >   (class vfnmsac): Ditto.
> >   (class vfmadd): Ditto.
> >   (class vfnmsub): Ditto.
> >   (class vfnmacc): Ditto.
> >   (class vfmsac): Ditto.
> >   (class vfnmadd): Ditto.
> >   (class vfmsub): Ditto.
> >   (class vfwmacc): Ditto.
> >   (class vfwnmacc): Ditto.
> >   (class vfwmsac): Ditto.
> >   (class vfwnmsac): Ditto.
> >   (class float_misc): Ditto.
> So in the expand method, you added a case for OP_TYPE_vx.  I assume that
> was intentional -- but it's not mentioned anywhere in the ChangeLog.  So
> please update the ChangeLog if it was intentional or remove the change
> if it wasn't intentional.  Pre-approved with whichever change is
> appropriate.
>
> Thanks,
> Jeff

RE: [PATCH v1] RISC-V: Support RVV VFWREDUSUM.VS rounding mode intrinsic API

2023-08-21 Thread Li, Pan2 via Gcc-patches

Committed, thanks Jeff.

Pan

-Original Message-
From: Jeff Law  
Sent: Monday, August 21, 2023 11:06 PM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
kito.ch...@gmail.com
Subject: Re: [PATCH v1] RISC-V: Support RVV VFWREDUSUM.VS rounding mode 
intrinsic API



On 8/17/23 02:05, Pan Li via Gcc-patches wrote:
> From: Pan Li 
> 
> This patch would like to support the rounding mode API for the
> VFWREDUSUM.VS as the below samples
> 
> * __riscv_vfwredusum_vs_f32m1_f64m1_rm
> * __riscv_vfwredusum_vs_f32m1_f64m1_rm_m
> 
> Signed-off-by: Pan Li 
> 
> gcc/ChangeLog:
> 
>   * config/riscv/riscv-vector-builtins-bases.cc
>   (vfwredusum_frm_obj): New declaration.
>   (BASE): Ditto.
>   * config/riscv/riscv-vector-builtins-bases.h: Ditto.
>   * config/riscv/riscv-vector-builtins-functions.def
>   (vfwredusum_frm): New intrinsic function def.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/riscv/rvv/base/float-point-wredusum.c: New test.
OK
jeff

RE: RE: [PATCH v1] RISC-V: Support RVV VFWREDUSUM.VS rounding mode intrinsic API

2023-08-21 Thread Li, Pan2 via Gcc-patches

By design, HAS_FRM must be present if this insn honor FRM.
For example, if one insn don't honor FRM, there should be only one declaration 
as below.

static CONSTEXPR const binop vfmax_obj;

But if one insn honors FRM, there will be 2 declaration as below for code reuse.

static CONSTEXPR const binop vfsub_obj;
static CONSTEXPR const binop_frm vfadd_frm_obj;

Pan

From: juzhe.zh...@rivai.ai 
Sent: Monday, August 21, 2023 4:48 PM
To: Li, Pan2 ; gcc-patches 
Cc: Wang, Yanzhang ; kito.cheng 
Subject: Re: RE: [PATCH v1] RISC-V: Support RVV VFWREDUSUM.VS rounding mode 
intrinsic API

Yes. I wonder why some floating-point rounding mode has HAS_FRM, some doesn't 
have?


juzhe.zh...@rivai.ai

From: Li, Pan2
Date: 2023-08-21 15:10
To: juzhe.zh...@rivai.ai; 
gcc-patches
CC: Wang, Yanzhang; 
kito.cheng
Subject: RE: [PATCH v1] RISC-V: Support RVV VFWREDUSUM.VS rounding mode 
intrinsic API
To double confirm, you mean this declaration ?

+static CONSTEXPR const widen_freducop 
vfwredusum_frm_obj;

Pan

From: juzhe.zh...@rivai.ai 
mailto:juzhe.zh...@rivai.ai>>
Sent: Monday, August 21, 2023 2:40 PM
To: Li, Pan2 mailto:pan2...@intel.com>>; gcc-patches 
mailto:gcc-patches@gcc.gnu.org>>
Cc: Li, Pan2 mailto:pan2...@intel.com>>; Wang, Yanzhang 
mailto:yanzhang.w...@intel.com>>; kito.cheng 
mailto:kito.ch...@gmail.com>>
Subject: Re: [PATCH v1] RISC-V: Support RVV VFWREDUSUM.VS rounding mode 
intrinsic API

Why does this patch not have HAS_FRM?


juzhe.zh...@rivai.ai

From: pan2.li
Date: 2023-08-17 16:05
To: gcc-patches
CC: juzhe.zhong; 
pan2.li; 
yanzhang.wang; 
kito.cheng
Subject: [PATCH v1] RISC-V: Support RVV VFWREDUSUM.VS rounding mode intrinsic 
API
From: Pan Li mailto:pan2...@intel.com>>

This patch would like to support the rounding mode API for the
VFWREDUSUM.VS as the below samples

* __riscv_vfwredusum_vs_f32m1_f64m1_rm
* __riscv_vfwredusum_vs_f32m1_f64m1_rm_m

Signed-off-by: Pan Li mailto:pan2...@intel.com>>

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(vfwredusum_frm_obj): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfwredusum_frm): New intrinsic function def.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-wredusum.c: New test.
---
.../riscv/riscv-vector-builtins-bases.cc  |  2 ++
.../riscv/riscv-vector-builtins-bases.h   |  1 +
.../riscv/riscv-vector-builtins-functions.def |  1 +
.../riscv/rvv/base/float-point-wredusum.c | 33 +++
4 files changed, 37 insertions(+)
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/float-point-wredusum.c

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index abf03bab0da..5ee7d3119db 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -2548,6 +2548,7 @@ static CONSTEXPR const freducop 
vfredosum_frm_obj;
static CONSTEXPR const reducop vfredmax_obj;
static CONSTEXPR const reducop vfredmin_obj;
static CONSTEXPR const widen_freducop vfwredusum_obj;
+static CONSTEXPR const widen_freducop 
vfwredusum_frm_obj;
static CONSTEXPR const widen_freducop vfwredosum_obj;
static CONSTEXPR const widen_freducop 
vfwredosum_frm_obj;
static CONSTEXPR const vmv vmv_x_obj;
@@ -2810,6 +2811,7 @@ BASE (vfredmin)
BASE (vfwredosum)
BASE (vfwredosum_frm)
BASE (vfwredusum)
+BASE (vfwredusum_frm)
BASE (vmv_x)
BASE (vmv_s)
BASE (vfmv_f)
diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h 
b/gcc/config/riscv/riscv-vector-builtins-bases.h
index c1bb164a712..69d4562091f 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.h
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.h
@@ -247,6 +247,7 @@ extern const function_base *const vfredmin;
extern const function_base *const vfwredosum;
extern const function_base *const vfwredosum_frm;
extern const function_base *const vfwredusum;
+extern const function_base *const vfwredusum_frm;
extern const function_base *const vmv_x;
extern const function_base *const vmv_s;
extern const function_base *const vfmv_f;
diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
b/gcc/config/riscv/riscv-vector-builtins-functions.def
index da1157f5a56..3ce06dc60b7 100644
--- a/gcc/config/riscv/riscv-vector-builtins-functions.def
+++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
@@ -508,6 +508,7 @@ DEF_RVV_FUNCTION (vfwredosum, reduc_alu, no_mu_preds, 
wf_vs_ops)
DEF_RVV_FUNCTION (vfwredusum,

RE: [PATCH v1] RISC-V: Support RVV VFWREDUSUM.VS rounding mode intrinsic API

2023-08-21 Thread Li, Pan2 via Gcc-patches

To double confirm, you mean this declaration ?

+static CONSTEXPR const widen_freducop 
vfwredusum_frm_obj;

Pan

From: juzhe.zh...@rivai.ai 
Sent: Monday, August 21, 2023 2:40 PM
To: Li, Pan2 ; gcc-patches 
Cc: Li, Pan2 ; Wang, Yanzhang ; 
kito.cheng 
Subject: Re: [PATCH v1] RISC-V: Support RVV VFWREDUSUM.VS rounding mode 
intrinsic API

Why does this patch not have HAS_FRM?


juzhe.zh...@rivai.ai

From: pan2.li
Date: 2023-08-17 16:05
To: gcc-patches
CC: juzhe.zhong; 
pan2.li; 
yanzhang.wang; 
kito.cheng
Subject: [PATCH v1] RISC-V: Support RVV VFWREDUSUM.VS rounding mode intrinsic 
API
From: Pan Li mailto:pan2...@intel.com>>

This patch would like to support the rounding mode API for the
VFWREDUSUM.VS as the below samples

* __riscv_vfwredusum_vs_f32m1_f64m1_rm
* __riscv_vfwredusum_vs_f32m1_f64m1_rm_m

Signed-off-by: Pan Li mailto:pan2...@intel.com>>

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(vfwredusum_frm_obj): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfwredusum_frm): New intrinsic function def.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-wredusum.c: New test.
---
.../riscv/riscv-vector-builtins-bases.cc  |  2 ++
.../riscv/riscv-vector-builtins-bases.h   |  1 +
.../riscv/riscv-vector-builtins-functions.def |  1 +
.../riscv/rvv/base/float-point-wredusum.c | 33 +++
4 files changed, 37 insertions(+)
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/float-point-wredusum.c

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index abf03bab0da..5ee7d3119db 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -2548,6 +2548,7 @@ static CONSTEXPR const freducop 
vfredosum_frm_obj;
static CONSTEXPR const reducop vfredmax_obj;
static CONSTEXPR const reducop vfredmin_obj;
static CONSTEXPR const widen_freducop vfwredusum_obj;
+static CONSTEXPR const widen_freducop 
vfwredusum_frm_obj;
static CONSTEXPR const widen_freducop vfwredosum_obj;
static CONSTEXPR const widen_freducop 
vfwredosum_frm_obj;
static CONSTEXPR const vmv vmv_x_obj;
@@ -2810,6 +2811,7 @@ BASE (vfredmin)
BASE (vfwredosum)
BASE (vfwredosum_frm)
BASE (vfwredusum)
+BASE (vfwredusum_frm)
BASE (vmv_x)
BASE (vmv_s)
BASE (vfmv_f)
diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h 
b/gcc/config/riscv/riscv-vector-builtins-bases.h
index c1bb164a712..69d4562091f 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.h
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.h
@@ -247,6 +247,7 @@ extern const function_base *const vfredmin;
extern const function_base *const vfwredosum;
extern const function_base *const vfwredosum_frm;
extern const function_base *const vfwredusum;
+extern const function_base *const vfwredusum_frm;
extern const function_base *const vmv_x;
extern const function_base *const vmv_s;
extern const function_base *const vfmv_f;
diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
b/gcc/config/riscv/riscv-vector-builtins-functions.def
index da1157f5a56..3ce06dc60b7 100644
--- a/gcc/config/riscv/riscv-vector-builtins-functions.def
+++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
@@ -508,6 +508,7 @@ DEF_RVV_FUNCTION (vfwredosum, reduc_alu, no_mu_preds, 
wf_vs_ops)
DEF_RVV_FUNCTION (vfwredusum, reduc_alu, no_mu_preds, wf_vs_ops)
DEF_RVV_FUNCTION (vfwredosum_frm, reduc_alu_frm, no_mu_preds, wf_vs_ops)
+DEF_RVV_FUNCTION (vfwredusum_frm, reduc_alu_frm, no_mu_preds, wf_vs_ops)
/* 15. Vector Mask Instructions.  */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-wredusum.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-wredusum.c
new file mode 100644
index 000..6c888c10c0d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-wredusum.c
@@ -0,0 +1,33 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vfloat64m1_t
+test_riscv_vfwredusum_vs_f32m1_f64m1_rm (vfloat32m1_t op1, vfloat64m1_t op2,
+ size_t vl) {
+  return __riscv_vfwredusum_vs_f32m1_f64m1_rm (op1, op2, 0, vl);
+}
+
+vfloat64m1_t
+test_vfwredusum_vs_f32m1_f64m1_rm_m (vbool32_t mask, vfloat32m1_t op1,
+  vfloat64m1_t op2, size_t vl) {
+  return __riscv_vfwredusum_vs_f32m1_f64m1_rm_m (mask, op1, op2, 1, vl);
+}
+
+vfloat64m1_t
+test_riscv_vfwredusum_vs_f32m1_f64m1 (vfloat32m1_t op1, vfloat64m1_t op2,
+   size_t vl) {
+  return __riscv_vfwredusum_vs_f32m1_f64m1 (op1, op2, vl);
+}
+
+vfloat64m1_t
+test_vfwredusum_vs_f32m1_f64m1_m (vbool32_t mask, vfloat32m1_t op1,
+   vfloat64m1_t op2, size_t vl) {
+

RE: [PATCH v1] RISC-V: Support RVV VFREDOSUM.VS rounding mode intrinsic API

2023-08-17 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

-Original Message-
From: Kito Cheng  
Sent: Thursday, August 17, 2023 3:30 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 

Subject: Re: [PATCH v1] RISC-V: Support RVV VFREDOSUM.VS rounding mode 
intrinsic API

lgtm

On Thu, Aug 17, 2023 at 2:23 PM Pan Li via Gcc-patches
 wrote:
>
> From: Pan Li 
>
> This patch would like to support the rounding mode API for the
> VFREDOSUM.VS as the below samples.
>
> * __riscv_vfredosum_vs_f32m1_f32m1_rm
> * __riscv_vfredosum_vs_f32m1_f32m1_rm_m
>
> Signed-off-by: Pan Li 
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vector-builtins-bases.cc
> (vfredosum_frm_obj): New declaration.
> (BASE): Ditto.
> * config/riscv/riscv-vector-builtins-bases.h: Ditto.
> * config/riscv/riscv-vector-builtins-functions.def
> (vfredosum_frm): New intrinsic function def.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/float-point-redosum.c: New test.
> ---
>  .../riscv/riscv-vector-builtins-bases.cc  |  2 ++
>  .../riscv/riscv-vector-builtins-bases.h   |  1 +
>  .../riscv/riscv-vector-builtins-functions.def |  1 +
>  .../riscv/rvv/base/float-point-redosum.c  | 33 +++
>  4 files changed, 37 insertions(+)
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/base/float-point-redosum.c
>
> diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
> b/gcc/config/riscv/riscv-vector-builtins-bases.cc
> index 65f1d9c8ff7..ef2991359da 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
> +++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
> @@ -2539,6 +2539,7 @@ static CONSTEXPR const 
> widen_reducop vwredsumu_obj;
>  static CONSTEXPR const freducop vfredusum_obj;
>  static CONSTEXPR const freducop vfredusum_frm_obj;
>  static CONSTEXPR const freducop vfredosum_obj;
> +static CONSTEXPR const freducop vfredosum_frm_obj;
>  static CONSTEXPR const reducop vfredmax_obj;
>  static CONSTEXPR const reducop vfredmin_obj;
>  static CONSTEXPR const widen_freducop vfwredusum_obj;
> @@ -2797,6 +2798,7 @@ BASE (vwredsumu)
>  BASE (vfredusum)
>  BASE (vfredusum_frm)
>  BASE (vfredosum)
> +BASE (vfredosum_frm)
>  BASE (vfredmax)
>  BASE (vfredmin)
>  BASE (vfwredosum)
> diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h 
> b/gcc/config/riscv/riscv-vector-builtins-bases.h
> index fd1a84f3e68..da8412b66df 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-bases.h
> +++ b/gcc/config/riscv/riscv-vector-builtins-bases.h
> @@ -241,6 +241,7 @@ extern const function_base *const vwredsumu;
>  extern const function_base *const vfredusum;
>  extern const function_base *const vfredusum_frm;
>  extern const function_base *const vfredosum;
> +extern const function_base *const vfredosum_frm;
>  extern const function_base *const vfredmax;
>  extern const function_base *const vfredmin;
>  extern const function_base *const vfwredosum;
> diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
> b/gcc/config/riscv/riscv-vector-builtins-functions.def
> index 90a83c02d52..80e65bfb14b 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-functions.def
> +++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
> @@ -501,6 +501,7 @@ DEF_RVV_FUNCTION (vfredmax, reduc_alu, no_mu_preds, 
> f_vs_ops)
>  DEF_RVV_FUNCTION (vfredmin, reduc_alu, no_mu_preds, f_vs_ops)
>
>  DEF_RVV_FUNCTION (vfredusum_frm, reduc_alu_frm, no_mu_preds, f_vs_ops)
> +DEF_RVV_FUNCTION (vfredosum_frm, reduc_alu_frm, no_mu_preds, f_vs_ops)
>
>  // 14.4. Vector Widening Floating-Point Reduction Instructions
>  DEF_RVV_FUNCTION (vfwredosum, reduc_alu, no_mu_preds, wf_vs_ops)
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-redosum.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-redosum.c
> new file mode 100644
> index 000..2e6a3c28a89
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-redosum.c
> @@ -0,0 +1,33 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +vfloat32m1_t
> +test_riscv_vfredosum_vs_f32m1_f32m1_rm (vfloat32m1_t op1, vfloat32m1_t op2,
> +   size_t vl) {
> +  return __riscv_vfredosum_vs_f32m1_f32m1_rm (op1, op2, 0, vl);
> +}
> +
> +vfloat32m1_t
> +test_vfredosum_vs_f32m1_f32m1_rm_m (vbool32_t mask, vfloat32m1_t op1,
> +   vfloat32m1_t op2, size_t vl) {
> +  return __riscv_vfredosum_vs_f32m1_f32m1_rm_m (mask, op1, op2, 1, vl);
> +}
> +
> +vfloat32m1_t
> +test_riscv_vfredosum_vs_f32m1_f32m1 (vfloat32m1_t op1, vfloat32m1_t op2,
> +size_t vl) {
> +  return __riscv_vfredosum_vs_f32m1_f32m1 (op1, op2, vl);
> +}
> +
> +vfloat32m1_t
> +test_vfredosum_vs_f32m1_f32m1_m (vbool32_t mask, vfloat32m1_t op1,
> +vfloat32m1_t op2, size_t vl) {
> +  return

RE: [PATCH v1] RISC-V: Support RVV VFREDOSUM.VS rounding mode intrinsic API

2023-08-17 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

-Original Message-
From: Li, Pan2  
Sent: Thursday, August 17, 2023 2:23 PM
To: gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; Li, Pan2 ; Wang, Yanzhang 
; kito.ch...@gmail.com
Subject: [PATCH v1] RISC-V: Support RVV VFREDOSUM.VS rounding mode intrinsic API

From: Pan Li 

This patch would like to support the rounding mode API for the
VFREDOSUM.VS as the below samples.

* __riscv_vfredosum_vs_f32m1_f32m1_rm
* __riscv_vfredosum_vs_f32m1_f32m1_rm_m

Signed-off-by: Pan Li 

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(vfredosum_frm_obj): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfredosum_frm): New intrinsic function def.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-redosum.c: New test.
---
 .../riscv/riscv-vector-builtins-bases.cc  |  2 ++
 .../riscv/riscv-vector-builtins-bases.h   |  1 +
 .../riscv/riscv-vector-builtins-functions.def |  1 +
 .../riscv/rvv/base/float-point-redosum.c  | 33 +++
 4 files changed, 37 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/float-point-redosum.c

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index 65f1d9c8ff7..ef2991359da 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -2539,6 +2539,7 @@ static CONSTEXPR const widen_reducop 
vwredsumu_obj;
 static CONSTEXPR const freducop vfredusum_obj;
 static CONSTEXPR const freducop vfredusum_frm_obj;
 static CONSTEXPR const freducop vfredosum_obj;
+static CONSTEXPR const freducop vfredosum_frm_obj;
 static CONSTEXPR const reducop vfredmax_obj;
 static CONSTEXPR const reducop vfredmin_obj;
 static CONSTEXPR const widen_freducop vfwredusum_obj;
@@ -2797,6 +2798,7 @@ BASE (vwredsumu)
 BASE (vfredusum)
 BASE (vfredusum_frm)
 BASE (vfredosum)
+BASE (vfredosum_frm)
 BASE (vfredmax)
 BASE (vfredmin)
 BASE (vfwredosum)
diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h 
b/gcc/config/riscv/riscv-vector-builtins-bases.h
index fd1a84f3e68..da8412b66df 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.h
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.h
@@ -241,6 +241,7 @@ extern const function_base *const vwredsumu;
 extern const function_base *const vfredusum;
 extern const function_base *const vfredusum_frm;
 extern const function_base *const vfredosum;
+extern const function_base *const vfredosum_frm;
 extern const function_base *const vfredmax;
 extern const function_base *const vfredmin;
 extern const function_base *const vfwredosum;
diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
b/gcc/config/riscv/riscv-vector-builtins-functions.def
index 90a83c02d52..80e65bfb14b 100644
--- a/gcc/config/riscv/riscv-vector-builtins-functions.def
+++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
@@ -501,6 +501,7 @@ DEF_RVV_FUNCTION (vfredmax, reduc_alu, no_mu_preds, 
f_vs_ops)
 DEF_RVV_FUNCTION (vfredmin, reduc_alu, no_mu_preds, f_vs_ops)
 
 DEF_RVV_FUNCTION (vfredusum_frm, reduc_alu_frm, no_mu_preds, f_vs_ops)
+DEF_RVV_FUNCTION (vfredosum_frm, reduc_alu_frm, no_mu_preds, f_vs_ops)
 
 // 14.4. Vector Widening Floating-Point Reduction Instructions
 DEF_RVV_FUNCTION (vfwredosum, reduc_alu, no_mu_preds, wf_vs_ops)
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-redosum.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-redosum.c
new file mode 100644
index 000..2e6a3c28a89
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-redosum.c
@@ -0,0 +1,33 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vfloat32m1_t
+test_riscv_vfredosum_vs_f32m1_f32m1_rm (vfloat32m1_t op1, vfloat32m1_t op2,
+   size_t vl) {
+  return __riscv_vfredosum_vs_f32m1_f32m1_rm (op1, op2, 0, vl);
+}
+
+vfloat32m1_t
+test_vfredosum_vs_f32m1_f32m1_rm_m (vbool32_t mask, vfloat32m1_t op1,
+   vfloat32m1_t op2, size_t vl) {
+  return __riscv_vfredosum_vs_f32m1_f32m1_rm_m (mask, op1, op2, 1, vl);
+}
+
+vfloat32m1_t
+test_riscv_vfredosum_vs_f32m1_f32m1 (vfloat32m1_t op1, vfloat32m1_t op2,
+size_t vl) {
+  return __riscv_vfredosum_vs_f32m1_f32m1 (op1, op2, vl);
+}
+
+vfloat32m1_t
+test_vfredosum_vs_f32m1_f32m1_m (vbool32_t mask, vfloat32m1_t op1,
+vfloat32m1_t op2, size_t vl) {
+  return __riscv_vfredosum_vs_f32m1_f32m1_m (mask, op1, op2, vl);
+}
+
+/* { dg-final { scan-assembler-times 
{vfredosum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {frrm\s+[axs][0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {fsrm\s+[axs][0-9]+} 2 } } */
+/* {

RE: [PATCH v1] RISC-V: Support RVV VFREDUSUM.VS rounding mode intrinsic API

2023-08-17 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

From: Kito Cheng 
Sent: Thursday, August 17, 2023 11:33 AM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 

Subject: Re: [PATCH v1] RISC-V: Support RVV VFREDUSUM.VS rounding mode 
intrinsic API

Lgtm

Pan Li via Gcc-patches 
mailto:gcc-patches@gcc.gnu.org>>於 2023年8月17日 
週四，11:09寫道：
From: Pan Li mailto:pan2...@intel.com>>

This patch would like to support the rounding mode API for the
VFREDUSUM.VS as the below samples.

* __riscv_vfredusum_vs_f32m1_f32m1_rm
* __riscv_vfredusum_vs_f32m1_f32m1_rm_m

Signed-off-by: Pan Li mailto:pan2...@intel.com>>

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(class freducop): Add frm_op_type template arg.
(vfredusum_frm_obj): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfredusum_frm): New intrinsic function def.
* config/riscv/riscv-vector-builtins-shapes.cc
(struct reduc_alu_frm_def): New class for frm shape.
(SHAPE): New declaration.
* config/riscv/riscv-vector-builtins-shapes.h: Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-redusum.c: New test.
---
 .../riscv/riscv-vector-builtins-bases.cc  |  9 -
 .../riscv/riscv-vector-builtins-bases.h   |  1 +
 .../riscv/riscv-vector-builtins-functions.def |  2 +
 .../riscv/riscv-vector-builtins-shapes.cc | 39 +++
 .../riscv/riscv-vector-builtins-shapes.h  |  1 +
 .../riscv/rvv/base/float-point-redusum.c  | 33 
 6 files changed, 84 insertions(+), 1 deletion(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/float-point-redusum.c

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index ad04647f9ba..65f1d9c8ff7 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -1847,10 +1847,15 @@ public:
 };

 /* Implements floating-point reduction instructions.  */
-template
+template
 class freducop : public function_base
 {
 public:
+  bool has_rounding_mode_operand_p () const override
+  {
+return FRM_OP == HAS_FRM;
+  }
+
   bool apply_mask_policy_p () const override { return false; }

   rtx expand (function_expander ) const override
@@ -2532,6 +2537,7 @@ static CONSTEXPR const reducop vredxor_obj;
 static CONSTEXPR const widen_reducop vwredsum_obj;
 static CONSTEXPR const widen_reducop vwredsumu_obj;
 static CONSTEXPR const freducop vfredusum_obj;
+static CONSTEXPR const freducop vfredusum_frm_obj;
 static CONSTEXPR const freducop vfredosum_obj;
 static CONSTEXPR const reducop vfredmax_obj;
 static CONSTEXPR const reducop vfredmin_obj;
@@ -2789,6 +2795,7 @@ BASE (vredxor)
 BASE (vwredsum)
 BASE (vwredsumu)
 BASE (vfredusum)
+BASE (vfredusum_frm)
 BASE (vfredosum)
 BASE (vfredmax)
 BASE (vfredmin)
diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h 
b/gcc/config/riscv/riscv-vector-builtins-bases.h
index c8c649c4bb0..fd1a84f3e68 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.h
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.h
@@ -239,6 +239,7 @@ extern const function_base *const vredxor;
 extern const function_base *const vwredsum;
 extern const function_base *const vwredsumu;
 extern const function_base *const vfredusum;
+extern const function_base *const vfredusum_frm;
 extern const function_base *const vfredosum;
 extern const function_base *const vfredmax;
 extern const function_base *const vfredmin;
diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
b/gcc/config/riscv/riscv-vector-builtins-functions.def
index cfbc125dcd8..90a83c02d52 100644
--- a/gcc/config/riscv/riscv-vector-builtins-functions.def
+++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
@@ -500,6 +500,8 @@ DEF_RVV_FUNCTION (vfredosum, reduc_alu, no_mu_preds, 
f_vs_ops)
 DEF_RVV_FUNCTION (vfredmax, reduc_alu, no_mu_preds, f_vs_ops)
 DEF_RVV_FUNCTION (vfredmin, reduc_alu, no_mu_preds, f_vs_ops)

+DEF_RVV_FUNCTION (vfredusum_frm, reduc_alu_frm, no_mu_preds, f_vs_ops)
+
 // 14.4. Vector Widening Floating-Point Reduction Instructions
 DEF_RVV_FUNCTION (vfwredosum, reduc_alu, no_mu_preds, wf_vs_ops)
 DEF_RVV_FUNCTION (vfwredusum, reduc_alu, no_mu_preds, wf_vs_ops)
diff --git a/gcc/config/riscv/riscv-vector-builtins-shapes.cc 
b/gcc/config/riscv/riscv-vector-builtins-shapes.cc
index 80329113af3..f8fdec863e6 100644
--- a/gcc/config/riscv/riscv-vector-builtins-shapes.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-shapes.cc
@@ -371,6 +371,44 @@ struct narrow_alu_frm_def : public build_frm_base
   }
 };

+/* reduc_alu_frm_def class.  */
+struct reduc_alu_frm_def : public build_frm_base
+{
+  char *get_name (function_builder , const function_instance ,
+ bool overloaded_p) const override
+  {
+char base_name[BASE_NAME_MAX_LEN] = {};
+
+

RE: [PATCH v1] RISC-V: Support RVV VFNCVT.F.{X|XU|F}.W rounding mode intrinsic API

2023-08-17 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

From: Kito Cheng 
Sent: Thursday, August 17, 2023 11:32 AM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 

Subject: Re: [PATCH v1] RISC-V: Support RVV VFNCVT.F.{X|XU|F}.W rounding mode 
intrinsic API

Lgtm

Pan Li via Gcc-patches 
mailto:gcc-patches@gcc.gnu.org>>於 2023年8月17日 
週四，10:19寫道：
From: Pan Li mailto:pan2...@intel.com>>

This patch would like to support the rounding mode API for the
VFNCVT.F.{X|XU|F}.W as the below samples.

* __riscv_vfncvt_f_x_w_f32m1_rm
* __riscv_vfncvt_f_x_w_f32m1_rm_m
* __riscv_vfncvt_f_xu_w_f32m1_rm
* __riscv_vfncvt_f_xu_w_f32m1_rm_m
* __riscv_vfncvt_f_f_w_f32m1_rm
* __riscv_vfncvt_f_f_w_f32m1_rm_m

Signed-off-by: Pan Li mailto:pan2...@intel.com>>

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(class vfncvt_f): Add frm_op_type template arg.
(vfncvt_f_frm_obj): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfncvt_f_frm): New intrinsic function def.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-ncvt-f.c: New test.
---
 .../riscv/riscv-vector-builtins-bases.cc  | 10 ++-
 .../riscv/riscv-vector-builtins-bases.h   |  1 +
 .../riscv/riscv-vector-builtins-functions.def |  3 +
 .../riscv/rvv/base/float-point-ncvt-f.c   | 69 +++
 4 files changed, 82 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/float-point-ncvt-f.c

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index acadec2afca..ad04647f9ba 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -1786,9 +1786,15 @@ public:
   }
 };

+template
 class vfncvt_f : public function_base
 {
 public:
+  bool has_rounding_mode_operand_p () const override
+  {
+return FRM_OP == HAS_FRM;
+  }
+
   rtx expand (function_expander ) const override
   {
 if (e.op_info->op == OP_TYPE_f_w)
@@ -2512,7 +2518,8 @@ static CONSTEXPR const vfncvt_x 
vfncvt_xu_obj;
 static CONSTEXPR const vfncvt_x 
vfncvt_xu_frm_obj;
 static CONSTEXPR const vfncvt_rtz_x vfncvt_rtz_x_obj;
 static CONSTEXPR const vfncvt_rtz_x vfncvt_rtz_xu_obj;
-static CONSTEXPR const vfncvt_f vfncvt_f_obj;
+static CONSTEXPR const vfncvt_f vfncvt_f_obj;
+static CONSTEXPR const vfncvt_f vfncvt_f_frm_obj;
 static CONSTEXPR const vfncvt_rod_f vfncvt_rod_f_obj;
 static CONSTEXPR const reducop vredsum_obj;
 static CONSTEXPR const reducop vredmaxu_obj;
@@ -2769,6 +2776,7 @@ BASE (vfncvt_xu_frm)
 BASE (vfncvt_rtz_x)
 BASE (vfncvt_rtz_xu)
 BASE (vfncvt_f)
+BASE (vfncvt_f_frm)
 BASE (vfncvt_rod_f)
 BASE (vredsum)
 BASE (vredmaxu)
diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h 
b/gcc/config/riscv/riscv-vector-builtins-bases.h
index 9bd09a41960..c8c649c4bb0 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.h
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.h
@@ -226,6 +226,7 @@ extern const function_base *const vfncvt_xu_frm;
 extern const function_base *const vfncvt_rtz_x;
 extern const function_base *const vfncvt_rtz_xu;
 extern const function_base *const vfncvt_f;
+extern const function_base *const vfncvt_f_frm;
 extern const function_base *const vfncvt_rod_f;
 extern const function_base *const vredsum;
 extern const function_base *const vredmaxu;
diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
b/gcc/config/riscv/riscv-vector-builtins-functions.def
index 1e0e989fc2a..cfbc125dcd8 100644
--- a/gcc/config/riscv/riscv-vector-builtins-functions.def
+++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
@@ -474,6 +474,9 @@ DEF_RVV_FUNCTION (vfncvt_rod_f, narrow_alu, full_preds, 
f_to_nf_f_w_ops)

 DEF_RVV_FUNCTION (vfncvt_x_frm, narrow_alu_frm, full_preds, f_to_ni_f_w_ops)
 DEF_RVV_FUNCTION (vfncvt_xu_frm, narrow_alu_frm, full_preds, f_to_nu_f_w_ops)
+DEF_RVV_FUNCTION (vfncvt_f_frm, narrow_alu_frm, full_preds, i_to_nf_x_w_ops)
+DEF_RVV_FUNCTION (vfncvt_f_frm, narrow_alu_frm, full_preds, u_to_nf_xu_w_ops)
+DEF_RVV_FUNCTION (vfncvt_f_frm, narrow_alu_frm, full_preds, f_to_nf_f_w_ops)

 /* 14. Vector Reduction Operations.  */

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-ncvt-f.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-ncvt-f.c
new file mode 100644
index 000..d6d4be5e98e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-ncvt-f.c
@@ -0,0 +1,69 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vfloat32m1_t
+test_riscv_vfncvt_f_x_w_f32m1_rm (vint64m2_t op1, size_t vl) {
+  return __riscv_vfncvt_f_x_w_f32m1_rm (op1, 0, vl);
+}
+
+vfloat32m1_t
+test_vfncvt_f_x_w_f32m1_rm_m (vbool32_t mask, vint64m2_t op1, size_t vl) {
+  return __riscv_vfncvt_f_x_w_f32m1_rm_m (mask, op1, 1, vl);

RE: [PATCH v1] RISC-V: Support RVV VFNCVT.XU.F.W rounding mode intrinsic API

2023-08-17 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Li, Pan2 via Gcc-patches
Sent: Thursday, August 17, 2023 10:18 AM
To: Kito Cheng 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 

Subject: RE: [PATCH v1] RISC-V: Support RVV VFNCVT.XU.F.W rounding mode 
intrinsic API

Thanks Kito, will commit it after the VFNCVT.X.F.W one, aka the signed integer 
cvt.

Pan

-Original Message-
From: Kito Cheng  
Sent: Thursday, August 17, 2023 9:30 AM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 

Subject: Re: [PATCH v1] RISC-V: Support RVV VFNCVT.XU.F.W rounding mode 
intrinsic API

LGTM

On Thu, Aug 17, 2023 at 9:23 AM Pan Li via Gcc-patches
 wrote:
>
> From: Pan Li 
>
> This patch would like to support the rounding mode API for the
> VFNCVT.XU.F.W as the below samples.
>
> * __riscv_vfncvt_xu_f_w_u16mf2_rm
> * __riscv_vfncvt_xu_f_w_u16mf2_rm_m
>
> Signed-off-by: Pan Li 
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vector-builtins-bases.cc
> (vfncvt_xu_frm_obj): New declaration.
> (BASE): Ditto.
> * config/riscv/riscv-vector-builtins-bases.h: Ditto.
> * config/riscv/riscv-vector-builtins-functions.def
> (vfncvt_xu_frm): New intrinsic function def.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/float-point-ncvt-xu.c: New test.
> ---
>  .../riscv/riscv-vector-builtins-bases.cc  |  2 ++
>  .../riscv/riscv-vector-builtins-bases.h   |  1 +
>  .../riscv/riscv-vector-builtins-functions.def |  1 +
>  .../riscv/rvv/base/float-point-ncvt-xu.c  | 29 +++
>  4 files changed, 33 insertions(+)
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/base/float-point-ncvt-xu.c
>
> diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
> b/gcc/config/riscv/riscv-vector-builtins-bases.cc
> index 2f40eeaeda5..acadec2afca 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
> +++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
> @@ -2509,6 +2509,7 @@ static CONSTEXPR const vfwcvt_f vfwcvt_f_obj;
>  static CONSTEXPR const vfncvt_x vfncvt_x_obj;
>  static CONSTEXPR const vfncvt_x vfncvt_x_frm_obj;
>  static CONSTEXPR const vfncvt_x vfncvt_xu_obj;
> +static CONSTEXPR const vfncvt_x 
> vfncvt_xu_frm_obj;
>  static CONSTEXPR const vfncvt_rtz_x vfncvt_rtz_x_obj;
>  static CONSTEXPR const vfncvt_rtz_x vfncvt_rtz_xu_obj;
>  static CONSTEXPR const vfncvt_f vfncvt_f_obj;
> @@ -2764,6 +2765,7 @@ BASE (vfwcvt_f)
>  BASE (vfncvt_x)
>  BASE (vfncvt_x_frm)
>  BASE (vfncvt_xu)
> +BASE (vfncvt_xu_frm)
>  BASE (vfncvt_rtz_x)
>  BASE (vfncvt_rtz_xu)
>  BASE (vfncvt_f)
> diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h 
> b/gcc/config/riscv/riscv-vector-builtins-bases.h
> index edff0de2715..9bd09a41960 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-bases.h
> +++ b/gcc/config/riscv/riscv-vector-builtins-bases.h
> @@ -222,6 +222,7 @@ extern const function_base *const vfwcvt_f;
>  extern const function_base *const vfncvt_x;
>  extern const function_base *const vfncvt_x_frm;
>  extern const function_base *const vfncvt_xu;
> +extern const function_base *const vfncvt_xu_frm;
>  extern const function_base *const vfncvt_rtz_x;
>  extern const function_base *const vfncvt_rtz_xu;
>  extern const function_base *const vfncvt_f;
> diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
> b/gcc/config/riscv/riscv-vector-builtins-functions.def
> index 5e37bae318a..1e0e989fc2a 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-functions.def
> +++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
> @@ -473,6 +473,7 @@ DEF_RVV_FUNCTION (vfncvt_f, narrow_alu, full_preds, 
> f_to_nf_f_w_ops)
>  DEF_RVV_FUNCTION (vfncvt_rod_f, narrow_alu, full_preds, f_to_nf_f_w_ops)
>
>  DEF_RVV_FUNCTION (vfncvt_x_frm, narrow_alu_frm, full_preds, f_to_ni_f_w_ops)
> +DEF_RVV_FUNCTION (vfncvt_xu_frm, narrow_alu_frm, full_preds, f_to_nu_f_w_ops)
>
>  /* 14. Vector Reduction Operations.  */
>
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-ncvt-xu.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-ncvt-xu.c
> new file mode 100644
> index 000..82c3e1364bf
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-ncvt-xu.c
> @@ -0,0 +1,29 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +vuint16mf2_t
> +test_riscv_vfncvt_xu_f_w_u16mf2_rm (vfloat32m1_t op1, size_t vl) {
> +  return __riscv_vfncvt_xu_f_w_u16mf2_rm (op1, 0, vl);
> +}
> +
> +vuint16mf2_t
> +test_vfncvt_xu_f_w_u16mf2_rm_m (vbool32_t m

RE: [PATCH v1] RISC-V: Support RVV VFNCVT.X.F.W rounding mode intrinsic API

2023-08-17 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

-Original Message-
From: Kito Cheng  
Sent: Thursday, August 17, 2023 3:30 PM
To: Li, Pan2 
Cc: juzhe.zh...@rivai.ai
Subject: Re: [PATCH v1] RISC-V: Support RVV VFNCVT.X.F.W rounding mode 
intrinsic API

Yeah, I missed that, LGTM :P

On Thu, Aug 17, 2023 at 2:28 PM Li, Pan2  wrote:
>
> Hi Kito,
>
> In case you missed this one, which is the precondition of the rest rounding 
> mode API patches for committing.
> Thank in advance, and we are close to complete all the rounding mode API, .
>
> Pan
>
> -Original Message-
> From: Li, Pan2 
> Sent: Wednesday, August 16, 2023 8:54 PM
> To: gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; Li, Pan2 ; Wang, Yanzhang 
> ; kito.ch...@gmail.com
> Subject: [PATCH v1] RISC-V: Support RVV VFNCVT.X.F.W rounding mode intrinsic 
> API
>
> From: Pan Li 
>
> This patch would like to support the rounding mode API for the
> VFNCVT.X.F.W as the below samples.
>
> * __riscv_vfncvt_x_f_w_i16mf2_rm
> * __riscv_vfncvt_x_f_w_i16mf2_rm_m
>
> Signed-off-by: Pan Li 
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vector-builtins-bases.cc
> (class vfncvt_x): Add frm_op_type template arg.
> (BASE): New declaration.
> * config/riscv/riscv-vector-builtins-bases.h: Ditto.
> * config/riscv/riscv-vector-builtins-functions.def
> (vfncvt_x_frm): New intrinsic function def.
> * config/riscv/riscv-vector-builtins-shapes.cc
> (struct narrow_alu_frm_def): New shape function for frm.
> (SHAPE): New declaration.
> * config/riscv/riscv-vector-builtins-shapes.h: Ditto.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/float-point-ncvt-x.c: New test.
> ---
>  .../riscv/riscv-vector-builtins-bases.cc  |  9 -
>  .../riscv/riscv-vector-builtins-bases.h   |  1 +
>  .../riscv/riscv-vector-builtins-functions.def |  2 +
>  .../riscv/riscv-vector-builtins-shapes.cc | 39 +++
>  .../riscv/riscv-vector-builtins-shapes.h  |  1 +
>  .../riscv/rvv/base/float-point-ncvt-x.c   | 29 ++
>  6 files changed, 80 insertions(+), 1 deletion(-)
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/base/float-point-ncvt-x.c
>
> diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
> b/gcc/config/riscv/riscv-vector-builtins-bases.cc
> index 050ecbe780c..2f40eeaeda5 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
> +++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
> @@ -1759,10 +1759,15 @@ public:
>  };
>
>  /* Implements vfncvt.x.  */
> -template
> +template
>  class vfncvt_x : public function_base
>  {
>  public:
> +  bool has_rounding_mode_operand_p () const override
> +  {
> +return FRM_OP == HAS_FRM;
> +  }
> +
>rtx expand (function_expander ) const override
>{
>  return e.use_exact_insn (
> @@ -2502,6 +2507,7 @@ static CONSTEXPR const vfwcvt_rtz_x 
> vfwcvt_rtz_x_obj;
>  static CONSTEXPR const vfwcvt_rtz_x vfwcvt_rtz_xu_obj;
>  static CONSTEXPR const vfwcvt_f vfwcvt_f_obj;
>  static CONSTEXPR const vfncvt_x vfncvt_x_obj;
> +static CONSTEXPR const vfncvt_x vfncvt_x_frm_obj;
>  static CONSTEXPR const vfncvt_x vfncvt_xu_obj;
>  static CONSTEXPR const vfncvt_rtz_x vfncvt_rtz_x_obj;
>  static CONSTEXPR const vfncvt_rtz_x vfncvt_rtz_xu_obj;
> @@ -2756,6 +2762,7 @@ BASE (vfwcvt_rtz_x)
>  BASE (vfwcvt_rtz_xu)
>  BASE (vfwcvt_f)
>  BASE (vfncvt_x)
> +BASE (vfncvt_x_frm)
>  BASE (vfncvt_xu)
>  BASE (vfncvt_rtz_x)
>  BASE (vfncvt_rtz_xu)
> diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h 
> b/gcc/config/riscv/riscv-vector-builtins-bases.h
> index 6565740c597..edff0de2715 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-bases.h
> +++ b/gcc/config/riscv/riscv-vector-builtins-bases.h
> @@ -220,6 +220,7 @@ extern const function_base *const vfwcvt_rtz_x;
>  extern const function_base *const vfwcvt_rtz_xu;
>  extern const function_base *const vfwcvt_f;
>  extern const function_base *const vfncvt_x;
> +extern const function_base *const vfncvt_x_frm;
>  extern const function_base *const vfncvt_xu;
>  extern const function_base *const vfncvt_rtz_x;
>  extern const function_base *const vfncvt_rtz_xu;
> diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
> b/gcc/config/riscv/riscv-vector-builtins-functions.def
> index 22c039c8cbb..5e37bae318a 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-functions.def
> +++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
> @@ -472,6 +472,8 @@ DEF_RVV_FUNCTION (vfncvt_f, narrow_alu, full_preds, 
> u_to_nf_xu_w_ops)
>  DEF_RVV_FUNCTION (vfncvt_f, narrow_alu, full_preds, f_to_nf_f_w_ops)
>  DEF_RVV_FUNCTION (vfncvt_rod_f, narrow_alu, full_preds, f_to_nf_f_w_ops)
>
> +DEF_RVV_FUNCTION (vfncvt_x_frm, narrow_alu_frm, full_preds, f_to_ni_f_w_ops)
> +
>  /* 14. Vector Reduction Operations.  */
>
>  // 14.1. Vector Single-Width Integer Reduction Instructions
> diff --git a/gcc/config/riscv/riscv-vector-builtins-shapes.cc 
>

RE: [PATCH] RISC-V: Fix incorrect VTYPE fusion for floating point scalar move insn[PR111037]

2023-08-17 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Kito Cheng via Gcc-patches
Sent: Thursday, August 17, 2023 2:08 PM
To: Juzhe-Zhong 
Cc: gcc-patches@gcc.gnu.org; kito.ch...@sifive.com; jeffreya...@gmail.com; 
rdapp@gmail.com
Subject: Re: [PATCH] RISC-V: Fix incorrect VTYPE fusion for floating point 
scalar move insn[PR111037]

LGTM, thanks :)

On Thu, Aug 17, 2023 at 1:59 PM Juzhe-Zhong  wrote:
>
> void foo(_Float16 y, int64_t *i64p)
> {
>   vint64m1_t vx =__riscv_vle64_v_i64m1 (i64p, 1);
>   vx = __riscv_vadd_vv_i64m1 (vx, vx, 1);
>   vfloat16m1_t vy =__riscv_vfmv_s_f_f16m1 (y, 1);
>   asm volatile ("# use %0 %1" : : "vr"(vx), "vr" (vy));
> }
>
> zve64f:
> foo:
> vsetivlizero,1,e16,mf4,ta,ma
> vle64.v v1,0(a0)
> vfmv.s.fv2,fa0
> vsetvli zero,zero,e64,m1,ta,ma
> vadd.vv v1,v1,v1
>
> zve64d:
> foo:
> vsetivlizero,1,e64,m1,ta,ma
> vle64.v v1,0(a0)
> vfmv.s.fv2,fa0
> vadd.vv v1,v1,v1
>
> PR target111037
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vsetvl.cc (float_insn_valid_sew_p): New function.
> (second_sew_less_than_first_sew_p): Fix bug.
> (first_sew_less_than_second_sew_p): Ditto.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/pr111037-1.c: New test.
> * gcc.target/riscv/rvv/base/pr111037-2.c: New test.
>
> ---
>  gcc/config/riscv/riscv-vsetvl.cc  | 22 +--
>  .../gcc.target/riscv/rvv/base/pr111037-1.c| 15 +
>  .../gcc.target/riscv/rvv/base/pr111037-2.c|  8 +++
>  3 files changed, 43 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111037-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111037-2.c
>
> diff --git a/gcc/config/riscv/riscv-vsetvl.cc 
> b/gcc/config/riscv/riscv-vsetvl.cc
> index 08c487d82c0..79cbac01047 100644
> --- a/gcc/config/riscv/riscv-vsetvl.cc
> +++ b/gcc/config/riscv/riscv-vsetvl.cc
> @@ -1183,18 +1183,36 @@ second_ratio_invalid_for_first_lmul_p (const 
> vector_insn_info ,
>return calculate_sew (info1.get_vlmul (), info2.get_ratio ()) == 0;
>  }
>
> +static bool
> +float_insn_valid_sew_p (const vector_insn_info , unsigned int sew)
> +{
> +  if (info.get_insn () && info.get_insn ()->is_real ()
> +  && get_attr_type (info.get_insn ()->rtl ()) == TYPE_VFMOVFV)
> +{
> +  if (sew == 16)
> +   return TARGET_VECTOR_ELEN_FP_16;
> +  else if (sew == 32)
> +   return TARGET_VECTOR_ELEN_FP_32;
> +  else if (sew == 64)
> +   return TARGET_VECTOR_ELEN_FP_64;
> +}
> +  return true;
> +}
> +
>  static bool
>  second_sew_less_than_first_sew_p (const vector_insn_info ,
>   const vector_insn_info )
>  {
> -  return info2.get_sew () < info1.get_sew ();
> +  return info2.get_sew () < info1.get_sew ()
> +|| !float_insn_valid_sew_p (info1, info2.get_sew ());
>  }
>
>  static bool
>  first_sew_less_than_second_sew_p (const vector_insn_info ,
>   const vector_insn_info )
>  {
> -  return info1.get_sew () < info2.get_sew ();
> +  return info1.get_sew () < info2.get_sew ()
> +|| !float_insn_valid_sew_p (info2, info1.get_sew ());
>  }
>
>  /* return 0 if LMUL1 == LMUL2.
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111037-1.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111037-1.c
> new file mode 100644
> index 000..0b7b32fc3e6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111037-1.c
> @@ -0,0 +1,15 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv32gc_zve64f_zvfh -mabi=ilp32d -O3" } */
> +
> +#include "riscv_vector.h"
> +
> +void foo(_Float16 y, int64_t *i64p)
> +{
> +  vint64m1_t vx =__riscv_vle64_v_i64m1 (i64p, 1);
> +  vx = __riscv_vadd_vv_i64m1 (vx, vx, 1);
> +  vfloat16m1_t vy =__riscv_vfmv_s_f_f16m1 (y, 1);
> +  asm volatile ("# use %0 %1" : : "vr"(vx), "vr" (vy));
> +}
> +
> +/* { dg-final { scan-assembler-times 
> {vsetivli\s+zero,\s*1,\s*e16,\s*mf4,\s*t[au],\s*m[au]} 1 } } */
> +/* { dg-final { scan-assembler-times 
> {vsetvli\s+zero,\s*zero,\s*e64,\s*m1,\s*t[au],\s*m[au]} 1 } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111037-2.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111037-2.c
> new file mode 100644
> index 000..ac50da71726
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111037-2.c
> @@ -0,0 +1,8 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv32gc_zve64d_zvfh -mabi=ilp32d -O3" } */
> +
> +#include "pr111037-1.c"
> +
> +/* { dg-final { scan-assembler-times 
> {vsetivli\s+zero,\s*1,\s*e64,\s*m1,\s*t[au],\s*m[au]} 1 } } */
> +/* { dg-final { scan-assembler-not {vsetvli} } } */
> +/* { dg-final { scan-assembler-times {vsetivli} 1 } } */
> --
> 2.36.3
>

RE: [PATCH v1] RISC-V: Support RVV VFNCVT.XU.F.W rounding mode intrinsic API

2023-08-16 Thread Li, Pan2 via Gcc-patches

Thanks Kito, will commit it after the VFNCVT.X.F.W one, aka the signed integer 
cvt.

Pan

-Original Message-
From: Kito Cheng  
Sent: Thursday, August 17, 2023 9:30 AM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 

Subject: Re: [PATCH v1] RISC-V: Support RVV VFNCVT.XU.F.W rounding mode 
intrinsic API

LGTM

On Thu, Aug 17, 2023 at 9:23 AM Pan Li via Gcc-patches
 wrote:
>
> From: Pan Li 
>
> This patch would like to support the rounding mode API for the
> VFNCVT.XU.F.W as the below samples.
>
> * __riscv_vfncvt_xu_f_w_u16mf2_rm
> * __riscv_vfncvt_xu_f_w_u16mf2_rm_m
>
> Signed-off-by: Pan Li 
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vector-builtins-bases.cc
> (vfncvt_xu_frm_obj): New declaration.
> (BASE): Ditto.
> * config/riscv/riscv-vector-builtins-bases.h: Ditto.
> * config/riscv/riscv-vector-builtins-functions.def
> (vfncvt_xu_frm): New intrinsic function def.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/float-point-ncvt-xu.c: New test.
> ---
>  .../riscv/riscv-vector-builtins-bases.cc  |  2 ++
>  .../riscv/riscv-vector-builtins-bases.h   |  1 +
>  .../riscv/riscv-vector-builtins-functions.def |  1 +
>  .../riscv/rvv/base/float-point-ncvt-xu.c  | 29 +++
>  4 files changed, 33 insertions(+)
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/base/float-point-ncvt-xu.c
>
> diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
> b/gcc/config/riscv/riscv-vector-builtins-bases.cc
> index 2f40eeaeda5..acadec2afca 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
> +++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
> @@ -2509,6 +2509,7 @@ static CONSTEXPR const vfwcvt_f vfwcvt_f_obj;
>  static CONSTEXPR const vfncvt_x vfncvt_x_obj;
>  static CONSTEXPR const vfncvt_x vfncvt_x_frm_obj;
>  static CONSTEXPR const vfncvt_x vfncvt_xu_obj;
> +static CONSTEXPR const vfncvt_x 
> vfncvt_xu_frm_obj;
>  static CONSTEXPR const vfncvt_rtz_x vfncvt_rtz_x_obj;
>  static CONSTEXPR const vfncvt_rtz_x vfncvt_rtz_xu_obj;
>  static CONSTEXPR const vfncvt_f vfncvt_f_obj;
> @@ -2764,6 +2765,7 @@ BASE (vfwcvt_f)
>  BASE (vfncvt_x)
>  BASE (vfncvt_x_frm)
>  BASE (vfncvt_xu)
> +BASE (vfncvt_xu_frm)
>  BASE (vfncvt_rtz_x)
>  BASE (vfncvt_rtz_xu)
>  BASE (vfncvt_f)
> diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h 
> b/gcc/config/riscv/riscv-vector-builtins-bases.h
> index edff0de2715..9bd09a41960 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-bases.h
> +++ b/gcc/config/riscv/riscv-vector-builtins-bases.h
> @@ -222,6 +222,7 @@ extern const function_base *const vfwcvt_f;
>  extern const function_base *const vfncvt_x;
>  extern const function_base *const vfncvt_x_frm;
>  extern const function_base *const vfncvt_xu;
> +extern const function_base *const vfncvt_xu_frm;
>  extern const function_base *const vfncvt_rtz_x;
>  extern const function_base *const vfncvt_rtz_xu;
>  extern const function_base *const vfncvt_f;
> diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
> b/gcc/config/riscv/riscv-vector-builtins-functions.def
> index 5e37bae318a..1e0e989fc2a 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-functions.def
> +++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
> @@ -473,6 +473,7 @@ DEF_RVV_FUNCTION (vfncvt_f, narrow_alu, full_preds, 
> f_to_nf_f_w_ops)
>  DEF_RVV_FUNCTION (vfncvt_rod_f, narrow_alu, full_preds, f_to_nf_f_w_ops)
>
>  DEF_RVV_FUNCTION (vfncvt_x_frm, narrow_alu_frm, full_preds, f_to_ni_f_w_ops)
> +DEF_RVV_FUNCTION (vfncvt_xu_frm, narrow_alu_frm, full_preds, f_to_nu_f_w_ops)
>
>  /* 14. Vector Reduction Operations.  */
>
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-ncvt-xu.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-ncvt-xu.c
> new file mode 100644
> index 000..82c3e1364bf
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-ncvt-xu.c
> @@ -0,0 +1,29 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +vuint16mf2_t
> +test_riscv_vfncvt_xu_f_w_u16mf2_rm (vfloat32m1_t op1, size_t vl) {
> +  return __riscv_vfncvt_xu_f_w_u16mf2_rm (op1, 0, vl);
> +}
> +
> +vuint16mf2_t
> +test_vfncvt_xu_f_w_u16mf2_rm_m (vbool32_t mask, vfloat32m1_t op1, size_t vl) 
> {
> +  return __riscv_vfncvt_xu_f_w_u16mf2_rm_m (mask, op1, 1, vl);
> +}
> +
> +vuint16mf2_t
> +test_riscv_vfncvt_xu_f_w_u16mf2 (vfloat32m1_t op1, size_t vl) {
> +  return __riscv_vfncvt_xu_f_w_u16mf2 (op1, vl);
> +}
> +
> +vuint16mf2_t
> +test_vfncvt_xu_f_w_u16mf2_m (vbool32_t mask, vfloat32m1_t op1, size_t vl) {
> +  return __riscv_vfncvt_xu_f_w_u16mf2_m (mask, op1, vl);
> +}
> +
> +/* { dg-final { scan-assembler-times {vfncvt\.xu\.f\.w\s+v[0-9]+,\s*v[0-9]+} 
> 4 } } */
> +/* { dg-final { scan-assembler-times {frrm\s+[axs][0-9]+} 2 } } */
> +/* { dg-final { scan-assembler-times {fsrm\s+[axs][0-9]+} 2 } }

RE: [PATCH v2] RISC-V: Support RVV VFWCVT.XU.F.V rounding mode intrinsic API

2023-08-16 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

-Original Message-
From: Kito Cheng  
Sent: Wednesday, August 16, 2023 5:54 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@sifive.com; Wang, 
Yanzhang 
Subject: Re: [PATCH v2] RISC-V: Support RVV VFWCVT.XU.F.V rounding mode 
intrinsic API

ok

On Wed, Aug 16, 2023 at 4:10 PM Pan Li via Gcc-patches
 wrote:
>
> From: Pan Li 
>
> This patch would like to support the rounding mode API for the
> VFWCVT.X.F.V as the below samples.
>
> * __riscv_vfwcvt_xu_f_v_u64m2_rm
> * __riscv_vfwcvt_xu_f_v_u64m2_rm_m
>
> Signed-off-by: Pan Li 
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vector-builtins-bases.cc
> (BASE): New declaration.
> * config/riscv/riscv-vector-builtins-bases.h: Ditto.
> * config/riscv/riscv-vector-builtins-functions.def
> (vfwcvt_xu_frm): New intrinsic function def.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/float-point-wcvt-xu.c: New test.
> ---
>  .../riscv/riscv-vector-builtins-bases.cc  |  2 ++
>  .../riscv/riscv-vector-builtins-bases.h   |  1 +
>  .../riscv/riscv-vector-builtins-functions.def |  1 +
>  .../riscv/rvv/base/float-point-wcvt-xu.c  | 29 +++
>  4 files changed, 33 insertions(+)
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/base/float-point-wcvt-xu.c
>
> diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
> b/gcc/config/riscv/riscv-vector-builtins-bases.cc
> index 22640745398..6621c77c3f2 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
> +++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
> @@ -2497,6 +2497,7 @@ static CONSTEXPR const vfcvt_f vfcvt_f_frm_obj;
>  static CONSTEXPR const vfwcvt_x vfwcvt_x_obj;
>  static CONSTEXPR const vfwcvt_x vfwcvt_x_frm_obj;
>  static CONSTEXPR const vfwcvt_x vfwcvt_xu_obj;
> +static CONSTEXPR const vfwcvt_x 
> vfwcvt_xu_frm_obj;
>  static CONSTEXPR const vfwcvt_rtz_x vfwcvt_rtz_x_obj;
>  static CONSTEXPR const vfwcvt_rtz_x vfwcvt_rtz_xu_obj;
>  static CONSTEXPR const vfwcvt_f vfwcvt_f_obj;
> @@ -2750,6 +2751,7 @@ BASE (vfcvt_f_frm)
>  BASE (vfwcvt_x)
>  BASE (vfwcvt_x_frm)
>  BASE (vfwcvt_xu)
> +BASE (vfwcvt_xu_frm)
>  BASE (vfwcvt_rtz_x)
>  BASE (vfwcvt_rtz_xu)
>  BASE (vfwcvt_f)
> diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h 
> b/gcc/config/riscv/riscv-vector-builtins-bases.h
> index dd711846cbe..6565740c597 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-bases.h
> +++ b/gcc/config/riscv/riscv-vector-builtins-bases.h
> @@ -215,6 +215,7 @@ extern const function_base *const vfcvt_f_frm;
>  extern const function_base *const vfwcvt_x;
>  extern const function_base *const vfwcvt_x_frm;
>  extern const function_base *const vfwcvt_xu;
> +extern const function_base *const vfwcvt_xu_frm;
>  extern const function_base *const vfwcvt_rtz_x;
>  extern const function_base *const vfwcvt_rtz_xu;
>  extern const function_base *const vfwcvt_f;
> diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
> b/gcc/config/riscv/riscv-vector-builtins-functions.def
> index 4e6cc793447..22c039c8cbb 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-functions.def
> +++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
> @@ -460,6 +460,7 @@ DEF_RVV_FUNCTION (vfwcvt_f, alu, full_preds, 
> u_to_wf_xu_v_ops)
>  DEF_RVV_FUNCTION (vfwcvt_f, alu, full_preds, f_to_wf_f_v_ops)
>
>  DEF_RVV_FUNCTION (vfwcvt_x_frm, alu_frm, full_preds, f_to_wi_f_v_ops)
> +DEF_RVV_FUNCTION (vfwcvt_xu_frm, alu_frm, full_preds, f_to_wu_f_v_ops)
>
>  // 13.19. Narrowing Floating-Point/Integer Type-Convert Instructions
>  DEF_RVV_FUNCTION (vfncvt_x, narrow_alu, full_preds, f_to_ni_f_w_ops)
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-wcvt-xu.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-wcvt-xu.c
> new file mode 100644
> index 000..29449e79b69
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-wcvt-xu.c
> @@ -0,0 +1,29 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +vuint64m2_t
> +test_riscv_vfwcvt_xu_f_v_u64m2_rm (vfloat32m1_t op1, size_t vl) {
> +  return __riscv_vfwcvt_xu_f_v_u64m2_rm (op1, 0, vl);
> +}
> +
> +vuint64m2_t
> +test_vfwcvt_xu_f_v_u64m2_rm_m (vbool32_t mask, vfloat32m1_t op1, size_t vl) {
> +  return __riscv_vfwcvt_xu_f_v_u64m2_rm_m (mask, op1, 1, vl);
> +}
> +
> +vuint64m2_t
> +test_riscv_vfwcvt_xu_f_v_u64m2 (vfloat32m1_t op1, size_t vl) {
> +  return __riscv_vfwcvt_xu_f_v_u64m2 (op1, vl);
> +}
> +
> +vuint64m2_t
> +test_vfwcvt_xu_f_v_u64m2_m (vbool32_t mask, vfloat32m1_t op1, size_t vl) {
> +  return __riscv_vfwcvt_xu_f_v_u64m2_m (mask, op1, vl);
> +}
> +
> +/* { dg-final { scan-assembler-times {vfwcvt\.xu\.f\.v\s+v[0-9]+,\s*v[0-9]+} 
> 4 } } */
> +/* { dg-final { scan-assembler-times {frrm\s+[axs][0-9]+} 2 } } */
> +/* { dg-final { scan-assembler-times {fsrm\s+[axs][0-9]+} 2 } } */
> +/* {

RE: [PATCH v1] RISC-V: Fix one build error for template default arg

2023-08-16 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

-Original Message-
From: Kito Cheng  
Sent: Wednesday, August 16, 2023 5:49 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@sifive.com; Wang, 
Yanzhang 
Subject: Re: [PATCH v1] RISC-V: Fix one build error for template default arg

ok

On Wed, Aug 16, 2023 at 5:44 PM Pan Li via Gcc-patches
 wrote:
>
> From: Pan Li 
>
> In some build option combination, the default value may result in
> below error. This patch would like to fix it by passing a explict
> argument.
>
> riscv-vector-builtins-bases.cc:2495:24: error: invalid use of template-name \
>   ‘riscv_vector::vfcvt_f’ without an argument list
>
> Signed-off-by: Pan Li 
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vector-builtins-bases.cc: Use explicit argument.
> ---
>  gcc/config/riscv/riscv-vector-builtins-bases.cc | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
> b/gcc/config/riscv/riscv-vector-builtins-bases.cc
> index 22640745398..18453e54b51 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
> +++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
> @@ -2492,7 +2492,7 @@ static CONSTEXPR const vfcvt_x 
> vfcvt_xu_obj;
>  static CONSTEXPR const vfcvt_x 
> vfcvt_xu_frm_obj;
>  static CONSTEXPR const vfcvt_rtz_x vfcvt_rtz_x_obj;
>  static CONSTEXPR const vfcvt_rtz_x vfcvt_rtz_xu_obj;
> -static CONSTEXPR const vfcvt_f vfcvt_f_obj;
> +static CONSTEXPR const vfcvt_f vfcvt_f_obj;
>  static CONSTEXPR const vfcvt_f vfcvt_f_frm_obj;
>  static CONSTEXPR const vfwcvt_x vfwcvt_x_obj;
>  static CONSTEXPR const vfwcvt_x vfwcvt_x_frm_obj;
> --
> 2.34.1
>

RE: [PATCH v2] RISC-V: Support RVV VFWCVT.X.F.V rounding mode intrinsic API

2023-08-16 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

From: Kito Cheng 
Sent: Wednesday, August 16, 2023 3:38 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 

Subject: Re: [PATCH v2] RISC-V: Support RVV VFWCVT.X.F.V rounding mode 
intrinsic API

lgtm

On Wed, Aug 16, 2023 at 3:32 PM mailto:pan2...@intel.com>> 
wrote:
From: Pan Li mailto:pan2...@intel.com>>

This patch would like to support the rounding mode API for the
VFWCVT.X.F.V as the below samples.

* __riscv_vfwcvt_x_f_v_i64m2_rm
* __riscv_vfwcvt_x_f_v_i64m2_rm_m

Signed-off-by: Pan Li mailto:pan2...@intel.com>>

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(BASE): New declaration.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfwcvt_x_frm): New intrinsic function def.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-wcvt-x.c: New test.
---
 .../riscv/riscv-vector-builtins-bases.cc  |  9 +-
 .../riscv/riscv-vector-builtins-bases.h   |  1 +
 .../riscv/riscv-vector-builtins-functions.def |  2 ++
 .../riscv/rvv/base/float-point-wcvt-x.c   | 29 +++
 4 files changed, 40 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/float-point-wcvt-x.c

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index c78fa8e5b62..22640745398 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -1715,10 +1715,15 @@ public:
 };

 /* Implements vfwcvt.x.  */
-template
+template
 class vfwcvt_x : public function_base
 {
 public:
+  bool has_rounding_mode_operand_p () const override
+  {
+return FRM_OP == HAS_FRM;
+  }
+
   rtx expand (function_expander ) const override
   {
 return e.use_exact_insn (
@@ -2490,6 +2495,7 @@ static CONSTEXPR const vfcvt_rtz_x 
vfcvt_rtz_xu_obj;
 static CONSTEXPR const vfcvt_f vfcvt_f_obj;
 static CONSTEXPR const vfcvt_f vfcvt_f_frm_obj;
 static CONSTEXPR const vfwcvt_x vfwcvt_x_obj;
+static CONSTEXPR const vfwcvt_x vfwcvt_x_frm_obj;
 static CONSTEXPR const vfwcvt_x vfwcvt_xu_obj;
 static CONSTEXPR const vfwcvt_rtz_x vfwcvt_rtz_x_obj;
 static CONSTEXPR const vfwcvt_rtz_x vfwcvt_rtz_xu_obj;
@@ -2742,6 +2748,7 @@ BASE (vfcvt_rtz_xu)
 BASE (vfcvt_f)
 BASE (vfcvt_f_frm)
 BASE (vfwcvt_x)
+BASE (vfwcvt_x_frm)
 BASE (vfwcvt_xu)
 BASE (vfwcvt_rtz_x)
 BASE (vfwcvt_rtz_xu)
diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h 
b/gcc/config/riscv/riscv-vector-builtins-bases.h
index 08452587180..dd711846cbe 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.h
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.h
@@ -213,6 +213,7 @@ extern const function_base *const vfcvt_rtz_xu;
 extern const function_base *const vfcvt_f;
 extern const function_base *const vfcvt_f_frm;
 extern const function_base *const vfwcvt_x;
+extern const function_base *const vfwcvt_x_frm;
 extern const function_base *const vfwcvt_xu;
 extern const function_base *const vfwcvt_rtz_x;
 extern const function_base *const vfwcvt_rtz_xu;
diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
b/gcc/config/riscv/riscv-vector-builtins-functions.def
index 8dbcd946d11..4e6cc793447 100644
--- a/gcc/config/riscv/riscv-vector-builtins-functions.def
+++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
@@ -459,6 +459,8 @@ DEF_RVV_FUNCTION (vfwcvt_f, alu, full_preds, 
i_to_wf_x_v_ops)
 DEF_RVV_FUNCTION (vfwcvt_f, alu, full_preds, u_to_wf_xu_v_ops)
 DEF_RVV_FUNCTION (vfwcvt_f, alu, full_preds, f_to_wf_f_v_ops)

+DEF_RVV_FUNCTION (vfwcvt_x_frm, alu_frm, full_preds, f_to_wi_f_v_ops)
+
 // 13.19. Narrowing Floating-Point/Integer Type-Convert Instructions
 DEF_RVV_FUNCTION (vfncvt_x, narrow_alu, full_preds, f_to_ni_f_w_ops)
 DEF_RVV_FUNCTION (vfncvt_xu, narrow_alu, full_preds, f_to_nu_f_w_ops)
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-wcvt-x.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-wcvt-x.c
new file mode 100644
index 000..8f67ec00966
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-wcvt-x.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vint64m2_t
+test_riscv_vfwcvt_x_f_v_i64m2_rm (vfloat32m1_t op1, size_t vl) {
+  return __riscv_vfwcvt_x_f_v_i64m2_rm (op1, 0, vl);
+}
+
+vint64m2_t
+test_vfwcvt_x_f_v_i64m2_rm_m (vbool32_t mask, vfloat32m1_t op1, size_t vl) {
+  return __riscv_vfwcvt_x_f_v_i64m2_rm_m (mask, op1, 1, vl);
+}
+
+vint64m2_t
+test_riscv_vfwcvt_x_f_v_i64m2 (vfloat32m1_t op1, size_t vl) {
+  return __riscv_vfwcvt_x_f_v_i64m2 (op1, vl);
+}
+
+vint64m2_t
+test_vfwcvt_x_f_v_i64m2_m (vbool32_t mask, vfloat32m1_t op1, size_t vl) {
+  return __riscv_vfwcvt_x_f_v_i64m2_m (mask, op1, vl);
+}
+
+/* { dg-final { scan-assembler-times {vfwcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+}

RE: [PATCH v2] RISC-V: Support RVV VFCVT.F.X.V and VFCVT.F.XU.V rounding mode intrinsic API

2023-08-16 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

From: Kito Cheng 
Sent: Wednesday, August 16, 2023 3:12 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 

Subject: Re: [PATCH v2] RISC-V: Support RVV VFCVT.F.X.V and VFCVT.F.XU.V 
rounding mode intrinsic API

lgtm

On Wed, Aug 16, 2023 at 2:51 PM mailto:pan2...@intel.com>> 
wrote:
From: Pan Li mailto:pan2...@intel.com>>

This patch would like to support the rounding mode API for the
VFCVT.F.X.V and VFCVT.F.XU.V as the below samples.

* __riscv_vfcvt_f_x_v_f32m1_rm
* __riscv_vfcvt_f_x_v_f32m1_rm_m
* __riscv_vfcvt_f_xu_v_f32m1_rm
* __riscv_vfcvt_f_xu_v_f32m1_rm_m

Signed-off-by: Pan Li mailto:pan2...@intel.com>>

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc (BASE): New declaration.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfcvt_f_frm): New intrinsic function def.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-cvt-f.c: New test.
---
 .../riscv/riscv-vector-builtins-bases.cc  |  8 +++
 .../riscv/riscv-vector-builtins-bases.h   |  1 +
 .../riscv/riscv-vector-builtins-functions.def |  2 +
 .../riscv/rvv/base/float-point-cvt-f.c| 50 +++
 4 files changed, 61 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/float-point-cvt-f.c

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index 421f4096db8..c78fa8e5b62 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -1694,9 +1694,15 @@ public:
   }
 };

+template
 class vfcvt_f : public function_base
 {
 public:
+  bool has_rounding_mode_operand_p () const override
+  {
+return FRM_OP == HAS_FRM;
+  }
+
   rtx expand (function_expander ) const override
   {
 if (e.op_info->op == OP_TYPE_x_v)
@@ -2482,6 +2488,7 @@ static CONSTEXPR const vfcvt_x vfcvt_xu_frm_obj;
 static CONSTEXPR const vfcvt_rtz_x vfcvt_rtz_x_obj;
 static CONSTEXPR const vfcvt_rtz_x vfcvt_rtz_xu_obj;
 static CONSTEXPR const vfcvt_f vfcvt_f_obj;
+static CONSTEXPR const vfcvt_f vfcvt_f_frm_obj;
 static CONSTEXPR const vfwcvt_x vfwcvt_x_obj;
 static CONSTEXPR const vfwcvt_x vfwcvt_xu_obj;
 static CONSTEXPR const vfwcvt_rtz_x vfwcvt_rtz_x_obj;
@@ -2733,6 +2740,7 @@ BASE (vfcvt_xu_frm)
 BASE (vfcvt_rtz_x)
 BASE (vfcvt_rtz_xu)
 BASE (vfcvt_f)
+BASE (vfcvt_f_frm)
 BASE (vfwcvt_x)
 BASE (vfwcvt_xu)
 BASE (vfwcvt_rtz_x)
diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h 
b/gcc/config/riscv/riscv-vector-builtins-bases.h
index 98b61655692..08452587180 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.h
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.h
@@ -211,6 +211,7 @@ extern const function_base *const vfcvt_xu_frm;
 extern const function_base *const vfcvt_rtz_x;
 extern const function_base *const vfcvt_rtz_xu;
 extern const function_base *const vfcvt_f;
+extern const function_base *const vfcvt_f_frm;
 extern const function_base *const vfwcvt_x;
 extern const function_base *const vfwcvt_xu;
 extern const function_base *const vfwcvt_rtz_x;
diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
b/gcc/config/riscv/riscv-vector-builtins-functions.def
index 613bbe7a855..8dbcd946d11 100644
--- a/gcc/config/riscv/riscv-vector-builtins-functions.def
+++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
@@ -447,6 +447,8 @@ DEF_RVV_FUNCTION (vfcvt_f, alu, full_preds, u_to_f_xu_v_ops)

 DEF_RVV_FUNCTION (vfcvt_x_frm, alu_frm, full_preds, f_to_i_f_v_ops)
 DEF_RVV_FUNCTION (vfcvt_xu_frm, alu_frm, full_preds, f_to_u_f_v_ops)
+DEF_RVV_FUNCTION (vfcvt_f_frm, alu_frm, full_preds, i_to_f_x_v_ops)
+DEF_RVV_FUNCTION (vfcvt_f_frm, alu_frm, full_preds, u_to_f_xu_v_ops)

 // 13.18. Widening Floating-Point/Integer Type-Convert Instructions
 DEF_RVV_FUNCTION (vfwcvt_x, alu, full_preds, f_to_wi_f_v_ops)
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-cvt-f.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-cvt-f.c
new file mode 100644
index 000..424a38ede13
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-cvt-f.c
@@ -0,0 +1,50 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vfloat32m1_t
+test_riscv_vfcvt_f_x_v_f32m1_rm (vint32m1_t op1, size_t vl) {
+  return __riscv_vfcvt_f_x_v_f32m1_rm (op1, 0, vl);
+}
+
+vfloat32m1_t
+test_riscv_vfcvt_f_x_v_f32m1_rm_m (vbool32_t mask, vint32m1_t op1, size_t vl) {
+  return __riscv_vfcvt_f_x_v_f32m1_rm_m (mask, op1, 0, vl);
+}
+
+vfloat32m1_t
+test_riscv_vfcvt_f_xu_v_f32m1_rm (vuint32m1_t op1, size_t vl) {
+  return __riscv_vfcvt_f_xu_v_f32m1_rm (op1, 0, vl);
+}
+
+vfloat32m1_t
+test_riscv_vfcvt_f_xu_v_f32m1_rm_m (vbool32_t mask, vuint32m1_t op1,
+   size_t vl) {
+  return __riscv_vfcvt_f_xu_v_f32m1_rm_m (mask, op1, 0,

RE: [PATCH v2] RISC-V: Support RVV VFCVT.XU.F.V rounding mode intrinsic API

2023-08-16 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

-Original Message-
From: Kito Cheng  
Sent: Wednesday, August 16, 2023 3:02 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 

Subject: Re: [PATCH v2] RISC-V: Support RVV VFCVT.XU.F.V rounding mode 
intrinsic API

lgtm

On Wed, Aug 16, 2023 at 2:21 PM Pan Li via Gcc-patches
 wrote:
>
> From: Pan Li 
>
> This patch would like to support the rounding mode API for the
> VFCVT.XU.F.V as the below samples.
>
> * __riscv_vfcvt_xu_f_v_u32m1_rm
> * __riscv_vfcvt_xu_f_v_u32m1_rm_m
>
> Signed-off-by: Pan Li 
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vector-builtins-bases.cc
> (BASE): New declaration.
> * config/riscv/riscv-vector-builtins-bases.h: Ditto.
> * config/riscv/riscv-vector-builtins-functions.def
> (vfcvt_xu_frm): New intrinsic function def..
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/float-point-cvt-xu.c: New test.
> ---
>  .../riscv/riscv-vector-builtins-bases.cc  |  2 ++
>  .../riscv/riscv-vector-builtins-bases.h   |  1 +
>  .../riscv/riscv-vector-builtins-functions.def |  1 +
>  .../riscv/rvv/base/float-point-cvt-xu.c   | 29 +++
>  4 files changed, 33 insertions(+)
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/base/float-point-cvt-xu.c
>
> diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
> b/gcc/config/riscv/riscv-vector-builtins-bases.cc
> index 817d2ed016a..421f4096db8 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
> +++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
> @@ -2478,6 +2478,7 @@ static CONSTEXPR const vmv_v vfmv_v_obj;
>  static CONSTEXPR const vfcvt_x vfcvt_x_obj;
>  static CONSTEXPR const vfcvt_x vfcvt_x_frm_obj;
>  static CONSTEXPR const vfcvt_x vfcvt_xu_obj;
> +static CONSTEXPR const vfcvt_x 
> vfcvt_xu_frm_obj;
>  static CONSTEXPR const vfcvt_rtz_x vfcvt_rtz_x_obj;
>  static CONSTEXPR const vfcvt_rtz_x vfcvt_rtz_xu_obj;
>  static CONSTEXPR const vfcvt_f vfcvt_f_obj;
> @@ -2728,6 +2729,7 @@ BASE (vfmv_v)
>  BASE (vfcvt_x)
>  BASE (vfcvt_x_frm)
>  BASE (vfcvt_xu)
> +BASE (vfcvt_xu_frm)
>  BASE (vfcvt_rtz_x)
>  BASE (vfcvt_rtz_xu)
>  BASE (vfcvt_f)
> diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h 
> b/gcc/config/riscv/riscv-vector-builtins-bases.h
> index 50a7d7ffb6f..98b61655692 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-bases.h
> +++ b/gcc/config/riscv/riscv-vector-builtins-bases.h
> @@ -207,6 +207,7 @@ extern const function_base *const vfmv_v;
>  extern const function_base *const vfcvt_x;
>  extern const function_base *const vfcvt_x_frm;
>  extern const function_base *const vfcvt_xu;
> +extern const function_base *const vfcvt_xu_frm;
>  extern const function_base *const vfcvt_rtz_x;
>  extern const function_base *const vfcvt_rtz_xu;
>  extern const function_base *const vfcvt_f;
> diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
> b/gcc/config/riscv/riscv-vector-builtins-functions.def
> index 8b6a7cc49f3..613bbe7a855 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-functions.def
> +++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
> @@ -446,6 +446,7 @@ DEF_RVV_FUNCTION (vfcvt_f, alu, full_preds, 
> i_to_f_x_v_ops)
>  DEF_RVV_FUNCTION (vfcvt_f, alu, full_preds, u_to_f_xu_v_ops)
>
>  DEF_RVV_FUNCTION (vfcvt_x_frm, alu_frm, full_preds, f_to_i_f_v_ops)
> +DEF_RVV_FUNCTION (vfcvt_xu_frm, alu_frm, full_preds, f_to_u_f_v_ops)
>
>  // 13.18. Widening Floating-Point/Integer Type-Convert Instructions
>  DEF_RVV_FUNCTION (vfwcvt_x, alu, full_preds, f_to_wi_f_v_ops)
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-cvt-xu.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-cvt-xu.c
> new file mode 100644
> index 000..bb164b2b001
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-cvt-xu.c
> @@ -0,0 +1,29 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +vuint32m1_t
> +test_riscv_vfcvt_xu_f_v_u32m1_rm (vfloat32m1_t op1, size_t vl) {
> +  return __riscv_vfcvt_xu_f_v_u32m1_rm (op1, 0, vl);
> +}
> +
> +vuint32m1_t
> +test_vfcvt_xu_f_v_u32m1_rm_m (vbool32_t mask, vfloat32m1_t op1, size_t vl) {
> +  return __riscv_vfcvt_xu_f_v_u32m1_rm_m (mask, op1, 1, vl);
> +}
> +
> +vuint32m1_t
> +test_riscv_vfcvt_xu_f_vv_u32m1 (vfloat32m1_t op1, size_t vl) {
> +  return __riscv_vfcvt_xu_f_v_u32m1 (op1, vl);
> +}
> +
> +vuint32m1_t
> +test_vfcvt_xu_f_v_u32m1_m (vbool32_t mask, vfloat32m1_t op1, size_t vl) {
> +  return __riscv_vfcvt_xu_f_v_u32m1_m (mask, op1, vl);
> +}
> +
> +/* { dg-final { scan-assembler-times {vfcvt\.xu\.f\.v\s+v[0-9]+,\s*v[0-9]+} 
> 4 } } */
> +/* { dg-final { scan-assembler-times {frrm\s+[axs][0-9]+} 2 } } */
> +/* { dg-final { scan-assembler-times {fsrm\s+[axs][0-9]+} 2 } } */
> +/* { dg-final { scan-assembler-times {fsrmi\s+[01234]} 2 } } */
> --
> 2.34.1
>

RE: Re: [PATCH] RISC-V: Support MASK_LEN_{LOAD_LANES,STORE_LANES}

2023-08-16 Thread Li, Pan2 via Gcc-patches

Committed, thanks Jeff.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of juzhe.zh...@rivai.ai
Sent: Wednesday, August 16, 2023 9:23 AM
To: jeffreyalaw ; gcc-patches 
Cc: kito.cheng ; Kito.cheng ; 
Robin Dapp 
Subject: Re: Re: [PATCH] RISC-V: Support MASK_LEN_{LOAD_LANES,STORE_LANES}

Thanks Jeff.
I realize the quad_trunc/oct_trunc change is not necessary. I will remove that.

The middle-end support is approved, and testing on both X86 and ARM, soon will 
be committed.

Will commit this patch after middle-end patch is committed.

Thanks.


juzhe.zh...@rivai.ai
 
From: Jeff Law
Date: 2023-08-15 22:18
To: Juzhe-Zhong; gcc-patches
CC: kito.cheng; kito.cheng; rdapp.gcc
Subject: Re: [PATCH] RISC-V: Support MASK_LEN_{LOAD_LANES,STORE_LANES}
 
 
On 8/14/23 06:15, Juzhe-Zhong wrote:
> This patch is depending on middle-end support:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627305.html
> 
> This patch allow us auto-vectorize this following case:
> 
> #define TEST_LOOP(NAME, OUTTYPE, INTYPE, MASKTYPE)
>  \
>void __attribute__ ((noinline, noclone))   
>   \
>NAME##_8 (OUTTYPE *__restrict dest, INTYPE *__restrict src,
>   \
> MASKTYPE *__restrict cond, intptr_t n) \
>{  
>   \
>  for (intptr_t i = 0; i < n; ++i) 
>   \
>if (cond[i])   
>   \
> dest[i] = (src[i * 8] + src[i * 8 + 1] + src[i * 8 + 2]\
>+ src[i * 8 + 3] + src[i * 8 + 4] + src[i * 8 + 5]  \
>+ src[i * 8 + 6] + src[i * 8 + 7]); \
>}
> 
> #define TEST2(NAME, OUTTYPE, INTYPE)  
>  \
>TEST_LOOP (NAME##_f32, OUTTYPE, INTYPE, int32_t)   
> \
> 
> #define TEST1(NAME, OUTTYPE)  
>  \
>TEST2 (NAME##_i32, OUTTYPE, int32_t)   
>   \
> 
> #define TEST(NAME)
>  \
>TEST1 (NAME##_i32, int32_t)
>   \
> 
> TEST (test)
> 
> ASM:
> 
> test_i32_i32_f32_8:
> ble a3,zero,.L5
> .L3:
> vsetvli a4,a3,e8,mf4,ta,ma
> vle32.v v0,0(a2)
> vsetvli a5,zero,e32,m1,ta,ma
> vmsne.vi v0,v0,0
> vsetvli zero,a4,e32,m1,ta,ma
> vlseg8e32.v v8,(a1),v0.t
> vsetvli a5,zero,e32,m1,ta,ma
> slli a6,a4,2
> vadd.vv v1,v9,v8
> slli a7,a4,5
> vadd.vv v1,v1,v10
> sub a3,a3,a4
> vadd.vv v1,v1,v11
> vadd.vv v1,v1,v12
> vadd.vv v1,v1,v13
> vadd.vv v1,v1,v14
> vadd.vv v1,v1,v15
> vsetvli zero,a4,e32,m1,ta,ma
> vse32.v v1,0(a0),v0.t
> add a2,a2,a6
> add a1,a1,a7
> add a0,a0,a6
> bne a3,zero,.L3
> .L5:
> ret
> 
> gcc/ChangeLog:
> 
>  * config/riscv/autovec.md (vec_mask_len_load_lanes): 
> New pattern.
>  (vec_mask_len_store_lanes): Ditto.
>  (2): Fix pattern for ICE.
>  (2): Ditto.
>  * config/riscv/riscv-protos.h (expand_lanes_load_store): New 
> function.
>  * config/riscv/riscv-v.cc (get_mask_mode): Add tuple mode mask mode.
>  (expand_lanes_load_store): New function.
>  * config/riscv/vector-iterators.md: New iterator.
I would generally recommend sending independent fixes separately.  In 
particular the quad_trunc, oct_trunc changes seem like they should have 
been a separate patch.  But no need to resend this time.  Just try to 
break out distinct changes like those into their own patch.
 
OK, but obviously hold off committing until the generic support is 
approved and committed.
 
Thanks,
jeff

RE: [PATCH V2] VECT: Apply MASK_LEN_{LOAD_LANES, STORE_LANES} into vectorizer

2023-08-16 Thread Li, Pan2 via Gcc-patches

Committed, thanks Richard.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Richard Biener via Gcc-patches
Sent: Tuesday, August 15, 2023 8:35 PM
To: Juzhe-Zhong 
Cc: gcc-patches@gcc.gnu.org; richard.sandif...@arm.com
Subject: Re: [PATCH V2] VECT: Apply MASK_LEN_{LOAD_LANES, STORE_LANES} into 
vectorizer

On Tue, 15 Aug 2023, Juzhe-Zhong wrote:

> Hi, Richard and Richi.
> 
> This patch is adding MASK_LEN_{LOAD_LANES,STORE_LANES} support into 
> vectorizer.
> 
> Consider this simple case:
> 
> void __attribute__ ((noinline, noclone))
> foo (int *__restrict a, int *__restrict b, int *__restrict c,
> int *__restrict d, int *__restrict e, int *__restrict f,
> int *__restrict g, int *__restrict h, int *__restrict j, int n)
> {
>   for (int i = 0; i < n; ++i)
> {
>   a[i] = j[i * 8];
>   b[i] = j[i * 8 + 1];
>   c[i] = j[i * 8 + 2];
>   d[i] = j[i * 8 + 3];
>   e[i] = j[i * 8 + 4];
>   f[i] = j[i * 8 + 5];
>   g[i] = j[i * 8 + 6];
>   h[i] = j[i * 8 + 7];
> }
> }
> 
> RVV Gimple IR:
> 
>   _79 = .SELECT_VL (ivtmp_81, POLY_INT_CST [4, 4]);
>   ivtmp_125 = _79 * 32;
>   vect_array.8 = .MASK_LEN_LOAD_LANES (vectp_j.6_124, 32B, { -1, ... }, _79, 
> 0);
>   vect__8.9_122 = vect_array.8[0];
>   vect__8.10_121 = vect_array.8[1];
>   vect__8.11_120 = vect_array.8[2];
>   vect__8.12_119 = vect_array.8[3];
>   vect__8.13_118 = vect_array.8[4];
>   vect__8.14_117 = vect_array.8[5];
>   vect__8.15_116 = vect_array.8[6];
>   vect__8.16_115 = vect_array.8[7];
>   vect_array.8 ={v} {CLOBBER};
>   ivtmp_114 = _79 * 4;
>   .MASK_LEN_STORE (vectp_a.17_113, 32B, { -1, ... }, _79, 0, vect__8.9_122);
>   .MASK_LEN_STORE (vectp_b.19_109, 32B, { -1, ... }, _79, 0, vect__8.10_121);
>   .MASK_LEN_STORE (vectp_c.21_105, 32B, { -1, ... }, _79, 0, vect__8.11_120);
>   .MASK_LEN_STORE (vectp_d.23_101, 32B, { -1, ... }, _79, 0, vect__8.12_119);
>   .MASK_LEN_STORE (vectp_e.25_97, 32B, { -1, ... }, _79, 0, vect__8.13_118);
>   .MASK_LEN_STORE (vectp_f.27_93, 32B, { -1, ... }, _79, 0, vect__8.14_117);
>   .MASK_LEN_STORE (vectp_g.29_89, 32B, { -1, ... }, _79, 0, vect__8.15_116);
>   .MASK_LEN_STORE (vectp_h.31_85, 32B, { -1, ... }, _79, 0, vect__8.16_115);
> 
> ASM:
> 
> foo:
>   lw  t4,8(sp)
>   ld  t5,0(sp)
>   ble t4,zero,.L5
> .L3:
>   vsetvli t1,t4,e8,mf4,ta,ma
>   vlseg8e32.v v8,(t5)
>   sllit3,t1,2
>   sllit6,t1,5
>   vse32.v v8,0(a0)
>   vse32.v v9,0(a1)
>   vse32.v v10,0(a2)
>   vse32.v v11,0(a3)
>   vse32.v v12,0(a4)
>   vse32.v v13,0(a5)
>   vse32.v v14,0(a6)
>   vse32.v v15,0(a7)
>   sub t4,t4,t1
>   add t5,t5,t6
>   add a0,a0,t3
>   add a1,a1,t3
>   add a2,a2,t3
>   add a3,a3,t3
>   add a4,a4,t3
>   add a5,a5,t3
>   add a6,a6,t3
>   add a7,a7,t3
>   bne t4,zero,.L3
> .L5:
>   ret
> 
> The details of the approach:
> 
> Step 1 - Modifiy the LANES LOAD/STORE support function 
> (vect_load_lanes_supported/vect_store_lanes_supported):
> 
> +/* Return FN if vec_{masked_,mask_len,}load_lanes is available for COUNT
> +   vectors of type VECTYPE.  MASKED_P says whether the masked form is 
> needed. */
>  
> -bool
> +internal_fn
>  vect_load_lanes_supported (tree vectype, unsigned HOST_WIDE_INT count,
>  bool masked_p)
>  {
> -  if (masked_p)
> -return vect_lanes_optab_supported_p ("vec_mask_load_lanes",
> -  vec_mask_load_lanes_optab,
> -  vectype, count);
> +  if (vect_lanes_optab_supported_p ("vec_mask_len_load_lanes",
> + vec_mask_len_load_lanes_optab,
> + vectype, count))
> +return IFN_MASK_LEN_LOAD_LANES;
> +  else if (masked_p)
> +{
> +  if (vect_lanes_optab_supported_p ("vec_mask_load_lanes",
> + vec_mask_load_lanes_optab,
> + vectype, count))
> + return IFN_MASK_LOAD_LANES;
> +}
>else
> -return vect_lanes_optab_supported_p ("vec_load_lanes",
> -  vec_load_lanes_optab,
> -  vectype, count);
> +{
> +  if (vect_lanes_optab_supported_p ("vec_load_lanes",
> + vec_load_lanes_optab,
> + vectype, count))
> + return IFN_LOAD_LANES;
> +}
> +  return IFN_LAST;
>  }
>  
> Instead of returning TRUE or FALSE whether target support the LANES 
> LOAD/STORE.
> I change it into return internal_fn of the LANES LOAD/STORE that target 
> support,
> If target didn't support any LANE LOAD/STORE optabs, return IFN_LAST.
> 
> Step 2 - Compute IFN for LANES LOAD/STORE (Only compute once).
> 
>   if (!STMT_VINFO_STRIDED_P (first_stmt_info)
> && (can_overrun_p || !would_overrun_p)
>

RE: [PATCH v2] RISC-V: Support RVV VFCVT.X.F.V rounding mode intrinsic API

2023-08-16 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

From: Kito Cheng 
Sent: Wednesday, August 16, 2023 1:57 PM
To: Li, Pan2 
Cc: GCC Patches ; 钟居哲 ; Wang, 
Yanzhang 
Subject: Re: [PATCH v2] RISC-V: Support RVV VFCVT.X.F.V rounding mode intrinsic 
API

LGTM

mailto:pan2...@intel.com>> 於 2023年8月16日 週三 13:17 寫道：
From: Pan Li mailto:pan2...@intel.com>>

This patch would like to support the rounding mode API for the
VFCVT.X.F.V as the below samples.

* __riscv_vfcvt_x_f_v_i32m1_rm
* __riscv_vfcvt_x_f_v_i32m1_rm_m

Signed-off-by: Pan Li mailto:pan2...@intel.com>>

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(enum frm_op_type): New type for frm.
(BASE): New declaration.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfcvt_x_frm): New intrinsic function def.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-cvt-x.c: New test.
---
 .../riscv/riscv-vector-builtins-bases.cc  | 15 +-
 .../riscv/riscv-vector-builtins-bases.h   |  1 +
 .../riscv/riscv-vector-builtins-functions.def |  2 ++
 .../riscv/rvv/base/float-point-cvt-x.c| 29 +++
 4 files changed, 46 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/float-point-cvt-x.c

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index f2124080ef9..817d2ed016a 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -58,6 +58,12 @@ enum lst_type
   LST_INDEXED,
 };

+enum frm_op_type
+{
+  NO_FRM,
+  HAS_FRM,
+};
+
 /* Helper function to fold vleff and vlsegff.  */
 static gimple *
 fold_fault_load (gimple_folder )
@@ -1662,10 +1668,15 @@ public:
 };

 /* Implements vfcvt.x.  */
-template
+template
 class vfcvt_x : public function_base
 {
 public:
+  bool has_rounding_mode_operand_p () const override
+  {
+return FRM_OP == HAS_FRM;
+  }
+
   rtx expand (function_expander ) const override
   {
 return e.use_exact_insn (code_for_pred_fcvt_x_f (UNSPEC, e.arg_mode (0)));
@@ -2465,6 +2476,7 @@ static CONSTEXPR const vfclass vfclass_obj;
 static CONSTEXPR const vmerge vfmerge_obj;
 static CONSTEXPR const vmv_v vfmv_v_obj;
 static CONSTEXPR const vfcvt_x vfcvt_x_obj;
+static CONSTEXPR const vfcvt_x vfcvt_x_frm_obj;
 static CONSTEXPR const vfcvt_x vfcvt_xu_obj;
 static CONSTEXPR const vfcvt_rtz_x vfcvt_rtz_x_obj;
 static CONSTEXPR const vfcvt_rtz_x vfcvt_rtz_xu_obj;
@@ -2714,6 +2726,7 @@ BASE (vfclass)
 BASE (vfmerge)
 BASE (vfmv_v)
 BASE (vfcvt_x)
+BASE (vfcvt_x_frm)
 BASE (vfcvt_xu)
 BASE (vfcvt_rtz_x)
 BASE (vfcvt_rtz_xu)
diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h 
b/gcc/config/riscv/riscv-vector-builtins-bases.h
index 2a9381eec5e..50a7d7ffb6f 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.h
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.h
@@ -205,6 +205,7 @@ extern const function_base *const vfclass;
 extern const function_base *const vfmerge;
 extern const function_base *const vfmv_v;
 extern const function_base *const vfcvt_x;
+extern const function_base *const vfcvt_x_frm;
 extern const function_base *const vfcvt_xu;
 extern const function_base *const vfcvt_rtz_x;
 extern const function_base *const vfcvt_rtz_xu;
diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
b/gcc/config/riscv/riscv-vector-builtins-functions.def
index 34def6bb82f..8b6a7cc49f3 100644
--- a/gcc/config/riscv/riscv-vector-builtins-functions.def
+++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
@@ -445,6 +445,8 @@ DEF_RVV_FUNCTION (vfcvt_rtz_xu, alu, full_preds, 
f_to_u_f_v_ops)
 DEF_RVV_FUNCTION (vfcvt_f, alu, full_preds, i_to_f_x_v_ops)
 DEF_RVV_FUNCTION (vfcvt_f, alu, full_preds, u_to_f_xu_v_ops)

+DEF_RVV_FUNCTION (vfcvt_x_frm, alu_frm, full_preds, f_to_i_f_v_ops)
+
 // 13.18. Widening Floating-Point/Integer Type-Convert Instructions
 DEF_RVV_FUNCTION (vfwcvt_x, alu, full_preds, f_to_wi_f_v_ops)
 DEF_RVV_FUNCTION (vfwcvt_xu, alu, full_preds, f_to_wu_f_v_ops)
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-cvt-x.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-cvt-x.c
new file mode 100644
index 000..e090f0f97e9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-cvt-x.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vint32m1_t
+test_riscv_vfcvt_x_f_vv_i32m1_rm (vfloat32m1_t op1, size_t vl) {
+  return __riscv_vfcvt_x_f_v_i32m1_rm (op1, 0, vl);
+}
+
+vint32m1_t
+test_vfcvt_x_f_vv_i32m1_rm_m (vbool32_t mask, vfloat32m1_t op1, size_t vl) {
+  return __riscv_vfcvt_x_f_v_i32m1_rm_m (mask, op1, 1, vl);
+}
+
+vint32m1_t
+test_riscv_vfcvt_x_f_vv_i32m1 (vfloat32m1_t op1, size_t vl) {
+  return __riscv_vfcvt_x_f_v_i32m1 (op1, vl);
+}
+
+vint32m1_t
+test_vfcvt_x_f_vv_i32m1_m (vbool32_t

RE: [PATCH v1] RISC-V: Support RVV VFCVT.X.F.V rounding mode intrinsic API

2023-08-15 Thread Li, Pan2 via Gcc-patches

Got it, thanks!

Will start with CVT and rest frm instructions first, and then refactor.

Pan

-Original Message-
From: Kito Cheng  
Sent: Wednesday, August 16, 2023 11:44 AM
To: Li, Pan2 
Cc: juzhe.zh...@rivai.ai; gcc-patches ; Wang, Yanzhang 

Subject: Re: [PATCH v1] RISC-V: Support RVV VFCVT.X.F.V rounding mode intrinsic 
API

I would prefer to introduce an enum template argument and refactor
existing code later :)

On Wed, Aug 16, 2023 at 11:40 AM Li, Pan2 via Gcc-patches
 wrote:
>
> That should work as well, but may require some changes to existing codes like 
> declaration, etc.
> I am OK for both the enum or inherit, and will start with the CVT parts, then 
> refactor the existing frm class.
>
> Do you have any suggestion for the decision making?
>
> Pan
>
> -Original Message-
> From: Kito Cheng 
> Sent: Wednesday, August 16, 2023 11:30 AM
> To: Li, Pan2 
> Cc: juzhe.zh...@rivai.ai; gcc-patches ; Wang, 
> Yanzhang 
> Subject: Re: [PATCH v1] RISC-V: Support RVV VFCVT.X.F.V rounding mode 
> intrinsic API
>
> Or using an enum value rather than bool?
>
> I am thinking we could also simplify/remove most other frm classes,
> some practical example:
>
>
> diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc
> b/gcc/config/riscv/riscv-vector-builtins-bases.cc
> index 2074dac0f16..ace63e963a5 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
> +++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
> @@ -58,6 +58,11 @@ enum lst_type
>   LST_INDEXED,
> };
>
> +enum frm_op_type
> +{
> +  NO_FRM,
> +  HAS_FRM
> +};
> /* Helper function to fold vleff and vlsegff.  */
> static gimple *
> fold_fault_load (gimple_folder )
> @@ -256,41 +261,22 @@ public:
>vremu/vsadd/vsaddu/vssub/vssubu
>vfadd/vfsub/
> */
> -template
> +template
> class binop : public function_base
> {
> public:
> -  rtx expand (function_expander ) const override
> +  bool has_rounding_mode_operand_p () const override
>   {
> -switch (e.op_info->op)
> -  {
> -  case OP_TYPE_vx:
> -  case OP_TYPE_vf:
> -   return e.use_exact_insn (code_for_pred_scalar (CODE, e.vector_mode 
> ()));
> -  case OP_TYPE_vv:
> -   return e.use_exact_insn (code_for_pred (CODE, e.vector_mode ()));
> -  default:
> -   gcc_unreachable ();
> -  }
> +return FRM_OP == HAS_FRM;
>   }
> -};
> -
> -/* Implements below instructions for now.
> -   - vfadd
> -   - vfsub
> -   - vfmul
> -   - vfdiv
> -*/
> -template
> -class binop_frm : public function_base
> -{
> -public:
> -  bool has_rounding_mode_operand_p () const override { return true; }
>
>   rtx expand (function_expander ) const override
>   {
> switch (e.op_info->op)
>   {
> +  case OP_TYPE_vx:
> +   gcc_assert (FRM_OP == NO_FRM);
> +   gcc_fallthrough ();
>   case OP_TYPE_vf:
>return e.use_exact_insn (code_for_pred_scalar (CODE, e.vector_mode 
> ()));
>   case OP_TYPE_vv:
> @@ -1648,10 +1634,15 @@ public:
> };
>
> /* Implements vfcvt.x.  */
> -template
> +template
> class vfcvt_x : public function_base
> {
> public:
> +  bool has_rounding_mode_operand_p () const override
> +  {
> +return FRM_OP == HAS_FRM;
> +  }
> +
>   rtx expand (function_expander ) const override
>   {
> return e.use_exact_insn (code_for_pred_fcvt_x_f (UNSPEC, e.arg_mode (0)));
> @@ -2389,8 +2380,8 @@ static CONSTEXPR const viota viota_obj;
> static CONSTEXPR const vid vid_obj;
> static CONSTEXPR const binop vfadd_obj;
> static CONSTEXPR const binop vfsub_obj;
> -static CONSTEXPR const binop_frm vfadd_frm_obj;
> -static CONSTEXPR const binop_frm vfsub_frm_obj;
> +static CONSTEXPR const binop vfadd_frm_obj;
> +static CONSTEXPR const binop vfsub_frm_obj;
> static CONSTEXPR const reverse_binop vfrsub_obj;
> static CONSTEXPR const reverse_binop_frm vfrsub_frm_obj;
> static CONSTEXPR const widen_binop vfwadd_obj;
> @@ -2398,9 +2389,9 @@ static CONSTEXPR const widen_binop_frm
> vfwadd_frm_obj;
> static CONSTEXPR const widen_binop vfwsub_obj;
> static CONSTEXPR const widen_binop_frm vfwsub_frm_obj;
> static CONSTEXPR const binop vfmul_obj;
> -static CONSTEXPR const binop_frm vfmul_frm_obj;
> +static CONSTEXPR const binop vfmul_frm_obj;
> static CONSTEXPR const binop vfdiv_obj;
> -static CONSTEXPR const binop_frm vfdiv_frm_obj;
> +static CONSTEXPR const binop vfdiv_frm_obj;
> static CONSTEXPR const reverse_binop vfrdiv_obj;
> static CONSTEXPR const reverse_binop_frm vfrdiv_frm_obj;
> static CONSTEXPR const widen_binop vfwmul_obj;

RE: [PATCH v1] RISC-V: Support RVV VFCVT.X.F.V rounding mode intrinsic API

2023-08-15 Thread Li, Pan2 via Gcc-patches

That should work as well, but may require some changes to existing codes like 
declaration, etc.
I am OK for both the enum or inherit, and will start with the CVT parts, then 
refactor the existing frm class.

Do you have any suggestion for the decision making?

Pan

-Original Message-
From: Kito Cheng  
Sent: Wednesday, August 16, 2023 11:30 AM
To: Li, Pan2 
Cc: juzhe.zh...@rivai.ai; gcc-patches ; Wang, Yanzhang 

Subject: Re: [PATCH v1] RISC-V: Support RVV VFCVT.X.F.V rounding mode intrinsic 
API

Or using an enum value rather than bool?

I am thinking we could also simplify/remove most other frm classes,
some practical example:


diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index 2074dac0f16..ace63e963a5 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -58,6 +58,11 @@ enum lst_type
  LST_INDEXED,
};

+enum frm_op_type
+{
+  NO_FRM,
+  HAS_FRM
+};
/* Helper function to fold vleff and vlsegff.  */
static gimple *
fold_fault_load (gimple_folder )
@@ -256,41 +261,22 @@ public:
   vremu/vsadd/vsaddu/vssub/vssubu
   vfadd/vfsub/
*/
-template
+template
class binop : public function_base
{
public:
-  rtx expand (function_expander ) const override
+  bool has_rounding_mode_operand_p () const override
  {
-switch (e.op_info->op)
-  {
-  case OP_TYPE_vx:
-  case OP_TYPE_vf:
-   return e.use_exact_insn (code_for_pred_scalar (CODE, e.vector_mode ()));
-  case OP_TYPE_vv:
-   return e.use_exact_insn (code_for_pred (CODE, e.vector_mode ()));
-  default:
-   gcc_unreachable ();
-  }
+return FRM_OP == HAS_FRM;
  }
-};
-
-/* Implements below instructions for now.
-   - vfadd
-   - vfsub
-   - vfmul
-   - vfdiv
-*/
-template
-class binop_frm : public function_base
-{
-public:
-  bool has_rounding_mode_operand_p () const override { return true; }

  rtx expand (function_expander ) const override
  {
switch (e.op_info->op)
  {
+  case OP_TYPE_vx:
+   gcc_assert (FRM_OP == NO_FRM);
+   gcc_fallthrough ();
  case OP_TYPE_vf:
   return e.use_exact_insn (code_for_pred_scalar (CODE, e.vector_mode ()));
  case OP_TYPE_vv:
@@ -1648,10 +1634,15 @@ public:
};

/* Implements vfcvt.x.  */
-template
+template
class vfcvt_x : public function_base
{
public:
+  bool has_rounding_mode_operand_p () const override
+  {
+return FRM_OP == HAS_FRM;
+  }
+
  rtx expand (function_expander ) const override
  {
return e.use_exact_insn (code_for_pred_fcvt_x_f (UNSPEC, e.arg_mode (0)));
@@ -2389,8 +2380,8 @@ static CONSTEXPR const viota viota_obj;
static CONSTEXPR const vid vid_obj;
static CONSTEXPR const binop vfadd_obj;
static CONSTEXPR const binop vfsub_obj;
-static CONSTEXPR const binop_frm vfadd_frm_obj;
-static CONSTEXPR const binop_frm vfsub_frm_obj;
+static CONSTEXPR const binop vfadd_frm_obj;
+static CONSTEXPR const binop vfsub_frm_obj;
static CONSTEXPR const reverse_binop vfrsub_obj;
static CONSTEXPR const reverse_binop_frm vfrsub_frm_obj;
static CONSTEXPR const widen_binop vfwadd_obj;
@@ -2398,9 +2389,9 @@ static CONSTEXPR const widen_binop_frm
vfwadd_frm_obj;
static CONSTEXPR const widen_binop vfwsub_obj;
static CONSTEXPR const widen_binop_frm vfwsub_frm_obj;
static CONSTEXPR const binop vfmul_obj;
-static CONSTEXPR const binop_frm vfmul_frm_obj;
+static CONSTEXPR const binop vfmul_frm_obj;
static CONSTEXPR const binop vfdiv_obj;
-static CONSTEXPR const binop_frm vfdiv_frm_obj;
+static CONSTEXPR const binop vfdiv_frm_obj;
static CONSTEXPR const reverse_binop vfrdiv_obj;
static CONSTEXPR const reverse_binop_frm vfrdiv_frm_obj;
static CONSTEXPR const widen_binop vfwmul_obj;

RE: [PATCH v1] RISC-V: Support RVV VFCVT.X.F.V rounding mode intrinsic API

2023-08-15 Thread Li, Pan2 via Gcc-patches

Thanks Kito for comments. How about leverage inherit instead of template? 
AFAIK, the bool argument isn't recommended up to a point. 
For example, as below to reuse the expand part.

class vfcvt_x : public function_base
 {
 public:
+  virtual bool has_rounding_mode_operand_p () const { return false; }
+
   rtx expand (function_expander ) const override
   {
 return e.use_exact_insn (code_for_pred_fcvt_x_f (UNSPEC, e.arg_mode (0)));
   }
 };

+/* Implements below instructions for frm
+   - vfcvt_x
+*/
+template
+class vfcvt_x_frm : public vfcvt_x
+{
+public:
+  bool has_rounding_mode_operand_p () const override { return true; }
+};

Pan

-Original Message-
From: Kito Cheng  
Sent: Tuesday, August 15, 2023 11:34 PM
To: juzhe.zh...@rivai.ai
Cc: Li, Pan2 ; gcc-patches ; Wang, 
Yanzhang 
Subject: Re: [PATCH v1] RISC-V: Support RVV VFCVT.X.F.V rounding mode intrinsic 
API

Just a random idea came to my mind, maybe we could introduce one more
template argument to reduce those codes for rounding mode intrinsic
stuff?

example:

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index 2074dac0f16..9cc60842a5b 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -1648,10 +1648,11 @@ public:
};

/* Implements vfcvt.x.  */
-template
+template
class vfcvt_x : public function_base
{
public:
+  bool has_rounding_mode_operand_p () const override { return HAS_FRM; }
  rtx expand (function_expander ) const override
  {
return e.use_exact_insn (code_for_pred_fcvt_x_f (UNSPEC, e.arg_mode (0)));
@@ -2451,6 +2452,7 @@ static CONSTEXPR const vmerge vfmerge_obj;
static CONSTEXPR const vmv_v vfmv_v_obj;
static CONSTEXPR const vfcvt_x vfcvt_x_obj;
static CONSTEXPR const vfcvt_x vfcvt_xu_obj;
+static CONSTEXPR const vfcvt_x vfcvt_x_frm_obj;
static CONSTEXPR const vfcvt_rtz_x vfcvt_rtz_x_obj;
static CONSTEXPR const vfcvt_rtz_x vfcvt_rtz_xu_obj;
static CONSTEXPR const vfcvt_f vfcvt_f_obj;

RE: [PATCH] RISC-V: Fix autovec_length_operand predicate[PR110989]

2023-08-15 Thread Li, Pan2 via Gcc-patches

Committed, thanks Robin.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Robin Dapp via Gcc-patches
Sent: Tuesday, August 15, 2023 6:43 PM
To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org
Cc: rdapp@gmail.com; kito.ch...@sifive.com; kito.ch...@gmail.com; 
jeffreya...@gmail.com
Subject: Re: [PATCH] RISC-V: Fix autovec_length_operand predicate[PR110989]

> Currently, autovec_length_operand predicate incorrect configuration is
> discovered in PR110989 since this following situation:

In case you haven't committed it yet: This is OK.

Regards
 Robin

RE: [PATCH v4] Mode-Switching: Fix SET_SRC ICE for create_pre_exit

2023-08-14 Thread Li, Pan2 via Gcc-patches

Committed as passed both the bootstrap and regression test in x86, thanks Jeff.

Pan

-Original Message-
From: Jeff Law  
Sent: Tuesday, August 15, 2023 1:21 AM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@sifive.com; Wang, Yanzhang 

Subject: Re: [PATCH v4] Mode-Switching: Fix SET_SRC ICE for create_pre_exit



On 8/12/23 18:56, pan2...@intel.com wrote:
> From: Pan Li 
> 
> In same cases, like gcc/testsuite/gcc.dg/pr78148.c in RISC-V, there will
> be only 1 operand when SET_SRC in create_pre_exit. For example as below.
> 
> (insn 13 9 14 2 (clobber (reg/i:TI 10 a0)) 
> "gcc/testsuite/gcc.dg/pr78148.c":24:1 -1
>(expr_list:REG_UNUSED (reg/i:TI 10 a0)
>  (nil)))
> 
> Unfortunately, SET_SRC requires at least 2 operands and then Segment
> Fault here. For SH4 part result in Segment Fault, it looks like only
> valid when the return_copy_pat is load or something like that. Thus,
> this patch try to fix it by restrict the SET insn for SET_SRC.
> 
> Signed-off-by: Pan Li 
> 
> gcc/ChangeLog:
> 
>   * mode-switching.cc (create_pre_exit): Add SET insn check.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/riscv/mode-switch-ice-1.c: New test.
OK.  Thanks for the updated version.

jeff

RE: [PATCH v1] RISC-V: Support RVV VFREC7 rounding mode intrinsic API

2023-08-14 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

From: Kito Cheng 
Sent: Monday, August 14, 2023 11:02 PM
To: Li, Pan2 
Cc: Wang, Yanzhang ; gcc-patches 
; 钟居哲 
Subject: Re: [PATCH v1] RISC-V: Support RVV VFREC7 rounding mode intrinsic API

Checked with doc and llvm implementation, LGTM

RE: [PATCH v1] RISC-V: Support RVV VFREC7 rounding mode intrinsic API

2023-08-14 Thread Li, Pan2 via Gcc-patches

Thanks Kito for comments, updated in PATCH v2.

https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627367.html

Pan

-Original Message-
From: Kito Cheng  
Sent: Monday, August 14, 2023 10:07 PM
To: 钟居哲 
Cc: Li, Pan2 ; gcc-patches ; Wang, 
Yanzhang 
Subject: Re: [PATCH v1] RISC-V: Support RVV VFREC7 rounding mode intrinsic API

> +template

You don't need a template class here since it can only be UNSPEC_VFREC7.

> +class vfrec7_frm : public function_base
> +{
> +public:
> +  bool has_rounding_mode_operand_p () const override { return true; }
> +
> +  rtx expand (function_expander ) const override
> +  {
> +return e.use_exact_insn (code_for_pred (UNSPEC, e.vector_mode ()));
> +  }
> +};
> +
> /* Implements vrsub.  */
> class vrsub : public function_base
> {
> @@ -2433,6 +2448,7 @@ static CONSTEXPR const unop vfsqrt_obj;
> static CONSTEXPR const unop_frm vfsqrt_frm_obj;
> static CONSTEXPR const float_misc vfrsqrt7_obj;
> static CONSTEXPR const float_misc vfrec7_obj;
> +static CONSTEXPR const vfrec7_frm vfrec7_frm_obj;

Then `static CONSTEXPR const vfrec7_frm vfrec7_frm_obj;` here

> static CONSTEXPR const binop vfmin_obj;
> static CONSTEXPR const binop vfmax_obj;
> static CONSTEXPR const float_misc vfsgnj_obj;

RE: [PATCH v1] RISC-V: Support RVV VFSQRT rounding mode intrinsic API

2023-08-14 Thread Li, Pan2 via Gcc-patches

Committed, thanks Juzhe.

Pan

From: juzhe.zh...@rivai.ai 
Sent: Monday, August 14, 2023 3:44 PM
To: Li, Pan2 ; gcc-patches 
Cc: Li, Pan2 ; Wang, Yanzhang ; 
kito.cheng 
Subject: Re: [PATCH v1] RISC-V: Support RVV VFSQRT rounding mode intrinsic API

LGTM


juzhe.zh...@rivai.ai

From: pan2.li
Date: 2023-08-14 15:39
To: gcc-patches
CC: juzhe.zhong; 
pan2.li; 
yanzhang.wang; 
kito.cheng
Subject: [PATCH v1] RISC-V: Support RVV VFSQRT rounding mode intrinsic API
From: Pan Li mailto:pan2...@intel.com>>

This patch would like to support the rounding mode API for the
VFSQRT as the below samples.

* __riscv_vfsqrt_v_f32m1_rm
* __riscv_vfsqrt_v_f32m1_rm_m

Signed-off-by: Pan Li mailto:pan2...@intel.com>>

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(class unop_frm): New class for frm.
(vfsqrt_frm_obj): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfsqrt_frm): New intrinsic function definition.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-sqrt.c: New test.
---
.../riscv/riscv-vector-builtins-bases.cc  | 17 ++
.../riscv/riscv-vector-builtins-bases.h   |  1 +
.../riscv/riscv-vector-builtins-functions.def |  2 ++
.../riscv/rvv/base/float-point-sqrt.c | 31 +++
4 files changed, 51 insertions(+)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/float-point-sqrt.c

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index b458560a040..2074dac0f16 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -631,6 +631,21 @@ public:
   }
};
+/* Implements below instructions for frm
+   - vfsqrt
+*/
+template
+class unop_frm : public function_base
+{
+public:
+  bool has_rounding_mode_operand_p () const override { return true; }
+
+  rtx expand (function_expander ) const override
+  {
+return e.use_exact_insn (code_for_pred (CODE, e.vector_mode ()));
+  }
+};
+
/* Implements vrsub.  */
class vrsub : public function_base
{
@@ -2415,6 +2430,7 @@ static CONSTEXPR const vfwmsac_frm vfwmsac_frm_obj;
static CONSTEXPR const vfwnmsac vfwnmsac_obj;
static CONSTEXPR const vfwnmsac_frm vfwnmsac_frm_obj;
static CONSTEXPR const unop vfsqrt_obj;
+static CONSTEXPR const unop_frm vfsqrt_frm_obj;
static CONSTEXPR const float_misc vfrsqrt7_obj;
static CONSTEXPR const float_misc vfrec7_obj;
static CONSTEXPR const binop vfmin_obj;
@@ -2662,6 +2678,7 @@ BASE (vfwmsac_frm)
BASE (vfwnmsac)
BASE (vfwnmsac_frm)
BASE (vfsqrt)
+BASE (vfsqrt_frm)
BASE (vfrsqrt7)
BASE (vfrec7)
BASE (vfmin)
diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h 
b/gcc/config/riscv/riscv-vector-builtins-bases.h
index 85e8b9a3769..5c91381bd4c 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.h
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.h
@@ -184,6 +184,7 @@ extern const function_base *const vfwmsac_frm;
extern const function_base *const vfwnmsac;
extern const function_base *const vfwnmsac_frm;
extern const function_base *const vfsqrt;
+extern const function_base *const vfsqrt_frm;
extern const function_base *const vfrsqrt7;
extern const function_base *const vfrec7;
extern const function_base *const vfmin;
diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
b/gcc/config/riscv/riscv-vector-builtins-functions.def
index 7e2a4ab2969..a821aca6a4b 100644
--- a/gcc/config/riscv/riscv-vector-builtins-functions.def
+++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
@@ -388,6 +388,8 @@ DEF_RVV_FUNCTION (vfwnmsac_frm, alu_frm, full_preds, 
f_wwfv_ops)
// 13.8. Vector Floating-Point Square-Root Instruction
DEF_RVV_FUNCTION (vfsqrt, alu, full_preds, f_v_ops)
+DEF_RVV_FUNCTION (vfsqrt_frm, alu_frm, full_preds, f_v_ops)
+
// 13.9. Vector Floating-Point Reciprocal Square-Root Estimate Instruction
DEF_RVV_FUNCTION (vfrsqrt7, alu, full_preds, f_v_ops)
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-sqrt.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-sqrt.c
new file mode 100644
index 000..afd1fb2b8f6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-sqrt.c
@@ -0,0 +1,31 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+typedef float float32_t;
+
+vfloat32m1_t
+test_riscv_vfsqrt_vv_f32m1_rm (vfloat32m1_t op1, size_t vl) {
+  return __riscv_vfsqrt_v_f32m1_rm (op1, 0, vl);
+}
+
+vfloat32m1_t
+test_vfsqrt_vv_f32m1_rm_m (vbool32_t mask, vfloat32m1_t op1, size_t vl) {
+  return __riscv_vfsqrt_v_f32m1_rm_m (mask, op1, 1, vl);
+}
+
+vfloat32m1_t
+test_riscv_vfsqrt_vv_f32m1 (vfloat32m1_t op1, size_t

RE: [PATCH v1] RISC-V: Support RVV VFWNMSAC rounding mode intrinsic API

2023-08-14 Thread Li, Pan2 via Gcc-patches

Committed, thanks Juzhe.

Pan

From: juzhe.zh...@rivai.ai 
Sent: Monday, August 14, 2023 2:43 PM
To: Li, Pan2 ; gcc-patches 
Cc: Li, Pan2 ; Wang, Yanzhang ; 
kito.cheng 
Subject: Re: [PATCH v1] RISC-V: Support RVV VFWNMSAC rounding mode intrinsic API

LGTM


juzhe.zh...@rivai.ai

From: pan2.li
Date: 2023-08-14 14:07
To: gcc-patches
CC: juzhe.zhong; 
pan2.li; 
yanzhang.wang; 
kito.cheng
Subject: [PATCH v1] RISC-V: Support RVV VFWNMSAC rounding mode intrinsic API
From: Pan Li mailto:pan2...@intel.com>>

This patch would like to support the rounding mode API for the
VFWNMSAC as the below samples.

* __riscv_vfwnmsac_vv_f64m2_rm
* __riscv_vfwnmsac_vv_f64m2_rm_m
* __riscv_vfwnmsac_vf_f64m2_rm
* __riscv_vfwnmsac_vf_f64m2_rm_m

Signed-off-by: Pan Li mailto:pan2...@intel.com>>

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(class vfwnmsac_frm): New class for frm.
(vfwnmsac_frm_obj): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfwnmsac_frm): New intrinsic function definition.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-wnmsac.c: New test.
---
.../riscv/riscv-vector-builtins-bases.cc  | 25 ++
.../riscv/riscv-vector-builtins-bases.h   |  1 +
.../riscv/riscv-vector-builtins-functions.def |  2 +
.../riscv/rvv/base/float-point-wnmsac.c   | 47 +++
4 files changed, 75 insertions(+)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/float-point-wnmsac.c

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index 5a5da903cb2..b458560a040 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -608,6 +608,29 @@ public:
   }
};
+/* Implements below instructions for frm
+   - vfwnmsac
+*/
+class vfwnmsac_frm : public function_base
+{
+public:
+  bool has_rounding_mode_operand_p () const override { return true; }
+
+  bool has_merge_operand_p () const override { return false; }
+
+  rtx expand (function_expander ) const override
+  {
+if (e.op_info->op == OP_TYPE_vf)
+  return e.use_widen_ternop_insn (
+ code_for_pred_widen_mul_neg_scalar (PLUS, e.vector_mode ()));
+if (e.op_info->op == OP_TYPE_vv)
+  return e.use_widen_ternop_insn (
+ code_for_pred_widen_mul_neg (PLUS, e.vector_mode ()));
+
+gcc_unreachable ();
+  }
+};
+
/* Implements vrsub.  */
class vrsub : public function_base
{
@@ -2390,6 +2413,7 @@ static CONSTEXPR const vfwnmacc_frm vfwnmacc_frm_obj;
static CONSTEXPR const vfwmsac vfwmsac_obj;
static CONSTEXPR const vfwmsac_frm vfwmsac_frm_obj;
static CONSTEXPR const vfwnmsac vfwnmsac_obj;
+static CONSTEXPR const vfwnmsac_frm vfwnmsac_frm_obj;
static CONSTEXPR const unop vfsqrt_obj;
static CONSTEXPR const float_misc vfrsqrt7_obj;
static CONSTEXPR const float_misc vfrec7_obj;
@@ -2636,6 +2660,7 @@ BASE (vfwnmacc_frm)
BASE (vfwmsac)
BASE (vfwmsac_frm)
BASE (vfwnmsac)
+BASE (vfwnmsac_frm)
BASE (vfsqrt)
BASE (vfrsqrt7)
BASE (vfrec7)
diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h 
b/gcc/config/riscv/riscv-vector-builtins-bases.h
index 09356dd7ac8..85e8b9a3769 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.h
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.h
@@ -182,6 +182,7 @@ extern const function_base *const vfwnmacc_frm;
extern const function_base *const vfwmsac;
extern const function_base *const vfwmsac_frm;
extern const function_base *const vfwnmsac;
+extern const function_base *const vfwnmsac_frm;
extern const function_base *const vfsqrt;
extern const function_base *const vfrsqrt7;
extern const function_base *const vfrec7;
diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
b/gcc/config/riscv/riscv-vector-builtins-functions.def
index e2a79607d04..7e2a4ab2969 100644
--- a/gcc/config/riscv/riscv-vector-builtins-functions.def
+++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
@@ -382,6 +382,8 @@ DEF_RVV_FUNCTION (vfwnmacc_frm, alu_frm, full_preds, 
f_wwvv_ops)
DEF_RVV_FUNCTION (vfwnmacc_frm, alu_frm, full_preds, f_wwfv_ops)
DEF_RVV_FUNCTION (vfwmsac_frm, alu_frm, full_preds, f_wwvv_ops)
DEF_RVV_FUNCTION (vfwmsac_frm, alu_frm, full_preds, f_wwfv_ops)
+DEF_RVV_FUNCTION (vfwnmsac_frm, alu_frm, full_preds, f_wwvv_ops)
+DEF_RVV_FUNCTION (vfwnmsac_frm, alu_frm, full_preds, f_wwfv_ops)
// 13.8. Vector Floating-Point Square-Root Instruction
DEF_RVV_FUNCTION (vfsqrt, alu, full_preds, f_v_ops)
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-wnmsac.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-wnmsac.c
new file mode 100644
index 000..13eb306313c
--- /dev/null
+++

1 2 3 4 5 6 >

1 - 100 of 533 matches

Mail list logo