Re: Re: [PATCH] RISC-V: Support non-SLP unordered reduction

2023-07-17 Thread juzhe.zh...@rivai.ai
Address comment.

V2 patch:
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624638.html 

I added:

+/* Change insn and Assert the change always happens.  */
+static void
+validate_change_or_fail (rtx object, rtx *loc, rtx new_rtx, bool in_group)
+{
+  bool change_p = validate_change (object, loc, new_rtx, in_group);
+  gcc_assert (change_p);
+}
as you suggested.

Could you take a look again?


juzhe.zh...@rivai.ai
 
From: Kito Cheng
Date: 2023-07-17 15:00
To: juzhe.zhong
CC: gcc-patches; kito.cheng; palmer; rdapp.gcc; jeffreyalaw
Subject: Re: [PATCH] RISC-V: Support non-SLP unordered reduction
> @@ -247,6 +248,7 @@ void emit_vlmax_cmp_mu_insn (unsigned, rtx *);
>  void emit_vlmax_masked_mu_insn (unsigned, int, rtx *);
>  void emit_scalar_move_insn (unsigned, rtx *);
>  void emit_nonvlmax_integer_move_insn (unsigned, rtx *, rtx);
> +//void emit_vlmax_reduction_insn (unsigned, rtx *);
 
Plz drop this.
 
 
> diff --git a/gcc/config/riscv/riscv-vsetvl.cc 
> b/gcc/config/riscv/riscv-vsetvl.cc
> index 586dc8e5379..97a9dad8a77 100644
> --- a/gcc/config/riscv/riscv-vsetvl.cc
> +++ b/gcc/config/riscv/riscv-vsetvl.cc
> @@ -646,7 +646,8 @@ gen_vsetvl_pat (enum vsetvl_type insn_type, const 
> vl_vtype_info , rtx vl)
>  }
>
>  static rtx
> -gen_vsetvl_pat (rtx_insn *rinsn, const vector_insn_info )
> +gen_vsetvl_pat (rtx_insn *rinsn, const vector_insn_info ,
> +   rtx vl = NULL_RTX)
>  {
>rtx new_pat;
>vl_vtype_info new_info = info;
> @@ -657,7 +658,7 @@ gen_vsetvl_pat (rtx_insn *rinsn, const vector_insn_info 
> )
>if (vsetvl_insn_p (rinsn) || vlmax_avl_p (info.get_avl ()))
>  {
>rtx dest = get_vl (rinsn);
 
rtx dest = vl ? vl : get_vl (rinsn);
 
> -  new_pat = gen_vsetvl_pat (VSETVL_NORMAL, new_info, dest);
> +  new_pat = gen_vsetvl_pat (VSETVL_NORMAL, new_info, vl ? vl : dest);
 
and keep dest here.
 
>  }
>else if (INSN_CODE (rinsn) == CODE_FOR_vsetvl_vtype_change_only)
>  new_pat = gen_vsetvl_pat (VSETVL_VTYPE_CHANGE_ONLY, new_info, NULL_RTX);
 
Should we handle vl is non-null case in else-if and else case?
Add `assert (vl == NULL_RTX)` if not handle.
 
> @@ -818,7 +819,8 @@ change_insn (rtx_insn *rinsn, rtx new_pat)
>print_rtl_single (dump_file, PATTERN (rinsn));
>  }
>
> -  validate_change (rinsn,  (rinsn), new_pat, false);
> +  bool change_p = validate_change (rinsn,  (rinsn), new_pat, false);
> +  gcc_assert (change_p);
 
I think we could create a wrapper for validate_change to make sure
that return true, and also use that wrapper for all other call sites?
 
e.g.
validate_change_or_fail?
 


Re: [PATCH] RISC-V: Support non-SLP unordered reduction

2023-07-17 Thread Kito Cheng via Gcc-patches
> @@ -247,6 +248,7 @@ void emit_vlmax_cmp_mu_insn (unsigned, rtx *);
>  void emit_vlmax_masked_mu_insn (unsigned, int, rtx *);
>  void emit_scalar_move_insn (unsigned, rtx *);
>  void emit_nonvlmax_integer_move_insn (unsigned, rtx *, rtx);
> +//void emit_vlmax_reduction_insn (unsigned, rtx *);

Plz drop this.


> diff --git a/gcc/config/riscv/riscv-vsetvl.cc 
> b/gcc/config/riscv/riscv-vsetvl.cc
> index 586dc8e5379..97a9dad8a77 100644
> --- a/gcc/config/riscv/riscv-vsetvl.cc
> +++ b/gcc/config/riscv/riscv-vsetvl.cc
> @@ -646,7 +646,8 @@ gen_vsetvl_pat (enum vsetvl_type insn_type, const 
> vl_vtype_info , rtx vl)
>  }
>
>  static rtx
> -gen_vsetvl_pat (rtx_insn *rinsn, const vector_insn_info )
> +gen_vsetvl_pat (rtx_insn *rinsn, const vector_insn_info ,
> +   rtx vl = NULL_RTX)
>  {
>rtx new_pat;
>vl_vtype_info new_info = info;
> @@ -657,7 +658,7 @@ gen_vsetvl_pat (rtx_insn *rinsn, const vector_insn_info 
> )
>if (vsetvl_insn_p (rinsn) || vlmax_avl_p (info.get_avl ()))
>  {
>rtx dest = get_vl (rinsn);

rtx dest = vl ? vl : get_vl (rinsn);

> -  new_pat = gen_vsetvl_pat (VSETVL_NORMAL, new_info, dest);
> +  new_pat = gen_vsetvl_pat (VSETVL_NORMAL, new_info, vl ? vl : dest);

and keep dest here.

>  }
>else if (INSN_CODE (rinsn) == CODE_FOR_vsetvl_vtype_change_only)
>  new_pat = gen_vsetvl_pat (VSETVL_VTYPE_CHANGE_ONLY, new_info, NULL_RTX);

Should we handle vl is non-null case in else-if and else case?
Add `assert (vl == NULL_RTX)` if not handle.

> @@ -818,7 +819,8 @@ change_insn (rtx_insn *rinsn, rtx new_pat)
>print_rtl_single (dump_file, PATTERN (rinsn));
>  }
>
> -  validate_change (rinsn,  (rinsn), new_pat, false);
> +  bool change_p = validate_change (rinsn,  (rinsn), new_pat, false);
> +  gcc_assert (change_p);

I think we could create a wrapper for validate_change to make sure
that return true, and also use that wrapper for all other call sites?

e.g.
validate_change_or_fail?


RE: Re: [PATCH] RISC-V: Support non-SLP unordered reduction

2023-07-15 Thread Li, Pan2 via Gcc-patches
File a separated PATCH target GCC 13 for this bug with rvv.exp and riscv.exp 
test passed. Unfortunately, it is not easy to reproduce this by Intrinsic API.

https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624574.html

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of ???
Sent: Friday, July 14, 2023 8:51 PM
To: kito.cheng 
Cc: gcc-patches ; kito.cheng ; 
palmer ; rdapp.gcc ; Jeff Law 

Subject: Re: Re: [PATCH] RISC-V: Support non-SLP unordered reduction

So to be safe, I think it should be backport to GCC 13 even though I didn't 
have a intrinsic testcase to reproduce it.



juzhe.zh...@rivai.ai
 
From: Kito Cheng
Date: 2023-07-14 20:38
To: 钟居哲
CC: GCC Patches; Kito Cheng; Palmer Dabbelt; Robin Dapp; Jeff Law
Subject: Re: [PATCH] RISC-V: Support non-SLP unordered reduction


 於 2023年7月14日 週五 20:31 寫道:
From: Ju-Zhe Zhong 

This patch add reduc_*_scal to support reduction auto-vectorization.

Use COND_LEN_* + reduc_*_scal to support unordered non-SLP auto-vectorization.

Consider this following case:
int __attribute__((noipa))
and_loop (int32_t * __restrict x, 
int32_t n, int res)
{
  for (int i = 0; i < n; ++i)
res &= x[i];
  return res;
}

ASM:
and_loop:
ble a1,zero,.L4
vsetvli a3,zero,e32,m1,ta,ma
vmv.v.i v1,-1
.L3:
vsetvli a5,a1,e32,m1,tu,ma   > MUST BE "TU".
sllia4,a5,2
sub a1,a1,a5
vle32.v v2,0(a0)
add a0,a0,a4
vand.vv v1,v2,v1
bne a1,zero,.L3
vsetivlizero,1,e32,m1,ta,ma
vmv.v.i v2,-1
vsetvli a3,zero,e32,m1,ta,ma
vredand.vs  v1,v1,v2
vmv.x.s a5,v1
and a0,a2,a5
ret
.L4:
mv  a0,a2
ret

Fix bug of VSETVL PASS which is caused by reduction testcase.


It's performance bug or correctness bug? Does it's also appeared in gcc 13 if 
it's a correctness bug?


SLP reduction and floating-point in-order reduction are not supported yet.

gcc/ChangeLog:

* config/riscv/autovec.md (reduc_plus_scal_): New pattern.
(reduc_smax_scal_): Ditto.
(reduc_umax_scal_): Ditto.
(reduc_smin_scal_): Ditto.
(reduc_umin_scal_): Ditto.
(reduc_and_scal_): Ditto.
(reduc_ior_scal_): Ditto.
(reduc_xor_scal_): Ditto.
* config/riscv/riscv-protos.h (enum insn_type): New enum.
(emit_nonvlmax_integer_move_insn): Add reduction.
(expand_reduction): New function.
* config/riscv/riscv-v.cc (emit_vlmax_reduction_insn): Ditto.
(emit_vlmax_fp_reduction_insn): Ditto.
(get_m1_mode): Ditto.
(expand_cond_len_binop): Fix name.
(expand_reduction): New function.
* config/riscv/riscv-vsetvl.cc (gen_vsetvl_pat): Fix bug.
(change_insn): Ditto.
(change_vsetvl_insn): Ditto.
(pass_vsetvl::backward_demand_fusion): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/rvv.exp: Add reduction tests.
* gcc.target/riscv/rvv/autovec/reduc/reduc-1.c: New test.
* gcc.target/riscv/rvv/autovec/reduc/reduc-2.c: New test.
* gcc.target/riscv/rvv/autovec/reduc/reduc-3.c: New test.
* gcc.target/riscv/rvv/autovec/reduc/reduc-4.c: New test.
* gcc.target/riscv/rvv/autovec/reduc/reduc_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/reduc/reduc_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/reduc/reduc_run-3.c: New test.
* gcc.target/riscv/rvv/autovec/reduc/reduc_run-4.c: New test.

---
 gcc/config/riscv/autovec.md   | 138 ++
 gcc/config/riscv/riscv-protos.h   |   3 +
 gcc/config/riscv/riscv-v.cc   |  84 ++-
 gcc/config/riscv/riscv-vsetvl.cc  |  28 +++-
 .../riscv/rvv/autovec/reduc/reduc-1.c | 118 +++
 .../riscv/rvv/autovec/reduc/reduc-2.c | 129 
 .../riscv/rvv/autovec/reduc/reduc-3.c |  65 +
 .../riscv/rvv/autovec/reduc/reduc-4.c |  59 
 .../riscv/rvv/autovec/reduc/reduc_run-1.c |  56 +++
 .../riscv/rvv/autovec/reduc/reduc_run-2.c |  79 ++
 .../riscv/rvv/autovec/reduc/reduc_run-3.c |  49 +++
 .../riscv/rvv/autovec/reduc/reduc_run-4.c |  66 +
 gcc/testsuite/gcc.target/riscv/rvv/rvv.exp|   2 +
 13 files changed, 868 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc-4.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_run-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_run-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_run-3.c

Re: Re: [PATCH] RISC-V: Support non-SLP unordered reduction

2023-07-14 Thread 钟居哲
So to be safe, I think it should be backport to GCC 13 even though I didn't 
have a intrinsic testcase to reproduce it.



juzhe.zh...@rivai.ai
 
From: Kito Cheng
Date: 2023-07-14 20:38
To: 钟居哲
CC: GCC Patches; Kito Cheng; Palmer Dabbelt; Robin Dapp; Jeff Law
Subject: Re: [PATCH] RISC-V: Support non-SLP unordered reduction


 於 2023年7月14日 週五 20:31 寫道:
From: Ju-Zhe Zhong 

This patch add reduc_*_scal to support reduction auto-vectorization.

Use COND_LEN_* + reduc_*_scal to support unordered non-SLP auto-vectorization.

Consider this following case:
int __attribute__((noipa))
and_loop (int32_t * __restrict x, 
int32_t n, int res)
{
  for (int i = 0; i < n; ++i)
res &= x[i];
  return res;
}

ASM:
and_loop:
ble a1,zero,.L4
vsetvli a3,zero,e32,m1,ta,ma
vmv.v.i v1,-1
.L3:
vsetvli a5,a1,e32,m1,tu,ma   > MUST BE "TU".
sllia4,a5,2
sub a1,a1,a5
vle32.v v2,0(a0)
add a0,a0,a4
vand.vv v1,v2,v1
bne a1,zero,.L3
vsetivlizero,1,e32,m1,ta,ma
vmv.v.i v2,-1
vsetvli a3,zero,e32,m1,ta,ma
vredand.vs  v1,v1,v2
vmv.x.s a5,v1
and a0,a2,a5
ret
.L4:
mv  a0,a2
ret

Fix bug of VSETVL PASS which is caused by reduction testcase.


It's performance bug or correctness bug? Does it's also appeared in gcc 13 if 
it's a correctness bug?


SLP reduction and floating-point in-order reduction are not supported yet.

gcc/ChangeLog:

* config/riscv/autovec.md (reduc_plus_scal_): New pattern.
(reduc_smax_scal_): Ditto.
(reduc_umax_scal_): Ditto.
(reduc_smin_scal_): Ditto.
(reduc_umin_scal_): Ditto.
(reduc_and_scal_): Ditto.
(reduc_ior_scal_): Ditto.
(reduc_xor_scal_): Ditto.
* config/riscv/riscv-protos.h (enum insn_type): New enum.
(emit_nonvlmax_integer_move_insn): Add reduction.
(expand_reduction): New function.
* config/riscv/riscv-v.cc (emit_vlmax_reduction_insn): Ditto.
(emit_vlmax_fp_reduction_insn): Ditto.
(get_m1_mode): Ditto.
(expand_cond_len_binop): Fix name.
(expand_reduction): New function.
* config/riscv/riscv-vsetvl.cc (gen_vsetvl_pat): Fix bug.
(change_insn): Ditto.
(change_vsetvl_insn): Ditto.
(pass_vsetvl::backward_demand_fusion): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/rvv.exp: Add reduction tests.
* gcc.target/riscv/rvv/autovec/reduc/reduc-1.c: New test.
* gcc.target/riscv/rvv/autovec/reduc/reduc-2.c: New test.
* gcc.target/riscv/rvv/autovec/reduc/reduc-3.c: New test.
* gcc.target/riscv/rvv/autovec/reduc/reduc-4.c: New test.
* gcc.target/riscv/rvv/autovec/reduc/reduc_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/reduc/reduc_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/reduc/reduc_run-3.c: New test.
* gcc.target/riscv/rvv/autovec/reduc/reduc_run-4.c: New test.

---
 gcc/config/riscv/autovec.md   | 138 ++
 gcc/config/riscv/riscv-protos.h   |   3 +
 gcc/config/riscv/riscv-v.cc   |  84 ++-
 gcc/config/riscv/riscv-vsetvl.cc  |  28 +++-
 .../riscv/rvv/autovec/reduc/reduc-1.c | 118 +++
 .../riscv/rvv/autovec/reduc/reduc-2.c | 129 
 .../riscv/rvv/autovec/reduc/reduc-3.c |  65 +
 .../riscv/rvv/autovec/reduc/reduc-4.c |  59 
 .../riscv/rvv/autovec/reduc/reduc_run-1.c |  56 +++
 .../riscv/rvv/autovec/reduc/reduc_run-2.c |  79 ++
 .../riscv/rvv/autovec/reduc/reduc_run-3.c |  49 +++
 .../riscv/rvv/autovec/reduc/reduc_run-4.c |  66 +
 gcc/testsuite/gcc.target/riscv/rvv/rvv.exp|   2 +
 13 files changed, 868 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc-4.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_run-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_run-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_run-3.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_run-4.c

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 0476b1dea45..a74f66f41ac 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -1531,3 +1531,141 @@
   riscv_vector::expand_cond_len_binop (, operands);
   DONE;
 })
+
+;; ===

Re: Re: [PATCH] RISC-V: Support non-SLP unordered reduction

2023-07-14 Thread 钟居哲
>> It's performance bug or correctness bug? Does it's also appeared in gcc 13 
>> if it's a correctness bug?

It's correctness bug. 

The bug as below:

vsetvli zero, 1, e16, m1, ta, ma  > VSETVL pass detect it can be  fused as 
"t1,zero,e16,m2,ta,ma" but failed in change_insn
vmv.s.x v1,a5
...
vsetvli t1,zero,e16,m2,ta,ma  -> elided 
vlse16.v v2...

So finally, we end up with:

vsetvli zero, 1, e16, m1, ta, ma 
vmv.s.x v1,a5
...
vlse16.v v2...

which is incorrect.
I tried to reproduce this situation by intrinsic but failed.
It seems that it can only be reproduced by reduction auto-vectorization.



juzhe.zh...@rivai.ai
 
From: Kito Cheng
Date: 2023-07-14 20:38
To: 钟居哲
CC: GCC Patches; Kito Cheng; Palmer Dabbelt; Robin Dapp; Jeff Law
Subject: Re: [PATCH] RISC-V: Support non-SLP unordered reduction


 於 2023年7月14日 週五 20:31 寫道:
From: Ju-Zhe Zhong 

This patch add reduc_*_scal to support reduction auto-vectorization.

Use COND_LEN_* + reduc_*_scal to support unordered non-SLP auto-vectorization.

Consider this following case:
int __attribute__((noipa))
and_loop (int32_t * __restrict x, 
int32_t n, int res)
{
  for (int i = 0; i < n; ++i)
res &= x[i];
  return res;
}

ASM:
and_loop:
ble a1,zero,.L4
vsetvli a3,zero,e32,m1,ta,ma
vmv.v.i v1,-1
.L3:
vsetvli a5,a1,e32,m1,tu,ma   > MUST BE "TU".
sllia4,a5,2
sub a1,a1,a5
vle32.v v2,0(a0)
add a0,a0,a4
vand.vv v1,v2,v1
bne a1,zero,.L3
vsetivlizero,1,e32,m1,ta,ma
vmv.v.i v2,-1
vsetvli a3,zero,e32,m1,ta,ma
vredand.vs  v1,v1,v2
vmv.x.s a5,v1
and a0,a2,a5
ret
.L4:
mv  a0,a2
ret

Fix bug of VSETVL PASS which is caused by reduction testcase.


It's performance bug or correctness bug? Does it's also appeared in gcc 13 if 
it's a correctness bug?


SLP reduction and floating-point in-order reduction are not supported yet.

gcc/ChangeLog:

* config/riscv/autovec.md (reduc_plus_scal_): New pattern.
(reduc_smax_scal_): Ditto.
(reduc_umax_scal_): Ditto.
(reduc_smin_scal_): Ditto.
(reduc_umin_scal_): Ditto.
(reduc_and_scal_): Ditto.
(reduc_ior_scal_): Ditto.
(reduc_xor_scal_): Ditto.
* config/riscv/riscv-protos.h (enum insn_type): New enum.
(emit_nonvlmax_integer_move_insn): Add reduction.
(expand_reduction): New function.
* config/riscv/riscv-v.cc (emit_vlmax_reduction_insn): Ditto.
(emit_vlmax_fp_reduction_insn): Ditto.
(get_m1_mode): Ditto.
(expand_cond_len_binop): Fix name.
(expand_reduction): New function.
* config/riscv/riscv-vsetvl.cc (gen_vsetvl_pat): Fix bug.
(change_insn): Ditto.
(change_vsetvl_insn): Ditto.
(pass_vsetvl::backward_demand_fusion): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/rvv.exp: Add reduction tests.
* gcc.target/riscv/rvv/autovec/reduc/reduc-1.c: New test.
* gcc.target/riscv/rvv/autovec/reduc/reduc-2.c: New test.
* gcc.target/riscv/rvv/autovec/reduc/reduc-3.c: New test.
* gcc.target/riscv/rvv/autovec/reduc/reduc-4.c: New test.
* gcc.target/riscv/rvv/autovec/reduc/reduc_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/reduc/reduc_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/reduc/reduc_run-3.c: New test.
* gcc.target/riscv/rvv/autovec/reduc/reduc_run-4.c: New test.

---
 gcc/config/riscv/autovec.md   | 138 ++
 gcc/config/riscv/riscv-protos.h   |   3 +
 gcc/config/riscv/riscv-v.cc   |  84 ++-
 gcc/config/riscv/riscv-vsetvl.cc  |  28 +++-
 .../riscv/rvv/autovec/reduc/reduc-1.c | 118 +++
 .../riscv/rvv/autovec/reduc/reduc-2.c | 129 
 .../riscv/rvv/autovec/reduc/reduc-3.c |  65 +
 .../riscv/rvv/autovec/reduc/reduc-4.c |  59 
 .../riscv/rvv/autovec/reduc/reduc_run-1.c |  56 +++
 .../riscv/rvv/autovec/reduc/reduc_run-2.c |  79 ++
 .../riscv/rvv/autovec/reduc/reduc_run-3.c |  49 +++
 .../riscv/rvv/autovec/reduc/reduc_run-4.c |  66 +
 gcc/testsuite/gcc.target/riscv/rvv/rvv.exp|   2 +
 13 files changed, 868 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc-4.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_run-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_run-2.c
 create mode 100644 
gcc/testsuite/gcc.targ

Re: [PATCH] RISC-V: Support non-SLP unordered reduction

2023-07-14 Thread Kito Cheng via Gcc-patches
 於 2023年7月14日 週五 20:31 寫道:

> From: Ju-Zhe Zhong 
>
> This patch add reduc_*_scal to support reduction auto-vectorization.
>
> Use COND_LEN_* + reduc_*_scal to support unordered non-SLP
> auto-vectorization.
>
> Consider this following case:
> int __attribute__((noipa))
> and_loop (int32_t * __restrict x,
> int32_t n, int res)
> {
>   for (int i = 0; i < n; ++i)
> res &= x[i];
>   return res;
> }
>
> ASM:
> and_loop:
> ble a1,zero,.L4
> vsetvli a3,zero,e32,m1,ta,ma
> vmv.v.i v1,-1
> .L3:
> vsetvli a5,a1,e32,m1,tu,ma   > MUST BE "TU".
> sllia4,a5,2
> sub a1,a1,a5
> vle32.v v2,0(a0)
> add a0,a0,a4
> vand.vv v1,v2,v1
> bne a1,zero,.L3
> vsetivlizero,1,e32,m1,ta,ma
> vmv.v.i v2,-1
> vsetvli a3,zero,e32,m1,ta,ma
> vredand.vs  v1,v1,v2
> vmv.x.s a5,v1
> and a0,a2,a5
> ret
> .L4:
> mv  a0,a2
> ret
>
> Fix bug of VSETVL PASS which is caused by reduction testcase.
>


It's performance bug or correctness bug? Does it's also appeared in gcc 13
if it's a correctness bug?


> SLP reduction and floating-point in-order reduction are not supported yet.
>
> gcc/ChangeLog:
>
> * config/riscv/autovec.md (reduc_plus_scal_): New pattern.
> (reduc_smax_scal_): Ditto.
> (reduc_umax_scal_): Ditto.
> (reduc_smin_scal_): Ditto.
> (reduc_umin_scal_): Ditto.
> (reduc_and_scal_): Ditto.
> (reduc_ior_scal_): Ditto.
> (reduc_xor_scal_): Ditto.
> * config/riscv/riscv-protos.h (enum insn_type): New enum.
> (emit_nonvlmax_integer_move_insn): Add reduction.
> (expand_reduction): New function.
> * config/riscv/riscv-v.cc (emit_vlmax_reduction_insn): Ditto.
> (emit_vlmax_fp_reduction_insn): Ditto.
> (get_m1_mode): Ditto.
> (expand_cond_len_binop): Fix name.
> (expand_reduction): New function.
> * config/riscv/riscv-vsetvl.cc (gen_vsetvl_pat): Fix bug.
> (change_insn): Ditto.
> (change_vsetvl_insn): Ditto.
> (pass_vsetvl::backward_demand_fusion): Ditto.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/rvv.exp: Add reduction tests.
> * gcc.target/riscv/rvv/autovec/reduc/reduc-1.c: New test.
> * gcc.target/riscv/rvv/autovec/reduc/reduc-2.c: New test.
> * gcc.target/riscv/rvv/autovec/reduc/reduc-3.c: New test.
> * gcc.target/riscv/rvv/autovec/reduc/reduc-4.c: New test.
> * gcc.target/riscv/rvv/autovec/reduc/reduc_run-1.c: New test.
> * gcc.target/riscv/rvv/autovec/reduc/reduc_run-2.c: New test.
> * gcc.target/riscv/rvv/autovec/reduc/reduc_run-3.c: New test.
> * gcc.target/riscv/rvv/autovec/reduc/reduc_run-4.c: New test.
>
> ---
>  gcc/config/riscv/autovec.md   | 138 ++
>  gcc/config/riscv/riscv-protos.h   |   3 +
>  gcc/config/riscv/riscv-v.cc   |  84 ++-
>  gcc/config/riscv/riscv-vsetvl.cc  |  28 +++-
>  .../riscv/rvv/autovec/reduc/reduc-1.c | 118 +++
>  .../riscv/rvv/autovec/reduc/reduc-2.c | 129 
>  .../riscv/rvv/autovec/reduc/reduc-3.c |  65 +
>  .../riscv/rvv/autovec/reduc/reduc-4.c |  59 
>  .../riscv/rvv/autovec/reduc/reduc_run-1.c |  56 +++
>  .../riscv/rvv/autovec/reduc/reduc_run-2.c |  79 ++
>  .../riscv/rvv/autovec/reduc/reduc_run-3.c |  49 +++
>  .../riscv/rvv/autovec/reduc/reduc_run-4.c |  66 +
>  gcc/testsuite/gcc.target/riscv/rvv/rvv.exp|   2 +
>  13 files changed, 868 insertions(+), 8 deletions(-)
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc-1.c
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc-2.c
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc-3.c
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc-4.c
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_run-1.c
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_run-2.c
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_run-3.c
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_run-4.c
>
> diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
> index 0476b1dea45..a74f66f41ac 100644
> --- a/gcc/config/riscv/autovec.md
> +++ b/gcc/config/riscv/autovec.md
> @@ -1531,3 +1531,141 @@
>riscv_vector::expand_cond_len_binop (, operands);
>DONE;
>  })
> +
> +;;
> =
> +;; == Reductions
> +;;
> =
> +
> +;;
>