RE: [PATCH v2] RISC-V: Allow RVV VMS{Compare}(V1, V1) simplify to VMCLR
Cool, Thank you! Pan -Original Message- From: Kito Cheng Sent: Friday, April 28, 2023 8:37 PM To: Li, Pan2 Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang Subject: Re: [PATCH v2] RISC-V: Allow RVV VMS{Compare}(V1, V1) simplify to VMCLR pushed, thanks!
Re: [PATCH v2] RISC-V: Allow RVV VMS{Compare}(V1, V1) simplify to VMCLR
pushed, thanks!
RE: [PATCH v2] RISC-V: Allow RVV VMS{Compare}(V1, V1) simplify to VMCLR
Passed both the X86 bootstrap and regression test. Pan -Original Message- From: Li, Pan2 Sent: Friday, April 28, 2023 2:45 PM To: Kito Cheng Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang Subject: RE: [PATCH v2] RISC-V: Allow RVV VMS{Compare}(V1, V1) simplify to VMCLR Thanks, kito. Yes, you are right. I am investigating this right now from simplify rtl. Given we have one similar case VMORN in previous. Pan -Original Message- From: Kito Cheng Sent: Friday, April 28, 2023 2:41 PM To: Li, Pan2 Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang Subject: Re: [PATCH v2] RISC-V: Allow RVV VMS{Compare}(V1, V1) simplify to VMCLR LGTM I thought it can optimization __riscv_vmseq_vv_i8m8_b1(v1, v1, vl) too, but don't know why it's not evaluated (eq:VNx128BI (reg/v:VNx128QI 137 [ v1 ]) (reg/v:VNx128QI 137 [ v1 ])) to true, anyway, I guess it should be your next step to investigate :) On Fri, Apr 28, 2023 at 10:46 AM wrote: > > From: Pan Li > > When some RVV integer compare operators act on the same vector > registers without mask. They can be simplified to VMCLR. > > This PATCH allow the ne, lt, ltu, gt, gtu to perform such kind of the > simplification by adding one new define_split. > > Given we have: > vbool1_t test_shortcut_for_riscv_vmslt_case_0(vint8m8_t v1, size_t vl) { > return __riscv_vmslt_vv_i8m8_b1(v1, v1, vl); } > > Before this patch: > vsetvli zero,a2,e8,m8,ta,ma > vl8re8.v v24,0(a1) > vmslt.vv v8,v24,v24 > vsetvli a5,zero,e8,m8,ta,ma > vsm.vv8,0(a0) > ret > > After this patch: > vsetvli zero,a2,e8,mf8,ta,ma > vmclr.m v24<- optimized to vmclr.m > vsetvli zero,a5,e8,mf8,ta,ma > vsm.v v24,0(a0) > ret > > As above, we may have one instruction eliminated and require less > vector registers. > > gcc/ChangeLog: > > * config/riscv/vector.md: Add new define split to perform > the simplification. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/rvv/base/integer_compare_insn_shortcut.c: New test. > > Signed-off-by: Pan Li > Co-authored-by: kito-cheng > --- > gcc/config/riscv/vector.md| 32 ++ > .../rvv/base/integer_compare_insn_shortcut.c | 291 > ++ > 2 files changed, 323 insertions(+) > create mode 100644 > gcc/testsuite/gcc.target/riscv/rvv/base/integer_compare_insn_shortcut. > c > > diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md > index b3d23441679..1642822d098 100644 > --- a/gcc/config/riscv/vector.md > +++ b/gcc/config/riscv/vector.md > @@ -7689,3 +7689,35 @@ (define_insn "@pred_fault_load" >"vleff.v\t%0,%3%p1" >[(set_attr "type" "vldff") > (set_attr "mode" "")]) > + > +;; > +- > + ;; Integer Compare Instructions Simplification ;; > +- > + > +;; Simplify to VMCLR.m Includes: > +;; - 1. VMSNE > +;; - 2. VMSLT > +;; - 3. VMSLTU > +;; - 4. VMSGT > +;; - 5. VMSGTU > +;; > +- > + > +(define_split > + [(set (match_operand:VB 0 "register_operand") > + (if_then_else:VB > + (unspec:VB > + [(match_operand:VB 1 "vector_all_trues_mask_operand") > +(match_operand4 "vector_length_operand") > +(match_operand5 "const_int_operand") > +(match_operand6 "const_int_operand") > +(reg:SI VL_REGNUM) > +(reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) > + (match_operand:VB3 "vector_move_operand") > + (match_operand:VB2 "vector_undef_operand")))] > + "TARGET_VECTOR" > + [(const_int 0)] > + { > +emit_insn (gen_pred_mov (mode, operands[0], CONST1_RTX > (mode), > +RVV_VUNDEF (mode), operands[3], > +operands[4], operands[5])); > +DONE; > + } > +) > diff --git > a/gcc/testsuite/gcc.target/riscv/rvv/base/integer_compare_insn_shortcu > t.c > b/gcc/testsuite/gcc.target/riscv/rvv/base/integer_compare_insn_shortcu > t.c > new file mode 100644 > index 000..8954adad09d > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/integer_compare_insn_sho > +++ rtcut.c > @@ -0,0 +1,291 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3" } */ > + > +#include "riscv_vector.h
RE: [PATCH v2] RISC-V: Allow RVV VMS{Compare}(V1, V1) simplify to VMCLR
Thanks, kito. Yes, you are right. I am investigating this right now from simplify rtl. Given we have one similar case VMORN in previous. Pan -Original Message- From: Kito Cheng Sent: Friday, April 28, 2023 2:41 PM To: Li, Pan2 Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang Subject: Re: [PATCH v2] RISC-V: Allow RVV VMS{Compare}(V1, V1) simplify to VMCLR LGTM I thought it can optimization __riscv_vmseq_vv_i8m8_b1(v1, v1, vl) too, but don't know why it's not evaluated (eq:VNx128BI (reg/v:VNx128QI 137 [ v1 ]) (reg/v:VNx128QI 137 [ v1 ])) to true, anyway, I guess it should be your next step to investigate :) On Fri, Apr 28, 2023 at 10:46 AM wrote: > > From: Pan Li > > When some RVV integer compare operators act on the same vector > registers without mask. They can be simplified to VMCLR. > > This PATCH allow the ne, lt, ltu, gt, gtu to perform such kind of the > simplification by adding one new define_split. > > Given we have: > vbool1_t test_shortcut_for_riscv_vmslt_case_0(vint8m8_t v1, size_t vl) { > return __riscv_vmslt_vv_i8m8_b1(v1, v1, vl); } > > Before this patch: > vsetvli zero,a2,e8,m8,ta,ma > vl8re8.v v24,0(a1) > vmslt.vv v8,v24,v24 > vsetvli a5,zero,e8,m8,ta,ma > vsm.vv8,0(a0) > ret > > After this patch: > vsetvli zero,a2,e8,mf8,ta,ma > vmclr.m v24<- optimized to vmclr.m > vsetvli zero,a5,e8,mf8,ta,ma > vsm.v v24,0(a0) > ret > > As above, we may have one instruction eliminated and require less > vector registers. > > gcc/ChangeLog: > > * config/riscv/vector.md: Add new define split to perform > the simplification. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/rvv/base/integer_compare_insn_shortcut.c: New test. > > Signed-off-by: Pan Li > Co-authored-by: kito-cheng > --- > gcc/config/riscv/vector.md| 32 ++ > .../rvv/base/integer_compare_insn_shortcut.c | 291 > ++ > 2 files changed, 323 insertions(+) > create mode 100644 > gcc/testsuite/gcc.target/riscv/rvv/base/integer_compare_insn_shortcut. > c > > diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md > index b3d23441679..1642822d098 100644 > --- a/gcc/config/riscv/vector.md > +++ b/gcc/config/riscv/vector.md > @@ -7689,3 +7689,35 @@ (define_insn "@pred_fault_load" >"vleff.v\t%0,%3%p1" >[(set_attr "type" "vldff") > (set_attr "mode" "")]) > + > +;; > +- > + ;; Integer Compare Instructions Simplification ;; > +- > + > +;; Simplify to VMCLR.m Includes: > +;; - 1. VMSNE > +;; - 2. VMSLT > +;; - 3. VMSLTU > +;; - 4. VMSGT > +;; - 5. VMSGTU > +;; > +- > + > +(define_split > + [(set (match_operand:VB 0 "register_operand") > + (if_then_else:VB > + (unspec:VB > + [(match_operand:VB 1 "vector_all_trues_mask_operand") > +(match_operand4 "vector_length_operand") > +(match_operand5 "const_int_operand") > +(match_operand6 "const_int_operand") > +(reg:SI VL_REGNUM) > +(reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) > + (match_operand:VB3 "vector_move_operand") > + (match_operand:VB2 "vector_undef_operand")))] > + "TARGET_VECTOR" > + [(const_int 0)] > + { > +emit_insn (gen_pred_mov (mode, operands[0], CONST1_RTX > (mode), > +RVV_VUNDEF (mode), operands[3], > +operands[4], operands[5])); > +DONE; > + } > +) > diff --git > a/gcc/testsuite/gcc.target/riscv/rvv/base/integer_compare_insn_shortcu > t.c > b/gcc/testsuite/gcc.target/riscv/rvv/base/integer_compare_insn_shortcu > t.c > new file mode 100644 > index 000..8954adad09d > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/integer_compare_insn_sho > +++ rtcut.c > @@ -0,0 +1,291 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3" } */ > + > +#include "riscv_vector.h" > + > +vbool1_t test_shortcut_for_riscv_vmseq_case_0(vint8m8_t v1, size_t > +vl) { > + return __riscv_vmseq_vv_i8m8_b1(v1, v1, vl); } > + > +vbool2_t test_shortcut_for_riscv_vmseq_case_1(vint8m4_t v1, size_t > +vl) { > + return __riscv_vmseq_vv_i8
Re: [PATCH v2] RISC-V: Allow RVV VMS{Compare}(V1, V1) simplify to VMCLR
LGTM I thought it can optimization __riscv_vmseq_vv_i8m8_b1(v1, v1, vl) too, but don't know why it's not evaluated (eq:VNx128BI (reg/v:VNx128QI 137 [ v1 ]) (reg/v:VNx128QI 137 [ v1 ])) to true, anyway, I guess it should be your next step to investigate :) On Fri, Apr 28, 2023 at 10:46 AM wrote: > > From: Pan Li > > When some RVV integer compare operators act on the same vector > registers without mask. They can be simplified to VMCLR. > > This PATCH allow the ne, lt, ltu, gt, gtu to perform such kind > of the simplification by adding one new define_split. > > Given we have: > vbool1_t test_shortcut_for_riscv_vmslt_case_0(vint8m8_t v1, size_t vl) { > return __riscv_vmslt_vv_i8m8_b1(v1, v1, vl); > } > > Before this patch: > vsetvli zero,a2,e8,m8,ta,ma > vl8re8.v v24,0(a1) > vmslt.vv v8,v24,v24 > vsetvli a5,zero,e8,m8,ta,ma > vsm.vv8,0(a0) > ret > > After this patch: > vsetvli zero,a2,e8,mf8,ta,ma > vmclr.m v24<- optimized to vmclr.m > vsetvli zero,a5,e8,mf8,ta,ma > vsm.v v24,0(a0) > ret > > As above, we may have one instruction eliminated and require less > vector registers. > > gcc/ChangeLog: > > * config/riscv/vector.md: Add new define split to perform > the simplification. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/rvv/base/integer_compare_insn_shortcut.c: New test. > > Signed-off-by: Pan Li > Co-authored-by: kito-cheng > --- > gcc/config/riscv/vector.md| 32 ++ > .../rvv/base/integer_compare_insn_shortcut.c | 291 ++ > 2 files changed, 323 insertions(+) > create mode 100644 > gcc/testsuite/gcc.target/riscv/rvv/base/integer_compare_insn_shortcut.c > > diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md > index b3d23441679..1642822d098 100644 > --- a/gcc/config/riscv/vector.md > +++ b/gcc/config/riscv/vector.md > @@ -7689,3 +7689,35 @@ (define_insn "@pred_fault_load" >"vleff.v\t%0,%3%p1" >[(set_attr "type" "vldff") > (set_attr "mode" "")]) > + > +;; > - > +;; Integer Compare Instructions Simplification > +;; > - > +;; Simplify to VMCLR.m Includes: > +;; - 1. VMSNE > +;; - 2. VMSLT > +;; - 3. VMSLTU > +;; - 4. VMSGT > +;; - 5. VMSGTU > +;; > - > +(define_split > + [(set (match_operand:VB 0 "register_operand") > + (if_then_else:VB > + (unspec:VB > + [(match_operand:VB 1 "vector_all_trues_mask_operand") > +(match_operand4 "vector_length_operand") > +(match_operand5 "const_int_operand") > +(match_operand6 "const_int_operand") > +(reg:SI VL_REGNUM) > +(reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) > + (match_operand:VB3 "vector_move_operand") > + (match_operand:VB2 "vector_undef_operand")))] > + "TARGET_VECTOR" > + [(const_int 0)] > + { > +emit_insn (gen_pred_mov (mode, operands[0], CONST1_RTX > (mode), > +RVV_VUNDEF (mode), operands[3], > +operands[4], operands[5])); > +DONE; > + } > +) > diff --git > a/gcc/testsuite/gcc.target/riscv/rvv/base/integer_compare_insn_shortcut.c > b/gcc/testsuite/gcc.target/riscv/rvv/base/integer_compare_insn_shortcut.c > new file mode 100644 > index 000..8954adad09d > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/integer_compare_insn_shortcut.c > @@ -0,0 +1,291 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3" } */ > + > +#include "riscv_vector.h" > + > +vbool1_t test_shortcut_for_riscv_vmseq_case_0(vint8m8_t v1, size_t vl) { > + return __riscv_vmseq_vv_i8m8_b1(v1, v1, vl); > +} > + > +vbool2_t test_shortcut_for_riscv_vmseq_case_1(vint8m4_t v1, size_t vl) { > + return __riscv_vmseq_vv_i8m4_b2(v1, v1, vl); > +} > + > +vbool4_t test_shortcut_for_riscv_vmseq_case_2(vint8m2_t v1, size_t vl) { > + return __riscv_vmseq_vv_i8m2_b4(v1, v1, vl); > +} > + > +vbool8_t test_shortcut_for_riscv_vmseq_case_3(vint8m1_t v1, size_t vl) { > + return __riscv_vmseq_vv_i8m1_b8(v1, v1, vl); > +} > + > +vbool16_t test_shortcut_for_riscv_vmseq_case_4(vint8mf2_t v1, size_t vl) { > + return __riscv_vmseq_vv_i8mf2_b16(v1, v1, vl); > +} > + > +vbool32_t test_shortcut_for_riscv_vmseq_case_5(vint8mf4_t v1, size_t vl) { > + return __riscv_vmseq_vv_i8mf4_b32(v1, v1, vl); > +} > + > +vbool64_t test_shortcut_for_riscv_vmseq_case_6(vint8mf8_t v1, size_t vl) { > + return __riscv_vmseq_vv_i8mf8_b64(v1, v1, vl); > +} > + > +vbool1_t test_shortcut_for_riscv_vmsne_case_0(vint8m8_t v1, size_t vl) { > + return __riscv_vmsne_vv_i8m8_b1(v1, v1, vl); > +} > + > +vbool2_t test_shortcut_for_riscv_vmsne_case_1(vint8m4_t v1, size_t vl) { > + return __riscv_vmsne_vv_i8m4_b2(v1, v1, vl