Re: [PATCH] RISC-V: Allow LICM hoist POLY_INT configuration code sequence
On 2/6/24 6:14 AM, Robin Dapp wrote: The root cause is this following RTL pattern, after fwprop1: (insn 82 78 84 9 (set (reg:DI 230) (sign_extend:DI (minus:SI (subreg/s/v:SI (reg:DI 150 [ niters.10 ]) 0) (subreg:SI (reg:DI 221) 0 13 {subsi3_extended} (expr_list:REG_EQUAL (sign_extend:DI (plus:SI (subreg/s/v:SI (reg:DI 150 [ niters.10 ]) 0) *(const_poly_int:SI [-16, -16])*)) (nil))) The highlight *(const_poly_int:SI [-16, -16])* causes ICE. This RTL is because: (insn 69 68 71 8 (set (reg:DI 221) (const_poly_int:DI [16, 16])) 208 {*movdi_64bit} (nil)) (insn 82 78 84 9 (set (reg:DI 230) (sign_extend:DI (minus:SI (subreg/s/v:SI (reg:DI 150 [ niters.10 ]) 0) (subreg:SI (reg:DI 221) 0 13 {subsi3_extended} > (subreg:SI (const_poly_int:SI [-16, -16])) fwprop1 add (const_poly_int:SI [-16, -16]) reg_equal (expr_list:REG_EQUAL (sign_extend:DI (plus:SI (subreg/s/v:SI (reg:DI 150 [ niters.10 ]) 0) (const_poly_int:SI [-16, -16]))) (nil))) I'm seeing a slightly different pattern but that doesn't change the problem. (set (reg:SI) (subreg:SI (DI: poly value))) but it causes ICE that I mentioned above. That's indeed a bit more idiomatic and I wouldn't oppose that. The problem causing the ICE is that we want to simplify a PLUS with (const_poly_int:SI [16, 16]) and (const_int 0) but the mode is DImode. My suspicion is that this is caused by our addsi3_extended pattern and we fail to deduce the proper mode for analysis. Certainly possible. It didn't even occur to me that a POLY_INT would slip through here. I'm just speculating but maybe that's because we assert that a plus is of the form simple_reg_p (op0) && CONSTANT_P (op1). Usually, constants don't have a mode and can just be used. poly_int_csts do have one and need to be explicitly converted (kind of). We can only analyze this zero_extended plus at all since Jeff added the addsi3_extended handling for loop-iv. Maybe we could punt like diff --git a/gcc/loop-iv.cc b/gcc/loop-iv.cc index eb7e923a38b..796413c25a3 100644 --- a/gcc/loop-iv.cc +++ b/gcc/loop-iv.cc @@ -714,6 +714,9 @@ get_biv_step_1 (df_ref def, scalar_int_mode outer_mode, rtx reg, if (!simple_reg_p (op0) || !CONSTANT_P (op1)) return false; + if (CONST_POLY_INT_P (op1) && GET_MODE (op1) != outer_mode) + return false; + This helps for your test case but I haven't done any further testing. I'd think this is relatively safe because it's only a missed analysis/optimization in the worst case. Still, generally, I don't see a reason why we wouldn't be able to analyze this? I don't think it would significant hurt anything. IIRC bit of code was to fix a minor regression caused by the backend changes. I would ACK that patch given the usual testing cycle. jeff
Re: Re: [PATCH] RISC-V: Allow LICM hoist POLY_INT configuration code sequence
Hi, Robin. Could you continue on this LICM issue ? I am not sure whether my fix is correct, or you may find another way to make LICM works ? juzhe.zh...@rivai.ai From: Robin Dapp Date: 2024-02-06 21:14 To: juzhe.zh...@rivai.ai; kito.cheng CC: rdapp.gcc; gcc-patches; Kito.cheng; jeffreyalaw Subject: Re: [PATCH] RISC-V: Allow LICM hoist POLY_INT configuration code sequence > The root cause is this following RTL pattern, after fwprop1: > > (insn 82 78 84 9 (set (reg:DI 230) > (sign_extend:DI (minus:SI (subreg/s/v:SI (reg:DI 150 [ niters.10 ]) 0) > (subreg:SI (reg:DI 221) 0 13 {subsi3_extended} > (expr_list:REG_EQUAL (sign_extend:DI (plus:SI (subreg/s/v:SI (reg:DI 150 > [ niters.10 ]) 0) > *(const_poly_int:SI [-16, -16])*)) > (nil))) > > The highlight *(const_poly_int:SI [-16, -16])* > causes ICE. > > This RTL is because: > (insn 69 68 71 8 (set (reg:DI 221) > (const_poly_int:DI [16, 16])) 208 {*movdi_64bit} > (nil)) > (insn 82 78 84 9 (set (reg:DI 230) > (sign_extend:DI (minus:SI (subreg/s/v:SI (reg:DI 150 [ niters.10 ]) 0) > (subreg:SI (reg:DI 221) 0 13 {subsi3_extended} > > (subreg:SI (const_poly_int:SI [-16, > -16])) fwprop1 add (const_poly_int:SI [-16, -16]) reg_equal > (expr_list:REG_EQUAL (sign_extend:DI (plus:SI (subreg/s/v:SI (reg:DI 150 > [ niters.10 ]) 0) > (const_poly_int:SI [-16, -16]))) > (nil))) I'm seeing a slightly different pattern but that doesn't change the problem. > (set (reg:SI) (subreg:SI (DI: poly value))) but it causes ICE that I > mentioned above. That's indeed a bit more idiomatic and I wouldn't oppose that. The problem causing the ICE is that we want to simplify a PLUS with (const_poly_int:SI [16, 16]) and (const_int 0) but the mode is DImode. My suspicion is that this is caused by our addsi3_extended pattern and we fail to deduce the proper mode for analysis. I'm just speculating but maybe that's because we assert that a plus is of the form simple_reg_p (op0) && CONSTANT_P (op1). Usually, constants don't have a mode and can just be used. poly_int_csts do have one and need to be explicitly converted (kind of). We can only analyze this zero_extended plus at all since Jeff added the addsi3_extended handling for loop-iv. Maybe we could punt like diff --git a/gcc/loop-iv.cc b/gcc/loop-iv.cc index eb7e923a38b..796413c25a3 100644 --- a/gcc/loop-iv.cc +++ b/gcc/loop-iv.cc @@ -714,6 +714,9 @@ get_biv_step_1 (df_ref def, scalar_int_mode outer_mode, rtx reg, if (!simple_reg_p (op0) || !CONSTANT_P (op1)) return false; + if (CONST_POLY_INT_P (op1) && GET_MODE (op1) != outer_mode) + return false; + This helps for your test case but I haven't done any further testing. I'd think this is relatively safe because it's only a missed analysis/optimization in the worst case. Still, generally, I don't see a reason why we wouldn't be able to analyze this? Regards Robin
Re: [PATCH] RISC-V: Allow LICM hoist POLY_INT configuration code sequence
> The root cause is this following RTL pattern, after fwprop1: > > (insn 82 78 84 9 (set (reg:DI 230) > (sign_extend:DI (minus:SI (subreg/s/v:SI (reg:DI 150 [ niters.10 ]) 0) > (subreg:SI (reg:DI 221) 0 13 {subsi3_extended} > (expr_list:REG_EQUAL (sign_extend:DI (plus:SI (subreg/s/v:SI (reg:DI 150 > [ niters.10 ]) 0) > *(const_poly_int:SI [-16, -16])*)) > (nil))) > > The highlight *(const_poly_int:SI [-16, -16])* > causes ICE. > > This RTL is because: > (insn 69 68 71 8 (set (reg:DI 221) > (const_poly_int:DI [16, 16])) 208 {*movdi_64bit} > (nil)) > (insn 82 78 84 9 (set (reg:DI 230) > (sign_extend:DI (minus:SI (subreg/s/v:SI (reg:DI 150 [ niters.10 ]) 0) > (subreg:SI (reg:DI 221) 0 13 {subsi3_extended} > > (subreg:SI (const_poly_int:SI [-16, > -16])) fwprop1 add (const_poly_int:SI [-16, -16]) reg_equal > (expr_list:REG_EQUAL (sign_extend:DI (plus:SI (subreg/s/v:SI (reg:DI 150 > [ niters.10 ]) 0) > (const_poly_int:SI [-16, -16]))) > (nil))) I'm seeing a slightly different pattern but that doesn't change the problem. > (set (reg:SI) (subreg:SI (DI: poly value))) but it causes ICE that I > mentioned above. That's indeed a bit more idiomatic and I wouldn't oppose that. The problem causing the ICE is that we want to simplify a PLUS with (const_poly_int:SI [16, 16]) and (const_int 0) but the mode is DImode. My suspicion is that this is caused by our addsi3_extended pattern and we fail to deduce the proper mode for analysis. I'm just speculating but maybe that's because we assert that a plus is of the form simple_reg_p (op0) && CONSTANT_P (op1). Usually, constants don't have a mode and can just be used. poly_int_csts do have one and need to be explicitly converted (kind of). We can only analyze this zero_extended plus at all since Jeff added the addsi3_extended handling for loop-iv. Maybe we could punt like diff --git a/gcc/loop-iv.cc b/gcc/loop-iv.cc index eb7e923a38b..796413c25a3 100644 --- a/gcc/loop-iv.cc +++ b/gcc/loop-iv.cc @@ -714,6 +714,9 @@ get_biv_step_1 (df_ref def, scalar_int_mode outer_mode, rtx reg, if (!simple_reg_p (op0) || !CONSTANT_P (op1)) return false; + if (CONST_POLY_INT_P (op1) && GET_MODE (op1) != outer_mode) + return false; + This helps for your test case but I haven't done any further testing. I'd think this is relatively safe because it's only a missed analysis/optimization in the worst case. Still, generally, I don't see a reason why we wouldn't be able to analyze this? Regards Robin
Re: Re: [PATCH] RISC-V: Allow LICM hoist POLY_INT configuration code sequence
y value)). --> outer mode bigger than inner mode in dest operand. We never has (subreg: (poly_value)), so we won't have ICE. However, I don't think our previous approach is correct. Actually, I believe we should apply this following which should be better: (set (reg:SI) (subreg:SI (DI: poly value))) but it causes ICE that I mentioned above. Also, I try this following which can fix this issue: diff --git a/gcc/loop-iv.cc b/gcc/loop-iv.cc index eb7e923a38b..09750951845 100644 --- a/gcc/loop-iv.cc +++ b/gcc/loop-iv.cc @@ -646,10 +646,10 @@ get_biv_step_1 (df_ref def, scalar_int_mode outer_mode, rtx reg, if (!set) return false; - rhs = find_reg_equal_equiv_note (insn); - if (rhs) -rhs = XEXP (rhs, 0); - else + //rhs = find_reg_equal_equiv_note (insn); + //if (rhs) + // rhs = XEXP (rhs, 0); + //else rhs = SET_SRC (set); Any thoughts ? juzhe.zh...@rivai.ai From: Kito Cheng Date: 2024-02-02 16:50 To: Juzhe-Zhong CC: gcc-patches; kito.cheng; jeffreyalaw; rdapp.gcc Subject: Re: [PATCH] RISC-V: Allow LICM hoist POLY_INT configuration code sequence LGTM :) On Thu, Feb 1, 2024 at 11:46 PM Juzhe-Zhong wrote: > > Realize in recent benchmark evaluation (coremark-pro zip-test): > > vid.v v2 > vmv.v.i v5,0 > .L9: > vle16.v v3,0(a4) > vrsub.vxv4,v2,a6 ---> LICM failed to hoist it outside the > loop. > > The root cause is: > > (insn 56 47 57 4 (set (subreg:DI (reg:HI 220) 0) > (reg:DI 223)) "rvv.c":11:9 208 {*movdi_64bit} -> Its result used by > the following vrsub.vx then supress the hoist of the vrsub.vx > (nil)) > > (insn 57 56 59 4 (set (reg:RVVMF2HI 216) > (if_then_else:RVVMF2HI (unspec:RVVMF32BI [ > (const_vector:RVVMF32BI repeat [ > (const_int 1 [0x1]) > ]) > (reg:DI 350) > (const_int 2 [0x2]) repeated x2 > (const_int 1 [0x1]) > (reg:SI 66 vl) > (reg:SI 67 vtype) > ] UNSPEC_VPREDICATE) > (minus:RVVMF2HI (vec_duplicate:RVVMF2HI (reg:HI 220)) > (reg:RVVMF2HI 217)) > (unspec:RVVMF2HI [ > (reg:DI 0 zero) > ] UNSPEC_VUNDEF))) "rvv.c":11:9 6938 > {pred_subrvvmf2hi_reverse_scalar} > (expr_list:REG_DEAD (reg:HI 220) > (nil))) > > This patch fixes it generate (set (reg:HI) (subreg:HI (reg:DI))) instead of > (set (subreg:DI (reg:DI)) (reg:DI)). > > After this patch: > > vid.v v2 > vrsub.vxv2,v2,a7 > vmv.v.i v4,0 > .L3: > vle16.v v3,0(a4) > > Tested on both RV32 and RV64 no regression. > > gcc/ChangeLog: > > * config/riscv/riscv.cc (riscv_legitimize_move): Fix poly_int dest > generation. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/rvv/autovec/poly_licm-1.c: New test. > * gcc.target/riscv/rvv/autovec/poly_licm-2.c: New test. > > --- > gcc/config/riscv/riscv.cc | 9 --- > .../riscv/rvv/autovec/poly_licm-1.c | 18 + > .../riscv/rvv/autovec/poly_licm-2.c | 27 +++ > 3 files changed, 50 insertions(+), 4 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-1.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-2.c > > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc > index 529ef5e84b7..6e22b43e618 100644 > --- a/gcc/config/riscv/riscv.cc > +++ b/gcc/config/riscv/riscv.cc > @@ -2711,16 +2711,17 @@ riscv_legitimize_move (machine_mode mode, rtx dest, > rtx src) > (const_poly_int:HI [m, n]) > (const_poly_int:SI [m, n]). */ > rtx tmp = gen_reg_rtx (Pmode); > - riscv_legitimize_poly_move (Pmode, gen_lowpart (Pmode, dest), tmp, > - src); > + rtx tmp2 = gen_reg_rtx (Pmode); > + riscv_legitimize_poly_move (Pmode, tmp2, tmp, src); > + emit_move_insn (dest, gen_lowpart (mode, tmp2)); > } >else > { > /* In RV32 system, handle (const_poly_int:SI [m, n]) > (const_poly_int:DI [m, n]). > In RV64 system, handle (const_poly_int:DI [m, n]). > - FIXME: Maybe we could gen SImode in RV32 and then sign-extend to > DImode, > - the offset should not exceed 4GiB in general. */ > +FIXME: Maybe we could gen SImode in RV32 and then sign-extend to > +DImode, the offset should
Re: [PATCH] RISC-V: Allow LICM hoist POLY_INT configuration code sequence
LGTM :) On Thu, Feb 1, 2024 at 11:46 PM Juzhe-Zhong wrote: > > Realize in recent benchmark evaluation (coremark-pro zip-test): > > vid.v v2 > vmv.v.i v5,0 > .L9: > vle16.v v3,0(a4) > vrsub.vxv4,v2,a6 ---> LICM failed to hoist it outside the > loop. > > The root cause is: > > (insn 56 47 57 4 (set (subreg:DI (reg:HI 220) 0) > (reg:DI 223)) "rvv.c":11:9 208 {*movdi_64bit} -> Its result used by > the following vrsub.vx then supress the hoist of the vrsub.vx > (nil)) > > (insn 57 56 59 4 (set (reg:RVVMF2HI 216) > (if_then_else:RVVMF2HI (unspec:RVVMF32BI [ > (const_vector:RVVMF32BI repeat [ > (const_int 1 [0x1]) > ]) > (reg:DI 350) > (const_int 2 [0x2]) repeated x2 > (const_int 1 [0x1]) > (reg:SI 66 vl) > (reg:SI 67 vtype) > ] UNSPEC_VPREDICATE) > (minus:RVVMF2HI (vec_duplicate:RVVMF2HI (reg:HI 220)) > (reg:RVVMF2HI 217)) > (unspec:RVVMF2HI [ > (reg:DI 0 zero) > ] UNSPEC_VUNDEF))) "rvv.c":11:9 6938 > {pred_subrvvmf2hi_reverse_scalar} > (expr_list:REG_DEAD (reg:HI 220) > (nil))) > > This patch fixes it generate (set (reg:HI) (subreg:HI (reg:DI))) instead of > (set (subreg:DI (reg:DI)) (reg:DI)). > > After this patch: > > vid.v v2 > vrsub.vxv2,v2,a7 > vmv.v.i v4,0 > .L3: > vle16.v v3,0(a4) > > Tested on both RV32 and RV64 no regression. > > gcc/ChangeLog: > > * config/riscv/riscv.cc (riscv_legitimize_move): Fix poly_int dest > generation. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/rvv/autovec/poly_licm-1.c: New test. > * gcc.target/riscv/rvv/autovec/poly_licm-2.c: New test. > > --- > gcc/config/riscv/riscv.cc | 9 --- > .../riscv/rvv/autovec/poly_licm-1.c | 18 + > .../riscv/rvv/autovec/poly_licm-2.c | 27 +++ > 3 files changed, 50 insertions(+), 4 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-1.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-2.c > > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc > index 529ef5e84b7..6e22b43e618 100644 > --- a/gcc/config/riscv/riscv.cc > +++ b/gcc/config/riscv/riscv.cc > @@ -2711,16 +2711,17 @@ riscv_legitimize_move (machine_mode mode, rtx dest, > rtx src) > (const_poly_int:HI [m, n]) > (const_poly_int:SI [m, n]). */ > rtx tmp = gen_reg_rtx (Pmode); > - riscv_legitimize_poly_move (Pmode, gen_lowpart (Pmode, dest), tmp, > - src); > + rtx tmp2 = gen_reg_rtx (Pmode); > + riscv_legitimize_poly_move (Pmode, tmp2, tmp, src); > + emit_move_insn (dest, gen_lowpart (mode, tmp2)); > } >else > { > /* In RV32 system, handle (const_poly_int:SI [m, n]) > (const_poly_int:DI [m, n]). > In RV64 system, handle (const_poly_int:DI [m, n]). > - FIXME: Maybe we could gen SImode in RV32 and then sign-extend to > DImode, > - the offset should not exceed 4GiB in general. */ > +FIXME: Maybe we could gen SImode in RV32 and then sign-extend to > +DImode, the offset should not exceed 4GiB in general. */ > rtx tmp = gen_reg_rtx (mode); > riscv_legitimize_poly_move (mode, dest, tmp, src); > } > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-1.c > b/gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-1.c > new file mode 100644 > index 000..b7da65f0996 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-1.c > @@ -0,0 +1,18 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-schedule-insns > -fno-schedule-insns2" } */ > + > +extern int wsize; > + > +typedef unsigned short Posf; > +#define NIL 0 > + > +void foo (Posf *p) > +{ > + register unsigned n, m; > + do { > + m = *--p; > + *p = (Posf)(m >= wsize ? m-wsize : NIL); > + } while (--n); > +} > + > +/* { dg-final { scan-assembler-times > {vid\.v\s+v[0-9]+\s+addi\s+\s*[a-x0-9]+,\s*[a-x0-9]+,\s*-1\s+vrsub\.vx\s+} 1 > } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-2.c > b/gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-2.c > new file mode 100644 > index 000..ffb3c63149f > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-2.c > @@ -0,0 +1,27 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-schedule-insns > -fno-schedule-insns2" } */ > + > +typedef unsigned short uint16_t; > + > +void AAA
[PATCH] RISC-V: Allow LICM hoist POLY_INT configuration code sequence
Realize in recent benchmark evaluation (coremark-pro zip-test): vid.v v2 vmv.v.i v5,0 .L9: vle16.v v3,0(a4) vrsub.vxv4,v2,a6 ---> LICM failed to hoist it outside the loop. The root cause is: (insn 56 47 57 4 (set (subreg:DI (reg:HI 220) 0) (reg:DI 223)) "rvv.c":11:9 208 {*movdi_64bit} -> Its result used by the following vrsub.vx then supress the hoist of the vrsub.vx (nil)) (insn 57 56 59 4 (set (reg:RVVMF2HI 216) (if_then_else:RVVMF2HI (unspec:RVVMF32BI [ (const_vector:RVVMF32BI repeat [ (const_int 1 [0x1]) ]) (reg:DI 350) (const_int 2 [0x2]) repeated x2 (const_int 1 [0x1]) (reg:SI 66 vl) (reg:SI 67 vtype) ] UNSPEC_VPREDICATE) (minus:RVVMF2HI (vec_duplicate:RVVMF2HI (reg:HI 220)) (reg:RVVMF2HI 217)) (unspec:RVVMF2HI [ (reg:DI 0 zero) ] UNSPEC_VUNDEF))) "rvv.c":11:9 6938 {pred_subrvvmf2hi_reverse_scalar} (expr_list:REG_DEAD (reg:HI 220) (nil))) This patch fixes it generate (set (reg:HI) (subreg:HI (reg:DI))) instead of (set (subreg:DI (reg:DI)) (reg:DI)). After this patch: vid.v v2 vrsub.vxv2,v2,a7 vmv.v.i v4,0 .L3: vle16.v v3,0(a4) Tested on both RV32 and RV64 no regression. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_legitimize_move): Fix poly_int dest generation. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/poly_licm-1.c: New test. * gcc.target/riscv/rvv/autovec/poly_licm-2.c: New test. --- gcc/config/riscv/riscv.cc | 9 --- .../riscv/rvv/autovec/poly_licm-1.c | 18 + .../riscv/rvv/autovec/poly_licm-2.c | 27 +++ 3 files changed, 50 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-2.c diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 529ef5e84b7..6e22b43e618 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -2711,16 +2711,17 @@ riscv_legitimize_move (machine_mode mode, rtx dest, rtx src) (const_poly_int:HI [m, n]) (const_poly_int:SI [m, n]). */ rtx tmp = gen_reg_rtx (Pmode); - riscv_legitimize_poly_move (Pmode, gen_lowpart (Pmode, dest), tmp, - src); + rtx tmp2 = gen_reg_rtx (Pmode); + riscv_legitimize_poly_move (Pmode, tmp2, tmp, src); + emit_move_insn (dest, gen_lowpart (mode, tmp2)); } else { /* In RV32 system, handle (const_poly_int:SI [m, n]) (const_poly_int:DI [m, n]). In RV64 system, handle (const_poly_int:DI [m, n]). - FIXME: Maybe we could gen SImode in RV32 and then sign-extend to DImode, - the offset should not exceed 4GiB in general. */ +FIXME: Maybe we could gen SImode in RV32 and then sign-extend to +DImode, the offset should not exceed 4GiB in general. */ rtx tmp = gen_reg_rtx (mode); riscv_legitimize_poly_move (mode, dest, tmp, src); } diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-1.c new file mode 100644 index 000..b7da65f0996 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-1.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-schedule-insns -fno-schedule-insns2" } */ + +extern int wsize; + +typedef unsigned short Posf; +#define NIL 0 + +void foo (Posf *p) +{ + register unsigned n, m; + do { + m = *--p; + *p = (Posf)(m >= wsize ? m-wsize : NIL); + } while (--n); +} + +/* { dg-final { scan-assembler-times {vid\.v\s+v[0-9]+\s+addi\s+\s*[a-x0-9]+,\s*[a-x0-9]+,\s*-1\s+vrsub\.vx\s+} 1 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-2.c new file mode 100644 index 000..ffb3c63149f --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-2.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-schedule-insns -fno-schedule-insns2" } */ + +typedef unsigned short uint16_t; + +void AAA (uint16_t *x, uint16_t *y, unsigned wsize, unsigned count) +{ + unsigned m = 0, n = count; + register uint16_t *p; + + p = x; + + do { +m = *--p; +*p = (uint16_t)(m >= wsize ? m-wsize : 0); + } while (--n); + + n = wsize; + p = y; + + do { + m = *--p; +