Re: [PATCH] RISC-V: Allow LICM hoist POLY_INT configuration code sequence

2024-03-18 Thread Jeff Law




On 2/6/24 6:14 AM, Robin Dapp wrote:

The root cause is this following RTL pattern, after fwprop1:

(insn 82 78 84 9 (set (reg:DI 230)
         (sign_extend:DI (minus:SI (subreg/s/v:SI (reg:DI 150 [ niters.10 ]) 0)
                 (subreg:SI (reg:DI 221) 0 13 {subsi3_extended}
      (expr_list:REG_EQUAL (sign_extend:DI (plus:SI (subreg/s/v:SI (reg:DI 150 
[ niters.10 ]) 0)
                 *(const_poly_int:SI [-16, -16])*))
         (nil)))

The highlight *(const_poly_int:SI [-16, -16])*
causes ICE.

This RTL is because:
(insn 69 68 71 8 (set (reg:DI 221)
         (const_poly_int:DI [16, 16])) 208 {*movdi_64bit}
      (nil))
(insn 82 78 84 9 (set (reg:DI 230)
         (sign_extend:DI (minus:SI (subreg/s/v:SI (reg:DI 150 [ niters.10 ]) 0)
                 (subreg:SI (reg:DI 221) 0 13 {subsi3_extended}                
                          > (subreg:SI (const_poly_int:SI [-16, -16])) 
fwprop1 add  (const_poly_int:SI [-16, -16]) reg_equal
      (expr_list:REG_EQUAL (sign_extend:DI (plus:SI (subreg/s/v:SI (reg:DI 150 
[ niters.10 ]) 0)
                 (const_poly_int:SI [-16, -16])))
         (nil)))


I'm seeing a slightly different pattern but that doesn't change
the problem.


(set (reg:SI)  (subreg:SI (DI: poly value))) but it causes ICE that I
mentioned above.


That's indeed a bit more idiomatic and I wouldn't oppose that.

The problem causing the ICE is that we want to simplify a PLUS
with (const_poly_int:SI [16, 16]) and (const_int 0) but the mode
is DImode.  My suspicion is that this is caused by our
addsi3_extended pattern and we fail to deduce the proper mode
for analysis.
Certainly possible.  It didn't even occur to me that a POLY_INT would 
slip through here.




I'm just speculating but maybe that's because we assert that a
plus is of the form simple_reg_p (op0) && CONSTANT_P (op1).
Usually, constants don't have a mode and can just be used.
poly_int_csts do have one and need to be explicitly converted
(kind of).

We can only analyze this zero_extended plus at all since Jeff
added the addsi3_extended handling for loop-iv.   Maybe we could
punt like

diff --git a/gcc/loop-iv.cc b/gcc/loop-iv.cc
index eb7e923a38b..796413c25a3 100644
--- a/gcc/loop-iv.cc
+++ b/gcc/loop-iv.cc
@@ -714,6 +714,9 @@ get_biv_step_1 (df_ref def, scalar_int_mode outer_mode, rtx 
reg,
   if (!simple_reg_p (op0) || !CONSTANT_P (op1))
 return false;
  
+ if (CONST_POLY_INT_P (op1) && GET_MODE (op1) != outer_mode)

+   return false;
+

This helps for your test case but I haven't done any further
testing.  I'd think this is relatively safe because it's only
a missed analysis/optimization in the worst case.
Still, generally, I don't see a reason why we wouldn't be able
to analyze this?
I don't think it would significant hurt anything.  IIRC bit of code was 
to fix a minor regression caused by the backend changes.


I would ACK that patch given the usual testing cycle.

jeff



Re: Re: [PATCH] RISC-V: Allow LICM hoist POLY_INT configuration code sequence

2024-02-17 Thread juzhe.zh...@rivai.ai
Hi, Robin. Could you continue on this LICM issue ?
I am not sure whether my fix is correct, or you may find another way to make 
LICM works ?



juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2024-02-06 21:14
To: juzhe.zh...@rivai.ai; kito.cheng
CC: rdapp.gcc; gcc-patches; Kito.cheng; jeffreyalaw
Subject: Re: [PATCH] RISC-V: Allow LICM hoist POLY_INT configuration code 
sequence
> The root cause is this following RTL pattern, after fwprop1:
> 
> (insn 82 78 84 9 (set (reg:DI 230)
> (sign_extend:DI (minus:SI (subreg/s/v:SI (reg:DI 150 [ niters.10 ]) 0)
> (subreg:SI (reg:DI 221) 0 13 {subsi3_extended}
>  (expr_list:REG_EQUAL (sign_extend:DI (plus:SI (subreg/s/v:SI (reg:DI 150 
> [ niters.10 ]) 0)
> *(const_poly_int:SI [-16, -16])*))
> (nil)))
> 
> The highlight *(const_poly_int:SI [-16, -16])*
> causes ICE.
> 
> This RTL is because:
> (insn 69 68 71 8 (set (reg:DI 221)
> (const_poly_int:DI [16, 16])) 208 {*movdi_64bit}
>  (nil))
> (insn 82 78 84 9 (set (reg:DI 230)
> (sign_extend:DI (minus:SI (subreg/s/v:SI (reg:DI 150 [ niters.10 ]) 0)
> (subreg:SI (reg:DI 221) 0 13 {subsi3_extended}
>   > (subreg:SI (const_poly_int:SI [-16, 
> -16])) fwprop1 add  (const_poly_int:SI [-16, -16]) reg_equal
>  (expr_list:REG_EQUAL (sign_extend:DI (plus:SI (subreg/s/v:SI (reg:DI 150 
> [ niters.10 ]) 0)
> (const_poly_int:SI [-16, -16])))
> (nil)))
 
I'm seeing a slightly different pattern but that doesn't change
the problem.
 
> (set (reg:SI)  (subreg:SI (DI: poly value))) but it causes ICE that I
> mentioned above.
 
That's indeed a bit more idiomatic and I wouldn't oppose that.
 
The problem causing the ICE is that we want to simplify a PLUS
with (const_poly_int:SI [16, 16]) and (const_int 0) but the mode
is DImode.  My suspicion is that this is caused by our
addsi3_extended pattern and we fail to deduce the proper mode
for analysis.
 
I'm just speculating but maybe that's because we assert that a
plus is of the form simple_reg_p (op0) && CONSTANT_P (op1).
Usually, constants don't have a mode and can just be used.
poly_int_csts do have one and need to be explicitly converted
(kind of).
 
We can only analyze this zero_extended plus at all since Jeff
added the addsi3_extended handling for loop-iv.   Maybe we could
punt like
 
diff --git a/gcc/loop-iv.cc b/gcc/loop-iv.cc
index eb7e923a38b..796413c25a3 100644
--- a/gcc/loop-iv.cc
+++ b/gcc/loop-iv.cc
@@ -714,6 +714,9 @@ get_biv_step_1 (df_ref def, scalar_int_mode outer_mode, rtx 
reg,
  if (!simple_reg_p (op0) || !CONSTANT_P (op1))
return false;
+ if (CONST_POLY_INT_P (op1) && GET_MODE (op1) != outer_mode)
+   return false;
+
 
This helps for your test case but I haven't done any further
testing.  I'd think this is relatively safe because it's only
a missed analysis/optimization in the worst case.
Still, generally, I don't see a reason why we wouldn't be able
to analyze this?
 
Regards
Robin
 
 


Re: [PATCH] RISC-V: Allow LICM hoist POLY_INT configuration code sequence

2024-02-06 Thread Robin Dapp
> The root cause is this following RTL pattern, after fwprop1:
> 
> (insn 82 78 84 9 (set (reg:DI 230)
>         (sign_extend:DI (minus:SI (subreg/s/v:SI (reg:DI 150 [ niters.10 ]) 0)
>                 (subreg:SI (reg:DI 221) 0 13 {subsi3_extended}
>      (expr_list:REG_EQUAL (sign_extend:DI (plus:SI (subreg/s/v:SI (reg:DI 150 
> [ niters.10 ]) 0)
>                 *(const_poly_int:SI [-16, -16])*))
>         (nil)))
> 
> The highlight *(const_poly_int:SI [-16, -16])*
> causes ICE.
> 
> This RTL is because:
> (insn 69 68 71 8 (set (reg:DI 221)
>         (const_poly_int:DI [16, 16])) 208 {*movdi_64bit}
>      (nil))
> (insn 82 78 84 9 (set (reg:DI 230)
>         (sign_extend:DI (minus:SI (subreg/s/v:SI (reg:DI 150 [ niters.10 ]) 0)
>                 (subreg:SI (reg:DI 221) 0 13 {subsi3_extended}            
>                               > (subreg:SI (const_poly_int:SI [-16, 
> -16])) fwprop1 add  (const_poly_int:SI [-16, -16]) reg_equal
>      (expr_list:REG_EQUAL (sign_extend:DI (plus:SI (subreg/s/v:SI (reg:DI 150 
> [ niters.10 ]) 0)
>                 (const_poly_int:SI [-16, -16])))
>         (nil)))

I'm seeing a slightly different pattern but that doesn't change
the problem.

> (set (reg:SI)  (subreg:SI (DI: poly value))) but it causes ICE that I
> mentioned above.

That's indeed a bit more idiomatic and I wouldn't oppose that.

The problem causing the ICE is that we want to simplify a PLUS
with (const_poly_int:SI [16, 16]) and (const_int 0) but the mode
is DImode.  My suspicion is that this is caused by our
addsi3_extended pattern and we fail to deduce the proper mode
for analysis.

I'm just speculating but maybe that's because we assert that a
plus is of the form simple_reg_p (op0) && CONSTANT_P (op1).
Usually, constants don't have a mode and can just be used.
poly_int_csts do have one and need to be explicitly converted
(kind of).

We can only analyze this zero_extended plus at all since Jeff
added the addsi3_extended handling for loop-iv.   Maybe we could
punt like

diff --git a/gcc/loop-iv.cc b/gcc/loop-iv.cc
index eb7e923a38b..796413c25a3 100644
--- a/gcc/loop-iv.cc
+++ b/gcc/loop-iv.cc
@@ -714,6 +714,9 @@ get_biv_step_1 (df_ref def, scalar_int_mode outer_mode, rtx 
reg,
  if (!simple_reg_p (op0) || !CONSTANT_P (op1))
return false;
 
+ if (CONST_POLY_INT_P (op1) && GET_MODE (op1) != outer_mode)
+   return false;
+

This helps for your test case but I haven't done any further
testing.  I'd think this is relatively safe because it's only
a missed analysis/optimization in the worst case.
Still, generally, I don't see a reason why we wouldn't be able
to analyze this?

Regards
 Robin



Re: Re: [PATCH] RISC-V: Allow LICM hoist POLY_INT configuration code sequence

2024-02-03 Thread juzhe.zh...@rivai.ai
y value)). --> outer mode bigger than inner 
mode in dest operand.

We never has (subreg: (poly_value)), so we won't have ICE. However, I don't 
think our previous approach is correct.

Actually, I believe we should apply this following which should be better:

 (set (reg:SI)  (subreg:SI (DI: poly value))) but it causes ICE that I 
mentioned above.

Also, I try this following which can fix this issue:

diff --git a/gcc/loop-iv.cc b/gcc/loop-iv.cc
index eb7e923a38b..09750951845 100644
--- a/gcc/loop-iv.cc
+++ b/gcc/loop-iv.cc
@@ -646,10 +646,10 @@ get_biv_step_1 (df_ref def, scalar_int_mode outer_mode, 
rtx reg,
   if (!set)
 return false;

-  rhs = find_reg_equal_equiv_note (insn);
-  if (rhs)
-rhs = XEXP (rhs, 0);
-  else
+  //rhs = find_reg_equal_equiv_note (insn);
+  //if (rhs)
+  //  rhs = XEXP (rhs, 0);
+  //else
 rhs = SET_SRC (set);

Any thoughts ?





juzhe.zh...@rivai.ai
 
From: Kito Cheng
Date: 2024-02-02 16:50
To: Juzhe-Zhong
CC: gcc-patches; kito.cheng; jeffreyalaw; rdapp.gcc
Subject: Re: [PATCH] RISC-V: Allow LICM hoist POLY_INT configuration code 
sequence
LGTM :)
 
On Thu, Feb 1, 2024 at 11:46 PM Juzhe-Zhong  wrote:
>
> Realize in recent benchmark evaluation (coremark-pro zip-test):
>
> vid.v   v2
> vmv.v.i v5,0
> .L9:
> vle16.v v3,0(a4)
> vrsub.vxv4,v2,a6   ---> LICM failed to hoist it outside the 
> loop.
>
> The root cause is:
>
> (insn 56 47 57 4 (set (subreg:DI (reg:HI 220) 0)
> (reg:DI 223)) "rvv.c":11:9 208 {*movdi_64bit}  -> Its result used by 
> the following vrsub.vx then supress the hoist of the vrsub.vx
>  (nil))
>
> (insn 57 56 59 4 (set (reg:RVVMF2HI 216)
> (if_then_else:RVVMF2HI (unspec:RVVMF32BI [
> (const_vector:RVVMF32BI repeat [
> (const_int 1 [0x1])
> ])
> (reg:DI 350)
> (const_int 2 [0x2]) repeated x2
> (const_int 1 [0x1])
> (reg:SI 66 vl)
> (reg:SI 67 vtype)
> ] UNSPEC_VPREDICATE)
> (minus:RVVMF2HI (vec_duplicate:RVVMF2HI (reg:HI 220))
> (reg:RVVMF2HI 217))
> (unspec:RVVMF2HI [
> (reg:DI 0 zero)
> ] UNSPEC_VUNDEF))) "rvv.c":11:9 6938 
> {pred_subrvvmf2hi_reverse_scalar}
>  (expr_list:REG_DEAD (reg:HI 220)
> (nil)))
>
> This patch fixes it generate (set (reg:HI) (subreg:HI (reg:DI))) instead of 
> (set (subreg:DI (reg:DI)) (reg:DI)).
>
> After this patch:
>
> vid.v   v2
> vrsub.vxv2,v2,a7
> vmv.v.i v4,0
> .L3:
> vle16.v v3,0(a4)
>
> Tested on both RV32 and RV64 no regression.
>
> gcc/ChangeLog:
>
> * config/riscv/riscv.cc (riscv_legitimize_move): Fix poly_int dest 
> generation.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/autovec/poly_licm-1.c: New test.
> * gcc.target/riscv/rvv/autovec/poly_licm-2.c: New test.
>
> ---
>  gcc/config/riscv/riscv.cc |  9 ---
>  .../riscv/rvv/autovec/poly_licm-1.c   | 18 +
>  .../riscv/rvv/autovec/poly_licm-2.c   | 27 +++
>  3 files changed, 50 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-2.c
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 529ef5e84b7..6e22b43e618 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -2711,16 +2711,17 @@ riscv_legitimize_move (machine_mode mode, rtx dest, 
> rtx src)
> (const_poly_int:HI [m, n])
> (const_poly_int:SI [m, n]).  */
>   rtx tmp = gen_reg_rtx (Pmode);
> - riscv_legitimize_poly_move (Pmode, gen_lowpart (Pmode, dest), tmp,
> - src);
> + rtx tmp2 = gen_reg_rtx (Pmode);
> + riscv_legitimize_poly_move (Pmode, tmp2, tmp, src);
> + emit_move_insn (dest, gen_lowpart (mode, tmp2));
> }
>else
> {
>   /* In RV32 system, handle (const_poly_int:SI [m, n])
> (const_poly_int:DI [m, n]).
>  In RV64 system, handle (const_poly_int:DI [m, n]).
> -   FIXME: Maybe we could gen SImode in RV32 and then sign-extend to 
> DImode,
> -   the offset should not exceed 4GiB in general.  */
> +FIXME: Maybe we could gen SImode in RV32 and then sign-extend to
> +DImode, the offset should 

Re: [PATCH] RISC-V: Allow LICM hoist POLY_INT configuration code sequence

2024-02-02 Thread Kito Cheng
LGTM :)

On Thu, Feb 1, 2024 at 11:46 PM Juzhe-Zhong  wrote:
>
> Realize in recent benchmark evaluation (coremark-pro zip-test):
>
> vid.v   v2
> vmv.v.i v5,0
> .L9:
> vle16.v v3,0(a4)
> vrsub.vxv4,v2,a6   ---> LICM failed to hoist it outside the 
> loop.
>
> The root cause is:
>
> (insn 56 47 57 4 (set (subreg:DI (reg:HI 220) 0)
> (reg:DI 223)) "rvv.c":11:9 208 {*movdi_64bit}  -> Its result used by 
> the following vrsub.vx then supress the hoist of the vrsub.vx
>  (nil))
>
> (insn 57 56 59 4 (set (reg:RVVMF2HI 216)
> (if_then_else:RVVMF2HI (unspec:RVVMF32BI [
> (const_vector:RVVMF32BI repeat [
> (const_int 1 [0x1])
> ])
> (reg:DI 350)
> (const_int 2 [0x2]) repeated x2
> (const_int 1 [0x1])
> (reg:SI 66 vl)
> (reg:SI 67 vtype)
> ] UNSPEC_VPREDICATE)
> (minus:RVVMF2HI (vec_duplicate:RVVMF2HI (reg:HI 220))
> (reg:RVVMF2HI 217))
> (unspec:RVVMF2HI [
> (reg:DI 0 zero)
> ] UNSPEC_VUNDEF))) "rvv.c":11:9 6938 
> {pred_subrvvmf2hi_reverse_scalar}
>  (expr_list:REG_DEAD (reg:HI 220)
> (nil)))
>
> This patch fixes it generate (set (reg:HI) (subreg:HI (reg:DI))) instead of 
> (set (subreg:DI (reg:DI)) (reg:DI)).
>
> After this patch:
>
> vid.v   v2
> vrsub.vxv2,v2,a7
> vmv.v.i v4,0
> .L3:
> vle16.v v3,0(a4)
>
> Tested on both RV32 and RV64 no regression.
>
> gcc/ChangeLog:
>
> * config/riscv/riscv.cc (riscv_legitimize_move): Fix poly_int dest 
> generation.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/autovec/poly_licm-1.c: New test.
> * gcc.target/riscv/rvv/autovec/poly_licm-2.c: New test.
>
> ---
>  gcc/config/riscv/riscv.cc |  9 ---
>  .../riscv/rvv/autovec/poly_licm-1.c   | 18 +
>  .../riscv/rvv/autovec/poly_licm-2.c   | 27 +++
>  3 files changed, 50 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-2.c
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 529ef5e84b7..6e22b43e618 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -2711,16 +2711,17 @@ riscv_legitimize_move (machine_mode mode, rtx dest, 
> rtx src)
> (const_poly_int:HI [m, n])
> (const_poly_int:SI [m, n]).  */
>   rtx tmp = gen_reg_rtx (Pmode);
> - riscv_legitimize_poly_move (Pmode, gen_lowpart (Pmode, dest), tmp,
> - src);
> + rtx tmp2 = gen_reg_rtx (Pmode);
> + riscv_legitimize_poly_move (Pmode, tmp2, tmp, src);
> + emit_move_insn (dest, gen_lowpart (mode, tmp2));
> }
>else
> {
>   /* In RV32 system, handle (const_poly_int:SI [m, n])
> (const_poly_int:DI [m, n]).
>  In RV64 system, handle (const_poly_int:DI [m, n]).
> -   FIXME: Maybe we could gen SImode in RV32 and then sign-extend to 
> DImode,
> -   the offset should not exceed 4GiB in general.  */
> +FIXME: Maybe we could gen SImode in RV32 and then sign-extend to
> +DImode, the offset should not exceed 4GiB in general.  */
>   rtx tmp = gen_reg_rtx (mode);
>   riscv_legitimize_poly_move (mode, dest, tmp, src);
> }
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-1.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-1.c
> new file mode 100644
> index 000..b7da65f0996
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-1.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-schedule-insns 
> -fno-schedule-insns2" } */
> +
> +extern int wsize;
> +
> +typedef unsigned short Posf;
> +#define NIL 0
> +
> +void foo (Posf *p)
> +{
> +  register unsigned n, m;
> +  do {
> +  m = *--p;
> +  *p = (Posf)(m >= wsize ? m-wsize : NIL);
> +  } while (--n);
> +}
> +
> +/* { dg-final { scan-assembler-times 
> {vid\.v\s+v[0-9]+\s+addi\s+\s*[a-x0-9]+,\s*[a-x0-9]+,\s*-1\s+vrsub\.vx\s+} 1 
> } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-2.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-2.c
> new file mode 100644
> index 000..ffb3c63149f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-2.c
> @@ -0,0 +1,27 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-schedule-insns 
> -fno-schedule-insns2" } */
> +
> +typedef unsigned short uint16_t;
> +
> +void AAA

[PATCH] RISC-V: Allow LICM hoist POLY_INT configuration code sequence

2024-02-01 Thread Juzhe-Zhong
Realize in recent benchmark evaluation (coremark-pro zip-test):

vid.v   v2
vmv.v.i v5,0
.L9:
vle16.v v3,0(a4)
vrsub.vxv4,v2,a6   ---> LICM failed to hoist it outside the 
loop.

The root cause is:

(insn 56 47 57 4 (set (subreg:DI (reg:HI 220) 0)
(reg:DI 223)) "rvv.c":11:9 208 {*movdi_64bit}  -> Its result used by 
the following vrsub.vx then supress the hoist of the vrsub.vx
 (nil))  

(insn 57 56 59 4 (set (reg:RVVMF2HI 216)
(if_then_else:RVVMF2HI (unspec:RVVMF32BI [
(const_vector:RVVMF32BI repeat [
(const_int 1 [0x1])
])
(reg:DI 350)
(const_int 2 [0x2]) repeated x2
(const_int 1 [0x1])
(reg:SI 66 vl)
(reg:SI 67 vtype)
] UNSPEC_VPREDICATE)
(minus:RVVMF2HI (vec_duplicate:RVVMF2HI (reg:HI 220))
(reg:RVVMF2HI 217))
(unspec:RVVMF2HI [
(reg:DI 0 zero)
] UNSPEC_VUNDEF))) "rvv.c":11:9 6938 
{pred_subrvvmf2hi_reverse_scalar}
 (expr_list:REG_DEAD (reg:HI 220)
(nil)))

This patch fixes it generate (set (reg:HI) (subreg:HI (reg:DI))) instead of 
(set (subreg:DI (reg:DI)) (reg:DI)).

After this patch:

vid.v   v2
vrsub.vxv2,v2,a7
vmv.v.i v4,0
.L3:
vle16.v v3,0(a4)

Tested on both RV32 and RV64 no regression.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_legitimize_move): Fix poly_int dest 
generation.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/poly_licm-1.c: New test.
* gcc.target/riscv/rvv/autovec/poly_licm-2.c: New test.

---
 gcc/config/riscv/riscv.cc |  9 ---
 .../riscv/rvv/autovec/poly_licm-1.c   | 18 +
 .../riscv/rvv/autovec/poly_licm-2.c   | 27 +++
 3 files changed, 50 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-2.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 529ef5e84b7..6e22b43e618 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -2711,16 +2711,17 @@ riscv_legitimize_move (machine_mode mode, rtx dest, rtx 
src)
(const_poly_int:HI [m, n])
(const_poly_int:SI [m, n]).  */
  rtx tmp = gen_reg_rtx (Pmode);
- riscv_legitimize_poly_move (Pmode, gen_lowpart (Pmode, dest), tmp,
- src);
+ rtx tmp2 = gen_reg_rtx (Pmode);
+ riscv_legitimize_poly_move (Pmode, tmp2, tmp, src);
+ emit_move_insn (dest, gen_lowpart (mode, tmp2));
}
   else
{
  /* In RV32 system, handle (const_poly_int:SI [m, n])
(const_poly_int:DI [m, n]).
 In RV64 system, handle (const_poly_int:DI [m, n]).
-   FIXME: Maybe we could gen SImode in RV32 and then sign-extend to DImode,
-   the offset should not exceed 4GiB in general.  */
+FIXME: Maybe we could gen SImode in RV32 and then sign-extend to
+DImode, the offset should not exceed 4GiB in general.  */
  rtx tmp = gen_reg_rtx (mode);
  riscv_legitimize_poly_move (mode, dest, tmp, src);
}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-1.c
new file mode 100644
index 000..b7da65f0996
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-schedule-insns 
-fno-schedule-insns2" } */
+
+extern int wsize;
+
+typedef unsigned short Posf;
+#define NIL 0
+
+void foo (Posf *p)
+{
+  register unsigned n, m;
+  do {
+  m = *--p;
+  *p = (Posf)(m >= wsize ? m-wsize : NIL);
+  } while (--n);
+}
+
+/* { dg-final { scan-assembler-times 
{vid\.v\s+v[0-9]+\s+addi\s+\s*[a-x0-9]+,\s*[a-x0-9]+,\s*-1\s+vrsub\.vx\s+} 1 } 
} */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-2.c
new file mode 100644
index 000..ffb3c63149f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/poly_licm-2.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-schedule-insns 
-fno-schedule-insns2" } */
+
+typedef unsigned short uint16_t;
+
+void AAA (uint16_t *x, uint16_t *y, unsigned wsize, unsigned count)
+{
+  unsigned m = 0, n = count;
+  register uint16_t *p;
+
+  p = x;
+
+  do {
+m = *--p;
+*p = (uint16_t)(m >= wsize ? m-wsize : 0);
+  } while (--n);
+
+  n = wsize;
+  p = y;
+
+  do {
+  m = *--p;
+