Re: [PATCH 0/1] tcg: Adjust simd_desc size encoding

2020-08-31 Thread Frank Chang
On Tue, Sep 1, 2020 at 6:29 AM Richard Henderson <
richard.hender...@linaro.org> wrote:

> Frank, this is intended to address the vector size limitation
> that you encountered with the risc-v rvv patch set, as per
>
> https://lists.gnu.org/archive/html/qemu-devel/2020-08/msg07924.html
>
> although not exactly like that email says.
>
> This will allow vectors up to 2k bytes in length.
> Please test, if you can.
>
>
> r~
>
>
> Richard Henderson (1):
>   tcg: Adjust simd_desc size encoding
>
>  include/tcg/tcg-gvec-desc.h | 38 -
>  tcg/tcg-op-gvec.c   | 35 ++
>  2 files changed, 52 insertions(+), 21 deletions(-)
>
> --
> 2.25.1
>
>
Thanks Richard, I will give it a try on my RVV 1.0.
Thanks for the quick fix.

Frank Chang


Re: [RFC v5 06/68] target/riscv: rvv-1.0: add translation-time vector context status

2020-10-05 Thread Frank Chang
On Sat, Oct 3, 2020 at 12:19 AM Richard Henderson <
richard.hender...@linaro.org> wrote:

> On 9/29/20 2:03 PM, frank.ch...@sifive.com wrote:
> > +++ b/target/riscv/insn_trans/trans_rvv.c.inc
> > @@ -41,6 +41,7 @@ static bool trans_vsetvl(DisasContext *ctx, arg_vsetvl
> *a)
> >  gen_get_gpr(s2, a->rs2);
> >  gen_helper_vsetvl(dst, cpu_env, s1, s2);
> >  gen_set_gpr(a->rd, dst);
> > +mark_vs_dirty(ctx);
> >  tcg_gen_movi_tl(cpu_pc, ctx->pc_succ_insn);
> >  lookup_and_goto_ptr(ctx);
> >  ctx->base.is_jmp = DISAS_NORETURN;
> > @@ -72,7 +73,7 @@ static bool trans_vsetvli(DisasContext *ctx,
> arg_vsetvli *a)
> >  }
> >  gen_helper_vsetvl(dst, cpu_env, s1, s2);
> >  gen_set_gpr(a->rd, dst);
> > -gen_goto_tb(ctx, 0, ctx->pc_succ_insn);
> > +mark_vs_dirty(ctx);
>
> Removing the gen_goto_tb can't be right all by itself.
>

Oops, I think I somehow mess up the commits.
This commit should only contain: + mark_vs_dirty(ctx); change.
The - gen_goto_tb(ctx, 0, ctx->pc_succ_insn); should be included and
replaced by:
+ tcg_gen_movi_tl(cpu_pc, ctx->pc_succ_insn);
+ lookup_and_goto_ptr(ctx);
in later commit: target/riscv: rvv-1.0: update check functions.


> I think you want to be sharing the code between vsetvl and vsetvli now.
> Just
> pass in a TCGv value to a common helper.
>

The only difference now between vsetvl and vsetvli is the format of zimm
and s2 fields.
But they have different formats and are queried by different functions,
i.e. s2 = tcg_const_tl(a->zimm); and gen_get_gpr(s2, a->rs2);

Is there any elegant way to retrieve the values of zimm and s2 by shared
common codes?


>
> r~
>

Thanks,
Frank Chang


Re: [RFC v5 00/68] support vector extension v1.0

2020-10-20 Thread Frank Chang
On Wed, Sep 30, 2020 at 3:04 AM  wrote:

> From: Frank Chang 
>
> This patchset implements the vector extension v1.0 for RISC-V on QEMU.
>
> This patchset is sent as RFC because RVV v1.0 is still in draft state.
> v2 patchset was sent for RVV v0.9 and bumped to RVV v1.0 since v3 patchset.
>
> The port is available here:
> https://github.com/sifive/qemu/tree/rvv-1.0-upstream-v5
>
> You can change the cpu argument: vext_spec to v1.0 (i.e. vext_spec=v1.0)
> to run with RVV v1.0 instructions.
>
> Note: This patchset depends on two other patchsets listed in Based-on
>   section below so it might not able to be built unless those two
>   patchsets are applied.
>
> Changelog:
>
> v5
>   * refactor RVV v1.0 check functions.
> (Thanks to Richard Henderson's bitwise tricks.)
>   * relax RV_VLEN_MAX to 1024-bits.
>   * implement vstart CSR's behaviors.
>   * trigger illegal instruction exception if frm is not valid for
> vector floating-point instructions.
>   * rebase on riscv-to-apply.next.
>
> v4
>   * remove explicit float flmul variable in DisasContext.
>   * replace floating-point calculations with shift operations to
> improve performance.
>   * relax RV_VLEN_MAX to 512-bits.
>
> v3
>   * apply nan-box helpers from Richard Henderson.
>   * remove fp16 api changes as they are sent independently in another
> pathcset by Chih-Min Chao.
>   * remove all tail elements clear functions as tail elements can
> retain unchanged for either VTA set to undisturbed or agnostic.
>   * add fp16 nan-box check generator function.
>   * add floating-point rounding mode enum.
>   * replace flmul arithmetic with shifts to avoid floating-point
> conversions.
>   * add Zvqmac extension.
>   * replace gdbstub vector register xml files with dynamic generator.
>   * bumped to RVV v1.0.
>   * RVV v1.0 related changes:
> * add vlre.v and vsr.v vector whole register
>   load/store instructions
> * add vrgatherei16 instruction.
> * rearranged bits in vtype to make vlmul bits into a contiguous
>   field.
>
> v2
>   * drop v0.7.1 support.
>   * replace invisible return check macros with functions.
>   * move mark_vs_dirty() to translators.
>   * add SSTATUS_VS flag for s-mode.
>   * nan-box scalar fp register for floating-point operations.
>   * add gdbstub files for vector registers to allow system-mode
> debugging with GDB.
>
> Based-on: <20200909001647.532249-1-richard.hender...@linaro.org/>
> Based-on: <1596102747-20226-1-git-send-email-chihmin.c...@sifive.com/>
>
> Frank Chang (62):
>   target/riscv: drop vector 0.7.1 and add 1.0 support
>   target/riscv: Use FIELD_EX32() to extract wd field
>   target/riscv: rvv-1.0: introduce writable misa.v field
>   target/riscv: rvv-1.0: add translation-time vector context status
>   target/riscv: rvv-1.0: remove rvv related codes from fcsr registers
>   target/riscv: rvv-1.0: check MSTATUS_VS when accessing vector csr
> registers
>   target/riscv: rvv-1.0: remove MLEN calculations
>   target/riscv: rvv-1.0: add fractional LMUL
>   target/riscv: rvv-1.0: add VMA and VTA
>   target/riscv: rvv-1.0: update check functions
>   target/riscv: introduce more imm value modes in translator functions
>   target/riscv: rvv:1.0: add translation-time nan-box helper function
>   target/riscv: rvv-1.0: configure instructions
>   target/riscv: rvv-1.0: stride load and store instructions
>   target/riscv: rvv-1.0: index load and store instructions
>   target/riscv: rvv-1.0: fix address index overflow bug of indexed
> load/store insns
>   target/riscv: rvv-1.0: fault-only-first unit stride load
>   target/riscv: rvv-1.0: amo operations
>   target/riscv: rvv-1.0: load/store whole register instructions
>   target/riscv: rvv-1.0: update vext_max_elems() for load/store insns
>   target/riscv: rvv-1.0: take fractional LMUL into vector max elements
> calculation
>   target/riscv: rvv-1.0: floating-point square-root instruction
>   target/riscv: rvv-1.0: floating-point classify instructions
>   target/riscv: rvv-1.0: mask population count instruction
>   target/riscv: rvv-1.0: find-first-set mask bit instruction
>   target/riscv: rvv-1.0: set-X-first mask bit instructions
>   target/riscv: rvv-1.0: iota instruction
>   target/riscv: rvv-1.0: element index instruction
>   target/riscv: rvv-1.0: allow load element with sign-extended
>   target/riscv: rvv-1.0: register gather instructions
>   target/riscv: rvv-1.0: integer scalar move instructions
>   target/riscv: rvv-1.0: floating-point move instruction
>   target/riscv: rvv-1.0: floating-point scalar move instructions
>   target/riscv: rvv-1.0: whole register move instructions
>   tar

Re: [RFC v5 06/68] target/riscv: rvv-1.0: add translation-time vector context status

2020-10-05 Thread Frank Chang
On Mon, Oct 5, 2020 at 10:00 PM Richard Henderson <
richard.hender...@linaro.org> wrote:

> On 10/5/20 2:12 AM, Frank Chang wrote:
> > I think you want to be sharing the code between vsetvl and vsetvli
> now.  Just
> > pass in a TCGv value to a common helper.
> >
> >
> > The only difference now between vsetvl and vsetvli is the format of zimm
> and s2
> > fields.
> > But they have different formats and are queried by different functions,
> > i.e. s2 = tcg_const_tl(a->zimm); and gen_get_gpr(s2, a->rs2);
> >
> > Is there any elegant way to retrieve the values of zimm and s2 by shared
> common
> > codes?
>
> Yes, like I (too briefly) described:
>
> static bool do_vsetvl(DisasContext *ctx,
>   int rd, int rs1, TCGv s2)
> {
> // existing contents of trans_vsetvl
> // do continue to free s2.
> }
>
> static bool trans_vsetvl(DisasContext *ctx, arg_vsetvl *a)
> {
> TCGv s2 = tcg_temp_new();
> gen_get_gpr(s2, a->rs2);
> return do_vsetvl(ctx, a->rd, a->rs1, s2);
> }
>
> static bool trans_vsetvli(DisasContext *ctx, arg_vsetvli *a)
> {
> TCGv s2 = tcg_const_tl(a->zimm);
> return do_vsetvl(ctx, a->rd, a->rs1, s2);
> }
>
>
> r~
>

Oops, I misunderstood what the "helper function" you meant.
I thought it was the helper function in vector_helper.c.
I'll update the codes in the next version patchset.

Thanks,
Frank Chang


Re: [RFC v4 00/70] support vector extension v1.0

2020-08-25 Thread Frank Chang
On Mon, Aug 17, 2020 at 4:50 PM  wrote:

> From: Frank Chang 
>
> This patchset implements the vector extension v1.0 for RISC-V on QEMU.
>
> This patchset is sent as RFC because RVV v1.0 is still in draft state.
> v2 patchset was sent for RVV v0.9 and bumped to RVV v1.0 since v3 patchset.
>
> The port is available here:
> https://github.com/sifive/qemu/tree/rvv-1.0-upstream-v4
>
> You can change the cpu argument: vext_spec to v1.0 (i.e. vext_spec=v1.0)
> to run with RVV v1.0 instructions.
>
> Note: This patchset depends on two other patchsets listed in Based-on
>   section below so it might not able to be built unless those two
>   patchsets are applied.
>
> Changelog:
>
> v4
>   * remove explicit float flmul variable in DisasContext.
>   * replace floating-point calculations with shift operations to
> improve performance.
>   * relax RV_VLEN_MAX to 512-bits.
>
> v3
>   * apply nan-box helpers from Richard Henderson.
>   * remove fp16 api changes as they are sent independently in another
> pathcset by Chih-Min Chao.
>   * remove all tail elements clear functions as tail elements can
> retain unchanged for either VTA set to undisturbed or agnostic.
>   * add fp16 nan-box check generator function.
>   * add floating-point rounding mode enum.
>   * replace flmul arithmetic with shifts to avoid floating-point
> conversions.
>   * add Zvqmac extension.
>   * replace gdbstub vector register xml files with dynamic generator.
>   * bumped to RVV v1.0.
>   * RVV v1.0 related changes:
> * add vlre.v and vsr.v vector whole register
>   load/store instructions
> * add vrgatherei16 instruction.
> * rearranged bits in vtype to make vlmul bits into a contiguous
>   field.
>
> v2
>   * drop v0.7.1 support.
>   * replace invisible return check macros with functions.
>   * move mark_vs_dirty() to translators.
>   * add SSTATUS_VS flag for s-mode.
>   * nan-box scalar fp register for floating-point operations.
>   * add gdbstub files for vector registers to allow system-mode
> debugging with GDB.
>
> Based-on: <20200724002807.441147-1-richard.hender...@linaro.org/>
> Based-on: <1596102747-20226-1-git-send-email-chihmin.c...@sifive.com/>
>
> Frank Chang (62):
>   target/riscv: drop vector 0.7.1 and add 1.0 support
>   target/riscv: Use FIELD_EX32() to extract wd field
>   target/riscv: rvv-1.0: introduce writable misa.v field
>   target/riscv: rvv-1.0: remove rvv related codes from fcsr registers
>   target/riscv: rvv-1.0: check MSTATUS_VS when accessing vector csr
> registers
>   target/riscv: rvv-1.0: remove MLEN calculations
>   target/riscv: rvv-1.0: add fractional LMUL
>   target/riscv: rvv-1.0: add VMA and VTA
>   target/riscv: rvv-1.0: update check functions
>   target/riscv: introduce more imm value modes in translator functions
>   target/riscv: rvv:1.0: add translation-time nan-box helper function
>   target/riscv: rvv-1.0: configure instructions
>   target/riscv: rvv-1.0: stride load and store instructions
>   target/riscv: rvv-1.0: index load and store instructions
>   target/riscv: rvv-1.0: fix address index overflow bug of indexed
> load/store insns
>   target/riscv: rvv-1.0: fault-only-first unit stride load
>   target/riscv: rvv-1.0: amo operations
>   target/riscv: rvv-1.0: load/store whole register instructions
>   target/riscv: rvv-1.0: update vext_max_elems() for load/store insns
>   target/riscv: rvv-1.0: take fractional LMUL into vector max elements
> calculation
>   target/riscv: rvv-1.0: floating-point square-root instruction
>   target/riscv: rvv-1.0: floating-point classify instructions
>   target/riscv: rvv-1.0: mask population count instruction
>   target/riscv: rvv-1.0: find-first-set mask bit instruction
>   target/riscv: rvv-1.0: set-X-first mask bit instructions
>   target/riscv: rvv-1.0: iota instruction
>   target/riscv: rvv-1.0: element index instruction
>   target/riscv: rvv-1.0: allow load element with sign-extended
>   target/riscv: rvv-1.0: register gather instructions
>   target/riscv: rvv-1.0: integer scalar move instructions
>   target/riscv: rvv-1.0: floating-point move instruction
>   target/riscv: rvv-1.0: floating-point scalar move instructions
>   target/riscv: rvv-1.0: whole register move instructions
>   target/riscv: rvv-1.0: integer extension instructions
>   target/riscv: rvv-1.0: single-width averaging add and subtract
> instructions
>   target/riscv: rvv-1.0: single-width bit shift instructions
>   target/riscv: rvv-1.0: integer add-with-carry/subtract-with-borrow
>   target/riscv: rvv-1.0: narrowing integer right shift instructions
>   target/riscv: rvv-1.0: widening integer multiply-add instructio

Re: [RFC v4 00/70] support vector extension v1.0

2020-08-26 Thread Frank Chang
On Thu, Aug 27, 2020 at 12:56 AM Alistair Francis 
wrote:

> On Tue, Aug 25, 2020 at 1:29 AM Frank Chang 
> wrote:
> >
> > On Mon, Aug 17, 2020 at 4:50 PM  wrote:
> >>
> >> From: Frank Chang 
> >>
> >> This patchset implements the vector extension v1.0 for RISC-V on QEMU.
> >>
> >> This patchset is sent as RFC because RVV v1.0 is still in draft state.
> >> v2 patchset was sent for RVV v0.9 and bumped to RVV v1.0 since v3
> patchset.
> >>
> >> The port is available here:
> >> https://github.com/sifive/qemu/tree/rvv-1.0-upstream-v4
> >>
> >> You can change the cpu argument: vext_spec to v1.0 (i.e. vext_spec=v1.0)
> >> to run with RVV v1.0 instructions.
> >>
> >> Note: This patchset depends on two other patchsets listed in Based-on
> >>   section below so it might not able to be built unless those two
> >>   patchsets are applied.
> >>
> >> Changelog:
> >>
> >> v4
> >>   * remove explicit float flmul variable in DisasContext.
> >>   * replace floating-point calculations with shift operations to
> >> improve performance.
> >>   * relax RV_VLEN_MAX to 512-bits.
> >>
> >> v3
> >>   * apply nan-box helpers from Richard Henderson.
> >>   * remove fp16 api changes as they are sent independently in another
> >> pathcset by Chih-Min Chao.
> >>   * remove all tail elements clear functions as tail elements can
> >> retain unchanged for either VTA set to undisturbed or agnostic.
> >>   * add fp16 nan-box check generator function.
> >>   * add floating-point rounding mode enum.
> >>   * replace flmul arithmetic with shifts to avoid floating-point
> >> conversions.
> >>   * add Zvqmac extension.
> >>   * replace gdbstub vector register xml files with dynamic generator.
> >>   * bumped to RVV v1.0.
> >>   * RVV v1.0 related changes:
> >> * add vlre.v and vsr.v vector whole register
> >>   load/store instructions
> >> * add vrgatherei16 instruction.
> >> * rearranged bits in vtype to make vlmul bits into a contiguous
> >>   field.
> >>
> >> v2
> >>   * drop v0.7.1 support.
> >>   * replace invisible return check macros with functions.
> >>   * move mark_vs_dirty() to translators.
> >>   * add SSTATUS_VS flag for s-mode.
> >>   * nan-box scalar fp register for floating-point operations.
> >>   * add gdbstub files for vector registers to allow system-mode
> >> debugging with GDB.
> >>
> >> Based-on: <20200724002807.441147-1-richard.hender...@linaro.org/>
> >> Based-on: <1596102747-20226-1-git-send-email-chihmin.c...@sifive.com/>
> >>
> >> Frank Chang (62):
> >>   target/riscv: drop vector 0.7.1 and add 1.0 support
> >>   target/riscv: Use FIELD_EX32() to extract wd field
> >>   target/riscv: rvv-1.0: introduce writable misa.v field
> >>   target/riscv: rvv-1.0: remove rvv related codes from fcsr registers
> >>   target/riscv: rvv-1.0: check MSTATUS_VS when accessing vector csr
> >> registers
> >>   target/riscv: rvv-1.0: remove MLEN calculations
> >>   target/riscv: rvv-1.0: add fractional LMUL
> >>   target/riscv: rvv-1.0: add VMA and VTA
> >>   target/riscv: rvv-1.0: update check functions
> >>   target/riscv: introduce more imm value modes in translator functions
> >>   target/riscv: rvv:1.0: add translation-time nan-box helper function
> >>   target/riscv: rvv-1.0: configure instructions
> >>   target/riscv: rvv-1.0: stride load and store instructions
> >>   target/riscv: rvv-1.0: index load and store instructions
> >>   target/riscv: rvv-1.0: fix address index overflow bug of indexed
> >> load/store insns
> >>   target/riscv: rvv-1.0: fault-only-first unit stride load
> >>   target/riscv: rvv-1.0: amo operations
> >>   target/riscv: rvv-1.0: load/store whole register instructions
> >>   target/riscv: rvv-1.0: update vext_max_elems() for load/store insns
> >>   target/riscv: rvv-1.0: take fractional LMUL into vector max elements
> >> calculation
> >>   target/riscv: rvv-1.0: floating-point square-root instruction
> >>   target/riscv: rvv-1.0: floating-point classify instructions
> >>   target/riscv: rvv-1.0: mask population count instruction
> >>   target/riscv: rvv-1.0: find-first-set mask bit instruction
> >>   target/riscv: rvv-1.0: set-X-first mask bit in

Re: [RFC v4 00/70] support vector extension v1.0

2020-08-26 Thread Frank Chang
On Thu, Aug 27, 2020 at 2:03 AM Alistair Francis 
wrote:

> On Wed, Aug 26, 2020 at 10:39 AM Frank Chang 
> wrote:
> >
> > On Thu, Aug 27, 2020 at 12:56 AM Alistair Francis 
> wrote:
> >>
> >> On Tue, Aug 25, 2020 at 1:29 AM Frank Chang 
> wrote:
> >> >
> >> > On Mon, Aug 17, 2020 at 4:50 PM  wrote:
> >> >>
> >> >> From: Frank Chang 
> >> >>
> >> >> This patchset implements the vector extension v1.0 for RISC-V on
> QEMU.
> >> >>
> >> >> This patchset is sent as RFC because RVV v1.0 is still in draft
> state.
> >> >> v2 patchset was sent for RVV v0.9 and bumped to RVV v1.0 since v3
> patchset.
> >> >>
> >> >> The port is available here:
> >> >> https://github.com/sifive/qemu/tree/rvv-1.0-upstream-v4
> >> >>
> >> >> You can change the cpu argument: vext_spec to v1.0 (i.e.
> vext_spec=v1.0)
> >> >> to run with RVV v1.0 instructions.
> >> >>
> >> >> Note: This patchset depends on two other patchsets listed in Based-on
> >> >>   section below so it might not able to be built unless those two
> >> >>   patchsets are applied.
> >> >>
> >> >> Changelog:
> >> >>
> >> >> v4
> >> >>   * remove explicit float flmul variable in DisasContext.
> >> >>   * replace floating-point calculations with shift operations to
> >> >> improve performance.
> >> >>   * relax RV_VLEN_MAX to 512-bits.
> >> >>
> >> >> v3
> >> >>   * apply nan-box helpers from Richard Henderson.
> >> >>   * remove fp16 api changes as they are sent independently in another
> >> >> pathcset by Chih-Min Chao.
> >> >>   * remove all tail elements clear functions as tail elements can
> >> >> retain unchanged for either VTA set to undisturbed or agnostic.
> >> >>   * add fp16 nan-box check generator function.
> >> >>   * add floating-point rounding mode enum.
> >> >>   * replace flmul arithmetic with shifts to avoid floating-point
> >> >> conversions.
> >> >>   * add Zvqmac extension.
> >> >>   * replace gdbstub vector register xml files with dynamic generator.
> >> >>   * bumped to RVV v1.0.
> >> >>   * RVV v1.0 related changes:
> >> >> * add vlre.v and vsr.v vector whole register
> >> >>   load/store instructions
> >> >> * add vrgatherei16 instruction.
> >> >> * rearranged bits in vtype to make vlmul bits into a contiguous
> >> >>   field.
> >> >>
> >> >> v2
> >> >>   * drop v0.7.1 support.
> >> >>   * replace invisible return check macros with functions.
> >> >>   * move mark_vs_dirty() to translators.
> >> >>   * add SSTATUS_VS flag for s-mode.
> >> >>   * nan-box scalar fp register for floating-point operations.
> >> >>   * add gdbstub files for vector registers to allow system-mode
> >> >> debugging with GDB.
> >> >>
> >> >> Based-on: <20200724002807.441147-1-richard.hender...@linaro.org/>
> >> >> Based-on: <
> 1596102747-20226-1-git-send-email-chihmin.c...@sifive.com/>
> >> >>
> >> >> Frank Chang (62):
> >> >>   target/riscv: drop vector 0.7.1 and add 1.0 support
> >> >>   target/riscv: Use FIELD_EX32() to extract wd field
> >> >>   target/riscv: rvv-1.0: introduce writable misa.v field
> >> >>   target/riscv: rvv-1.0: remove rvv related codes from fcsr registers
> >> >>   target/riscv: rvv-1.0: check MSTATUS_VS when accessing vector csr
> >> >> registers
> >> >>   target/riscv: rvv-1.0: remove MLEN calculations
> >> >>   target/riscv: rvv-1.0: add fractional LMUL
> >> >>   target/riscv: rvv-1.0: add VMA and VTA
> >> >>   target/riscv: rvv-1.0: update check functions
> >> >>   target/riscv: introduce more imm value modes in translator
> functions
> >> >>   target/riscv: rvv:1.0: add translation-time nan-box helper function
> >> >>   target/riscv: rvv-1.0: configure instructions
> >> >>   target/riscv: rvv-1.0: stride load and store instructions
> >> >>   target/riscv: rvv-1.0: index load and store instructions
> >

[RFC v4 08/70] target/riscv: rvv-1.0: add vcsr register

2020-08-17 Thread frank . chang
From: LIU Zhiwei 

Signed-off-by: LIU Zhiwei 
Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/cpu_bits.h |  7 +++
 target/riscv/csr.c  | 21 +
 2 files changed, 28 insertions(+)

diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h
index 5b0be0bb888..7afdd4814bb 100644
--- a/target/riscv/cpu_bits.h
+++ b/target/riscv/cpu_bits.h
@@ -60,9 +60,16 @@
 #define CSR_VSTART  0x008
 #define CSR_VXSAT   0x009
 #define CSR_VXRM0x00a
+#define CSR_VCSR0x00f
 #define CSR_VL  0xc20
 #define CSR_VTYPE   0xc21
 
+/* VCSR fields */
+#define VCSR_VXSAT_SHIFT0
+#define VCSR_VXSAT  (0x1 << VCSR_VXSAT_SHIFT)
+#define VCSR_VXRM_SHIFT 1
+#define VCSR_VXRM   (0x3 << VCSR_VXRM_SHIFT)
+
 /* User Timers and Counters */
 #define CSR_CYCLE   0xc00
 #define CSR_TIME0xc01
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index 005839390a1..c87f2ddbf7d 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -247,6 +247,26 @@ static int write_vstart(CPURISCVState *env, int csrno, 
target_ulong val)
 return 0;
 }
 
+static int read_vcsr(CPURISCVState *env, int csrno, target_ulong *val)
+{
+*val = (env->vxrm << VCSR_VXRM_SHIFT) | (env->vxsat << VCSR_VXSAT_SHIFT);
+return 0;
+}
+
+static int write_vcsr(CPURISCVState *env, int csrno, target_ulong val)
+{
+#if !defined(CONFIG_USER_ONLY)
+if (!env->debugger && !riscv_cpu_vector_enabled(env)) {
+return -1;
+}
+env->mstatus |= MSTATUS_VS;
+#endif
+
+env->vxrm = (val & VCSR_VXRM) >> VCSR_VXRM_SHIFT;
+env->vxsat = (val & VCSR_VXSAT) >> VCSR_VXSAT_SHIFT;
+return 0;
+}
+
 /* User Timers and Counters */
 static int read_instret(CPURISCVState *env, int csrno, target_ulong *val)
 {
@@ -1265,6 +1285,7 @@ static riscv_csr_operations csr_ops[CSR_TABLE_SIZE] = {
 [CSR_VSTART] =  { vs,   read_vstart,  write_vstart  },
 [CSR_VXSAT] =   { vs,   read_vxsat,   write_vxsat   },
 [CSR_VXRM] ={ vs,   read_vxrm,write_vxrm},
+[CSR_VCSR] ={ vs,   read_vcsr,write_vcsr},
 [CSR_VL] =  { vs,   read_vl },
 [CSR_VTYPE] =   { vs,   read_vtype  },
 /* User Timers and Counters */
-- 
2.17.1




[RFC v4 05/70] target/riscv: rvv-1.0: introduce writable misa.v field

2020-08-17 Thread frank . chang
From: Frank Chang 

Implementations may have a writable misa.v field. Analogous to the way
in which the floating-point unit is handled, the mstatus.vs field may
exist even if misa.v is clear.

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/csr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index 46c35266cb5..7f937e5b9c8 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -494,7 +494,7 @@ static int write_misa(CPURISCVState *env, int csrno, 
target_ulong val)
 val &= env->misa_mask;
 
 /* Mask extensions that are not supported by QEMU */
-val &= (RVI | RVE | RVM | RVA | RVF | RVD | RVC | RVS | RVU);
+val &= (RVI | RVE | RVM | RVA | RVF | RVD | RVC | RVS | RVU | RVV);
 
 /* 'D' depends on 'F', so clear 'D' if 'F' is not present */
 if ((val & RVD) && !(val & RVF)) {
-- 
2.17.1




[RFC v4 39/70] target/riscv: rvv-1.0: integer extension instructions

2020-08-17 Thread frank . chang
From: Frank Chang 

Add the following instructions:

* vzext.vf2
* vzext.vf4
* vzext.vf8
* vsext.vf2
* vsext.vf4
* vsext.vf8

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/helper.h   | 14 
 target/riscv/insn32.decode  |  8 +++
 target/riscv/insn_trans/trans_rvv.inc.c | 86 +
 target/riscv/vector_helper.c| 31 +
 4 files changed, 139 insertions(+)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 35fb09d2892..7ce2fa08d58 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -1118,3 +1118,17 @@ DEF_HELPER_6(vcompress_vm_b, void, ptr, ptr, ptr, ptr, 
env, i32)
 DEF_HELPER_6(vcompress_vm_h, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vcompress_vm_w, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vcompress_vm_d, void, ptr, ptr, ptr, ptr, env, i32)
+
+DEF_HELPER_5(vzext_vf2_h, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vzext_vf2_w, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vzext_vf2_d, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vzext_vf4_w, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vzext_vf4_d, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vzext_vf8_d, void, ptr, ptr, ptr, env, i32)
+
+DEF_HELPER_5(vsext_vf2_h, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vsext_vf2_w, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vsext_vf2_d, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vsext_vf4_w, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vsext_vf4_d, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vsext_vf8_d, void, ptr, ptr, ptr, env, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 65ff1688c25..2b9700a42ad 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -619,5 +619,13 @@ vmv2r_v 100111 1 . 1 011 . 1010111 
@r2rd
 vmv4r_v 100111 1 . 00011 011 . 1010111 @r2rd
 vmv8r_v 100111 1 . 00111 011 . 1010111 @r2rd
 
+# Vector Integer Extension
+vzext_vf2   010010 . . 00110 010 . 1010111 @r2_vm
+vzext_vf4   010010 . . 00100 010 . 1010111 @r2_vm
+vzext_vf8   010010 . . 00010 010 . 1010111 @r2_vm
+vsext_vf2   010010 . . 00111 010 . 1010111 @r2_vm
+vsext_vf4   010010 . . 00101 010 . 1010111 @r2_vm
+vsext_vf8   010010 . . 00011 010 . 1010111 @r2_vm
+
 vsetvli 0 ... . 111 . 1010111  @r2_zimm
 vsetvl  100 . . 111 . 1010111  @r
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index 52f2f4902c0..5cd099bed7b 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -3504,3 +3504,89 @@ GEN_VMV_WHOLE_TRANS(vmv1r_v, 1)
 GEN_VMV_WHOLE_TRANS(vmv2r_v, 2)
 GEN_VMV_WHOLE_TRANS(vmv4r_v, 4)
 GEN_VMV_WHOLE_TRANS(vmv8r_v, 8)
+
+static bool int_ext_check(DisasContext *s, arg_rmr *a, uint8_t div)
+{
+uint8_t from = (s->sew + 3) - div;
+bool ret = require_rvv(s);
+ret &= (from >= 3 && from <= 8) &&
+   (a->rd != a->rs2) &&
+   require_align(a->rd, 1 << s->lmul) &&
+   require_align(a->rs2, 1 << (s->lmul - div)) &&
+   require_vm(a->vm, a->rd);
+if ((s->lmul - div) < 0) {
+ret &= require_noover(a->rd, 1 << s->lmul,
+  a->rs2, 1 << (s->lmul - div));
+} else {
+ret &= require_noover_widen(a->rd, 1 << s->lmul, a->rs2,
+1 << (s->lmul - div));
+}
+return ret;
+}
+
+static bool int_ext_op(DisasContext *s, arg_rmr *a, uint8_t seq)
+{
+uint32_t data = 0;
+gen_helper_gvec_3_ptr *fn;
+TCGLabel *over = gen_new_label();
+tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);
+
+static gen_helper_gvec_3_ptr * const fns[6][4] = {
+{
+NULL, gen_helper_vzext_vf2_h,
+gen_helper_vzext_vf2_w, gen_helper_vzext_vf2_d
+},
+{
+NULL, NULL,
+gen_helper_vzext_vf4_w, gen_helper_vzext_vf4_d,
+},
+{
+NULL, NULL,
+NULL, gen_helper_vzext_vf8_d
+},
+{
+NULL, gen_helper_vsext_vf2_h,
+gen_helper_vsext_vf2_w, gen_helper_vsext_vf2_d
+},
+{
+NULL, NULL,
+gen_helper_vsext_vf4_w, gen_helper_vsext_vf4_d,
+},
+{
+NULL, NULL,
+NULL, gen_helper_vsext_vf8_d
+}
+};
+
+fn = fns[seq][s->sew];
+if (fn == NULL) {
+return false;
+}
+
+data = FIELD_DP32(data, VDATA, VM, a->vm);
+
+tcg_gen_gvec_3_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, 0),
+   vreg_ofs(s, a->rs2), cpu_env, 0,
+   s->vlen / 8, data, fn);
+
+mark_vs_dirty(s

[RFC v4 15/70] target/riscv: introduce more imm value modes in translator functions

2020-08-17 Thread frank . chang
From: Frank Chang 

Immediate value in translator function is extended not only
zero-extended and sign-extended but with more modes to be applicable
with multiple formats of vector instructions.

* IMM_ZX: Zero-extended
* IMM_SX: Sign-extended
* IMM_TRUNC_SEW:  Truncate to log(SEW) bit
* IMM_TRUNC_2SEW: Truncate to log(2*SEW) bit

Signed-off-by: Frank Chang 
---
 target/riscv/insn_trans/trans_rvv.inc.c | 115 ++--
 1 file changed, 66 insertions(+), 49 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index 4ab556f784d..daaa47ac9c3 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -1283,8 +1283,32 @@ static void tcg_gen_gvec_rsubs(unsigned vece, uint32_t 
dofs, uint32_t aofs,
 
 GEN_OPIVX_GVEC_TRANS(vrsub_vx, rsubs)
 
+typedef enum {
+IMM_ZX, /* Zero-extended */
+IMM_SX, /* Sign-extended */
+IMM_TRUNC_SEW,  /* Truncate to log(SEW) bits */
+IMM_TRUNC_2SEW, /* Truncate to log(2*SEW) bits */
+} imm_mode_t;
+
+static int64_t extract_imm(DisasContext *s, uint32_t imm, imm_mode_t imm_mode)
+{
+switch (imm_mode) {
+case IMM_ZX:
+return extract64(imm, 0, 5);
+case IMM_SX:
+return sextract64(imm, 0, 5);
+case IMM_TRUNC_SEW:
+return extract64(imm, 0, s->sew + 3);
+case IMM_TRUNC_2SEW:
+return extract64(imm, 0, s->sew + 4);
+default:
+g_assert_not_reached();
+}
+}
+
 static bool opivi_trans(uint32_t vd, uint32_t imm, uint32_t vs2, uint32_t vm,
-gen_helper_opivx *fn, DisasContext *s, int zx)
+gen_helper_opivx *fn, DisasContext *s,
+imm_mode_t imm_mode)
 {
 TCGv_ptr dest, src2, mask;
 TCGv src1;
@@ -1297,11 +1321,8 @@ static bool opivi_trans(uint32_t vd, uint32_t imm, 
uint32_t vs2, uint32_t vm,
 dest = tcg_temp_new_ptr();
 mask = tcg_temp_new_ptr();
 src2 = tcg_temp_new_ptr();
-if (zx) {
-src1 = tcg_const_tl(imm);
-} else {
-src1 = tcg_const_tl(sextract64(imm, 0, 5));
-}
+src1 = tcg_const_tl(extract_imm(s, imm, imm_mode));
+
 data = FIELD_DP32(data, VDATA, VM, vm);
 data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
 desc = tcg_const_i32(simd_desc(0, s->vlen / 8, data));
@@ -1327,28 +1348,23 @@ typedef void GVecGen2iFn(unsigned, uint32_t, uint32_t, 
int64_t,
 
 static inline bool
 do_opivi_gvec(DisasContext *s, arg_rmrr *a, GVecGen2iFn *gvec_fn,
-  gen_helper_opivx *fn, int zx)
+  gen_helper_opivx *fn, imm_mode_t imm_mode)
 {
 if (!opivx_check(s, a)) {
 return false;
 }
 
 if (a->vm && s->vl_eq_vlmax) {
-if (zx) {
-gvec_fn(s->sew, vreg_ofs(s, a->rd), vreg_ofs(s, a->rs2),
-extract64(a->rs1, 0, 5), MAXSZ(s), MAXSZ(s));
-} else {
-gvec_fn(s->sew, vreg_ofs(s, a->rd), vreg_ofs(s, a->rs2),
-sextract64(a->rs1, 0, 5), MAXSZ(s), MAXSZ(s));
-}
+gvec_fn(s->sew, vreg_ofs(s, a->rd), vreg_ofs(s, a->rs2),
+extract_imm(s, a->rs1, imm_mode), MAXSZ(s), MAXSZ(s));
 mark_vs_dirty(s);
 return true;
 }
-return opivi_trans(a->rd, a->rs1, a->rs2, a->vm, fn, s, zx);
+return opivi_trans(a->rd, a->rs1, a->rs2, a->vm, fn, s, imm_mode);
 }
 
 /* OPIVI with GVEC IR */
-#define GEN_OPIVI_GVEC_TRANS(NAME, ZX, OPIVX, SUF) \
+#define GEN_OPIVI_GVEC_TRANS(NAME, IMM_MODE, OPIVX, SUF) \
 static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \
 {  \
 static gen_helper_opivx * const fns[4] = { \
@@ -1356,10 +1372,10 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a)  
   \
 gen_helper_##OPIVX##_w, gen_helper_##OPIVX##_d,\
 }; \
 return do_opivi_gvec(s, a, tcg_gen_gvec_##SUF, \
- fns[s->sew], ZX); \
+ fns[s->sew], IMM_MODE);   \
 }
 
-GEN_OPIVI_GVEC_TRANS(vadd_vi, 0, vadd_vx, addi)
+GEN_OPIVI_GVEC_TRANS(vadd_vi, IMM_SX, vadd_vx, addi)
 
 static void tcg_gen_gvec_rsubi(unsigned vece, uint32_t dofs, uint32_t aofs,
int64_t c, uint32_t oprsz, uint32_t maxsz)
@@ -1369,7 +1385,7 @@ static void tcg_gen_gvec_rsubi(unsigned vece, uint32_t 
dofs, uint32_t aofs,
 tcg_temp_free_i64(tmp);
 }
 
-GEN_OPIVI_GVEC_TRANS(vrsub_vi, 0, vrsub_vx, rsubi)
+GEN_OPIVI_GVEC_TRANS(vrsub_vi, IMM_SX, vrsub_vx, rsubi)
 
 /* Vector Widening Integer Add/Subtract */
 
@@ -1624,7 +1640,7 @@ GEN_OPIVX_TRANS(vmadc_vxm, opivx_vmadc_check)
 GEN_OPIVX_TRANS(vmsbc_vxm, opivx_vmadc_check)
 
 /* OPIVI without GVE

[RFC v4 23/70] target/riscv: rvv-1.0: load/store whole register instructions

2020-08-17 Thread frank . chang
From: Frank Chang 

Add the following instructions:

* vlre.v
* vsr.v

Signed-off-by: Frank Chang 
---
 target/riscv/helper.h   | 21 
 target/riscv/insn32.decode  | 22 
 target/riscv/insn_trans/trans_rvv.inc.c | 72 +
 target/riscv/vector_helper.c| 65 ++
 4 files changed, 180 insertions(+)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 9200178d25c..25d076d71a8 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -145,6 +145,27 @@ DEF_HELPER_5(vle16ff_v, void, ptr, ptr, tl, env, i32)
 DEF_HELPER_5(vle32ff_v, void, ptr, ptr, tl, env, i32)
 DEF_HELPER_5(vle64ff_v, void, ptr, ptr, tl, env, i32)
 
+DEF_HELPER_4(vl1re8_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vl1re16_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vl1re32_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vl1re64_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vl2re8_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vl2re16_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vl2re32_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vl2re64_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vl4re8_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vl4re16_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vl4re32_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vl4re64_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vl8re8_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vl8re16_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vl8re32_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vl8re64_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vs1r_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vs2r_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vs4r_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vs8r_v, void, ptr, tl, env, i32)
+
 DEF_HELPER_6(vamoswapei8_32_v, void, ptr, ptr, tl, ptr, env, i32)
 DEF_HELPER_6(vamoswapei8_64_v, void, ptr, ptr, tl, ptr, env, i32)
 DEF_HELPER_6(vamoswapei16_32_v, void, ptr, ptr, tl, ptr, env, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 6a9cf6ad534..c99575d1360 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -267,6 +267,28 @@ vle16ff_v ... 000 . 1 . 101 . 111 
@r2_nfvm
 vle32ff_v ... 000 . 1 . 110 . 111 @r2_nfvm
 vle64ff_v ... 000 . 1 . 111 . 111 @r2_nfvm
 
+# Vector whole register insns
+vl1re8_v  000 000 1 01000 . 000 . 111 @r2
+vl1re16_v 000 000 1 01000 . 101 . 111 @r2
+vl1re32_v 000 000 1 01000 . 110 . 111 @r2
+vl1re64_v 000 000 1 01000 . 111 . 111 @r2
+vl2re8_v  001 000 1 01000 . 000 . 111 @r2
+vl2re16_v 001 000 1 01000 . 101 . 111 @r2
+vl2re32_v 001 000 1 01000 . 110 . 111 @r2
+vl2re64_v 001 000 1 01000 . 111 . 111 @r2
+vl4re8_v  011 000 1 01000 . 000 . 111 @r2
+vl4re16_v 011 000 1 01000 . 101 . 111 @r2
+vl4re32_v 011 000 1 01000 . 110 . 111 @r2
+vl4re64_v 011 000 1 01000 . 111 . 111 @r2
+vl8re8_v  111 000 1 01000 . 000 . 111 @r2
+vl8re16_v 111 000 1 01000 . 101 . 111 @r2
+vl8re32_v 111 000 1 01000 . 110 . 111 @r2
+vl8re64_v 111 000 1 01000 . 111 . 111 @r2
+vs1r_v000 000 1 01000 . 000 . 0100111 @r2
+vs2r_v001 000 1 01000 . 000 . 0100111 @r2
+vs4r_v011 000 1 01000 . 000 . 0100111 @r2
+vs8r_v111 000 1 01000 . 000 . 0100111 @r2
+
 #*** Vector AMO operations are encoded under the standard AMO major opcode ***
 vamoswapei8_v   1 . . . . 000 . 010 @r_wdvm
 vamoswapei16_v  1 . . . . 101 . 010 @r_wdvm
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index 1377604d599..6a2f175b50a 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -1016,6 +1016,78 @@ GEN_VEXT_TRANS(vle16ff_v, 16, 1, r2nfvm, ldff_op, 
ld_us_check)
 GEN_VEXT_TRANS(vle32ff_v, 32, 2, r2nfvm, ldff_op, ld_us_check)
 GEN_VEXT_TRANS(vle64ff_v, 64, 3, r2nfvm, ldff_op, ld_us_check)
 
+/*
+ * load and store whole register instructions
+ */
+typedef void gen_helper_ldst_whole(TCGv_ptr, TCGv, TCGv_env, TCGv_i32);
+
+static bool ldst_whole_trans(uint32_t vd, uint32_t rs1, uint32_t data,
+ gen_helper_ldst_whole *fn, DisasContext *s,
+ bool is_store)
+{
+TCGv_ptr dest;
+TCGv base;
+TCGv_i32 desc;
+
+dest = tcg_temp_new_ptr();
+base = tcg_temp_new();
+desc = tcg_const_i32(simd_desc(0, s->vlen / 8, data));
+
+gen_get_gpr(base, rs1);
+tcg_gen_addi_ptr(dest, cpu_env, vreg_ofs(s, vd));
+
+fn(dest, base, cpu_env, desc);
+
+tcg_temp_free_ptr(dest);
+tcg_temp_free(base);
+tcg_temp_free_i32(desc);
+if (!is_store) {
+mark_vs_dirty(s);
+}
+return true;
+}
+
+/*
+ * load and store wh

[RFC v4 17/70] target/riscv: rvv-1.0: configure instructions

2020-08-17 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/insn_trans/trans_rvv.inc.c | 12 
 target/riscv/vector_helper.c| 14 +-
 2 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index 4b8ae5470c3..4efe323920b 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -98,8 +98,10 @@ static bool trans_vsetvl(DisasContext *ctx, arg_vsetvl *a)
 s2 = tcg_temp_new();
 dst = tcg_temp_new();
 
-/* Using x0 as the rs1 register specifier, encodes an infinite AVL */
-if (a->rs1 == 0) {
+if (a->rd == 0 && a->rs1 == 0) {
+s1 = tcg_temp_new();
+tcg_gen_mov_tl(s1, cpu_vl);
+} else if (a->rs1 == 0) {
 /* As the mask is at least one bit, RV_VLEN_MAX is >= VLMAX */
 s1 = tcg_const_tl(RV_VLEN_MAX);
 } else {
@@ -131,8 +133,10 @@ static bool trans_vsetvli(DisasContext *ctx, arg_vsetvli 
*a)
 s2 = tcg_const_tl(a->zimm);
 dst = tcg_temp_new();
 
-/* Using x0 as the rs1 register specifier, encodes an infinite AVL */
-if (a->rs1 == 0) {
+if (a->rd == 0 && a->rs1 == 0) {
+s1 = tcg_temp_new();
+tcg_gen_mov_tl(s1, cpu_vl);
+} else if (a->rs1 == 0) {
 /* As the mask is at least one bit, RV_VLEN_MAX is >= VLMAX */
 s1 = tcg_const_tl(RV_VLEN_MAX);
 } else {
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 7b4b1151b97..430b25d16c2 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -31,12 +31,24 @@ target_ulong HELPER(vsetvl)(CPURISCVState *env, 
target_ulong s1,
 {
 int vlmax, vl;
 RISCVCPU *cpu = env_archcpu(env);
+uint64_t lmul = FIELD_EX64(s2, VTYPE, VLMUL);
 uint16_t sew = 8 << FIELD_EX64(s2, VTYPE, VSEW);
 uint8_t ediv = FIELD_EX64(s2, VTYPE, VEDIV);
 bool vill = FIELD_EX64(s2, VTYPE, VILL);
 target_ulong reserved = FIELD_EX64(s2, VTYPE, RESERVED);
 
-if ((sew > cpu->cfg.elen) || vill || (ediv != 0) || (reserved != 0)) {
+if (lmul & 4) {
+/* Fractional LMUL. */
+if (lmul == 4 ||
+cpu->cfg.elen >> (8 - lmul) < sew) {
+vill = true;
+}
+}
+
+if ((sew > cpu->cfg.elen)
+|| vill
+|| (ediv != 0)
+|| (reserved != 0)) {
 /* only set vill bit. */
 env->vtype = FIELD_DP64(0, VTYPE, VILL, 1);
 env->vl = 0;
-- 
2.17.1




[RFC v4 35/70] target/riscv: rvv-1.0: integer scalar move instructions

2020-08-17 Thread frank . chang
From: Frank Chang 

* Remove "vmv.s.x: dothing if rs1 == 0" constraint.
* Add vmv.x.s instruction.

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/insn32.decode  |  3 +-
 target/riscv/insn_trans/trans_rvv.inc.c | 45 -
 2 files changed, 39 insertions(+), 9 deletions(-)

diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 67306ac7161..6b90b67c7cc 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -598,8 +598,9 @@ vmsif_m 010100 . . 00011 010 . 1010111 
@r2_vm
 vmsof_m 010100 . . 00010 010 . 1010111 @r2_vm
 viota_m 010100 . . 1 010 . 1010111 @r2_vm
 vid_v   010100 . 0 10001 010 . 1010111 @r1_vm
+vmv_x_s 01 1 . 0 010 . 1010111 @r2rd
+vmv_s_x 01 1 0 . 110 . 1010111 @r2
 vext_x_v001100 1 . . 010 . 1010111 @r
-vmv_s_x 001101 1 0 . 110 . 1010111 @r2
 vfmv_f_s001100 1 . 0 001 . 1010111 @r2rd
 vfmv_s_f001101 1 0 . 101 . 1010111 @r2
 vslideup_vx 001110 . . . 100 . 1010111 @r_vm
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index 392a1eba6b9..92d34be5a99 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -3207,27 +3207,56 @@ static void vec_element_storei(DisasContext *s, int 
vreg,
 store_element(val, cpu_env, endian_ofs(s, vreg, idx), s->sew);
 }
 
+/* vmv.x.s rd, vs2 # x[rd] = vs2[0] */
+static bool trans_vmv_x_s(DisasContext *s, arg_vmv_x_s *a)
+{
+if (require_rvv(s) &&
+vext_check_isa_ill(s)) {
+TCGv_i64 t1;
+TCGv dest;
+
+t1 = tcg_temp_new_i64();
+dest = tcg_temp_new();
+/*
+ * load vreg and sign-extend to 64 bits,
+ * then truncate to XLEN bits before storing to gpr.
+ */
+vec_element_loadi(s, t1, a->rs2, 0, true);
+tcg_gen_trunc_i64_tl(dest, t1);
+gen_set_gpr(a->rd, dest);
+tcg_temp_free_i64(t1);
+tcg_temp_free(dest);
+
+return true;
+}
+return false;
+}
+
 /* vmv.s.x vd, rs1 # vd[0] = rs1 */
 static bool trans_vmv_s_x(DisasContext *s, arg_vmv_s_x *a)
 {
-if (vext_check_isa_ill(s)) {
+if (require_rvv(s) &&
+vext_check_isa_ill(s)) {
 /* This instruction ignores LMUL and vector register groups */
-int maxsz = s->vlen >> 3;
 TCGv_i64 t1;
+TCGv s1;
 TCGLabel *over = gen_new_label();
 
 tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);
-tcg_gen_gvec_dup_imm(SEW64, vreg_ofs(s, a->rd), maxsz, maxsz, 0);
-if (a->rs1 == 0) {
-goto done;
-}
 
 t1 = tcg_temp_new_i64();
-tcg_gen_extu_tl_i64(t1, cpu_gpr[a->rs1]);
+s1 = tcg_temp_new();
+
+/*
+ * load gpr and sign-extend to 64 bits,
+ * then truncate to SEW bits when storing to vreg.
+ */
+gen_get_gpr(s1, a->rs1);
+tcg_gen_ext_tl_i64(t1, s1);
 vec_element_storei(s, a->rd, 0, t1);
 tcg_temp_free_i64(t1);
+tcg_temp_free(s1);
 mark_vs_dirty(s);
-done:
 gen_set_label(over);
 return true;
 }
-- 
2.17.1




[RFC v4 36/70] target/riscv: rvv-1.0: floating-point move instruction

2020-08-17 Thread frank . chang
From: Frank Chang 

NaN-boxed the scalar floating-point register based on RVV 1.0's rules.

Signed-off-by: Frank Chang 
---
 target/riscv/insn_trans/trans_rvv.inc.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index 92d34be5a99..7a12b89dc13 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -2689,12 +2689,17 @@ GEN_OPFVF_TRANS(vfmerge_vfm,  opfvf_check)
 static bool trans_vfmv_v_f(DisasContext *s, arg_vfmv_v_f *a)
 {
 if (require_rvv(s) &&
+has_ext(s, RVF) &&
 vext_check_isa_ill(s) &&
 require_align(a->rd, 1 << s->lmul) &&
 (s->sew != 0)) {
+TCGv_i64 t1 = tcg_temp_local_new_i64();
+/* NaN-box f[rs1] */
+do_nanbox(s, t1, cpu_fpr[a->rs1]);
+
 if (s->vl_eq_vlmax) {
 tcg_gen_gvec_dup_i64(s->sew, vreg_ofs(s, a->rd),
- MAXSZ(s), MAXSZ(s), cpu_fpr[a->rs1]);
+ MAXSZ(s), MAXSZ(s), t1);
 mark_vs_dirty(s);
 } else {
 TCGv_ptr dest;
@@ -2711,13 +2716,15 @@ static bool trans_vfmv_v_f(DisasContext *s, 
arg_vfmv_v_f *a)
 dest = tcg_temp_new_ptr();
 desc = tcg_const_i32(simd_desc(0, s->vlen / 8, data));
 tcg_gen_addi_ptr(dest, cpu_env, vreg_ofs(s, a->rd));
-fns[s->sew - 1](dest, cpu_fpr[a->rs1], cpu_env, desc);
+
+fns[s->sew - 1](dest, t1, cpu_env, desc);
 
 tcg_temp_free_ptr(dest);
 tcg_temp_free_i32(desc);
 mark_vs_dirty(s);
 gen_set_label(over);
 }
+tcg_temp_free_i64(t1);
 return true;
 }
 return false;
-- 
2.17.1




[RFC v4 34/70] target/riscv: rvv-1.0: register gather instructions

2020-08-17 Thread frank . chang
From: Frank Chang 

* Add vrgatherei16.vv instruction.

Signed-off-by: Frank Chang 
---
 target/riscv/helper.h   |  4 
 target/riscv/insn32.decode  |  1 +
 target/riscv/insn_trans/trans_rvv.inc.c | 21 +++--
 target/riscv/vector_helper.c| 23 ++-
 4 files changed, 38 insertions(+), 11 deletions(-)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index a5d58010134..35fb09d2892 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -1105,6 +1105,10 @@ DEF_HELPER_6(vrgather_vv_b, void, ptr, ptr, ptr, ptr, 
env, i32)
 DEF_HELPER_6(vrgather_vv_h, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vrgather_vv_w, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vrgather_vv_d, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vrgatherei16_vv_b, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vrgatherei16_vv_h, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vrgatherei16_vv_w, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vrgatherei16_vv_d, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vrgather_vx_b, void, ptr, ptr, tl, ptr, env, i32)
 DEF_HELPER_6(vrgather_vx_h, void, ptr, ptr, tl, ptr, env, i32)
 DEF_HELPER_6(vrgather_vx_w, void, ptr, ptr, tl, ptr, env, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 15afc469cb0..67306ac7161 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -609,6 +609,7 @@ vslidedown_vx   00 . . . 100 . 1010111 @r_vm
 vslidedown_vi   00 . . . 011 . 1010111 @r_vm
 vslide1down_vx  00 . . . 110 . 1010111 @r_vm
 vrgather_vv 001100 . . . 000 . 1010111 @r_vm
+vrgatherei16_vv 001110 . . . 000 . 1010111 @r_vm
 vrgather_vx 001100 . . . 100 . 1010111 @r_vm
 vrgather_vi 001100 . . . 011 . 1010111 @r_vm
 vcompress_vm010111 - . . 010 . 1010111 @r
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index be5149fa762..392a1eba6b9 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -3319,7 +3319,21 @@ static bool vrgather_vv_check(DisasContext *s, arg_rmrr 
*a)
require_vm(a->vm, a->rd);
 }
 
+static bool vrgatherei16_vv_check(DisasContext *s, arg_rmrr *a)
+{
+int8_t emul = 4 - (s->sew + 3) + s->lmul;
+return require_rvv(s) &&
+   vext_check_isa_ill(s) &&
+   (emul >= -3 && emul <= 3) &&
+   require_align(a->rd, 1 << s->lmul) &&
+   require_align(a->rs1, 1 << emul) &&
+   require_align(a->rs2, 1 << s->lmul) &&
+   (a->rd != a->rs2 && a->rd != a->rs1) &&
+   require_vm(a->vm, a->rd);
+}
+
 GEN_OPIVV_TRANS(vrgather_vv, vrgather_vv_check)
+GEN_OPIVV_TRANS(vrgatherei16_vv, vrgatherei16_vv_check)
 
 static bool vrgather_vx_check(DisasContext *s, arg_rmrr *a)
 {
@@ -3339,7 +3353,8 @@ static bool trans_vrgather_vx(DisasContext *s, arg_rmrr 
*a)
 }
 
 if (a->vm && s->vl_eq_vlmax) {
-int vlmax = s->vlen;
+int scale = s->lmul - (s->sew + 3);
+int vlmax = scale < 0 ? s->vlen >> -scale : s->vlen << scale;
 TCGv_i64 dest = tcg_temp_new_i64();
 
 if (a->rs1 == 0) {
@@ -3370,7 +3385,9 @@ static bool trans_vrgather_vi(DisasContext *s, arg_rmrr 
*a)
 }
 
 if (a->vm && s->vl_eq_vlmax) {
-if (a->rs1 >= s->vlen) {
+int scale = s->lmul - (s->sew + 3);
+int vlmax = scale < 0 ? s->vlen >> -scale : s->vlen << scale;
+if (a->rs1 >= vlmax) {
 tcg_gen_gvec_dup_imm(SEW64, vreg_ofs(s, a->rd),
  MAXSZ(s), MAXSZ(s), 0);
 } else {
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 716e1926ee2..26a8ac6fe25 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -4693,11 +4693,11 @@ GEN_VEXT_VSLIDE1DOWN_VX(vslide1down_vx_w, uint32_t, H4)
 GEN_VEXT_VSLIDE1DOWN_VX(vslide1down_vx_d, uint64_t, H8)
 
 /* Vector Register Gather Instruction */
-#define GEN_VEXT_VRGATHER_VV(NAME, ETYPE, H)  \
+#define GEN_VEXT_VRGATHER_VV(NAME, TS1, TS2, HS1, HS2)\
 void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2,   \
   CPURISCVState *env, uint32_t desc)  \
 { \
-uint32_t vlmax = vext_max_elems(desc, ctzl(sizeof(ETYPE)));   \
+uint32_t vlmax = vext_max_elems(desc, ctzl(sizeof(TS1))); \
 uint32_t vm = vext_vm(desc);  

[RFC v4 54/70] target/riscv: rvv-1.0: narrowing fixed-point clip instructions

2020-08-17 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/helper.h   | 24 ++--
 target/riscv/insn32.decode  | 12 +++---
 target/riscv/insn_trans/trans_rvv.inc.c | 12 +++---
 target/riscv/vector_helper.c| 52 -
 4 files changed, 50 insertions(+), 50 deletions(-)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 6d98de1be15..0a21440d98d 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -786,18 +786,18 @@ DEF_HELPER_6(vssra_vx_h, void, ptr, ptr, tl, ptr, env, 
i32)
 DEF_HELPER_6(vssra_vx_w, void, ptr, ptr, tl, ptr, env, i32)
 DEF_HELPER_6(vssra_vx_d, void, ptr, ptr, tl, ptr, env, i32)
 
-DEF_HELPER_6(vnclip_vv_b, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vnclip_vv_h, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vnclip_vv_w, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vnclipu_vv_b, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vnclipu_vv_h, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vnclipu_vv_w, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vnclipu_vx_b, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vnclipu_vx_h, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vnclipu_vx_w, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vnclip_vx_b, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vnclip_vx_h, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vnclip_vx_w, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vnclip_wv_b, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vnclip_wv_h, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vnclip_wv_w, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vnclipu_wv_b, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vnclipu_wv_h, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vnclipu_wv_w, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vnclipu_wx_b, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vnclipu_wx_h, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vnclipu_wx_w, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vnclip_wx_b, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vnclip_wx_h, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vnclip_wx_w, void, ptr, ptr, tl, ptr, env, i32)
 
 DEF_HELPER_6(vfadd_vv_h, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vfadd_vv_w, void, ptr, ptr, ptr, ptr, env, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index d181db197ef..39565ef047c 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -487,12 +487,12 @@ vssrl_vi101010 . . . 011 . 1010111 
@r_vm
 vssra_vv101011 . . . 000 . 1010111 @r_vm
 vssra_vx101011 . . . 100 . 1010111 @r_vm
 vssra_vi101011 . . . 011 . 1010111 @r_vm
-vnclipu_vv  101110 . . . 000 . 1010111 @r_vm
-vnclipu_vx  101110 . . . 100 . 1010111 @r_vm
-vnclipu_vi  101110 . . . 011 . 1010111 @r_vm
-vnclip_vv   10 . . . 000 . 1010111 @r_vm
-vnclip_vx   10 . . . 100 . 1010111 @r_vm
-vnclip_vi   10 . . . 011 . 1010111 @r_vm
+vnclipu_wv  101110 . . . 000 . 1010111 @r_vm
+vnclipu_wx  101110 . . . 100 . 1010111 @r_vm
+vnclipu_wi  101110 . . . 011 . 1010111 @r_vm
+vnclip_wv   10 . . . 000 . 1010111 @r_vm
+vnclip_wx   10 . . . 100 . 1010111 @r_vm
+vnclip_wi   10 . . . 011 . 1010111 @r_vm
 vfadd_vv00 . . . 001 . 1010111 @r_vm
 vfadd_vf00 . . . 101 . 1010111 @r_vm
 vfsub_vv10 . . . 001 . 1010111 @r_vm
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index c452292652c..41a60cf2fb9 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -2376,12 +2376,12 @@ GEN_OPIVI_TRANS(vssrl_vi, IMM_ZX, vssrl_vx, opivx_check)
 GEN_OPIVI_TRANS(vssra_vi, IMM_SX, vssra_vx, opivx_check)
 
 /* Vector Narrowing Fixed-Point Clip Instructions */
-GEN_OPIVV_NARROW_TRANS(vnclipu_vv)
-GEN_OPIVV_NARROW_TRANS(vnclip_vv)
-GEN_OPIVX_NARROW_TRANS(vnclipu_vx)
-GEN_OPIVX_NARROW_TRANS(vnclip_vx)
-GEN_OPIVI_NARROW_TRANS(vnclipu_vi, IMM_ZX, vnclipu_vx)
-GEN_OPIVI_NARROW_TRANS(vnclip_vi, IMM_ZX, vnclip_vx)
+GEN_OPIWV_NARROW_TRANS(vnclipu_wv)
+GEN_OPIWV_NARROW_TRANS(vnclip_wv)
+GEN_OPIWX_NARROW_TRANS(vnclipu_wx)
+GEN_OPIWX_NARROW_TRANS(vnclip_wx)
+GEN_OPIWI_NARROW_TRANS(vnclipu_wi, IMM_ZX, vnclipu_wx)
+GEN_OPIWI_NARROW_TRANS(vnclip_wi, IMM_ZX, vnclip_wx)
 
 /*
  *** Vector Float Point Arithmetic Instructions
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 09f0b03e2c5..15a646af361 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -3128,19 +3128,19 @@ vnclip32(CPURISCVState *env, int vxrm, int64_t a, 
int32_t b

[RFC v4 62/70] target/riscv: introduce floating-point rounding mode enum

2020-08-17 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
---
 target/riscv/fpu_helper.c   | 12 ++--
 target/riscv/insn_trans/trans_rvv.inc.c | 18 +-
 target/riscv/internals.h|  9 +
 3 files changed, 24 insertions(+), 15 deletions(-)

diff --git a/target/riscv/fpu_helper.c b/target/riscv/fpu_helper.c
index bb346a82499..92e076c6ed8 100644
--- a/target/riscv/fpu_helper.c
+++ b/target/riscv/fpu_helper.c
@@ -55,23 +55,23 @@ void helper_set_rounding_mode(CPURISCVState *env, uint32_t 
rm)
 {
 int softrm;
 
-if (rm == 7) {
+if (rm == FRM_DYN) {
 rm = env->frm;
 }
 switch (rm) {
-case 0:
+case FRM_RNE:
 softrm = float_round_nearest_even;
 break;
-case 1:
+case FRM_RTZ:
 softrm = float_round_to_zero;
 break;
-case 2:
+case FRM_RDN:
 softrm = float_round_down;
 break;
-case 3:
+case FRM_RUP:
 softrm = float_round_up;
 break;
-case 4:
+case FRM_RMM:
 softrm = float_round_ties_away;
 break;
 default:
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index 4f33c42990e..c148ed40c9f 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -2430,7 +2430,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a)
 \
 gen_helper_##NAME##_d, \
 }; \
 TCGLabel *over = gen_new_label();  \
-gen_set_rm(s, 7);  \
+gen_set_rm(s, FRM_DYN);\
 tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);  \
\
 data = FIELD_DP32(data, VDATA, VM, a->vm); \
@@ -2510,7 +2510,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a)
\
 gen_helper_##NAME##_w,\
 gen_helper_##NAME##_d,\
 };\
-gen_set_rm(s, 7); \
+gen_set_rm(s, FRM_DYN);   \
 data = FIELD_DP32(data, VDATA, VM, a->vm);\
 data = FIELD_DP32(data, VDATA, LMUL, s->lmul);\
 return opfvf_trans(a->rd, a->rs1, a->rs2, data,   \
@@ -2542,7 +2542,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a)
   \
 gen_helper_##NAME##_h, gen_helper_##NAME##_w,\
 };   \
 TCGLabel *over = gen_new_label();\
-gen_set_rm(s, 7);\
+gen_set_rm(s, FRM_DYN);  \
 tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);\
  \
 data = FIELD_DP32(data, VDATA, VM, a->vm);   \
@@ -2578,7 +2578,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a)
   \
 static gen_helper_opfvf *const fns[2] = {\
 gen_helper_##NAME##_h, gen_helper_##NAME##_w,\
 };   \
-gen_set_rm(s, 7);\
+gen_set_rm(s, FRM_DYN);  \
 data = FIELD_DP32(data, VDATA, VM, a->vm);   \
 data = FIELD_DP32(data, VDATA, LMUL, s->lmul);   \
 return opfvf_trans(a->rd, a->rs1, a->rs2, data,  \
@@ -2608,7 +2608,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a)
 \
 gen_helper_##NAME##_h, gen_helper_##NAME##_w,  \
 }; \
 TCGLabel *over = gen_new_label();  \
-gen_set_rm(s, 7);  \
+gen_set_rm(s, FRM_DYN);\
 tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);  \
\
 data = FIELD_DP32(data, VDATA, VM, a->vm); \
@@ -2644,7 +2644,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a)
   \
 static gen_helper_opfvf *const fns[2] = {\
 gen_helper_##NAME##_h, gen_helper_##NAME##_w,\
 };   \
-gen_set_rm(s, 7);\
+gen_set_rm(s, FRM_DYN);   

[RFC v4 42/70] target/riscv: rvv-1.0: integer add-with-carry/subtract-with-borrow

2020-08-17 Thread frank . chang
From: Frank Chang 

Clear tail elements only if VTA is agnostic.

Signed-off-by: Frank Chang 
---
 target/riscv/insn32.decode  | 20 ++--
 target/riscv/insn_trans/trans_rvv.inc.c |  2 +-
 target/riscv/vector_helper.c| 14 --
 3 files changed, 15 insertions(+), 21 deletions(-)

diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index fd00ee6fdca..e62bad906a3 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -342,16 +342,16 @@ vwsubu_wv   110110 . . . 010 . 1010111 
@r_vm
 vwsubu_wx   110110 . . . 110 . 1010111 @r_vm
 vwsub_wv110111 . . . 010 . 1010111 @r_vm
 vwsub_wx110111 . . . 110 . 1010111 @r_vm
-vadc_vvm01 1 . . 000 . 1010111 @r_vm_1
-vadc_vxm01 1 . . 100 . 1010111 @r_vm_1
-vadc_vim01 1 . . 011 . 1010111 @r_vm_1
-vmadc_vvm   010001 1 . . 000 . 1010111 @r_vm_1
-vmadc_vxm   010001 1 . . 100 . 1010111 @r_vm_1
-vmadc_vim   010001 1 . . 011 . 1010111 @r_vm_1
-vsbc_vvm010010 1 . . 000 . 1010111 @r_vm_1
-vsbc_vxm010010 1 . . 100 . 1010111 @r_vm_1
-vmsbc_vvm   010011 1 . . 000 . 1010111 @r_vm_1
-vmsbc_vxm   010011 1 . . 100 . 1010111 @r_vm_1
+vadc_vvm01 0 . . 000 . 1010111 @r_vm_1
+vadc_vxm01 0 . . 100 . 1010111 @r_vm_1
+vadc_vim01 0 . . 011 . 1010111 @r_vm_1
+vmadc_vvm   010001 . . . 000 . 1010111 @r_vm
+vmadc_vxm   010001 . . . 100 . 1010111 @r_vm
+vmadc_vim   010001 . . . 011 . 1010111 @r_vm
+vsbc_vvm010010 0 . . 000 . 1010111 @r_vm_1
+vsbc_vxm010010 0 . . 100 . 1010111 @r_vm_1
+vmsbc_vvm   010011 . . . 000 . 1010111 @r_vm
+vmsbc_vxm   010011 . . . 100 . 1010111 @r_vm
 vand_vv 001001 . . . 000 . 1010111 @r_vm
 vand_vx 001001 . . . 100 . 1010111 @r_vm
 vand_vi 001001 . . . 011 . 1010111 @r_vm
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index b763c3956cb..c8ebfa6c3f5 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -1774,7 +1774,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a)
 \
 
 /*
  * For vadc and vsbc, an illegal instruction exception is raised if the
- * destination vector register is v0 and LMUL > 1. (Section 12.3)
+ * destination vector register is v0 and LMUL > 1. (Section 12.4)
  */
 static bool opivv_vadc_check(DisasContext *s, arg_rmrr *a)
 {
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index ace6fcd28d8..70394611b21 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -1205,19 +1205,16 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void 
*vs2,   \
   CPURISCVState *env, uint32_t desc)  \
 { \
 uint32_t vl = env->vl;\
-uint32_t vlmax = vext_maxsz(desc) / sizeof(ETYPE);\
+uint32_t vm = vext_vm(desc);  \
 uint32_t i;   \
   \
 for (i = 0; i < vl; i++) {\
 ETYPE s1 = *((ETYPE *)vs1 + H(i));\
 ETYPE s2 = *((ETYPE *)vs2 + H(i));\
-uint8_t carry = vext_elem_mask(v0, i);\
+uint8_t carry = !vm ? vext_elem_mask(v0, i) : 0;  \
   \
 vext_set_elem_mask(vd, i, DO_OP(s2, s1, carry));  \
 } \
-for (; i < vlmax; i++) {  \
-vext_set_elem_mask(vd, i, 0); \
-} \
 }
 
 GEN_VEXT_VMADC_VVM(vmadc_vvm_b, uint8_t,  H1, DO_MADC)
@@ -1235,19 +1232,16 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1,  
\
   void *vs2, CPURISCVState *env, uint32_t desc) \
 {   \
 uint32_t vl = env->vl;  \
-uint32_t vlmax = vext_max_elems(desc, ctzl(sizeof(ETYPE))); \
+uint32_t vm = vext_vm(desc);\
 uint32_t i; \
 \
 for (i = 0; i < vl; i++) {  \
 ETYPE s2 =

[RFC v4 58/70] target/riscv: rvv-1.0: remove widening saturating scaled multiply-add

2020-08-17 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/helper.h   |  22 ---
 target/riscv/insn32.decode  |   7 -
 target/riscv/insn_trans/trans_rvv.inc.c |   9 --
 target/riscv/vector_helper.c| 205 
 4 files changed, 243 deletions(-)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 0a21440d98d..ac655b8f274 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -747,28 +747,6 @@ DEF_HELPER_6(vsmul_vx_h, void, ptr, ptr, tl, ptr, env, i32)
 DEF_HELPER_6(vsmul_vx_w, void, ptr, ptr, tl, ptr, env, i32)
 DEF_HELPER_6(vsmul_vx_d, void, ptr, ptr, tl, ptr, env, i32)
 
-DEF_HELPER_6(vwsmaccu_vv_b, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vwsmaccu_vv_h, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vwsmaccu_vv_w, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vwsmacc_vv_b, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vwsmacc_vv_h, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vwsmacc_vv_w, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vwsmaccsu_vv_b, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vwsmaccsu_vv_h, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vwsmaccsu_vv_w, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vwsmaccu_vx_b, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vwsmaccu_vx_h, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vwsmaccu_vx_w, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vwsmacc_vx_b, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vwsmacc_vx_h, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vwsmacc_vx_w, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vwsmaccsu_vx_b, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vwsmaccsu_vx_h, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vwsmaccsu_vx_w, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vwsmaccus_vx_b, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vwsmaccus_vx_h, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vwsmaccus_vx_w, void, ptr, ptr, tl, ptr, env, i32)
-
 DEF_HELPER_6(vssrl_vv_b, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vssrl_vv_h, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vssrl_vv_w, void, ptr, ptr, ptr, ptr, env, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 39565ef047c..99320705cca 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -474,13 +474,6 @@ vasubu_vv   001010 . . . 010 . 1010111 
@r_vm
 vasubu_vx   001010 . . . 110 . 1010111 @r_vm
 vsmul_vv100111 . . . 000 . 1010111 @r_vm
 vsmul_vx100111 . . . 100 . 1010111 @r_vm
-vwsmaccu_vv 00 . . . 000 . 1010111 @r_vm
-vwsmaccu_vx 00 . . . 100 . 1010111 @r_vm
-vwsmacc_vv  01 . . . 000 . 1010111 @r_vm
-vwsmacc_vx  01 . . . 100 . 1010111 @r_vm
-vwsmaccsu_vv10 . . . 000 . 1010111 @r_vm
-vwsmaccsu_vx10 . . . 100 . 1010111 @r_vm
-vwsmaccus_vx11 . . . 100 . 1010111 @r_vm
 vssrl_vv101010 . . . 000 . 1010111 @r_vm
 vssrl_vx101010 . . . 100 . 1010111 @r_vm
 vssrl_vi101010 . . . 011 . 1010111 @r_vm
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index 9c92ad62915..d3b1499c64c 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -2358,15 +2358,6 @@ GEN_OPIVX_TRANS(vasubu_vx,  opivx_check)
 GEN_OPIVV_TRANS(vsmul_vv, opivv_check)
 GEN_OPIVX_TRANS(vsmul_vx,  opivx_check)
 
-/* Vector Widening Saturating Scaled Multiply-Add */
-GEN_OPIVV_WIDEN_TRANS(vwsmaccu_vv, opivv_widen_check)
-GEN_OPIVV_WIDEN_TRANS(vwsmacc_vv, opivv_widen_check)
-GEN_OPIVV_WIDEN_TRANS(vwsmaccsu_vv, opivv_widen_check)
-GEN_OPIVX_WIDEN_TRANS(vwsmaccu_vx)
-GEN_OPIVX_WIDEN_TRANS(vwsmacc_vx)
-GEN_OPIVX_WIDEN_TRANS(vwsmaccsu_vx)
-GEN_OPIVX_WIDEN_TRANS(vwsmaccus_vx)
-
 /* Vector Single-Width Scaling Shift Instructions */
 GEN_OPIVV_TRANS(vssrl_vv, opivv_check)
 GEN_OPIVV_TRANS(vssra_vv, opivv_check)
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 00743cbce34..1aeb3b5e4aa 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -2747,211 +2747,6 @@ GEN_VEXT_VX_RM(vsmul_vx_h, 2, 2)
 GEN_VEXT_VX_RM(vsmul_vx_w, 4, 4)
 GEN_VEXT_VX_RM(vsmul_vx_d, 8, 8)
 
-/* Vector Widening Saturating Scaled Multiply-Add */
-static inline uint16_t
-vwsmaccu8(CPURISCVState *env, int vxrm, uint8_t a, uint8_t b,
-  uint16_t c)
-{
-uint8_t round;
-uint16_t res = (uint16_t)a * b;
-
-round = get_round(vxrm, res, 4);
-res   = (res >> 4) + round;
-return saddu16(env, vxrm, c, res);
-}
-
-static inline uint32_t
-vwsmaccu16(CPURISCVState *env, int vxrm, uint16_t a, uint16_t b,
-   uint32_t c)
-{
-uint8_t round;
-uint32_t res = (ui

[RFC v4 63/70] target/riscv: rvv-1.0: floating-point/integer type-convert instructions

2020-08-17 Thread frank . chang
From: Frank Chang 

Add the following instructions:

* vfcvt.rtz.xu.f.v
* vfcvt.rtz.x.f.v

Also adjust GEN_OPFV_TRANS() to accept multiple floating-point rounding
modes.

Signed-off-by: Frank Chang 
---
 target/riscv/helper.h   |  6 ++
 target/riscv/insn32.decode  | 11 +++
 target/riscv/insn_trans/trans_rvv.inc.c | 18 ++
 target/riscv/vector_helper.c| 22 ++
 4 files changed, 45 insertions(+), 12 deletions(-)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index a9ec14c49ad..5ef37b9dc49 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -984,6 +984,12 @@ DEF_HELPER_5(vfcvt_f_xu_v_d, void, ptr, ptr, ptr, env, i32)
 DEF_HELPER_5(vfcvt_f_x_v_h, void, ptr, ptr, ptr, env, i32)
 DEF_HELPER_5(vfcvt_f_x_v_w, void, ptr, ptr, ptr, env, i32)
 DEF_HELPER_5(vfcvt_f_x_v_d, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfcvt_rtz_xu_f_v_h, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfcvt_rtz_xu_f_v_w, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfcvt_rtz_xu_f_v_d, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfcvt_rtz_x_f_v_h, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfcvt_rtz_x_f_v_w, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfcvt_rtz_x_f_v_d, void, ptr, ptr, ptr, env, i32)
 
 DEF_HELPER_5(vfwcvt_xu_f_v_h, void, ptr, ptr, ptr, env, i32)
 DEF_HELPER_5(vfwcvt_xu_f_v_w, void, ptr, ptr, ptr, env, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 425cfd7cb32..c25c03dfb7c 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -556,10 +556,13 @@ vmfge_vf01 . . . 101 . 1010111 
@r_vm
 vfclass_v   010011 . . 1 001 . 1010111 @r2_vm
 vfmerge_vfm 010111 0 . . 101 . 1010111 @r_vm_0
 vfmv_v_f010111 1 0 . 101 . 1010111 @r2
-vfcvt_xu_f_v100010 . . 0 001 . 1010111 @r2_vm
-vfcvt_x_f_v 100010 . . 1 001 . 1010111 @r2_vm
-vfcvt_f_xu_v100010 . . 00010 001 . 1010111 @r2_vm
-vfcvt_f_x_v 100010 . . 00011 001 . 1010111 @r2_vm
+
+vfcvt_xu_f_v   010010 . . 0 001 . 1010111 @r2_vm
+vfcvt_x_f_v010010 . . 1 001 . 1010111 @r2_vm
+vfcvt_f_xu_v   010010 . . 00010 001 . 1010111 @r2_vm
+vfcvt_f_x_v010010 . . 00011 001 . 1010111 @r2_vm
+vfcvt_rtz_xu_f_v   010010 . . 00110 001 . 1010111 @r2_vm
+vfcvt_rtz_x_f_v010010 . . 00111 001 . 1010111 @r2_vm
 vfwcvt_xu_f_v   100010 . . 01000 001 . 1010111 @r2_vm
 vfwcvt_x_f_v100010 . . 01001 001 . 1010111 @r2_vm
 vfwcvt_f_xu_v   100010 . . 01010 001 . 1010111 @r2_vm
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index c148ed40c9f..9cc5e2315cd 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -2710,7 +2710,7 @@ static bool opfv_check(DisasContext *s, arg_rmr *a)
(s->sew != 0);
 }
 
-#define GEN_OPFV_TRANS(NAME, CHECK)\
+#define GEN_OPFV_TRANS(NAME, CHECK, FRM)   \
 static bool trans_##NAME(DisasContext *s, arg_rmr *a)  \
 {  \
 if (CHECK(s, a)) { \
@@ -2721,7 +2721,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a) 
 \
 gen_helper_##NAME##_d, \
 }; \
 TCGLabel *over = gen_new_label();  \
-gen_set_rm(s, FRM_DYN);\
+gen_set_rm(s, FRM);\
 tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);  \
\
 data = FIELD_DP32(data, VDATA, VM, a->vm); \
@@ -2736,7 +2736,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a) 
 \
 return false;  \
 }
 
-GEN_OPFV_TRANS(vfsqrt_v, opfv_check)
+GEN_OPFV_TRANS(vfsqrt_v, opfv_check, FRM_DYN)
 
 /* Vector Floating-Point MIN/MAX Instructions */
 GEN_OPFVV_TRANS(vfmin_vv, opfvv_check)
@@ -2782,7 +2782,7 @@ GEN_OPFVF_TRANS(vmfgt_vf, opfvf_cmp_check)
 GEN_OPFVF_TRANS(vmfge_vf, opfvf_cmp_check)
 
 /* Vector Floating-Point Classify Instruction */
-GEN_OPFV_TRANS(vfclass_v, opfv_check)
+GEN_OPFV_TRANS(vfclass_v, opfv_check, FRM_DYN)
 
 /* Vector Floating-Point Merge Instruction */
 GEN_OPFVF_TRANS(vfmerge_vfm,  opfvf_check)
@@ -2832,10 +2832,12 @@ static bool trans_vfmv_v_f(DisasContext *s, 
arg_vfmv_v_f *a)
 }
 
 /* Single-Width Floating-Point/Integer Type-Convert Instructions */
-GEN_OPFV_TRANS(vfcvt_xu_f_v, opfv_check)
-GEN_OPFV_TRANS(vfcvt_x_f_v, opfv_check)
-GEN_OPFV

[RFC v4 57/70] target/riscv: rvv-1.0: single-width scaling shift instructions

2020-08-17 Thread frank . chang
From: Frank Chang 

log(SEW) truncate vssra.vi immediate value.

Signed-off-by: Frank Chang 
---
 target/riscv/insn_trans/trans_rvv.inc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index 2ebe2373237..9c92ad62915 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -2372,8 +2372,8 @@ GEN_OPIVV_TRANS(vssrl_vv, opivv_check)
 GEN_OPIVV_TRANS(vssra_vv, opivv_check)
 GEN_OPIVX_TRANS(vssrl_vx,  opivx_check)
 GEN_OPIVX_TRANS(vssra_vx,  opivx_check)
-GEN_OPIVI_TRANS(vssrl_vi, IMM_ZX, vssrl_vx, opivx_check)
-GEN_OPIVI_TRANS(vssra_vi, IMM_SX, vssra_vx, opivx_check)
+GEN_OPIVI_TRANS(vssrl_vi, IMM_TRUNC_SEW, vssrl_vx, opivx_check)
+GEN_OPIVI_TRANS(vssra_vi, IMM_TRUNC_SEW, vssra_vx, opivx_check)
 
 /* Vector Narrowing Fixed-Point Clip Instructions */
 GEN_OPIWV_NARROW_TRANS(vnclipu_wv)
-- 
2.17.1




[RFC v4 67/70] target/riscv: rvv-1.0: relax RV_VLEN_MAX to 512-bits

2020-08-17 Thread frank . chang
From: Frank Chang 

As GVEC only supports MAXSZ and OPRSZ in the range of: [8..256] bytes
and LMUL could be a fractional number. The maximum vector size can be
operated might be less than 8 bytes or larger than 256 bytes.
Skip to use GVEC if maximum vector size <= 8 or >= 256 bytes.

Signed-off-by: Frank Chang 

--
Maybe to relax the limitations of MAXSZ or OPRSZ would be a better
approach.

Signed-off-by: Frank Chang 
---
 target/riscv/cpu.h  | 13 +++--
 target/riscv/insn_trans/trans_rvv.inc.c |  2 +-
 target/riscv/vector_helper.c|  2 +-
 3 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 6e9b17c4e38..2c7ce500fa7 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -92,7 +92,7 @@ typedef struct CPURISCVState CPURISCVState;
 
 #include "pmp.h"
 
-#define RV_VLEN_MAX 256
+#define RV_VLEN_MAX 512
 
 FIELD(VTYPE, VLMUL, 0, 3)
 FIELD(VTYPE, VSEW, 3, 3)
@@ -413,16 +413,17 @@ static inline void cpu_get_tb_cpu_state(CPURISCVState 
*env, target_ulong *pc,
 /*
  * If env->vl equals to VLMAX, we can use generic vector operation
  * expanders (GVEC) to accerlate the vector operations.
- * However, as LMUL could be a fractional number. The maximum
- * vector size can be operated might be less than 8 bytes,
- * which is not supported by GVEC. So we set vl_eq_vlmax flag to true
- * only when maxsz >= 8 bytes.
+ * However, as GVEC only supports MAXSZ and OPRSZ in the range of:
+ * [8..256] bytes and LMUL could be a fractional number. The maximum
+ * vector size can be operated might be less than 8 bytes or
+ * larger than 256 bytes. So we set vl_eq_vlmax flag to true only
+ * when maxsz >= 8 bytes and <= 256 bytes.
  */
 uint32_t vlmax = vext_get_vlmax(env_archcpu(env), env->vtype);
 uint32_t sew = FIELD_EX64(env->vtype, VTYPE, VSEW);
 uint32_t maxsz = vlmax << sew;
 bool vl_eq_vlmax = (env->vstart == 0) && (vlmax == env->vl)
-   && (maxsz >= 8);
+   && (maxsz >= 8) && (maxsz <= 256);
 flags = FIELD_DP32(flags, TB_FLAGS, VILL,
 FIELD_EX64(env->vtype, VTYPE, VILL));
 flags = FIELD_DP32(flags, TB_FLAGS, SEW, sew);
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index f2edf804460..9ad64762239 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -669,7 +669,7 @@ static bool ldst_us_trans(uint32_t vd, uint32_t rs1, 
uint32_t data,
 
 /*
  * As simd_desc supports at most 256 bytes, and in this implementation,
- * the max vector group length is 1024 bytes. So split it into two parts.
+ * the max vector group length is 2048 bytes. So split it into two parts.
  *
  * The first part is vlen in bytes, encoded in maxsz of simd_desc.
  * The second part is lmul, encoded in data of simd_desc.
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 316e435f8af..07d1ee60717 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -129,7 +129,7 @@ static uint32_t vext_wd(uint32_t desc)
 static inline uint32_t vext_max_elems(uint32_t desc, uint32_t esz)
 {
 /*
- * As simd_desc support at most 256 bytes, the max vlen is 256 bits.
+ * As simd_desc support at most 256 bytes, the max vlen is 512 bits.
  * so vlen in bytes (vlenb) is encoded as maxsz.
  */
 uint32_t vlenb = simd_maxsz(desc);
-- 
2.17.1




[RFC v4 68/70] target/riscv: gdb: modify gdb csr xml file to align with csr register map

2020-08-17 Thread frank . chang
From: Hsiangkai Wang 

Signed-off-by: Hsiangkai Wang 
Signed-off-by: Frank Chang 
---
 gdb-xml/riscv-32bit-csr.xml | 11 ++-
 gdb-xml/riscv-64bit-csr.xml | 11 ++-
 target/riscv/gdbstub.c  |  4 ++--
 3 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/gdb-xml/riscv-32bit-csr.xml b/gdb-xml/riscv-32bit-csr.xml
index da1bf19e2f4..3d2031da7dc 100644
--- a/gdb-xml/riscv-32bit-csr.xml
+++ b/gdb-xml/riscv-32bit-csr.xml
@@ -110,6 +110,8 @@
   
   
   
+  
+  
   
   
   
@@ -232,12 +234,11 @@
   
   
   
-  
-  
-  
-  
-  
+  
+  
   
+  
+  
   
   
   
diff --git a/gdb-xml/riscv-64bit-csr.xml b/gdb-xml/riscv-64bit-csr.xml
index 6aa4bed9f50..90394562930 100644
--- a/gdb-xml/riscv-64bit-csr.xml
+++ b/gdb-xml/riscv-64bit-csr.xml
@@ -110,6 +110,8 @@
   
   
   
+  
+  
   
   
   
@@ -232,12 +234,11 @@
   
   
   
-  
-  
-  
-  
-  
+  
+  
   
+  
+  
   
   
   
diff --git a/target/riscv/gdbstub.c b/target/riscv/gdbstub.c
index eba12a86f2e..f7c5212e274 100644
--- a/target/riscv/gdbstub.c
+++ b/target/riscv/gdbstub.c
@@ -418,13 +418,13 @@ void riscv_cpu_register_gdb_regs_for_features(CPUState 
*cs)
 }
 #if defined(TARGET_RISCV32)
 gdb_register_coprocessor(cs, riscv_gdb_get_csr, riscv_gdb_set_csr,
- 240, "riscv-32bit-csr.xml", 0);
+ 241, "riscv-32bit-csr.xml", 0);
 
 gdb_register_coprocessor(cs, riscv_gdb_get_virtual, riscv_gdb_set_virtual,
  1, "riscv-32bit-virtual.xml", 0);
 #elif defined(TARGET_RISCV64)
 gdb_register_coprocessor(cs, riscv_gdb_get_csr, riscv_gdb_set_csr,
- 240, "riscv-64bit-csr.xml", 0);
+ 241, "riscv-64bit-csr.xml", 0);
 
 gdb_register_coprocessor(cs, riscv_gdb_get_virtual, riscv_gdb_set_virtual,
  1, "riscv-64bit-virtual.xml", 0);
-- 
2.17.1




[RFC v4 00/70] support vector extension v1.0

2020-08-17 Thread frank . chang
From: Frank Chang 

This patchset implements the vector extension v1.0 for RISC-V on QEMU.

This patchset is sent as RFC because RVV v1.0 is still in draft state.
v2 patchset was sent for RVV v0.9 and bumped to RVV v1.0 since v3 patchset.

The port is available here:
https://github.com/sifive/qemu/tree/rvv-1.0-upstream-v4

You can change the cpu argument: vext_spec to v1.0 (i.e. vext_spec=v1.0)
to run with RVV v1.0 instructions.

Note: This patchset depends on two other patchsets listed in Based-on
  section below so it might not able to be built unless those two
  patchsets are applied.

Changelog:

v4
  * remove explicit float flmul variable in DisasContext.
  * replace floating-point calculations with shift operations to
improve performance.
  * relax RV_VLEN_MAX to 512-bits.

v3
  * apply nan-box helpers from Richard Henderson.
  * remove fp16 api changes as they are sent independently in another
pathcset by Chih-Min Chao.
  * remove all tail elements clear functions as tail elements can
retain unchanged for either VTA set to undisturbed or agnostic.
  * add fp16 nan-box check generator function.
  * add floating-point rounding mode enum.
  * replace flmul arithmetic with shifts to avoid floating-point
conversions.
  * add Zvqmac extension.
  * replace gdbstub vector register xml files with dynamic generator.
  * bumped to RVV v1.0.
  * RVV v1.0 related changes:
* add vlre.v and vsr.v vector whole register
  load/store instructions
* add vrgatherei16 instruction.
* rearranged bits in vtype to make vlmul bits into a contiguous
  field.

v2
  * drop v0.7.1 support.
  * replace invisible return check macros with functions.
  * move mark_vs_dirty() to translators.
  * add SSTATUS_VS flag for s-mode.
  * nan-box scalar fp register for floating-point operations.
  * add gdbstub files for vector registers to allow system-mode
debugging with GDB.

Based-on: <20200724002807.441147-1-richard.hender...@linaro.org/>
Based-on: <1596102747-20226-1-git-send-email-chihmin.c...@sifive.com/>

Frank Chang (62):
  target/riscv: drop vector 0.7.1 and add 1.0 support
  target/riscv: Use FIELD_EX32() to extract wd field
  target/riscv: rvv-1.0: introduce writable misa.v field
  target/riscv: rvv-1.0: remove rvv related codes from fcsr registers
  target/riscv: rvv-1.0: check MSTATUS_VS when accessing vector csr
registers
  target/riscv: rvv-1.0: remove MLEN calculations
  target/riscv: rvv-1.0: add fractional LMUL
  target/riscv: rvv-1.0: add VMA and VTA
  target/riscv: rvv-1.0: update check functions
  target/riscv: introduce more imm value modes in translator functions
  target/riscv: rvv:1.0: add translation-time nan-box helper function
  target/riscv: rvv-1.0: configure instructions
  target/riscv: rvv-1.0: stride load and store instructions
  target/riscv: rvv-1.0: index load and store instructions
  target/riscv: rvv-1.0: fix address index overflow bug of indexed
load/store insns
  target/riscv: rvv-1.0: fault-only-first unit stride load
  target/riscv: rvv-1.0: amo operations
  target/riscv: rvv-1.0: load/store whole register instructions
  target/riscv: rvv-1.0: update vext_max_elems() for load/store insns
  target/riscv: rvv-1.0: take fractional LMUL into vector max elements
calculation
  target/riscv: rvv-1.0: floating-point square-root instruction
  target/riscv: rvv-1.0: floating-point classify instructions
  target/riscv: rvv-1.0: mask population count instruction
  target/riscv: rvv-1.0: find-first-set mask bit instruction
  target/riscv: rvv-1.0: set-X-first mask bit instructions
  target/riscv: rvv-1.0: iota instruction
  target/riscv: rvv-1.0: element index instruction
  target/riscv: rvv-1.0: allow load element with sign-extended
  target/riscv: rvv-1.0: register gather instructions
  target/riscv: rvv-1.0: integer scalar move instructions
  target/riscv: rvv-1.0: floating-point move instruction
  target/riscv: rvv-1.0: floating-point scalar move instructions
  target/riscv: rvv-1.0: whole register move instructions
  target/riscv: rvv-1.0: integer extension instructions
  target/riscv: rvv-1.0: single-width averaging add and subtract
instructions
  target/riscv: rvv-1.0: single-width bit shift instructions
  target/riscv: rvv-1.0: integer add-with-carry/subtract-with-borrow
  target/riscv: rvv-1.0: narrowing integer right shift instructions
  target/riscv: rvv-1.0: widening integer multiply-add instructions
  target/riscv: rvv-1.0: add Zvqmac extension
  target/riscv: rvv-1.0: quad-widening integer multiply-add instructions
  target/riscv: rvv-1.0: single-width saturating add and subtract
instructions
  target/riscv: rvv-1.0: integer comparison instructions
  target/riscv: use softfloat lib float16 comparison functions
  target/riscv: rvv-1.0: floating-point compare instructions
  target/riscv: rvv-1.0: mask-register logical instructions
  target/riscv: rvv-1.0: slide instructions
  target/riscv: rvv-1.0: floating-point slide 

[RFC v4 02/70] target/riscv: Use FIELD_EX32() to extract wd field

2020-08-17 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/vector_helper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 793af990673..43ba272c09b 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -98,7 +98,7 @@ static inline uint32_t vext_lmul(uint32_t desc)
 
 static uint32_t vext_wd(uint32_t desc)
 {
-return (simd_data(desc) >> 11) & 0x1;
+return FIELD_EX32(simd_data(desc), VDATA, WD);
 }
 
 /*
-- 
2.17.1




[RFC v4 24/70] target/riscv: rvv-1.0: update vext_max_elems() for load/store insns

2020-08-17 Thread frank . chang
From: Frank Chang 

Unlike other vector instructions, load/store vector instructions return
the maximum vector size calculated with EMUL.
For other vector instructions, return VLMAX as the maximum vector size.

Signed-off-by: Frank Chang 
---
 target/riscv/insn_trans/trans_rvv.inc.c | 57 +++-
 target/riscv/vector_helper.c| 90 ++---
 2 files changed, 88 insertions(+), 59 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index 6a2f175b50a..334e1fc123b 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -583,11 +583,17 @@ static bool vext_check_isa_ill(DisasContext *s)
 static bool trans_##NAME(DisasContext *s, arg_##ARGTYPE * a) \
 {\
 if (CHECK(s, a, EEW)) {  \
-return OP(s, a, SEQ);\
+return OP(s, a, EEW, SEQ);   \
 }\
 return false;\
 }
 
+static uint8_t vext_get_emul(DisasContext *s, uint8_t eew)
+{
+int8_t emul = ctzl(eew) - (s->sew + 3) + s->lmul;
+return emul < 0 ? 0 : emul;
+}
+
 /*
  *** unit stride load and store
  */
@@ -611,7 +617,7 @@ static bool ldst_us_trans(uint32_t vd, uint32_t rs1, 
uint32_t data,
 
 /*
  * As simd_desc supports at most 256 bytes, and in this implementation,
- * the max vector group length is 2048 bytes. So split it into two parts.
+ * the max vector group length is 1024 bytes. So split it into two parts.
  *
  * The first part is vlen in bytes, encoded in maxsz of simd_desc.
  * The second part is lmul, encoded in data of simd_desc.
@@ -635,7 +641,7 @@ static bool ldst_us_trans(uint32_t vd, uint32_t rs1, 
uint32_t data,
 return true;
 }
 
-static bool ld_us_op(DisasContext *s, arg_r2nfvm *a, uint8_t seq)
+static bool ld_us_op(DisasContext *s, arg_r2nfvm *a, uint8_t eew, uint8_t seq)
 {
 uint32_t data = 0;
 gen_helper_ldst_us *fn;
@@ -653,8 +659,14 @@ static bool ld_us_op(DisasContext *s, arg_r2nfvm *a, 
uint8_t seq)
 return false;
 }
 
+/*
+ * Vector load/store instructions have the EEW encoded
+ * directly in the instructions. The maximum vector size is
+ * calculated with EMUL rather than LMUL.
+ */
+uint8_t emul = vext_get_emul(s, eew);
 data = FIELD_DP32(data, VDATA, VM, a->vm);
-data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
+data = FIELD_DP32(data, VDATA, LMUL, emul);
 data = FIELD_DP32(data, VDATA, NF, a->nf);
 return ldst_us_trans(a->rd, a->rs1, data, fn, s, false);
 }
@@ -671,7 +683,7 @@ GEN_VEXT_TRANS(vle16_v, 16, 1, r2nfvm, ld_us_op, 
ld_us_check)
 GEN_VEXT_TRANS(vle32_v, 32, 2, r2nfvm, ld_us_op, ld_us_check)
 GEN_VEXT_TRANS(vle64_v, 64, 3, r2nfvm, ld_us_op, ld_us_check)
 
-static bool st_us_op(DisasContext *s, arg_r2nfvm *a, uint8_t seq)
+static bool st_us_op(DisasContext *s, arg_r2nfvm *a, uint8_t eew, uint8_t seq)
 {
 uint32_t data = 0;
 gen_helper_ldst_us *fn;
@@ -689,8 +701,9 @@ static bool st_us_op(DisasContext *s, arg_r2nfvm *a, 
uint8_t seq)
 return false;
 }
 
+uint8_t emul = vext_get_emul(s, eew);
 data = FIELD_DP32(data, VDATA, VM, a->vm);
-data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
+data = FIELD_DP32(data, VDATA, LMUL, emul);
 data = FIELD_DP32(data, VDATA, NF, a->nf);
 return ldst_us_trans(a->rd, a->rs1, data, fn, s, true);
 }
@@ -749,7 +762,8 @@ static bool ldst_stride_trans(uint32_t vd, uint32_t rs1, 
uint32_t rs2,
 return true;
 }
 
-static bool ld_stride_op(DisasContext *s, arg_rnfvm *a, uint8_t seq)
+static bool ld_stride_op(DisasContext *s, arg_rnfvm *a, uint8_t eew,
+ uint8_t seq)
 {
 uint32_t data = 0;
 gen_helper_ldst_stride *fn;
@@ -763,8 +777,9 @@ static bool ld_stride_op(DisasContext *s, arg_rnfvm *a, 
uint8_t seq)
 return false;
 }
 
+uint8_t emul = vext_get_emul(s, eew);
 data = FIELD_DP32(data, VDATA, VM, a->vm);
-data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
+data = FIELD_DP32(data, VDATA, LMUL, emul);
 data = FIELD_DP32(data, VDATA, NF, a->nf);
 return ldst_stride_trans(a->rd, a->rs1, a->rs2, data, fn, s, false);
 }
@@ -781,7 +796,8 @@ GEN_VEXT_TRANS(vlse16_v, 16, 1, rnfvm, ld_stride_op, 
ld_stride_check)
 GEN_VEXT_TRANS(vlse32_v, 32, 2, rnfvm, ld_stride_op, ld_stride_check)
 GEN_VEXT_TRANS(vlse64_v, 64, 3, rnfvm, ld_stride_op, ld_stride_check)
 
-static bool st_stride_op(DisasContext *s, arg_rnfvm *a, uint8_t seq)
+static bool st_stride_op(DisasContext *s, arg_rnfvm *a, uint8_t eew,
+ uint8_t seq)
 {
 uint32_t data = 0;
 gen_helper_ldst_stride *fn;
@@ -791,8 +807,9 @@ static bool

[RFC v4 20/70] target/riscv: rvv-1.0: fix address index overflow bug of indexed load/store insns

2020-08-17 Thread frank . chang
From: Frank Chang 

Replace ETYPE from signed int to unsigned int to prevent index overflow
issue, which would lead to wrong index address.

Signed-off-by: Frank Chang 
---
 target/riscv/vector_helper.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 272a65ebb3a..92a2161e373 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -363,10 +363,10 @@ static target_ulong NAME(target_ulong base,\
 return (base + *((ETYPE *)vs2 + H(idx)));  \
 }
 
-GEN_VEXT_GET_INDEX_ADDR(idx_b, int8_t,  H1)
-GEN_VEXT_GET_INDEX_ADDR(idx_h, int16_t, H2)
-GEN_VEXT_GET_INDEX_ADDR(idx_w, int32_t, H4)
-GEN_VEXT_GET_INDEX_ADDR(idx_d, int64_t, H8)
+GEN_VEXT_GET_INDEX_ADDR(idx_b, uint8_t,  H1)
+GEN_VEXT_GET_INDEX_ADDR(idx_h, uint16_t, H2)
+GEN_VEXT_GET_INDEX_ADDR(idx_w, uint32_t, H4)
+GEN_VEXT_GET_INDEX_ADDR(idx_d, uint64_t, H8)
 
 static inline void
 vext_ldst_index(void *vd, void *v0, target_ulong base,
-- 
2.17.1




[RFC v4 10/70] target/riscv: rvv-1.0: check MSTATUS_VS when accessing vector csr registers

2020-08-17 Thread frank . chang
From: Frank Chang 

If VS field is off, accessing vector csr registers should raise an
illegal-instruction exception.

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/csr.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index 6379718e1b6..ed8f6e175f4 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -56,6 +56,11 @@ static int fs(CPURISCVState *env, int csrno)
 static int vs(CPURISCVState *env, int csrno)
 {
 if (env->misa & RVV) {
+#if !defined(CONFIG_USER_ONLY)
+if (!env->debugger && !riscv_cpu_vector_enabled(env)) {
+return -1;
+}
+#endif
 return 0;
 }
 return -1;
-- 
2.17.1




[RFC v4 09/70] target/riscv: rvv-1.0: add vlenb register

2020-08-17 Thread frank . chang
From: Greentime Hu 

Signed-off-by: Greentime Hu 
Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/cpu_bits.h | 1 +
 target/riscv/csr.c  | 7 +++
 2 files changed, 8 insertions(+)

diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h
index 7afdd4814bb..fe055b67a6a 100644
--- a/target/riscv/cpu_bits.h
+++ b/target/riscv/cpu_bits.h
@@ -63,6 +63,7 @@
 #define CSR_VCSR0x00f
 #define CSR_VL  0xc20
 #define CSR_VTYPE   0xc21
+#define CSR_VLENB   0xc22
 
 /* VCSR fields */
 #define VCSR_VXSAT_SHIFT0
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index c87f2ddbf7d..6379718e1b6 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -184,6 +184,12 @@ static int read_vtype(CPURISCVState *env, int csrno, 
target_ulong *val)
 return 0;
 }
 
+static int read_vlenb(CPURISCVState *env, int csrno, target_ulong *val)
+{
+*val = env_archcpu(env)->cfg.vlen >> 3;
+return 0;
+}
+
 static int read_vl(CPURISCVState *env, int csrno, target_ulong *val)
 {
 *val = env->vl;
@@ -1288,6 +1294,7 @@ static riscv_csr_operations csr_ops[CSR_TABLE_SIZE] = {
 [CSR_VCSR] ={ vs,   read_vcsr,write_vcsr},
 [CSR_VL] =  { vs,   read_vl },
 [CSR_VTYPE] =   { vs,   read_vtype  },
+[CSR_VLENB] =   { vs,   read_vlenb  },
 /* User Timers and Counters */
 [CSR_CYCLE] =   { ctr,  read_instret},
 [CSR_INSTRET] = { ctr,  read_instret},
-- 
2.17.1




[RFC v4 19/70] target/riscv: rvv-1.0: index load and store instructions

2020-08-17 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
---
 target/riscv/helper.h   |  67 
 target/riscv/insn32.decode  |  21 ++-
 target/riscv/insn_trans/trans_rvv.inc.c | 193 
 target/riscv/vector_helper.c|  89 ++-
 4 files changed, 214 insertions(+), 156 deletions(-)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 2311ce39cfd..8a5d97969da 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -108,41 +108,38 @@ DEF_HELPER_6(vsse8_v, void, ptr, ptr, tl, tl, env, i32)
 DEF_HELPER_6(vsse16_v, void, ptr, ptr, tl, tl, env, i32)
 DEF_HELPER_6(vsse32_v, void, ptr, ptr, tl, tl, env, i32)
 DEF_HELPER_6(vsse64_v, void, ptr, ptr, tl, tl, env, i32)
-DEF_HELPER_6(vlxb_v_b, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vlxb_v_h, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vlxb_v_w, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vlxb_v_d, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vlxh_v_h, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vlxh_v_w, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vlxh_v_d, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vlxw_v_w, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vlxw_v_d, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vlxe_v_b, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vlxe_v_h, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vlxe_v_w, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vlxe_v_d, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vlxbu_v_b, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vlxbu_v_h, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vlxbu_v_w, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vlxbu_v_d, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vlxhu_v_h, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vlxhu_v_w, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vlxhu_v_d, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vlxwu_v_w, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vlxwu_v_d, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vsxb_v_b, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vsxb_v_h, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vsxb_v_w, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vsxb_v_d, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vsxh_v_h, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vsxh_v_w, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vsxh_v_d, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vsxw_v_w, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vsxw_v_d, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vsxe_v_b, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vsxe_v_h, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vsxe_v_w, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vsxe_v_d, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxei8_8_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxei8_16_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxei8_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxei8_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxei16_8_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxei16_16_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxei16_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxei16_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxei32_8_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxei32_16_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxei32_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxei32_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxei64_8_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxei64_16_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxei64_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxei64_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vsxei8_8_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vsxei8_16_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vsxei8_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vsxei8_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vsxei16_8_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vsxei16_16_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vsxei16_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vsxei16_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vsxei32_8_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vsxei32_16_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vsxei32_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vsxei32_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vsxei64_8_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vsxei64_16_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vsxei64_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vsxei64_64_v, void, ptr, ptr, tl, ptr, env, i32)
 DEF_HELPER_5(vlbff_v_b, void, ptr, ptr, tl, env, i32)
 DEF_HELPER_5(vlbff_v_h, void, ptr, ptr, tl, env, i32)
 DEF_HELPER_5(vlbff_v_w, void, ptr, ptr, tl, env, i32)
diff

[RFC v4 46/70] target/riscv: rvv-1.0: quad-widening integer multiply-add instructions

2020-08-17 Thread frank . chang
From: Frank Chang 

Add the following instructions:

* vqmaccu.vv
* vqmaccu.vx
* vqmacc.vv
* vqmacc.vx
* vqmaccsu.vv
* vqmaccsu.vx
* vqmaccus.vx

Signed-off-by: Frank Chang 
---
 target/riscv/helper.h   |  15 
 target/riscv/insn32.decode  |   7 ++
 target/riscv/insn_trans/trans_rvv.inc.c | 109 
 target/riscv/vector_helper.c|  40 +
 4 files changed, 171 insertions(+)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index fe37bd2f4af..6825c15e025 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -640,6 +640,21 @@ DEF_HELPER_6(vwmaccus_vx_b, void, ptr, ptr, tl, ptr, env, 
i32)
 DEF_HELPER_6(vwmaccus_vx_h, void, ptr, ptr, tl, ptr, env, i32)
 DEF_HELPER_6(vwmaccus_vx_w, void, ptr, ptr, tl, ptr, env, i32)
 
+DEF_HELPER_6(vqmaccu_vv_b, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vqmaccu_vv_h, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vqmacc_vv_b, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vqmacc_vv_h, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vqmaccsu_vv_b, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vqmaccsu_vv_h, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vqmaccu_vx_b, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vqmaccu_vx_h, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vqmacc_vx_b, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vqmacc_vx_h, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vqmaccsu_vx_b, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vqmaccsu_vx_h, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vqmaccus_vx_b, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vqmaccus_vx_h, void, ptr, ptr, tl, ptr, env, i32)
+
 DEF_HELPER_6(vmerge_vvm_b, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vmerge_vvm_h, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vmerge_vvm_w, void, ptr, ptr, ptr, ptr, env, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 2e305d492d8..b2ecc8dd4d1 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -441,6 +441,13 @@ vwmacc_vx   01 . . . 110 . 1010111 
@r_vm
 vwmaccsu_vv 11 . . . 010 . 1010111 @r_vm
 vwmaccsu_vx 11 . . . 110 . 1010111 @r_vm
 vwmaccus_vx 10 . . . 110 . 1010111 @r_vm
+vqmaccu_vv  00 . . . 000 . 1010111 @r_vm
+vqmaccu_vx  00 . . . 100 . 1010111 @r_vm
+vqmacc_vv   01 . . . 000 . 1010111 @r_vm
+vqmacc_vx   01 . . . 100 . 1010111 @r_vm
+vqmaccsu_vv 11 . . . 000 . 1010111 @r_vm
+vqmaccsu_vx 11 . . . 100 . 1010111 @r_vm
+vqmaccus_vx 10 . . . 100 . 1010111 @r_vm
 vmv_v_v 010111 1 0 . 000 . 1010111 @r2
 vmv_v_x 010111 1 0 . 100 . 1010111 @r2
 vmv_v_i 010111 1 0 . 011 . 1010111 @r2
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index 11dddc3252c..809280f4c5c 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -64,6 +64,11 @@ static bool require_rvv(DisasContext *s)
 return true;
 }
 
+static bool require_ext_vqmac(DisasContext *s)
+{
+return s->ext_vqmac;
+}
+
 /* Destination vector register group cannot overlap source mask register. */
 static bool require_vm(int vm, int rd)
 {
@@ -461,6 +466,53 @@ static bool vext_check_dss(DisasContext *s, int vd, int 
vs1, int vs2,
 return ret;
 }
 
+/*
+ * Check function for vector instruction with format:
+ * quad-width result and single-width sources (4*SEW = SEW op SEW)
+ *
+ * is_vs1: indicates whether insn[19:15] is a vs1 field or not.
+ *
+ * Rules to be checked here:
+ *   1. The largest vector register group used by an instruction
+ *  can not be greater than 8 vector registers (Section 5.2):
+ *  => LMUL < 4.
+ *  => SEW < 32.
+ *   2. Destination vector register number is multiples of 4 * LMUL.
+ *  (Section 3.3.2)
+ *   3. Source (vs2, vs1) vector register number are multiples of LMUL.
+ *  (Section 3.3.2)
+ *   4. Destination vector register cannot overlap a source vector
+ *  register (vs2, vs1) group.
+ *  (Section 5.2)
+ *   5. Destination vector register group for a masked vector
+ *  instruction cannot overlap the source mask register (v0).
+ *  (Section 5.3)
+ */
+static bool vext_check_qss(DisasContext *s, int vd, int vs1, int vs2,
+   int vm, bool is_vs1)
+{
+bool ret = (s->lmul <= 1) &&
+   (s->sew < 2) &&
+   require_align(vd, 1 << (s->lmul + 2)) &&
+   require_align(vs2, 1 << s->lmul) &&
+   require_vm(vm, vd);
+if (s->lmul < 0) {
+ret &= require_noover(vd, 1 << (s->lmul + 2), vs2, 1 << 

[RFC v4 43/70] target/riscv: rvv-1.0: narrowing integer right shift instructions

2020-08-17 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/helper.h   | 24 ++--
 target/riscv/insn32.decode  | 12 +-
 target/riscv/insn_trans/trans_rvv.inc.c | 30 -
 target/riscv/vector_helper.c| 24 ++--
 4 files changed, 45 insertions(+), 45 deletions(-)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 3560bf1d4f5..fe37bd2f4af 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -397,18 +397,18 @@ DEF_HELPER_6(vsra_vx_h, void, ptr, ptr, tl, ptr, env, i32)
 DEF_HELPER_6(vsra_vx_w, void, ptr, ptr, tl, ptr, env, i32)
 DEF_HELPER_6(vsra_vx_d, void, ptr, ptr, tl, ptr, env, i32)
 
-DEF_HELPER_6(vnsrl_vv_b, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vnsrl_vv_h, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vnsrl_vv_w, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vnsra_vv_b, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vnsra_vv_h, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vnsra_vv_w, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vnsrl_vx_b, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vnsrl_vx_h, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vnsrl_vx_w, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vnsra_vx_b, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vnsra_vx_h, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vnsra_vx_w, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vnsrl_wv_b, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vnsrl_wv_h, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vnsrl_wv_w, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vnsra_wv_b, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vnsra_wv_h, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vnsra_wv_w, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vnsrl_wx_b, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vnsrl_wx_h, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vnsrl_wx_w, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vnsra_wx_b, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vnsra_wx_h, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vnsra_wx_w, void, ptr, ptr, tl, ptr, env, i32)
 
 DEF_HELPER_6(vmseq_vv_b, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vmseq_vv_h, void, ptr, ptr, ptr, ptr, env, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index e62bad906a3..c4fe9767585 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -370,12 +370,12 @@ vsrl_vi 101000 . . . 011 . 1010111 
@r_vm
 vsra_vv 101001 . . . 000 . 1010111 @r_vm
 vsra_vx 101001 . . . 100 . 1010111 @r_vm
 vsra_vi 101001 . . . 011 . 1010111 @r_vm
-vnsrl_vv101100 . . . 000 . 1010111 @r_vm
-vnsrl_vx101100 . . . 100 . 1010111 @r_vm
-vnsrl_vi101100 . . . 011 . 1010111 @r_vm
-vnsra_vv101101 . . . 000 . 1010111 @r_vm
-vnsra_vx101101 . . . 100 . 1010111 @r_vm
-vnsra_vi101101 . . . 011 . 1010111 @r_vm
+vnsrl_wv101100 . . . 000 . 1010111 @r_vm
+vnsrl_wx101100 . . . 100 . 1010111 @r_vm
+vnsrl_wi101100 . . . 011 . 1010111 @r_vm
+vnsra_wv101101 . . . 000 . 1010111 @r_vm
+vnsra_wx101101 . . . 100 . 1010111 @r_vm
+vnsra_wi101101 . . . 011 . 1010111 @r_vm
 vmseq_vv011000 . . . 000 . 1010111 @r_vm
 vmseq_vx011000 . . . 100 . 1010111 @r_vm
 vmseq_vi011000 . . . 011 . 1010111 @r_vm
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index c8ebfa6c3f5..11dddc3252c 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -1920,7 +1920,7 @@ GEN_OPIVI_GVEC_TRANS(vsrl_vi, IMM_TRUNC_SEW, vsrl_vx, 
shri)
 GEN_OPIVI_GVEC_TRANS(vsra_vi, IMM_TRUNC_SEW, vsra_vx, sari)
 
 /* Vector Narrowing Integer Right Shift Instructions */
-static bool opivv_narrow_check(DisasContext *s, arg_rmrr *a)
+static bool opiwv_narrow_check(DisasContext *s, arg_rmrr *a)
 {
 return require_rvv(s) &&
vext_check_isa_ill(s) &&
@@ -1928,10 +1928,10 @@ static bool opivv_narrow_check(DisasContext *s, 
arg_rmrr *a)
 }
 
 /* OPIVV with NARROW */
-#define GEN_OPIVV_NARROW_TRANS(NAME)   \
+#define GEN_OPIWV_NARROW_TRANS(NAME)   \
 static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \
 {  \
-if (opivv_narrow_check(s, a)) {\
+if (opiwv_narrow_check(s, a)) {\
 uint32_t data = 0; \
 static gen_h

[RFC v4 37/70] target/riscv: rvv-1.0: floating-point scalar move instructions

2020-08-17 Thread frank . chang
From: Frank Chang 

NaN-boxed the scalar floating-point register based on RVV 1.0's rules.

Signed-off-by: Frank Chang 
---
 target/riscv/insn32.decode  |  4 +--
 target/riscv/insn_trans/trans_rvv.inc.c | 42 ++---
 2 files changed, 25 insertions(+), 21 deletions(-)

diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 6b90b67c7cc..97fce34fcd8 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -601,8 +601,8 @@ vid_v   010100 . 0 10001 010 . 1010111 
@r1_vm
 vmv_x_s 01 1 . 0 010 . 1010111 @r2rd
 vmv_s_x 01 1 0 . 110 . 1010111 @r2
 vext_x_v001100 1 . . 010 . 1010111 @r
-vfmv_f_s001100 1 . 0 001 . 1010111 @r2rd
-vfmv_s_f001101 1 0 . 101 . 1010111 @r2
+vfmv_f_s01 1 . 0 001 . 1010111 @r2rd
+vfmv_s_f01 1 0 . 101 . 1010111 @r2
 vslideup_vx 001110 . . . 100 . 1010111 @r_vm
 vslideup_vi 001110 . . . 011 . 1010111 @r_vm
 vslide1up_vx001110 . . . 110 . 1010111 @r_vm
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index 7a12b89dc13..95fdd972fdf 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -3273,14 +3273,22 @@ static bool trans_vmv_s_x(DisasContext *s, arg_vmv_s_x 
*a)
 /* Floating-Point Scalar Move Instructions */
 static bool trans_vfmv_f_s(DisasContext *s, arg_vfmv_f_s *a)
 {
-if (!s->vill && has_ext(s, RVF) &&
-(s->mstatus_fs != 0) && (s->sew != 0)) {
-unsigned int len = 8 << s->sew;
+if (require_rvv(s) &&
+vext_check_isa_ill(s) &&
+has_ext(s, RVF) &&
+(s->mstatus_fs != 0) &&
+(s->sew != 0)) {
+unsigned int ofs = (8 << s->sew);
+unsigned int len = 64 - ofs;
+TCGv_i64 t_nan;
 
 vec_element_loadi(s, cpu_fpr[a->rd], a->rs2, 0, false);
-if (len < 64) {
-tcg_gen_ori_i64(cpu_fpr[a->rd], cpu_fpr[a->rd],
-MAKE_64BIT_MASK(len, 64 - len));
+/* NaN-box f[rd] as necessary for SEW */
+if (len) {
+t_nan = tcg_const_i64(UINT64_MAX);
+tcg_gen_deposit_i64(cpu_fpr[a->rd], cpu_fpr[a->rd],
+t_nan, ofs, len);
+tcg_temp_free_i64(t_nan);
 }
 
 mark_fs_dirty(s);
@@ -3292,25 +3300,21 @@ static bool trans_vfmv_f_s(DisasContext *s, 
arg_vfmv_f_s *a)
 /* vfmv.s.f vd, rs1 # vd[0] = rs1 (vs2=0) */
 static bool trans_vfmv_s_f(DisasContext *s, arg_vfmv_s_f *a)
 {
-if (!s->vill && has_ext(s, RVF) && (s->sew != 0)) {
-TCGv_i64 t1;
+if (require_rvv(s) &&
+vext_check_isa_ill(s) &&
+has_ext(s, RVF) &&
+(s->sew != 0)) {
 /* The instructions ignore LMUL and vector register group. */
-uint32_t vlmax = s->vlen >> 3;
+TCGv_i64 t1;
+TCGLabel *over = gen_new_label();
 
 /* if vl == 0, skip vector register write back */
-TCGLabel *over = gen_new_label();
 tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);
 
-/* zeroed all elements */
-tcg_gen_gvec_dup_imm(SEW64, vreg_ofs(s, a->rd), vlmax, vlmax, 0);
-
-/* NaN-box f[rs1] as necessary for SEW */
+/* NaN-box f[rs1] */
 t1 = tcg_temp_new_i64();
-if (s->sew == MO_64 && !has_ext(s, RVD)) {
-tcg_gen_ori_i64(t1, cpu_fpr[a->rs1], MAKE_64BIT_MASK(32, 32));
-} else {
-tcg_gen_mov_i64(t1, cpu_fpr[a->rs1]);
-}
+do_nanbox(s, t1, cpu_fpr[a->rs1]);
+
 vec_element_storei(s, a->rd, 0, t1);
 tcg_temp_free_i64(t1);
 mark_vs_dirty(s);
-- 
2.17.1




[RFC v4 49/70] target/riscv: use softfloat lib float16 comparison functions

2020-08-17 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/vector_helper.c | 19 ---
 1 file changed, 19 deletions(-)

diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index f80c13b0857..e6441f18465 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -3980,12 +3980,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void 
*vs2,   \
 } \
 }
 
-static bool float16_eq_quiet(uint16_t a, uint16_t b, float_status *s)
-{
-FloatRelation compare = float16_compare_quiet(a, b, s);
-return compare == float_relation_equal;
-}
-
 GEN_VEXT_CMP_VV_ENV(vmfeq_vv_h, uint16_t, H2, float16_eq_quiet)
 GEN_VEXT_CMP_VV_ENV(vmfeq_vv_w, uint32_t, H4, float32_eq_quiet)
 GEN_VEXT_CMP_VV_ENV(vmfeq_vv_d, uint64_t, H8, float64_eq_quiet)
@@ -4041,12 +4035,6 @@ GEN_VEXT_CMP_VF(vmfne_vf_h, uint16_t, H2, vmfne16)
 GEN_VEXT_CMP_VF(vmfne_vf_w, uint32_t, H4, vmfne32)
 GEN_VEXT_CMP_VF(vmfne_vf_d, uint64_t, H8, vmfne64)
 
-static bool float16_lt(uint16_t a, uint16_t b, float_status *s)
-{
-FloatRelation compare = float16_compare(a, b, s);
-return compare == float_relation_less;
-}
-
 GEN_VEXT_CMP_VV_ENV(vmflt_vv_h, uint16_t, H2, float16_lt)
 GEN_VEXT_CMP_VV_ENV(vmflt_vv_w, uint32_t, H4, float32_lt)
 GEN_VEXT_CMP_VV_ENV(vmflt_vv_d, uint64_t, H8, float64_lt)
@@ -4054,13 +4042,6 @@ GEN_VEXT_CMP_VF(vmflt_vf_h, uint16_t, H2, float16_lt)
 GEN_VEXT_CMP_VF(vmflt_vf_w, uint32_t, H4, float32_lt)
 GEN_VEXT_CMP_VF(vmflt_vf_d, uint64_t, H8, float64_lt)
 
-static bool float16_le(uint16_t a, uint16_t b, float_status *s)
-{
-FloatRelation compare = float16_compare(a, b, s);
-return compare == float_relation_less ||
-   compare == float_relation_equal;
-}
-
 GEN_VEXT_CMP_VV_ENV(vmfle_vv_h, uint16_t, H2, float16_le)
 GEN_VEXT_CMP_VV_ENV(vmfle_vv_w, uint32_t, H4, float32_le)
 GEN_VEXT_CMP_VV_ENV(vmfle_vv_d, uint64_t, H8, float64_le)
-- 
2.17.1




[RFC v4 48/70] target/riscv: rvv-1.0: integer comparison instructions

2020-08-17 Thread frank . chang
From: Frank Chang 

* Sign-extend vmselu.vi and vmsgtu.vi immediate values.
* Remove "set tail elements to zeros" as tail elements can be unchanged
  for either VTA to have undisturbed or agnostic setting.

Signed-off-by: Frank Chang 
---
 target/riscv/insn_trans/trans_rvv.inc.c | 4 ++--
 target/riscv/vector_helper.c| 8 
 2 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index ef100254830..c3be3dd97ff 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -2090,9 +2090,9 @@ GEN_OPIVX_TRANS(vmsgt_vx, opivx_cmp_check)
 
 GEN_OPIVI_TRANS(vmseq_vi, IMM_SX, vmseq_vx, opivx_cmp_check)
 GEN_OPIVI_TRANS(vmsne_vi, IMM_SX, vmsne_vx, opivx_cmp_check)
-GEN_OPIVI_TRANS(vmsleu_vi, IMM_ZX, vmsleu_vx, opivx_cmp_check)
+GEN_OPIVI_TRANS(vmsleu_vi, IMM_SX, vmsleu_vx, opivx_cmp_check)
 GEN_OPIVI_TRANS(vmsle_vi, IMM_SX, vmsle_vx, opivx_cmp_check)
-GEN_OPIVI_TRANS(vmsgtu_vi, IMM_ZX, vmsgtu_vx, opivx_cmp_check)
+GEN_OPIVI_TRANS(vmsgtu_vi, IMM_SX, vmsgtu_vx, opivx_cmp_check)
 GEN_OPIVI_TRANS(vmsgt_vi, IMM_SX, vmsgt_vx, opivx_cmp_check)
 
 /* Vector Integer Min/Max Instructions */
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 544c8e38fca..f80c13b0857 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -1403,7 +1403,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void 
*vs2,   \
 { \
 uint32_t vm = vext_vm(desc);  \
 uint32_t vl = env->vl;\
-uint32_t vlmax = vext_maxsz(desc) / sizeof(ETYPE);\
 uint32_t i;   \
   \
 for (i = 0; i < vl; i++) {\
@@ -1414,9 +1413,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void 
*vs2,   \
 } \
 vext_set_elem_mask(vd, i, DO_OP(s2, s1)); \
 } \
-for (; i < vlmax; i++) {  \
-vext_set_elem_mask(vd, i, 0); \
-} \
 }
 
 GEN_VEXT_CMP_VV(vmseq_vv_b, uint8_t,  H1, DO_MSEQ)
@@ -1455,7 +1451,6 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, 
void *vs2,   \
 {   \
 uint32_t vm = vext_vm(desc);\
 uint32_t vl = env->vl;  \
-uint32_t vlmax = vext_maxsz(desc) / sizeof(ETYPE);  \
 uint32_t i; \
 \
 for (i = 0; i < vl; i++) {  \
@@ -1466,9 +1461,6 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, 
void *vs2,   \
 vext_set_elem_mask(vd, i,   \
 DO_OP(s2, (ETYPE)(target_long)s1)); \
 }   \
-for (; i < vlmax; i++) {\
-vext_set_elem_mask(vd, i, 0);   \
-}   \
 }
 
 GEN_VEXT_CMP_VX(vmseq_vx_b, uint8_t,  H1, DO_MSEQ)
-- 
2.17.1




[RFC v4 56/70] target/riscv: rvv-1.0: widening floating-point reduction instructions

2020-08-17 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
---
 target/riscv/insn_trans/trans_rvv.inc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index 41a60cf2fb9..2ebe2373237 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -2977,7 +2977,7 @@ GEN_OPFVV_TRANS(vfredmax_vs, reduction_check)
 GEN_OPFVV_TRANS(vfredmin_vs, reduction_check)
 
 /* Vector Widening Floating-Point Reduction Instructions */
-GEN_OPFVV_WIDEN_TRANS(vfwredsum_vs, reduction_check)
+GEN_OPFVV_WIDEN_TRANS(vfwredsum_vs, reduction_widen_check)
 
 /*
  *** Vector Mask Operations
-- 
2.17.1




[RFC v4 64/70] target/riscv: rvv-1.0: widening floating-point/integer type-convert

2020-08-17 Thread frank . chang
From: Frank Chang 

Add the following instructions:

* vfwcvt.rtz.xu.f.v
* vfwcvt.rtz.x.f.v

Also adjust GEN_OPFV_WIDEN_TRANS() to accept multiple floating-point
rounding modes.

Signed-off-by: Frank Chang 
---
 target/riscv/helper.h   |  6 +++
 target/riscv/insn32.decode  | 13 ---
 target/riscv/insn_trans/trans_rvv.inc.c | 50 +
 target/riscv/vector_helper.c| 25 -
 4 files changed, 81 insertions(+), 13 deletions(-)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 5ef37b9dc49..7539b4a5004 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -995,12 +995,18 @@ DEF_HELPER_5(vfwcvt_xu_f_v_h, void, ptr, ptr, ptr, env, 
i32)
 DEF_HELPER_5(vfwcvt_xu_f_v_w, void, ptr, ptr, ptr, env, i32)
 DEF_HELPER_5(vfwcvt_x_f_v_h, void, ptr, ptr, ptr, env, i32)
 DEF_HELPER_5(vfwcvt_x_f_v_w, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfwcvt_f_xu_v_b, void, ptr, ptr, ptr, env, i32)
 DEF_HELPER_5(vfwcvt_f_xu_v_h, void, ptr, ptr, ptr, env, i32)
 DEF_HELPER_5(vfwcvt_f_xu_v_w, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfwcvt_f_x_v_b, void, ptr, ptr, ptr, env, i32)
 DEF_HELPER_5(vfwcvt_f_x_v_h, void, ptr, ptr, ptr, env, i32)
 DEF_HELPER_5(vfwcvt_f_x_v_w, void, ptr, ptr, ptr, env, i32)
 DEF_HELPER_5(vfwcvt_f_f_v_h, void, ptr, ptr, ptr, env, i32)
 DEF_HELPER_5(vfwcvt_f_f_v_w, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfwcvt_rtz_xu_f_v_h, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfwcvt_rtz_xu_f_v_w, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfwcvt_rtz_x_f_v_h, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfwcvt_rtz_x_f_v_w, void, ptr, ptr, ptr, env, i32)
 
 DEF_HELPER_5(vfncvt_xu_f_v_h, void, ptr, ptr, ptr, env, i32)
 DEF_HELPER_5(vfncvt_xu_f_v_w, void, ptr, ptr, ptr, env, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index c25c03dfb7c..fae96194078 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -563,11 +563,14 @@ vfcvt_f_xu_v   010010 . . 00010 001 . 1010111 
@r2_vm
 vfcvt_f_x_v010010 . . 00011 001 . 1010111 @r2_vm
 vfcvt_rtz_xu_f_v   010010 . . 00110 001 . 1010111 @r2_vm
 vfcvt_rtz_x_f_v010010 . . 00111 001 . 1010111 @r2_vm
-vfwcvt_xu_f_v   100010 . . 01000 001 . 1010111 @r2_vm
-vfwcvt_x_f_v100010 . . 01001 001 . 1010111 @r2_vm
-vfwcvt_f_xu_v   100010 . . 01010 001 . 1010111 @r2_vm
-vfwcvt_f_x_v100010 . . 01011 001 . 1010111 @r2_vm
-vfwcvt_f_f_v100010 . . 01100 001 . 1010111 @r2_vm
+
+vfwcvt_xu_f_v  010010 . . 01000 001 . 1010111 @r2_vm
+vfwcvt_x_f_v   010010 . . 01001 001 . 1010111 @r2_vm
+vfwcvt_f_xu_v  010010 . . 01010 001 . 1010111 @r2_vm
+vfwcvt_f_x_v   010010 . . 01011 001 . 1010111 @r2_vm
+vfwcvt_f_f_v   010010 . . 01100 001 . 1010111 @r2_vm
+vfwcvt_rtz_xu_f_v  010010 . . 01110 001 . 1010111 @r2_vm
+vfwcvt_rtz_x_f_v   010010 . . 0 001 . 1010111 @r2_vm
 vfncvt_xu_f_v   100010 . . 1 001 . 1010111 @r2_vm
 vfncvt_x_f_v100010 . . 10001 001 . 1010111 @r2_vm
 vfncvt_f_xu_v   100010 . . 10010 001 . 1010111 @r2_vm
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index 9cc5e2315cd..877655d9671 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -2854,7 +2854,7 @@ static bool opfv_widen_check(DisasContext *s, arg_rmr *a)
(s->sew != 0);
 }
 
-#define GEN_OPFV_WIDEN_TRANS(NAME) \
+#define GEN_OPFV_WIDEN_TRANS(NAME, FRM)\
 static bool trans_##NAME(DisasContext *s, arg_rmr *a)  \
 {  \
 if (opfv_widen_check(s, a)) {  \
@@ -2864,7 +2864,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a) 
 \
 gen_helper_##NAME##_w, \
 }; \
 TCGLabel *over = gen_new_label();  \
-gen_set_rm(s, FRM_DYN);\
+gen_set_rm(s, FRM);\
 tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);  \
\
 data = FIELD_DP32(data, VDATA, VM, a->vm); \
@@ -2879,11 +2879,47 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a)   
   \
 return false;  \
 }
 
-GEN_OPFV_WIDEN_TRANS(vfwcvt_xu_f_v)
-GEN_OPFV_WIDEN_TRANS(vfwcvt_x_f_v)
-GEN_OPFV_WIDEN_TRANS(vfwcvt_f_xu_v)
-GEN_OPFV_WIDEN_TRANS(vfwcvt_f_x_v)
-GEN_OPFV_WIDEN_TRANS(vfwcvt_f_f_v)
+GEN_OPFV_WIDEN_TRANS(vfwcvt_xu_f_v, FRM_DYN)
+GEN_OPFV_WIDEN

[RFC v4 61/70] target/riscv: rvv-1.0: floating-point min/max instructions

2020-08-17 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
---
 target/riscv/vector_helper.c | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 600d2b53353..4d9a1cf3651 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -3640,28 +3640,28 @@ GEN_VEXT_V_ENV(vfsqrt_v_w, 4, 4)
 GEN_VEXT_V_ENV(vfsqrt_v_d, 8, 8)
 
 /* Vector Floating-Point MIN/MAX Instructions */
-RVVCALL(OPFVV2, vfmin_vv_h, OP_UUU_H, H2, H2, H2, float16_minnum)
-RVVCALL(OPFVV2, vfmin_vv_w, OP_UUU_W, H4, H4, H4, float32_minnum)
-RVVCALL(OPFVV2, vfmin_vv_d, OP_UUU_D, H8, H8, H8, float64_minnum)
+RVVCALL(OPFVV2, vfmin_vv_h, OP_UUU_H, H2, H2, H2, float16_minnum_noprop)
+RVVCALL(OPFVV2, vfmin_vv_w, OP_UUU_W, H4, H4, H4, float32_minnum_noprop)
+RVVCALL(OPFVV2, vfmin_vv_d, OP_UUU_D, H8, H8, H8, float64_minnum_noprop)
 GEN_VEXT_VV_ENV(vfmin_vv_h, 2, 2)
 GEN_VEXT_VV_ENV(vfmin_vv_w, 4, 4)
 GEN_VEXT_VV_ENV(vfmin_vv_d, 8, 8)
-RVVCALL(OPFVF2, vfmin_vf_h, OP_UUU_H, H2, H2, float16_minnum)
-RVVCALL(OPFVF2, vfmin_vf_w, OP_UUU_W, H4, H4, float32_minnum)
-RVVCALL(OPFVF2, vfmin_vf_d, OP_UUU_D, H8, H8, float64_minnum)
+RVVCALL(OPFVF2, vfmin_vf_h, OP_UUU_H, H2, H2, float16_minnum_noprop)
+RVVCALL(OPFVF2, vfmin_vf_w, OP_UUU_W, H4, H4, float32_minnum_noprop)
+RVVCALL(OPFVF2, vfmin_vf_d, OP_UUU_D, H8, H8, float64_minnum_noprop)
 GEN_VEXT_VF(vfmin_vf_h, 2, 2)
 GEN_VEXT_VF(vfmin_vf_w, 4, 4)
 GEN_VEXT_VF(vfmin_vf_d, 8, 8)
 
-RVVCALL(OPFVV2, vfmax_vv_h, OP_UUU_H, H2, H2, H2, float16_maxnum)
-RVVCALL(OPFVV2, vfmax_vv_w, OP_UUU_W, H4, H4, H4, float32_maxnum)
-RVVCALL(OPFVV2, vfmax_vv_d, OP_UUU_D, H8, H8, H8, float64_maxnum)
+RVVCALL(OPFVV2, vfmax_vv_h, OP_UUU_H, H2, H2, H2, float16_maxnum_noprop)
+RVVCALL(OPFVV2, vfmax_vv_w, OP_UUU_W, H4, H4, H4, float32_maxnum_noprop)
+RVVCALL(OPFVV2, vfmax_vv_d, OP_UUU_D, H8, H8, H8, float64_maxnum_noprop)
 GEN_VEXT_VV_ENV(vfmax_vv_h, 2, 2)
 GEN_VEXT_VV_ENV(vfmax_vv_w, 4, 4)
 GEN_VEXT_VV_ENV(vfmax_vv_d, 8, 8)
-RVVCALL(OPFVF2, vfmax_vf_h, OP_UUU_H, H2, H2, float16_maxnum)
-RVVCALL(OPFVF2, vfmax_vf_w, OP_UUU_W, H4, H4, float32_maxnum)
-RVVCALL(OPFVF2, vfmax_vf_d, OP_UUU_D, H8, H8, float64_maxnum)
+RVVCALL(OPFVF2, vfmax_vf_h, OP_UUU_H, H2, H2, float16_maxnum_noprop)
+RVVCALL(OPFVF2, vfmax_vf_w, OP_UUU_W, H4, H4, float32_maxnum_noprop)
+RVVCALL(OPFVF2, vfmax_vf_d, OP_UUU_D, H8, H8, float64_maxnum_noprop)
 GEN_VEXT_VF(vfmax_vf_h, 2, 2)
 GEN_VEXT_VF(vfmax_vf_w, 4, 4)
 GEN_VEXT_VF(vfmax_vf_d, 8, 8)
-- 
2.17.1




[RFC v4 41/70] target/riscv: rvv-1.0: single-width bit shift instructions

2020-08-17 Thread frank . chang
From: Frank Chang 

Truncate vsll.vi, vsrl.vi, vsra.vi's immediate values to lg2(SEW) bits.

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/insn_trans/trans_rvv.inc.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index 16e0941efb6..b763c3956cb 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -1915,9 +1915,9 @@ GEN_OPIVX_GVEC_SHIFT_TRANS(vsll_vx,  shls)
 GEN_OPIVX_GVEC_SHIFT_TRANS(vsrl_vx,  shrs)
 GEN_OPIVX_GVEC_SHIFT_TRANS(vsra_vx,  sars)
 
-GEN_OPIVI_GVEC_TRANS(vsll_vi, IMM_ZX, vsll_vx, shli)
-GEN_OPIVI_GVEC_TRANS(vsrl_vi, IMM_ZX, vsrl_vx, shri)
-GEN_OPIVI_GVEC_TRANS(vsra_vi, IMM_ZX, vsra_vx, sari)
+GEN_OPIVI_GVEC_TRANS(vsll_vi, IMM_TRUNC_SEW, vsll_vx, shli)
+GEN_OPIVI_GVEC_TRANS(vsrl_vi, IMM_TRUNC_SEW, vsrl_vx, shri)
+GEN_OPIVI_GVEC_TRANS(vsra_vi, IMM_TRUNC_SEW, vsra_vx, sari)
 
 /* Vector Narrowing Integer Right Shift Instructions */
 static bool opivv_narrow_check(DisasContext *s, arg_rmrr *a)
-- 
2.17.1




[RFC v4 69/70] target/riscv: gdb: support vector registers for rv64

2020-08-17 Thread frank . chang
From: Hsiangkai Wang 

Signed-off-by: Hsiangkai Wang 
Signed-off-by: Frank Chang 
---
 gdb-xml/riscv-64bit-csr.xml |   7 ++
 target/riscv/cpu.c  |   1 +
 target/riscv/cpu.h  |  25 +++
 target/riscv/gdbstub.c  | 126 +++-
 4 files changed, 157 insertions(+), 2 deletions(-)

diff --git a/gdb-xml/riscv-64bit-csr.xml b/gdb-xml/riscv-64bit-csr.xml
index 90394562930..f768c3202a4 100644
--- a/gdb-xml/riscv-64bit-csr.xml
+++ b/gdb-xml/riscv-64bit-csr.xml
@@ -248,4 +248,11 @@
   
   
   
+  
+  
+  
+  
+  
+  
+  
 
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 8844975bf94..e04cea5514c 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -548,6 +548,7 @@ static void riscv_cpu_class_init(ObjectClass *c, void *data)
 #elif defined(TARGET_RISCV64)
 cc->gdb_core_xml_file = "riscv-64bit-cpu.xml";
 #endif
+cc->gdb_get_dynamic_xml = riscv_gdb_get_dynamic_xml;
 cc->gdb_stop_before_watchpoint = true;
 cc->disas_set_info = riscv_cpu_disas_set_info;
 #ifndef CONFIG_USER_ONLY
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 2c7ce500fa7..932b7e8d0fe 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -102,6 +102,16 @@ FIELD(VTYPE, VEDIV, 8, 2)
 FIELD(VTYPE, RESERVED, 10, sizeof(target_ulong) * 8 - 11)
 FIELD(VTYPE, VILL, sizeof(target_ulong) * 8 - 1, 1)
 
+/**
+ * DynamicGDBXMLInfo:
+ * @desc: Contains the XML descriptions.
+ * @num: Number of the registers in this XML seen by GDB.
+ */
+typedef struct DynamicGDBXMLInfo {
+char *desc;
+int num;
+} DynamicGDBXMLInfo;
+
 struct CPURISCVState {
 target_ulong gpr[32];
 uint64_t fpr[32]; /* assume both F and D extensions */
@@ -295,6 +305,8 @@ typedef struct RISCVCPU {
 bool mmu;
 bool pmp;
 } cfg;
+
+DynamicGDBXMLInfo dyn_vreg_xml;
 } RISCVCPU;
 
 static inline int riscv_has_ext(CPURISCVState *env, target_ulong ext)
@@ -485,6 +497,19 @@ typedef struct {
 void riscv_get_csr_ops(int csrno, riscv_csr_operations *ops);
 void riscv_set_csr_ops(int csrno, riscv_csr_operations *ops);
 
+/*
+ * Helpers to dynamically generates XML descriptions of the
+ * vector registers. Returns the number of registers in each set.
+ */
+int ricsv_gen_dynamic_vector_xml(CPUState *cpu, int base_reg);
+
+/*
+ * Returns the dynamically generated XML for the gdb stub.
+ * Returns a pointer to the XML contents for the specified XML file or NULL
+ * if the XML name doesn't match the predefined one.
+ */
+const char *riscv_gdb_get_dynamic_xml(CPUState *cpu, const char *xmlname);
+
 void riscv_cpu_register_gdb_regs_for_features(CPUState *cs);
 
 #endif /* RISCV_CPU_H */
diff --git a/target/riscv/gdbstub.c b/target/riscv/gdbstub.c
index f7c5212e274..ceb73a08b25 100644
--- a/target/riscv/gdbstub.c
+++ b/target/riscv/gdbstub.c
@@ -268,6 +268,39 @@ static int csr_register_map[] = {
 CSR_MUCOUNTEREN,
 CSR_MSCOUNTEREN,
 CSR_MHCOUNTEREN,
+CSR_VSTART,
+CSR_VXSAT,
+CSR_VXRM,
+CSR_VCSR,
+CSR_VL,
+CSR_VTYPE,
+CSR_VLENB,
+};
+
+struct TypeSize {
+const char *gdb_type;
+const char *id;
+int size;
+const char suffix;
+};
+
+static const struct TypeSize vec_lanes[] = {
+/* quads */
+{ "uint128", "quads", 128, 'q' },
+/* 64 bit */
+{ "uint64", "longs", 64, 'l' },
+/* 32 bit */
+{ "uint32", "words", 32, 'w' },
+/* 16 bit */
+{ "uint16", "shorts", 16, 's' },
+/*
+ * TODO: currently there is no reliable way of telling
+ * if the remote gdb actually understands ieee_half so
+ * we don't expose it in the target description for now.
+ * { "ieee_half", 16, 'h', 'f' },
+ */
+/* bytes */
+{ "uint8", "bytes", 8, 'b' },
 };
 
 int riscv_cpu_gdb_read_register(CPUState *cs, GByteArray *mem_buf, int n)
@@ -351,6 +384,34 @@ static int riscv_gdb_set_fpu(CPURISCVState *env, uint8_t 
*mem_buf, int n)
 return 0;
 }
 
+static int riscv_gdb_get_vector(CPURISCVState *env, GByteArray *buf, int n)
+{
+uint16_t vlenb = env_archcpu(env)->cfg.vlen >> 3;
+if (n < 32) {
+int i;
+int cnt = 0;
+for (i = 0; i < vlenb; i += 8) {
+cnt += gdb_get_reg64(buf,
+ env->vreg[(n * vlenb + i) / 8]);
+}
+return cnt;
+}
+return 0;
+}
+
+static int riscv_gdb_set_vector(CPURISCVState *env, uint8_t *mem_buf, int n)
+{
+uint16_t vlenb = env_archcpu(env)->cfg.vlen >> 3;
+if (n < 32) {
+int i;
+for (i = 0; i < vlenb; i += 8) {
+env->vreg[(n * vlenb + i) / 8] = ldq_p(mem_buf + i);
+}
+return vlenb;
+}
+return 0;
+}
+
 static int riscv_gdb_get_csr(CPURISCVState *env, GByteArray *buf, int n)
 {
 if (n < ARRAY_SIZE(csr_register_map)) {
@@ -405,6 +466,51 @@ static int riscv_gdb_set_

[RFC v4 11/70] target/riscv: rvv-1.0: remove MLEN calculations

2020-08-17 Thread frank . chang
From: Frank Chang 

As in RVV 1.0 design, MLEN is hardcoded with value 1 (Section 4.5).
Thus, remove all MLEN related calculations.

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/insn_trans/trans_rvv.inc.c |  35 +---
 target/riscv/internals.h|   9 +-
 target/riscv/translate.c|   2 -
 target/riscv/vector_helper.c| 250 ++--
 4 files changed, 110 insertions(+), 186 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index 1b021603c1c..b529474403e 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -247,7 +247,6 @@ static bool ld_us_op(DisasContext *s, arg_r2nfvm *a, 
uint8_t seq)
 return false;
 }
 
-data = FIELD_DP32(data, VDATA, MLEN, s->mlen);
 data = FIELD_DP32(data, VDATA, VM, a->vm);
 data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
 data = FIELD_DP32(data, VDATA, NF, a->nf);
@@ -300,7 +299,6 @@ static bool st_us_op(DisasContext *s, arg_r2nfvm *a, 
uint8_t seq)
 return false;
 }
 
-data = FIELD_DP32(data, VDATA, MLEN, s->mlen);
 data = FIELD_DP32(data, VDATA, VM, a->vm);
 data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
 data = FIELD_DP32(data, VDATA, NF, a->nf);
@@ -387,7 +385,6 @@ static bool ld_stride_op(DisasContext *s, arg_rnfvm *a, 
uint8_t seq)
 return false;
 }
 
-data = FIELD_DP32(data, VDATA, MLEN, s->mlen);
 data = FIELD_DP32(data, VDATA, VM, a->vm);
 data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
 data = FIELD_DP32(data, VDATA, NF, a->nf);
@@ -426,7 +423,6 @@ static bool st_stride_op(DisasContext *s, arg_rnfvm *a, 
uint8_t seq)
   gen_helper_vsse_v_w,  gen_helper_vsse_v_d }
 };
 
-data = FIELD_DP32(data, VDATA, MLEN, s->mlen);
 data = FIELD_DP32(data, VDATA, VM, a->vm);
 data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
 data = FIELD_DP32(data, VDATA, NF, a->nf);
@@ -518,7 +514,6 @@ static bool ld_index_op(DisasContext *s, arg_rnfvm *a, 
uint8_t seq)
 return false;
 }
 
-data = FIELD_DP32(data, VDATA, MLEN, s->mlen);
 data = FIELD_DP32(data, VDATA, VM, a->vm);
 data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
 data = FIELD_DP32(data, VDATA, NF, a->nf);
@@ -570,7 +565,6 @@ static bool st_index_op(DisasContext *s, arg_rnfvm *a, 
uint8_t seq)
 return false;
 }
 
-data = FIELD_DP32(data, VDATA, MLEN, s->mlen);
 data = FIELD_DP32(data, VDATA, VM, a->vm);
 data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
 data = FIELD_DP32(data, VDATA, NF, a->nf);
@@ -649,7 +643,6 @@ static bool ldff_op(DisasContext *s, arg_r2nfvm *a, uint8_t 
seq)
 return false;
 }
 
-data = FIELD_DP32(data, VDATA, MLEN, s->mlen);
 data = FIELD_DP32(data, VDATA, VM, a->vm);
 data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
 data = FIELD_DP32(data, VDATA, NF, a->nf);
@@ -760,7 +753,6 @@ static bool amo_op(DisasContext *s, arg_rwdvm *a, uint8_t 
seq)
 }
 }
 
-data = FIELD_DP32(data, VDATA, MLEN, s->mlen);
 data = FIELD_DP32(data, VDATA, VM, a->vm);
 data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
 data = FIELD_DP32(data, VDATA, WD, a->wd);
@@ -839,7 +831,6 @@ do_opivv_gvec(DisasContext *s, arg_rmrr *a, GVecGen3Fn 
*gvec_fn,
 } else {
 uint32_t data = 0;
 
-data = FIELD_DP32(data, VDATA, MLEN, s->mlen);
 data = FIELD_DP32(data, VDATA, VM, a->vm);
 data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
 tcg_gen_gvec_4_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, 0),
@@ -885,7 +876,6 @@ static bool opivx_trans(uint32_t vd, uint32_t rs1, uint32_t 
vs2, uint32_t vm,
 src1 = tcg_temp_new();
 gen_get_gpr(src1, rs1);
 
-data = FIELD_DP32(data, VDATA, MLEN, s->mlen);
 data = FIELD_DP32(data, VDATA, VM, vm);
 data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
 desc = tcg_const_i32(simd_desc(0, s->vlen / 8, data));
@@ -1034,7 +1024,6 @@ static bool opivi_trans(uint32_t vd, uint32_t imm, 
uint32_t vs2, uint32_t vm,
 } else {
 src1 = tcg_const_tl(sextract64(imm, 0, 5));
 }
-data = FIELD_DP32(data, VDATA, MLEN, s->mlen);
 data = FIELD_DP32(data, VDATA, VM, vm);
 data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
 desc = tcg_const_i32(simd_desc(0, s->vlen / 8, data));
@@ -1130,7 +1119,6 @@ static bool do_opivv_widen(DisasContext *s, arg_rmrr *a,
 TCGLabel *over = gen_new_label();
 tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);
 
-data = FIELD_DP32(data, VDATA, MLEN, s->mlen);
 data = FIELD_DP32(data, VDATA, VM, a->vm);
 data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
 tcg_gen_gvec_4_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, 0),
@@ -1219,7 +1207,6 @@ static bool do_opiwv_widen(DisasCont

[RFC v4 01/70] target/riscv: drop vector 0.7.1 and add 1.0 support

2020-08-17 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
Reviewed-by: Alistair Francis 
---
 target/riscv/cpu.c | 10 +-
 target/riscv/cpu.h |  2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 228b9bdb5d6..085381fee00 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -339,7 +339,7 @@ static void riscv_cpu_realize(DeviceState *dev, Error 
**errp)
 CPURISCVState *env = >env;
 RISCVCPUClass *mcc = RISCV_CPU_GET_CLASS(dev);
 int priv_version = PRIV_VERSION_1_11_0;
-int vext_version = VEXT_VERSION_0_07_1;
+int vext_version = VEXT_VERSION_1_00_0;
 target_ulong target_misa = 0;
 Error *local_err = NULL;
 
@@ -455,8 +455,8 @@ static void riscv_cpu_realize(DeviceState *dev, Error 
**errp)
 return;
 }
 if (cpu->cfg.vext_spec) {
-if (!g_strcmp0(cpu->cfg.vext_spec, "v0.7.1")) {
-vext_version = VEXT_VERSION_0_07_1;
+if (!g_strcmp0(cpu->cfg.vext_spec, "v1.0")) {
+vext_version = VEXT_VERSION_1_00_0;
 } else {
 error_setg(errp,
"Unsupported vector spec version '%s'",
@@ -464,8 +464,8 @@ static void riscv_cpu_realize(DeviceState *dev, Error 
**errp)
 return;
 }
 } else {
-qemu_log("vector verison is not specified, "
-"use the default value v0.7.1\n");
+qemu_log("vector version is not specified, "
+"use the default value v1.0\n");
 }
 set_vext_version(env, vext_version);
 }
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index a804a5d0bab..f9ef20fe89a 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -79,7 +79,7 @@ enum {
 #define PRIV_VERSION_1_10_0 0x00011000
 #define PRIV_VERSION_1_11_0 0x00011100
 
-#define VEXT_VERSION_0_07_1 0x0701
+#define VEXT_VERSION_1_00_0 0x0001
 
 #define TRANSLATE_PMP_FAIL 2
 #define TRANSLATE_FAIL 1
-- 
2.17.1




[RFC v4 03/70] target/riscv: rvv-1.0: add mstatus VS field

2020-08-17 Thread frank . chang
From: LIU Zhiwei 

Signed-off-by: LIU Zhiwei 
Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/cpu.h|  6 ++
 target/riscv/cpu_bits.h   |  1 +
 target/riscv/cpu_helper.c | 16 +++-
 target/riscv/csr.c| 25 -
 4 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index f9ef20fe89a..08d2c10a024 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -317,6 +317,7 @@ int riscv_cpu_gdb_read_register(CPUState *cpu, GByteArray 
*buf, int reg);
 int riscv_cpu_gdb_write_register(CPUState *cpu, uint8_t *buf, int reg);
 bool riscv_cpu_exec_interrupt(CPUState *cs, int interrupt_request);
 bool riscv_cpu_fp_enabled(CPURISCVState *env);
+bool riscv_cpu_vector_enabled(CPURISCVState *env);
 bool riscv_cpu_virt_enabled(CPURISCVState *env);
 void riscv_cpu_set_virt_enabled(CPURISCVState *env, bool enable);
 bool riscv_cpu_force_hs_excep_enabled(CPURISCVState *env);
@@ -360,6 +361,7 @@ void riscv_cpu_set_fflags(CPURISCVState *env, target_ulong);
 
 #define TB_FLAGS_MMU_MASK   3
 #define TB_FLAGS_MSTATUS_FS MSTATUS_FS
+#define TB_FLAGS_MSTATUS_VS MSTATUS_VS
 
 typedef CPURISCVState CPUArchState;
 typedef RISCVCPU ArchCPU;
@@ -410,11 +412,15 @@ static inline void cpu_get_tb_cpu_state(CPURISCVState 
*env, target_ulong *pc,
 
 #ifdef CONFIG_USER_ONLY
 flags |= TB_FLAGS_MSTATUS_FS;
+flags |= TB_FLAGS_MSTATUS_VS;
 #else
 flags |= cpu_mmu_index(env, 0);
 if (riscv_cpu_fp_enabled(env)) {
 flags |= env->mstatus & MSTATUS_FS;
 }
+if (riscv_cpu_vector_enabled(env)) {
+flags |= env->mstatus & MSTATUS_VS;
+}
 #endif
 *pflags = flags;
 }
diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h
index 8117e8b5a7e..a8b31208833 100644
--- a/target/riscv/cpu_bits.h
+++ b/target/riscv/cpu_bits.h
@@ -367,6 +367,7 @@
 #define MSTATUS_SPIE0x0020
 #define MSTATUS_MPIE0x0080
 #define MSTATUS_SPP 0x0100
+#define MSTATUS_VS  0x0600
 #define MSTATUS_MPP 0x1800
 #define MSTATUS_FS  0x6000
 #define MSTATUS_XS  0x00018000
diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index 75d2ae34349..3fae736529a 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -108,10 +108,24 @@ bool riscv_cpu_fp_enabled(CPURISCVState *env)
 return false;
 }
 
+/* Return true is vector support is currently enabled */
+bool riscv_cpu_vector_enabled(CPURISCVState *env)
+{
+if (env->mstatus & MSTATUS_VS) {
+if (riscv_cpu_virt_enabled(env) && !(env->mstatus_hs & MSTATUS_VS)) {
+return false;
+}
+return true;
+}
+
+return false;
+}
+
 void riscv_cpu_swap_hypervisor_regs(CPURISCVState *env)
 {
 target_ulong mstatus_mask = MSTATUS_MXR | MSTATUS_SUM | MSTATUS_FS |
-MSTATUS_SPP | MSTATUS_SPIE | MSTATUS_SIE;
+MSTATUS_SPP | MSTATUS_SPIE | MSTATUS_SIE |
+MSTATUS_VS;
 bool current_virt = riscv_cpu_virt_enabled(env);
 
 g_assert(riscv_has_ext(env, RVH));
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index 6a96a01b1cf..b0413f52d77 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -180,6 +180,7 @@ static int write_fcsr(CPURISCVState *env, int csrno, 
target_ulong val)
 return -1;
 }
 env->mstatus |= MSTATUS_FS;
+env->mstatus |= MSTATUS_VS;
 #endif
 env->frm = (val & FSR_RD) >> FSR_RD_SHIFT;
 if (vs(env, csrno) >= 0) {
@@ -210,6 +211,13 @@ static int read_vxrm(CPURISCVState *env, int csrno, 
target_ulong *val)
 
 static int write_vxrm(CPURISCVState *env, int csrno, target_ulong val)
 {
+#if !defined(CONFIG_USER_ONLY)
+if (!env->debugger && !riscv_cpu_vector_enabled(env)) {
+return -1;
+}
+env->mstatus |= MSTATUS_VS;
+#endif
+
 env->vxrm = val;
 return 0;
 }
@@ -222,6 +230,13 @@ static int read_vxsat(CPURISCVState *env, int csrno, 
target_ulong *val)
 
 static int write_vxsat(CPURISCVState *env, int csrno, target_ulong val)
 {
+#if !defined(CONFIG_USER_ONLY)
+if (!env->debugger && !riscv_cpu_vector_enabled(env)) {
+return -1;
+}
+env->mstatus |= MSTATUS_VS;
+#endif
+
 env->vxsat = val;
 return 0;
 }
@@ -234,6 +249,13 @@ static int read_vstart(CPURISCVState *env, int csrno, 
target_ulong *val)
 
 static int write_vstart(CPURISCVState *env, int csrno, target_ulong val)
 {
+#if !defined(CONFIG_USER_ONLY)
+if (!env->debugger && !riscv_cpu_vector_enabled(env)) {
+return -1;
+}
+env->mstatus |= MSTATUS_VS;
+#endif
+
 env->vstart = val;
 return 0;
 }
@@ -400,7 +422,7 @@ static int write_mstatus(CPURISCVState *env, int csrno, 
target_ulong val)
 mask = MSTATUS_SIE | MSTATUS_SPIE | MSTATUS_MIE | MSTAT

[RFC v4 04/70] target/riscv: rvv-1.0: add sstatus VS field

2020-08-17 Thread frank . chang
From: LIU Zhiwei 

Signed-off-by: LIU Zhiwei 
Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/cpu_bits.h | 1 +
 target/riscv/csr.c  | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h
index a8b31208833..5b0be0bb888 100644
--- a/target/riscv/cpu_bits.h
+++ b/target/riscv/cpu_bits.h
@@ -422,6 +422,7 @@
 #define SSTATUS_UPIE0x0010
 #define SSTATUS_SPIE0x0020
 #define SSTATUS_SPP 0x0100
+#define SSTATUS_VS  0x0600
 #define SSTATUS_FS  0x6000
 #define SSTATUS_XS  0x00018000
 #define SSTATUS_PUM 0x0004 /* until: priv-1.9.1 */
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index b0413f52d77..46c35266cb5 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -365,7 +365,7 @@ static const target_ulong delegable_excps =
 (1ULL << (RISCV_EXCP_STORE_GUEST_AMO_ACCESS_FAULT));
 static const target_ulong sstatus_v1_10_mask = SSTATUS_SIE | SSTATUS_SPIE |
 SSTATUS_UIE | SSTATUS_UPIE | SSTATUS_SPP | SSTATUS_FS | SSTATUS_XS |
-SSTATUS_SUM | SSTATUS_MXR | SSTATUS_SD;
+SSTATUS_SUM | SSTATUS_MXR | SSTATUS_SD | SSTATUS_VS;
 static const target_ulong sip_writable_mask = SIP_SSIP | MIP_USIP | MIP_UEIP;
 static const target_ulong hip_writable_mask = MIP_VSSIP | MIP_VSTIP | 
MIP_VSEIP;
 static const target_ulong vsip_writable_mask = MIP_VSSIP;
-- 
2.17.1




[RFC v4 14/70] target/riscv: rvv-1.0: update check functions

2020-08-17 Thread frank . chang
From: Frank Chang 

Update check functions with RVV 1.0 rules.

Signed-off-by: Frank Chang 
---
 target/riscv/insn_trans/trans_rvv.inc.c | 708 
 1 file changed, 476 insertions(+), 232 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index b529474403e..4ab556f784d 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -19,11 +19,79 @@
 #include "tcg/tcg-gvec-desc.h"
 #include "internals.h"
 
+#define NVPR32
+
+static inline bool is_aligned(const uint8_t val, const uint8_t pos)
+{
+return pos ? (val & (pos - 1)) == 0 : true;
+}
+
+static inline bool is_overlapped(const uint8_t astart, uint8_t asize,
+ const uint8_t bstart, uint8_t bsize)
+{
+asize = asize == 0 ? 1 : asize;
+bsize = bsize == 0 ? 1 : bsize;
+
+const int aend = astart + asize;
+const int bend = bstart + bsize;
+
+return MAX(aend, bend) - MIN(astart, bstart) < asize + bsize;
+}
+
+static inline bool is_overlapped_widen(const uint8_t astart, uint8_t asize,
+   const uint8_t bstart, uint8_t bsize)
+{
+asize = asize == 0 ? 1 : asize;
+bsize = bsize == 0 ? 1 : bsize;
+
+const int aend = astart + asize;
+const int bend = bstart + bsize;
+
+if (astart < bstart &&
+is_overlapped(astart, asize, bstart, bsize) &&
+!is_overlapped(astart, asize, bstart + bsize, bsize)) {
+return false;
+} else  {
+return MAX(aend, bend) - MIN(astart, bstart) < asize + bsize;
+}
+}
+
+static bool require_rvv(DisasContext *s)
+{
+if (s->mstatus_vs == 0) {
+return false;
+}
+return true;
+}
+
+/* Destination vector register group cannot overlap source mask register. */
+static bool require_vm(int vm, int rd)
+{
+return (vm != 0 || rd != 0);
+}
+
+static bool require_align(const uint8_t val, const uint8_t pos)
+{
+return is_aligned(val, pos);
+}
+
+static bool require_noover(const uint8_t astart, const uint8_t asize,
+   const uint8_t bstart, const uint8_t bsize)
+{
+return !is_overlapped(astart, asize, bstart, bsize);
+}
+
+static bool require_noover_widen(const uint8_t astart, const uint8_t asize,
+ const uint8_t bstart, const uint8_t bsize)
+{
+return !is_overlapped_widen(astart, asize, bstart, bsize);
+}
+
 static bool trans_vsetvl(DisasContext *ctx, arg_vsetvl *a)
 {
 TCGv s1, s2, dst;
 
-if (!has_ext(ctx, RVV)) {
+if (!require_rvv(ctx) || !has_ext(ctx, RVV)) {
 return false;
 }
 
@@ -56,7 +124,7 @@ static bool trans_vsetvli(DisasContext *ctx, arg_vsetvli *a)
 {
 TCGv s1, s2, dst;
 
-if (!has_ext(ctx, RVV)) {
+if (!require_rvv(ctx) || !has_ext(ctx, RVV)) {
 return false;
 }
 
@@ -101,53 +169,266 @@ static bool vext_check_isa_ill(DisasContext *s)
 }
 
 /*
- * There are two rules check here.
+ * Check function for vector instruction with format:
+ * single-width result and single-width sources (SEW = SEW op SEW)
  *
- * 1. Vector register numbers are multiples of LMUL. (Section 3.2)
+ * is_vs1: indicates whether insn[19:15] is a vs1 field or not.
  *
- * 2. For all widening instructions, the destination LMUL value must also be
- *a supported LMUL value. (Section 11.2)
+ * Rules to be checked here:
+ *   1. Destination vector register group for a masked vector
+ *  instruction cannot overlap the source mask register (v0).
+ *  (Section 5.3)
+ *   2. Destination vector register number is multiples of LMUL.
+ *  (Section 3.3.2)
+ *   3. Source (vs2, vs1) vector register number are multiples of LMUL.
+ *  (Section 3.3.2)
  */
-static bool vext_check_reg(DisasContext *s, uint32_t reg, bool widen)
+static bool vext_check_sss(DisasContext *s, int vd, int vs1,
+   int vs2, int vm, bool is_vs1)
+{
+bool ret = require_vm(vm, vd);
+if (s->lmul > 0) {
+ret &= require_align(vd, 1 << s->lmul) &&
+   require_align(vs2, 1 << s->lmul);
+if (is_vs1) {
+ret &= require_align(vs1, 1 << s->lmul);
+}
+}
+return ret;
+}
+
+/*
+ * Check function for maskable vector instruction with format:
+ * single-width result and single-width sources (SEW = SEW op SEW)
+ *
+ * is_vs1: indicates whether insn[19:15] is a vs1 field or not.
+ *
+ * Rules to be checked here:
+ *   1. Source (vs2, vs1) vector register number are multiples of LMUL.
+ *  (Section 3.3.2)
+ *   2. Destination vector register cannot overlap a source vector
+ *  register (vs2, vs1) group.
+ *  (Section 5.2)
+ */
+static bool vext_check_mss(DisasContext *s, int vd, int vs1,
+   int vs2, bool is_vs1)
 {
-/*
- * The destination vector register group results are arranged as

[RFC v4 18/70] target/riscv: rvv-1.0: stride load and store instructions

2020-08-17 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
---
 target/riscv/helper.h   | 129 +++---
 target/riscv/insn32.decode  |  43 +++--
 target/riscv/insn_trans/trans_rvv.inc.c | 221 +++-
 target/riscv/vector_helper.c| 188 ++--
 4 files changed, 192 insertions(+), 389 deletions(-)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index acc298219da..2311ce39cfd 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -84,111 +84,30 @@ DEF_HELPER_1(hyp_tlb_flush, void, env)
 
 /* Vector functions */
 DEF_HELPER_3(vsetvl, tl, env, tl, tl)
-DEF_HELPER_5(vlb_v_b, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlb_v_b_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlb_v_h, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlb_v_h_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlb_v_w, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlb_v_w_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlb_v_d, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlb_v_d_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlh_v_h, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlh_v_h_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlh_v_w, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlh_v_w_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlh_v_d, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlh_v_d_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlw_v_w, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlw_v_w_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlw_v_d, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlw_v_d_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vle_v_b, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vle_v_b_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vle_v_h, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vle_v_h_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vle_v_w, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vle_v_w_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vle_v_d, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vle_v_d_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlbu_v_b, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlbu_v_b_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlbu_v_h, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlbu_v_h_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlbu_v_w, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlbu_v_w_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlbu_v_d, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlbu_v_d_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlhu_v_h, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlhu_v_h_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlhu_v_w, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlhu_v_w_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlhu_v_d, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlhu_v_d_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlwu_v_w, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlwu_v_w_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlwu_v_d, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlwu_v_d_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vsb_v_b, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vsb_v_b_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vsb_v_h, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vsb_v_h_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vsb_v_w, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vsb_v_w_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vsb_v_d, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vsb_v_d_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vsh_v_h, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vsh_v_h_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vsh_v_w, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vsh_v_w_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vsh_v_d, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vsh_v_d_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vsw_v_w, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vsw_v_w_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vsw_v_d, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vsw_v_d_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vse_v_b, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vse_v_b_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vse_v_h, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vse_v_h_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vse_v_w, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vse_v_w_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vse_v_d, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vse_v_d_mask, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_6(vlsb_v_b, void, ptr, ptr, tl, tl, env, i32)
-DEF_HELPER_6(vlsb_v_h, void, ptr, ptr, tl, tl, env, i32)
-DEF_HELPER_6(vlsb_v_w, void, ptr, ptr, tl, tl, env, i32)
-DEF_HELPER_6(vlsb_v_d, void, ptr, ptr, tl, tl, env, i32)
-DEF_HELPER_6(vlsh_v_h, void, ptr, ptr, tl, tl, env, i32)
-DEF_HELPER_6(vlsh_v_w, void, ptr, ptr, tl, tl, env, i32)
-DEF_HELPER_6(vlsh_v_d, void, ptr, ptr, tl, tl, env, i32)
-DEF_HELPER_6(vlsw_v_w, void, ptr, ptr, tl, tl

[RFC v4 40/70] target/riscv: rvv-1.0: single-width averaging add and subtract instructions

2020-08-17 Thread frank . chang
From: Frank Chang 

Add the following instructions:

* vaaddu.vv
* vaaddu.vx
* vasubu.vv
* vasubu.vx

Remove the following instructions:

* vadd.vi

Signed-off-by: Frank Chang 
---
 target/riscv/helper.h   | 16 ++
 target/riscv/insn32.decode  | 13 +++--
 target/riscv/insn_trans/trans_rvv.inc.c |  5 +-
 target/riscv/vector_helper.c| 74 +
 4 files changed, 102 insertions(+), 6 deletions(-)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 7ce2fa08d58..3560bf1d4f5 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -694,18 +694,34 @@ DEF_HELPER_6(vaadd_vv_b, void, ptr, ptr, ptr, ptr, env, 
i32)
 DEF_HELPER_6(vaadd_vv_h, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vaadd_vv_w, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vaadd_vv_d, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vaaddu_vv_b, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vaaddu_vv_h, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vaaddu_vv_w, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vaaddu_vv_d, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vasub_vv_b, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vasub_vv_h, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vasub_vv_w, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vasub_vv_d, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vasubu_vv_b, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vasubu_vv_h, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vasubu_vv_w, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vasubu_vv_d, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vaadd_vx_b, void, ptr, ptr, tl, ptr, env, i32)
 DEF_HELPER_6(vaadd_vx_h, void, ptr, ptr, tl, ptr, env, i32)
 DEF_HELPER_6(vaadd_vx_w, void, ptr, ptr, tl, ptr, env, i32)
 DEF_HELPER_6(vaadd_vx_d, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vaaddu_vx_b, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vaaddu_vx_h, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vaaddu_vx_w, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vaaddu_vx_d, void, ptr, ptr, tl, ptr, env, i32)
 DEF_HELPER_6(vasub_vx_b, void, ptr, ptr, tl, ptr, env, i32)
 DEF_HELPER_6(vasub_vx_h, void, ptr, ptr, tl, ptr, env, i32)
 DEF_HELPER_6(vasub_vx_w, void, ptr, ptr, tl, ptr, env, i32)
 DEF_HELPER_6(vasub_vx_d, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vasubu_vx_b, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vasubu_vx_h, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vasubu_vx_w, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vasubu_vx_d, void, ptr, ptr, tl, ptr, env, i32)
 
 DEF_HELPER_6(vsmul_vv_b, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vsmul_vv_h, void, ptr, ptr, ptr, ptr, env, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 2b9700a42ad..fd00ee6fdca 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -457,11 +457,14 @@ vssubu_vv   100010 . . . 000 . 1010111 
@r_vm
 vssubu_vx   100010 . . . 100 . 1010111 @r_vm
 vssub_vv100011 . . . 000 . 1010111 @r_vm
 vssub_vx100011 . . . 100 . 1010111 @r_vm
-vaadd_vv100100 . . . 000 . 1010111 @r_vm
-vaadd_vx100100 . . . 100 . 1010111 @r_vm
-vaadd_vi100100 . . . 011 . 1010111 @r_vm
-vasub_vv100110 . . . 000 . 1010111 @r_vm
-vasub_vx100110 . . . 100 . 1010111 @r_vm
+vaadd_vv001001 . . . 010 . 1010111 @r_vm
+vaadd_vx001001 . . . 110 . 1010111 @r_vm
+vaaddu_vv   001000 . . . 010 . 1010111 @r_vm
+vaaddu_vx   001000 . . . 110 . 1010111 @r_vm
+vasub_vv001011 . . . 010 . 1010111 @r_vm
+vasub_vx001011 . . . 110 . 1010111 @r_vm
+vasubu_vv   001010 . . . 010 . 1010111 @r_vm
+vasubu_vx   001010 . . . 110 . 1010111 @r_vm
 vsmul_vv100111 . . . 000 . 1010111 @r_vm
 vsmul_vx100111 . . . 100 . 1010111 @r_vm
 vwsmaccu_vv 00 . . . 000 . 1010111 @r_vm
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index 5cd099bed7b..16e0941efb6 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -2237,10 +2237,13 @@ GEN_OPIVI_TRANS(vsadd_vi, IMM_SX, vsadd_vx, opivx_check)
 
 /* Vector Single-Width Averaging Add and Subtract */
 GEN_OPIVV_TRANS(vaadd_vv, opivv_check)
+GEN_OPIVV_TRANS(vaaddu_vv, opivv_check)
 GEN_OPIVV_TRANS(vasub_vv, opivv_check)
+GEN_OPIVV_TRANS(vasubu_vv, opivv_check)
 GEN_OPIVX_TRANS(vaadd_vx,  opivx_check)
+GEN_OPIVX_TRANS(vaaddu_vx,  opivx_check)
 GEN_OPIVX_TRANS(vasub_vx,  opivx_check)
-GEN_OPIVI_TRANS(vaadd_vi, 0, vaadd_vx, opivx_check)
+GEN_OPIVX_TRANS(vasubu_vx,  opivx_check)
 
 /* Vector Single-Width Fractional Multiply with Rounding and Saturation

[RFC v4 32/70] target/riscv: rvv-1.0: element index instruction

2020-08-17 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/insn32.decode | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 7a10fc27c5f..15afc469cb0 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -597,7 +597,7 @@ vmsbf_m 010100 . . 1 010 . 1010111 
@r2_vm
 vmsif_m 010100 . . 00011 010 . 1010111 @r2_vm
 vmsof_m 010100 . . 00010 010 . 1010111 @r2_vm
 viota_m 010100 . . 1 010 . 1010111 @r2_vm
-vid_v   010110 . 0 10001 010 . 1010111 @r1_vm
+vid_v   010100 . 0 10001 010 . 1010111 @r1_vm
 vext_x_v001100 1 . . 010 . 1010111 @r
 vmv_s_x 001101 1 0 . 110 . 1010111 @r2
 vfmv_f_s001100 1 . 0 001 . 1010111 @r2rd
-- 
2.17.1




[RFC v4 38/70] target/riscv: rvv-1.0: whole register move instructions

2020-08-17 Thread frank . chang
From: Frank Chang 

Add the following instructions:

* vmv1r.v
* vmv2r.v
* vmv4r.v
* vmv8r.v

Signed-off-by: Frank Chang 
---
 target/riscv/insn32.decode  |  4 
 target/riscv/insn_trans/trans_rvv.inc.c | 25 +
 2 files changed, 29 insertions(+)

diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 97fce34fcd8..65ff1688c25 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -614,6 +614,10 @@ vrgatherei16_vv 001110 . . . 000 . 1010111 
@r_vm
 vrgather_vx 001100 . . . 100 . 1010111 @r_vm
 vrgather_vi 001100 . . . 011 . 1010111 @r_vm
 vcompress_vm010111 - . . 010 . 1010111 @r
+vmv1r_v 100111 1 . 0 011 . 1010111 @r2rd
+vmv2r_v 100111 1 . 1 011 . 1010111 @r2rd
+vmv4r_v 100111 1 . 00011 011 . 1010111 @r2rd
+vmv8r_v 100111 1 . 00111 011 . 1010111 @r2rd
 
 vsetvli 0 ... . 111 . 1010111  @r2_zimm
 vsetvl  100 . . 111 . 1010111  @r
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index 95fdd972fdf..52f2f4902c0 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -3479,3 +3479,28 @@ static bool trans_vcompress_vm(DisasContext *s, arg_r *a)
 }
 return false;
 }
+
+/*
+ * Whole Vector Register Move Instructions ignore vtype and vl setting.
+ * Thus, we don't need to check vill bit. (Section 17.6)
+ */
+#define GEN_VMV_WHOLE_TRANS(NAME, LEN)  \
+static bool trans_##NAME(DisasContext *s, arg_##NAME * a)   \
+{   \
+if (require_rvv(s) &&   \
+QEMU_IS_ALIGNED(a->rd, LEN) &&  \
+QEMU_IS_ALIGNED(a->rs2, LEN)) { \
+/* EEW = 8 */   \
+tcg_gen_gvec_mov(MO_8, vreg_ofs(s, a->rd),  \
+ vreg_ofs(s, a->rs2),   \
+ s->vlen / 8 * LEN, s->vlen / 8 * LEN); \
+mark_vs_dirty(s);   \
+return true;\
+}   \
+return false;   \
+}
+
+GEN_VMV_WHOLE_TRANS(vmv1r_v, 1)
+GEN_VMV_WHOLE_TRANS(vmv2r_v, 2)
+GEN_VMV_WHOLE_TRANS(vmv4r_v, 4)
+GEN_VMV_WHOLE_TRANS(vmv8r_v, 8)
-- 
2.17.1




[RFC v4 60/70] target/riscv: rvv-1.0: remove integer extract instruction

2020-08-17 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/insn32.decode  |  1 -
 target/riscv/insn_trans/trans_rvv.inc.c | 23 ---
 2 files changed, 24 deletions(-)

diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 994ef3031b5..425cfd7cb32 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -603,7 +603,6 @@ viota_m 010100 . . 1 010 . 1010111 
@r2_vm
 vid_v   010100 . 0 10001 010 . 1010111 @r1_vm
 vmv_x_s 01 1 . 0 010 . 1010111 @r2rd
 vmv_s_x 01 1 0 . 110 . 1010111 @r2
-vext_x_v001100 1 . . 010 . 1010111 @r
 vfmv_f_s01 1 . 0 001 . 1010111 @r2rd
 vfmv_s_f01 1 0 . 101 . 1010111 @r2
 vslideup_vx 001110 . . . 100 . 1010111 @r_vm
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index a1d6f7a844b..4f33c42990e 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -3158,8 +3158,6 @@ static bool trans_vid_v(DisasContext *s, arg_vid_v *a)
  *** Vector Permutation Instructions
  */
 
-/* Integer Extract Instruction */
-
 static void load_element(TCGv_i64 dest, TCGv_ptr base,
  int ofs, int sew, bool sign)
 {
@@ -3261,27 +3259,6 @@ static void vec_element_loadi(DisasContext *s, TCGv_i64 
dest,
 load_element(dest, cpu_env, endian_ofs(s, vreg, idx), s->sew, sign);
 }
 
-static bool trans_vext_x_v(DisasContext *s, arg_r *a)
-{
-TCGv_i64 tmp = tcg_temp_new_i64();
-TCGv dest = tcg_temp_new();
-
-if (a->rs1 == 0) {
-/* Special case vmv.x.s rd, vs2. */
-vec_element_loadi(s, tmp, a->rs2, 0, false);
-} else {
-/* This instruction ignores LMUL and vector register groups */
-int vlmax = s->vlen >> (3 + s->sew);
-vec_element_loadx(s, tmp, a->rs2, cpu_gpr[a->rs1], vlmax);
-}
-tcg_gen_trunc_i64_tl(dest, tmp);
-gen_set_gpr(a->rd, dest);
-
-tcg_temp_free(dest);
-tcg_temp_free_i64(tmp);
-return true;
-}
-
 /* Integer Scalar Move Instruction */
 
 static void store_element(TCGv_i64 val, TCGv_ptr base,
-- 
2.17.1




[RFC v4 59/70] target/riscv: rvv-1.0: remove vmford.vv and vmford.vf

2020-08-17 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/helper.h   |  6 --
 target/riscv/insn32.decode  |  2 --
 target/riscv/insn_trans/trans_rvv.inc.c |  2 --
 target/riscv/vector_helper.c| 13 -
 4 files changed, 23 deletions(-)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index ac655b8f274..a9ec14c49ad 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -963,12 +963,6 @@ DEF_HELPER_6(vmfgt_vf_d, void, ptr, ptr, i64, ptr, env, 
i32)
 DEF_HELPER_6(vmfge_vf_h, void, ptr, ptr, i64, ptr, env, i32)
 DEF_HELPER_6(vmfge_vf_w, void, ptr, ptr, i64, ptr, env, i32)
 DEF_HELPER_6(vmfge_vf_d, void, ptr, ptr, i64, ptr, env, i32)
-DEF_HELPER_6(vmford_vv_h, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vmford_vv_w, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vmford_vv_d, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vmford_vf_h, void, ptr, ptr, i64, ptr, env, i32)
-DEF_HELPER_6(vmford_vf_w, void, ptr, ptr, i64, ptr, env, i32)
-DEF_HELPER_6(vmford_vf_d, void, ptr, ptr, i64, ptr, env, i32)
 
 DEF_HELPER_5(vfclass_v_h, void, ptr, ptr, ptr, env, i32)
 DEF_HELPER_5(vfclass_v_w, void, ptr, ptr, ptr, env, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 99320705cca..994ef3031b5 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -553,8 +553,6 @@ vmfle_vv011001 . . . 001 . 1010111 @r_vm
 vmfle_vf011001 . . . 101 . 1010111 @r_vm
 vmfgt_vf011101 . . . 101 . 1010111 @r_vm
 vmfge_vf01 . . . 101 . 1010111 @r_vm
-vmford_vv   011010 . . . 001 . 1010111 @r_vm
-vmford_vf   011010 . . . 101 . 1010111 @r_vm
 vfclass_v   010011 . . 1 001 . 1010111 @r2_vm
 vfmerge_vfm 010111 0 . . 101 . 1010111 @r_vm_0
 vfmv_v_f010111 1 0 . 101 . 1010111 @r2
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index d3b1499c64c..a1d6f7a844b 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -2765,7 +2765,6 @@ GEN_OPFVV_TRANS(vmfeq_vv, opfvv_cmp_check)
 GEN_OPFVV_TRANS(vmfne_vv, opfvv_cmp_check)
 GEN_OPFVV_TRANS(vmflt_vv, opfvv_cmp_check)
 GEN_OPFVV_TRANS(vmfle_vv, opfvv_cmp_check)
-GEN_OPFVV_TRANS(vmford_vv, opfvv_cmp_check)
 
 static bool opfvf_cmp_check(DisasContext *s, arg_rmrr *a)
 {
@@ -2781,7 +2780,6 @@ GEN_OPFVF_TRANS(vmflt_vf, opfvf_cmp_check)
 GEN_OPFVF_TRANS(vmfle_vf, opfvf_cmp_check)
 GEN_OPFVF_TRANS(vmfgt_vf, opfvf_cmp_check)
 GEN_OPFVF_TRANS(vmfge_vf, opfvf_cmp_check)
-GEN_OPFVF_TRANS(vmford_vf, opfvf_cmp_check)
 
 /* Vector Floating-Point Classify Instruction */
 GEN_OPFV_TRANS(vfclass_v, opfv_check)
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 1aeb3b5e4aa..600d2b53353 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -3883,19 +3883,6 @@ GEN_VEXT_CMP_VF(vmfge_vf_h, uint16_t, H2, vmfge16)
 GEN_VEXT_CMP_VF(vmfge_vf_w, uint32_t, H4, vmfge32)
 GEN_VEXT_CMP_VF(vmfge_vf_d, uint64_t, H8, vmfge64)
 
-static bool float16_unordered_quiet(uint16_t a, uint16_t b, float_status *s)
-{
-FloatRelation compare = float16_compare_quiet(a, b, s);
-return compare == float_relation_unordered;
-}
-
-GEN_VEXT_CMP_VV_ENV(vmford_vv_h, uint16_t, H2, !float16_unordered_quiet)
-GEN_VEXT_CMP_VV_ENV(vmford_vv_w, uint32_t, H4, !float32_unordered_quiet)
-GEN_VEXT_CMP_VV_ENV(vmford_vv_d, uint64_t, H8, !float64_unordered_quiet)
-GEN_VEXT_CMP_VF(vmford_vf_h, uint16_t, H2, !float16_unordered_quiet)
-GEN_VEXT_CMP_VF(vmford_vf_w, uint32_t, H4, !float32_unordered_quiet)
-GEN_VEXT_CMP_VF(vmford_vf_d, uint64_t, H8, !float64_unordered_quiet)
-
 /* Vector Floating-Point Classify Instruction */
 #define OPIVV1(NAME, TD, T2, TX2, HD, HS2, OP) \
 static void do_##NAME(void *vd, void *vs2, int i)  \
-- 
2.17.1




[RFC v4 52/70] target/riscv: rvv-1.0: slide instructions

2020-08-17 Thread frank . chang
From: Frank Chang 

* Remove clear function from helper functions as the tail elements
  are unchanged in RVV 1.0.

Signed-off-by: Frank Chang 
---
 target/riscv/vector_helper.c | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index ea1715b5484..2f1460b624d 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -4689,17 +4689,22 @@ GEN_VEXT_VSLIDEUP_VX(vslideup_vx_d, uint64_t, H8)
 void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2, \
   CPURISCVState *env, uint32_t desc)  \
 { \
-uint32_t vlmax = env_archcpu(env)->cfg.vlen;  \
+uint32_t vlmax = vext_max_elems(desc, ctzl(sizeof(ETYPE)));   \
 uint32_t vm = vext_vm(desc);  \
 uint32_t vl = env->vl;\
-target_ulong offset = s1, i;  \
+target_ulong i_max, i;\
   \
-for (i = 0; i < vl; ++i) {\
-target_ulong j = i + offset;  \
-if (!vm && !vext_elem_mask(v0, i)) {  \
-continue; \
+i_max = MIN(s1 < vlmax ? vlmax - s1 : 0, vl); \
+for (i = 0; i < i_max; ++i) { \
+if (vm || vext_elem_mask(v0, i)) {\
+*((ETYPE *)vd + H(i)) = *((ETYPE *)vs2 + H(i + s1));  \
+} \
+} \
+  \
+for (i = i_max; i < vl; ++i) {\
+if (vm || vext_elem_mask(v0, i)) {\
+*((ETYPE *)vd + H(i)) = 0;\
 } \
-*((ETYPE *)vd + H(i)) = j >= vlmax ? 0 : *((ETYPE *)vs2 + H(j));  \
 } \
 }
 
-- 
2.17.1




[RFC v4 06/70] target/riscv: rvv-1.0: add translation-time vector context status

2020-08-17 Thread frank . chang
From: LIU Zhiwei 

Signed-off-by: LIU Zhiwei 
Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/insn_trans/trans_rvv.inc.c | 69 -
 target/riscv/translate.c| 33 
 2 files changed, 90 insertions(+), 12 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index 887c6b88831..1b021603c1c 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -48,6 +48,7 @@ static bool trans_vsetvl(DisasContext *ctx, arg_vsetvl *a)
 tcg_temp_free(s1);
 tcg_temp_free(s2);
 tcg_temp_free(dst);
+mark_vs_dirty(ctx);
 return true;
 }
 
@@ -78,6 +79,7 @@ static bool trans_vsetvli(DisasContext *ctx, arg_vsetvli *a)
 tcg_temp_free(s1);
 tcg_temp_free(s2);
 tcg_temp_free(dst);
+mark_vs_dirty(ctx);
 return true;
 }
 
@@ -163,7 +165,8 @@ typedef void gen_helper_ldst_us(TCGv_ptr, TCGv_ptr, TCGv,
 TCGv_env, TCGv_i32);
 
 static bool ldst_us_trans(uint32_t vd, uint32_t rs1, uint32_t data,
-  gen_helper_ldst_us *fn, DisasContext *s)
+  gen_helper_ldst_us *fn, DisasContext *s,
+  bool is_store)
 {
 TCGv_ptr dest, mask;
 TCGv base;
@@ -195,6 +198,9 @@ static bool ldst_us_trans(uint32_t vd, uint32_t rs1, 
uint32_t data,
 tcg_temp_free_ptr(mask);
 tcg_temp_free(base);
 tcg_temp_free_i32(desc);
+if (!is_store) {
+mark_vs_dirty(s);
+}
 gen_set_label(over);
 return true;
 }
@@ -245,7 +251,7 @@ static bool ld_us_op(DisasContext *s, arg_r2nfvm *a, 
uint8_t seq)
 data = FIELD_DP32(data, VDATA, VM, a->vm);
 data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
 data = FIELD_DP32(data, VDATA, NF, a->nf);
-return ldst_us_trans(a->rd, a->rs1, data, fn, s);
+return ldst_us_trans(a->rd, a->rs1, data, fn, s, false);
 }
 
 static bool ld_us_check(DisasContext *s, arg_r2nfvm* a)
@@ -298,7 +304,7 @@ static bool st_us_op(DisasContext *s, arg_r2nfvm *a, 
uint8_t seq)
 data = FIELD_DP32(data, VDATA, VM, a->vm);
 data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
 data = FIELD_DP32(data, VDATA, NF, a->nf);
-return ldst_us_trans(a->rd, a->rs1, data, fn, s);
+return ldst_us_trans(a->rd, a->rs1, data, fn, s, true);
 }
 
 static bool st_us_check(DisasContext *s, arg_r2nfvm* a)
@@ -321,7 +327,7 @@ typedef void gen_helper_ldst_stride(TCGv_ptr, TCGv_ptr, 
TCGv,
 
 static bool ldst_stride_trans(uint32_t vd, uint32_t rs1, uint32_t rs2,
   uint32_t data, gen_helper_ldst_stride *fn,
-  DisasContext *s)
+  DisasContext *s, bool is_store)
 {
 TCGv_ptr dest, mask;
 TCGv base, stride;
@@ -348,6 +354,9 @@ static bool ldst_stride_trans(uint32_t vd, uint32_t rs1, 
uint32_t rs2,
 tcg_temp_free(base);
 tcg_temp_free(stride);
 tcg_temp_free_i32(desc);
+if (!is_store) {
+mark_vs_dirty(s);
+}
 gen_set_label(over);
 return true;
 }
@@ -382,7 +391,7 @@ static bool ld_stride_op(DisasContext *s, arg_rnfvm *a, 
uint8_t seq)
 data = FIELD_DP32(data, VDATA, VM, a->vm);
 data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
 data = FIELD_DP32(data, VDATA, NF, a->nf);
-return ldst_stride_trans(a->rd, a->rs1, a->rs2, data, fn, s);
+return ldst_stride_trans(a->rd, a->rs1, a->rs2, data, fn, s, false);
 }
 
 static bool ld_stride_check(DisasContext *s, arg_rnfvm* a)
@@ -426,7 +435,7 @@ static bool st_stride_op(DisasContext *s, arg_rnfvm *a, 
uint8_t seq)
 return false;
 }
 
-return ldst_stride_trans(a->rd, a->rs1, a->rs2, data, fn, s);
+return ldst_stride_trans(a->rd, a->rs1, a->rs2, data, fn, s, true);
 }
 
 static bool st_stride_check(DisasContext *s, arg_rnfvm* a)
@@ -449,7 +458,7 @@ typedef void gen_helper_ldst_index(TCGv_ptr, TCGv_ptr, TCGv,
 
 static bool ldst_index_trans(uint32_t vd, uint32_t rs1, uint32_t vs2,
  uint32_t data, gen_helper_ldst_index *fn,
- DisasContext *s)
+ DisasContext *s, bool is_store)
 {
 TCGv_ptr dest, mask, index;
 TCGv base;
@@ -476,6 +485,9 @@ static bool ldst_index_trans(uint32_t vd, uint32_t rs1, 
uint32_t vs2,
 tcg_temp_free_ptr(index);
 tcg_temp_free(base);
 tcg_temp_free_i32(desc);
+if (!is_store) {
+mark_vs_dirty(s);
+}
 gen_set_label(over);
 return true;
 }
@@ -510,7 +522,7 @@ static bool ld_index_op(DisasContext *s, arg_rnfvm *a, 
uint8_t seq)
 data = FIELD_DP32(data, VDATA, VM, a->vm);
 data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
 data = FIELD_DP32(data, VDATA, NF, a->nf);
-return ldst_index_trans(a->rd, a->rs1, a->rs2, data, fn, s);
+return ldst_ind

[RFC v4 07/70] target/riscv: rvv-1.0: remove rvv related codes from fcsr registers

2020-08-17 Thread frank . chang
From: Frank Chang 

* Remove VXRM and VXSAT fields from FCSR register as they are only
  presented in VCSR register.
* Remove RVV loose check in fs() predicate function.

Signed-off-by: Frank Chang 
---
 target/riscv/csr.c | 13 -
 1 file changed, 13 deletions(-)

diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index 7f937e5b9c8..005839390a1 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -46,10 +46,6 @@ void riscv_set_csr_ops(int csrno, riscv_csr_operations *ops)
 static int fs(CPURISCVState *env, int csrno)
 {
 #if !defined(CONFIG_USER_ONLY)
-/* loose check condition for fcsr in vector extension */
-if ((csrno == CSR_FCSR) && (env->misa & RVV)) {
-return 0;
-}
 if (!env->debugger && !riscv_cpu_fp_enabled(env)) {
 return -1;
 }
@@ -166,10 +162,6 @@ static int read_fcsr(CPURISCVState *env, int csrno, 
target_ulong *val)
 #endif
 *val = (riscv_cpu_get_fflags(env) << FSR_AEXC_SHIFT)
 | (env->frm << FSR_RD_SHIFT);
-if (vs(env, csrno) >= 0) {
-*val |= (env->vxrm << FSR_VXRM_SHIFT)
-| (env->vxsat << FSR_VXSAT_SHIFT);
-}
 return 0;
 }
 
@@ -180,13 +172,8 @@ static int write_fcsr(CPURISCVState *env, int csrno, 
target_ulong val)
 return -1;
 }
 env->mstatus |= MSTATUS_FS;
-env->mstatus |= MSTATUS_VS;
 #endif
 env->frm = (val & FSR_RD) >> FSR_RD_SHIFT;
-if (vs(env, csrno) >= 0) {
-env->vxrm = (val & FSR_VXRM) >> FSR_VXRM_SHIFT;
-env->vxsat = (val & FSR_VXSAT) >> FSR_VXSAT_SHIFT;
-}
 riscv_cpu_set_fflags(env, (val & FSR_AEXC) >> FSR_AEXC_SHIFT);
 return 0;
 }
-- 
2.17.1




[RFC v4 22/70] target/riscv: rvv-1.0: amo operations

2020-08-17 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
---
 target/riscv/helper.h   | 100 +++---
 target/riscv/insn32-64.decode   |  18 +-
 target/riscv/insn32.decode  |  36 +++-
 target/riscv/insn_trans/trans_rvv.inc.c | 220 ++
 target/riscv/vector_helper.c| 232 
 5 files changed, 407 insertions(+), 199 deletions(-)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 3d931ba0c70..9200178d25c 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -145,36 +145,80 @@ DEF_HELPER_5(vle16ff_v, void, ptr, ptr, tl, env, i32)
 DEF_HELPER_5(vle32ff_v, void, ptr, ptr, tl, env, i32)
 DEF_HELPER_5(vle64ff_v, void, ptr, ptr, tl, env, i32)
 
+DEF_HELPER_6(vamoswapei8_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamoswapei8_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamoswapei16_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamoswapei16_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamoswapei32_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamoswapei32_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamoaddei8_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamoaddei8_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamoaddei16_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamoaddei16_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamoaddei32_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamoaddei32_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamoxorei8_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamoxorei8_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamoxorei16_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamoxorei16_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamoxorei32_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamoxorei32_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamoandei8_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamoandei8_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamoandei16_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamoandei16_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamoandei32_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamoandei32_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamoorei8_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamoorei8_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamoorei16_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamoorei16_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamoorei32_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamoorei32_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamominei8_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamominei8_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamominei16_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamominei16_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamominei32_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamominei32_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamomaxei8_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamomaxei8_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamomaxei16_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamomaxei16_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamomaxei32_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamomaxei32_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamominuei8_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamominuei8_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamominuei16_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamominuei16_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamominuei32_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamominuei32_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamomaxuei8_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamomaxuei8_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamomaxuei16_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamomaxuei16_64_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamomaxuei32_32_v, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vamomaxuei32_64_v, void, ptr, ptr, tl, ptr, env, i32)
 #ifdef TARGET_RISCV64
-DEF_HELPER_6(vamoswapw_v_d, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vamoswapd_v_d, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vamoaddw_v_d,  void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vamoaddd_v_d,  void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vamoxorw_v_d,  void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vamoxord_v_d,  void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vamoandw_v_d,  void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vamoandd_v_d,  void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vamoorw_v_d,   void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vamoord_v_d,   void

[RFC v4 16/70] target/riscv: rvv:1.0: add translation-time nan-box helper function

2020-08-17 Thread frank . chang
From: Frank Chang 

* Add fp16 nan-box check generator function, if a 16-bit input is not
  properly nanboxed, then the input is replaced with the default qnan.
* Add do_nanbox() helper function to utilize gen_check_nanbox_X() to
  generate the NaN-boxed floating-point values based on SEW setting.
* Apply nanbox helper in opfvf_trans

Signed-off-by: Frank Chang 
---
 target/riscv/insn_trans/trans_rvv.inc.c | 35 -
 target/riscv/translate.c| 10 +++
 2 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index daaa47ac9c3..4b8ae5470c3 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -2076,6 +2076,33 @@ GEN_OPIVI_NARROW_TRANS(vnclip_vi, IMM_ZX, vnclip_vx)
 /*
  *** Vector Float Point Arithmetic Instructions
  */
+
+/*
+ * As RVF-only cpus always have values NaN-boxed to 64-bits,
+ * RVF and RVD can be treated equally.
+ * We don't have to deal with the cases of: SEW > FLEN.
+ *
+ * If SEW < FLEN, check whether input fp register is a valid
+ * NaN-boxed value, in which case the least-significant SEW bits
+ * of the f regsiter are used, else the canonical NaN value is used.
+ */
+static void do_nanbox(DisasContext *s, TCGv_i64 out, TCGv_i64 in)
+{
+switch (s->sew) {
+case 1:
+gen_check_nanbox_h(out, in);
+break;
+case 2:
+gen_check_nanbox_s(out, in);
+break;
+case 3:
+tcg_gen_mov_i64(out, in);
+break;
+default:
+g_assert_not_reached();
+}
+}
+
 /* Vector Single-Width Floating-Point Add/Subtract Instructions */
 
 /*
@@ -2128,6 +2155,7 @@ static bool opfvf_trans(uint32_t vd, uint32_t rs1, 
uint32_t vs2,
 {
 TCGv_ptr dest, src2, mask;
 TCGv_i32 desc;
+TCGv_i64 t1;
 
 TCGLabel *over = gen_new_label();
 tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);
@@ -2141,12 +2169,17 @@ static bool opfvf_trans(uint32_t vd, uint32_t rs1, 
uint32_t vs2,
 tcg_gen_addi_ptr(src2, cpu_env, vreg_ofs(s, vs2));
 tcg_gen_addi_ptr(mask, cpu_env, vreg_ofs(s, 0));
 
-fn(dest, mask, cpu_fpr[rs1], src2, cpu_env, desc);
+/* NaN-box f[rs1] */
+t1 = tcg_temp_new_i64();
+do_nanbox(s, t1, cpu_fpr[rs1]);
+
+fn(dest, mask, t1, src2, cpu_env, desc);
 
 tcg_temp_free_ptr(dest);
 tcg_temp_free_ptr(mask);
 tcg_temp_free_ptr(src2);
 tcg_temp_free_i32(desc);
+tcg_temp_free_i64(t1);
 mark_vs_dirty(s);
 gen_set_label(over);
 return true;
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 10ef55bbeb7..0b3f5f1b4ba 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -121,6 +121,16 @@ static void gen_nanbox_s(TCGv_i64 out, TCGv_i64 in)
  *
  * Here, the result is always nan-boxed, even the canonical nan.
  */
+static void gen_check_nanbox_h(TCGv_i64 out, TCGv_i64 in)
+{
+TCGv_i64 t_max = tcg_const_i64(0xull);
+TCGv_i64 t_nan = tcg_const_i64(0x7e00ull);
+
+tcg_gen_movcond_i64(TCG_COND_GEU, out, in, t_max, in, t_nan);
+tcg_temp_free_i64(t_max);
+tcg_temp_free_i64(t_nan);
+}
+
 static void gen_check_nanbox_s(TCGv_i64 out, TCGv_i64 in)
 {
 TCGv_i64 t_max = tcg_const_i64(0xull);
-- 
2.17.1




[RFC v4 31/70] target/riscv: rvv-1.0: iota instruction

2020-08-17 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/insn32.decode | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 0992d6ac86d..7a10fc27c5f 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -596,7 +596,7 @@ vfirst_m01 . . 10001 010 . 1010111 
@r2_vm
 vmsbf_m 010100 . . 1 010 . 1010111 @r2_vm
 vmsif_m 010100 . . 00011 010 . 1010111 @r2_vm
 vmsof_m 010100 . . 00010 010 . 1010111 @r2_vm
-viota_m 010110 . . 1 010 . 1010111 @r2_vm
+viota_m 010100 . . 1 010 . 1010111 @r2_vm
 vid_v   010110 . 0 10001 010 . 1010111 @r1_vm
 vext_x_v001100 1 . . 010 . 1010111 @r
 vmv_s_x 001101 1 0 . 110 . 1010111 @r2
-- 
2.17.1




[RFC v4 26/70] target/riscv: rvv-1.0: floating-point square-root instruction

2020-08-17 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/insn32.decode | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index c99575d1360..f142aa5d073 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -527,7 +527,7 @@ vfwmsac_vv  10 . . . 001 . 1010111 @r_vm
 vfwmsac_vf  10 . . . 101 . 1010111 @r_vm
 vfwnmsac_vv 11 . . . 001 . 1010111 @r_vm
 vfwnmsac_vf 11 . . . 101 . 1010111 @r_vm
-vfsqrt_v100011 . . 0 001 . 1010111 @r2_vm
+vfsqrt_v010011 . . 0 001 . 1010111 @r2_vm
 vfmin_vv000100 . . . 001 . 1010111 @r_vm
 vfmin_vf000100 . . . 101 . 1010111 @r_vm
 vfmax_vv000110 . . . 001 . 1010111 @r_vm
-- 
2.17.1




[RFC v4 45/70] target/riscv: rvv-1.0: add Zvqmac extension

2020-08-17 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
---
 target/riscv/cpu.c   | 1 +
 target/riscv/cpu.h   | 1 +
 target/riscv/translate.c | 2 ++
 3 files changed, 4 insertions(+)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 085381fee00..8844975bf94 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -512,6 +512,7 @@ static Property riscv_cpu_properties[] = {
 DEFINE_PROP_BOOL("Counters", RISCVCPU, cfg.ext_counters, true),
 DEFINE_PROP_BOOL("Zifencei", RISCVCPU, cfg.ext_ifencei, true),
 DEFINE_PROP_BOOL("Zicsr", RISCVCPU, cfg.ext_icsr, true),
+DEFINE_PROP_BOOL("Zvqmac", RISCVCPU, cfg.ext_vqmac, true),
 DEFINE_PROP_STRING("priv_spec", RISCVCPU, cfg.priv_spec),
 DEFINE_PROP_STRING("vext_spec", RISCVCPU, cfg.vext_spec),
 DEFINE_PROP_UINT16("vlen", RISCVCPU, cfg.vlen, 128),
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 715faed8824..6e9b17c4e38 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -285,6 +285,7 @@ typedef struct RISCVCPU {
 bool ext_counters;
 bool ext_ifencei;
 bool ext_icsr;
+bool ext_vqmac;
 
 char *priv_spec;
 char *user_spec;
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 0b3f5f1b4ba..5817e9344e9 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -75,6 +75,7 @@ typedef struct DisasContext {
 uint8_t sew;
 uint16_t vlen;
 bool vl_eq_vlmax;
+bool ext_vqmac;
 } DisasContext;
 
 #ifdef TARGET_RISCV64
@@ -870,6 +871,7 @@ static void riscv_tr_init_disas_context(DisasContextBase 
*dcbase, CPUState *cs)
 ctx->misa = env->misa;
 ctx->frm = -1;  /* unknown rounding mode */
 ctx->ext_ifencei = cpu->cfg.ext_ifencei;
+ctx->ext_vqmac = cpu->cfg.ext_vqmac;
 ctx->vlen = cpu->cfg.vlen;
 ctx->vill = FIELD_EX32(tb_flags, TB_FLAGS, VILL);
 ctx->sew = FIELD_EX32(tb_flags, TB_FLAGS, SEW);
-- 
2.17.1




[RFC v4 44/70] target/riscv: rvv-1.0: widening integer multiply-add instructions

2020-08-17 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/insn32.decode | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index c4fe9767585..2e305d492d8 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -438,9 +438,9 @@ vwmaccu_vv  00 . . . 010 . 1010111 @r_vm
 vwmaccu_vx  00 . . . 110 . 1010111 @r_vm
 vwmacc_vv   01 . . . 010 . 1010111 @r_vm
 vwmacc_vx   01 . . . 110 . 1010111 @r_vm
-vwmaccsu_vv 10 . . . 010 . 1010111 @r_vm
-vwmaccsu_vx 10 . . . 110 . 1010111 @r_vm
-vwmaccus_vx 11 . . . 110 . 1010111 @r_vm
+vwmaccsu_vv 11 . . . 010 . 1010111 @r_vm
+vwmaccsu_vx 11 . . . 110 . 1010111 @r_vm
+vwmaccus_vx 10 . . . 110 . 1010111 @r_vm
 vmv_v_v 010111 1 0 . 000 . 1010111 @r2
 vmv_v_x 010111 1 0 . 100 . 1010111 @r2
 vmv_v_i 010111 1 0 . 011 . 1010111 @r2
-- 
2.17.1




[RFC v4 25/70] target/riscv: rvv-1.0: take fractional LMUL into vector max elements calculation

2020-08-17 Thread frank . chang
From: Frank Chang 

Update vext_get_vlmax() and MAXSZ() to take fractional LMUL into
calculation for RVV 1.0.

Signed-off-by: Frank Chang 
---
 target/riscv/cpu.h  | 43 ++---
 target/riscv/insn_trans/trans_rvv.inc.c | 12 ++-
 2 files changed, 42 insertions(+), 13 deletions(-)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 8b5e6429015..715faed8824 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -376,18 +376,27 @@ FIELD(TB_FLAGS, SEW, 6, 3)
 FIELD(TB_FLAGS, VILL, 11, 1)
 
 /*
- * A simplification for VLMAX
- * = (1 << LMUL) * VLEN / (8 * (1 << SEW))
- * = (VLEN << LMUL) / (8 << SEW)
- * = (VLEN << LMUL) >> (SEW + 3)
- * = VLEN >> (SEW + 3 - LMUL)
+ * Encode LMUL to lmul as follows:
+ * LMULvlmullmul
+ *  1   000   0
+ *  2   001   1
+ *  4   010   2
+ *  8   011   3
+ *  -   100   -
+ * 1/8  101  -3
+ * 1/4  110  -2
+ * 1/2  111  -1
+ *
+ * then, we can calculate VLMAX = vlen >> (vsew + 3 - lmul)
+ * e.g. vlen = 256 bits, SEW = 16, LMUL = 1/8
+ *  => VLMAX = vlen >> (1 + 3 - (-3))
+ *   = 256 >> 7
+ *   = 2
  */
 static inline uint32_t vext_get_vlmax(RISCVCPU *cpu, target_ulong vtype)
 {
-uint8_t sew, lmul;
-
-sew = FIELD_EX64(vtype, VTYPE, VSEW);
-lmul = FIELD_EX64(vtype, VTYPE, VLMUL);
+uint8_t sew = FIELD_EX64(vtype, VTYPE, VSEW);
+int8_t lmul = sextract32(FIELD_EX64(vtype, VTYPE, VLMUL), 0, 3);
 return cpu->cfg.vlen >> (sew + 3 - lmul);
 }
 
@@ -400,12 +409,22 @@ static inline void cpu_get_tb_cpu_state(CPURISCVState 
*env, target_ulong *pc,
 *cs_base = 0;
 
 if (riscv_has_ext(env, RVV)) {
+/*
+ * If env->vl equals to VLMAX, we can use generic vector operation
+ * expanders (GVEC) to accerlate the vector operations.
+ * However, as LMUL could be a fractional number. The maximum
+ * vector size can be operated might be less than 8 bytes,
+ * which is not supported by GVEC. So we set vl_eq_vlmax flag to true
+ * only when maxsz >= 8 bytes.
+ */
 uint32_t vlmax = vext_get_vlmax(env_archcpu(env), env->vtype);
-bool vl_eq_vlmax = (env->vstart == 0) && (vlmax == env->vl);
+uint32_t sew = FIELD_EX64(env->vtype, VTYPE, VSEW);
+uint32_t maxsz = vlmax << sew;
+bool vl_eq_vlmax = (env->vstart == 0) && (vlmax == env->vl)
+   && (maxsz >= 8);
 flags = FIELD_DP32(flags, TB_FLAGS, VILL,
 FIELD_EX64(env->vtype, VTYPE, VILL));
-flags = FIELD_DP32(flags, TB_FLAGS, SEW,
-FIELD_EX64(env->vtype, VTYPE, VSEW));
+flags = FIELD_DP32(flags, TB_FLAGS, SEW, sew);
 flags = FIELD_DP32(flags, TB_FLAGS, LMUL,
 FIELD_EX64(env->vtype, VTYPE, VLMUL));
 flags = FIELD_DP32(flags, TB_FLAGS, VL_EQ_VLMAX, vl_eq_vlmax);
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index 334e1fc123b..2c6efce00a7 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -1268,7 +1268,17 @@ GEN_VEXT_TRANS(vamomaxuei64_v, 64, 35, rwdvm, amo_op, 
amo_check)
 /*
  *** Vector Integer Arithmetic Instructions
  */
-#define MAXSZ(s) (s->vlen >> (3 - s->lmul))
+
+/*
+ * MAXSZ returns the maximum vector size can be operated in bytes,
+ * which is used in GVEC IR when vl_eq_vlmax flag is set to true
+ * to accerlate vector operation.
+ */
+static inline uint32_t MAXSZ(DisasContext *s)
+{
+int scale = s->lmul - 3;
+return scale < 0 ? s->vlen >> -scale : s->vlen << scale;
+}
 
 static bool opivv_check(DisasContext *s, arg_rmrr *a)
 {
-- 
2.17.1




[RFC v4 51/70] target/riscv: rvv-1.0: mask-register logical instructions

2020-08-17 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
---
 target/riscv/insn_trans/trans_rvv.inc.c | 3 ++-
 target/riscv/vector_helper.c| 4 
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index c3be3dd97ff..41789a2ba6f 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -2987,7 +2987,8 @@ GEN_OPFVV_WIDEN_TRANS(vfwredsum_vs, reduction_check)
 #define GEN_MM_TRANS(NAME) \
 static bool trans_##NAME(DisasContext *s, arg_r *a)\
 {  \
-if (vext_check_isa_ill(s)) {   \
+if (require_rvv(s) &&  \
+vext_check_isa_ill(s)) {   \
 uint32_t data = 0; \
 gen_helper_gvec_4_ptr *fn = gen_helper_##NAME; \
 TCGLabel *over = gen_new_label();  \
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 766622d3878..ea1715b5484 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -4490,7 +4490,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1,  
\
   void *vs2, CPURISCVState *env,  \
   uint32_t desc)  \
 { \
-uint32_t vlmax = env_archcpu(env)->cfg.vlen;  \
 uint32_t vl = env->vl;\
 uint32_t i;   \
 int a, b; \
@@ -4500,9 +4499,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1,  
\
 b = vext_elem_mask(vs2, i);   \
 vext_set_elem_mask(vd, i, OP(b, a));  \
 } \
-for (; i < vlmax; i++) {  \
-vext_set_elem_mask(vd, i, 0); \
-} \
 }
 
 #define DO_NAND(N, M)  (!(N & M))
-- 
2.17.1




[RFC v4 66/70] target/riscv: rvv-1.0: narrowing floating-point/integer type-convert

2020-08-17 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
---
 target/riscv/helper.h   | 30 ++
 target/riscv/insn32.decode  | 15 +++--
 target/riscv/insn_trans/trans_rvv.inc.c | 51 ++---
 target/riscv/vector_helper.c| 76 ++---
 4 files changed, 130 insertions(+), 42 deletions(-)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index b128610978d..2ecacdc225e 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -1009,16 +1009,26 @@ DEF_HELPER_5(vfwcvt_rtz_xu_f_v_w, void, ptr, ptr, ptr, 
env, i32)
 DEF_HELPER_5(vfwcvt_rtz_x_f_v_h, void, ptr, ptr, ptr, env, i32)
 DEF_HELPER_5(vfwcvt_rtz_x_f_v_w, void, ptr, ptr, ptr, env, i32)
 
-DEF_HELPER_5(vfncvt_xu_f_v_h, void, ptr, ptr, ptr, env, i32)
-DEF_HELPER_5(vfncvt_xu_f_v_w, void, ptr, ptr, ptr, env, i32)
-DEF_HELPER_5(vfncvt_x_f_v_h, void, ptr, ptr, ptr, env, i32)
-DEF_HELPER_5(vfncvt_x_f_v_w, void, ptr, ptr, ptr, env, i32)
-DEF_HELPER_5(vfncvt_f_xu_v_h, void, ptr, ptr, ptr, env, i32)
-DEF_HELPER_5(vfncvt_f_xu_v_w, void, ptr, ptr, ptr, env, i32)
-DEF_HELPER_5(vfncvt_f_x_v_h, void, ptr, ptr, ptr, env, i32)
-DEF_HELPER_5(vfncvt_f_x_v_w, void, ptr, ptr, ptr, env, i32)
-DEF_HELPER_5(vfncvt_f_f_v_h, void, ptr, ptr, ptr, env, i32)
-DEF_HELPER_5(vfncvt_f_f_v_w, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfncvt_xu_f_w_b, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfncvt_xu_f_w_h, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfncvt_xu_f_w_w, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfncvt_x_f_w_b, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfncvt_x_f_w_h, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfncvt_x_f_w_w, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfncvt_f_xu_w_h, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfncvt_f_xu_w_w, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfncvt_f_x_w_h, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfncvt_f_x_w_w, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfncvt_f_f_w_h, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfncvt_f_f_w_w, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfncvt_rod_f_f_w_h, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfncvt_rod_f_f_w_w, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfncvt_rtz_xu_f_w_b, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfncvt_rtz_xu_f_w_h, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfncvt_rtz_xu_f_w_w, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfncvt_rtz_x_f_w_b, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfncvt_rtz_x_f_w_h, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfncvt_rtz_x_f_w_w, void, ptr, ptr, ptr, env, i32)
 
 DEF_HELPER_6(vredsum_vs_b, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vredsum_vs_h, void, ptr, ptr, ptr, ptr, env, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index fae96194078..3b42cb01a77 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -571,11 +571,16 @@ vfwcvt_f_x_v   010010 . . 01011 001 . 1010111 
@r2_vm
 vfwcvt_f_f_v   010010 . . 01100 001 . 1010111 @r2_vm
 vfwcvt_rtz_xu_f_v  010010 . . 01110 001 . 1010111 @r2_vm
 vfwcvt_rtz_x_f_v   010010 . . 0 001 . 1010111 @r2_vm
-vfncvt_xu_f_v   100010 . . 1 001 . 1010111 @r2_vm
-vfncvt_x_f_v100010 . . 10001 001 . 1010111 @r2_vm
-vfncvt_f_xu_v   100010 . . 10010 001 . 1010111 @r2_vm
-vfncvt_f_x_v100010 . . 10011 001 . 1010111 @r2_vm
-vfncvt_f_f_v100010 . . 10100 001 . 1010111 @r2_vm
+
+vfncvt_xu_f_w  010010 . . 1 001 . 1010111 @r2_vm
+vfncvt_x_f_w   010010 . . 10001 001 . 1010111 @r2_vm
+vfncvt_f_xu_w  010010 . . 10010 001 . 1010111 @r2_vm
+vfncvt_f_x_w   010010 . . 10011 001 . 1010111 @r2_vm
+vfncvt_f_f_w   010010 . . 10100 001 . 1010111 @r2_vm
+vfncvt_rod_f_f_w   010010 . . 10101 001 . 1010111 @r2_vm
+vfncvt_rtz_xu_f_w  010010 . . 10110 001 . 1010111 @r2_vm
+vfncvt_rtz_x_f_w   010010 . . 10111 001 . 1010111 @r2_vm
+
 vredsum_vs  00 . . . 010 . 1010111 @r_vm
 vredand_vs  01 . . . 010 . 1010111 @r_vm
 vredor_vs   10 . . . 010 . 1010111 @r_vm
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index 877655d9671..f2edf804460 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -2936,7 +2936,7 @@ static bool opfv_narrow_check(DisasContext *s, arg_rmr *a)
(s->sew != 0);
 }
 
-#define GEN_OPFV_NARROW_TRANS(NAME)\
+#define GEN_OPFV_NARROW_TRANS(NAME, FRM)   \
 static bool trans_##NAME(DisasContext *s, arg_rmr *a)  \
 {  \
 if (opfv_narrow_check(s, a)) { \
@@ -2946,7 +2946,7 @@ static bool trans_##NAME(DisasContext *s, arg_

[RFC v4 65/70] target/riscv: add "set round to odd" rounding mode helper function

2020-08-17 Thread frank . chang
From: Frank Chang 

helper_set_rounding_mode() is responsible for SIGILL, and "round to odd"
should be an interface private to translation, so add a new independent
helper_set_rod_rounding_mode().

Signed-off-by: Frank Chang 
---
 target/riscv/fpu_helper.c | 5 +
 target/riscv/helper.h | 1 +
 target/riscv/internals.h  | 1 +
 target/riscv/translate.c  | 5 +
 4 files changed, 12 insertions(+)

diff --git a/target/riscv/fpu_helper.c b/target/riscv/fpu_helper.c
index 92e076c6ed8..a01b8eab0b3 100644
--- a/target/riscv/fpu_helper.c
+++ b/target/riscv/fpu_helper.c
@@ -81,6 +81,11 @@ void helper_set_rounding_mode(CPURISCVState *env, uint32_t 
rm)
 set_float_rounding_mode(softrm, >fp_status);
 }
 
+void helper_set_rod_rounding_mode(CPURISCVState *env)
+{
+set_float_rounding_mode(float_round_to_odd, >fp_status);
+}
+
 static uint64_t do_fmadd_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2,
uint64_t rs3, int flags)
 {
diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 7539b4a5004..b128610978d 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -3,6 +3,7 @@ DEF_HELPER_2(raise_exception, noreturn, env, i32)
 
 /* Floating Point - rounding mode */
 DEF_HELPER_FLAGS_2(set_rounding_mode, TCG_CALL_NO_WG, void, env, i32)
+DEF_HELPER_FLAGS_1(set_rod_rounding_mode, TCG_CALL_NO_WG, void, env)
 
 /* Floating Point - fused */
 DEF_HELPER_FLAGS_4(fmadd_s, TCG_CALL_NO_RWG, i64, env, i64, i64, i64)
diff --git a/target/riscv/internals.h b/target/riscv/internals.h
index d9ea6a32188..20fb6f2cb7e 100644
--- a/target/riscv/internals.h
+++ b/target/riscv/internals.h
@@ -44,6 +44,7 @@ enum {
 FRM_RUP = 3,/* Round Up */
 FRM_RMM = 4,/* Round to Nearest, ties to Max Magnitude */
 FRM_DYN = 7,/* Dynamic rounding mode */
+FRM_ROD = 8,/* Round to Odd */
 };
 
 static inline uint64_t nanbox_s(float32 f)
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 5817e9344e9..9ae331cbc1a 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -30,6 +30,7 @@
 #include "exec/log.h"
 
 #include "instmap.h"
+#include "internals.h"
 
 /* global register indices */
 static TCGv cpu_gpr[32], cpu_pc, cpu_vl;
@@ -584,6 +585,10 @@ static void gen_set_rm(DisasContext *ctx, int rm)
 return;
 }
 ctx->frm = rm;
+if (rm == FRM_ROD) {
+gen_helper_set_rod_rounding_mode(cpu_env);
+return;
+}
 t0 = tcg_const_i32(rm);
 gen_helper_set_rounding_mode(cpu_env, t0);
 tcg_temp_free_i32(t0);
-- 
2.17.1




[RFC v4 70/70] target/riscv: gdb: support vector registers for rv32

2020-08-17 Thread frank . chang
From: Greentime Hu 

This patch adds vector support for rv32 gdb. It allows gdb client to access
vector registers correctly.

Signed-off-by: Greentime Hu 
Signed-off-by: Frank Chang 
---
 gdb-xml/riscv-32bit-csr.xml | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/gdb-xml/riscv-32bit-csr.xml b/gdb-xml/riscv-32bit-csr.xml
index 3d2031da7dc..bb98b927995 100644
--- a/gdb-xml/riscv-32bit-csr.xml
+++ b/gdb-xml/riscv-32bit-csr.xml
@@ -248,4 +248,11 @@
   
   
   
+  
+  
+  
+  
+  
+  
+  
 
-- 
2.17.1




[RFC v4 55/70] target/riscv: rvv-1.0: single-width floating-point reduction

2020-08-17 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
---
 target/riscv/vector_helper.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 15a646af361..00743cbce34 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -4432,14 +4432,14 @@ GEN_VEXT_FRED(vfredsum_vs_w, uint32_t, uint32_t, H4, 
H4, float32_add)
 GEN_VEXT_FRED(vfredsum_vs_d, uint64_t, uint64_t, H8, H8, float64_add)
 
 /* Maximum value */
-GEN_VEXT_FRED(vfredmax_vs_h, uint16_t, uint16_t, H2, H2, float16_maxnum)
-GEN_VEXT_FRED(vfredmax_vs_w, uint32_t, uint32_t, H4, H4, float32_maxnum)
-GEN_VEXT_FRED(vfredmax_vs_d, uint64_t, uint64_t, H8, H8, float64_maxnum)
+GEN_VEXT_FRED(vfredmax_vs_h, uint16_t, uint16_t, H2, H2, float16_maxnum_noprop)
+GEN_VEXT_FRED(vfredmax_vs_w, uint32_t, uint32_t, H4, H4, float32_maxnum_noprop)
+GEN_VEXT_FRED(vfredmax_vs_d, uint64_t, uint64_t, H8, H8, float64_maxnum_noprop)
 
 /* Minimum value */
-GEN_VEXT_FRED(vfredmin_vs_h, uint16_t, uint16_t, H2, H2, float16_minnum)
-GEN_VEXT_FRED(vfredmin_vs_w, uint32_t, uint32_t, H4, H4, float32_minnum)
-GEN_VEXT_FRED(vfredmin_vs_d, uint64_t, uint64_t, H8, H8, float64_minnum)
+GEN_VEXT_FRED(vfredmin_vs_h, uint16_t, uint16_t, H2, H2, float16_minnum_noprop)
+GEN_VEXT_FRED(vfredmin_vs_w, uint32_t, uint32_t, H4, H4, float32_minnum_noprop)
+GEN_VEXT_FRED(vfredmin_vs_d, uint64_t, uint64_t, H8, H8, float64_minnum_noprop)
 
 /* Vector Widening Floating-Point Reduction Instructions */
 /* Unordered reduce 2*SEW = 2*SEW + sum(promote(SEW)) */
-- 
2.17.1




[RFC v4 27/70] target/riscv: rvv-1.0: floating-point classify instructions

2020-08-17 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/insn32.decode | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index f142aa5d073..a800c989050 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -550,7 +550,7 @@ vmfgt_vf011101 . . . 101 . 1010111 @r_vm
 vmfge_vf01 . . . 101 . 1010111 @r_vm
 vmford_vv   011010 . . . 001 . 1010111 @r_vm
 vmford_vf   011010 . . . 101 . 1010111 @r_vm
-vfclass_v   100011 . . 1 001 . 1010111 @r2_vm
+vfclass_v   010011 . . 1 001 . 1010111 @r2_vm
 vfmerge_vfm 010111 0 . . 101 . 1010111 @r_vm_0
 vfmv_v_f010111 1 0 . 101 . 1010111 @r2
 vfcvt_xu_f_v100010 . . 0 001 . 1010111 @r2_vm
-- 
2.17.1




[RFC v4 21/70] target/riscv: rvv-1.0: fault-only-first unit stride load

2020-08-17 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
---
 target/riscv/helper.h   | 27 +++-
 target/riscv/insn32.decode  | 14 +++
 target/riscv/insn_trans/trans_rvv.inc.c | 31 --
 target/riscv/vector_helper.c| 56 +
 4 files changed, 38 insertions(+), 90 deletions(-)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 8a5d97969da..3d931ba0c70 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -140,28 +140,11 @@ DEF_HELPER_6(vsxei64_8_v, void, ptr, ptr, tl, ptr, env, 
i32)
 DEF_HELPER_6(vsxei64_16_v, void, ptr, ptr, tl, ptr, env, i32)
 DEF_HELPER_6(vsxei64_32_v, void, ptr, ptr, tl, ptr, env, i32)
 DEF_HELPER_6(vsxei64_64_v, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_5(vlbff_v_b, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlbff_v_h, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlbff_v_w, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlbff_v_d, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlhff_v_h, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlhff_v_w, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlhff_v_d, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlwff_v_w, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlwff_v_d, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vleff_v_b, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vleff_v_h, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vleff_v_w, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vleff_v_d, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlbuff_v_b, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlbuff_v_h, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlbuff_v_w, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlbuff_v_d, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlhuff_v_h, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlhuff_v_w, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlhuff_v_d, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlwuff_v_w, void, ptr, ptr, tl, env, i32)
-DEF_HELPER_5(vlwuff_v_d, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vle8ff_v, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vle16ff_v, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vle32ff_v, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vle64ff_v, void, ptr, ptr, tl, env, i32)
+
 #ifdef TARGET_RISCV64
 DEF_HELPER_6(vamoswapw_v_d, void, ptr, ptr, tl, ptr, env, i32)
 DEF_HELPER_6(vamoswapd_v_d, void, ptr, ptr, tl, ptr, env, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 46542d162e6..b0aaa186b8b 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -249,14 +249,6 @@ vsse16_v... 010 . . . 101 . 0100111 @r_nfvm
 vsse32_v... 010 . . . 110 . 0100111 @r_nfvm
 vsse64_v... 010 . . . 111 . 0100111 @r_nfvm
 
-vlbff_v... 100 . 1 . 000 . 111 @r2_nfvm
-vlhff_v... 100 . 1 . 101 . 111 @r2_nfvm
-vlwff_v... 100 . 1 . 110 . 111 @r2_nfvm
-vleff_v... 000 . 1 . 111 . 111 @r2_nfvm
-vlbuff_v   ... 000 . 1 . 000 . 111 @r2_nfvm
-vlhuff_v   ... 000 . 1 . 101 . 111 @r2_nfvm
-vlwuff_v   ... 000 . 1 . 110 . 111 @r2_nfvm
-
 # Vector indexed load insns.
 vlxei8_v  ... 011 . . . 000 . 111 @r_nfvm
 vlxei16_v ... 011 . . . 101 . 111 @r_nfvm
@@ -269,6 +261,12 @@ vsxei16_v ... 0-1 . . . 101 . 0100111 
@r_nfvm
 vsxei32_v ... 0-1 . . . 110 . 0100111 @r_nfvm
 vsxei64_v ... 0-1 . . . 111 . 0100111 @r_nfvm
 
+# Vector unit-stride fault-only-first load insns.
+vle8ff_v  ... 000 . 1 . 000 . 111 @r2_nfvm
+vle16ff_v ... 000 . 1 . 101 . 111 @r2_nfvm
+vle32ff_v ... 000 . 1 . 110 . 111 @r2_nfvm
+vle64ff_v ... 000 . 1 . 111 . 111 @r2_nfvm
+
 #*** Vector AMO operations are encoded under the standard AMO major opcode ***
 vamoswapw_v 1 . . . . 110 . 010 @r_wdvm
 vamoaddw_v  0 . . . . 110 . 010 @r_wdvm
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index 74e83824b36..6bb3cd47ff9 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -946,24 +946,12 @@ static bool ldff_op(DisasContext *s, arg_r2nfvm *a, 
uint8_t seq)
 {
 uint32_t data = 0;
 gen_helper_ldst_us *fn;
-static gen_helper_ldst_us * const fns[7][4] = {
-{ gen_helper_vlbff_v_b,  gen_helper_vlbff_v_h,
-  gen_helper_vlbff_v_w,  gen_helper_vlbff_v_d },
-{ NULL,  gen_helper_vlhff_v_h,
-  gen_helper_vlhff_v_w,  gen_helper_vlhff_v_d },
-{ NULL,  NULL,
-  gen_helper_vlwff_v_w,  gen_helper_vlwff_v_d },
-{ gen_helper_vleff_v_b,  gen_helper_vleff_v_h,
-  gen_helper_vleff_v_w,  gen_helper_vleff_v_d },
-{ gen_helper_vlbuff_v_b, gen_helper_vlbuff_v_h

[RFC v4 12/70] target/riscv: rvv-1.0: add fractional LMUL

2020-08-17 Thread frank . chang
From: Frank Chang 

Introduce the concepts of fractional LMUL for RVV 1.0.
In RVV 1.0, LMUL bits are contiguous in vtype register.

Signed-off-by: Frank Chang 
---
 target/riscv/cpu.h   | 15 ---
 target/riscv/translate.c | 16 ++--
 target/riscv/vector_helper.c | 16 ++--
 3 files changed, 36 insertions(+), 11 deletions(-)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 08d2c10a024..d0f9a76ca01 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -94,10 +94,10 @@ typedef struct CPURISCVState CPURISCVState;
 
 #define RV_VLEN_MAX 256
 
-FIELD(VTYPE, VLMUL, 0, 2)
-FIELD(VTYPE, VSEW, 2, 3)
-FIELD(VTYPE, VEDIV, 5, 2)
-FIELD(VTYPE, RESERVED, 7, sizeof(target_ulong) * 8 - 9)
+FIELD(VTYPE, VLMUL, 0, 3)
+FIELD(VTYPE, VSEW, 3, 3)
+FIELD(VTYPE, VEDIV, 8, 2)
+FIELD(VTYPE, RESERVED, 10, sizeof(target_ulong) * 8 - 11)
 FIELD(VTYPE, VILL, sizeof(target_ulong) * 8 - 1, 1)
 
 struct CPURISCVState {
@@ -368,9 +368,10 @@ typedef RISCVCPU ArchCPU;
 #include "exec/cpu-all.h"
 
 FIELD(TB_FLAGS, VL_EQ_VLMAX, 2, 1)
-FIELD(TB_FLAGS, LMUL, 3, 2)
-FIELD(TB_FLAGS, SEW, 5, 3)
-FIELD(TB_FLAGS, VILL, 8, 1)
+FIELD(TB_FLAGS, LMUL, 3, 3)
+FIELD(TB_FLAGS, SEW, 6, 3)
+/* Skip MSTATUS_VS (0x600) fields */
+FIELD(TB_FLAGS, VILL, 11, 1)
 
 /*
  * A simplification for VLMAX
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 7b6088677d4..10ef55bbeb7 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -59,7 +59,19 @@ typedef struct DisasContext {
 bool ext_ifencei;
 /* vector extension */
 bool vill;
-uint8_t lmul;
+/*
+ * Encode LMUL to lmul as follows:
+ * LMULvlmullmul
+ *  1   000   0
+ *  2   001   1
+ *  4   010   2
+ *  8   011   3
+ *  -   100   -
+ * 1/8  101  -3
+ * 1/4  110  -2
+ * 1/2  111  -1
+ */
+int8_t lmul;
 uint8_t sew;
 uint16_t vlen;
 bool vl_eq_vlmax;
@@ -851,7 +863,7 @@ static void riscv_tr_init_disas_context(DisasContextBase 
*dcbase, CPUState *cs)
 ctx->vlen = cpu->cfg.vlen;
 ctx->vill = FIELD_EX32(tb_flags, TB_FLAGS, VILL);
 ctx->sew = FIELD_EX32(tb_flags, TB_FLAGS, SEW);
-ctx->lmul = FIELD_EX32(tb_flags, TB_FLAGS, LMUL);
+ctx->lmul = sextract32(FIELD_EX32(tb_flags, TB_FLAGS, LMUL), 0, 3);
 ctx->vl_eq_vlmax = FIELD_EX32(tb_flags, TB_FLAGS, VL_EQ_VLMAX);
 }
 
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index f42346cb9ca..37c510b98f0 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -86,9 +86,21 @@ static inline uint32_t vext_vm(uint32_t desc)
 return FIELD_EX32(simd_data(desc), VDATA, VM);
 }
 
-static inline uint32_t vext_lmul(uint32_t desc)
+/*
+ * Encode LMUL to lmul as following:
+ * LMULvlmullmul
+ *  1   000   0
+ *  2   001   1
+ *  4   010   2
+ *  8   011   3
+ *  -   100   -
+ * 1/8  101  -3
+ * 1/4  110  -2
+ * 1/2  111  -1
+ */
+static inline int32_t vext_lmul(uint32_t desc)
 {
-return FIELD_EX32(simd_data(desc), VDATA, LMUL);
+return sextract32(FIELD_EX32(simd_data(desc), VDATA, LMUL), 0, 3);
 }
 
 static uint32_t vext_wd(uint32_t desc)
-- 
2.17.1




[RFC v4 28/70] target/riscv: rvv-1.0: mask population count instruction

2020-08-17 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/helper.h   | 2 +-
 target/riscv/insn32.decode  | 2 +-
 target/riscv/insn_trans/trans_rvv.inc.c | 7 ---
 target/riscv/vector_helper.c| 6 +++---
 4 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 25d076d71a8..0a1179370b1 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -1066,7 +1066,7 @@ DEF_HELPER_6(vmnor_mm, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vmornot_mm, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vmxnor_mm, void, ptr, ptr, ptr, ptr, env, i32)
 
-DEF_HELPER_4(vmpopc_m, tl, ptr, ptr, env, i32)
+DEF_HELPER_4(vpopc_m, tl, ptr, ptr, env, i32)
 
 DEF_HELPER_4(vmfirst_m, tl, ptr, ptr, env, i32)
 
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index a800c989050..3d2d43ebd8a 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -591,7 +591,7 @@ vmor_mm 011010 - . . 010 . 1010111 @r
 vmnor_mm00 - . . 010 . 1010111 @r
 vmornot_mm  011100 - . . 010 . 1010111 @r
 vmxnor_mm   01 - . . 010 . 1010111 @r
-vmpopc_m010100 . . - 010 . 1010111 @r2_vm
+vpopc_m 01 . . 1 010 . 1010111 @r2_vm
 vmfirst_m   010101 . . - 010 . 1010111 @r2_vm
 vmsbf_m 010110 . . 1 010 . 1010111 @r2_vm
 vmsif_m 010110 . . 00011 010 . 1010111 @r2_vm
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index 2c6efce00a7..ce963c33af8 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -2895,8 +2895,8 @@ GEN_MM_TRANS(vmnor_mm)
 GEN_MM_TRANS(vmornot_mm)
 GEN_MM_TRANS(vmxnor_mm)
 
-/* Vector mask population count vmpopc */
-static bool trans_vmpopc_m(DisasContext *s, arg_rmr *a)
+/* Vector mask population count vpopc */
+static bool trans_vpopc_m(DisasContext *s, arg_rmr *a)
 {
 if (require_rvv(s) &&
 vext_check_isa_ill(s)) {
@@ -2915,13 +2915,14 @@ static bool trans_vmpopc_m(DisasContext *s, arg_rmr *a)
 tcg_gen_addi_ptr(src2, cpu_env, vreg_ofs(s, a->rs2));
 tcg_gen_addi_ptr(mask, cpu_env, vreg_ofs(s, 0));
 
-gen_helper_vmpopc_m(dst, mask, src2, cpu_env, desc);
+gen_helper_vpopc_m(dst, mask, src2, cpu_env, desc);
 gen_set_gpr(a->rd, dst);
 
 tcg_temp_free_ptr(mask);
 tcg_temp_free_ptr(src2);
 tcg_temp_free(dst);
 tcg_temp_free_i32(desc);
+
 return true;
 }
 return false;
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index f802e8c9c05..13694c1b2c4 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -4447,9 +4447,9 @@ GEN_VEXT_MASK_VV(vmnor_mm, DO_NOR)
 GEN_VEXT_MASK_VV(vmornot_mm, DO_ORNOT)
 GEN_VEXT_MASK_VV(vmxnor_mm, DO_XNOR)
 
-/* Vector mask population count vmpopc */
-target_ulong HELPER(vmpopc_m)(void *v0, void *vs2, CPURISCVState *env,
-  uint32_t desc)
+/* Vector mask population count vpopc */
+target_ulong HELPER(vpopc_m)(void *v0, void *vs2, CPURISCVState *env,
+ uint32_t desc)
 {
 target_ulong cnt = 0;
 uint32_t vm = vext_vm(desc);
-- 
2.17.1




[RFC v4 29/70] target/riscv: rvv-1.0: find-first-set mask bit instruction

2020-08-17 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/helper.h   | 2 +-
 target/riscv/insn32.decode  | 2 +-
 target/riscv/insn_trans/trans_rvv.inc.c | 4 ++--
 target/riscv/vector_helper.c| 6 +++---
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 0a1179370b1..a5d58010134 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -1068,7 +1068,7 @@ DEF_HELPER_6(vmxnor_mm, void, ptr, ptr, ptr, ptr, env, 
i32)
 
 DEF_HELPER_4(vpopc_m, tl, ptr, ptr, env, i32)
 
-DEF_HELPER_4(vmfirst_m, tl, ptr, ptr, env, i32)
+DEF_HELPER_4(vfirst_m, tl, ptr, ptr, env, i32)
 
 DEF_HELPER_5(vmsbf_m, void, ptr, ptr, ptr, env, i32)
 DEF_HELPER_5(vmsif_m, void, ptr, ptr, ptr, env, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 3d2d43ebd8a..d72120cfd85 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -592,7 +592,7 @@ vmnor_mm00 - . . 010 . 1010111 @r
 vmornot_mm  011100 - . . 010 . 1010111 @r
 vmxnor_mm   01 - . . 010 . 1010111 @r
 vpopc_m 01 . . 1 010 . 1010111 @r2_vm
-vmfirst_m   010101 . . - 010 . 1010111 @r2_vm
+vfirst_m01 . . 10001 010 . 1010111 @r2_vm
 vmsbf_m 010110 . . 1 010 . 1010111 @r2_vm
 vmsif_m 010110 . . 00011 010 . 1010111 @r2_vm
 vmsof_m 010110 . . 00010 010 . 1010111 @r2_vm
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index ce963c33af8..e1f9903a8b5 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -2929,7 +2929,7 @@ static bool trans_vpopc_m(DisasContext *s, arg_rmr *a)
 }
 
 /* vmfirst find-first-set mask bit */
-static bool trans_vmfirst_m(DisasContext *s, arg_rmr *a)
+static bool trans_vfirst_m(DisasContext *s, arg_rmr *a)
 {
 if (require_rvv(s) &&
 vext_check_isa_ill(s)) {
@@ -2948,7 +2948,7 @@ static bool trans_vmfirst_m(DisasContext *s, arg_rmr *a)
 tcg_gen_addi_ptr(src2, cpu_env, vreg_ofs(s, a->rs2));
 tcg_gen_addi_ptr(mask, cpu_env, vreg_ofs(s, 0));
 
-gen_helper_vmfirst_m(dst, mask, src2, cpu_env, desc);
+gen_helper_vfirst_m(dst, mask, src2, cpu_env, desc);
 gen_set_gpr(a->rd, dst);
 
 tcg_temp_free_ptr(mask);
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 13694c1b2c4..973eb689c51 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -4466,9 +4466,9 @@ target_ulong HELPER(vpopc_m)(void *v0, void *vs2, 
CPURISCVState *env,
 return cnt;
 }
 
-/* vmfirst find-first-set mask bit*/
-target_ulong HELPER(vmfirst_m)(void *v0, void *vs2, CPURISCVState *env,
-   uint32_t desc)
+/* vfirst find-first-set mask bit*/
+target_ulong HELPER(vfirst_m)(void *v0, void *vs2, CPURISCVState *env,
+  uint32_t desc)
 {
 uint32_t vm = vext_vm(desc);
 uint32_t vl = env->vl;
-- 
2.17.1




[RFC v4 50/70] target/riscv: rvv-1.0: floating-point compare instructions

2020-08-17 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
---
 target/riscv/vector_helper.c | 8 
 1 file changed, 8 deletions(-)

diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index e6441f18465..766622d3878 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -3963,7 +3963,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void 
*vs2,   \
 { \
 uint32_t vm = vext_vm(desc);  \
 uint32_t vl = env->vl;\
-uint32_t vlmax = vext_maxsz(desc) / sizeof(ETYPE);\
 uint32_t i;   \
   \
 for (i = 0; i < vl; i++) {\
@@ -3975,9 +3974,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void 
*vs2,   \
 vext_set_elem_mask(vd, i, \
DO_OP(s2, s1, >fp_status));   \
 } \
-for (; i < vlmax; i++) {  \
-vext_set_elem_mask(vd, i, 0); \
-} \
 }
 
 GEN_VEXT_CMP_VV_ENV(vmfeq_vv_h, uint16_t, H2, float16_eq_quiet)
@@ -3990,7 +3986,6 @@ void HELPER(NAME)(void *vd, void *v0, uint64_t s1, void 
*vs2,   \
 {   \
 uint32_t vm = vext_vm(desc);\
 uint32_t vl = env->vl;  \
-uint32_t vlmax = vext_max_elems(desc, ctzl(sizeof(ETYPE))); \
 uint32_t i; \
 \
 for (i = 0; i < vl; i++) {  \
@@ -4001,9 +3996,6 @@ void HELPER(NAME)(void *vd, void *v0, uint64_t s1, void 
*vs2,   \
 vext_set_elem_mask(vd, i,   \
DO_OP(s2, (ETYPE)s1, >fp_status));  \
 }   \
-for (; i < vlmax; i++) {\
-vext_set_elem_mask(vd, i, 0);   \
-}   \
 }
 
 GEN_VEXT_CMP_VF(vmfeq_vf_h, uint16_t, H2, float16_eq_quiet)
-- 
2.17.1




[RFC v4 30/70] target/riscv: rvv-1.0: set-X-first mask bit instructions

2020-08-17 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/insn32.decode  | 6 +++---
 target/riscv/insn_trans/trans_rvv.inc.c | 5 -
 target/riscv/vector_helper.c| 4 
 3 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index d72120cfd85..0992d6ac86d 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -593,9 +593,9 @@ vmornot_mm  011100 - . . 010 . 1010111 @r
 vmxnor_mm   01 - . . 010 . 1010111 @r
 vpopc_m 01 . . 1 010 . 1010111 @r2_vm
 vfirst_m01 . . 10001 010 . 1010111 @r2_vm
-vmsbf_m 010110 . . 1 010 . 1010111 @r2_vm
-vmsif_m 010110 . . 00011 010 . 1010111 @r2_vm
-vmsof_m 010110 . . 00010 010 . 1010111 @r2_vm
+vmsbf_m 010100 . . 1 010 . 1010111 @r2_vm
+vmsif_m 010100 . . 00011 010 . 1010111 @r2_vm
+vmsof_m 010100 . . 00010 010 . 1010111 @r2_vm
 viota_m 010110 . . 1 010 . 1010111 @r2_vm
 vid_v   010110 . 0 10001 010 . 1010111 @r1_vm
 vext_x_v001100 1 . . 010 . 1010111 @r
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index e1f9903a8b5..b21fa747d84 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -2966,7 +2966,10 @@ static bool trans_vfirst_m(DisasContext *s, arg_rmr *a)
 #define GEN_M_TRANS(NAME)  \
 static bool trans_##NAME(DisasContext *s, arg_rmr *a)  \
 {  \
-if (vext_check_isa_ill(s)) {   \
+if (require_rvv(s) &&  \
+vext_check_isa_ill(s) &&   \
+require_vm(a->vm, a->rd) &&\
+(a->rd != a->rs2)) {   \
 uint32_t data = 0; \
 gen_helper_gvec_3_ptr *fn = gen_helper_##NAME; \
 TCGLabel *over = gen_new_label();  \
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 973eb689c51..716e1926ee2 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -4493,7 +4493,6 @@ enum set_mask_type {
 static void vmsetm(void *vd, void *v0, void *vs2, CPURISCVState *env,
uint32_t desc, enum set_mask_type type)
 {
-uint32_t vlmax = env_archcpu(env)->cfg.vlen;
 uint32_t vm = vext_vm(desc);
 uint32_t vl = env->vl;
 int i;
@@ -4523,9 +4522,6 @@ static void vmsetm(void *vd, void *v0, void *vs2, 
CPURISCVState *env,
 }
 }
 }
-for (; i < vlmax; i++) {
-vext_set_elem_mask(vd, i, 0);
-}
 }
 
 void HELPER(vmsbf_m)(void *vd, void *v0, void *vs2, CPURISCVState *env,
-- 
2.17.1




[RFC v4 33/70] target/riscv: rvv-1.0: allow load element with sign-extended

2020-08-17 Thread frank . chang
From: Frank Chang 

For some vector instructions (e.g. vmv.s.x), the element is loaded with
sign-extended.

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/insn_trans/trans_rvv.inc.c | 32 +
 1 file changed, 22 insertions(+), 10 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index b21fa747d84..be5149fa762 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -3052,17 +3052,29 @@ static bool trans_vid_v(DisasContext *s, arg_vid_v *a)
 /* Integer Extract Instruction */
 
 static void load_element(TCGv_i64 dest, TCGv_ptr base,
- int ofs, int sew)
+ int ofs, int sew, bool sign)
 {
 switch (sew) {
 case MO_8:
-tcg_gen_ld8u_i64(dest, base, ofs);
+if (!sign) {
+tcg_gen_ld8u_i64(dest, base, ofs);
+} else {
+tcg_gen_ld8s_i64(dest, base, ofs);
+}
 break;
 case MO_16:
-tcg_gen_ld16u_i64(dest, base, ofs);
+if (!sign) {
+tcg_gen_ld16u_i64(dest, base, ofs);
+} else {
+tcg_gen_ld16s_i64(dest, base, ofs);
+}
 break;
 case MO_32:
-tcg_gen_ld32u_i64(dest, base, ofs);
+if (!sign) {
+tcg_gen_ld32u_i64(dest, base, ofs);
+} else {
+tcg_gen_ld32s_i64(dest, base, ofs);
+}
 break;
 case MO_64:
 tcg_gen_ld_i64(dest, base, ofs);
@@ -3117,7 +3129,7 @@ static void vec_element_loadx(DisasContext *s, TCGv_i64 
dest,
 
 /* Perform the load. */
 load_element(dest, base,
- vreg_ofs(s, vreg), s->sew);
+ vreg_ofs(s, vreg), s->sew, false);
 tcg_temp_free_ptr(base);
 tcg_temp_free_i32(ofs);
 
@@ -3135,9 +3147,9 @@ static void vec_element_loadx(DisasContext *s, TCGv_i64 
dest,
 }
 
 static void vec_element_loadi(DisasContext *s, TCGv_i64 dest,
-  int vreg, int idx)
+  int vreg, int idx, bool sign)
 {
-load_element(dest, cpu_env, endian_ofs(s, vreg, idx), s->sew);
+load_element(dest, cpu_env, endian_ofs(s, vreg, idx), s->sew, sign);
 }
 
 static bool trans_vext_x_v(DisasContext *s, arg_r *a)
@@ -3147,7 +3159,7 @@ static bool trans_vext_x_v(DisasContext *s, arg_r *a)
 
 if (a->rs1 == 0) {
 /* Special case vmv.x.s rd, vs2. */
-vec_element_loadi(s, tmp, a->rs2, 0);
+vec_element_loadi(s, tmp, a->rs2, 0, false);
 } else {
 /* This instruction ignores LMUL and vector register groups */
 int vlmax = s->vlen >> (3 + s->sew);
@@ -3229,7 +3241,7 @@ static bool trans_vfmv_f_s(DisasContext *s, arg_vfmv_f_s 
*a)
 (s->mstatus_fs != 0) && (s->sew != 0)) {
 unsigned int len = 8 << s->sew;
 
-vec_element_loadi(s, cpu_fpr[a->rd], a->rs2, 0);
+vec_element_loadi(s, cpu_fpr[a->rd], a->rs2, 0, false);
 if (len < 64) {
 tcg_gen_ori_i64(cpu_fpr[a->rd], cpu_fpr[a->rd],
 MAKE_64BIT_MASK(len, 64 - len));
@@ -3331,7 +3343,7 @@ static bool trans_vrgather_vx(DisasContext *s, arg_rmrr 
*a)
 TCGv_i64 dest = tcg_temp_new_i64();
 
 if (a->rs1 == 0) {
-vec_element_loadi(s, dest, a->rs2, 0);
+vec_element_loadi(s, dest, a->rs2, 0, false);
 } else {
 vec_element_loadx(s, dest, a->rs2, cpu_gpr[a->rs1], vlmax);
 }
-- 
2.17.1




[RFC v4 53/70] target/riscv: rvv-1.0: floating-point slide instructions

2020-08-17 Thread frank . chang
From: Frank Chang 

Add the following instructions:

* vfslide1up.vf
* vfslide1down.vf

Signed-off-by: Frank Chang 
---
 target/riscv/helper.h   |   7 ++
 target/riscv/insn32.decode  |   2 +
 target/riscv/insn_trans/trans_rvv.inc.c |   4 +
 target/riscv/vector_helper.c| 141 
 4 files changed, 109 insertions(+), 45 deletions(-)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 6825c15e025..6d98de1be15 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -1132,6 +1132,13 @@ DEF_HELPER_6(vslide1down_vx_h, void, ptr, ptr, tl, ptr, 
env, i32)
 DEF_HELPER_6(vslide1down_vx_w, void, ptr, ptr, tl, ptr, env, i32)
 DEF_HELPER_6(vslide1down_vx_d, void, ptr, ptr, tl, ptr, env, i32)
 
+DEF_HELPER_6(vfslide1up_vf_h, void, ptr, ptr, i64, ptr, env, i32)
+DEF_HELPER_6(vfslide1up_vf_w, void, ptr, ptr, i64, ptr, env, i32)
+DEF_HELPER_6(vfslide1up_vf_d, void, ptr, ptr, i64, ptr, env, i32)
+DEF_HELPER_6(vfslide1down_vf_h, void, ptr, ptr, i64, ptr, env, i32)
+DEF_HELPER_6(vfslide1down_vf_w, void, ptr, ptr, i64, ptr, env, i32)
+DEF_HELPER_6(vfslide1down_vf_d, void, ptr, ptr, i64, ptr, env, i32)
+
 DEF_HELPER_6(vrgather_vv_b, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vrgather_vv_h, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vrgather_vv_w, void, ptr, ptr, ptr, ptr, env, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index b2ecc8dd4d1..d181db197ef 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -548,6 +548,8 @@ vfsgnjn_vv  001001 . . . 001 . 1010111 @r_vm
 vfsgnjn_vf  001001 . . . 101 . 1010111 @r_vm
 vfsgnjx_vv  001010 . . . 001 . 1010111 @r_vm
 vfsgnjx_vf  001010 . . . 101 . 1010111 @r_vm
+vfslide1up_vf   001110 . . . 101 . 1010111 @r_vm
+vfslide1down_vf 00 . . . 101 . 1010111 @r_vm
 vmfeq_vv011000 . . . 001 . 1010111 @r_vm
 vmfeq_vf011000 . . . 101 . 1010111 @r_vm
 vmfne_vv011100 . . . 001 . 1010111 @r_vm
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index 41789a2ba6f..c452292652c 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -3460,6 +3460,10 @@ GEN_OPIVX_TRANS(vslidedown_vx, slidedown_check)
 GEN_OPIVX_TRANS(vslide1down_vx, slidedown_check)
 GEN_OPIVI_TRANS(vslidedown_vi, IMM_ZX, vslidedown_vx, slidedown_check)
 
+/* Vector Floating-Point Slide Instructions */
+GEN_OPFVF_TRANS(vfslide1up_vf, slideup_check)
+GEN_OPFVF_TRANS(vfslide1down_vf, slidedown_check)
+
 /* Vector Register Gather Instruction */
 static bool vrgather_vv_check(DisasContext *s, arg_rmrr *a)
 {
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 2f1460b624d..09f0b03e2c5 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -4714,57 +4714,108 @@ GEN_VEXT_VSLIDEDOWN_VX(vslidedown_vx_h, uint16_t, H2)
 GEN_VEXT_VSLIDEDOWN_VX(vslidedown_vx_w, uint32_t, H4)
 GEN_VEXT_VSLIDEDOWN_VX(vslidedown_vx_d, uint64_t, H8)
 
-#define GEN_VEXT_VSLIDE1UP_VX(NAME, ETYPE, H) \
-void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2, \
-  CPURISCVState *env, uint32_t desc)  \
-{ \
-uint32_t vm = vext_vm(desc);  \
-uint32_t vl = env->vl;\
-uint32_t i;   \
-  \
-for (i = 0; i < vl; i++) {\
-if (!vm && !vext_elem_mask(v0, i)) {  \
-continue; \
-} \
-if (i == 0) { \
-*((ETYPE *)vd + H(i)) = s1;   \
-} else {  \
-*((ETYPE *)vd + H(i)) = *((ETYPE *)vs2 + H(i - 1));   \
-} \
-} \
+#define GEN_VEXT_VSLIE1UP(ESZ, H)   \
+static void vslide1up_##ESZ(void *vd, void *v0, target_ulong s1, void *vs2, \
+ CPURISCVState *env, uint32_t desc) \
+{   \
+typedef uint##ESZ##_t ETYPE;\
+uint32_t vm 

[RFC v4 47/70] target/riscv: rvv-1.0: single-width saturating add and subtract instructions

2020-08-17 Thread frank . chang
From: Frank Chang 

Sign-extend vsaddu.vi immediate value.

Signed-off-by: Frank Chang 
---
 target/riscv/insn_trans/trans_rvv.inc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index 809280f4c5c..ef100254830 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -2341,7 +2341,7 @@ GEN_OPIVX_TRANS(vsaddu_vx,  opivx_check)
 GEN_OPIVX_TRANS(vsadd_vx,  opivx_check)
 GEN_OPIVX_TRANS(vssubu_vx,  opivx_check)
 GEN_OPIVX_TRANS(vssub_vx,  opivx_check)
-GEN_OPIVI_TRANS(vsaddu_vi, IMM_ZX, vsaddu_vx, opivx_check)
+GEN_OPIVI_TRANS(vsaddu_vi, IMM_SX, vsaddu_vx, opivx_check)
 GEN_OPIVI_TRANS(vsadd_vi, IMM_SX, vsadd_vx, opivx_check)
 
 /* Vector Single-Width Averaging Add and Subtract */
-- 
2.17.1




Re: [RFC v3 26/71] target/riscv: rvv-1.0: update vext_max_elems() for load/store insns

2020-08-15 Thread Frank Chang
On Sat, Aug 15, 2020 at 2:36 AM Richard Henderson <
richard.hender...@linaro.org> wrote:

> On 8/13/20 7:48 PM, Frank Chang wrote:
> > esz is passed from e.g. GEN_VEXT_LD_STRIDE() macro:
> >
> >> #define GEN_VEXT_LD_STRIDE(NAME, ETYPE, LOAD_FN)\
> >> void HELPER(NAME)(void *vd, void * v0, target_ulong base,  \
> >>   target_ulong stride, CPURISCVState *env, \
> >>   uint32_t desc)   \
> >> {  \
> >> uint32_t vm = vext_vm(desc);   \
> >> vext_ldst_stride(vd, v0, base, stride, env, desc, vm, LOAD_FN, \
> >>  sizeof(ETYPE), GETPC(), MMU_DATA_LOAD);   \
> >> }
> >>
> >> GEN_VEXT_LD_STRIDE(vlse8_v,  int8_t,  lde_b)
> >
> > which is calculated by sizeof(ETYPE), so the results would be: 1, 2, 4,
> 8.
> > and vext_max_elems() is called by e.g. vext_ldst_stride():
>
> Ah, yes.
>
> >> uint32_t max_elems = vext_max_elems(desc, esz);
> >
> > I can add another parameter to the macro and pass the hard-coded
> log2(esz) number
> > if it's the better way instead of using ctzl().
> > Or if there's another approach to get the log2(esz) number more
> elegantly?
>
> Using ctzl(sizeof(type)) in the GEN_VEXT_LD_STRIDE macro will work well.
> This
> will be constant folded by the compiler.
>
>
> r~
>

Checked the codes again,
GEN_VEXT_LD_STRIDE() will eventually call vext_ldst_stride() and pass esz
as the parameter.
However, esz is not only used in vext_max_elems() but also used for other
calculation, e.g.:

probe_pages(env, base + stride * i, nf * esz, ra, access_type);
and
target_ulong addr = base + stride * i + k * esz;

If we pass ctzl(sizeof(type)) in GEN_VEXT_LD_STRIDE(),
I would still have to do: (1 << esz) to get the correct element size in the
above calculations.
Would it eliminate the performance gain we have in vext_max_elems() instead?

Frank Chang


Re: [RFC v3 26/71] target/riscv: rvv-1.0: update vext_max_elems() for load/store insns

2020-08-15 Thread Frank Chang
On Sat, Aug 15, 2020 at 2:36 AM Richard Henderson <
richard.hender...@linaro.org> wrote:

> On 8/13/20 7:48 PM, Frank Chang wrote:
> > esz is passed from e.g. GEN_VEXT_LD_STRIDE() macro:
> >
> >> #define GEN_VEXT_LD_STRIDE(NAME, ETYPE, LOAD_FN)\
> >> void HELPER(NAME)(void *vd, void * v0, target_ulong base,  \
> >>   target_ulong stride, CPURISCVState *env, \
> >>   uint32_t desc)   \
> >> {  \
> >> uint32_t vm = vext_vm(desc);   \
> >> vext_ldst_stride(vd, v0, base, stride, env, desc, vm, LOAD_FN, \
> >>  sizeof(ETYPE), GETPC(), MMU_DATA_LOAD);   \
> >> }
> >>
> >> GEN_VEXT_LD_STRIDE(vlse8_v,  int8_t,  lde_b)
> >
> > which is calculated by sizeof(ETYPE), so the results would be: 1, 2, 4,
> 8.
> > and vext_max_elems() is called by e.g. vext_ldst_stride():
>
> Ah, yes.
>
> >> uint32_t max_elems = vext_max_elems(desc, esz);
> >
> > I can add another parameter to the macro and pass the hard-coded
> log2(esz) number
> > if it's the better way instead of using ctzl().
> > Or if there's another approach to get the log2(esz) number more
> elegantly?
>
> Using ctzl(sizeof(type)) in the GEN_VEXT_LD_STRIDE macro will work well.
> This
> will be constant folded by the compiler.
>
>
> r~
>

Nice, didn't come up with the compiler optimization.
Will fix the codes and send out a new version of patchset.
Thanks for the tips.

Frank Chang


[RFC v5 24/68] target/riscv: rvv-1.0: update vext_max_elems() for load/store insns

2020-09-29 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
---
 target/riscv/insn_trans/trans_rvv.c.inc | 32 +++--
 target/riscv/vector_helper.c| 90 ++---
 2 files changed, 74 insertions(+), 48 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvv.c.inc 
b/target/riscv/insn_trans/trans_rvv.c.inc
index dd59e67ced..82b01586b2 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -603,6 +603,12 @@ static bool trans_##NAME(DisasContext *s, arg_##ARGTYPE * 
a) \
 return false;\
 }
 
+static uint8_t vext_get_emul(DisasContext *s, uint8_t eew)
+{
+int8_t emul = eew - s->sew + s->lmul;
+return emul < 0 ? 0 : emul;
+}
+
 /*
  *** unit stride load and store
  */
@@ -668,8 +674,14 @@ static bool ld_us_op(DisasContext *s, arg_r2nfvm *a, 
uint8_t eew)
 return false;
 }
 
+/*
+ * Vector load/store instructions have the EEW encoded
+ * directly in the instructions. The maximum vector size is
+ * calculated with EMUL rather than LMUL.
+ */
+uint8_t emul = vext_get_emul(s, eew);
 data = FIELD_DP32(data, VDATA, VM, a->vm);
-data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
+data = FIELD_DP32(data, VDATA, LMUL, emul);
 data = FIELD_DP32(data, VDATA, NF, a->nf);
 return ldst_us_trans(a->rd, a->rs1, data, fn, s, false);
 }
@@ -704,8 +716,9 @@ static bool st_us_op(DisasContext *s, arg_r2nfvm *a, 
uint8_t eew)
 return false;
 }
 
+uint8_t emul = vext_get_emul(s, eew);
 data = FIELD_DP32(data, VDATA, VM, a->vm);
-data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
+data = FIELD_DP32(data, VDATA, LMUL, emul);
 data = FIELD_DP32(data, VDATA, NF, a->nf);
 return ldst_us_trans(a->rd, a->rs1, data, fn, s, true);
 }
@@ -778,8 +791,9 @@ static bool ld_stride_op(DisasContext *s, arg_rnfvm *a, 
uint8_t eew)
 return false;
 }
 
+uint8_t emul = vext_get_emul(s, eew);
 data = FIELD_DP32(data, VDATA, VM, a->vm);
-data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
+data = FIELD_DP32(data, VDATA, LMUL, emul);
 data = FIELD_DP32(data, VDATA, NF, a->nf);
 return ldst_stride_trans(a->rd, a->rs1, a->rs2, data, fn, s, false);
 }
@@ -806,8 +820,9 @@ static bool st_stride_op(DisasContext *s, arg_rnfvm *a, 
uint8_t eew)
 gen_helper_vsse32_v,  gen_helper_vsse64_v
 };
 
+uint8_t emul = vext_get_emul(s, eew);
 data = FIELD_DP32(data, VDATA, VM, a->vm);
-data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
+data = FIELD_DP32(data, VDATA, LMUL, emul);
 data = FIELD_DP32(data, VDATA, NF, a->nf);
 fn = fns[eew];
 if (fn == NULL) {
@@ -904,8 +919,9 @@ static bool ld_index_op(DisasContext *s, arg_rnfvm *a, 
uint8_t eew)
 
 fn = fns[eew][s->sew];
 
+uint8_t emul = vext_get_emul(s, s->sew);
 data = FIELD_DP32(data, VDATA, VM, a->vm);
-data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
+data = FIELD_DP32(data, VDATA, LMUL, emul);
 data = FIELD_DP32(data, VDATA, NF, a->nf);
 return ldst_index_trans(a->rd, a->rs1, a->rs2, data, fn, s, false);
 }
@@ -955,8 +971,9 @@ static bool st_index_op(DisasContext *s, arg_rnfvm *a, 
uint8_t eew)
 
 fn = fns[eew][s->sew];
 
+uint8_t emul = vext_get_emul(s, s->sew);
 data = FIELD_DP32(data, VDATA, VM, a->vm);
-data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
+data = FIELD_DP32(data, VDATA, LMUL, emul);
 data = FIELD_DP32(data, VDATA, NF, a->nf);
 return ldst_index_trans(a->rd, a->rs1, a->rs2, data, fn, s, true);
 }
@@ -1020,8 +1037,9 @@ static bool ldff_op(DisasContext *s, arg_r2nfvm *a, 
uint8_t eew)
 return false;
 }
 
+uint8_t emul = vext_get_emul(s, eew);
 data = FIELD_DP32(data, VDATA, VM, a->vm);
-data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
+data = FIELD_DP32(data, VDATA, LMUL, emul);
 data = FIELD_DP32(data, VDATA, NF, a->nf);
 return ldff_trans(a->rd, a->rs1, data, fn, s);
 }
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 57564c5c0c..8556ab3b0d 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -17,6 +17,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/host-utils.h"
 #include "cpu.h"
 #include "exec/memop.h"
 #include "exec/exec-all.h"
@@ -121,14 +122,21 @@ static uint32_t vext_wd(uint32_t desc)
 }
 
 /*
- * Get vector group length in bytes. Its range is [64, 2048].
+ * Get the maximum number of elements can be operated.
  *
- * As simd_desc support at most 256, the max vlen is 512 bits.
- * So vlen in bytes is encoded as maxsz.
+ * esz: log2 of element size in bytes.
  */
-static inline uint32_t vext_maxsz(uint32_t desc)
+static inline uint32_t vext_max_elems(uint32_t desc, uint32_t esz)
 {
-   

[RFC v5 39/68] target/riscv: rvv-1.0: integer extension instructions

2020-09-29 Thread frank . chang
From: Frank Chang 

Add the following instructions:

* vzext.vf2
* vzext.vf4
* vzext.vf8
* vsext.vf2
* vsext.vf4
* vsext.vf8

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/helper.h   | 14 +
 target/riscv/insn32.decode  |  8 +++
 target/riscv/insn_trans/trans_rvv.c.inc | 80 +
 target/riscv/vector_helper.c| 31 ++
 4 files changed, 133 insertions(+)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 287500db8b..fd59b07af3 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -1122,3 +1122,17 @@ DEF_HELPER_6(vcompress_vm_b, void, ptr, ptr, ptr, ptr, 
env, i32)
 DEF_HELPER_6(vcompress_vm_h, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vcompress_vm_w, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vcompress_vm_d, void, ptr, ptr, ptr, ptr, env, i32)
+
+DEF_HELPER_5(vzext_vf2_h, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vzext_vf2_w, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vzext_vf2_d, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vzext_vf4_w, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vzext_vf4_d, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vzext_vf8_d, void, ptr, ptr, ptr, env, i32)
+
+DEF_HELPER_5(vsext_vf2_h, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vsext_vf2_w, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vsext_vf2_d, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vsext_vf4_w, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vsext_vf4_d, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vsext_vf8_d, void, ptr, ptr, ptr, env, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 2280627553..158ef6e49f 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -630,5 +630,13 @@ vmv2r_v 100111 1 . 1 011 . 1010111 
@r2rd
 vmv4r_v 100111 1 . 00011 011 . 1010111 @r2rd
 vmv8r_v 100111 1 . 00111 011 . 1010111 @r2rd
 
+# Vector Integer Extension
+vzext_vf2   010010 . . 00110 010 . 1010111 @r2_vm
+vzext_vf4   010010 . . 00100 010 . 1010111 @r2_vm
+vzext_vf8   010010 . . 00010 010 . 1010111 @r2_vm
+vsext_vf2   010010 . . 00111 010 . 1010111 @r2_vm
+vsext_vf4   010010 . . 00101 010 . 1010111 @r2_vm
+vsext_vf8   010010 . . 00011 010 . 1010111 @r2_vm
+
 vsetvli 0 ... . 111 . 1010111  @r2_zimm
 vsetvl  100 . . 111 . 1010111  @r
diff --git a/target/riscv/insn_trans/trans_rvv.c.inc 
b/target/riscv/insn_trans/trans_rvv.c.inc
index dec3940663..12250c776d 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -3522,3 +3522,83 @@ GEN_VMV_WHOLE_TRANS(vmv1r_v, 1)
 GEN_VMV_WHOLE_TRANS(vmv2r_v, 2)
 GEN_VMV_WHOLE_TRANS(vmv4r_v, 4)
 GEN_VMV_WHOLE_TRANS(vmv8r_v, 8)
+
+static bool int_ext_check(DisasContext *s, arg_rmr *a, uint8_t div)
+{
+uint8_t from = (s->sew + 3) - div;
+bool ret = require_rvv(s) &&
+(from >= 3 && from <= 8) &&
+(a->rd != a->rs2) &&
+require_align(a->rd, s->lmul) &&
+require_align(a->rs2, s->lmul - div) &&
+require_vm(a->vm, a->rd) &&
+require_noover(a->rd, s->lmul, a->rs2, s->lmul - div);
+return ret;
+}
+
+static bool int_ext_op(DisasContext *s, arg_rmr *a, uint8_t seq)
+{
+uint32_t data = 0;
+gen_helper_gvec_3_ptr *fn;
+TCGLabel *over = gen_new_label();
+tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);
+
+static gen_helper_gvec_3_ptr * const fns[6][4] = {
+{
+NULL, gen_helper_vzext_vf2_h,
+gen_helper_vzext_vf2_w, gen_helper_vzext_vf2_d
+},
+{
+NULL, NULL,
+gen_helper_vzext_vf4_w, gen_helper_vzext_vf4_d,
+},
+{
+NULL, NULL,
+NULL, gen_helper_vzext_vf8_d
+},
+{
+NULL, gen_helper_vsext_vf2_h,
+gen_helper_vsext_vf2_w, gen_helper_vsext_vf2_d
+},
+{
+NULL, NULL,
+gen_helper_vsext_vf4_w, gen_helper_vsext_vf4_d,
+},
+{
+NULL, NULL,
+NULL, gen_helper_vsext_vf8_d
+}
+};
+
+fn = fns[seq][s->sew];
+if (fn == NULL) {
+return false;
+}
+
+data = FIELD_DP32(data, VDATA, VM, a->vm);
+
+tcg_gen_gvec_3_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, 0),
+   vreg_ofs(s, a->rs2), cpu_env, 0,
+   s->vlen / 8, data, fn);
+
+mark_vs_dirty(s);
+gen_set_label(over);
+return true;
+}
+
+/* Vector Integer Extension */
+#define GEN_INT_EXT_TRANS(NAME, DIV, SEQ) \
+static bool trans_##NAME(DisasContext *s, arg_rmr *a) \
+{

[RFC v5 27/68] target/riscv: rvv-1.0: floating-point classify instructions

2020-09-29 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/insn32.decode | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 6c95a3460a..958914458d 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -561,7 +561,7 @@ vmfgt_vf011101 . . . 101 . 1010111 @r_vm
 vmfge_vf01 . . . 101 . 1010111 @r_vm
 vmford_vv   011010 . . . 001 . 1010111 @r_vm
 vmford_vf   011010 . . . 101 . 1010111 @r_vm
-vfclass_v   100011 . . 1 001 . 1010111 @r2_vm
+vfclass_v   010011 . . 1 001 . 1010111 @r2_vm
 vfmerge_vfm 010111 0 . . 101 . 1010111 @r_vm_0
 vfmv_v_f010111 1 0 . 101 . 1010111 @r2
 vfcvt_xu_f_v100010 . . 0 001 . 1010111 @r2_vm
-- 
2.17.1




[RFC v5 55/68] target/riscv: rvv-1.0: remove widening saturating scaled multiply-add

2020-09-29 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/helper.h   |  22 ---
 target/riscv/insn32.decode  |   7 -
 target/riscv/insn_trans/trans_rvv.c.inc |   9 --
 target/riscv/vector_helper.c| 205 
 4 files changed, 243 deletions(-)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index c2d6be790d..24d575162d 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -736,28 +736,6 @@ DEF_HELPER_6(vsmul_vx_h, void, ptr, ptr, tl, ptr, env, i32)
 DEF_HELPER_6(vsmul_vx_w, void, ptr, ptr, tl, ptr, env, i32)
 DEF_HELPER_6(vsmul_vx_d, void, ptr, ptr, tl, ptr, env, i32)
 
-DEF_HELPER_6(vwsmaccu_vv_b, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vwsmaccu_vv_h, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vwsmaccu_vv_w, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vwsmacc_vv_b, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vwsmacc_vv_h, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vwsmacc_vv_w, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vwsmaccsu_vv_b, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vwsmaccsu_vv_h, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vwsmaccsu_vv_w, void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_6(vwsmaccu_vx_b, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vwsmaccu_vx_h, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vwsmaccu_vx_w, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vwsmacc_vx_b, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vwsmacc_vx_h, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vwsmacc_vx_w, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vwsmaccsu_vx_b, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vwsmaccsu_vx_h, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vwsmaccsu_vx_w, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vwsmaccus_vx_b, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vwsmaccus_vx_h, void, ptr, ptr, tl, ptr, env, i32)
-DEF_HELPER_6(vwsmaccus_vx_w, void, ptr, ptr, tl, ptr, env, i32)
-
 DEF_HELPER_6(vssrl_vv_b, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vssrl_vv_h, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vssrl_vv_w, void, ptr, ptr, ptr, ptr, env, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 979b0317e8..d6468750a1 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -478,13 +478,6 @@ vasubu_vv   001010 . . . 010 . 1010111 
@r_vm
 vasubu_vx   001010 . . . 110 . 1010111 @r_vm
 vsmul_vv100111 . . . 000 . 1010111 @r_vm
 vsmul_vx100111 . . . 100 . 1010111 @r_vm
-vwsmaccu_vv 00 . . . 000 . 1010111 @r_vm
-vwsmaccu_vx 00 . . . 100 . 1010111 @r_vm
-vwsmacc_vv  01 . . . 000 . 1010111 @r_vm
-vwsmacc_vx  01 . . . 100 . 1010111 @r_vm
-vwsmaccsu_vv10 . . . 000 . 1010111 @r_vm
-vwsmaccsu_vx10 . . . 100 . 1010111 @r_vm
-vwsmaccus_vx11 . . . 100 . 1010111 @r_vm
 vssrl_vv101010 . . . 000 . 1010111 @r_vm
 vssrl_vx101010 . . . 100 . 1010111 @r_vm
 vssrl_vi101010 . . . 011 . 1010111 @r_vm
diff --git a/target/riscv/insn_trans/trans_rvv.c.inc 
b/target/riscv/insn_trans/trans_rvv.c.inc
index 6df96f4597..20781ab5d1 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -2266,15 +2266,6 @@ GEN_OPIVX_TRANS(vasubu_vx,  opivx_check)
 GEN_OPIVV_TRANS(vsmul_vv, opivv_check)
 GEN_OPIVX_TRANS(vsmul_vx,  opivx_check)
 
-/* Vector Widening Saturating Scaled Multiply-Add */
-GEN_OPIVV_WIDEN_TRANS(vwsmaccu_vv, opivv_widen_check)
-GEN_OPIVV_WIDEN_TRANS(vwsmacc_vv, opivv_widen_check)
-GEN_OPIVV_WIDEN_TRANS(vwsmaccsu_vv, opivv_widen_check)
-GEN_OPIVX_WIDEN_TRANS(vwsmaccu_vx)
-GEN_OPIVX_WIDEN_TRANS(vwsmacc_vx)
-GEN_OPIVX_WIDEN_TRANS(vwsmaccsu_vx)
-GEN_OPIVX_WIDEN_TRANS(vwsmaccus_vx)
-
 /* Vector Single-Width Scaling Shift Instructions */
 GEN_OPIVV_TRANS(vssrl_vv, opivv_check)
 GEN_OPIVV_TRANS(vssra_vv, opivv_check)
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index e6931466d4..549a476490 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -2703,211 +2703,6 @@ GEN_VEXT_VX_RM(vsmul_vx_h, 2, 2)
 GEN_VEXT_VX_RM(vsmul_vx_w, 4, 4)
 GEN_VEXT_VX_RM(vsmul_vx_d, 8, 8)
 
-/* Vector Widening Saturating Scaled Multiply-Add */
-static inline uint16_t
-vwsmaccu8(CPURISCVState *env, int vxrm, uint8_t a, uint8_t b,
-  uint16_t c)
-{
-uint8_t round;
-uint16_t res = (uint16_t)a * b;
-
-round = get_round(vxrm, res, 4);
-res   = (res >> 4) + round;
-return saddu16(env, vxrm, c, res);
-}
-
-static inline uint32_t
-vwsmaccu16(CPURISCVState *env, int vxrm, uint16_t a, uint16_t b,
-   uint32_t c)
-{
-uint8_t round;
-uint32_t res = (uint32_t

[RFC v5 44/68] target/riscv: rvv-1.0: widening integer multiply-add instructions

2020-09-29 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/insn32.decode | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 4517f8ed54..c75d728fc5 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -449,9 +449,9 @@ vwmaccu_vv  00 . . . 010 . 1010111 @r_vm
 vwmaccu_vx  00 . . . 110 . 1010111 @r_vm
 vwmacc_vv   01 . . . 010 . 1010111 @r_vm
 vwmacc_vx   01 . . . 110 . 1010111 @r_vm
-vwmaccsu_vv 10 . . . 010 . 1010111 @r_vm
-vwmaccsu_vx 10 . . . 110 . 1010111 @r_vm
-vwmaccus_vx 11 . . . 110 . 1010111 @r_vm
+vwmaccsu_vv 11 . . . 010 . 1010111 @r_vm
+vwmaccsu_vx 11 . . . 110 . 1010111 @r_vm
+vwmaccus_vx 10 . . . 110 . 1010111 @r_vm
 vmv_v_v 010111 1 0 . 000 . 1010111 @r2
 vmv_v_x 010111 1 0 . 100 . 1010111 @r2
 vmv_v_i 010111 1 0 . 011 . 1010111 @r2
-- 
2.17.1




[RFC v5 60/68] target/riscv: rvv-1.0: floating-point/integer type-convert instructions

2020-09-29 Thread frank . chang
From: Frank Chang 

Add the following instructions:

* vfcvt.rtz.xu.f.v
* vfcvt.rtz.x.f.v

Also adjust GEN_OPFV_TRANS() to accept multiple floating-point rounding
modes.

Signed-off-by: Frank Chang 
---
 target/riscv/insn32.decode  | 11 ++--
 target/riscv/insn_trans/trans_rvv.c.inc | 83 -
 2 files changed, 60 insertions(+), 34 deletions(-)

diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index c3d9ef4fe1..88d8f0eb0b 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -560,10 +560,13 @@ vmfge_vf01 . . . 101 . 1010111 
@r_vm
 vfclass_v   010011 . . 1 001 . 1010111 @r2_vm
 vfmerge_vfm 010111 0 . . 101 . 1010111 @r_vm_0
 vfmv_v_f010111 1 0 . 101 . 1010111 @r2
-vfcvt_xu_f_v100010 . . 0 001 . 1010111 @r2_vm
-vfcvt_x_f_v 100010 . . 1 001 . 1010111 @r2_vm
-vfcvt_f_xu_v100010 . . 00010 001 . 1010111 @r2_vm
-vfcvt_f_x_v 100010 . . 00011 001 . 1010111 @r2_vm
+
+vfcvt_xu_f_v   010010 . . 0 001 . 1010111 @r2_vm
+vfcvt_x_f_v010010 . . 1 001 . 1010111 @r2_vm
+vfcvt_f_xu_v   010010 . . 00010 001 . 1010111 @r2_vm
+vfcvt_f_x_v010010 . . 00011 001 . 1010111 @r2_vm
+vfcvt_rtz_xu_f_v   010010 . . 00110 001 . 1010111 @r2_vm
+vfcvt_rtz_x_f_v010010 . . 00111 001 . 1010111 @r2_vm
 vfwcvt_xu_f_v   100010 . . 01000 001 . 1010111 @r2_vm
 vfwcvt_x_f_v100010 . . 01001 001 . 1010111 @r2_vm
 vfwcvt_f_xu_v   100010 . . 01010 001 . 1010111 @r2_vm
diff --git a/target/riscv/insn_trans/trans_rvv.c.inc 
b/target/riscv/insn_trans/trans_rvv.c.inc
index 01ef472413..452447e5ed 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -2618,33 +2618,42 @@ static bool opfv_check(DisasContext *s, arg_rmr *a)
vext_check_ss(s, a->rd, a->rs2, a->vm);
 }
 
-#define GEN_OPFV_TRANS(NAME, CHECK)\
-static bool trans_##NAME(DisasContext *s, arg_rmr *a)  \
-{  \
-if (CHECK(s, a)) { \
-uint32_t data = 0; \
-static gen_helper_gvec_3_ptr * const fns[3] = {\
-gen_helper_##NAME##_h, \
-gen_helper_##NAME##_w, \
-gen_helper_##NAME##_d, \
-}; \
-TCGLabel *over = gen_new_label();  \
-gen_set_rm(s, RISCV_FRM_DYN);  \
-tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);  \
-   \
-data = FIELD_DP32(data, VDATA, VM, a->vm); \
-data = FIELD_DP32(data, VDATA, LMUL, s->lmul); \
-tcg_gen_gvec_3_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, 0), \
-   vreg_ofs(s, a->rs2), cpu_env, 0,\
-   s->vlen / 8, data, fns[s->sew - 1]);\
-mark_vs_dirty(s);  \
-gen_set_label(over);   \
-return true;   \
-}  \
-return false;  \
+static bool do_opfv(DisasContext *s, arg_rmr *a,
+gen_helper_gvec_3_ptr *fn,
+bool (*checkfn)(DisasContext *, arg_rmr *),
+int rm)
+{
+if (checkfn(s, a)) {
+uint32_t data = 0;
+gen_set_rm(s, RISCV_FRM_DYN);
+TCGLabel *over = gen_new_label();
+gen_set_rm(s, rm);
+tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);
+
+data = FIELD_DP32(data, VDATA, VM, a->vm);
+data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
+tcg_gen_gvec_3_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, 0),
+vreg_ofs(s, a->rs2), cpu_env, 0,
+s->vlen / 8, data, fn);
+mark_vs_dirty(s);
+gen_set_label(over);
+return true;
+}
+return false;
+}
+
+#define GEN_OPFV_TRANS(NAME, CHECK, FRM)   \
+static bool trans_##NAME(DisasContext *s, arg_rmr *a)  \
+{  \
+static gen_helper_gvec_3_ptr * const fns[3] = {\
+gen_helper_##NAME##_h, \
+gen_helper_##NAME##_w, \
+gen_helper_##NAME##_d  \
+}; 

[RFC v5 00/68] support vector extension v1.0

2020-09-29 Thread frank . chang
From: Frank Chang 

This patchset implements the vector extension v1.0 for RISC-V on QEMU.

This patchset is sent as RFC because RVV v1.0 is still in draft state.
v2 patchset was sent for RVV v0.9 and bumped to RVV v1.0 since v3 patchset.

The port is available here:
https://github.com/sifive/qemu/tree/rvv-1.0-upstream-v5

You can change the cpu argument: vext_spec to v1.0 (i.e. vext_spec=v1.0)
to run with RVV v1.0 instructions.

Note: This patchset depends on two other patchsets listed in Based-on
  section below so it might not able to be built unless those two
  patchsets are applied.

Changelog:

v5
  * refactor RVV v1.0 check functions.
(Thanks to Richard Henderson's bitwise tricks.)
  * relax RV_VLEN_MAX to 1024-bits.
  * implement vstart CSR's behaviors.
  * trigger illegal instruction exception if frm is not valid for
vector floating-point instructions.
  * rebase on riscv-to-apply.next.

v4
  * remove explicit float flmul variable in DisasContext.
  * replace floating-point calculations with shift operations to
improve performance.
  * relax RV_VLEN_MAX to 512-bits.

v3
  * apply nan-box helpers from Richard Henderson.
  * remove fp16 api changes as they are sent independently in another
pathcset by Chih-Min Chao.
  * remove all tail elements clear functions as tail elements can
retain unchanged for either VTA set to undisturbed or agnostic.
  * add fp16 nan-box check generator function.
  * add floating-point rounding mode enum.
  * replace flmul arithmetic with shifts to avoid floating-point
conversions.
  * add Zvqmac extension.
  * replace gdbstub vector register xml files with dynamic generator.
  * bumped to RVV v1.0.
  * RVV v1.0 related changes:
* add vlre.v and vsr.v vector whole register
  load/store instructions
* add vrgatherei16 instruction.
* rearranged bits in vtype to make vlmul bits into a contiguous
  field.

v2
  * drop v0.7.1 support.
  * replace invisible return check macros with functions.
  * move mark_vs_dirty() to translators.
  * add SSTATUS_VS flag for s-mode.
  * nan-box scalar fp register for floating-point operations.
  * add gdbstub files for vector registers to allow system-mode
debugging with GDB.

Based-on: <20200909001647.532249-1-richard.hender...@linaro.org/>
Based-on: <1596102747-20226-1-git-send-email-chihmin.c...@sifive.com/>

Frank Chang (62):
  target/riscv: drop vector 0.7.1 and add 1.0 support
  target/riscv: Use FIELD_EX32() to extract wd field
  target/riscv: rvv-1.0: introduce writable misa.v field
  target/riscv: rvv-1.0: add translation-time vector context status
  target/riscv: rvv-1.0: remove rvv related codes from fcsr registers
  target/riscv: rvv-1.0: check MSTATUS_VS when accessing vector csr
registers
  target/riscv: rvv-1.0: remove MLEN calculations
  target/riscv: rvv-1.0: add fractional LMUL
  target/riscv: rvv-1.0: add VMA and VTA
  target/riscv: rvv-1.0: update check functions
  target/riscv: introduce more imm value modes in translator functions
  target/riscv: rvv:1.0: add translation-time nan-box helper function
  target/riscv: rvv-1.0: configure instructions
  target/riscv: rvv-1.0: stride load and store instructions
  target/riscv: rvv-1.0: index load and store instructions
  target/riscv: rvv-1.0: fix address index overflow bug of indexed
load/store insns
  target/riscv: rvv-1.0: fault-only-first unit stride load
  target/riscv: rvv-1.0: amo operations
  target/riscv: rvv-1.0: load/store whole register instructions
  target/riscv: rvv-1.0: update vext_max_elems() for load/store insns
  target/riscv: rvv-1.0: take fractional LMUL into vector max elements
calculation
  target/riscv: rvv-1.0: floating-point square-root instruction
  target/riscv: rvv-1.0: floating-point classify instructions
  target/riscv: rvv-1.0: mask population count instruction
  target/riscv: rvv-1.0: find-first-set mask bit instruction
  target/riscv: rvv-1.0: set-X-first mask bit instructions
  target/riscv: rvv-1.0: iota instruction
  target/riscv: rvv-1.0: element index instruction
  target/riscv: rvv-1.0: allow load element with sign-extended
  target/riscv: rvv-1.0: register gather instructions
  target/riscv: rvv-1.0: integer scalar move instructions
  target/riscv: rvv-1.0: floating-point move instruction
  target/riscv: rvv-1.0: floating-point scalar move instructions
  target/riscv: rvv-1.0: whole register move instructions
  target/riscv: rvv-1.0: integer extension instructions
  target/riscv: rvv-1.0: single-width averaging add and subtract
instructions
  target/riscv: rvv-1.0: single-width bit shift instructions
  target/riscv: rvv-1.0: integer add-with-carry/subtract-with-borrow
  target/riscv: rvv-1.0: narrowing integer right shift instructions
  target/riscv: rvv-1.0: widening integer multiply-add instructions
  target/riscv: rvv-1.0: single-width saturating add and subtract
instructions
  target/riscv: rvv-1.0: integer comparison instructions
  target/ri

[RFC v5 30/68] target/riscv: rvv-1.0: set-X-first mask bit instructions

2020-09-29 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/insn32.decode  | 6 +++---
 target/riscv/insn_trans/trans_rvv.c.inc | 5 -
 target/riscv/vector_helper.c| 4 
 3 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index d7dac12883..0fed7c9e56 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -604,9 +604,9 @@ vmornot_mm  011100 - . . 010 . 1010111 @r
 vmxnor_mm   01 - . . 010 . 1010111 @r
 vpopc_m 01 . . 1 010 . 1010111 @r2_vm
 vfirst_m01 . . 10001 010 . 1010111 @r2_vm
-vmsbf_m 010110 . . 1 010 . 1010111 @r2_vm
-vmsif_m 010110 . . 00011 010 . 1010111 @r2_vm
-vmsof_m 010110 . . 00010 010 . 1010111 @r2_vm
+vmsbf_m 010100 . . 1 010 . 1010111 @r2_vm
+vmsif_m 010100 . . 00011 010 . 1010111 @r2_vm
+vmsof_m 010100 . . 00010 010 . 1010111 @r2_vm
 viota_m 010110 . . 1 010 . 1010111 @r2_vm
 vid_v   010110 . 0 10001 010 . 1010111 @r1_vm
 vext_x_v001100 1 . . 010 . 1010111 @r
diff --git a/target/riscv/insn_trans/trans_rvv.c.inc 
b/target/riscv/insn_trans/trans_rvv.c.inc
index c34877140f..ae2b224b0f 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -2982,7 +2982,10 @@ static bool trans_vfirst_m(DisasContext *s, arg_rmr *a)
 #define GEN_M_TRANS(NAME)  \
 static bool trans_##NAME(DisasContext *s, arg_rmr *a)  \
 {  \
-if (vext_check_isa_ill(s)) {   \
+if (require_rvv(s) &&  \
+vext_check_isa_ill(s) &&   \
+require_vm(a->vm, a->rd) &&\
+(a->rd != a->rs2)) {   \
 uint32_t data = 0; \
 gen_helper_gvec_3_ptr *fn = gen_helper_##NAME; \
 TCGLabel *over = gen_new_label();  \
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index ecc9be7733..8ccf538141 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -4466,7 +4466,6 @@ enum set_mask_type {
 static void vmsetm(void *vd, void *v0, void *vs2, CPURISCVState *env,
uint32_t desc, enum set_mask_type type)
 {
-uint32_t vlmax = env_archcpu(env)->cfg.vlen;
 uint32_t vm = vext_vm(desc);
 uint32_t vl = env->vl;
 int i;
@@ -4496,9 +4495,6 @@ static void vmsetm(void *vd, void *v0, void *vs2, 
CPURISCVState *env,
 }
 }
 }
-for (; i < vlmax; i++) {
-vext_set_elem_mask(vd, i, 0);
-}
 }
 
 void HELPER(vmsbf_m)(void *vd, void *v0, void *vs2, CPURISCVState *env,
-- 
2.17.1




[RFC v5 29/68] target/riscv: rvv-1.0: find-first-set mask bit instruction

2020-09-29 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/helper.h   | 2 +-
 target/riscv/insn32.decode  | 2 +-
 target/riscv/insn_trans/trans_rvv.c.inc | 4 ++--
 target/riscv/vector_helper.c| 6 +++---
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index e046591a42..8d40009871 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -1072,7 +1072,7 @@ DEF_HELPER_6(vmxnor_mm, void, ptr, ptr, ptr, ptr, env, 
i32)
 
 DEF_HELPER_4(vpopc_m, tl, ptr, ptr, env, i32)
 
-DEF_HELPER_4(vmfirst_m, tl, ptr, ptr, env, i32)
+DEF_HELPER_4(vfirst_m, tl, ptr, ptr, env, i32)
 
 DEF_HELPER_5(vmsbf_m, void, ptr, ptr, ptr, env, i32)
 DEF_HELPER_5(vmsif_m, void, ptr, ptr, ptr, env, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 0cb76682a0..d7dac12883 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -603,7 +603,7 @@ vmnor_mm00 - . . 010 . 1010111 @r
 vmornot_mm  011100 - . . 010 . 1010111 @r
 vmxnor_mm   01 - . . 010 . 1010111 @r
 vpopc_m 01 . . 1 010 . 1010111 @r2_vm
-vmfirst_m   010101 . . - 010 . 1010111 @r2_vm
+vfirst_m01 . . 10001 010 . 1010111 @r2_vm
 vmsbf_m 010110 . . 1 010 . 1010111 @r2_vm
 vmsif_m 010110 . . 00011 010 . 1010111 @r2_vm
 vmsof_m 010110 . . 00010 010 . 1010111 @r2_vm
diff --git a/target/riscv/insn_trans/trans_rvv.c.inc 
b/target/riscv/insn_trans/trans_rvv.c.inc
index f9d280b0c5..c34877140f 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -2945,7 +2945,7 @@ static bool trans_vpopc_m(DisasContext *s, arg_rmr *a)
 }
 
 /* vmfirst find-first-set mask bit */
-static bool trans_vmfirst_m(DisasContext *s, arg_rmr *a)
+static bool trans_vfirst_m(DisasContext *s, arg_rmr *a)
 {
 if (require_rvv(s) &&
 vext_check_isa_ill(s)) {
@@ -2964,7 +2964,7 @@ static bool trans_vmfirst_m(DisasContext *s, arg_rmr *a)
 tcg_gen_addi_ptr(src2, cpu_env, vreg_ofs(s, a->rs2));
 tcg_gen_addi_ptr(mask, cpu_env, vreg_ofs(s, 0));
 
-gen_helper_vmfirst_m(dst, mask, src2, cpu_env, desc);
+gen_helper_vfirst_m(dst, mask, src2, cpu_env, desc);
 gen_set_gpr(a->rd, dst);
 
 tcg_temp_free_ptr(mask);
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 517e7344b9..ecc9be7733 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -4439,9 +4439,9 @@ target_ulong HELPER(vpopc_m)(void *v0, void *vs2, 
CPURISCVState *env,
 return cnt;
 }
 
-/* vmfirst find-first-set mask bit*/
-target_ulong HELPER(vmfirst_m)(void *v0, void *vs2, CPURISCVState *env,
-   uint32_t desc)
+/* vfirst find-first-set mask bit*/
+target_ulong HELPER(vfirst_m)(void *v0, void *vs2, CPURISCVState *env,
+  uint32_t desc)
 {
 uint32_t vm = vext_vm(desc);
 uint32_t vl = env->vl;
-- 
2.17.1




[RFC v5 50/68] target/riscv: rvv-1.0: floating-point slide instructions

2020-09-29 Thread frank . chang
From: Frank Chang 

Add the following instructions:

* vfslide1up.vf
* vfslide1down.vf

Signed-off-by: Frank Chang 
---
 target/riscv/helper.h   |   7 ++
 target/riscv/insn32.decode  |   2 +
 target/riscv/insn_trans/trans_rvv.c.inc |  16 +++
 target/riscv/vector_helper.c| 141 
 4 files changed, 121 insertions(+), 45 deletions(-)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index c0109a8689..d956d73b1b 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -1121,6 +1121,13 @@ DEF_HELPER_6(vslide1down_vx_h, void, ptr, ptr, tl, ptr, 
env, i32)
 DEF_HELPER_6(vslide1down_vx_w, void, ptr, ptr, tl, ptr, env, i32)
 DEF_HELPER_6(vslide1down_vx_d, void, ptr, ptr, tl, ptr, env, i32)
 
+DEF_HELPER_6(vfslide1up_vf_h, void, ptr, ptr, i64, ptr, env, i32)
+DEF_HELPER_6(vfslide1up_vf_w, void, ptr, ptr, i64, ptr, env, i32)
+DEF_HELPER_6(vfslide1up_vf_d, void, ptr, ptr, i64, ptr, env, i32)
+DEF_HELPER_6(vfslide1down_vf_h, void, ptr, ptr, i64, ptr, env, i32)
+DEF_HELPER_6(vfslide1down_vf_w, void, ptr, ptr, i64, ptr, env, i32)
+DEF_HELPER_6(vfslide1down_vf_d, void, ptr, ptr, i64, ptr, env, i32)
+
 DEF_HELPER_6(vrgather_vv_b, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vrgather_vv_h, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vrgather_vv_w, void, ptr, ptr, ptr, ptr, env, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index c75d728fc5..7e14f7f977 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -552,6 +552,8 @@ vfsgnjn_vv  001001 . . . 001 . 1010111 @r_vm
 vfsgnjn_vf  001001 . . . 101 . 1010111 @r_vm
 vfsgnjx_vv  001010 . . . 001 . 1010111 @r_vm
 vfsgnjx_vf  001010 . . . 101 . 1010111 @r_vm
+vfslide1up_vf   001110 . . . 101 . 1010111 @r_vm
+vfslide1down_vf 00 . . . 101 . 1010111 @r_vm
 vmfeq_vv011000 . . . 001 . 1010111 @r_vm
 vmfeq_vf011000 . . . 101 . 1010111 @r_vm
 vmfne_vv011100 . . . 001 . 1010111 @r_vm
diff --git a/target/riscv/insn_trans/trans_rvv.c.inc 
b/target/riscv/insn_trans/trans_rvv.c.inc
index 5726fd7133..ac29c8f36c 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -3369,6 +3369,22 @@ GEN_OPIVX_TRANS(vslidedown_vx, slidedown_check)
 GEN_OPIVX_TRANS(vslide1down_vx, slidedown_check)
 GEN_OPIVI_TRANS(vslidedown_vi, IMM_ZX, vslidedown_vx, slidedown_check)
 
+/* Vector Floating-Point Slide Instructions */
+static bool fslideup_check(DisasContext *s, arg_rmrr *a)
+{
+return slideup_check(s, a) &&
+   require_rvf(s);
+}
+
+static bool fslidedown_check(DisasContext *s, arg_rmrr *a)
+{
+return slidedown_check(s, a) &&
+   require_rvf(s);
+}
+
+GEN_OPFVF_TRANS(vfslide1up_vf, fslideup_check)
+GEN_OPFVF_TRANS(vfslide1down_vf, fslidedown_check)
+
 /* Vector Register Gather Instruction */
 static bool vrgather_vv_check(DisasContext *s, arg_rmrr *a)
 {
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index fac387353f..df4f0b2d6a 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -4664,57 +4664,108 @@ GEN_VEXT_VSLIDEDOWN_VX(vslidedown_vx_h, uint16_t, H2)
 GEN_VEXT_VSLIDEDOWN_VX(vslidedown_vx_w, uint32_t, H4)
 GEN_VEXT_VSLIDEDOWN_VX(vslidedown_vx_d, uint64_t, H8)
 
-#define GEN_VEXT_VSLIDE1UP_VX(NAME, ETYPE, H) \
-void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2, \
-  CPURISCVState *env, uint32_t desc)  \
-{ \
-uint32_t vm = vext_vm(desc);  \
-uint32_t vl = env->vl;\
-uint32_t i;   \
-  \
-for (i = 0; i < vl; i++) {\
-if (!vm && !vext_elem_mask(v0, i)) {  \
-continue; \
-} \
-if (i == 0) { \
-*((ETYPE *)vd + H(i)) = s1;   \
-} else {  \
-*((ETYPE *)vd + H(i)) = *((ETYPE *)vs2 + H(i - 1));   \
-} \
-} \
+#define GEN_VEXT_VSLIE1UP(ESZ, H)   \
+static void vslide1up_##ESZ(void *vd, void *

[RFC v5 52/68] target/riscv: rvv-1.0: single-width floating-point reduction

2020-09-29 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/insn_trans/trans_rvv.c.inc | 12 +---
 target/riscv/vector_helper.c| 12 ++--
 2 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvv.c.inc 
b/target/riscv/insn_trans/trans_rvv.c.inc
index 0d5872830b..e2f4de6078 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -2884,9 +2884,15 @@ GEN_OPIVV_WIDEN_TRANS(vwredsum_vs, reduction_widen_check)
 GEN_OPIVV_WIDEN_TRANS(vwredsumu_vs, reduction_widen_check)
 
 /* Vector Single-Width Floating-Point Reduction Instructions */
-GEN_OPFVV_TRANS(vfredsum_vs, reduction_check)
-GEN_OPFVV_TRANS(vfredmax_vs, reduction_check)
-GEN_OPFVV_TRANS(vfredmin_vs, reduction_check)
+static bool freduction_check(DisasContext *s, arg_rmrr *a)
+{
+return reduction_check(s, a) &&
+   require_rvf(s);
+}
+
+GEN_OPFVV_TRANS(vfredsum_vs, freduction_check)
+GEN_OPFVV_TRANS(vfredmax_vs, freduction_check)
+GEN_OPFVV_TRANS(vfredmin_vs, freduction_check)
 
 /* Vector Widening Floating-Point Reduction Instructions */
 GEN_OPFVV_WIDEN_TRANS(vfwredsum_vs, reduction_check)
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index c5048882e9..e6931466d4 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -4382,14 +4382,14 @@ GEN_VEXT_FRED(vfredsum_vs_w, uint32_t, uint32_t, H4, 
H4, float32_add)
 GEN_VEXT_FRED(vfredsum_vs_d, uint64_t, uint64_t, H8, H8, float64_add)
 
 /* Maximum value */
-GEN_VEXT_FRED(vfredmax_vs_h, uint16_t, uint16_t, H2, H2, float16_maxnum)
-GEN_VEXT_FRED(vfredmax_vs_w, uint32_t, uint32_t, H4, H4, float32_maxnum)
-GEN_VEXT_FRED(vfredmax_vs_d, uint64_t, uint64_t, H8, H8, float64_maxnum)
+GEN_VEXT_FRED(vfredmax_vs_h, uint16_t, uint16_t, H2, H2, float16_maxnum_noprop)
+GEN_VEXT_FRED(vfredmax_vs_w, uint32_t, uint32_t, H4, H4, float32_maxnum_noprop)
+GEN_VEXT_FRED(vfredmax_vs_d, uint64_t, uint64_t, H8, H8, float64_maxnum_noprop)
 
 /* Minimum value */
-GEN_VEXT_FRED(vfredmin_vs_h, uint16_t, uint16_t, H2, H2, float16_minnum)
-GEN_VEXT_FRED(vfredmin_vs_w, uint32_t, uint32_t, H4, H4, float32_minnum)
-GEN_VEXT_FRED(vfredmin_vs_d, uint64_t, uint64_t, H8, H8, float64_minnum)
+GEN_VEXT_FRED(vfredmin_vs_h, uint16_t, uint16_t, H2, H2, float16_minnum_noprop)
+GEN_VEXT_FRED(vfredmin_vs_w, uint32_t, uint32_t, H4, H4, float32_minnum_noprop)
+GEN_VEXT_FRED(vfredmin_vs_d, uint64_t, uint64_t, H8, H8, float64_minnum_noprop)
 
 /* Vector Widening Floating-Point Reduction Instructions */
 /* Unordered reduce 2*SEW = 2*SEW + sum(promote(SEW)) */
-- 
2.17.1




[RFC v5 59/68] target/riscv: introduce floating-point rounding mode enum

2020-09-29 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/fpu_helper.c   | 12 ++--
 target/riscv/insn_trans/trans_rvv.c.inc | 18 +-
 target/riscv/internals.h|  9 +
 3 files changed, 24 insertions(+), 15 deletions(-)

diff --git a/target/riscv/fpu_helper.c b/target/riscv/fpu_helper.c
index bb346a8249..3850146fec 100644
--- a/target/riscv/fpu_helper.c
+++ b/target/riscv/fpu_helper.c
@@ -55,23 +55,23 @@ void helper_set_rounding_mode(CPURISCVState *env, uint32_t 
rm)
 {
 int softrm;
 
-if (rm == 7) {
+if (rm == RISCV_FRM_DYN) {
 rm = env->frm;
 }
 switch (rm) {
-case 0:
+case RISCV_FRM_RNE:
 softrm = float_round_nearest_even;
 break;
-case 1:
+case RISCV_FRM_RTZ:
 softrm = float_round_to_zero;
 break;
-case 2:
+case RISCV_FRM_RDN:
 softrm = float_round_down;
 break;
-case 3:
+case RISCV_FRM_RUP:
 softrm = float_round_up;
 break;
-case 4:
+case RISCV_FRM_RMM:
 softrm = float_round_ties_away;
 break;
 default:
diff --git a/target/riscv/insn_trans/trans_rvv.c.inc 
b/target/riscv/insn_trans/trans_rvv.c.inc
index 37c97f8c61..01ef472413 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -2338,7 +2338,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a)
 \
 gen_helper_##NAME##_d, \
 }; \
 TCGLabel *over = gen_new_label();  \
-gen_set_rm(s, 7);  \
+gen_set_rm(s, RISCV_FRM_DYN);  \
 tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);  \
\
 data = FIELD_DP32(data, VDATA, VM, a->vm); \
@@ -2418,7 +2418,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a)
\
 gen_helper_##NAME##_w,\
 gen_helper_##NAME##_d,\
 };\
-gen_set_rm(s, 7); \
+gen_set_rm(s, RISCV_FRM_DYN); \
 data = FIELD_DP32(data, VDATA, VM, a->vm);\
 data = FIELD_DP32(data, VDATA, LMUL, s->lmul);\
 return opfvf_trans(a->rd, a->rs1, a->rs2, data,   \
@@ -2450,7 +2450,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a)
   \
 gen_helper_##NAME##_h, gen_helper_##NAME##_w,\
 };   \
 TCGLabel *over = gen_new_label();\
-gen_set_rm(s, 7);\
+gen_set_rm(s, RISCV_FRM_DYN);\
 tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);\
  \
 data = FIELD_DP32(data, VDATA, VM, a->vm);   \
@@ -2486,7 +2486,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a)
   \
 static gen_helper_opfvf *const fns[2] = {\
 gen_helper_##NAME##_h, gen_helper_##NAME##_w,\
 };   \
-gen_set_rm(s, 7);\
+gen_set_rm(s, RISCV_FRM_DYN);\
 data = FIELD_DP32(data, VDATA, VM, a->vm);   \
 data = FIELD_DP32(data, VDATA, LMUL, s->lmul);   \
 return opfvf_trans(a->rd, a->rs1, a->rs2, data,  \
@@ -2516,7 +2516,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a)
 \
 gen_helper_##NAME##_h, gen_helper_##NAME##_w,  \
 }; \
 TCGLabel *over = gen_new_label();  \
-gen_set_rm(s, 7);  \
+gen_set_rm(s, RISCV_FRM_DYN);  \
 tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);  \
\
 data = FIELD_DP32(data, VDATA, VM, a->vm); \
@@ -2552,7 +2552,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a)
   \
 static gen_helper_opfvf *const fns[2] = {\
 gen_helper_##NAME##_h, gen_helper_##NAME##_w,\
 };   \
-gen_set_rm(s, 7); 

[RFC v5 01/68] target/riscv: drop vector 0.7.1 and add 1.0 support

2020-09-29 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
Reviewed-by: Alistair Francis 
---
 target/riscv/cpu.c | 10 +-
 target/riscv/cpu.h |  2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 57c006df5d..17c138bb90 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -345,7 +345,7 @@ static void riscv_cpu_realize(DeviceState *dev, Error 
**errp)
 CPURISCVState *env = >env;
 RISCVCPUClass *mcc = RISCV_CPU_GET_CLASS(dev);
 int priv_version = PRIV_VERSION_1_11_0;
-int vext_version = VEXT_VERSION_0_07_1;
+int vext_version = VEXT_VERSION_1_00_0;
 target_ulong target_misa = 0;
 Error *local_err = NULL;
 
@@ -463,8 +463,8 @@ static void riscv_cpu_realize(DeviceState *dev, Error 
**errp)
 return;
 }
 if (cpu->cfg.vext_spec) {
-if (!g_strcmp0(cpu->cfg.vext_spec, "v0.7.1")) {
-vext_version = VEXT_VERSION_0_07_1;
+if (!g_strcmp0(cpu->cfg.vext_spec, "v1.0")) {
+vext_version = VEXT_VERSION_1_00_0;
 } else {
 error_setg(errp,
"Unsupported vector spec version '%s'",
@@ -472,8 +472,8 @@ static void riscv_cpu_realize(DeviceState *dev, Error 
**errp)
 return;
 }
 } else {
-qemu_log("vector verison is not specified, "
-"use the default value v0.7.1\n");
+qemu_log("vector version is not specified, "
+"use the default value v1.0\n");
 }
 set_vext_version(env, vext_version);
 }
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 65daa73675..bf10b64fcb 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -79,7 +79,7 @@ enum {
 #define PRIV_VERSION_1_10_0 0x00011000
 #define PRIV_VERSION_1_11_0 0x00011100
 
-#define VEXT_VERSION_0_07_1 0x0701
+#define VEXT_VERSION_1_00_0 0x0001
 
 #define TRANSLATE_PMP_FAIL 2
 #define TRANSLATE_FAIL 1
-- 
2.17.1




[RFC v5 04/68] target/riscv: rvv-1.0: add sstatus VS field

2020-09-29 Thread frank . chang
From: LIU Zhiwei 

Signed-off-by: LIU Zhiwei 
Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/cpu_bits.h | 1 +
 target/riscv/csr.c  | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h
index a65d3ab2b3..d77f790dec 100644
--- a/target/riscv/cpu_bits.h
+++ b/target/riscv/cpu_bits.h
@@ -425,6 +425,7 @@
 #define SSTATUS_UPIE0x0010
 #define SSTATUS_SPIE0x0020
 #define SSTATUS_SPP 0x0100
+#define SSTATUS_VS  0x0600
 #define SSTATUS_FS  0x6000
 #define SSTATUS_XS  0x00018000
 #define SSTATUS_PUM 0x0004 /* until: priv-1.9.1 */
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index ca04b44aea..05aca3243b 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -423,7 +423,7 @@ static const target_ulong delegable_excps =
 (1ULL << (RISCV_EXCP_STORE_GUEST_AMO_ACCESS_FAULT));
 static const target_ulong sstatus_v1_10_mask = SSTATUS_SIE | SSTATUS_SPIE |
 SSTATUS_UIE | SSTATUS_UPIE | SSTATUS_SPP | SSTATUS_FS | SSTATUS_XS |
-SSTATUS_SUM | SSTATUS_MXR | SSTATUS_SD;
+SSTATUS_SUM | SSTATUS_MXR | SSTATUS_SD | SSTATUS_VS;
 static const target_ulong sip_writable_mask = SIP_SSIP | MIP_USIP | MIP_UEIP;
 static const target_ulong hip_writable_mask = MIP_VSSIP | MIP_VSTIP | 
MIP_VSEIP;
 static const target_ulong vsip_writable_mask = MIP_VSSIP;
-- 
2.17.1




[RFC v5 14/68] target/riscv: rvv-1.0: update check functions

2020-09-29 Thread frank . chang
From: Frank Chang 

Update check functions with RVV 1.0 rules.

Signed-off-by: Frank Chang 
---
 target/riscv/insn_trans/trans_rvv.c.inc | 735 
 1 file changed, 502 insertions(+), 233 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvv.c.inc 
b/target/riscv/insn_trans/trans_rvv.c.inc
index d32ce2b3de..91a08d4faf 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -19,11 +19,124 @@
 #include "tcg/tcg-gvec-desc.h"
 #include "internals.h"
 
+static inline bool is_overlapped(const int8_t astart, int8_t asize,
+ const int8_t bstart, int8_t bsize)
+{
+const int8_t aend = astart + asize;
+const int8_t bend = bstart + bsize;
+
+return MAX(aend, bend) - MIN(astart, bstart) < asize + bsize;
+}
+
+static bool require_rvv(DisasContext *s)
+{
+return s->mstatus_vs != 0;
+}
+
+static bool require_rvf(DisasContext *s)
+{
+if (s->mstatus_fs == 0) {
+return false;
+}
+
+switch (s->sew) {
+case MO_16:
+case MO_32:
+return has_ext(s, RVF);
+case MO_64:
+return has_ext(s, RVD);
+default:
+return false;
+}
+}
+
+static bool require_scale_rvf(DisasContext *s)
+{
+if (s->mstatus_fs == 0) {
+return false;
+}
+
+switch (s->sew) {
+case MO_8:
+case MO_16:
+return has_ext(s, RVF);
+case MO_32:
+return has_ext(s, RVD);
+default:
+return false;
+}
+}
+
+/* Destination vector register group cannot overlap source mask register. */
+static bool require_vm(int vm, int vd)
+{
+return (vm != 0 || vd != 0);
+}
+
+static bool require_nf(int vd, int nf, int lmul)
+{
+int size = nf << MAX(lmul, 0);
+return size <= 8 && vd + size <= 32;
+}
+
+/*
+ * Vector register should aligned with the passed-in LMUL (EMUL).
+ * If LMUL < 0, i.e. fractional LMUL, any vector register is allowed.
+ */
+static bool require_align(const int8_t val, const int8_t lmul)
+{
+return lmul <= 0 || extract32(val, 0, lmul) == 0;
+}
+
+/*
+ * A destination vector register group can overlap a source vector
+ * register group only if one of the following holds:
+ *  1. The destination EEW equals the source EEW.
+ *  2. The destination EEW is smaller than the source EEW and the overlap
+ * is in the lowest-numbered part of the source register group.
+ *  3. The destination EEW is greater than the source EEW, the source EMUL
+ * is at least 1, and the overlap is in the highest-numbered part of
+ * the destination register group.
+ * (Section 5.2)
+ *
+ * This function returns true if one of the following holds:
+ *  * Destination vector register group does not overlap a source vector
+ *register group.
+ *  * Rule 3 met.
+ * For rule 1, overlap is allowed so this function doesn't need to be called.
+ * For rule 2, (vd == vs). Caller has to check whether: (vd != vs) before
+ * calling this function.
+ */
+static bool require_noover(const int8_t dst, const int8_t dst_lmul,
+   const int8_t src, const int8_t src_lmul)
+{
+int8_t dst_size = dst_lmul <= 0 ? 1 : 1 << dst_lmul;
+int8_t src_size = src_lmul <= 0 ? 1 : 1 << src_lmul;
+
+/* Destination EEW is greater than the source EEW, check rule 3. */
+if (dst_size > src_size) {
+if (dst < src &&
+src_lmul >= 0 &&
+is_overlapped(dst, dst_size, src, src_size) &&
+!is_overlapped(dst, dst_size, src + src_size, src_size)) {
+return true;
+}
+}
+
+return !is_overlapped(dst, dst_size, src, src_size);
+}
+
+static bool require_noover_seg(const int8_t dst, const int8_t nf,
+   const int8_t src)
+{
+return !is_overlapped(dst, nf, src, 1);
+}
+
 static bool trans_vsetvl(DisasContext *ctx, arg_vsetvl *a)
 {
 TCGv s1, s2, dst;
 
-if (!has_ext(ctx, RVV)) {
+if (!require_rvv(ctx) || !has_ext(ctx, RVV)) {
 return false;
 }
 
@@ -56,7 +169,7 @@ static bool trans_vsetvli(DisasContext *ctx, arg_vsetvli *a)
 {
 TCGv s1, s2, dst;
 
-if (!has_ext(ctx, RVV)) {
+if (!require_rvv(ctx) || !has_ext(ctx, RVV)) {
 return false;
 }
 
@@ -99,54 +212,248 @@ static bool vext_check_isa_ill(DisasContext *s)
 return !s->vill;
 }
 
+static bool vext_check_ss(DisasContext *s, int vd, int vs, int vm)
+{
+return require_vm(vm, vd) &&
+require_align(vd, s->lmul) &&
+require_align(vs, s->lmul);
+}
+
 /*
- * There are two rules check here.
+ * Check function for vector instruction with format:
+ * single-width result and single-width sources (SEW = SEW op SEW)
  *
- * 1. Vector register numbers are multiples of LMUL. (Section 3.2)
+ * is_vs1: indicates whether insn[19:15] is a vs1 field or not.
  *
- * 2. For all widening instructions, the

[RFC v5 09/68] target/riscv: rvv-1.0: add vlenb register

2020-09-29 Thread frank . chang
From: Greentime Hu 

Signed-off-by: Greentime Hu 
Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/cpu_bits.h | 1 +
 target/riscv/csr.c  | 7 +++
 2 files changed, 8 insertions(+)

diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h
index 0cf8a04dd8..1a84b7fd75 100644
--- a/target/riscv/cpu_bits.h
+++ b/target/riscv/cpu_bits.h
@@ -63,6 +63,7 @@
 #define CSR_VCSR0x00f
 #define CSR_VL  0xc20
 #define CSR_VTYPE   0xc21
+#define CSR_VLENB   0xc22
 
 /* VCSR fields */
 #define VCSR_VXSAT_SHIFT0
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index aa58b0b369..cf9718908e 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -241,6 +241,12 @@ static int read_vtype(CPURISCVState *env, int csrno, 
target_ulong *val)
 return 0;
 }
 
+static int read_vlenb(CPURISCVState *env, int csrno, target_ulong *val)
+{
+*val = env_archcpu(env)->cfg.vlen >> 3;
+return 0;
+}
+
 static int read_vl(CPURISCVState *env, int csrno, target_ulong *val)
 {
 *val = env->vl;
@@ -1400,6 +1406,7 @@ static riscv_csr_operations csr_ops[CSR_TABLE_SIZE] = {
 [CSR_VCSR] ={ vs,   read_vcsr,write_vcsr},
 [CSR_VL] =  { vs,   read_vl },
 [CSR_VTYPE] =   { vs,   read_vtype  },
+[CSR_VLENB] =   { vs,   read_vlenb  },
 /* User Timers and Counters */
 [CSR_CYCLE] =   { ctr,  read_instret},
 [CSR_INSTRET] = { ctr,  read_instret},
-- 
2.17.1




[RFC v5 20/68] target/riscv: rvv-1.0: fix address index overflow bug of indexed load/store insns

2020-09-29 Thread frank . chang
From: Frank Chang 

Replace ETYPE from signed int to unsigned int to prevent index overflow
issue, which would lead to wrong index address.

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/vector_helper.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 368259f75a..9349a36b41 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -361,10 +361,10 @@ static target_ulong NAME(target_ulong base,\
 return (base + *((ETYPE *)vs2 + H(idx)));  \
 }
 
-GEN_VEXT_GET_INDEX_ADDR(idx_b, int8_t,  H1)
-GEN_VEXT_GET_INDEX_ADDR(idx_h, int16_t, H2)
-GEN_VEXT_GET_INDEX_ADDR(idx_w, int32_t, H4)
-GEN_VEXT_GET_INDEX_ADDR(idx_d, int64_t, H8)
+GEN_VEXT_GET_INDEX_ADDR(idx_b, uint8_t,  H1)
+GEN_VEXT_GET_INDEX_ADDR(idx_h, uint16_t, H2)
+GEN_VEXT_GET_INDEX_ADDR(idx_w, uint32_t, H4)
+GEN_VEXT_GET_INDEX_ADDR(idx_d, uint64_t, H8)
 
 static inline void
 vext_ldst_index(void *vd, void *v0, target_ulong base,
-- 
2.17.1




[RFC v5 23/68] target/riscv: rvv-1.0: load/store whole register instructions

2020-09-29 Thread frank . chang
From: Frank Chang 

Add the following instructions:

* vlre.v
* vsr.v

Signed-off-by: Frank Chang 
---
 target/riscv/helper.h   | 21 
 target/riscv/insn32.decode  | 22 
 target/riscv/insn_trans/trans_rvv.c.inc | 69 +
 target/riscv/vector_helper.c| 65 +++
 4 files changed, 177 insertions(+)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index b73df6512d..50cca2952c 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -149,6 +149,27 @@ DEF_HELPER_5(vle16ff_v, void, ptr, ptr, tl, env, i32)
 DEF_HELPER_5(vle32ff_v, void, ptr, ptr, tl, env, i32)
 DEF_HELPER_5(vle64ff_v, void, ptr, ptr, tl, env, i32)
 
+DEF_HELPER_4(vl1re8_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vl1re16_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vl1re32_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vl1re64_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vl2re8_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vl2re16_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vl2re32_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vl2re64_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vl4re8_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vl4re16_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vl4re32_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vl4re64_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vl8re8_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vl8re16_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vl8re32_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vl8re64_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vs1r_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vs2r_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vs4r_v, void, ptr, tl, env, i32)
+DEF_HELPER_4(vs8r_v, void, ptr, tl, env, i32)
+
 DEF_HELPER_6(vamoswapei8_32_v, void, ptr, ptr, tl, ptr, env, i32)
 DEF_HELPER_6(vamoswapei8_64_v, void, ptr, ptr, tl, ptr, env, i32)
 DEF_HELPER_6(vamoswapei16_32_v, void, ptr, ptr, tl, ptr, env, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index f7b9aae844..44d35c0271 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -278,6 +278,28 @@ vle16ff_v ... 000 . 1 . 101 . 111 
@r2_nfvm
 vle32ff_v ... 000 . 1 . 110 . 111 @r2_nfvm
 vle64ff_v ... 000 . 1 . 111 . 111 @r2_nfvm
 
+# Vector whole register insns
+vl1re8_v  000 000 1 01000 . 000 . 111 @r2
+vl1re16_v 000 000 1 01000 . 101 . 111 @r2
+vl1re32_v 000 000 1 01000 . 110 . 111 @r2
+vl1re64_v 000 000 1 01000 . 111 . 111 @r2
+vl2re8_v  001 000 1 01000 . 000 . 111 @r2
+vl2re16_v 001 000 1 01000 . 101 . 111 @r2
+vl2re32_v 001 000 1 01000 . 110 . 111 @r2
+vl2re64_v 001 000 1 01000 . 111 . 111 @r2
+vl4re8_v  011 000 1 01000 . 000 . 111 @r2
+vl4re16_v 011 000 1 01000 . 101 . 111 @r2
+vl4re32_v 011 000 1 01000 . 110 . 111 @r2
+vl4re64_v 011 000 1 01000 . 111 . 111 @r2
+vl8re8_v  111 000 1 01000 . 000 . 111 @r2
+vl8re16_v 111 000 1 01000 . 101 . 111 @r2
+vl8re32_v 111 000 1 01000 . 110 . 111 @r2
+vl8re64_v 111 000 1 01000 . 111 . 111 @r2
+vs1r_v000 000 1 01000 . 000 . 0100111 @r2
+vs2r_v001 000 1 01000 . 000 . 0100111 @r2
+vs4r_v011 000 1 01000 . 000 . 0100111 @r2
+vs8r_v111 000 1 01000 . 000 . 0100111 @r2
+
 #*** Vector AMO operations are encoded under the standard AMO major opcode ***
 vamoswapei8_v   1 . . . . 000 . 010 @r_wdvm
 vamoswapei16_v  1 . . . . 101 . 010 @r_wdvm
diff --git a/target/riscv/insn_trans/trans_rvv.c.inc 
b/target/riscv/insn_trans/trans_rvv.c.inc
index 8d69956acc..dd59e67ced 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -1031,6 +1031,75 @@ GEN_VEXT_TRANS(vle16ff_v, MO_16, r2nfvm, ldff_op, 
ld_us_check)
 GEN_VEXT_TRANS(vle32ff_v, MO_32, r2nfvm, ldff_op, ld_us_check)
 GEN_VEXT_TRANS(vle64ff_v, MO_64, r2nfvm, ldff_op, ld_us_check)
 
+/*
+ * load and store whole register instructions
+ */
+typedef void gen_helper_ldst_whole(TCGv_ptr, TCGv, TCGv_env, TCGv_i32);
+
+static bool ldst_whole_trans(uint32_t vd, uint32_t rs1, uint32_t nf,
+ gen_helper_ldst_whole *fn, DisasContext *s,
+ bool is_store)
+{
+TCGv_ptr dest;
+TCGv base;
+TCGv_i32 desc;
+
+uint32_t data = FIELD_DP32(0, VDATA, NF, nf);
+dest = tcg_temp_new_ptr();
+base = tcg_temp_new();
+desc = tcg_const_i32(simd_desc(0, s->vlen / 8, data));
+
+gen_get_gpr(base, rs1);
+tcg_gen_addi_ptr(dest, cpu_env, vreg_ofs(s, vd));
+
+fn(dest, base, cpu_env, desc);
+
+tcg_temp_free_ptr(dest);
+tcg_temp_free(base);
+tcg_temp_free_i32(desc);
+if (!is_store) {
+mark_vs_dirt

[RFC v5 36/68] target/riscv: rvv-1.0: floating-point move instruction

2020-09-29 Thread frank . chang
From: Frank Chang 

NaN-boxed the scalar floating-point register based on RVV 1.0's rules.

Signed-off-by: Frank Chang 
---
 target/riscv/insn_trans/trans_rvv.c.inc | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvv.c.inc 
b/target/riscv/insn_trans/trans_rvv.c.inc
index 46317ae490..254cface60 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -2709,9 +2709,15 @@ static bool trans_vfmv_v_f(DisasContext *s, arg_vfmv_v_f 
*a)
 require_rvf(s) &&
 vext_check_isa_ill(s) &&
 require_align(a->rd, s->lmul)) {
+TCGv_i64 t1;
+
 if (s->vl_eq_vlmax) {
+t1 = tcg_temp_new_i64();
+/* NaN-box f[rs1] */
+do_nanbox(s, t1, cpu_fpr[a->rs1]);
+
 tcg_gen_gvec_dup_i64(s->sew, vreg_ofs(s, a->rd),
- MAXSZ(s), MAXSZ(s), cpu_fpr[a->rs1]);
+ MAXSZ(s), MAXSZ(s), t1);
 mark_vs_dirty(s);
 } else {
 TCGv_ptr dest;
@@ -2725,16 +2731,22 @@ static bool trans_vfmv_v_f(DisasContext *s, 
arg_vfmv_v_f *a)
 TCGLabel *over = gen_new_label();
 tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);
 
+t1 = tcg_temp_new_i64();
+/* NaN-box f[rs1] */
+do_nanbox(s, t1, cpu_fpr[a->rs1]);
+
 dest = tcg_temp_new_ptr();
 desc = tcg_const_i32(simd_desc(0, s->vlen / 8, data));
 tcg_gen_addi_ptr(dest, cpu_env, vreg_ofs(s, a->rd));
-fns[s->sew - 1](dest, cpu_fpr[a->rs1], cpu_env, desc);
+
+fns[s->sew - 1](dest, t1, cpu_env, desc);
 
 tcg_temp_free_ptr(dest);
 tcg_temp_free_i32(desc);
 mark_vs_dirty(s);
 gen_set_label(over);
 }
+tcg_temp_free_i64(t1);
 return true;
 }
 return false;
-- 
2.17.1




[RFC v5 26/68] target/riscv: rvv-1.0: floating-point square-root instruction

2020-09-29 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/insn32.decode | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 44d35c0271..6c95a3460a 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -538,7 +538,7 @@ vfwmsac_vv  10 . . . 001 . 1010111 @r_vm
 vfwmsac_vf  10 . . . 101 . 1010111 @r_vm
 vfwnmsac_vv 11 . . . 001 . 1010111 @r_vm
 vfwnmsac_vf 11 . . . 101 . 1010111 @r_vm
-vfsqrt_v100011 . . 0 001 . 1010111 @r2_vm
+vfsqrt_v010011 . . 0 001 . 1010111 @r2_vm
 vfmin_vv000100 . . . 001 . 1010111 @r_vm
 vfmin_vf000100 . . . 101 . 1010111 @r_vm
 vfmax_vv000110 . . . 001 . 1010111 @r_vm
-- 
2.17.1




[RFC v5 32/68] target/riscv: rvv-1.0: element index instruction

2020-09-29 Thread frank . chang
From: Frank Chang 

Signed-off-by: Frank Chang 
Reviewed-by: Richard Henderson 
---
 target/riscv/insn32.decode | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 33b4612a69..c3b42b051c 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -608,7 +608,7 @@ vmsbf_m 010100 . . 1 010 . 1010111 
@r2_vm
 vmsif_m 010100 . . 00011 010 . 1010111 @r2_vm
 vmsof_m 010100 . . 00010 010 . 1010111 @r2_vm
 viota_m 010100 . . 1 010 . 1010111 @r2_vm
-vid_v   010110 . 0 10001 010 . 1010111 @r1_vm
+vid_v   010100 . 0 10001 010 . 1010111 @r1_vm
 vext_x_v001100 1 . . 010 . 1010111 @r
 vmv_s_x 001101 1 0 . 110 . 1010111 @r2
 vfmv_f_s001100 1 . 0 001 . 1010111 @r2rd
-- 
2.17.1




  1   2   3   4   5   6   7   8   9   10   >