Re: [Qemu-devel] [PATCH 00/20] target/arm: sve system mode patches

2018-08-08 Thread Laurent Desnogues
Hello,

On Thu, Aug 9, 2018 at 6:21 AM, Richard Henderson
 wrote:
> This is my current set of patches for running SVE in system mode.
>
> The first half deal with the system registers that affect SVE.
> I recall that Peter has said he'd like the first patch to be
> done a different way, but we haven't had a chance to talk about
> what form it should take.  I've left it as-is since it does what
> I need for now.
>
> The second half re-implement the SVE memory operations.
> The FF and NF loads had been stubbed out.  Getting those to work
> requires some infrastructure that can be reused to speed up normal
> loads -- one guest-to-host tlb lookup can be reused for the rest
> of the page.

I did not review every patch individually but tested the whole and
found no issue.

Tested-by: Laurent Desnogues 

Thanks,

Laurent

>
> r~
>
>
> Based-on: <20180809034033.10579-1-richard.hender...@linaro.org>
> Richard Henderson (20):
>   target/arm: Set ISAR bits for -cpu max
>   target/arm: Set ID_AA64PFR0 bits for SVE for -cpu max
>   target/arm: Define ID_AA64ZFR0_EL1
>   target/arm: Adjust sve_exception_el
>   target/arm: Fix arm_cpu_data_is_big_endian for aa64 user-only
>   target/arm: Fix arm_current_el for user-only
>   target/arm: Fix is_a64 for user-only
>   target/arm: Pass in current_el to fp and sve_exception_el
>   target/arm: Handle SVE vector length changes in system mode
>   target/arm: Adjust aarch64_cpu_dump_state for system mode SVE
>   target/arm: Clear unused predicate bits for LD1RQ
>   target/arm: Rewrite helper_sve_ld1*_r using pages
>   target/arm: Rewrite helper_sve_ld[234]*_r
>   target/arm: Rewrite helper_sve_st[1234]*_r
>   target/arm: Split contiguous loads for endianness
>   target/arm: Split contiguous stores for endianness
>   target/arm: Rewrite vector gather loads
>   target/arm: Rewrite vector gather stores
>   target/arm: Rewrite vector gather first-fault loads
>   target/arm: Pass TCGMemOpIdx to sve memory helpers
>
>  target/arm/cpu.h   |   47 +-
>  target/arm/helper-sve.h|  385 +--
>  target/arm/internals.h |5 +
>  target/arm/cpu.c   |   24 +-
>  target/arm/cpu64.c |   93 +-
>  target/arm/helper.c|  237 +++--
>  target/arm/op_helper.c |1 +
>  target/arm/sve_helper.c| 2062 +---
>  target/arm/translate-a64.c |8 +-
>  target/arm/translate-sve.c |  670 
>  10 files changed, 2453 insertions(+), 1079 deletions(-)
>
> --
> 2.17.1
>



Re: [Qemu-devel] [PATCH 08/11] target/arm: Fix offset scaling for LD_zprr and ST_zprr

2018-08-08 Thread Laurent Desnogues
On Thu, Aug 9, 2018 at 5:40 AM, Richard Henderson
 wrote:
> The scaling should be solely on the memory operation size; the number
> of registers being loaded does not come in to the initial computation.
>
> Cc: qemu-sta...@nongnu.org (3.0.1)
> Reported-by: Laurent Desnogues 
> Signed-off-by: Richard Henderson 

Tested-by: Laurent Desnogues 
Reviewed-by: Laurent Desnogues 

Laurent

> ---
>  target/arm/translate-sve.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
> index f635822a61..d27bc8c946 100644
> --- a/target/arm/translate-sve.c
> +++ b/target/arm/translate-sve.c
> @@ -4665,8 +4665,7 @@ static bool trans_LD_zprr(DisasContext *s, 
> arg_rprr_load *a, uint32_t insn)
>  }
>  if (sve_access_check(s)) {
>  TCGv_i64 addr = new_tmp_a64(s);
> -tcg_gen_muli_i64(addr, cpu_reg(s, a->rm),
> - (a->nreg + 1) << dtype_msz(a->dtype));
> +tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), dtype_msz(a->dtype));
>  tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn));
>  do_ld_zpa(s, a->rd, a->pg, addr, a->dtype, a->nreg);
>  }
> @@ -4899,7 +4898,7 @@ static bool trans_ST_zprr(DisasContext *s, 
> arg_rprr_store *a, uint32_t insn)
>  }
>  if (sve_access_check(s)) {
>  TCGv_i64 addr = new_tmp_a64(s);
> -tcg_gen_muli_i64(addr, cpu_reg(s, a->rm), (a->nreg + 1) << a->msz);
> +tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), a->msz);
>  tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn));
>  do_st_zpa(s, a->rd, a->pg, addr, a->msz, a->esz, a->nreg);
>  }
> --
> 2.17.1
>



Re: [Qemu-devel] [PATCH v2 05/22] check: Only test usb-xhci-nec when it is compiled in

2018-08-08 Thread Thomas Huth
On 08/08/2018 07:02 PM, Juan Quintela wrote:
> Thomas Huth  wrote:
[...]
> I didn't want to go "further", but I think that we should have here is
> something like:
> 
> check-qtest-$(CONFIG_USB_XHCI_NEC) += tests/usb-hcd-xhci-test$(EXESUF)
> gcov-files-$(CONFIG_USB_XHCI) += hw/usb/hcd-xhci.c
> 
> and remove the arch specific bits.  If one arch don't support it, we
> know have CONFIG_USB_XHCI bits to not _enable_ it there.
> 
> What do you think?
> 
> Thanks, Juan.

If "make check" passes with that change, that sounds like the right
solution to me, too.

 Thomas



Re: [Qemu-devel] [PATCH 07/11] target/arm: Fix offset for LD1R instructions

2018-08-08 Thread Laurent Desnogues
On Thu, Aug 9, 2018 at 5:40 AM, Richard Henderson
 wrote:
> The immediate should be scaled by the size of the memory reference,
> not the size of the elements into which it is loaded.
>
> Cc: qemu-sta...@nongnu.org (3.0.1)
> Reported-by: Laurent Desnogues 
> Signed-off-by: Richard Henderson 

Tested-by: Laurent Desnogues 
Reviewed-by: Laurent Desnogues 

Laurent

> ---
>  target/arm/translate-sve.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
> index 9e63b5f8e5..f635822a61 100644
> --- a/target/arm/translate-sve.c
> +++ b/target/arm/translate-sve.c
> @@ -4819,6 +4819,7 @@ static bool trans_LD1R_zpri(DisasContext *s, 
> arg_rpri_load *a, uint32_t insn)
>  unsigned vsz = vec_full_reg_size(s);
>  unsigned psz = pred_full_reg_size(s);
>  unsigned esz = dtype_esz[a->dtype];
> +unsigned msz = dtype_msz(a->dtype);
>  TCGLabel *over = gen_new_label();
>  TCGv_i64 temp;
>
> @@ -4842,7 +4843,7 @@ static bool trans_LD1R_zpri(DisasContext *s, 
> arg_rpri_load *a, uint32_t insn)
>
>  /* Load the data.  */
>  temp = tcg_temp_new_i64();
> -tcg_gen_addi_i64(temp, cpu_reg_sp(s, a->rn), a->imm << esz);
> +tcg_gen_addi_i64(temp, cpu_reg_sp(s, a->rn), a->imm << msz);
>  tcg_gen_qemu_ld_i64(temp, temp, get_mem_index(s),
>  s->be_data | dtype_mop[a->dtype]);
>
> --
> 2.17.1
>



Re: [Qemu-devel] [PATCH 06/11] target/arm: Fix sign-extension in sve do_ldr/do_str

2018-08-08 Thread Laurent Desnogues
On Thu, Aug 9, 2018 at 5:40 AM, Richard Henderson
 wrote:
> The expression (int) imm + (uint32_t) len_align turns into uint32_t
> and thus with negative imm produces a memory operation at the wrong
> offset.  None of the numbers involved are particularly large, so
> change everything to use int.
>
> Cc: qemu-sta...@nongnu.org (3.0.1)
> Reported-by: Laurent Desnogues 
> Signed-off-by: Richard Henderson 

Tested-by: Laurent Desnogues 
Reviewed-by: Laurent Desnogues 

Laurent

> ---
>  target/arm/translate-sve.c | 18 --
>  1 file changed, 8 insertions(+), 10 deletions(-)
>
> diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
> index 89efc80ee7..9e63b5f8e5 100644
> --- a/target/arm/translate-sve.c
> +++ b/target/arm/translate-sve.c
> @@ -4372,12 +4372,11 @@ static bool trans_UCVTF_dd(DisasContext *s, 
> arg_rpr_esz *a, uint32_t insn)
>   * The load should begin at the address Rn + IMM.
>   */
>
> -static void do_ldr(DisasContext *s, uint32_t vofs, uint32_t len,
> -   int rn, int imm)
> +static void do_ldr(DisasContext *s, uint32_t vofs, int len, int rn, int imm)
>  {
> -uint32_t len_align = QEMU_ALIGN_DOWN(len, 8);
> -uint32_t len_remain = len % 8;
> -uint32_t nparts = len / 8 + ctpop8(len_remain);
> +int len_align = QEMU_ALIGN_DOWN(len, 8);
> +int len_remain = len % 8;
> +int nparts = len / 8 + ctpop8(len_remain);
>  int midx = get_mem_index(s);
>  TCGv_i64 addr, t0, t1;
>
> @@ -4458,12 +4457,11 @@ static void do_ldr(DisasContext *s, uint32_t vofs, 
> uint32_t len,
>  }
>
>  /* Similarly for stores.  */
> -static void do_str(DisasContext *s, uint32_t vofs, uint32_t len,
> -   int rn, int imm)
> +static void do_str(DisasContext *s, uint32_t vofs, int len, int rn, int imm)
>  {
> -uint32_t len_align = QEMU_ALIGN_DOWN(len, 8);
> -uint32_t len_remain = len % 8;
> -uint32_t nparts = len / 8 + ctpop8(len_remain);
> +int len_align = QEMU_ALIGN_DOWN(len, 8);
> +int len_remain = len % 8;
> +int nparts = len / 8 + ctpop8(len_remain);
>  int midx = get_mem_index(s);
>  TCGv_i64 addr, t0;
>
> --
> 2.17.1
>



[Qemu-devel] [PATCH 19/20] target/arm: Rewrite vector gather first-fault loads

2018-08-08 Thread Richard Henderson
This implements the feature for softmmu, and moves the
main loop out of a macro and into a function.

Signed-off-by: Richard Henderson 
---
 target/arm/helper-sve.h|  84 ---
 target/arm/sve_helper.c| 290 +++--
 target/arm/translate-sve.c |  84 +--
 3 files changed, 321 insertions(+), 137 deletions(-)

diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index 6b9b93af45..9e79182ab4 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -1401,69 +1401,111 @@ DEF_HELPER_FLAGS_6(sve_ldsds_be_zd, TCG_CALL_NO_WG,
 
 DEF_HELPER_FLAGS_6(sve_ldffbsu_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_ldffhsu_zsu, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_ldffhsu_le_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_ldffssu_zsu, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_ldffhsu_be_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffss_le_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffss_be_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
 DEF_HELPER_FLAGS_6(sve_ldffbss_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_ldffhss_zsu, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_ldffhss_le_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffhss_be_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
 
 DEF_HELPER_FLAGS_6(sve_ldffbsu_zss, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_ldffhsu_zss, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_ldffhsu_le_zss, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_ldffssu_zss, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_ldffhsu_be_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffss_le_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffss_be_zss, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
 DEF_HELPER_FLAGS_6(sve_ldffbss_zss, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_ldffhss_zss, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_ldffhss_le_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffhss_be_zss, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
 
 DEF_HELPER_FLAGS_6(sve_ldffbdu_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_ldffhdu_zsu, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_ldffhdu_le_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_ldffsdu_zsu, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_ldffhdu_be_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_ldffddu_zsu, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_ldffsdu_le_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffsdu_be_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffdd_le_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffdd_be_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
 DEF_HELPER_FLAGS_6(sve_ldffbds_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_ldffhds_zsu, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_ldffhds_le_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_ldffsds_zsu, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_ldffhds_be_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffsds_le_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffsds_be_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
 
 DEF_HELPER_FLAGS_6(sve_ldffbdu_zss, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_ldffhdu_zss, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_ldffhdu_le_zss, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_ldffsdu_zss, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_ldffhdu_be_zss, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_ldffddu_zss, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_ldffsdu_le_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffsdu_be_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)

[Qemu-devel] [PATCH 17/20] target/arm: Rewrite vector gather loads

2018-08-08 Thread Richard Henderson
This fixes the endianness problem for softmmu, and does
move the main loop out of a macro and into an inlined function.

Signed-off-by: Richard Henderson 
---
 target/arm/helper-sve.h|  84 +
 target/arm/sve_helper.c| 218 +++--
 target/arm/translate-sve.c | 244 +
 3 files changed, 380 insertions(+), 166 deletions(-)

diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index 1ad043101a..49d1c09e30 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -1292,69 +1292,111 @@ DEF_HELPER_FLAGS_4(sve_st1sd_be_r, TCG_CALL_NO_WG, 
void, env, ptr, tl, i32)
 
 DEF_HELPER_FLAGS_6(sve_ldbsu_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_ldhsu_zsu, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_ldhsu_le_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_ldssu_zsu, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_ldhsu_be_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldss_le_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldss_be_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
 DEF_HELPER_FLAGS_6(sve_ldbss_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_ldhss_zsu, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_ldhss_le_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldhss_be_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
 
 DEF_HELPER_FLAGS_6(sve_ldbsu_zss, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_ldhsu_zss, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_ldhsu_le_zss, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_ldssu_zss, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_ldhsu_be_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldss_le_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldss_be_zss, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
 DEF_HELPER_FLAGS_6(sve_ldbss_zss, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_ldhss_zss, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_ldhss_le_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldhss_be_zss, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
 
 DEF_HELPER_FLAGS_6(sve_ldbdu_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_ldhdu_zsu, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_ldhdu_le_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_ldsdu_zsu, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_ldhdu_be_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_ldddu_zsu, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_ldsdu_le_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldsdu_be_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_lddd_le_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_lddd_be_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
 DEF_HELPER_FLAGS_6(sve_ldbds_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_ldhds_zsu, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_ldhds_le_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_ldsds_zsu, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_ldhds_be_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldsds_le_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldsds_be_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
 
 DEF_HELPER_FLAGS_6(sve_ldbdu_zss, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_ldhdu_zss, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_ldhdu_le_zss, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_ldsdu_zss, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_ldhdu_be_zss, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_ldddu_zss, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_ldsdu_le_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldsdu_be_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_lddd_le_zss, 

[Qemu-devel] [PATCH 18/20] target/arm: Rewrite vector gather stores

2018-08-08 Thread Richard Henderson
This fixes the endianness problem for softmmu, and does
move the main loop out of a macro and into an inlined function.

Signed-off-by: Richard Henderson 
---
 target/arm/helper-sve.h|  52 ++
 target/arm/sve_helper.c| 139 -
 target/arm/translate-sve.c |  74 +---
 3 files changed, 177 insertions(+), 88 deletions(-)

diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index 49d1c09e30..6b9b93af45 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -1468,41 +1468,67 @@ DEF_HELPER_FLAGS_6(sve_ldffsds_zd, TCG_CALL_NO_WG,
 
 DEF_HELPER_FLAGS_6(sve_stbs_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_sths_zsu, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_sths_le_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_stss_zsu, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_sths_be_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stss_le_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stss_be_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
 
 DEF_HELPER_FLAGS_6(sve_stbs_zss, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_sths_zss, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_sths_le_zss, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_stss_zss, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_sths_be_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stss_le_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stss_be_zss, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
 
 DEF_HELPER_FLAGS_6(sve_stbd_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_sthd_zsu, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_sthd_le_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_stsd_zsu, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_sthd_be_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_stdd_zsu, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_stsd_le_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stsd_be_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stdd_le_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stdd_be_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
 
 DEF_HELPER_FLAGS_6(sve_stbd_zss, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_sthd_zss, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_sthd_le_zss, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_stsd_zss, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_sthd_be_zss, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_stdd_zss, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_stsd_le_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stsd_be_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stdd_le_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stdd_be_zss, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
 
 DEF_HELPER_FLAGS_6(sve_stbd_zd, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_sthd_zd, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_sthd_le_zd, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_stsd_zd, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_sthd_be_zd, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
-DEF_HELPER_FLAGS_6(sve_stdd_zd, TCG_CALL_NO_WG,
+DEF_HELPER_FLAGS_6(sve_stsd_le_zd, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stsd_be_zd, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stdd_le_zd, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stdd_be_zd, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index 76d3f021e4..0a4756bff9 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -5235,61 +5235,100 @@ DO_LDFF1_ZPZ_D(sve_ldffsds_zd, uint64_t, int32_t,  
cpu_ldl_data_ra)
 
 /* Stores with a vector index.  */
 
-#define DO_ST1_ZPZ_S(NAME, TYPEI, FN)   \
-void 

[Qemu-devel] [PATCH 15/20] target/arm: Split contiguous loads for endianness

2018-08-08 Thread Richard Henderson
We can choose the endianness at translation time, rather than
re-computing it at execution time.

Signed-off-by: Richard Henderson 
---
 target/arm/helper-sve.h| 117 +++---
 target/arm/sve_helper.c|  70 ++---
 target/arm/translate-sve.c | 196 +
 3 files changed, 252 insertions(+), 131 deletions(-)

diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index 023952a9a4..526caec8da 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -1128,20 +1128,35 @@ DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, 
env, ptr, tl, i32)
 DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
 DEF_HELPER_FLAGS_4(sve_ld4bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
 
-DEF_HELPER_FLAGS_4(sve_ld1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
-DEF_HELPER_FLAGS_4(sve_ld2hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
-DEF_HELPER_FLAGS_4(sve_ld3hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
-DEF_HELPER_FLAGS_4(sve_ld4hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1hh_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld2hh_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld3hh_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld4hh_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
 
-DEF_HELPER_FLAGS_4(sve_ld1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
-DEF_HELPER_FLAGS_4(sve_ld2ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
-DEF_HELPER_FLAGS_4(sve_ld3ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
-DEF_HELPER_FLAGS_4(sve_ld4ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1hh_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld2hh_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld3hh_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld4hh_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
 
-DEF_HELPER_FLAGS_4(sve_ld1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
-DEF_HELPER_FLAGS_4(sve_ld2dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
-DEF_HELPER_FLAGS_4(sve_ld3dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
-DEF_HELPER_FLAGS_4(sve_ld4dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1ss_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld2ss_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld3ss_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld4ss_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ld1ss_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld2ss_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld3ss_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld4ss_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ld1dd_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld2dd_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld3dd_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld4dd_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ld1dd_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld2dd_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld3dd_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld4dd_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
 
 DEF_HELPER_FLAGS_4(sve_ld1bhu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
 DEF_HELPER_FLAGS_4(sve_ld1bsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
@@ -1150,13 +1165,21 @@ DEF_HELPER_FLAGS_4(sve_ld1bhs_r, TCG_CALL_NO_WG, void, 
env, ptr, tl, i32)
 DEF_HELPER_FLAGS_4(sve_ld1bss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
 DEF_HELPER_FLAGS_4(sve_ld1bds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
 
-DEF_HELPER_FLAGS_4(sve_ld1hsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
-DEF_HELPER_FLAGS_4(sve_ld1hdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
-DEF_HELPER_FLAGS_4(sve_ld1hss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
-DEF_HELPER_FLAGS_4(sve_ld1hds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1hsu_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1hdu_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1hss_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1hds_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
 
-DEF_HELPER_FLAGS_4(sve_ld1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
-DEF_HELPER_FLAGS_4(sve_ld1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1hsu_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1hdu_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1hss_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1hds_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, 

[Qemu-devel] [PATCH 16/20] target/arm: Split contiguous stores for endianness

2018-08-08 Thread Richard Henderson
We can choose the endianness at translation time, rather than
re-computing it at execution time.

Signed-off-by: Richard Henderson 
---
 target/arm/helper-sve.h| 48 +
 target/arm/sve_helper.c| 11 --
 target/arm/translate-sve.c | 72 +-
 3 files changed, 96 insertions(+), 35 deletions(-)

diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index 526caec8da..1ad043101a 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -1248,29 +1248,47 @@ DEF_HELPER_FLAGS_4(sve_st2bb_r, TCG_CALL_NO_WG, void, 
env, ptr, tl, i32)
 DEF_HELPER_FLAGS_4(sve_st3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
 DEF_HELPER_FLAGS_4(sve_st4bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
 
-DEF_HELPER_FLAGS_4(sve_st1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
-DEF_HELPER_FLAGS_4(sve_st2hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
-DEF_HELPER_FLAGS_4(sve_st3hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
-DEF_HELPER_FLAGS_4(sve_st4hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st1hh_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st2hh_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st3hh_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st4hh_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
 
-DEF_HELPER_FLAGS_4(sve_st1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
-DEF_HELPER_FLAGS_4(sve_st2ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
-DEF_HELPER_FLAGS_4(sve_st3ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
-DEF_HELPER_FLAGS_4(sve_st4ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st1hh_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st2hh_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st3hh_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st4hh_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
 
-DEF_HELPER_FLAGS_4(sve_st1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
-DEF_HELPER_FLAGS_4(sve_st2dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
-DEF_HELPER_FLAGS_4(sve_st3dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
-DEF_HELPER_FLAGS_4(sve_st4dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st1ss_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st2ss_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st3ss_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st4ss_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_st1ss_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st2ss_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st3ss_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st4ss_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_st1dd_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st2dd_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st3dd_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st4dd_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_st1dd_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st2dd_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st3dd_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st4dd_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
 
 DEF_HELPER_FLAGS_4(sve_st1bh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
 DEF_HELPER_FLAGS_4(sve_st1bs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
 DEF_HELPER_FLAGS_4(sve_st1bd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
 
-DEF_HELPER_FLAGS_4(sve_st1hs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
-DEF_HELPER_FLAGS_4(sve_st1hd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st1hs_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st1hd_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st1hs_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st1hd_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
 
-DEF_HELPER_FLAGS_4(sve_st1sd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st1sd_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st1sd_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
 
 DEF_HELPER_FLAGS_6(sve_ldbsu_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index 56e2f523c5..92c0e961a9 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -4940,12 +4940,17 @@ void __attribute__((flatten)) 
HELPER(sve_st##N##NAME##_r)   \
 }
 
 #define DO_STN_2(N, NAME, ESIZE, MSIZE) \
-void __attribute__((flatten)) HELPER(sve_st##N##NAME##_r) \
+void __attribute__((flatten)) HELPER(sve_st##N##NAME##_le_r)  \
 (CPUARMState 

[Qemu-devel] [PATCH 12/20] target/arm: Rewrite helper_sve_ld1*_r using pages

2018-08-08 Thread Richard Henderson
Uses tlb_vaddr_to_host for correct operation with softmmu.
Optimize for accesses within a single page or pair of pages.

Perf report comparison for cortex-strings test-strlen
with aarch64-linux-user:

before:
   1.59%  qemu-aarch64  qemu-aarch64  [.] do_sve_ld1bb_r
   0.86%  qemu-aarch64  qemu-aarch64  [.] do_sve_ldff1bb_r
after:
   0.09%  qemu-aarch64  qemu-aarch64  [.] helper_sve_ldff1bb_r
   0.01%  qemu-aarch64  qemu-aarch64  [.] sve_ld1bb_host

Signed-off-by: Richard Henderson 
---
 target/arm/sve_helper.c | 839 
 1 file changed, 675 insertions(+), 164 deletions(-)

diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index e03f954a26..4ca9412e20 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -1688,6 +1688,45 @@ static void swap_memmove(void *vd, void *vs, size_t n)
 }
 }
 
+/* Similarly for memset of 0.  */
+static void swap_memzero(void *vd, size_t n)
+{
+uintptr_t d = (uintptr_t)vd;
+uintptr_t o = (d | n) & 7;
+size_t i;
+
+if (likely(n == 0)) {
+return;
+}
+#ifndef HOST_WORDS_BIGENDIAN
+o = 0;
+#endif
+switch (o) {
+case 0:
+memset(vd, 0, n);
+break;
+
+case 4:
+for (i = 0; i < n; i += 4) {
+*(uint32_t *)H1_4(d + i) = 0;
+}
+break;
+
+case 2:
+case 6:
+for (i = 0; i < n; i += 2) {
+*(uint16_t *)H1_2(d + i) = 0;
+}
+break;
+
+default:
+for (i = 0; i < n; i++) {
+*(uint8_t *)H1(d + i) = 0;
+}
+break;
+}
+}
+
 void HELPER(sve_ext)(void *vd, void *vn, void *vm, uint32_t desc)
 {
 intptr_t opr_sz = simd_oprsz(desc);
@@ -3927,32 +3966,438 @@ void HELPER(sve_fcmla_zpzzz_d)(CPUARMState *env, void 
*vg, uint32_t desc)
 /*
  * Load contiguous data, protected by a governing predicate.
  */
-#define DO_LD1(NAME, FN, TYPEE, TYPEM, H)  \
-static void do_##NAME(CPUARMState *env, void *vd, void *vg, \
-  target_ulong addr, intptr_t oprsz,   \
-  uintptr_t ra)\
-{  \
-intptr_t i = 0;\
-do {   \
-uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));\
-do {   \
-TYPEM m = 0;   \
-if (pg & 1) {  \
-m = FN(env, addr, ra); \
-}  \
-*(TYPEE *)(vd + H(i)) = m; \
-i += sizeof(TYPEE), pg >>= sizeof(TYPEE);  \
-addr += sizeof(TYPEM); \
-} while (i & 15);  \
-} while (i < oprsz);   \
-}  \
-void HELPER(NAME)(CPUARMState *env, void *vg,  \
-  target_ulong addr, uint32_t desc)\
-{  \
-do_##NAME(env, >vfp.zregs[simd_data(desc)], vg,   \
-  addr, simd_oprsz(desc), GETPC());\
+
+/* Load elements into VD, controlled by VG, from HOST+MEM_OFS.
+ * Memory is valid through MEM_MAX.  The register element indicies
+ * are inferred from MEM_OFS, as modified by the types for which
+ * the helper is built.  Return the MEM_OFS of the first element
+ * not loaded (which is MEM_MAX if they are all loaded).
+ *
+ * For softmmu, we have fully validated the guest page.  For user-only,
+ * we cannot fully validate without taking the mmap lock, but since we
+ * know the access is within one host page, if any access is valid they
+ * all must be valid.  However, it may be that no access is valid and
+ * they have all been predicated false.
+ */
+typedef intptr_t sve_ld1_host_fn(void *vd, void *vg, void *host,
+ intptr_t mem_ofs, intptr_t mem_max);
+
+/* Load one element into VD+REG_OFF from (ENV,VADDR,RA).
+ * The controlling predicate is known to be true.
+ */
+typedef void sve_ld1_tlb_fn(CPUARMState *env, void *vd, intptr_t reg_off,
+target_ulong vaddr, int mmu_idx, uintptr_t ra);
+
+/*
+ * Generate the above primitives.
+ */
+
+#define DO_LD_HOST(NAME, H, TYPEE, TYPEM, HOST) \
+static intptr_t sve_##NAME##_host(void *vd, void *vg, void *host,   \
+  intptr_t mem_off, const intptr_t mem_max) \
+{   \
+intptr_t reg_off = mem_off * (sizeof(TYPEE) / sizeof(TYPEM));   \
+uint64_t *pg = vg;  \
+while (mem_off + sizeof(TYPEM) <= mem_max) { 

[Qemu-devel] [PATCH 14/20] target/arm: Rewrite helper_sve_st[1234]*_r

2018-08-08 Thread Richard Henderson
This fixes the endianness problem for softmmu, and does
move the main loop out of a macro and into an inlined function.

Signed-off-by: Richard Henderson 
---
 target/arm/sve_helper.c | 351 
 1 file changed, 172 insertions(+), 179 deletions(-)

diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index 5cc7de5077..4eae6569cc 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -3987,6 +3987,7 @@ typedef intptr_t sve_ld1_host_fn(void *vd, void *vg, void 
*host,
  */
 typedef void sve_ld1_tlb_fn(CPUARMState *env, void *vd, intptr_t reg_off,
 target_ulong vaddr, int mmu_idx, uintptr_t ra);
+typedef sve_ld1_tlb_fn sve_st1_tlb_fn;
 
 /*
  * Generate the above primitives.
@@ -4772,214 +4773,206 @@ DO_LDFF1_LDNF1_2(dd,  3, 3)
 /*
  * Store contiguous data, protected by a governing predicate.
  */
-#define DO_ST1(NAME, FN, TYPEE, TYPEM, H)  \
-void HELPER(NAME)(CPUARMState *env, void *vg,  \
-  target_ulong addr, uint32_t desc)\
-{  \
-intptr_t i, oprsz = simd_oprsz(desc);  \
-intptr_t ra = GETPC(); \
-unsigned rd = simd_data(desc); \
-void *vd = >vfp.zregs[rd];\
-for (i = 0; i < oprsz; ) { \
-uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));\
-do {   \
-if (pg & 1) {  \
-TYPEM m = *(TYPEE *)(vd + H(i));   \
-FN(env, addr, m, ra);  \
-}  \
-i += sizeof(TYPEE), pg >>= sizeof(TYPEE);  \
-addr += sizeof(TYPEM); \
-} while (i & 15);  \
-}  \
+
+#ifdef CONFIG_SOFTMMU
+#define DO_ST_TLB(NAME, H, TYPEM, HOST, MOEND, TLB) \
+static void sve_##NAME##_tlb(CPUARMState *env, void *vd, intptr_t reg_off,  \
+ target_ulong addr, int mmu_idx, uintptr_t ra)  \
+{   \
+TCGMemOpIdx oi = make_memop_idx(ctz32(sizeof(TYPEM)) | MOEND, mmu_idx); \
+TLB(env, addr, *(TYPEM *)(vd + H(reg_off)), oi, ra);\
 }
-
-#define DO_ST1_D(NAME, FN, TYPEM)  \
-void HELPER(NAME)(CPUARMState *env, void *vg,  \
-  target_ulong addr, uint32_t desc)\
-{  \
-intptr_t i, oprsz = simd_oprsz(desc) / 8;  \
-intptr_t ra = GETPC(); \
-unsigned rd = simd_data(desc); \
-uint64_t *d = >vfp.zregs[rd].d[0];\
-uint8_t *pg = vg;  \
-for (i = 0; i < oprsz; i += 1) {   \
-if (pg[H1(i)] & 1) {   \
-FN(env, addr, d[i], ra);   \
-}  \
-addr += sizeof(TYPEM); \
-}  \
+#else
+#define DO_ST_TLB(NAME, H, TYPEM, HOST, MOEND, TLB) \
+static void sve_##NAME##_tlb(CPUARMState *env, void *vd, intptr_t reg_off,  \
+ target_ulong addr, int mmu_idx, uintptr_t ra)  \
+{   \
+HOST(g2h(addr), *(TYPEM *)(vd + H(reg_off)));   \
 }
+#endif
 
-#define DO_ST2(NAME, FN, TYPEE, TYPEM, H)  \
-void HELPER(NAME)(CPUARMState *env, void *vg,  \
-  target_ulong addr, uint32_t desc)\
-{  \
-intptr_t i, oprsz = simd_oprsz(desc);  \
-intptr_t ra = GETPC(); \
-unsigned rd = simd_data(desc); \
-void *d1 = >vfp.zregs[rd];\
-void *d2 = >vfp.zregs[(rd + 1) & 31]; \
-for (i = 0; i < oprsz; ) { \
-uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));\
-do {   \
-if (pg & 1) {  \
-TYPEM m1 = *(TYPEE *)(d1 + H(i));  \
-TYPEM m2 = *(TYPEE *)(d2 + H(i));  \
-FN(env, addr, m1, ra); \
-FN(env, addr + sizeof(TYPEM), m2, ra); \
-}  \
-i += 

[Qemu-devel] [PATCH 10/20] target/arm: Adjust aarch64_cpu_dump_state for system mode SVE

2018-08-08 Thread Richard Henderson
Use the existing helpers to determine if (1) the fpu is enabled,
(2) sve state is enabled, and (3) the current sve vector length.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h   | 4 
 target/arm/helper.c| 6 +++---
 target/arm/translate-a64.c | 8 ++--
 3 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 18b3c92c2e..33d06f2340 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -920,6 +920,10 @@ target_ulong do_arm_semihosting(CPUARMState *env);
 void aarch64_sync_32_to_64(CPUARMState *env);
 void aarch64_sync_64_to_32(CPUARMState *env);
 
+int fp_exception_el(CPUARMState *env, int cur_el);
+int sve_exception_el(CPUARMState *env, int cur_el);
+uint32_t sve_zcr_len_for_el(CPUARMState *env, int el);
+
 static inline bool is_a64(CPUARMState *env)
 {
 #ifdef CONFIG_USER_ONLY
diff --git a/target/arm/helper.c b/target/arm/helper.c
index fb79b27cf6..64ff71b722 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -4344,7 +4344,7 @@ static const ARMCPRegInfo debug_lpae_cp_reginfo[] = {
  * take care of raising that exception.
  * C.f. the ARM pseudocode function CheckSVEEnabled.
  */
-static int sve_exception_el(CPUARMState *env, int el)
+int sve_exception_el(CPUARMState *env, int el)
 {
 #ifndef CONFIG_USER_ONLY
 if (el <= 1) {
@@ -4402,7 +4402,7 @@ static int sve_exception_el(CPUARMState *env, int el)
 /*
  * Given that SVE is enabled, return the vector length for EL.
  */
-static uint32_t sve_zcr_len_for_el(CPUARMState *env, int el)
+uint32_t sve_zcr_len_for_el(CPUARMState *env, int el)
 {
 ARMCPU *cpu = arm_env_get_cpu(env);
 uint32_t zcr_len = cpu->sve_max_vq - 1;
@@ -12352,7 +12352,7 @@ uint32_t HELPER(crc32c)(uint32_t acc, uint32_t val, 
uint32_t bytes)
 /* Return the exception level to which FP-disabled exceptions should
  * be taken, or 0 if FP is enabled.
  */
-static int fp_exception_el(CPUARMState *env, int cur_el)
+int fp_exception_el(CPUARMState *env, int cur_el)
 {
 #ifndef CONFIG_USER_ONLY
 int fpen;
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index b29dc49c4f..4a0ca8c906 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -166,11 +166,15 @@ void aarch64_cpu_dump_state(CPUState *cs, FILE *f,
 cpu_fprintf(f, "\n");
 return;
 }
+if (fp_exception_el(env, el) != 0) {
+cpu_fprintf(f, "FPU disabled\n");
+return;
+}
 cpu_fprintf(f, " FPCR=%08x FPSR=%08x\n",
 vfp_get_fpcr(env), vfp_get_fpsr(env));
 
-if (arm_feature(env, ARM_FEATURE_SVE)) {
-int j, zcr_len = env->vfp.zcr_el[1] & 0xf; /* fix for system mode */
+if (arm_feature(env, ARM_FEATURE_SVE) && sve_exception_el(env, el) == 0) {
+int j, zcr_len = sve_zcr_len_for_el(env, el);
 
 for (i = 0; i <= FFR_PRED_NUM; i++) {
 bool eol;
-- 
2.17.1




[Qemu-devel] [PATCH 11/20] target/arm: Clear unused predicate bits for LD1RQ

2018-08-08 Thread Richard Henderson
The 16-byte load only uses 16 predicate bits.  But while
reusing the other load infrastructure, we find other bits
that are set and trigger an assert.  To avoid this and
retain the assert, zero-extend the predicate that we pass
to the LD1 helper.

Reported-by: Laurent Desnogues 
Signed-off-by: Richard Henderson 
---
 target/arm/translate-sve.c | 25 +++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index d27bc8c946..bef6b8242d 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -4765,12 +4765,33 @@ static void do_ldrq(DisasContext *s, int zt, int pg, 
TCGv_i64 addr, int msz)
 unsigned vsz = vec_full_reg_size(s);
 TCGv_ptr t_pg;
 TCGv_i32 desc;
+int poff;
 
 /* Load the first quadword using the normal predicated load helpers.  */
 desc = tcg_const_i32(simd_desc(16, 16, zt));
-t_pg = tcg_temp_new_ptr();
 
-tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg));
+poff = pred_full_reg_offset(s, pg);
+if (vsz > 16) {
+/*
+ * Zero-extend the first 16 bits of the predicate into a temporary.
+ * This avoids triggering an assert making sure we don't have bits
+ * set within a predicate beyond VQ, but we have lowered VQ to 1
+ * for this load operation.
+ */
+TCGv_i64 tmp = tcg_temp_new_i64();
+#ifdef HOST_WORDS_BIGENDIAN
+poff += 6;
+#endif
+tcg_gen_ld16u_i64(tmp, cpu_env, poff);
+
+poff = offsetof(CPUARMState, vfp.preg_tmp);
+tcg_gen_st_i64(tmp, cpu_env, poff);
+tcg_temp_free_i64(tmp);
+}
+
+t_pg = tcg_temp_new_ptr();
+tcg_gen_addi_ptr(t_pg, cpu_env, poff);
+
 fns[msz](cpu_env, t_pg, addr, desc);
 
 tcg_temp_free_ptr(t_pg);
-- 
2.17.1




[Qemu-devel] [PATCH 07/20] target/arm: Fix is_a64 for user-only

2018-08-08 Thread Richard Henderson
Saves about 8k code size in qemu-aarch64.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h | 8 
 1 file changed, 8 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index aedaf2631e..ed51a2f5aa 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -918,7 +918,15 @@ void aarch64_sync_64_to_32(CPUARMState *env);
 
 static inline bool is_a64(CPUARMState *env)
 {
+#ifdef CONFIG_USER_ONLY
+# ifdef TARGET_AARCH64
+return true;
+# else
+return false;
+# endif
+#else
 return env->aarch64;
+#endif
 }
 
 /* you can call this signal handler from your SIGBUS and SIGSEGV
-- 
2.17.1




[Qemu-devel] [PATCH 20/20] target/arm: Pass TCGMemOpIdx to sve memory helpers

2018-08-08 Thread Richard Henderson
There is quite a lot of code required to compute cpu_mem_index,
or even put together the full TCGMemOpIdx.  This can easily be
done at translation time.

Signed-off-by: Richard Henderson 
---
 target/arm/internals.h |   5 ++
 target/arm/sve_helper.c| 138 +++--
 target/arm/translate-sve.c |  67 +++---
 3 files changed, 121 insertions(+), 89 deletions(-)

diff --git a/target/arm/internals.h b/target/arm/internals.h
index dc9357766c..24c0444c8d 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -796,4 +796,9 @@ static inline uint32_t arm_debug_exception_fsr(CPUARMState 
*env)
 }
 }
 
+/* Note make_memop_idx reserves 4 bits for mmu_idx, and MO_BSWAP is bit 3.
+ * Thus a TCGMemOpIdx, without any MO_ALIGN bits, fits in 8 bits.
+ */
+#define MEMOPIDX_SHIFT  8
+
 #endif
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index 6728862326..5bae600d17 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -19,6 +19,7 @@
 
 #include "qemu/osdep.h"
 #include "cpu.h"
+#include "internals.h"
 #include "exec/exec-all.h"
 #include "exec/cpu_ldst.h"
 #include "exec/helper-proto.h"
@@ -3986,7 +3987,7 @@ typedef intptr_t sve_ld1_host_fn(void *vd, void *vg, void 
*host,
  * The controlling predicate is known to be true.
  */
 typedef void sve_ld1_tlb_fn(CPUARMState *env, void *vd, intptr_t reg_off,
-target_ulong vaddr, int mmu_idx, uintptr_t ra);
+target_ulong vaddr, TCGMemOpIdx oi, uintptr_t ra);
 typedef sve_ld1_tlb_fn sve_st1_tlb_fn;
 
 /*
@@ -4013,16 +4014,15 @@ static intptr_t sve_##NAME##_host(void *vd, void *vg, 
void *host,   \
 #ifdef CONFIG_SOFTMMU
 #define DO_LD_TLB(NAME, H, TYPEE, TYPEM, HOST, MOEND, TLB) \
 static void sve_##NAME##_tlb(CPUARMState *env, void *vd, intptr_t reg_off,  \
- target_ulong addr, int mmu_idx, uintptr_t ra)  \
+ target_ulong addr, TCGMemOpIdx oi, uintptr_t ra)  
\
 {   \
-TCGMemOpIdx oi = make_memop_idx(ctz32(sizeof(TYPEM)) | MOEND, mmu_idx); \
 TYPEM val = TLB(env, addr, oi, ra); \
 *(TYPEE *)(vd + H(reg_off)) = val;  \
 }
 #else
 #define DO_LD_TLB(NAME, H, TYPEE, TYPEM, HOST, MOEND, TLB)  \
 static void sve_##NAME##_tlb(CPUARMState *env, void *vd, intptr_t reg_off,  \
- target_ulong addr, int mmu_idx, uintptr_t ra)  \
+ target_ulong addr, TCGMemOpIdx oi, uintptr_t ra)  
\
 {   \
 TYPEM val = HOST(g2h(addr));\
 *(TYPEE *)(vd + H(reg_off)) = val;  \
@@ -4287,11 +4287,13 @@ static void sve_ld1_r(CPUARMState *env, void *vg, const 
target_ulong addr,
   sve_ld1_host_fn *host_fn,
   sve_ld1_tlb_fn *tlb_fn)
 {
-void *vd = >vfp.zregs[simd_data(desc)];
+const TCGMemOpIdx oi = extract32(desc, SIMD_DATA_SHIFT, MEMOPIDX_SHIFT);
+const int mmu_idx = get_mmuidx(oi);
+const unsigned rd = extract32(desc, SIMD_DATA_SHIFT + MEMOPIDX_SHIFT, 5);
+void *vd = >vfp.zregs[rd];
 const int diffsz = esz - msz;
 const intptr_t reg_max = simd_oprsz(desc);
 const intptr_t mem_max = reg_max >> diffsz;
-const int mmu_idx = cpu_mmu_index(env, false);
 ARMVectorReg scratch;
 void *host, *result;
 intptr_t split;
@@ -4345,7 +4347,7 @@ static void sve_ld1_r(CPUARMState *env, void *vg, const 
target_ulong addr,
  * on I/O memory, it may succeed but not bring in the TLB entry.
  * But even then we have still made forward progress.
  */
-tlb_fn(env, result, reg_off, addr + mem_off, mmu_idx, retaddr);
+tlb_fn(env, result, reg_off, addr + mem_off, oi, retaddr);
 reg_off += 1 << esz;
 }
 #endif
@@ -4406,9 +4408,9 @@ static void sve_ld2_r(CPUARMState *env, void *vg, 
target_ulong addr,
   uint32_t desc, int size, uintptr_t ra,
   sve_ld1_tlb_fn *tlb_fn)
 {
-const int mmu_idx = cpu_mmu_index(env, false);
+const TCGMemOpIdx oi = extract32(desc, SIMD_DATA_SHIFT, MEMOPIDX_SHIFT);
+const unsigned rd = extract32(desc, SIMD_DATA_SHIFT + MEMOPIDX_SHIFT, 5);
 intptr_t i, oprsz = simd_oprsz(desc);
-unsigned rd = simd_data(desc);
 ARMVectorReg scratch[2] = { };
 
 set_helper_retaddr(ra);
@@ -4416,8 +4418,8 @@ static void sve_ld2_r(CPUARMState *env, void *vg, 
target_ulong addr,
 uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));
 do {
 if (pg & 1) {
-tlb_fn(env, [0], i, addr, mmu_idx, ra);
-tlb_fn(env, [1], i, addr + size, mmu_idx, ra);
+tlb_fn(env, [0], i, addr, oi, 

[Qemu-devel] [PATCH 09/20] target/arm: Handle SVE vector length changes in system mode

2018-08-08 Thread Richard Henderson
SVE vector length can change when changing EL, or when writing
to one of the ZCR_ELn registers.

For correctness, our implementation requires that predicate bits
that are inaccessible are never set.  Which means noticing length
changes and zeroing the appropriate register bits.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h   |   4 ++
 target/arm/cpu64.c |  42 --
 target/arm/helper.c| 127 -
 target/arm/op_helper.c |   1 +
 4 files changed, 119 insertions(+), 55 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index ed51a2f5aa..18b3c92c2e 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -910,6 +910,10 @@ int arm_cpu_write_elf32_note(WriteCoreDumpFunction f, 
CPUState *cs,
 int aarch64_cpu_gdb_read_register(CPUState *cpu, uint8_t *buf, int reg);
 int aarch64_cpu_gdb_write_register(CPUState *cpu, uint8_t *buf, int reg);
 void aarch64_sve_narrow_vq(CPUARMState *env, unsigned vq);
+void aarch64_sve_change_el(CPUARMState *env, int old_el, int new_el);
+#else
+static inline void aarch64_sve_narrow_vq(CPUARMState *env, unsigned vq) { }
+static inline void aarch64_sve_change_el(CPUARMState *env, int o, int n) { }
 #endif
 
 target_ulong do_arm_semihosting(CPUARMState *env);
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index ae650b608e..16272f1358 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -439,45 +439,3 @@ static void aarch64_cpu_register_types(void)
 }
 
 type_init(aarch64_cpu_register_types)
-
-/* The manual says that when SVE is enabled and VQ is widened the
- * implementation is allowed to zero the previously inaccessible
- * portion of the registers.  The corollary to that is that when
- * SVE is enabled and VQ is narrowed we are also allowed to zero
- * the now inaccessible portion of the registers.
- *
- * The intent of this is that no predicate bit beyond VQ is ever set.
- * Which means that some operations on predicate registers themselves
- * may operate on full uint64_t or even unrolled across the maximum
- * uint64_t[4].  Performing 4 bits of host arithmetic unconditionally
- * may well be cheaper than conditionals to restrict the operation
- * to the relevant portion of a uint16_t[16].
- *
- * TODO: Need to call this for changes to the real system registers
- * and EL state changes.
- */
-void aarch64_sve_narrow_vq(CPUARMState *env, unsigned vq)
-{
-int i, j;
-uint64_t pmask;
-
-assert(vq >= 1 && vq <= ARM_MAX_VQ);
-assert(vq <= arm_env_get_cpu(env)->sve_max_vq);
-
-/* Zap the high bits of the zregs.  */
-for (i = 0; i < 32; i++) {
-memset(>vfp.zregs[i].d[2 * vq], 0, 16 * (ARM_MAX_VQ - vq));
-}
-
-/* Zap the high bits of the pregs and ffr.  */
-pmask = 0;
-if (vq & 3) {
-pmask = ~(-1ULL << (16 * (vq & 3)));
-}
-for (j = vq / 4; j < ARM_MAX_VQ / 4; j++) {
-for (i = 0; i < 17; ++i) {
-env->vfp.pregs[i].p[j] &= pmask;
-}
-pmask = 0;
-}
-}
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 290b1a849e..fb79b27cf6 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -4399,11 +4399,44 @@ static int sve_exception_el(CPUARMState *env, int el)
 return 0;
 }
 
+/*
+ * Given that SVE is enabled, return the vector length for EL.
+ */
+static uint32_t sve_zcr_len_for_el(CPUARMState *env, int el)
+{
+ARMCPU *cpu = arm_env_get_cpu(env);
+uint32_t zcr_len = cpu->sve_max_vq - 1;
+
+if (el <= 1) {
+zcr_len = MIN(zcr_len, 0xf & (uint32_t)env->vfp.zcr_el[1]);
+}
+if (el < 2 && arm_feature(env, ARM_FEATURE_EL2)) {
+zcr_len = MIN(zcr_len, 0xf & (uint32_t)env->vfp.zcr_el[2]);
+}
+if (el < 3 && arm_feature(env, ARM_FEATURE_EL3)) {
+zcr_len = MIN(zcr_len, 0xf & (uint32_t)env->vfp.zcr_el[3]);
+}
+return zcr_len;
+}
+
 static void zcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
   uint64_t value)
 {
+int cur_el = arm_current_el(env);
+int old_len = sve_zcr_len_for_el(env, cur_el);
+int new_len;
+
 /* Bits other than [3:0] are RAZ/WI.  */
 raw_write(env, ri, value & 0xf);
+
+/*
+ * Because we arrived here, we know both FP and SVE are enabled;
+ * otherwise we would have trapped access to the ZCR_ELn register.
+ */
+new_len = sve_zcr_len_for_el(env, cur_el);
+if (new_len < old_len) {
+aarch64_sve_narrow_vq(env, new_len + 1);
+}
 }
 
 static const ARMCPRegInfo zcr_el1_reginfo = {
@@ -8100,8 +8133,11 @@ static void arm_cpu_do_interrupt_aarch64(CPUState *cs)
 unsigned int new_el = env->exception.target_el;
 target_ulong addr = env->cp15.vbar_el[new_el];
 unsigned int new_mode = aarch64_pstate_mode(new_el, true);
+unsigned int cur_el = arm_current_el(env);
 
-if (arm_current_el(env) < new_el) {
+aarch64_sve_change_el(env, cur_el, new_el);
+
+if (cur_el < new_el) {
 /* Entry vector offset depends on whether the 

[Qemu-devel] [PATCH 13/20] target/arm: Rewrite helper_sve_ld[234]*_r

2018-08-08 Thread Richard Henderson
Use the same *_tlb primitives as we use for ld1.  This is not
a significant change, but does (for linux-user) hoist the set
of helper_retaddr, and (for softmmu) hoist the computation of
the current mmu_idx outside the loop.

This does fix the endianness problem for softmmu, and does
move the main loop out of a macro and into an inlined function.

Signed-off-by: Richard Henderson 
---
 target/arm/sve_helper.c | 210 ++--
 1 file changed, 117 insertions(+), 93 deletions(-)

diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index 4ca9412e20..5cc7de5077 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -4398,109 +4398,133 @@ DO_LD1_2(ld1dd,  3, 3)
 #undef DO_LD1_1
 #undef DO_LD1_2
 
-#define DO_LD2(NAME, FN, TYPEE, TYPEM, H)  \
-void HELPER(NAME)(CPUARMState *env, void *vg,  \
-  target_ulong addr, uint32_t desc)\
-{  \
-intptr_t i, oprsz = simd_oprsz(desc);  \
-intptr_t ra = GETPC(); \
-unsigned rd = simd_data(desc); \
-void *d1 = >vfp.zregs[rd];\
-void *d2 = >vfp.zregs[(rd + 1) & 31]; \
-for (i = 0; i < oprsz; ) { \
-uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));\
-do {   \
-TYPEM m1 = 0, m2 = 0;  \
-if (pg & 1) {  \
-m1 = FN(env, addr, ra);\
-m2 = FN(env, addr + sizeof(TYPEM), ra);\
-}  \
-*(TYPEE *)(d1 + H(i)) = m1;\
-*(TYPEE *)(d2 + H(i)) = m2;\
-i += sizeof(TYPEE), pg >>= sizeof(TYPEE);  \
-addr += 2 * sizeof(TYPEM); \
-} while (i & 15);  \
-}  \
+/*
+ * Common helpers for all contiguous 2,3,4-register predicated loads.
+ */
+static void sve_ld2_r(CPUARMState *env, void *vg, target_ulong addr,
+  uint32_t desc, int size, uintptr_t ra,
+  sve_ld1_tlb_fn *tlb_fn)
+{
+const int mmu_idx = cpu_mmu_index(env, false);
+intptr_t i, oprsz = simd_oprsz(desc);
+unsigned rd = simd_data(desc);
+ARMVectorReg scratch[2] = { };
+
+set_helper_retaddr(ra);
+for (i = 0; i < oprsz; ) {
+uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));
+do {
+if (pg & 1) {
+tlb_fn(env, [0], i, addr, mmu_idx, ra);
+tlb_fn(env, [1], i, addr + size, mmu_idx, ra);
+}
+i += size, pg >>= size;
+addr += 2 * size;
+} while (i & 15);
+}
+set_helper_retaddr(0);
+
+/* Wait until all exceptions have been raised to write back.  */
+memcpy(>vfp.zregs[rd], [0], oprsz);
+memcpy(>vfp.zregs[(rd + 1) & 31], [1], oprsz);
 }
 
-#define DO_LD3(NAME, FN, TYPEE, TYPEM, H)  \
-void HELPER(NAME)(CPUARMState *env, void *vg,  \
-  target_ulong addr, uint32_t desc)\
-{  \
-intptr_t i, oprsz = simd_oprsz(desc);  \
-intptr_t ra = GETPC(); \
-unsigned rd = simd_data(desc); \
-void *d1 = >vfp.zregs[rd];\
-void *d2 = >vfp.zregs[(rd + 1) & 31]; \
-void *d3 = >vfp.zregs[(rd + 2) & 31]; \
-for (i = 0; i < oprsz; ) { \
-uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));\
-do {   \
-TYPEM m1 = 0, m2 = 0, m3 = 0;  \
-if (pg & 1) {  \
-m1 = FN(env, addr, ra);\
-m2 = FN(env, addr + sizeof(TYPEM), ra);\
-m3 = FN(env, addr + 2 * sizeof(TYPEM), ra); \
-}  \
-*(TYPEE *)(d1 + H(i)) = m1;\
-*(TYPEE *)(d2 + H(i)) = m2;\
-*(TYPEE *)(d3 + H(i)) = m3;\
-i += sizeof(TYPEE), pg >>= sizeof(TYPEE);  \
-addr += 3 * sizeof(TYPEM); \
-} while (i & 15);  \
-}  \
+static void sve_ld3_r(CPUARMState *env, void *vg, target_ulong addr,
+  uint32_t desc, int size, uintptr_t ra,
+  sve_ld1_tlb_fn *tlb_fn)
+{
+const int 

[Qemu-devel] [PATCH 03/20] target/arm: Define ID_AA64ZFR0_EL1

2018-08-08 Thread Richard Henderson
Given that the only field defined for this new register may only
be 0, we don't actually need to change anything except the name.

Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index c24c66d43e..61a79e4c44 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -4956,9 +4956,10 @@ void register_cp_regs_for_features(ARMCPU *cpu)
   .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 4, .opc2 = 3,
   .access = PL1_R, .type = ARM_CP_CONST,
   .resetvalue = 0 },
-{ .name = "ID_AA64PFR4_EL1_RESERVED", .state = ARM_CP_STATE_AA64,
+{ .name = "ID_AA64ZFR0_EL1", .state = ARM_CP_STATE_AA64,
   .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 4, .opc2 = 4,
   .access = PL1_R, .type = ARM_CP_CONST,
+  /* At present, only SVEver == 0 is defined anyway.  */
   .resetvalue = 0 },
 { .name = "ID_AA64PFR5_EL1_RESERVED", .state = ARM_CP_STATE_AA64,
   .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 4, .opc2 = 5,
-- 
2.17.1




[Qemu-devel] [PATCH 06/20] target/arm: Fix arm_current_el for user-only

2018-08-08 Thread Richard Henderson
Saves about 12k code size in qemu-aarch64.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 2d6d7d03aa..aedaf2631e 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -1958,6 +1958,9 @@ static inline bool arm_v7m_is_handler_mode(CPUARMState 
*env)
  */
 static inline int arm_current_el(CPUARMState *env)
 {
+#ifdef CONFIG_USER_ONLY
+return 0;
+#else
 if (arm_feature(env, ARM_FEATURE_M)) {
 return arm_v7m_is_handler_mode(env) ||
 !(env->v7m.control[env->v7m.secure] & 1);
@@ -1984,6 +1987,7 @@ static inline int arm_current_el(CPUARMState *env)
 
 return 1;
 }
+#endif
 }
 
 typedef struct ARMCPRegInfo ARMCPRegInfo;
-- 
2.17.1




[Qemu-devel] [PATCH 05/20] target/arm: Fix arm_cpu_data_is_big_endian for aa64 user-only

2018-08-08 Thread Richard Henderson
Unlike aa32, endianness cannot be adjusted by userland in aa64.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h | 27 +--
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 9526ed27cb..2d6d7d03aa 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -2709,8 +2709,6 @@ static inline bool arm_sctlr_b(CPUARMState *env)
 /* Return true if the processor is in big-endian mode. */
 static inline bool arm_cpu_data_is_big_endian(CPUARMState *env)
 {
-int cur_el;
-
 /* In 32bit endianness is determined by looking at CPSR's E bit */
 if (!is_a64(env)) {
 return
@@ -2729,15 +2727,24 @@ static inline bool 
arm_cpu_data_is_big_endian(CPUARMState *env)
 arm_sctlr_b(env) ||
 #endif
 ((env->uncached_cpsr & CPSR_E) ? 1 : 0);
+} else {
+#ifdef CONFIG_USER_ONLY
+/* AArch64 does not have a SETEND instruction; endianness
+ * for usermode is fixed at compile-time.
+ */
+# ifdef TARGET_WORDS_BIGENDIAN
+return true;
+# else
+return false;
+# endif
+#else
+int cur_el = arm_current_el(env);
+if (cur_el == 0) {
+return (env->cp15.sctlr_el[1] & SCTLR_E0E) != 0;
+}
+return (env->cp15.sctlr_el[cur_el] & SCTLR_EE) != 0;
+#endif
 }
-
-cur_el = arm_current_el(env);
-
-if (cur_el == 0) {
-return (env->cp15.sctlr_el[1] & SCTLR_E0E) != 0;
-}
-
-return (env->cp15.sctlr_el[cur_el] & SCTLR_EE) != 0;
 }
 
 #include "exec/cpu-all.h"
-- 
2.17.1




[Qemu-devel] [PATCH 08/20] target/arm: Pass in current_el to fp and sve_exception_el

2018-08-08 Thread Richard Henderson
We are going to want to determine whether sve is enabled
for EL than current.

Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 21 +
 1 file changed, 9 insertions(+), 12 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 26e9098c5f..290b1a849e 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -4344,12 +4344,10 @@ static const ARMCPRegInfo debug_lpae_cp_reginfo[] = {
  * take care of raising that exception.
  * C.f. the ARM pseudocode function CheckSVEEnabled.
  */
-static int sve_exception_el(CPUARMState *env)
+static int sve_exception_el(CPUARMState *env, int el)
 {
 #ifndef CONFIG_USER_ONLY
-unsigned current_el = arm_current_el(env);
-
-if (current_el <= 1) {
+if (el <= 1) {
 bool disabled = false;
 
 /* The CPACR.ZEN controls traps to EL1:
@@ -4360,7 +4358,7 @@ static int sve_exception_el(CPUARMState *env)
 if (!extract32(env->cp15.cpacr_el1, 16, 1)) {
 disabled = true;
 } else if (!extract32(env->cp15.cpacr_el1, 17, 1)) {
-disabled = current_el == 0;
+disabled = el == 0;
 }
 if (disabled) {
 /* route_to_el2 */
@@ -4373,7 +4371,7 @@ static int sve_exception_el(CPUARMState *env)
 if (!extract32(env->cp15.cpacr_el1, 20, 1)) {
 disabled = true;
 } else if (!extract32(env->cp15.cpacr_el1, 21, 1)) {
-disabled = current_el == 0;
+disabled = el == 0;
 }
 if (disabled) {
 return 0;
@@ -4383,7 +4381,7 @@ static int sve_exception_el(CPUARMState *env)
 /* CPTR_EL2.  Since TZ and TFP are positive,
  * they will be zero when EL2 is not present.
  */
-if (current_el <= 2 && !arm_is_secure_below_el3(env)) {
+if (el <= 2 && !arm_is_secure_below_el3(env)) {
 if (env->cp15.cptr_el[2] & CPTR_TZ) {
 return 2;
 }
@@ -12318,11 +12316,10 @@ uint32_t HELPER(crc32c)(uint32_t acc, uint32_t val, 
uint32_t bytes)
 /* Return the exception level to which FP-disabled exceptions should
  * be taken, or 0 if FP is enabled.
  */
-static inline int fp_exception_el(CPUARMState *env)
+static int fp_exception_el(CPUARMState *env, int cur_el)
 {
 #ifndef CONFIG_USER_ONLY
 int fpen;
-int cur_el = arm_current_el(env);
 
 /* CPACR and the CPTR registers don't exist before v6, so FP is
  * always accessible
@@ -12385,11 +12382,12 @@ void cpu_get_tb_cpu_state(CPUARMState *env, 
target_ulong *pc,
   target_ulong *cs_base, uint32_t *pflags)
 {
 ARMMMUIdx mmu_idx = core_to_arm_mmu_idx(env, cpu_mmu_index(env, false));
-int fp_el = fp_exception_el(env);
+int current_el = arm_current_el(env);
+int fp_el = fp_exception_el(env, current_el);
 uint32_t flags;
 
 if (is_a64(env)) {
-int sve_el = sve_exception_el(env);
+int sve_el = sve_exception_el(env, current_el);
 uint32_t zcr_len;
 
 *pc = env->pc;
@@ -12404,7 +12402,6 @@ void cpu_get_tb_cpu_state(CPUARMState *env, 
target_ulong *pc,
 if (sve_el != 0 && fp_el == 0) {
 zcr_len = 0;
 } else {
-int current_el = arm_current_el(env);
 ARMCPU *cpu = arm_env_get_cpu(env);
 
 zcr_len = cpu->sve_max_vq - 1;
-- 
2.17.1




[Qemu-devel] [PATCH 04/20] target/arm: Adjust sve_exception_el

2018-08-08 Thread Richard Henderson
Check for EL3 before testing CPTR_EL3.EZ.  Return 0 when the exception
should be routed via AdvSIMDFPAccessTrap.  Mirror the structure of
CheckSVEEnabled more closely.

Fixes: 5be5e8eda78
Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 96 ++---
 1 file changed, 46 insertions(+), 50 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 61a79e4c44..26e9098c5f 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -4338,67 +4338,63 @@ static const ARMCPRegInfo debug_lpae_cp_reginfo[] = {
 REGINFO_SENTINEL
 };
 
-/* Return the exception level to which SVE-disabled exceptions should
- * be taken, or 0 if SVE is enabled.
+/* Return the exception level to which exceptions should be taken
+ * via SVEAccessTrap.  If an exception should be routed through
+ * AArch64.AdvSIMDFPAccessTrap, return 0; fp_exception_el should
+ * take care of raising that exception.
+ * C.f. the ARM pseudocode function CheckSVEEnabled.
  */
 static int sve_exception_el(CPUARMState *env)
 {
 #ifndef CONFIG_USER_ONLY
 unsigned current_el = arm_current_el(env);
 
-/* The CPACR.ZEN controls traps to EL1:
- * 0, 2 : trap EL0 and EL1 accesses
- * 1: trap only EL0 accesses
- * 3: trap no accesses
+if (current_el <= 1) {
+bool disabled = false;
+
+/* The CPACR.ZEN controls traps to EL1:
+ * 0, 2 : trap EL0 and EL1 accesses
+ * 1: trap only EL0 accesses
+ * 3: trap no accesses
+ */
+if (!extract32(env->cp15.cpacr_el1, 16, 1)) {
+disabled = true;
+} else if (!extract32(env->cp15.cpacr_el1, 17, 1)) {
+disabled = current_el == 0;
+}
+if (disabled) {
+/* route_to_el2 */
+return (arm_feature(env, ARM_FEATURE_EL2)
+&& !arm_is_secure(env)
+&& (env->cp15.hcr_el2 & HCR_TGE) ? 2 : 1);
+}
+
+/* Check CPACR.FPEN.  */
+if (!extract32(env->cp15.cpacr_el1, 20, 1)) {
+disabled = true;
+} else if (!extract32(env->cp15.cpacr_el1, 21, 1)) {
+disabled = current_el == 0;
+}
+if (disabled) {
+return 0;
+}
+}
+
+/* CPTR_EL2.  Since TZ and TFP are positive,
+ * they will be zero when EL2 is not present.
  */
-switch (extract32(env->cp15.cpacr_el1, 16, 2)) {
-default:
-if (current_el <= 1) {
-/* Trap to PL1, which might be EL1 or EL3 */
-if (arm_is_secure(env) && !arm_el_is_aa64(env, 3)) {
-return 3;
-}
-return 1;
+if (current_el <= 2 && !arm_is_secure_below_el3(env)) {
+if (env->cp15.cptr_el[2] & CPTR_TZ) {
+return 2;
 }
-break;
-case 1:
-if (current_el == 0) {
-return 1;
+if (env->cp15.cptr_el[2] & CPTR_TFP) {
+return 0;
 }
-break;
-case 3:
-break;
 }
 
-/* Similarly for CPACR.FPEN, after having checked ZEN.  */
-switch (extract32(env->cp15.cpacr_el1, 20, 2)) {
-default:
-if (current_el <= 1) {
-if (arm_is_secure(env) && !arm_el_is_aa64(env, 3)) {
-return 3;
-}
-return 1;
-}
-break;
-case 1:
-if (current_el == 0) {
-return 1;
-}
-break;
-case 3:
-break;
-}
-
-/* CPTR_EL2.  Check both TZ and TFP.  */
-if (current_el <= 2
-&& (env->cp15.cptr_el[2] & (CPTR_TFP | CPTR_TZ))
-&& !arm_is_secure_below_el3(env)) {
-return 2;
-}
-
-/* CPTR_EL3.  Check both EZ and TFP.  */
-if (!(env->cp15.cptr_el[3] & CPTR_EZ)
-|| (env->cp15.cptr_el[3] & CPTR_TFP)) {
+/* CPTR_EL3.  Since EZ is negative we must check for EL3.  */
+if (arm_feature(env, ARM_FEATURE_EL3)
+&& !(env->cp15.cptr_el[3] & CPTR_EZ)) {
 return 3;
 }
 #endif
-- 
2.17.1




[Qemu-devel] [PATCH 01/20] target/arm: Set ISAR bits for -cpu max

2018-08-08 Thread Richard Henderson
For the supported extensions, fill in the appropriate bits in
ID_ISAR5, ID_ISAR6, ID_AA64ISAR0, ID_AA64ISAR1.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu.c   | 24 +---
 target/arm/cpu64.c | 36 
 2 files changed, 45 insertions(+), 15 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index b25898ed4c..71daa39e86 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -1802,19 +1802,29 @@ static void arm_max_initfn(Object *obj)
 kvm_arm_set_cpu_features_from_host(cpu);
 } else {
 cortex_a15_initfn(obj);
+
+set_feature(>env, ARM_FEATURE_V8_AES);
+cpu->id_isar5 = deposit32(cpu->id_isar5, 4, 4, 2);
+set_feature(>env, ARM_FEATURE_V8_SHA1);
+cpu->id_isar5 = deposit32(cpu->id_isar5, 8, 4, 1);
+set_feature(>env, ARM_FEATURE_V8_SHA256);
+cpu->id_isar5 = deposit32(cpu->id_isar5, 12, 4, 1);
+set_feature(>env, ARM_FEATURE_CRC);
+cpu->id_isar5 = deposit32(cpu->id_isar5, 16, 4, 1);
+set_feature(>env, ARM_FEATURE_V8_RDM);
+cpu->id_isar5 = deposit32(cpu->id_isar5, 24, 4, 1);
+set_feature(>env, ARM_FEATURE_V8_FCMA);
+cpu->id_isar5 = deposit32(cpu->id_isar5, 28, 4, 1);
+
+set_feature(>env, ARM_FEATURE_V8_DOTPROD);
+cpu->id_isar6 = deposit32(cpu->id_isar6, 4, 4, 1);
+
 #ifdef CONFIG_USER_ONLY
 /* We don't set these in system emulation mode for the moment,
  * since we don't correctly set the ID registers to advertise them,
  */
 set_feature(>env, ARM_FEATURE_V8);
-set_feature(>env, ARM_FEATURE_V8_AES);
-set_feature(>env, ARM_FEATURE_V8_SHA1);
-set_feature(>env, ARM_FEATURE_V8_SHA256);
 set_feature(>env, ARM_FEATURE_V8_PMULL);
-set_feature(>env, ARM_FEATURE_CRC);
-set_feature(>env, ARM_FEATURE_V8_RDM);
-set_feature(>env, ARM_FEATURE_V8_DOTPROD);
-set_feature(>env, ARM_FEATURE_V8_FCMA);
 #endif
 }
 }
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 800bff780e..4d629bb99b 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -254,6 +254,34 @@ static void aarch64_max_initfn(Object *obj)
 kvm_arm_set_cpu_features_from_host(cpu);
 } else {
 aarch64_a57_initfn(obj);
+
+set_feature(>env, ARM_FEATURE_V8_SHA512);
+cpu->id_aa64isar0 = deposit64(cpu->id_aa64isar0, 12, 4, 2);
+
+set_feature(>env, ARM_FEATURE_V8_ATOMICS);
+cpu->id_aa64isar0 = deposit64(cpu->id_aa64isar0, 20, 4, 2);
+
+set_feature(>env, ARM_FEATURE_V8_RDM);
+cpu->id_aa64isar0 = deposit64(cpu->id_aa64isar0, 28, 4, 1);
+cpu->id_isar5 = deposit32(cpu->id_isar5, 24, 4, 1);
+
+set_feature(>env, ARM_FEATURE_V8_SHA3);
+cpu->id_aa64isar0 = deposit64(cpu->id_aa64isar0, 32, 4, 1);
+
+set_feature(>env, ARM_FEATURE_V8_SM3);
+cpu->id_aa64isar0 = deposit64(cpu->id_aa64isar0, 36, 4, 1);
+
+set_feature(>env, ARM_FEATURE_V8_SM4);
+cpu->id_aa64isar0 = deposit64(cpu->id_aa64isar0, 40, 4, 1);
+
+set_feature(>env, ARM_FEATURE_V8_DOTPROD);
+cpu->id_aa64isar0 = deposit64(cpu->id_aa64isar0, 44, 4, 1);
+cpu->id_isar6 = deposit32(cpu->id_isar6, 4, 4, 1);
+
+set_feature(>env, ARM_FEATURE_V8_FCMA);
+cpu->id_aa64isar1 = deposit64(cpu->id_aa64isar1, 16, 4, 1);
+cpu->id_isar5 = deposit32(cpu->id_isar5, 28, 4, 1);
+
 #ifdef CONFIG_USER_ONLY
 /* We don't set these in system emulation mode for the moment,
  * since we don't correctly set the ID registers to advertise them,
@@ -261,15 +289,7 @@ static void aarch64_max_initfn(Object *obj)
  * whereas the architecture requires them to be present in both if
  * present in either.
  */
-set_feature(>env, ARM_FEATURE_V8_SHA512);
-set_feature(>env, ARM_FEATURE_V8_SHA3);
-set_feature(>env, ARM_FEATURE_V8_SM3);
-set_feature(>env, ARM_FEATURE_V8_SM4);
-set_feature(>env, ARM_FEATURE_V8_ATOMICS);
-set_feature(>env, ARM_FEATURE_V8_RDM);
-set_feature(>env, ARM_FEATURE_V8_DOTPROD);
 set_feature(>env, ARM_FEATURE_V8_FP16);
-set_feature(>env, ARM_FEATURE_V8_FCMA);
 set_feature(>env, ARM_FEATURE_SVE);
 /* For usermode -cpu max we can use a larger and more efficient DCZ
  * blocksize since we don't have to follow what the hardware does.
-- 
2.17.1




[Qemu-devel] [PATCH 02/20] target/arm: Set ID_AA64PFR0 bits for SVE for -cpu max

2018-08-08 Thread Richard Henderson
This it a hair out of spec in that we have and advertise, support
for fp16 in aarch64 mode, but do not have nor advertise the same
in aarch32 mode.  Rationale as commented.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu64.c | 21 +++--
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 4d629bb99b..ae650b608e 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -282,15 +282,24 @@ static void aarch64_max_initfn(Object *obj)
 cpu->id_aa64isar1 = deposit64(cpu->id_aa64isar1, 16, 4, 1);
 cpu->id_isar5 = deposit32(cpu->id_isar5, 28, 4, 1);
 
-#ifdef CONFIG_USER_ONLY
-/* We don't set these in system emulation mode for the moment,
- * since we don't correctly set the ID registers to advertise them,
- * and in some cases they're only available in AArch64 and not AArch32,
- * whereas the architecture requires them to be present in both if
- * present in either.
+/* TODO: This is not yet implemented for AArch32, whereas the
+ * architecture requires a feature to be present in both if
+ * it is present in either.  However, it is required by SVE,
+ * so we don't want to leave it out of AArch64 state.
+ *
+ * Practically, the Linux kernel does not query the MVFR1 bit
+ * nor expose this as a HWCAP bit to AArch32 userland.  Thus
+ * userland, if it wanted to use fp16, would have to probe for
+ * support by executing an insn and checking for SIGILL.
+ * At which point it will get the correct answer: unsupported.
  */
 set_feature(>env, ARM_FEATURE_V8_FP16);
+cpu->id_aa64pfr0 = deposit64(cpu->id_aa64pfr0, 20, 4, 1);
+
 set_feature(>env, ARM_FEATURE_SVE);
+cpu->id_aa64pfr0 = deposit64(cpu->id_aa64pfr0, 32, 4, 1);
+
+#ifdef CONFIG_USER_ONLY
 /* For usermode -cpu max we can use a larger and more efficient DCZ
  * blocksize since we don't have to follow what the hardware does.
  */
-- 
2.17.1




[Qemu-devel] [PATCH 00/20] target/arm: sve system mode patches

2018-08-08 Thread Richard Henderson
This is my current set of patches for running SVE in system mode.

The first half deal with the system registers that affect SVE.
I recall that Peter has said he'd like the first patch to be
done a different way, but we haven't had a chance to talk about
what form it should take.  I've left it as-is since it does what
I need for now.

The second half re-implement the SVE memory operations.
The FF and NF loads had been stubbed out.  Getting those to work
requires some infrastructure that can be reused to speed up normal
loads -- one guest-to-host tlb lookup can be reused for the rest
of the page.


r~


Based-on: <20180809034033.10579-1-richard.hender...@linaro.org>
Richard Henderson (20):
  target/arm: Set ISAR bits for -cpu max
  target/arm: Set ID_AA64PFR0 bits for SVE for -cpu max
  target/arm: Define ID_AA64ZFR0_EL1
  target/arm: Adjust sve_exception_el
  target/arm: Fix arm_cpu_data_is_big_endian for aa64 user-only
  target/arm: Fix arm_current_el for user-only
  target/arm: Fix is_a64 for user-only
  target/arm: Pass in current_el to fp and sve_exception_el
  target/arm: Handle SVE vector length changes in system mode
  target/arm: Adjust aarch64_cpu_dump_state for system mode SVE
  target/arm: Clear unused predicate bits for LD1RQ
  target/arm: Rewrite helper_sve_ld1*_r using pages
  target/arm: Rewrite helper_sve_ld[234]*_r
  target/arm: Rewrite helper_sve_st[1234]*_r
  target/arm: Split contiguous loads for endianness
  target/arm: Split contiguous stores for endianness
  target/arm: Rewrite vector gather loads
  target/arm: Rewrite vector gather stores
  target/arm: Rewrite vector gather first-fault loads
  target/arm: Pass TCGMemOpIdx to sve memory helpers

 target/arm/cpu.h   |   47 +-
 target/arm/helper-sve.h|  385 +--
 target/arm/internals.h |5 +
 target/arm/cpu.c   |   24 +-
 target/arm/cpu64.c |   93 +-
 target/arm/helper.c|  237 +++--
 target/arm/op_helper.c |1 +
 target/arm/sve_helper.c| 2062 +---
 target/arm/translate-a64.c |8 +-
 target/arm/translate-sve.c |  670 
 10 files changed, 2453 insertions(+), 1079 deletions(-)

-- 
2.17.1




[Qemu-devel] [PATCH 11/11] target/arm: Add sve-max-vq cpu property to -cpu max

2018-08-08 Thread Richard Henderson
This allows the default (and maximum) vector length to be set
from the command-line.  Which is extraordinarily helpful in
debuging problems depending on vector length without having to
bake knowledge of PR_SET_SVE_VL into every guest binary.

Cc: qemu-sta...@nongnu.org (3.0.1)
Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h |  3 +++
 linux-user/syscall.c | 19 +--
 target/arm/cpu.c |  6 +++---
 target/arm/cpu64.c   | 29 +
 target/arm/helper.c  |  7 +--
 5 files changed, 53 insertions(+), 11 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index e310ffc29d..9526ed27cb 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -857,6 +857,9 @@ struct ARMCPU {
 
 /* Used to synchronize KVM and QEMU in-kernel device levels */
 uint8_t device_irq_level;
+
+/* Used to set the maximum vector length the cpu will support.  */
+uint32_t sve_max_vq;
 };
 
 static inline ARMCPU *arm_env_get_cpu(CPUARMState *env)
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index dfc851cc35..5a4af76c03 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -10848,15 +10848,22 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 #endif
 #ifdef TARGET_AARCH64
 case TARGET_PR_SVE_SET_VL:
-/* We cannot support either PR_SVE_SET_VL_ONEXEC
-   or PR_SVE_VL_INHERIT.  Therefore, anything above
-   ARM_MAX_VQ results in EINVAL.  */
+/*
+ * We cannot support either PR_SVE_SET_VL_ONEXEC or
+ * PR_SVE_VL_INHERIT.  Note the kernel definition
+ * of sve_vl_valid allows for VQ=512, i.e. VL=8192,
+ * even though the current architectural maximum is VQ=16.
+ */
 ret = -TARGET_EINVAL;
 if (arm_feature(cpu_env, ARM_FEATURE_SVE)
-&& arg2 >= 0 && arg2 <= ARM_MAX_VQ * 16 && !(arg2 & 15)) {
+&& arg2 >= 0 && arg2 <= 512 * 16 && !(arg2 & 15)) {
 CPUARMState *env = cpu_env;
-int old_vq = (env->vfp.zcr_el[1] & 0xf) + 1;
-int vq = MAX(arg2 / 16, 1);
+ARMCPU *cpu = arm_env_get_cpu(env);
+uint32_t vq, old_vq;
+
+old_vq = (env->vfp.zcr_el[1] & 0xf) + 1;
+vq = MAX(arg2 / 16, 1);
+vq = MIN(vq, cpu->sve_max_vq);
 
 if (vq < old_vq) {
 aarch64_sve_narrow_vq(env, vq);
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 64a8005a4b..b25898ed4c 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -168,9 +168,9 @@ static void arm_cpu_reset(CPUState *s)
 env->cp15.cpacr_el1 = deposit64(env->cp15.cpacr_el1, 16, 2, 3);
 env->cp15.cptr_el[3] |= CPTR_EZ;
 /* with maximum vector length */
-env->vfp.zcr_el[1] = ARM_MAX_VQ - 1;
-env->vfp.zcr_el[2] = ARM_MAX_VQ - 1;
-env->vfp.zcr_el[3] = ARM_MAX_VQ - 1;
+env->vfp.zcr_el[1] = cpu->sve_max_vq - 1;
+env->vfp.zcr_el[2] = env->vfp.zcr_el[1];
+env->vfp.zcr_el[3] = env->vfp.zcr_el[1];
 #else
 /* Reset into the highest available EL */
 if (arm_feature(env, ARM_FEATURE_EL3)) {
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index d0581d59d8..800bff780e 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -29,6 +29,7 @@
 #include "sysemu/sysemu.h"
 #include "sysemu/kvm.h"
 #include "kvm_arm.h"
+#include "qapi/visitor.h"
 
 static inline void set_feature(CPUARMState *env, int feature)
 {
@@ -217,6 +218,29 @@ static void aarch64_a53_initfn(Object *obj)
 define_arm_cp_regs(cpu, cortex_a57_a53_cp_reginfo);
 }
 
+static void cpu_max_get_sve_vq(Object *obj, Visitor *v, const char *name,
+   void *opaque, Error **errp)
+{
+ARMCPU *cpu = ARM_CPU(obj);
+visit_type_uint32(v, name, >sve_max_vq, errp);
+}
+
+static void cpu_max_set_sve_vq(Object *obj, Visitor *v, const char *name,
+   void *opaque, Error **errp)
+{
+ARMCPU *cpu = ARM_CPU(obj);
+Error *err = NULL;
+
+visit_type_uint32(v, name, >sve_max_vq, );
+
+if (!err && (cpu->sve_max_vq == 0 || cpu->sve_max_vq > ARM_MAX_VQ)) {
+error_setg(, "unsupported SVE vector length");
+error_append_hint(, "Valid sve-max-vq in range [1-%d]\n",
+  ARM_MAX_VQ);
+}
+error_propagate(errp, err);
+}
+
 /* -cpu max: if KVM is enabled, like -cpu host (best possible with this host);
  * otherwise, a CPU with as many features enabled as our emulation supports.
  * The version of '-cpu max' for qemu-system-arm is defined in cpu.c;
@@ -253,6 +277,10 @@ static void aarch64_max_initfn(Object *obj)
 cpu->ctr = 0x80038003; /* 32 byte I and D cacheline size, VIPT icache 
*/
 cpu->dcz_blocksize = 7; /*  512 bytes */
 #endif
+
+cpu->sve_max_vq = ARM_MAX_VQ;
+object_property_add(obj, "sve-max-vq", "uint32", 

[Qemu-devel] [PATCH 08/11] target/arm: Fix offset scaling for LD_zprr and ST_zprr

2018-08-08 Thread Richard Henderson
The scaling should be solely on the memory operation size; the number
of registers being loaded does not come in to the initial computation.

Cc: qemu-sta...@nongnu.org (3.0.1)
Reported-by: Laurent Desnogues 
Signed-off-by: Richard Henderson 
---
 target/arm/translate-sve.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index f635822a61..d27bc8c946 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -4665,8 +4665,7 @@ static bool trans_LD_zprr(DisasContext *s, arg_rprr_load 
*a, uint32_t insn)
 }
 if (sve_access_check(s)) {
 TCGv_i64 addr = new_tmp_a64(s);
-tcg_gen_muli_i64(addr, cpu_reg(s, a->rm),
- (a->nreg + 1) << dtype_msz(a->dtype));
+tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), dtype_msz(a->dtype));
 tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn));
 do_ld_zpa(s, a->rd, a->pg, addr, a->dtype, a->nreg);
 }
@@ -4899,7 +4898,7 @@ static bool trans_ST_zprr(DisasContext *s, arg_rprr_store 
*a, uint32_t insn)
 }
 if (sve_access_check(s)) {
 TCGv_i64 addr = new_tmp_a64(s);
-tcg_gen_muli_i64(addr, cpu_reg(s, a->rm), (a->nreg + 1) << a->msz);
+tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), a->msz);
 tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn));
 do_st_zpa(s, a->rd, a->pg, addr, a->msz, a->esz, a->nreg);
 }
-- 
2.17.1




[Qemu-devel] [PATCH 07/11] target/arm: Fix offset for LD1R instructions

2018-08-08 Thread Richard Henderson
The immediate should be scaled by the size of the memory reference,
not the size of the elements into which it is loaded.

Cc: qemu-sta...@nongnu.org (3.0.1)
Reported-by: Laurent Desnogues 
Signed-off-by: Richard Henderson 
---
 target/arm/translate-sve.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index 9e63b5f8e5..f635822a61 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -4819,6 +4819,7 @@ static bool trans_LD1R_zpri(DisasContext *s, 
arg_rpri_load *a, uint32_t insn)
 unsigned vsz = vec_full_reg_size(s);
 unsigned psz = pred_full_reg_size(s);
 unsigned esz = dtype_esz[a->dtype];
+unsigned msz = dtype_msz(a->dtype);
 TCGLabel *over = gen_new_label();
 TCGv_i64 temp;
 
@@ -4842,7 +4843,7 @@ static bool trans_LD1R_zpri(DisasContext *s, 
arg_rpri_load *a, uint32_t insn)
 
 /* Load the data.  */
 temp = tcg_temp_new_i64();
-tcg_gen_addi_i64(temp, cpu_reg_sp(s, a->rn), a->imm << esz);
+tcg_gen_addi_i64(temp, cpu_reg_sp(s, a->rn), a->imm << msz);
 tcg_gen_qemu_ld_i64(temp, temp, get_mem_index(s),
 s->be_data | dtype_mop[a->dtype]);
 
-- 
2.17.1




[Qemu-devel] [PATCH 04/11] target/arm: Fix typo in helper_sve_movz_d

2018-08-08 Thread Richard Henderson
Cc: qemu-sta...@nongnu.org (3.0.1)
Tested-by: Laurent Desnogues 
Reviewed-by: Laurent Desnogues 
Reviewed-by: Alex Bennée 
Reviewed-by: Philippe Mathieu-Daudé 
Reported-by: Laurent Desnogues 
Signed-off-by: Richard Henderson 
---
 target/arm/sve_helper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index 87594a8adb..c3cbec9cf5 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -1042,7 +1042,7 @@ void HELPER(sve_movz_d)(void *vd, void *vn, void *vg, 
uint32_t desc)
 uint64_t *d = vd, *n = vn;
 uint8_t *pg = vg;
 for (i = 0; i < opr_sz; i += 1) {
-d[i] = n[1] & -(uint64_t)(pg[H1(i)] & 1);
+d[i] = n[i] & -(uint64_t)(pg[H1(i)] & 1);
 }
 }
 
-- 
2.17.1




[Qemu-devel] [PATCH 10/11] target/arm: Dump SVE state if enabled

2018-08-08 Thread Richard Henderson
Also fold the FPCR/FPSR state onto the same line as PSTATE,
and mention but do not dump disabled FPU state.

Cc: qemu-sta...@nongnu.org (3.0.1)
Signed-off-by: Richard Henderson 
---
 target/arm/translate-a64.c | 95 +-
 1 file changed, 83 insertions(+), 12 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 358f169c75..b29dc49c4f 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -152,8 +152,7 @@ void aarch64_cpu_dump_state(CPUState *cs, FILE *f,
 } else {
 ns_status = "";
 }
-
-cpu_fprintf(f, "\nPSTATE=%08x %c%c%c%c %sEL%d%c\n",
+cpu_fprintf(f, "PSTATE=%08x %c%c%c%c %sEL%d%c",
 psr,
 psr & PSTATE_N ? 'N' : '-',
 psr & PSTATE_Z ? 'Z' : '-',
@@ -163,17 +162,89 @@ void aarch64_cpu_dump_state(CPUState *cs, FILE *f,
 el,
 psr & PSTATE_SP ? 'h' : 't');
 
-if (flags & CPU_DUMP_FPU) {
-int numvfpregs = 32;
-for (i = 0; i < numvfpregs; i++) {
-uint64_t *q = aa64_vfp_qreg(env, i);
-uint64_t vlo = q[0];
-uint64_t vhi = q[1];
-cpu_fprintf(f, "q%02d=%016" PRIx64 ":%016" PRIx64 "%c",
-i, vhi, vlo, (i & 1 ? '\n' : ' '));
+if (!(flags & CPU_DUMP_FPU)) {
+cpu_fprintf(f, "\n");
+return;
+}
+cpu_fprintf(f, " FPCR=%08x FPSR=%08x\n",
+vfp_get_fpcr(env), vfp_get_fpsr(env));
+
+if (arm_feature(env, ARM_FEATURE_SVE)) {
+int j, zcr_len = env->vfp.zcr_el[1] & 0xf; /* fix for system mode */
+
+for (i = 0; i <= FFR_PRED_NUM; i++) {
+bool eol;
+if (i == FFR_PRED_NUM) {
+cpu_fprintf(f, "FFR=");
+/* It's last, so end the line.  */
+eol = true;
+} else {
+cpu_fprintf(f, "P%02d=", i);
+switch (zcr_len) {
+case 0:
+eol = i % 8 == 7;
+break;
+case 1:
+eol = i % 6 == 5;
+break;
+case 2:
+case 3:
+eol = i % 3 == 2;
+break;
+default:
+/* More than one quadword per predicate.  */
+eol = true;
+break;
+}
+}
+for (j = zcr_len / 4; j >= 0; j--) {
+int digits;
+if (j * 4 + 4 <= zcr_len + 1) {
+digits = 16;
+} else {
+digits = (zcr_len % 4 + 1) * 4;
+}
+cpu_fprintf(f, "%0*" PRIx64 "%s", digits,
+env->vfp.pregs[i].p[j],
+j ? ":" : eol ? "\n" : " ");
+}
+}
+
+for (i = 0; i < 32; i++) {
+if (zcr_len == 0) {
+cpu_fprintf(f, "Z%02d=%016" PRIx64 ":%016" PRIx64 "%s",
+i, env->vfp.zregs[i].d[1],
+env->vfp.zregs[i].d[0], i & 1 ? "\n" : " ");
+} else if (zcr_len == 1) {
+cpu_fprintf(f, "Z%02d=%016" PRIx64 ":%016" PRIx64
+":%016" PRIx64 ":%016" PRIx64 "\n",
+i, env->vfp.zregs[i].d[3], env->vfp.zregs[i].d[2],
+env->vfp.zregs[i].d[1], env->vfp.zregs[i].d[0]);
+} else {
+for (j = zcr_len; j >= 0; j--) {
+bool odd = (zcr_len - j) % 2 != 0;
+if (j == zcr_len) {
+cpu_fprintf(f, "Z%02d[%x-%x]=", i, j, j - 1);
+} else if (!odd) {
+if (j > 0) {
+cpu_fprintf(f, "   [%x-%x]=", j, j - 1);
+} else {
+cpu_fprintf(f, " [%x]=", j);
+}
+}
+cpu_fprintf(f, "%016" PRIx64 ":%016" PRIx64 "%s",
+env->vfp.zregs[i].d[j * 2 + 1],
+env->vfp.zregs[i].d[j * 2],
+odd || j == 0 ? "\n" : ":");
+}
+}
+}
+} else {
+for (i = 0; i < 32; i++) {
+uint64_t *q = aa64_vfp_qreg(env, i);
+cpu_fprintf(f, "Q%02d=%016" PRIx64 ":%016" PRIx64 "%s",
+i, q[1], q[0], (i & 1 ? "\n" : " "));
 }
-cpu_fprintf(f, "FPCR: %08x  FPSR: %08x\n",
-vfp_get_fpcr(env), vfp_get_fpsr(env));
 }
 }
 
-- 
2.17.1




[Qemu-devel] [PATCH 02/11] target/arm: Fix typo in do_sat_addsub_64

2018-08-08 Thread Richard Henderson
Used the wrong temporary in the computation of subtractive overflow.

Cc: qemu-sta...@nongnu.org (3.0.1)
Tested-by: Laurent Desnogues 
Reviewed-by: Laurent Desnogues 
Reported-by: Laurent Desnogues 
Signed-off-by: Richard Henderson 
---
 target/arm/translate-sve.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index 374051cd20..9dd4c38bab 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -1625,7 +1625,7 @@ static void do_sat_addsub_64(TCGv_i64 reg, TCGv_i64 val, 
bool u, bool d)
 /* Detect signed overflow for subtraction.  */
 tcg_gen_xor_i64(t0, reg, val);
 tcg_gen_sub_i64(t1, reg, val);
-tcg_gen_xor_i64(reg, reg, t0);
+tcg_gen_xor_i64(reg, reg, t1);
 tcg_gen_and_i64(t0, t0, reg);
 
 /* Bound the result.  */
-- 
2.17.1




[Qemu-devel] [PATCH 06/11] target/arm: Fix sign-extension in sve do_ldr/do_str

2018-08-08 Thread Richard Henderson
The expression (int) imm + (uint32_t) len_align turns into uint32_t
and thus with negative imm produces a memory operation at the wrong
offset.  None of the numbers involved are particularly large, so
change everything to use int.

Cc: qemu-sta...@nongnu.org (3.0.1)
Reported-by: Laurent Desnogues 
Signed-off-by: Richard Henderson 
---
 target/arm/translate-sve.c | 18 --
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index 89efc80ee7..9e63b5f8e5 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -4372,12 +4372,11 @@ static bool trans_UCVTF_dd(DisasContext *s, arg_rpr_esz 
*a, uint32_t insn)
  * The load should begin at the address Rn + IMM.
  */
 
-static void do_ldr(DisasContext *s, uint32_t vofs, uint32_t len,
-   int rn, int imm)
+static void do_ldr(DisasContext *s, uint32_t vofs, int len, int rn, int imm)
 {
-uint32_t len_align = QEMU_ALIGN_DOWN(len, 8);
-uint32_t len_remain = len % 8;
-uint32_t nparts = len / 8 + ctpop8(len_remain);
+int len_align = QEMU_ALIGN_DOWN(len, 8);
+int len_remain = len % 8;
+int nparts = len / 8 + ctpop8(len_remain);
 int midx = get_mem_index(s);
 TCGv_i64 addr, t0, t1;
 
@@ -4458,12 +4457,11 @@ static void do_ldr(DisasContext *s, uint32_t vofs, 
uint32_t len,
 }
 
 /* Similarly for stores.  */
-static void do_str(DisasContext *s, uint32_t vofs, uint32_t len,
-   int rn, int imm)
+static void do_str(DisasContext *s, uint32_t vofs, int len, int rn, int imm)
 {
-uint32_t len_align = QEMU_ALIGN_DOWN(len, 8);
-uint32_t len_remain = len % 8;
-uint32_t nparts = len / 8 + ctpop8(len_remain);
+int len_align = QEMU_ALIGN_DOWN(len, 8);
+int len_remain = len % 8;
+int nparts = len / 8 + ctpop8(len_remain);
 int midx = get_mem_index(s);
 TCGv_i64 addr, t0;
 
-- 
2.17.1




[Qemu-devel] [PATCH 03/11] target/arm: Reorganize SVE WHILE

2018-08-08 Thread Richard Henderson
The pseudocode for this operation is an increment + compare loop,
so comparing <= the maximum integer produces an all-true predicate.

Rather than bound in both the inline code and the helper, pass the
helper the number of predicate bits to set instead of the number
of predicate elements to set.

Cc: qemu-sta...@nongnu.org (3.0.1)
Tested-by: Laurent Desnogues 
Reviewed-by: Laurent Desnogues 
Reported-by: Laurent Desnogues 
Signed-off-by: Richard Henderson 
---
 target/arm/sve_helper.c|  5 
 target/arm/translate-sve.c | 49 +-
 2 files changed, 32 insertions(+), 22 deletions(-)

diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index 9bd0694d55..87594a8adb 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -2846,11 +2846,6 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, 
uint32_t pred_desc)
 return flags;
 }
 
-/* Scale from predicate element count to bits.  */
-count <<= esz;
-/* Bound to the bits in the predicate.  */
-count = MIN(count, oprsz * 8);
-
 /* Set all of the requested bits.  */
 for (i = 0; i < count / 64; ++i) {
 d->p[i] = esz_mask;
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index 9dd4c38bab..89efc80ee7 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -3173,19 +3173,19 @@ static bool trans_CTERM(DisasContext *s, arg_CTERM *a, 
uint32_t insn)
 
 static bool trans_WHILE(DisasContext *s, arg_WHILE *a, uint32_t insn)
 {
-if (!sve_access_check(s)) {
-return true;
-}
-
-TCGv_i64 op0 = read_cpu_reg(s, a->rn, 1);
-TCGv_i64 op1 = read_cpu_reg(s, a->rm, 1);
-TCGv_i64 t0 = tcg_temp_new_i64();
-TCGv_i64 t1 = tcg_temp_new_i64();
+TCGv_i64 op0, op1, t0, t1, tmax;
 TCGv_i32 t2, t3;
 TCGv_ptr ptr;
 unsigned desc, vsz = vec_full_reg_size(s);
 TCGCond cond;
 
+if (!sve_access_check(s)) {
+return true;
+}
+
+op0 = read_cpu_reg(s, a->rn, 1);
+op1 = read_cpu_reg(s, a->rm, 1);
+
 if (!a->sf) {
 if (a->u) {
 tcg_gen_ext32u_i64(op0, op0);
@@ -3198,32 +3198,47 @@ static bool trans_WHILE(DisasContext *s, arg_WHILE *a, 
uint32_t insn)
 
 /* For the helper, compress the different conditions into a computation
  * of how many iterations for which the condition is true.
- *
- * This is slightly complicated by 0 <= UINT64_MAX, which is nominally
- * 2**64 iterations, overflowing to 0.  Of course, predicate registers
- * aren't that large, so any value >= predicate size is sufficient.
  */
+t0 = tcg_temp_new_i64();
+t1 = tcg_temp_new_i64();
 tcg_gen_sub_i64(t0, op1, op0);
 
-/* t0 = MIN(op1 - op0, vsz).  */
-tcg_gen_movi_i64(t1, vsz);
-tcg_gen_umin_i64(t0, t0, t1);
+tmax = tcg_const_i64(vsz >> a->esz);
 if (a->eq) {
 /* Equality means one more iteration.  */
 tcg_gen_addi_i64(t0, t0, 1);
+
+/* If op1 is max (un)signed integer (and the only time the addition
+ * above could overflow), then we produce an all-true predicate by
+ * setting the count to the vector length.  This is because the
+ * pseudocode is described as an increment + compare loop, and the
+ * max integer would always compare true.
+ */
+tcg_gen_movi_i64(t1, (a->sf
+  ? (a->u ? UINT64_MAX : INT64_MAX)
+  : (a->u ? UINT32_MAX : INT32_MAX)));
+tcg_gen_movcond_i64(TCG_COND_EQ, t0, op1, t1, tmax, t0);
 }
 
-/* t0 = (condition true ? t0 : 0).  */
+/* Bound to the maximum.  */
+tcg_gen_umin_i64(t0, t0, tmax);
+tcg_temp_free_i64(tmax);
+
+/* Set the count to zero if the condition is false.  */
 cond = (a->u
 ? (a->eq ? TCG_COND_LEU : TCG_COND_LTU)
 : (a->eq ? TCG_COND_LE : TCG_COND_LT));
 tcg_gen_movi_i64(t1, 0);
 tcg_gen_movcond_i64(cond, t0, op0, op1, t0, t1);
+tcg_temp_free_i64(t1);
 
+/* Since we're bounded, pass as a 32-bit type.  */
 t2 = tcg_temp_new_i32();
 tcg_gen_extrl_i64_i32(t2, t0);
 tcg_temp_free_i64(t0);
-tcg_temp_free_i64(t1);
+
+/* Scale elements to bits.  */
+tcg_gen_shli_i32(t2, t2, a->esz);
 
 desc = (vsz / 8) - 2;
 desc = deposit32(desc, SIMD_DATA_SHIFT, 2, a->esz);
-- 
2.17.1




[Qemu-devel] [PATCH 09/11] target/arm: Reformat integer register dump

2018-08-08 Thread Richard Henderson
With PC, there are 33 registers.  Three per line lines up nicely
without overflowing 80 columns.

Cc: qemu-sta...@nongnu.org (3.0.1)
Signed-off-by: Richard Henderson 
---
 target/arm/translate-a64.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 45a6c2a3aa..358f169c75 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -137,14 +137,13 @@ void aarch64_cpu_dump_state(CPUState *cs, FILE *f,
 int el = arm_current_el(env);
 const char *ns_status;
 
-cpu_fprintf(f, "PC=%016"PRIx64"  SP=%016"PRIx64"\n",
-env->pc, env->xregs[31]);
-for (i = 0; i < 31; i++) {
-cpu_fprintf(f, "X%02d=%016"PRIx64, i, env->xregs[i]);
-if ((i % 4) == 3) {
-cpu_fprintf(f, "\n");
+cpu_fprintf(f, " PC=%016" PRIx64 " ", env->pc);
+for (i = 0; i < 32; i++) {
+if (i == 31) {
+cpu_fprintf(f, " SP=%016" PRIx64 "\n", env->xregs[i]);
 } else {
-cpu_fprintf(f, " ");
+cpu_fprintf(f, "X%02d=%016" PRIx64 "%s", i, env->xregs[i],
+(i + 2) % 3 ? " " : "\n");
 }
 }
 
-- 
2.17.1




[Qemu-devel] [PATCH 01/11] target/arm: Fix sign of sve_cmpeq_ppzw/sve_cmpne_ppzw

2018-08-08 Thread Richard Henderson
The normal vector element is sign-extended before
comparing with the wide vector element.

Cc: qemu-sta...@nongnu.org (3.0.1)
Tested-by: Laurent Desnogues 
Reviewed-by: Laurent Desnogues 
Reviewed-by: Alex Bennée 
Reported-by: Laurent Desnogues 
Signed-off-by: Richard Henderson 
---
 target/arm/sve_helper.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index 54795c9194..9bd0694d55 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -2436,13 +2436,13 @@ uint32_t HELPER(NAME)(void *vd, void *vn, void *vm, 
void *vg, uint32_t desc) \
 #define DO_CMP_PPZW_S(NAME, TYPE, TYPEW, OP) \
 DO_CMP_PPZW(NAME, TYPE, TYPEW, OP, H1_4, 0xull)
 
-DO_CMP_PPZW_B(sve_cmpeq_ppzw_b, uint8_t,  uint64_t, ==)
-DO_CMP_PPZW_H(sve_cmpeq_ppzw_h, uint16_t, uint64_t, ==)
-DO_CMP_PPZW_S(sve_cmpeq_ppzw_s, uint32_t, uint64_t, ==)
+DO_CMP_PPZW_B(sve_cmpeq_ppzw_b, int8_t,  uint64_t, ==)
+DO_CMP_PPZW_H(sve_cmpeq_ppzw_h, int16_t, uint64_t, ==)
+DO_CMP_PPZW_S(sve_cmpeq_ppzw_s, int32_t, uint64_t, ==)
 
-DO_CMP_PPZW_B(sve_cmpne_ppzw_b, uint8_t,  uint64_t, !=)
-DO_CMP_PPZW_H(sve_cmpne_ppzw_h, uint16_t, uint64_t, !=)
-DO_CMP_PPZW_S(sve_cmpne_ppzw_s, uint32_t, uint64_t, !=)
+DO_CMP_PPZW_B(sve_cmpne_ppzw_b, int8_t,  uint64_t, !=)
+DO_CMP_PPZW_H(sve_cmpne_ppzw_h, int16_t, uint64_t, !=)
+DO_CMP_PPZW_S(sve_cmpne_ppzw_s, int32_t, uint64_t, !=)
 
 DO_CMP_PPZW_B(sve_cmpgt_ppzw_b, int8_t,   int64_t, >)
 DO_CMP_PPZW_H(sve_cmpgt_ppzw_h, int16_t,  int64_t, >)
-- 
2.17.1




[Qemu-devel] [PATCH 05/11] target/arm: Fix typo in helper_sve_ld1hss_r

2018-08-08 Thread Richard Henderson
Cc: qemu-sta...@nongnu.org (3.0.1)
Signed-off-by: Richard Henderson 
---
 target/arm/sve_helper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index c3cbec9cf5..e03f954a26 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -4045,7 +4045,7 @@ DO_LD1(sve_ld1bdu_r, cpu_ldub_data_ra, uint64_t, uint8_t, 
)
 DO_LD1(sve_ld1bds_r, cpu_ldsb_data_ra, uint64_t, int8_t, )
 
 DO_LD1(sve_ld1hsu_r, cpu_lduw_data_ra, uint32_t, uint16_t, H1_4)
-DO_LD1(sve_ld1hss_r, cpu_ldsw_data_ra, uint32_t, int8_t, H1_4)
+DO_LD1(sve_ld1hss_r, cpu_ldsw_data_ra, uint32_t, int16_t, H1_4)
 DO_LD1(sve_ld1hdu_r, cpu_lduw_data_ra, uint64_t, uint16_t, )
 DO_LD1(sve_ld1hds_r, cpu_ldsw_data_ra, uint64_t, int16_t, )
 
-- 
2.17.1




[Qemu-devel] [PATCH 00/11] target/arm: sve linux-user patches

2018-08-08 Thread Richard Henderson
I posted a few of these before, and I thought Peter had applied them
to his target-arm.for-3-1 branch, but I don't see them there now.  

I've taken the opportunity to tag all of these for backport into the
next stable release.  I'm intending to do so for all of the correctness
patches affecting sve linux-user so that 3.0.1 will be usable long-term.


r~


Richard Henderson (11):
  target/arm: Fix sign of sve_cmpeq_ppzw/sve_cmpne_ppzw
  target/arm: Fix typo in do_sat_addsub_64
  target/arm: Reorganize SVE WHILE
  target/arm: Fix typo in helper_sve_movz_d
  target/arm: Fix typo in helper_sve_ld1hss_r
  target/arm: Fix sign-extension in sve do_ldr/do_str
  target/arm: Fix offset for LD1R instructions
  target/arm: Fix offset scaling for LD_zprr and ST_zprr
  target/arm: Reformat integer register dump
  target/arm: Dump SVE state if enabled
  target/arm: Add sve-max-vq cpu property to -cpu max

 target/arm/cpu.h   |   3 ++
 linux-user/syscall.c   |  19 ---
 target/arm/cpu.c   |   6 +--
 target/arm/cpu64.c |  29 ++
 target/arm/helper.c|   7 ++-
 target/arm/sve_helper.c|  21 +++-
 target/arm/translate-a64.c | 108 ++---
 target/arm/translate-sve.c |  77 +++---
 8 files changed, 195 insertions(+), 75 deletions(-)

-- 
2.17.1




Re: [Qemu-devel] [PATCH v3 10/10] migration: show the statistics of compression

2018-08-08 Thread Peter Xu
On Thu, Aug 09, 2018 at 11:13:17AM +0800, Xiao Guangrong wrote:
> 
> 
> On 08/08/2018 02:12 PM, Peter Xu wrote:
> > On Tue, Aug 07, 2018 at 05:12:09PM +0800, guangrong.x...@gmail.com wrote:
> > 
> > [...]
> > 
> > > @@ -1602,6 +1614,26 @@ static void migration_update_rates(RAMState *rs, 
> > > int64_t end_time)
> > >   rs->xbzrle_cache_miss_prev) / page_count;
> > >   rs->xbzrle_cache_miss_prev = xbzrle_counters.cache_miss;
> > >   }
> > > +
> > > +if (migrate_use_compression()) {
> > > +compression_counters.busy_rate = 
> > > (double)(compression_counters.busy -
> > > +rs->compress_thread_busy_prev) / page_count;
> > 
> > So this is related to the previous patch - I still doubt its
> > correctness if page_count is the host pages count rather than the
> > guest pages'.  Other than that the patch looks good to me.
> 
> I think i can treat it as your Reviewed-by boldly. :)

Yes, please do. :)

Regards,

-- 
Peter Xu



Re: [Qemu-devel] [PATCH v3 10/10] migration: show the statistics of compression

2018-08-08 Thread Xiao Guangrong




On 08/08/2018 02:12 PM, Peter Xu wrote:

On Tue, Aug 07, 2018 at 05:12:09PM +0800, guangrong.x...@gmail.com wrote:

[...]


@@ -1602,6 +1614,26 @@ static void migration_update_rates(RAMState *rs, int64_t 
end_time)
  rs->xbzrle_cache_miss_prev) / page_count;
  rs->xbzrle_cache_miss_prev = xbzrle_counters.cache_miss;
  }
+
+if (migrate_use_compression()) {
+compression_counters.busy_rate = (double)(compression_counters.busy -
+rs->compress_thread_busy_prev) / page_count;


So this is related to the previous patch - I still doubt its
correctness if page_count is the host pages count rather than the
guest pages'.  Other than that the patch looks good to me.


I think i can treat it as your Reviewed-by boldly. :)



Re: [Qemu-devel] [PATCH v3 08/10] migration: handle the error condition properly

2018-08-08 Thread Xiao Guangrong




On 08/08/2018 10:11 PM, Dr. David Alan Gilbert wrote:

* Xiao Guangrong (guangrong.x...@gmail.com) wrote:



On 08/08/2018 01:08 PM, Peter Xu wrote:

On Tue, Aug 07, 2018 at 05:12:07PM +0800, guangrong.x...@gmail.com wrote:

From: Xiao Guangrong 

ram_find_and_save_block() can return negative if any error hanppens,
however, it is completely ignored in current code


Could you hint me where we'll return an error?



I think control_save_page() may return a error condition but i am not
good at it ... Other places look safe _currently_. These functions were
designed to have error returned anyway.


ram_control_save_page's return is checked by control_save_page which
returns true/false but sets *pages to a return value.

What I'd need to follow closely is the case where ram_control_save_page
returns RAM_SAVE_CONTROL_DELAYED, in that case control_save_page I think
returns with *pages=-1 and returns true.
And I think in that case ram_save_target_page can leak that -1 - hmm.

Now, ram_save_host_page already checks for <0 and will return that,
but I think that would potentially loop in ram_find_and_save_block; I'm
not sure we want to change that or not!


ram_find_and_save_block() will continue the look only if ram_save_host_page
returns zero:

..
if (found) {
pages = ram_save_host_page(rs, , last_stage);
}
} while (!pages && again);

IMHO, how to change the code really depends on the semantic of these functions,
based on the comments of them, returning error conditions is the current
semantic.

Another choice would be the one squashes error conditions to QEMUFile and
adapt comments and code of these functions to reflect the new semantic
clearly.

Which one do you guys prefer to? :)




Re: [Qemu-devel] [PATCH] spapr_cpu_core: vmstate_[un]register per-CPU data from (un)realizefn

2018-08-08 Thread David Gibson
On Wed, Aug 08, 2018 at 09:29:19PM +0530, Bharata B Rao wrote:
> VMStateDescription vmstate_spapr_cpu_state was added by commit
> b94020268e0b6 (spapr_cpu_core: migrate per-CPU data) to migrate per-CPU
> data with the required vmstate registration and unregistration calls.
> However the unregistration is being done only from vcpu creation error path
> and not from CPU delete path.
> 
> This causes migration to fail with the following error if migration is
> attempted after a CPU unplug like this:
> Unknown savevm section or instance 'spapr_cpu' 16
> Additionally this leaves the source VM unresponsive after migration failure.
> 
> Fix this by ensuring the vmstate_unregister happens during CPU removal.
> Fixing this becomes easier when vmstate (un)registration calls are moved to
> vcpu (un)realize functions which is what this patch does.
> 
> Fixes: https://bugs.launchpad.net/qemu/+bug/1785972
> Reported-by: Satheesh Rajendran 
> Signed-off-by: Bharata B Rao 

Applied to ppc-for-3.1.

Unfortunately, despite being a clear regression, I think it's too late
for 3.0 :(.

Mike, can you queue this for 3.0.1 as too, thanks?

> ---
>  hw/ppc/spapr_cpu_core.c | 62 
> +
>  1 file changed, 32 insertions(+), 30 deletions(-)
> 
> diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
> index 993759db47..bb88a3ce4e 100644
> --- a/hw/ppc/spapr_cpu_core.c
> +++ b/hw/ppc/spapr_cpu_core.c
> @@ -113,26 +113,6 @@ const char *spapr_get_cpu_core_type(const char *cpu_type)
>  return object_class_get_name(oc);
>  }
>  
> -static void spapr_unrealize_vcpu(PowerPCCPU *cpu)
> -{
> -qemu_unregister_reset(spapr_cpu_reset, cpu);
> -object_unparent(cpu->intc);
> -cpu_remove_sync(CPU(cpu));
> -object_unparent(OBJECT(cpu));
> -}
> -
> -static void spapr_cpu_core_unrealize(DeviceState *dev, Error **errp)
> -{
> -sPAPRCPUCore *sc = SPAPR_CPU_CORE(OBJECT(dev));
> -CPUCore *cc = CPU_CORE(dev);
> -int i;
> -
> -for (i = 0; i < cc->nr_threads; i++) {
> -spapr_unrealize_vcpu(sc->threads[i]);
> -}
> -g_free(sc->threads);
> -}
> -
>  static bool slb_shadow_needed(void *opaque)
>  {
>  sPAPRCPUState *spapr_cpu = opaque;
> @@ -207,10 +187,34 @@ static const VMStateDescription vmstate_spapr_cpu_state 
> = {
>  }
>  };
>  
> +static void spapr_unrealize_vcpu(PowerPCCPU *cpu, sPAPRCPUCore *sc)
> +{
> +if (!sc->pre_3_0_migration) {
> +vmstate_unregister(NULL, _spapr_cpu_state, 
> cpu->machine_data);
> +}
> +qemu_unregister_reset(spapr_cpu_reset, cpu);
> +object_unparent(cpu->intc);
> +cpu_remove_sync(CPU(cpu));
> +object_unparent(OBJECT(cpu));
> +}
> +
> +static void spapr_cpu_core_unrealize(DeviceState *dev, Error **errp)
> +{
> +sPAPRCPUCore *sc = SPAPR_CPU_CORE(OBJECT(dev));
> +CPUCore *cc = CPU_CORE(dev);
> +int i;
> +
> +for (i = 0; i < cc->nr_threads; i++) {
> +spapr_unrealize_vcpu(sc->threads[i], sc);
> +}
> +g_free(sc->threads);
> +}
> +
>  static void spapr_realize_vcpu(PowerPCCPU *cpu, sPAPRMachineState *spapr,
> -   Error **errp)
> +   sPAPRCPUCore *sc, Error **errp)
>  {
>  CPUPPCState *env = >env;
> +CPUState *cs = CPU(cpu);
>  Error *local_err = NULL;
>  
>  object_property_set_bool(OBJECT(cpu), true, "realized", _err);
> @@ -233,6 +237,11 @@ static void spapr_realize_vcpu(PowerPCCPU *cpu, 
> sPAPRMachineState *spapr,
>  goto error_unregister;
>  }
>  
> +if (!sc->pre_3_0_migration) {
> +vmstate_register(NULL, cs->cpu_index, _spapr_cpu_state,
> + cpu->machine_data);
> +}
> +
>  return;
>  
>  error_unregister:
> @@ -272,10 +281,6 @@ static PowerPCCPU *spapr_create_vcpu(sPAPRCPUCore *sc, 
> int i, Error **errp)
>  }
>  
>  cpu->machine_data = g_new0(sPAPRCPUState, 1);
> -if (!sc->pre_3_0_migration) {
> -vmstate_register(NULL, cs->cpu_index, _spapr_cpu_state,
> - cpu->machine_data);
> -}
>  
>  object_unref(obj);
>  return cpu;
> @@ -290,9 +295,6 @@ static void spapr_delete_vcpu(PowerPCCPU *cpu, 
> sPAPRCPUCore *sc)
>  {
>  sPAPRCPUState *spapr_cpu = spapr_cpu_state(cpu);
>  
> -if (!sc->pre_3_0_migration) {
> -vmstate_unregister(NULL, _spapr_cpu_state, 
> cpu->machine_data);
> -}
>  cpu->machine_data = NULL;
>  g_free(spapr_cpu);
>  object_unparent(OBJECT(cpu));
> @@ -325,7 +327,7 @@ static void spapr_cpu_core_realize(DeviceState *dev, 
> Error **errp)
>  }
>  
>  for (j = 0; j < cc->nr_threads; j++) {
> -spapr_realize_vcpu(sc->threads[j], spapr, _err);
> +spapr_realize_vcpu(sc->threads[j], spapr, sc, _err);
>  if (local_err) {
>  goto err_unrealize;
>  }
> @@ -334,7 +336,7 @@ static void spapr_cpu_core_realize(DeviceState *dev, 
> Error **errp)
>  
>  err_unrealize:
>  while (--j >= 0) {
> -   

Re: [Qemu-devel] [PATCH v3 0/4] Balloon inhibit enhancements, vfio restriction

2018-08-08 Thread Alex Williamson
On Wed, 8 Aug 2018 11:45:43 +0800
Peter Xu  wrote:

> On Wed, Aug 08, 2018 at 12:58:32AM +0300, Michael S. Tsirkin wrote:
> > At least with VTD, it seems entirely possible to change e.g. a PMD
> > atomically to point to a different set of PTEs, then flush.
> > That will allow removing memory at high granularity for
> > an arbitrary device without mdev or PASID dependency.  
> 
> My understanding is that the guest driver should prohibit this kind of
> operation (say, modifying PMD).

There's currently no need for this sort of operation within the dma api
and the iommu api doesn't offer it either.

> Actually I don't see how it can
> happen in Linux if the kernel drivers always call the IOMMU API since
> there are only map/unmap APIs rather than this atomic-modify API.

Exactly, the vfio dma mapping api is just an extension of the iommu api
and there's only map and unmap.  Furthermore, unmap can currently return
more than requested if the original mapping made use of superpages in
the iommu, so the only way to achieve page level granularity is to make
only page size mappings.  Otherwise we're talking about new apis
across the board.
 
> The thing is that IMHO it's the guest driver's responsibility to make
> sure the pages will never be used by the device before it removes the
> entry (including modifying the PMD since that actually removes all the
> entries on the old PMD).  If not, I would see it a guest kernel bug
> instead of the bug in the emulation code.

This is why there is no atomic modify in the dma api, we have drivers
that directly manage the buffers for a device and know when it's in use
and when it's not.  There's never a need, currently, to replace the iova
mapping for a single page within a larger buffer.  Maybe the dma api
could also find use for it, but it seems more unique to the iommu api
that we have a "buffer", which happens to be a contiguous RAM region
for the VM, where we do want to change the mapping of a single page.
That single page might currently be mapped by a 2MB or 1GB page in the
case of Intel, or by an arbitrary page size in the case of AMD.  vfio
is the driver managing these mappings, but versus the dma api, we don't
have any insight to the device behavior, including inflight dma.  We can
stop all dma for the device, but not without interfering and potentially
breaking the behavior of the device.

So again, I think this comes down to new iommu driver support and new
iommu apis and new vfio apis to enable some sort of atomic update
interface, or sacrificing performance and adding bloat by forcing page
size mappings.  Thanks,

Alex



[Qemu-devel] [PATCH v4 4/5] qcow2: Set the default cache-clean-interval to 10 minutes

2018-08-08 Thread Leonid Bloch
The default cache-clean-interval is set to 10 minutes, in order to lower
the overhead of the qcow2 caches (before the default was 0, i.e.
disabled).

Signed-off-by: Leonid Bloch 
---
 block/qcow2.c| 2 +-
 block/qcow2.h| 1 +
 docs/qcow2-cache.txt | 4 ++--
 qapi/block-core.json | 3 ++-
 qemu-options.hx  | 2 +-
 5 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 3f75b6e701..15d849d1f0 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -941,7 +941,7 @@ static int qcow2_update_options_prepare(BlockDriverState 
*bs,
 /* New interval for cache cleanup timer */
 r->cache_clean_interval =
 qemu_opt_get_number(opts, QCOW2_OPT_CACHE_CLEAN_INTERVAL,
-s->cache_clean_interval);
+DEFAULT_CACHE_CLEAN_INTERVAL);
 #ifndef CONFIG_LINUX
 if (r->cache_clean_interval != 0) {
 error_setg(errp, QCOW2_OPT_CACHE_CLEAN_INTERVAL
diff --git a/block/qcow2.h b/block/qcow2.h
index d77a31d932..587b053453 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -77,6 +77,7 @@
 
 #define DEFAULT_CLUSTER_SIZE 65536
 
+#define DEFAULT_CACHE_CLEAN_INTERVAL 600  /* seconds */
 
 #define QCOW2_OPT_LAZY_REFCOUNTS "lazy-refcounts"
 #define QCOW2_OPT_DISCARD_REQUEST "pass-discard-request"
diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
index c7625cdeb3..9926f83ada 100644
--- a/docs/qcow2-cache.txt
+++ b/docs/qcow2-cache.txt
@@ -202,8 +202,8 @@ This example removes all unused cache entries every 15 
minutes:
 
-drive file=hd.qcow2,cache-clean-interval=900
 
-If unset, the default value for this parameter is 0 and it disables
-this feature.
+If unset, the default value for this parameter is 600. Setting it to 0
+disables this feature.
 
 Note that this functionality currently relies on the MADV_DONTNEED
 argument for madvise() to actually free the memory. This is a
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 5b9084a394..7c6115096a 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2830,7 +2830,8 @@
 #
 # @cache-clean-interval:  clean unused entries in the L2 and refcount
 # caches. The interval is in seconds. The default value
-# is 0 and it disables this feature (since 2.5)
+# is 600. Setting 0 disables this feature. (since 2.5)
+#
 # @encrypt:   Image decryption options. Mandatory for
 # encrypted images, except when doing a metadata-only
 # probe of the image. (since 2.10)
diff --git a/qemu-options.hx b/qemu-options.hx
index d6e15b2f06..8cebb0c77d 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -767,7 +767,7 @@ it which is not used for the L2 cache)
 
 @item cache-clean-interval
 Clean unused entries in the L2 and refcount caches. The interval is in seconds.
-The default value is 0 and it disables this feature.
+The default value is 600. Setting it to 0 disables this feature.
 
 @item pass-discard-request
 Whether discard requests to the qcow2 device should be forwarded to the data
-- 
2.17.1




[Qemu-devel] [PATCH v4 5/5] qcow2: Explicit number replaced by a constant

2018-08-08 Thread Leonid Bloch
Signed-off-by: Leonid Bloch 
---
 block/qcow2.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 15d849d1f0..0d9d20e46b 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1321,7 +1321,7 @@ static int coroutine_fn qcow2_do_open(BlockDriverState 
*bs, QDict *options,
 /* 2^(s->refcount_order - 3) is the refcount width in bytes */
 s->refcount_block_bits = s->cluster_bits - (s->refcount_order - 3);
 s->refcount_block_size = 1 << s->refcount_block_bits;
-bs->total_sectors = header.size / 512;
+bs->total_sectors = header.size / BDRV_SECTOR_SIZE;
 s->csize_shift = (62 - (s->cluster_bits - 8));
 s->csize_mask = (1 << (s->cluster_bits - 8)) - 1;
 s->cluster_offset_mask = (1LL << s->csize_shift) - 1;
@@ -3447,7 +3447,7 @@ static int coroutine_fn 
qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
 goto fail;
 }
 
-old_length = bs->total_sectors * 512;
+old_length = bs->total_sectors * BDRV_SECTOR_SIZE;
 new_l1_size = size_to_l1(s, offset);
 
 if (offset < old_length) {
-- 
2.17.1




[Qemu-devel] [PATCH v4 2/5] qcow2: Assign the L2 cache relatively to image size

2018-08-08 Thread Leonid Bloch
Sufficient L2 cache can noticeably improve the performance when using
large images with frequent I/O. The memory overhead is not significant
in most cases, as the cache size is only 1 MB for each 8 GB of virtual
image size (with the default cluster size of 64 KB).

Previously, the L2 cache was allocated without considering the image
size, and an option existed to manually determine this size. Thus to
achieve full coverage of the image by the L2 cache (i.e. use more than
the default value of MAX(1 MB, 8 clusters)), a user needed to calculate
the required size manually or using a script, and passs this value to
the 'l2-cache-size' option.

Now, the L2 cache is assigned taking the actual image size into account,
and will cover the entire image, unless the size needed for that is
larger than a certain maximum. This maximum is set to 32 MB by default
(enough to cover a 256 GB image using the default cluster size) but can
be increased or decreased using the 'l2-cache-size' option. This option
was previously documented as the *maximum* L2 cache size, and this patch
makes it behave as such, instead of as a constant size. Also, the
existing option 'cache-size' can limit the sum of both L2 and refcount
caches, as previously.

Signed-off-by: Leonid Bloch 
---
 block/qcow2.c  | 33 +
 block/qcow2.h  |  4 +---
 docs/qcow2-cache.txt   | 24 ++--
 qemu-options.hx|  6 +++---
 tests/qemu-iotests/137 |  1 -
 tests/qemu-iotests/137.out |  1 -
 6 files changed, 31 insertions(+), 38 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index ec9e6238a0..98cb96aaca 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -777,29 +777,35 @@ static void read_cache_sizes(BlockDriverState *bs, 
QemuOpts *opts,
  uint64_t *refcount_cache_size, Error **errp)
 {
 BDRVQcow2State *s = bs->opaque;
-uint64_t combined_cache_size;
+uint64_t combined_cache_size, l2_cache_max_setting;
 bool l2_cache_size_set, refcount_cache_size_set, combined_cache_size_set;
-int min_refcount_cache = MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
+uint64_t min_refcount_cache = MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
 
 combined_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_CACHE_SIZE);
 l2_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_L2_CACHE_SIZE);
 refcount_cache_size_set = qemu_opt_get(opts, 
QCOW2_OPT_REFCOUNT_CACHE_SIZE);
 
 combined_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_CACHE_SIZE, 0);
-*l2_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_L2_CACHE_SIZE, 0);
+l2_cache_max_setting = qemu_opt_get_size(opts, QCOW2_OPT_L2_CACHE_SIZE,
+ DEFAULT_L2_CACHE_MAX_SIZE);
 *refcount_cache_size = qemu_opt_get_size(opts,
  QCOW2_OPT_REFCOUNT_CACHE_SIZE, 0);
 
 *l2_cache_entry_size = qemu_opt_get_size(
 opts, QCOW2_OPT_L2_CACHE_ENTRY_SIZE, s->cluster_size);
 
+uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE;
+uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
+*l2_cache_size = MIN(max_l2_cache, l2_cache_max_setting);
+
 if (combined_cache_size_set) {
 if (l2_cache_size_set && refcount_cache_size_set) {
 error_setg(errp, QCOW2_OPT_CACHE_SIZE ", " QCOW2_OPT_L2_CACHE_SIZE
" and " QCOW2_OPT_REFCOUNT_CACHE_SIZE " may not be set "
"at the same time");
 return;
-} else if (*l2_cache_size > combined_cache_size) {
+} else if (l2_cache_size_set &&
+   (l2_cache_max_setting > combined_cache_size)) {
 error_setg(errp, QCOW2_OPT_L2_CACHE_SIZE " may not exceed "
QCOW2_OPT_CACHE_SIZE);
 return;
@@ -814,29 +820,16 @@ static void read_cache_sizes(BlockDriverState *bs, 
QemuOpts *opts,
 } else if (refcount_cache_size_set) {
 *l2_cache_size = combined_cache_size - *refcount_cache_size;
 } else {
-uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE;
-uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
-
 /* Assign as much memory as possible to the L2 cache, and
  * use the remainder for the refcount cache */
-if (combined_cache_size >= max_l2_cache + min_refcount_cache) {
-*l2_cache_size = max_l2_cache;
+if (combined_cache_size >= *l2_cache_size + min_refcount_cache) {
 *refcount_cache_size = combined_cache_size - *l2_cache_size;
 } else {
-*refcount_cache_size =
-MIN(combined_cache_size, min_refcount_cache);
+*refcount_cache_size = MIN(combined_cache_size,
+   min_refcount_cache);
 *l2_cache_size = combined_cache_size - 

[Qemu-devel] [PATCH v4 3/5] qcow2: Resize the cache upon image resizing

2018-08-08 Thread Leonid Bloch
The caches are now recalculated upon image resizing. This is done
because the new default behavior of assigning L2 cache relatively to
the image size, implies that the cache will be adapted accordingly
after an image resize.

Signed-off-by: Leonid Bloch 
---
 block/qcow2.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index 98cb96aaca..3f75b6e701 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3415,6 +3415,7 @@ static int coroutine_fn 
qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
 uint64_t old_length;
 int64_t new_l1_size;
 int ret;
+QDict *options;
 
 if (prealloc != PREALLOC_MODE_OFF && prealloc != PREALLOC_MODE_METADATA &&
 prealloc != PREALLOC_MODE_FALLOC && prealloc != PREALLOC_MODE_FULL)
@@ -3639,6 +3640,8 @@ static int coroutine_fn 
qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
 }
 }
 
+bs->total_sectors = offset / BDRV_SECTOR_SIZE;
+
 /* write updated header.size */
 offset = cpu_to_be64(offset);
 ret = bdrv_pwrite_sync(bs->file, offsetof(QCowHeader, size),
@@ -3649,6 +3652,13 @@ static int coroutine_fn 
qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
 }
 
 s->l1_vm_state_index = new_l1_size;
+
+/* Update cache sizes */
+options = qdict_clone_shallow(bs->options);
+ret = qcow2_update_options(bs, options, s->flags, errp);
+if (ret < 0) {
+goto fail;
+}
 ret = 0;
 fail:
 qemu_co_mutex_unlock(>lock);
-- 
2.17.1




[Qemu-devel] [PATCH v4 1/5] qcow2: Options' documentation fixes

2018-08-08 Thread Leonid Bloch
Signed-off-by: Leonid Bloch 
---
 docs/qcow2-cache.txt | 3 +++
 qemu-options.hx  | 9 ++---
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
index 8a09a5cc5f..5bf2a8ad29 100644
--- a/docs/qcow2-cache.txt
+++ b/docs/qcow2-cache.txt
@@ -130,6 +130,9 @@ There are a few things that need to be taken into account:
memory as possible to the L2 cache before increasing the refcount
cache size.
 
+ - At most two of "l2-cache-size", "refcount-cache-size", and "cache-size"
+   can be set simultaneously.
+
 Unlike L2 tables, refcount blocks are not used during normal I/O but
 only during allocations and internal snapshots. In most cases they are
 accessed sequentially (even during random guest I/O) so increasing the
diff --git a/qemu-options.hx b/qemu-options.hx
index b1bf0f485f..f6804758d3 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -752,15 +752,18 @@ image file)
 
 @item cache-size
 The maximum total size of the L2 table and refcount block caches in bytes
-(default: 1048576 bytes or 8 clusters, whichever is larger)
+(default: the sum of l2-cache-size and refcount-cache-size)
 
 @item l2-cache-size
 The maximum size of the L2 table cache in bytes
-(default: 4/5 of the total cache size)
+(default: if cache-size is not defined - 1048576 bytes or 8 clusters, whichever
+is larger; otherwise, as large as possible or needed within the cache-size,
+while permitting the requested or the minimal refcount cache size)
 
 @item refcount-cache-size
 The maximum size of the refcount block cache in bytes
-(default: 1/5 of the total cache size)
+(default: 4 times the cluster size; or if cache-size is specified, the part of
+it which is not used for the L2 cache)
 
 @item cache-clean-interval
 Clean unused entries in the L2 and refcount caches. The interval is in seconds.
-- 
2.17.1




[Qemu-devel] [PATCH v4 0/5] qcow2: Take the image size into account when allocating the L2 cache

2018-08-08 Thread Leonid Bloch
This series makes the qcow2 L2 cache assignment aware of the image size,
with the intention for it to cover the entire image. The importance of
this change is in noticeable performance improvement, especially with
heavy random I/O. The memory overhead is not big in most cases, as only
1 MB of cache for every 8 GB of image size is used. For cases with very
large images and/or small cluster sizes, or systems with limited RAM
resources, there is an upper limit on the default L2 cache: 32 MB. To
modify this limit one can use the already existing 'l2-cache-size' and
'cache-size' options. Moreover, this fixes the behavior of 'l2-cache-size',
as it was documented as the *maximum* L2 cache size, but in practice
behaved as the absolute size.

To compensate the memory overhead which may be increased following this
behavior, the default cache-clean-interval is set to 10 minutes by default
(was disabled by default before).

The L2 cache is also resized accordingly, by default, if the image is
resized.

Additionally, few minor changes are made (refactoring and documentation
fixes).

Differences from v1:
* .gitignore modification patch removed (unneeded).
* The grammar fix in conflicting cache sizing patch removed (merged).
* The update total_sectors when resizing patch squashed with the
  resizing patch.
* L2 cache is now capped at 32 MB.
* The default cache-clean-interval is set to 30 seconds.

Differences from v2:
* Made it clear in the documentation that setting cache-clean-interval
  to 0 disables this feature.

Differences from v3:
* The default cache-clean-interval is set to 10 minutes instead of 30
  seconds before.
* Commit message changes to better explain the patches.
* Some refactoring.

Leonid Bloch (5):
  qcow2: Options' documentation fixes
  qcow2: Assign the L2 cache relatively to image size
  qcow2: Resize the cache upon image resizing
  qcow2: Set the default cache-clean-interval to 10 minutes
  qcow2: Explicit number replaced by a constant

 block/qcow2.c  | 49 --
 block/qcow2.h  |  5 ++--
 docs/qcow2-cache.txt   | 31 ++--
 qapi/block-core.json   |  3 ++-
 qemu-options.hx| 11 +
 tests/qemu-iotests/137 |  1 -
 tests/qemu-iotests/137.out |  1 -
 7 files changed, 56 insertions(+), 45 deletions(-)

-- 
2.17.1




Re: [Qemu-devel] [PATCH v4 3/6] loader: extract rom_free() function

2018-08-08 Thread Alistair Francis
On Fri, Aug 3, 2018 at 7:47 AM, Stefan Hajnoczi  wrote:
> The next patch will need to free a rom.  There is already code to do
> this in rom_add_file().
>
> Note that rom_add_file() uses:
>
>   rom = g_malloc0(sizeof(*rom));
>   ...
>   if (rom->fw_dir) {
>   g_free(rom->fw_dir);
>   g_free(rom->fw_file);
>   }
>
> The conditional is unnecessary since g_free(NULL) is a no-op.
>
> Signed-off-by: Stefan Hajnoczi 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  hw/core/loader.c | 21 -
>  1 file changed, 12 insertions(+), 9 deletions(-)
>
> diff --git a/hw/core/loader.c b/hw/core/loader.c
> index bbb6e65bb5..0c72e7c05a 100644
> --- a/hw/core/loader.c
> +++ b/hw/core/loader.c
> @@ -847,6 +847,17 @@ struct Rom {
>  static FWCfgState *fw_cfg;
>  static QTAILQ_HEAD(, Rom) roms = QTAILQ_HEAD_INITIALIZER(roms);
>
> +/* rom->data must be heap-allocated (do not use with rom_add_elf_program()) 
> */
> +static void rom_free(Rom *rom)
> +{
> +g_free(rom->data);
> +g_free(rom->path);
> +g_free(rom->name);
> +g_free(rom->fw_dir);
> +g_free(rom->fw_file);
> +g_free(rom);
> +}
> +
>  static inline bool rom_order_compare(Rom *rom, Rom *item)
>  {
>  return ((uintptr_t)(void *)rom->as > (uintptr_t)(void *)item->as) ||
> @@ -995,15 +1006,7 @@ err:
>  if (fd != -1)
>  close(fd);
>
> -g_free(rom->data);
> -g_free(rom->path);
> -g_free(rom->name);
> -if (fw_dir) {
> -g_free(rom->fw_dir);
> -g_free(rom->fw_file);
> -}
> -g_free(rom);
> -
> +rom_free(rom);
>  return -1;
>  }
>
> --
> 2.17.1
>
>



Re: [Qemu-devel] [PATCH v4 4/6] loader: add rom transaction API

2018-08-08 Thread Alistair Francis
On Fri, Aug 3, 2018 at 7:47 AM, Stefan Hajnoczi  wrote:
> Image file loaders may add a series of roms.  If an error occurs partway
> through loading there is no easy way to drop previously added roms.
>
> This patch adds a transaction mechanism that works like this:
>
>   rom_transaction_begin();
>   ...call rom_add_*()...
>   rom_transaction_end(ok);
>
> If ok is false then roms added in this transaction are dropped.
>
> Signed-off-by: Stefan Hajnoczi 
> ---
>  include/hw/loader.h | 19 +++
>  hw/core/loader.c| 32 
>  2 files changed, 51 insertions(+)
>
> diff --git a/include/hw/loader.h b/include/hw/loader.h
> index e98b84b8f9..5235f119a3 100644
> --- a/include/hw/loader.h
> +++ b/include/hw/loader.h
> @@ -225,6 +225,25 @@ int rom_check_and_register_reset(void);
>  void rom_set_fw(FWCfgState *f);
>  void rom_set_order_override(int order);
>  void rom_reset_order_override(void);
> +
> +/**
> + * rom_transaction_begin:
> + *
> + * Call this before of a series of rom_add_*() calls.  Call
> + * rom_transaction_end() afterwards to commit or abort.  These functions are
> + * useful for undoing a series of rom_add_*() calls if image file loading 
> fails
> + * partway through.
> + */
> +void rom_transaction_begin(void);
> +
> +/**
> + * rom_transaction_end:
> + * @commit: true to commit added roms, false to drop added roms
> + *
> + * Call this after a series of rom_add_*() calls.  See 
> rom_transaction_begin().
> + */
> +void rom_transaction_end(bool commit);
> +
>  int rom_copy(uint8_t *dest, hwaddr addr, size_t size);
>  void *rom_ptr(hwaddr addr, size_t size);
>  void hmp_info_roms(Monitor *mon, const QDict *qdict);
> diff --git a/hw/core/loader.c b/hw/core/loader.c
> index 0c72e7c05a..612420b870 100644
> --- a/hw/core/loader.c
> +++ b/hw/core/loader.c
> @@ -840,6 +840,8 @@ struct Rom {
>  char *fw_dir;
>  char *fw_file;
>
> +bool committed;
> +
>  hwaddr addr;
>  QTAILQ_ENTRY(Rom) next;
>  };
> @@ -877,6 +879,8 @@ static void rom_insert(Rom *rom)
>  rom->as = _space_memory;
>  }
>
> +rom->committed = false;
> +
>  /* List is ordered by load address in the same address space */
>  QTAILQ_FOREACH(item, , next) {
>  if (rom_order_compare(rom, item)) {
> @@ -1168,6 +1172,34 @@ void rom_reset_order_override(void)
>  fw_cfg_reset_order_override(fw_cfg);
>  }
>
> +void rom_transaction_begin(void)
> +{
> +Rom *rom;
> +
> +/* Ignore ROMs added without the transaction API */
> +QTAILQ_FOREACH(rom, , next) {
> +rom->committed = true;

My only thought is that maybe this should produce a warning or error
if a ROM isn't committed.

Alistair

> +}
> +}
> +
> +void rom_transaction_end(bool commit)
> +{
> +Rom *rom;
> +Rom *tmp;
> +
> +QTAILQ_FOREACH_SAFE(rom, , next, tmp) {
> +if (rom->committed) {
> +continue;
> +}
> +if (commit) {
> +rom->committed = true;
> +} else {
> +QTAILQ_REMOVE(, rom, next);
> +rom_free(rom);
> +}
> +}
> +}
> +
>  static Rom *find_rom(hwaddr addr, size_t size)
>  {
>  Rom *rom;
> --
> 2.17.1
>
>



[Qemu-devel] [PATCH v2 3/4] tests/boot-serial-test: Add microbit board testcase

2018-08-08 Thread Julia Suvorova via Qemu-devel
New mini-kernel test for nRF51 SoC UART.

Signed-off-by: Julia Suvorova 
---
 tests/boot-serial-test.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/tests/boot-serial-test.c b/tests/boot-serial-test.c
index 952a2e7ead..19714c3f87 100644
--- a/tests/boot-serial-test.c
+++ b/tests/boot-serial-test.c
@@ -62,6 +62,24 @@ static const uint8_t kernel_aarch64[] = {
 0xfd, 0xff, 0xff, 0x17, /* b   -12 (loop) */
 };
 
+static const uint8_t kernel_nrf51[] = {
+0x00, 0x00, 0x00, 0x00, /* Stack top address */
+0x09, 0x00, 0x00, 0x00, /* Reset handler address */
+0x04, 0x4a, /* ldr  r2, [pc, #16] Get ENABLE */
+0x04, 0x21, /* movs r1, #4 */
+0x11, 0x60, /* str  r1, [r2] */
+0x04, 0x4a, /* ldr  r2, [pc, #16] Get STARTTX 
*/
+0x01, 0x21, /* movs r1, #1 */
+0x11, 0x60, /* str  r1, [r2] */
+0x03, 0x4a, /* ldr  r2, [pc, #12] Get TXD */
+0x54, 0x21, /* movs r1, 'T' */
+0x11, 0x60, /* str  r1, [r2] */
+0xfe, 0xe7, /* b. */
+0x00, 0x25, 0x00, 0x40, /* 0x40002500 = UART ENABLE */
+0x08, 0x20, 0x00, 0x40, /* 0x40002008 = UART STARTTX */
+0x1c, 0x25, 0x00, 0x40  /* 0x4000251c = UART TXD */
+};
+
 typedef struct testdef {
 const char *arch;   /* Target architecture */
 const char *machine;/* Name of the machine */
@@ -107,6 +125,7 @@ static testdef_t tests[] = {
 { "hppa", "hppa", "", "SeaBIOS wants SYSTEM HALT" },
 { "aarch64", "virt", "-cpu cortex-a57", "TT", sizeof(kernel_aarch64),
   kernel_aarch64 },
+{ "arm", "microbit", "", "T", sizeof(kernel_nrf51), kernel_nrf51 },
 
 { NULL }
 };
-- 
2.17.1




[Qemu-devel] [PATCH v2 4/4] tests/microbit-test: Check nRF51 UART functionality

2018-08-08 Thread Julia Suvorova via Qemu-devel
Some functional tests for:
Basic reception/transmittion
Suspending
INTEN* registers

Based-on: <20180806100114.21410-6-cont...@steffen-goertz.de>

Signed-off-by: Julia Suvorova 
---
 tests/microbit-test.c | 106 --
 1 file changed, 103 insertions(+), 3 deletions(-)

diff --git a/tests/microbit-test.c b/tests/microbit-test.c
index 08e2210916..8b69d83684 100644
--- a/tests/microbit-test.c
+++ b/tests/microbit-test.c
@@ -17,7 +17,10 @@
 #include "qemu/osdep.h"
 #include "exec/hwaddr.h"
 #include "libqtest.h"
+#include "hw/char/nrf51_uart.h"
 
+#include 
+#include 
 
 #define PAGE_SIZE   1024
 #define FLASH_SIZE  (256 * PAGE_SIZE)
@@ -48,7 +51,6 @@
 #define GPIO_PULLDOWN 1
 #define GPIO_PULLUP 3
 
-
 static void fill_and_erase(hwaddr base, hwaddr size, uint32_t address_reg)
 {
 /* Fill memory */
@@ -204,19 +206,117 @@ static void test_nrf51_gpio(void)
 g_assert_false(true);
 }
 
+static bool wait_for_event(uint32_t event_addr)
+{
+int i;
+
+for (i = 0; i < 1000; i++) {
+if (readl(event_addr) == 1) {
+writel(event_addr, 0x00);
+return true;
+}
+g_usleep(1);
+}
+
+return false;
+}
+
+static void rw_to_rxd(int sock_fd, const char *in, char *out)
+{
+int i;
+
+g_assert(write(sock_fd, in, strlen(in)) == strlen(in));
+for (i = 0; i < strlen(in); i++) {
+g_assert(wait_for_event(UART_BASE + A_UART_RXDRDY));
+out[i] = readl(UART_BASE + A_UART_RXD);
+}
+out[i] = '\0';
+}
+
+static void w_to_txd(const char *in)
+{
+int i;
+
+for (i = 0; i < strlen(in); i++) {
+writel(UART_BASE + A_UART_TXD, in[i]);
+g_assert(wait_for_event(UART_BASE + A_UART_TXDRDY));
+}
+}
+
+static void test_nrf51_uart(const void *data)
+{
+int sock_fd = *((const int *) data);
+char s[10];
+
+g_assert(write(sock_fd, "c", 1) == 1);
+g_assert(readl(UART_BASE + A_UART_RXD) == 0);
+
+writel(UART_BASE + A_UART_ENABLE, 0x04);
+writel(UART_BASE + A_UART_STARTRX, 0x01);
+
+g_assert(wait_for_event(UART_BASE + A_UART_RXDRDY));
+writel(UART_BASE + A_UART_RXDRDY, 0x00);
+g_assert(readl(UART_BASE + A_UART_RXD) == 'c');
+
+writel(UART_BASE + A_UART_INTENSET, 0x04);
+g_assert(readl(UART_BASE + A_UART_INTEN) == 0x04);
+writel(UART_BASE + A_UART_INTENCLR, 0x04);
+g_assert(readl(UART_BASE + A_UART_INTEN) == 0x00);
+
+rw_to_rxd(sock_fd, "hello", s);
+g_assert(strcmp(s, "hello") == 0);
+
+writel(UART_BASE + A_UART_STARTTX, 0x01);
+w_to_txd("d");
+g_assert(read(sock_fd, s, 10) == 1);
+g_assert(s[0] == 'd');
+
+writel(UART_BASE + A_UART_SUSPEND, 0x01);
+writel(UART_BASE + A_UART_TXD, 'h');
+writel(UART_BASE + A_UART_STARTTX, 0x01);
+w_to_txd("world");
+g_assert(read(sock_fd, s, 10) == 5);
+g_assert(strcmp(s, "world") == 0);
+}
+
 int main(int argc, char **argv)
 {
 int ret;
+char serialtmpdir[] = "/tmp/qtest-microbit-serial-sXX";
+char serialtmp[40];
+int sock_fd;
+struct sockaddr_un addr;
+
+g_assert(mkdtemp(serialtmpdir));
+sprintf(serialtmp, "%s/sock", serialtmpdir);
+
+sock_fd = socket(AF_UNIX, SOCK_STREAM, 0);
+g_assert(sock_fd != -1);
+
+memset(, 0, sizeof(struct sockaddr_un));
+
+addr.sun_family = AF_UNIX;
+strncpy(addr.sun_path, serialtmp, sizeof(addr.sun_path) - 1);
 
 g_test_init(, , NULL);
 
-global_qtest = qtest_startf("-machine microbit");
+global_qtest = qtest_startf("-machine microbit "
+"-chardev socket,id=s0,path=%s,server,nowait "
+"-no-shutdown -serial chardev:s0",
+serialtmp);
+
+g_assert(connect(sock_fd, (const struct sockaddr *) ,
+ sizeof(struct sockaddr_un)) != -1);
 
 qtest_add_func("/microbit/nrf51/nvmc", test_nrf51_nvmc);
 qtest_add_func("/microbit/nrf51/gpio", test_nrf51_gpio);
-
+qtest_add_data_func("/microbit/nrf51/uart", _fd, test_nrf51_uart);
 ret = g_test_run();
 
 qtest_quit(global_qtest);
+
+close(sock_fd);
+rmdir(serialtmpdir);
+
 return ret;
 }
-- 
2.17.1




[Qemu-devel] [PATCH v2 2/4] hw/arm/nrf51_soc: Connect UART to nRF51 SoC

2018-08-08 Thread Julia Suvorova via Qemu-devel
Wire up nRF51 UART in the corresponding SoC using in-place init/realize.

Based-on: <20180803052137.10602-1-j...@jms.id.au>

Signed-off-by: Julia Suvorova 
---
 hw/arm/nrf51_soc.c | 20 
 include/hw/arm/nrf51_soc.h |  3 +++
 2 files changed, 23 insertions(+)

diff --git a/hw/arm/nrf51_soc.c b/hw/arm/nrf51_soc.c
index 9f9649c780..8b5602f363 100644
--- a/hw/arm/nrf51_soc.c
+++ b/hw/arm/nrf51_soc.c
@@ -38,9 +38,12 @@
 #define NRF51822_FLASH_SIZE (256 * 1024)
 #define NRF51822_SRAM_SIZE  (16 * 1024)
 
+#define BASE_TO_IRQ(base) ((base >> 12) & 0x1F)
+
 static void nrf51_soc_realize(DeviceState *dev_soc, Error **errp)
 {
 NRF51State *s = NRF51_SOC(dev_soc);
+MemoryRegion *mr = NULL;
 Error *err = NULL;
 
 if (!s->board_memory) {
@@ -70,6 +73,19 @@ static void nrf51_soc_realize(DeviceState *dev_soc, Error 
**errp)
 }
 memory_region_add_subregion(>container, SRAM_BASE, >sram);
 
+/* UART */
+qdev_prop_set_chr(DEVICE(>uart), "chardev", serial_hd(0));
+object_property_set_bool(OBJECT(>uart), true, "realized", );
+if (err) {
+error_propagate(errp, err);
+return;
+}
+mr = sysbus_mmio_get_region(SYS_BUS_DEVICE(>uart), 0);
+memory_region_add_subregion_overlap(>container, UART_BASE, mr, 0);
+sysbus_connect_irq(SYS_BUS_DEVICE(>uart), 0,
+   qdev_get_gpio_in(DEVICE(>cpu),
+BASE_TO_IRQ(UART_BASE)));
+
 create_unimplemented_device("nrf51_soc.io", IOMEM_BASE, IOMEM_SIZE);
 create_unimplemented_device("nrf51_soc.ficr", FICR_BASE, FICR_SIZE);
 create_unimplemented_device("nrf51_soc.private", 0xF000, 0x1000);
@@ -86,6 +102,10 @@ static void nrf51_soc_init(Object *obj)
 qdev_set_parent_bus(DEVICE(>cpu), sysbus_get_default());
 qdev_prop_set_string(DEVICE(>cpu), "cpu-type", 
ARM_CPU_TYPE_NAME("cortex-m0"));
 qdev_prop_set_uint32(DEVICE(>cpu), "num-irq", 32);
+
+object_initialize(>uart, sizeof(s->uart), TYPE_NRF51_UART);
+object_property_add_child(obj, "uart", OBJECT(>uart), _abort);
+qdev_set_parent_bus(DEVICE(>uart), sysbus_get_default());
 }
 
 static Property nrf51_soc_properties[] = {
diff --git a/include/hw/arm/nrf51_soc.h b/include/hw/arm/nrf51_soc.h
index e380ec26b8..46a1c1a66c 100644
--- a/include/hw/arm/nrf51_soc.h
+++ b/include/hw/arm/nrf51_soc.h
@@ -13,6 +13,7 @@
 #include "qemu/osdep.h"
 #include "hw/sysbus.h"
 #include "hw/arm/armv7m.h"
+#include "hw/char/nrf51_uart.h"
 
 #define TYPE_NRF51_SOC "nrf51-soc"
 #define NRF51_SOC(obj) \
@@ -25,6 +26,8 @@ typedef struct NRF51State {
 /*< public >*/
 ARMv7MState cpu;
 
+NRF51UARTState uart;
+
 MemoryRegion iomem;
 MemoryRegion sram;
 MemoryRegion flash;
-- 
2.17.1




[Qemu-devel] [PATCH v2 1/4] hw/char: Implement nRF51 SoC UART

2018-08-08 Thread Julia Suvorova via Qemu-devel
Not implemented: CTS/NCTS, PSEL*.

Signed-off-by: Julia Suvorova 
---
 hw/char/Makefile.objs|   1 +
 hw/char/nrf51_uart.c | 329 +++
 hw/char/trace-events |   4 +
 include/hw/char/nrf51_uart.h |  78 +
 4 files changed, 412 insertions(+)
 create mode 100644 hw/char/nrf51_uart.c
 create mode 100644 include/hw/char/nrf51_uart.h

diff --git a/hw/char/Makefile.objs b/hw/char/Makefile.objs
index b570531291..c4947d7ae7 100644
--- a/hw/char/Makefile.objs
+++ b/hw/char/Makefile.objs
@@ -1,5 +1,6 @@
 common-obj-$(CONFIG_IPACK) += ipoctal232.o
 common-obj-$(CONFIG_ESCC) += escc.o
+common-obj-$(CONFIG_NRF51_SOC) += nrf51_uart.o
 common-obj-$(CONFIG_PARALLEL) += parallel.o
 common-obj-$(CONFIG_PARALLEL) += parallel-isa.o
 common-obj-$(CONFIG_PL011) += pl011.o
diff --git a/hw/char/nrf51_uart.c b/hw/char/nrf51_uart.c
new file mode 100644
index 00..55404e8f37
--- /dev/null
+++ b/hw/char/nrf51_uart.c
@@ -0,0 +1,329 @@
+/*
+ * nRF51 SoC UART emulation
+ *
+ * See nRF51 Series Reference Manual, "29 Universal Asynchronous
+ * Receiver/Transmitter" for hardware specifications:
+ * http://infocenter.nordicsemi.com/pdf/nRF51_RM_v3.0.pdf
+ *
+ * Copyright (c) 2018 Julia Suvorova 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 or
+ * (at your option) any later version.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "hw/char/nrf51_uart.h"
+#include "trace.h"
+
+static void nrf51_uart_update_irq(NRF51UARTState *s)
+{
+bool irq = false;
+
+irq |= (s->reg[R_UART_RXDRDY] &&
+(s->reg[R_UART_INTEN] & R_UART_INTEN_RXDRDY_MASK));
+irq |= (s->reg[R_UART_TXDRDY] &&
+(s->reg[R_UART_INTEN] & R_UART_INTEN_TXDRDY_MASK));
+irq |= (s->reg[R_UART_ERROR]  &&
+(s->reg[R_UART_INTEN] & R_UART_INTEN_ERROR_MASK));
+irq |= (s->reg[R_UART_RXTO]   &&
+(s->reg[R_UART_INTEN] & R_UART_INTEN_RXTO_MASK));
+
+qemu_set_irq(s->irq, irq);
+}
+
+static uint64_t uart_read(void *opaque, hwaddr addr, unsigned int size)
+{
+NRF51UARTState *s = NRF51_UART(opaque);
+uint64_t r;
+
+if (!s->enabled) {
+return 0;
+}
+
+switch (addr) {
+case A_UART_RXD:
+r = s->rx_fifo[s->rx_fifo_pos];
+if (s->rx_started && s->rx_fifo_len) {
+qemu_chr_fe_accept_input(>chr);
+s->rx_fifo_pos = (s->rx_fifo_pos + 1) % UART_FIFO_LENGTH;
+s->rx_fifo_len--;
+if (s->rx_fifo_len) {
+s->reg[R_UART_RXDRDY] = 1;
+nrf51_uart_update_irq(s);
+}
+}
+break;
+case A_UART_INTENSET:
+case A_UART_INTENCLR:
+case A_UART_INTEN:
+r = s->reg[R_UART_INTEN];
+break;
+default:
+r = s->reg[addr / 4];
+break;
+}
+
+trace_nrf51_uart_read(addr, r, size);
+
+return r;
+}
+
+static gboolean uart_transmit(GIOChannel *chan, GIOCondition cond, void 
*opaque)
+{
+NRF51UARTState *s = NRF51_UART(opaque);
+int r;
+uint8_t c = s->reg[R_UART_TXD];
+
+s->watch_tag = 0;
+
+r = qemu_chr_fe_write(>chr, , 1);
+if (r <= 0) {
+s->watch_tag = qemu_chr_fe_add_watch(>chr, G_IO_OUT | G_IO_HUP,
+ uart_transmit, s);
+if (!s->watch_tag) {
+/* The hardware has no transmit error reporting,
+ * so silently drop the byte
+ */
+goto buffer_drained;
+}
+return FALSE;
+}
+
+buffer_drained:
+s->reg[R_UART_TXDRDY] = 1;
+s->pending_tx_byte = false;
+return FALSE;
+}
+
+static void uart_cancel_transmit(NRF51UARTState *s)
+{
+if (s->watch_tag) {
+g_source_remove(s->watch_tag);
+s->watch_tag = 0;
+}
+}
+
+static void uart_write(void *opaque, hwaddr addr,
+   uint64_t value, unsigned int size)
+{
+NRF51UARTState *s = NRF51_UART(opaque);
+
+trace_nrf51_uart_write(addr, value, size);
+
+if (!s->enabled && (addr != A_UART_ENABLE)) {
+return;
+}
+
+switch (addr) {
+case A_UART_TXD:
+if (!s->pending_tx_byte && s->tx_started) {
+s->reg[R_UART_TXD] = value;
+s->pending_tx_byte = true;
+uart_transmit(NULL, G_IO_OUT, s);
+}
+break;
+case A_UART_INTEN:
+s->reg[R_UART_INTEN] = value;
+break;
+case A_UART_INTENSET:
+s->reg[R_UART_INTEN] |= value;
+break;
+case A_UART_INTENCLR:
+s->reg[R_UART_INTEN] &= ~value;
+break;
+case A_UART_TXDRDY ... A_UART_RXTO:
+s->reg[addr / 4] = value;
+break;
+case A_UART_ERRORSRC:
+s->reg[addr / 4] &= ~value;
+break;
+case A_UART_RXD:
+break;
+case A_UART_RXDRDY:
+if (value == 0) {
+s->reg[R_UART_RXDRDY] = 0;
+}
+break;
+case 

[Qemu-devel] [PATCH v2 0/4] arm: Add nRF51 SoC UART support

2018-08-08 Thread Julia Suvorova via Qemu-devel
This series adds support for the nRF51 SoC UART, that used in
BBC Micro:bit board, and QTest for it.

v2:
* Suspend/Enable functionality added
* Connection to SoC moved to a separate patch
* Added QTest for checking reception functionality
* Mini-kernel test changed to fit current implementation
* Addressed review comments on R_*, uart_can_receive, VMState,
  uart_transmit

Julia Suvorova (4):
  hw/char: Implement nRF51 SoC UART
  hw/arm/nrf51_soc: Connect UART to nRF51 SoC
  tests/boot-serial-test: Add microbit board testcase
  tests/microbit-test: Check nRF51 UART functionality

 hw/arm/nrf51_soc.c   |  20 +++
 hw/char/Makefile.objs|   1 +
 hw/char/nrf51_uart.c | 329 +++
 hw/char/trace-events |   4 +
 include/hw/arm/nrf51_soc.h   |   3 +
 include/hw/char/nrf51_uart.h |  78 +
 tests/boot-serial-test.c |  19 ++
 tests/microbit-test.c| 106 ++-
 8 files changed, 557 insertions(+), 3 deletions(-)
 create mode 100644 hw/char/nrf51_uart.c
 create mode 100644 include/hw/char/nrf51_uart.h

-- 
2.17.1




Re: [Qemu-devel] [PATCH] fw_cfg: add bootdevice-ignore-suffixes property

2018-08-08 Thread Eduardo Habkost
On Wed, Aug 08, 2018 at 09:19:31PM +0100, Mark Cave-Ayland wrote:
> On 08/08/18 20:53, Eduardo Habkost wrote:
> 
> > On Wed, Aug 08, 2018 at 08:19:51PM +0100, Mark Cave-Ayland wrote:
> > > For the older machines (such as Mac and SPARC) the DT nodes representing
> > > bootdevices for disk nodes are irregular for mainly historical reasons.
> > > 
> > > Since the majority of bootdevice nodes for these machines either do not 
> > > have a
> > > separate disk node or require different (custom) names then it is much 
> > > easier
> > > to disable all suffixes for a particular machine by setting the 
> > > ignore_suffixes
> > > parameter to get_boot_devices_list() to true, and customise the disk 
> > > nodes as
> > > required.
> > > 
> > > Here we add a new bootdevice-ignore-suffixes property to the FW_CFG 
> > > device to
> > > allow the generation of disk suffixes to be controlled on a per-machine 
> > > basis.
> > > 
> > > Signed-off-by: Mark Cave-Ayland 
> > 
> > Reviewed-by: Eduardo Habkost 
> > 
> > But I would prefer to see this merged only after we see machines
> > actually using the property.  Can you send that as a single
> > series later?
> 
> I don't have any more time until tomorrow evening now, but FWIW I've pushed
> my working branch to
> https://github.com/mcayland/qemu/commits/openbios-bootindex3 if you want to
> take a quick look. Example command line:
> 
> $ ./qemu-system-sparc64 -drive
> file=disk.img,if=none,index=0,id=cd,media=cdrom -device
> virtio-blk-pci,bus=pciB,drive=cd,bootindex=0 -m 256 -nographic
> 
> Would you still like me to post this to the list properly tomorrow evening?

It's up to you (I don't know when you think your series will be
ready).  Maybe you'll want to adopt the patch below so you don't
need to inline your fw_cfg_init*() calls anymore?  (also, up to
you)


> > Also, maybe we can do it in a simpler way:
> > 
> > I now see that fw_cfg is not the only user of
> > get_boot_devices_list().  I didn't want to have a fw_cfg-specific
> > field in MachineClass, but but we can make it not fw_cfg-specific
> > if we make it affect all get_boot_devices_list() calls.
> > 
> > What do you think of the patch below?
> > 
> > (Patch is untested)
[...]

-- 
Eduardo



[Qemu-devel] [PATCH 1/3] vhost-user-scsi: move host_features into VHostSCSICommon

2018-08-08 Thread Greg Edwards
In preparation for having vhost-scsi also make use of host_features,
move it from struct VHostUserSCSI into struct VHostSCSICommon.

Signed-off-by: Greg Edwards 
---
 hw/scsi/vhost-user-scsi.c | 15 ---
 include/hw/virtio/vhost-scsi-common.h |  1 +
 include/hw/virtio/vhost-user-scsi.h   |  1 -
 3 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c
index 9355cfdf07f9..694cb801209a 100644
--- a/hw/scsi/vhost-user-scsi.c
+++ b/hw/scsi/vhost-user-scsi.c
@@ -141,9 +141,10 @@ static uint64_t vhost_user_scsi_get_features(VirtIODevice 
*vdev,
  uint64_t features, Error **errp)
 {
 VHostUserSCSI *s = VHOST_USER_SCSI(vdev);
+VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
 
 /* Turn on predefined features supported by this device */
-features |= s->host_features;
+features |= vsc->host_features;
 
 return vhost_scsi_common_get_features(vdev, features, errp);
 }
@@ -157,12 +158,12 @@ static Property vhost_user_scsi_properties[] = {
 DEFINE_PROP_UINT32("max_sectors", VirtIOSCSICommon, conf.max_sectors,
0x),
 DEFINE_PROP_UINT32("cmd_per_lun", VirtIOSCSICommon, conf.cmd_per_lun, 128),
-DEFINE_PROP_BIT64("hotplug", VHostUserSCSI, host_features,
-VIRTIO_SCSI_F_HOTPLUG,
-true),
-DEFINE_PROP_BIT64("param_change", VHostUserSCSI, host_features,
- VIRTIO_SCSI_F_CHANGE,
- true),
+DEFINE_PROP_BIT64("hotplug", VHostSCSICommon, host_features,
+  VIRTIO_SCSI_F_HOTPLUG,
+  true),
+DEFINE_PROP_BIT64("param_change", VHostSCSICommon, host_features,
+   VIRTIO_SCSI_F_CHANGE,
+   true),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/include/hw/virtio/vhost-scsi-common.h 
b/include/hw/virtio/vhost-scsi-common.h
index 4553be4bc378..57fb1d87b51d 100644
--- a/include/hw/virtio/vhost-scsi-common.h
+++ b/include/hw/virtio/vhost-scsi-common.h
@@ -35,6 +35,7 @@ typedef struct VHostSCSICommon {
 int channel;
 int target;
 int lun;
+uint64_t host_features;
 } VHostSCSICommon;
 
 int vhost_scsi_common_start(VHostSCSICommon *vsc);
diff --git a/include/hw/virtio/vhost-user-scsi.h 
b/include/hw/virtio/vhost-user-scsi.h
index 3ec34ae867ab..e429cacd8e06 100644
--- a/include/hw/virtio/vhost-user-scsi.h
+++ b/include/hw/virtio/vhost-user-scsi.h
@@ -30,7 +30,6 @@
 
 typedef struct VHostUserSCSI {
 VHostSCSICommon parent_obj;
-uint64_t host_features;
 VhostUserState *vhost_user;
 } VHostUserSCSI;
 
-- 
2.17.1




[Qemu-devel] [PATCH 3/3] vhost-scsi: expose 't10_pi' property for VIRTIO_SCSI_F_T10_PI

2018-08-08 Thread Greg Edwards
Allow toggling on/off the VIRTIO_SCSI_F_T10_PI feature bit for both
vhost-scsi and vhost-user-scsi devices.

Signed-off-by: Greg Edwards 
---
 hw/scsi/vhost-scsi.c  | 3 +++
 hw/scsi/vhost-user-scsi.c | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
index 9c1bea8ff327..becf55008553 100644
--- a/hw/scsi/vhost-scsi.c
+++ b/hw/scsi/vhost-scsi.c
@@ -238,6 +238,9 @@ static Property vhost_scsi_properties[] = {
 DEFINE_PROP_UINT32("max_sectors", VirtIOSCSICommon, conf.max_sectors,
0x),
 DEFINE_PROP_UINT32("cmd_per_lun", VirtIOSCSICommon, conf.cmd_per_lun, 128),
+DEFINE_PROP_BIT64("t10_pi", VHostSCSICommon, host_features,
+ VIRTIO_SCSI_F_T10_PI,
+ false),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c
index 26491daaa0bf..2e1ba4a87bb1 100644
--- a/hw/scsi/vhost-user-scsi.c
+++ b/hw/scsi/vhost-user-scsi.c
@@ -152,6 +152,9 @@ static Property vhost_user_scsi_properties[] = {
 DEFINE_PROP_BIT64("param_change", VHostSCSICommon, host_features,
VIRTIO_SCSI_F_CHANGE,
true),
+DEFINE_PROP_BIT64("t10_pi", VHostSCSICommon, host_features,
+ VIRTIO_SCSI_F_T10_PI,
+ false),
 DEFINE_PROP_END_OF_LIST(),
 };
 
-- 
2.17.1




[Qemu-devel] [PATCH 2/3] vhost-scsi: unify vhost-scsi get_features implementations

2018-08-08 Thread Greg Edwards
Move the enablement of preset host features into the common
vhost_scsi_common_get_features() function.  This is in preparation for
having vhost-scsi also make use of host_features.

Signed-off-by: Greg Edwards 
---
 hw/scsi/vhost-scsi-common.c |  3 +++
 hw/scsi/vhost-user-scsi.c   | 14 +-
 2 files changed, 4 insertions(+), 13 deletions(-)

diff --git a/hw/scsi/vhost-scsi-common.c b/hw/scsi/vhost-scsi-common.c
index e2a5828af137..b7fbab65dd17 100644
--- a/hw/scsi/vhost-scsi-common.c
+++ b/hw/scsi/vhost-scsi-common.c
@@ -96,6 +96,9 @@ uint64_t vhost_scsi_common_get_features(VirtIODevice *vdev, 
uint64_t features,
 {
 VHostSCSICommon *vsc = VHOST_SCSI_COMMON(vdev);
 
+/* Turn on predefined features supported by this device */
+features |= vsc->host_features;
+
 return vhost_get_features(>dev, vsc->feature_bits, features);
 }
 
diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c
index 694cb801209a..26491daaa0bf 100644
--- a/hw/scsi/vhost-user-scsi.c
+++ b/hw/scsi/vhost-user-scsi.c
@@ -137,18 +137,6 @@ static void vhost_user_scsi_unrealize(DeviceState *dev, 
Error **errp)
 }
 }
 
-static uint64_t vhost_user_scsi_get_features(VirtIODevice *vdev,
- uint64_t features, Error **errp)
-{
-VHostUserSCSI *s = VHOST_USER_SCSI(vdev);
-VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
-
-/* Turn on predefined features supported by this device */
-features |= vsc->host_features;
-
-return vhost_scsi_common_get_features(vdev, features, errp);
-}
-
 static Property vhost_user_scsi_properties[] = {
 DEFINE_PROP_CHR("chardev", VirtIOSCSICommon, conf.chardev),
 DEFINE_PROP_UINT32("boot_tpgt", VirtIOSCSICommon, conf.boot_tpgt, 0),
@@ -188,7 +176,7 @@ static void vhost_user_scsi_class_init(ObjectClass *klass, 
void *data)
 set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
 vdc->realize = vhost_user_scsi_realize;
 vdc->unrealize = vhost_user_scsi_unrealize;
-vdc->get_features = vhost_user_scsi_get_features;
+vdc->get_features = vhost_scsi_common_get_features;
 vdc->set_config = vhost_scsi_common_set_config;
 vdc->set_status = vhost_user_scsi_set_status;
 fwc->get_dev_path = vhost_scsi_common_get_fw_dev_path;
-- 
2.17.1




[Qemu-devel] [PATCH 0/3] expose VIRTIO_SCSI_F_T10_PI for vhost-scsi and vhost-user-scsi

2018-08-08 Thread Greg Edwards
Unify the get_features functions for vhost-scsi and vhost-user-scsi, including
their use of host_features, and expose a new 't10_pi' property to enable
negotiation of the VIRTIO_SCSI_F_T10_PI feature bit with the backend.

Greg Edwards (3):
  vhost-user-scsi: move host_features into VHostSCSICommon
  vhost-scsi: unify vhost-scsi get_features implementations
  vhost-scsi: expose 't10_pi' property for VIRTIO_SCSI_F_T10_PI

 hw/scsi/vhost-scsi-common.c   |  3 +++
 hw/scsi/vhost-scsi.c  |  3 +++
 hw/scsi/vhost-user-scsi.c | 28 ++-
 include/hw/virtio/vhost-scsi-common.h |  1 +
 include/hw/virtio/vhost-user-scsi.h   |  1 -
 5 files changed, 17 insertions(+), 19 deletions(-)

-- 
2.17.1




Re: [Qemu-devel] [RFC PATCH 0/4] "pc: acpi: _CST support"

2018-08-08 Thread Michael S. Tsirkin
On Wed, Aug 08, 2018 at 05:15:45PM +0200, Igor Mammedov wrote:
> It's an alternative approach to
>  1) [PATCH hack dontapply v2 0/7] Dynamic _CST generation
> which instead of dynamic AML loading uses static AML with
> dynamic values.  It allows us to keep firmware blob static and
> to avoid split firmware issue (1) in case of cross version migration.

I think there's a misunderstanding. That patch only declares a couple of
states but that is just for debugging/demonstration purposes.  A typical
real CPU has more states (e.g. some intel CPUs have ~10 levels).


> ABI in this case is confined to cpu hotplug IO registers
> (i.e. do it old school way, like we used to do so far).
> This way we don't have to add yet another ABI to keep dynamic
> AML code under control (1).
> 
> Tested  with: XPsp3 - ws2106 guests.
> 
> CC: "Michael S. Tsirkin" 
> 
> 
> Igor Mammedov (3):
>   acpi: add aml_create_byte_field()
>   pc: acpi: add _CST support
>   acpi: add support for CST update notification
> 
> Michael S. Tsirkin (1):
>   acpi: aml: add aml_register()
> 
>  include/hw/acpi/aml-build.h |   6 ++
>  include/hw/acpi/cpu.h   |  10 +++
>  docs/specs/acpi_cpu_hotplug.txt |  21 +-
>  hw/acpi/aml-build.c |  28 +++
>  hw/acpi/cpu.c   | 158 
> +++-
>  hw/acpi/piix4.c |   2 +
>  hw/i386/acpi-build.c|   5 +-
>  tests/bios-tables-test.c|   1 +
>  8 files changed, 225 insertions(+), 6 deletions(-)
> 
> -- 
> 2.7.4



Re: [Qemu-devel] [PATCH] fw_cfg: add bootdevice-ignore-suffixes property

2018-08-08 Thread Mark Cave-Ayland

On 08/08/18 20:53, Eduardo Habkost wrote:


On Wed, Aug 08, 2018 at 08:19:51PM +0100, Mark Cave-Ayland wrote:

For the older machines (such as Mac and SPARC) the DT nodes representing
bootdevices for disk nodes are irregular for mainly historical reasons.

Since the majority of bootdevice nodes for these machines either do not have a
separate disk node or require different (custom) names then it is much easier
to disable all suffixes for a particular machine by setting the ignore_suffixes
parameter to get_boot_devices_list() to true, and customise the disk nodes as
required.

Here we add a new bootdevice-ignore-suffixes property to the FW_CFG device to
allow the generation of disk suffixes to be controlled on a per-machine basis.

Signed-off-by: Mark Cave-Ayland 


Reviewed-by: Eduardo Habkost 

But I would prefer to see this merged only after we see machines
actually using the property.  Can you send that as a single
series later?


I don't have any more time until tomorrow evening now, but FWIW I've 
pushed my working branch to 
https://github.com/mcayland/qemu/commits/openbios-bootindex3 if you want 
to take a quick look. Example command line:


$ ./qemu-system-sparc64 -drive 
file=disk.img,if=none,index=0,id=cd,media=cdrom -device 
virtio-blk-pci,bus=pciB,drive=cd,bootindex=0 -m 256 -nographic


Would you still like me to post this to the list properly tomorrow evening?


Also, maybe we can do it in a simpler way:

I now see that fw_cfg is not the only user of
get_boot_devices_list().  I didn't want to have a fw_cfg-specific
field in MachineClass, but but we can make it not fw_cfg-specific
if we make it affect all get_boot_devices_list() calls.

What do you think of the patch below?

(Patch is untested)

Signed-off-by: Eduardo Habkost 
---
diff --git a/include/hw/boards.h b/include/hw/boards.h
index d139a431a6..f82f28468b 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -206,6 +206,7 @@ struct MachineClass {
  bool auto_enable_numa_with_memhp;
  void (*numa_auto_assign_ram)(MachineClass *mc, NodeInfo *nodes,
   int nb_nodes, ram_addr_t size);
+bool ignore_boot_device_suffixes;
  
  HotplugHandler *(*get_hotplug_handler)(MachineState *machine,

 DeviceState *dev);
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 76ef6196a7..8d6095d98b 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -182,7 +182,7 @@ void hmp_info_usb(Monitor *mon, const QDict *qdict);
  
  void add_boot_device_path(int32_t bootindex, DeviceState *dev,

const char *suffix);
-char *get_boot_devices_list(size_t *size, bool ignore_suffixes);
+char *get_boot_devices_list(size_t *size);
  
  DeviceState *get_boot_device(uint32_t position);

  void check_boot_index(int32_t bootindex, Error **errp);
diff --git a/bootdevice.c b/bootdevice.c
index 1141009114..1d225202f9 100644
--- a/bootdevice.c
+++ b/bootdevice.c
@@ -29,6 +29,7 @@
  #include "qemu/error-report.h"
  #include "sysemu/reset.h"
  #include "hw/qdev-core.h"
+#include "hw/boards.h"
  
  typedef struct FWBootEntry FWBootEntry;
  
@@ -208,11 +209,13 @@ DeviceState *get_boot_device(uint32_t position)

   * memory pointed by "size" is assigned total length of the array in bytes
   *
   */
-char *get_boot_devices_list(size_t *size, bool ignore_suffixes)
+char *get_boot_devices_list(size_t *size)
  {
  FWBootEntry *i;
  size_t total = 0;
  char *list = NULL;
+MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine());
+bool ignore_suffixes = mc->ignore_boot_device_suffixes;
  
  QTAILQ_FOREACH(i, _boot_order, link) {

  char *devpath = NULL,  *suffix = NULL;
diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
index b23e7f64a8..d79a568f54 100644
--- a/hw/nvram/fw_cfg.c
+++ b/hw/nvram/fw_cfg.c
@@ -861,7 +861,7 @@ static void fw_cfg_machine_reset(void *opaque)
  void *ptr;
  size_t len;
  FWCfgState *s = opaque;
-char *bootindex = get_boot_devices_list(, false);
+char *bootindex = get_boot_devices_list();
  
  ptr = fw_cfg_modify_file(s, "bootorder", (uint8_t *)bootindex, len);

  g_free(ptr);
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 421b2dd09b..47bc63b085 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1160,7 +1160,7 @@ static void spapr_dt_chosen(sPAPRMachineState *spapr, 
void *fdt)
  const char *boot_device = machine->boot_order;
  char *stdout_path = spapr_vio_stdout_path(spapr->vio_bus);
  size_t cb = 0;
-char *bootlist = get_boot_devices_list(, true);
+char *bootlist = get_boot_devices_list();
  
  _FDT(chosen = fdt_add_subnode(fdt, 0, "chosen"));
  
@@ -3950,6 +3950,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
  
  mc->desc = "pSeries Logical Partition (PAPR compliant)";
  
+mc->ignore_boot_device_suffixes = true;

  /*
   * We set up the default / latest behaviour here.  The 

Re: [Qemu-devel] [RFC PATCH 3/4] pc: acpi: add _CST support

2018-08-08 Thread Michael S. Tsirkin
On Wed, Aug 08, 2018 at 05:15:48PM +0200, Igor Mammedov wrote:
> Reuse CPU hotplug IO registers for passing a CST entry
> containing package for shalowest C1 using mwait and
> read it out in guest with new CCST AML method.

I don't see how 1 entry is enough. We need to describe full _CST package so
that guest can do reasonable power management on the CPU.


> The CState support is optional and could be turned on
> with '-global PIIX4_PM.cstate=on' CLI option.
> 
> Signed-off-by: Igor Mammedov 
> ---
> for demo purposes it's wired only to piix4
> TODO: q35 wiring
> 
> 'tested' with rhel7 and XPsp3 - WS2016
>  (i.e. it boots and all windows versions happy about AML qemu produces)
> ---
>  include/hw/acpi/cpu.h   |   9 +++
>  docs/specs/acpi_cpu_hotplug.txt |  10 ++-
>  hw/acpi/cpu.c   | 131 
> 
>  hw/acpi/piix4.c |   2 +
>  hw/i386/acpi-build.c|   5 +-
>  tests/bios-tables-test.c|   1 +
>  6 files changed, 156 insertions(+), 2 deletions(-)
> 
> diff --git a/include/hw/acpi/cpu.h b/include/hw/acpi/cpu.h
> index 89ce172..eb79cbf 100644
> --- a/include/hw/acpi/cpu.h
> +++ b/include/hw/acpi/cpu.h
> @@ -17,6 +17,12 @@
>  #include "hw/acpi/aml-build.h"
>  #include "hw/hotplug.h"
>  
> +typedef struct AcpiCState {
> +uint32_t current_cst_field;
> +uint32_t latency;
> +uint32_t power;
> +} AcpiCState;
> +
>  typedef struct AcpiCpuStatus {
>  struct CPUState *cpu;
>  uint64_t arch_id;
> @@ -24,6 +30,7 @@ typedef struct AcpiCpuStatus {
>  bool is_removing;
>  uint32_t ost_event;
>  uint32_t ost_status;
> +AcpiCState cst;
>  } AcpiCpuStatus;
>  
>  typedef struct CPUHotplugState {
> @@ -32,6 +39,7 @@ typedef struct CPUHotplugState {
>  uint8_t command;
>  uint32_t dev_count;
>  AcpiCpuStatus *devs;
> +bool enable_cstate;
>  } CPUHotplugState;
>  
>  void acpi_cpu_plug_cb(HotplugHandler *hotplug_dev,
> @@ -50,6 +58,7 @@ void cpu_hotplug_hw_init(MemoryRegion *as, Object *owner,
>  typedef struct CPUHotplugFeatures {
>  bool apci_1_compatible;
>  bool has_legacy_cphp;
> +bool cstate_enabled;
>  } CPUHotplugFeatures;
>  
>  void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures 
> opts,
> diff --git a/docs/specs/acpi_cpu_hotplug.txt b/docs/specs/acpi_cpu_hotplug.txt
> index ee219c8..adfb026 100644
> --- a/docs/specs/acpi_cpu_hotplug.txt
> +++ b/docs/specs/acpi_cpu_hotplug.txt
> @@ -47,6 +47,12 @@ read access:
>in case of error or unsupported command reads is 0x
>current 'Command field' value:
>0: returns PXM value corresponding to device
> +  3: sequential reads return a sequence of DWORDs
> +   {
> + AddressSpaceKeyword, RegisterBitWidth, 
> RegisterBitOffset,
> + RegisterAddress Lo, RegisterAddress Hi, AccessSize,
> + C State type, Latency, Power,
> +   }
>  
>  write access:
>  offset:
> @@ -75,7 +81,9 @@ write access:
>  1: following writes to 'Command data' register set OST event
> register in QEMU
>  2: following writes to 'Command data' register set OST status
> -   register in QEMU
> +3: following reads from 'Command data' register return Cx
> +   state (command execution resets unread field counter to the 
> 1st
> +   field).
>  other values: reserved
>  [0x6-0x7] reserved
>  [0x8] Command data: (DWORD access)
> diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
> index 5ae595e..7ef04f9 100644
> --- a/hw/acpi/cpu.c
> +++ b/hw/acpi/cpu.c
> @@ -16,6 +16,7 @@ enum {
>  CPHP_GET_NEXT_CPU_WITH_EVENT_CMD = 0,
>  CPHP_OST_EVENT_CMD = 1,
>  CPHP_OST_STATUS_CMD = 2,
> +CPHP_READ_CST_CMD = 3,
>  CPHP_CMD_MAX
>  };
>  
> @@ -73,6 +74,41 @@ static uint64_t cpu_hotplug_rd(void *opaque, hwaddr addr, 
> unsigned size)
>  case CPHP_GET_NEXT_CPU_WITH_EVENT_CMD:
> val = cpu_st->selector;
> break;
> +case CPHP_READ_CST_CMD:
> +switch (cdev->cst.current_cst_field) {
> +case 0:
> +val = cpu_to_le32(AML_AS_FFH); /* AddressSpaceKeyword */
> +break;
> +case 1:  /* RegisterBitWidth */
> +val = cpu_to_le32(1); /* Vendor: Intel */
> +break;
> +case 2:  /* RegisterBitOffset */
> +val = cpu_to_le32(2); /* Class: Native C State Instruction */
> +break;
> +case 3:  /* RegisterAddress Lo */
> +val = cpu_to_le64(0); /* Arg0: mwait EAX hint */
> +break;
> +case 4:  /* RegisterAddress Hi */
> +val = cpu_to_le32(0); /* Reserved */
> +break;
> +case 5:  /* AccessSize */
> +val = cpu_to_le32(0); /* Arg1 */

Re: [Qemu-devel] [PATCH] fw_cfg: add bootdevice-ignore-suffixes property

2018-08-08 Thread Eduardo Habkost
On Wed, Aug 08, 2018 at 09:39:49PM +0200, Laszlo Ersek wrote:
> On 08/08/18 21:19, Mark Cave-Ayland wrote:
> > For the older machines (such as Mac and SPARC) the DT nodes representing
> > bootdevices for disk nodes are irregular for mainly historical reasons.
> > 
> > Since the majority of bootdevice nodes for these machines either do not 
> > have a
> > separate disk node or require different (custom) names then it is much 
> > easier
> > to disable all suffixes for a particular machine by setting the 
> > ignore_suffixes
> > parameter to get_boot_devices_list() to true, and customise the disk nodes 
> > as
> > required.
> > 
> > Here we add a new bootdevice-ignore-suffixes property to the FW_CFG device 
> > to
> > allow the generation of disk suffixes to be controlled on a per-machine 
> > basis.
> > 
> > Signed-off-by: Mark Cave-Ayland 
> > ---
> >  hw/nvram/fw_cfg.c | 9 -
> >  include/hw/nvram/fw_cfg.h | 1 +
> >  2 files changed, 9 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
> > index b23e7f64a8..52488b999f 100644
> > --- a/hw/nvram/fw_cfg.c
> > +++ b/hw/nvram/fw_cfg.c
> > @@ -861,7 +861,8 @@ static void fw_cfg_machine_reset(void *opaque)
> >  void *ptr;
> >  size_t len;
> >  FWCfgState *s = opaque;
> > -char *bootindex = get_boot_devices_list(, false);
> > +char *bootindex = get_boot_devices_list(,
> > +s->bootdevice_ignore_suffixes);
> >  
> >  ptr = fw_cfg_modify_file(s, "bootorder", (uint8_t *)bootindex, len);
> >  g_free(ptr);
> > @@ -990,12 +991,18 @@ FWCfgState *fw_cfg_find(void)
> >  return FW_CFG(object_resolve_path_type("", TYPE_FW_CFG, NULL));
> >  }
> >  
> > +static Property fw_cfg_properties[] = {
> > +DEFINE_PROP_BOOL("bootdevice-ignore-suffixes", FWCfgState,
> > + bootdevice_ignore_suffixes, false),
> > +DEFINE_PROP_END_OF_LIST(),
> > +};
> 
> I've got two questions which are not "loaded" -- I honestly have no clue:
> 
> - Do we intend to expose this to users and higher-level tools? If not,
> should it be called "x-..." (experimental)? I can't remember the rules
> about "x-" properties.

That would be a good idea.  But maybe we can skip using QOM to
address this, and go back to a MachineClass field (but this time
not fw_cfg-specific).  See my other reply.


> 
> - I vaguely recall that earlier we tried to add properties to the fw_cfg
> base class, but ultimately added them to the derived classes (see
> "fw_cfg_mem_properties" and "fw_cfg_io_properties"). Despite the fact
> that the referenced fields themselves (dma_enabled, file_slots) belong
> to the base class; IOW, the properties refer to "parent_obj.xxx". I
> don't really remember why we did this. I seem to recall issues
> otherwise, with setting the property from the command line due to object
> construction / realization order, or whatever.

Maybe it was a workaround to an old bug in compat_props handling.
I will try to find out.

> 
> Mark, can you verify whether you can control
> "bootdevice-ignore-suffixes" from the command line, e.g. via "-global"?
> 
> The object model keeps scaring me. :(

You're not alone.  :(

-- 
Eduardo



Re: [Qemu-devel] [PATCH] fw_cfg: add bootdevice-ignore-suffixes property

2018-08-08 Thread Eduardo Habkost
On Wed, Aug 08, 2018 at 08:19:51PM +0100, Mark Cave-Ayland wrote:
> For the older machines (such as Mac and SPARC) the DT nodes representing
> bootdevices for disk nodes are irregular for mainly historical reasons.
> 
> Since the majority of bootdevice nodes for these machines either do not have a
> separate disk node or require different (custom) names then it is much easier
> to disable all suffixes for a particular machine by setting the 
> ignore_suffixes
> parameter to get_boot_devices_list() to true, and customise the disk nodes as
> required.
> 
> Here we add a new bootdevice-ignore-suffixes property to the FW_CFG device to
> allow the generation of disk suffixes to be controlled on a per-machine basis.
> 
> Signed-off-by: Mark Cave-Ayland 

Reviewed-by: Eduardo Habkost 

But I would prefer to see this merged only after we see machines
actually using the property.  Can you send that as a single
series later?

Also, maybe we can do it in a simpler way:

I now see that fw_cfg is not the only user of
get_boot_devices_list().  I didn't want to have a fw_cfg-specific
field in MachineClass, but but we can make it not fw_cfg-specific
if we make it affect all get_boot_devices_list() calls.

What do you think of the patch below?

(Patch is untested)

Signed-off-by: Eduardo Habkost 
---
diff --git a/include/hw/boards.h b/include/hw/boards.h
index d139a431a6..f82f28468b 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -206,6 +206,7 @@ struct MachineClass {
 bool auto_enable_numa_with_memhp;
 void (*numa_auto_assign_ram)(MachineClass *mc, NodeInfo *nodes,
  int nb_nodes, ram_addr_t size);
+bool ignore_boot_device_suffixes;
 
 HotplugHandler *(*get_hotplug_handler)(MachineState *machine,
DeviceState *dev);
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 76ef6196a7..8d6095d98b 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -182,7 +182,7 @@ void hmp_info_usb(Monitor *mon, const QDict *qdict);
 
 void add_boot_device_path(int32_t bootindex, DeviceState *dev,
   const char *suffix);
-char *get_boot_devices_list(size_t *size, bool ignore_suffixes);
+char *get_boot_devices_list(size_t *size);
 
 DeviceState *get_boot_device(uint32_t position);
 void check_boot_index(int32_t bootindex, Error **errp);
diff --git a/bootdevice.c b/bootdevice.c
index 1141009114..1d225202f9 100644
--- a/bootdevice.c
+++ b/bootdevice.c
@@ -29,6 +29,7 @@
 #include "qemu/error-report.h"
 #include "sysemu/reset.h"
 #include "hw/qdev-core.h"
+#include "hw/boards.h"
 
 typedef struct FWBootEntry FWBootEntry;
 
@@ -208,11 +209,13 @@ DeviceState *get_boot_device(uint32_t position)
  * memory pointed by "size" is assigned total length of the array in bytes
  *
  */
-char *get_boot_devices_list(size_t *size, bool ignore_suffixes)
+char *get_boot_devices_list(size_t *size)
 {
 FWBootEntry *i;
 size_t total = 0;
 char *list = NULL;
+MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine());
+bool ignore_suffixes = mc->ignore_boot_device_suffixes;
 
 QTAILQ_FOREACH(i, _boot_order, link) {
 char *devpath = NULL,  *suffix = NULL;
diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
index b23e7f64a8..d79a568f54 100644
--- a/hw/nvram/fw_cfg.c
+++ b/hw/nvram/fw_cfg.c
@@ -861,7 +861,7 @@ static void fw_cfg_machine_reset(void *opaque)
 void *ptr;
 size_t len;
 FWCfgState *s = opaque;
-char *bootindex = get_boot_devices_list(, false);
+char *bootindex = get_boot_devices_list();
 
 ptr = fw_cfg_modify_file(s, "bootorder", (uint8_t *)bootindex, len);
 g_free(ptr);
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 421b2dd09b..47bc63b085 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1160,7 +1160,7 @@ static void spapr_dt_chosen(sPAPRMachineState *spapr, 
void *fdt)
 const char *boot_device = machine->boot_order;
 char *stdout_path = spapr_vio_stdout_path(spapr->vio_bus);
 size_t cb = 0;
-char *bootlist = get_boot_devices_list(, true);
+char *bootlist = get_boot_devices_list();
 
 _FDT(chosen = fdt_add_subnode(fdt, 0, "chosen"));
 
@@ -3950,6 +3950,7 @@ static void spapr_machine_class_init(ObjectClass *oc, 
void *data)
 
 mc->desc = "pSeries Logical Partition (PAPR compliant)";
 
+mc->ignore_boot_device_suffixes = true;
 /*
  * We set up the default / latest behaviour here.  The class_init
  * functions for the specific versioned machine types can override



Re: [Qemu-devel] [PATCH] fw_cfg: add bootdevice-ignore-suffixes property

2018-08-08 Thread Laszlo Ersek
On 08/08/18 21:19, Mark Cave-Ayland wrote:
> For the older machines (such as Mac and SPARC) the DT nodes representing
> bootdevices for disk nodes are irregular for mainly historical reasons.
> 
> Since the majority of bootdevice nodes for these machines either do not have a
> separate disk node or require different (custom) names then it is much easier
> to disable all suffixes for a particular machine by setting the 
> ignore_suffixes
> parameter to get_boot_devices_list() to true, and customise the disk nodes as
> required.
> 
> Here we add a new bootdevice-ignore-suffixes property to the FW_CFG device to
> allow the generation of disk suffixes to be controlled on a per-machine basis.
> 
> Signed-off-by: Mark Cave-Ayland 
> ---
>  hw/nvram/fw_cfg.c | 9 -
>  include/hw/nvram/fw_cfg.h | 1 +
>  2 files changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
> index b23e7f64a8..52488b999f 100644
> --- a/hw/nvram/fw_cfg.c
> +++ b/hw/nvram/fw_cfg.c
> @@ -861,7 +861,8 @@ static void fw_cfg_machine_reset(void *opaque)
>  void *ptr;
>  size_t len;
>  FWCfgState *s = opaque;
> -char *bootindex = get_boot_devices_list(, false);
> +char *bootindex = get_boot_devices_list(,
> +s->bootdevice_ignore_suffixes);
>  
>  ptr = fw_cfg_modify_file(s, "bootorder", (uint8_t *)bootindex, len);
>  g_free(ptr);
> @@ -990,12 +991,18 @@ FWCfgState *fw_cfg_find(void)
>  return FW_CFG(object_resolve_path_type("", TYPE_FW_CFG, NULL));
>  }
>  
> +static Property fw_cfg_properties[] = {
> +DEFINE_PROP_BOOL("bootdevice-ignore-suffixes", FWCfgState,
> + bootdevice_ignore_suffixes, false),
> +DEFINE_PROP_END_OF_LIST(),
> +};

I've got two questions which are not "loaded" -- I honestly have no clue:

- Do we intend to expose this to users and higher-level tools? If not,
should it be called "x-..." (experimental)? I can't remember the rules
about "x-" properties.

- I vaguely recall that earlier we tried to add properties to the fw_cfg
base class, but ultimately added them to the derived classes (see
"fw_cfg_mem_properties" and "fw_cfg_io_properties"). Despite the fact
that the referenced fields themselves (dma_enabled, file_slots) belong
to the base class; IOW, the properties refer to "parent_obj.xxx". I
don't really remember why we did this. I seem to recall issues
otherwise, with setting the property from the command line due to object
construction / realization order, or whatever.

Mark, can you verify whether you can control
"bootdevice-ignore-suffixes" from the command line, e.g. via "-global"?

The object model keeps scaring me. :(

Laszlo

>  
>  static void fw_cfg_class_init(ObjectClass *klass, void *data)
>  {
>  DeviceClass *dc = DEVICE_CLASS(klass);
>  
>  dc->reset = fw_cfg_reset;
> +dc->props = fw_cfg_properties;
>  dc->vmsd = _fw_cfg;
>  }
>  
> diff --git a/include/hw/nvram/fw_cfg.h b/include/hw/nvram/fw_cfg.h
> index b2259cc4a3..848c83aef4 100644
> --- a/include/hw/nvram/fw_cfg.h
> +++ b/include/hw/nvram/fw_cfg.h
> @@ -58,6 +58,7 @@ struct FWCfgState {
>  uint16_t cur_entry;
>  uint32_t cur_offset;
>  Notifier machine_ready;
> +bool bootdevice_ignore_suffixes;
>  
>  int fw_cfg_order_override;
>  
> 




Re: [Qemu-devel] [PATCH v2 18/34] tests: virtio: separate ccw tests from libqos

2018-08-08 Thread Laurent Vivier
On 06/08/2018 16:33, Emanuele Giuseppe Esposito wrote:
> From: Paolo Bonzini 
> 
> Because qtest does not support s390 channel I/O, s390 only performs smoke 
> tests on
> those few devices that do not have any functional tests.  Therefore, every 
> time we
> add functional tests for a virtio device, the choice is between removing
> those tests from the s390 suite (so that s390 actually _loses_ coverage)
> or sprinkling the test with architecture checks.
> 
> This patch simply creates a ccw-specific test that only performs smoke tests 
> on
> all virtio-ccw devices.  If channel I/O support is ever added to qtest and 
> libqos,
> then this file can go away.  In the meanwhile, it simplifies maintenance and
> makes sure that all virtio devices are tested.
> 
> Signed-off-by: Paolo Bonzini 
> Signed-off-by: Emanuele Giuseppe Esposito 

You can add the "Acked-by" from Cornelia and the "Reviewed-by" from Thomas.

Thanks,
Laurent



Re: [Qemu-devel] [PATCH v2 04/34] tests/qgraph: x86_64/pc machine node

2018-08-08 Thread Laurent Vivier
On 06/08/2018 16:33, Emanuele Giuseppe Esposito wrote:
> Add pc machine for the x86_64 QEMU binary. This machine contains an 
> i440FX-pcihost
> driver, that contains itself a pci-bus-pc that produces the pci-bus interface.
> 
> Signed-off-by: Emanuele Giuseppe Esposito 
> ---
>  tests/Makefile.include   |   3 +
>  tests/libqos/x86_64_pc-machine.c | 110 +++
>  2 files changed, 113 insertions(+)
>  create mode 100644 tests/libqos/x86_64_pc-machine.c
> 
> diff --git a/tests/Makefile.include b/tests/Makefile.include
> index f04f9fbc3a..4e7b4bb614 100644
> --- a/tests/Makefile.include
> +++ b/tests/Makefile.include
> @@ -771,7 +771,10 @@ libqos-imx-obj-y = $(libqos-obj-y) tests/libqos/i2c-imx.o
>  libqos-usb-obj-y = $(libqos-spapr-obj-y) $(libqos-pc-obj-y) 
> tests/libqos/usb.o
>  libqos-virtio-obj-y = $(libqos-spapr-obj-y) $(libqos-pc-obj-y) 
> tests/libqos/virtio.o tests/libqos/virtio-pci.o tests/libqos/virtio-mmio.o 
> tests/libqos/malloc-generic.o
>  
> +libqgraph-machines-obj-y = tests/libqos/x86_64_pc-machine.o
> +
>  libqgraph-pci-obj-y = $(libqos-pc-obj-y)
> +libqgraph-pci-obj-y += $(libqgraph-machines-obj-y)
>  
>  check-unit-y += tests/test-qgraph$(EXESUF)
>  tests/test-qgraph$(EXESUF): tests/test-qgraph.o $(libqgraph-obj-y)
> diff --git a/tests/libqos/x86_64_pc-machine.c 
> b/tests/libqos/x86_64_pc-machine.c
> new file mode 100644
> index 00..e3eddf2eba
> --- /dev/null
> +++ b/tests/libqos/x86_64_pc-machine.c
> @@ -0,0 +1,110 @@
> +/*
> + * libqos driver framework
> + *
> + * Copyright (c) 2018 Emanuele Giuseppe Esposito 
> 
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License version 2 as published by the Free Software Foundation.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; if not, see 
> 
> + */
> +
> +#include "qemu/osdep.h"
> +#include "libqtest.h"
> +#include "libqos/qgraph.h"
> +#include "pci-pc.h"
> +#include "malloc-pc.h"
> +
> +typedef struct QX86_64_PCMachine QX86_64_PCMachine;
> +typedef struct i440FX_pcihost i440FX_pcihost;
> +typedef struct QSDHCI_PCI  QSDHCI_PCI;
> +
> +struct i440FX_pcihost {
> +QOSGraphObject obj;
> +QPCIBusPC pci;
> +};
> +
> +struct QX86_64_PCMachine {
> +QOSGraphObject obj;
> +QGuestAllocator *alloc;
> +i440FX_pcihost bridge;
> +};
> +
> +/* i440FX_pcihost */
> +
> +static QOSGraphObject *i440FX_host_get_device(void *obj, const char *device)
> +{
> +i440FX_pcihost *host = obj;
> +if (!g_strcmp0(device, "pci-bus-pc")) {
> +return >pci.obj;
> +}
> +printf("%s not present in i440FX-pcihost\n", device);

fprintf(stderr, ...

> +abort();

g_assert_not_reached()

> +}
> +
> +static void qos_create_i440FX_host(i440FX_pcihost *host,
> +   QGuestAllocator *alloc)
> +{
> +host->obj.get_device = i440FX_host_get_device;
> +qpci_init_pc(>pci, global_qtest, alloc);
> +}
> +
> +/* x86_64/pc machine */
> +
> +static void pc_destroy(QOSGraphObject *obj)
> +{
> +QX86_64_PCMachine *machine = (QX86_64_PCMachine *) obj;
> +pc_alloc_uninit(machine->alloc);
> +g_free(machine);
> +}
> +
> +static void *pc_get_driver(void *object, const char *interface)
> +{
> +QX86_64_PCMachine *machine = object;
> +if (!g_strcmp0(interface, "guest_allocator")) {

Perhaps we can call that "memory"  rather than "guest_allocator", as it
gives access to the memory of the guest?

> +return machine->alloc;
> +}
> +
> +printf("%s not present in x86_64/pc\n", interface);

fprintf(stderr, ...

> +abort();

g_assert_not_reached()

> +}
> +
> +static QOSGraphObject *pc_get_device(void *obj, const char *device)
> +{
> +QX86_64_PCMachine *machine = obj;
> +if (!g_strcmp0(device, "i440FX-pcihost")) {
> +return >bridge.obj;
> +}
> +
> +printf("%s not present in x86_64/pc\n", device);

fprintf(stderr, ...

> +abort();

g_assert_not_reached()

> +}
> +
> +static void *qos_create_machine_pc(void)
> +{
> +QX86_64_PCMachine *machine = g_new0(QX86_64_PCMachine, 1);
> +machine->obj.get_device = pc_get_device;
> +machine->obj.get_driver = pc_get_driver;
> +machine->obj.destructor = pc_destroy;
> +machine->alloc = pc_alloc_init_flags(global_qtest, ALLOC_NO_FLAGS);
> +qos_create_i440FX_host(>bridge, machine->alloc);
> +
> +return >obj;
> +}
> +
> +static void pc_machine(void)
> +{
> +qos_node_create_machine("x86_64/pc", qos_create_machine_pc);
> +qos_node_create_driver("i440FX-pcihost", NULL);
> +qos_node_contains("x86_64/pc", 

[Qemu-devel] [PATCH] fw_cfg: add bootdevice-ignore-suffixes property

2018-08-08 Thread Mark Cave-Ayland
For the older machines (such as Mac and SPARC) the DT nodes representing
bootdevices for disk nodes are irregular for mainly historical reasons.

Since the majority of bootdevice nodes for these machines either do not have a
separate disk node or require different (custom) names then it is much easier
to disable all suffixes for a particular machine by setting the ignore_suffixes
parameter to get_boot_devices_list() to true, and customise the disk nodes as
required.

Here we add a new bootdevice-ignore-suffixes property to the FW_CFG device to
allow the generation of disk suffixes to be controlled on a per-machine basis.

Signed-off-by: Mark Cave-Ayland 
---
 hw/nvram/fw_cfg.c | 9 -
 include/hw/nvram/fw_cfg.h | 1 +
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
index b23e7f64a8..52488b999f 100644
--- a/hw/nvram/fw_cfg.c
+++ b/hw/nvram/fw_cfg.c
@@ -861,7 +861,8 @@ static void fw_cfg_machine_reset(void *opaque)
 void *ptr;
 size_t len;
 FWCfgState *s = opaque;
-char *bootindex = get_boot_devices_list(, false);
+char *bootindex = get_boot_devices_list(,
+s->bootdevice_ignore_suffixes);
 
 ptr = fw_cfg_modify_file(s, "bootorder", (uint8_t *)bootindex, len);
 g_free(ptr);
@@ -990,12 +991,18 @@ FWCfgState *fw_cfg_find(void)
 return FW_CFG(object_resolve_path_type("", TYPE_FW_CFG, NULL));
 }
 
+static Property fw_cfg_properties[] = {
+DEFINE_PROP_BOOL("bootdevice-ignore-suffixes", FWCfgState,
+ bootdevice_ignore_suffixes, false),
+DEFINE_PROP_END_OF_LIST(),
+};
 
 static void fw_cfg_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
 
 dc->reset = fw_cfg_reset;
+dc->props = fw_cfg_properties;
 dc->vmsd = _fw_cfg;
 }
 
diff --git a/include/hw/nvram/fw_cfg.h b/include/hw/nvram/fw_cfg.h
index b2259cc4a3..848c83aef4 100644
--- a/include/hw/nvram/fw_cfg.h
+++ b/include/hw/nvram/fw_cfg.h
@@ -58,6 +58,7 @@ struct FWCfgState {
 uint16_t cur_entry;
 uint32_t cur_offset;
 Notifier machine_ready;
+bool bootdevice_ignore_suffixes;
 
 int fw_cfg_order_override;
 
-- 
2.11.0




Re: [Qemu-devel] [PATCH 2/2] fw_cfg: set the get_boot_devices_list() ignore_suffixes parameter from machine property

2018-08-08 Thread Mark Cave-Ayland

On 07/08/18 20:45, Eduardo Habkost wrote:


Is this sufficient, or are the compat properties supposed to be versioned
according to the QEMU machine version?


I never saw compat_properties being used for non-versioned
machines, but it should work for this use case as well.

But, I'm not sure this is the best option.  If the machine type
is not versioned, you can simply manually set the device property
to the desired value when creating the device inside your machine
init function.  See how the "data_width" and "dma_enabled"
properties are set by existing machines, for an example.


After some more digging, I can see that you are right and that we can 
actually get away with a standard qdev property for the fw_cfg device 
rather than a machine property after all - it's just that I need to 
convert both machines away from using the old fw_cfg_init() functions 
first. I'll send over what I have shortly.



ATB,

Mark.



Re: [Qemu-devel] [PATCH 3/4] serial-mcb: Add serial via MEN chameleon bus

2018-08-08 Thread Philippe Mathieu-Daudé
Hi Johannes,

On 08/08/2018 11:16 AM, Johannes Thumshirn wrote:
> Add MEN z125 UART over MEN Chameleon Bus emulation.
> 
> Signed-off-by: Johannes Thumshirn 
> ---
>  hw/char/Makefile.objs |  1 +
>  hw/char/serial-mcb.c  | 97 
> +++
>  2 files changed, 98 insertions(+)
>  create mode 100644 hw/char/serial-mcb.c
> 
> diff --git a/hw/char/Makefile.objs b/hw/char/Makefile.objs
> index b57053129107..063f1720974d 100644
> --- a/hw/char/Makefile.objs
> +++ b/hw/char/Makefile.objs
> @@ -6,6 +6,7 @@ common-obj-$(CONFIG_PL011) += pl011.o
>  common-obj-$(CONFIG_SERIAL) += serial.o
>  common-obj-$(CONFIG_SERIAL_ISA) += serial-isa.o
>  common-obj-$(CONFIG_SERIAL_PCI) += serial-pci.o
> +common-obj-$(CONFIG_MCB) += serial-mcb.o
>  common-obj-$(CONFIG_VIRTIO_SERIAL) += virtio-console.o
>  common-obj-$(CONFIG_XILINX) += xilinx_uartlite.o
>  common-obj-$(CONFIG_XEN) += xen_console.o
> diff --git a/hw/char/serial-mcb.c b/hw/char/serial-mcb.c
> new file mode 100644
> index ..a1087bc369dd
> --- /dev/null
> +++ b/hw/char/serial-mcb.c
> @@ -0,0 +1,97 @@
> +/*
> + * QEMU MEN 16z125 UART over MCB emulation
> + *
> + * Copyright (C) 2016 Johannes Thumshirn 
> + *
> + * This code is licensed under the GNU GPL v2 or (at your opinion) any
> + * later version
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qapi/error.h"
> +#include "hw/char/serial.h"
> +#include "hw/mcb/mcb.h"
> +
> +typedef struct {

IMHO adding ...

   /*< private >*/


> +MCBDevice dev;

   /*< public >*/

... makes it more explicit that MCBDevice contains the QOM parent
DeviceState. Since you don't seem to use it you can even name it parent_obj:

   /*< private >*/
   MCBDevice parent_obj;
   /*< public >*/
   ...

> +SerialState state;
> +} MCBSerialState;
> +
> +static void serial_mcb_realize(DeviceState *dev, Error **errp)
> +{
> +MCBDevice *mdev = MCB_DEVICE(dev);
> +MCBSerialState *mss = DO_UPCAST(MCBSerialState, dev, mdev);
> +MCBus *bus = MCB_BUS(qdev_get_parent_bus(DEVICE(dev)));
> +SerialState *s = >state;
> +Error *err = 0;
> +
> +mdev->gdd = mcb_new_chameleon_descriptor(bus, 125, mdev->rev,
> + mdev->var, 0x10);
> +if (!mdev->gdd) {
> +return;
> +}
> +
> +s->baudbase = 115200;
> +serial_realize_core(s, );
> +if (err != NULL) {
> +error_propagate(errp, err);
> +return;
> +}
> +
> +s->irq = mcb_allocate_irq(>dev);
> +memory_region_init_io(>io, OBJECT(mss), _io_ops, s, "serial", 
> 8);
> +memory_region_add_subregion(>mmio_region, mdev->gdd->offset, 
> >io);
> +}
> +
> +static void serial_mcb_unrealize(DeviceState *dev, Error **errp)
> +{
> +MCBDevice *mdev = MCB_DEVICE(dev);
> +
> +g_free(>gdd);
> +}
> +
> +static const VMStateDescription vmstate_mcb_serial = {
> +.name = "mcb-serial",
> +.version_id = 1,
> +.minimum_version_id = 1,
> +.fields = (VMStateField[]) {
> +VMSTATE_MCB_DEVICE(dev, MCBSerialState),
> +VMSTATE_STRUCT(state, MCBSerialState, 0, vmstate_serial, 
> SerialState),
> +VMSTATE_END_OF_LIST()
> +}
> +};
> +
> +static Property serial_mcb_properties[] = {
> +DEFINE_PROP_CHR("chardev", MCBSerialState, state.chr),
> +DEFINE_PROP_UINT8("rev", MCBSerialState, dev.rev, 0),
> +DEFINE_PROP_UINT8("var", MCBSerialState, dev.var, 0),
> +DEFINE_PROP_END_OF_LIST(),
> +};
> +
> +static void serial_mcb_class_initfn(ObjectClass *klass, void *data)
> +{
> +DeviceClass *dc = DEVICE_CLASS(klass);
> +MCBDeviceClass *mc = MCB_DEVICE_CLASS(klass);
> +
> +mc->realize = serial_mcb_realize;
> +mc->unrealize = serial_mcb_unrealize;
> +
> +set_bit(DEVICE_CATEGORY_INPUT, dc->categories);
> +dc->desc = "MEN 16z125 UART over MCB";
> +dc->vmsd = _mcb_serial;
> +dc->props = serial_mcb_properties;
> +}
> +
> +static const TypeInfo serial_mcb_info = {
> +.name = "mcb-serial",
> +.parent = TYPE_MCB_DEVICE,
> +.instance_size = sizeof(MCBSerialState),
> +.class_init = serial_mcb_class_initfn,
> +};
> +
> +static void serial_mcb_register_types(void)
> +{
> +type_register_static(_mcb_info);
> +}
> +
> +type_init(serial_mcb_register_types);
> 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v2] monitor: print message when using 'help' with an unknown command

2018-08-08 Thread Dr. David Alan Gilbert
* Collin Walling (wall...@linux.ibm.com) wrote:
> When typing 'help' followed by an unknown command, QEMU will
> not print anything to the command line to let the user know
> they typed a bad command. Let's fix this by printing a message
> to the monitor when this happens. For example:
> 
> (qemu) help xyz
> unknown command: 'xyz'
> 
> Reported-by: Stefan Zimmermann 
> Signed-off-by: Collin Walling 

Reviewed-by: Dr. David Alan Gilbert 

> ---
>  monitor.c | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/monitor.c b/monitor.c
> index 7af1f18..deeb41c 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -1013,6 +1013,7 @@ static void help_cmd_dump(Monitor *mon, const mon_cmd_t 
> *cmds,
>char **args, int nb_args, int arg_index)
>  {
>  const mon_cmd_t *cmd;
> +size_t i;
>  
>  /* No valid arg need to compare with, dump all in *cmds */
>  if (arg_index >= nb_args) {
> @@ -1034,9 +1035,15 @@ static void help_cmd_dump(Monitor *mon, const 
> mon_cmd_t *cmds,
>  } else {
>  help_cmd_dump_one(mon, cmd, args, arg_index);
>  }
> -break;
> +return;
>  }
>  }
> +
> +/* Command not found */
> +monitor_printf(mon, "unknown command: '");
> +for (i = 0; i <= arg_index; i++) {
> +monitor_printf(mon, "%s%s", args[i], i == arg_index ? "'\n" : " ");
> +}
>  }
>  
>  static void help_cmd(Monitor *mon, const char *name)
> -- 
> 2.7.4
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH v2 03/34] tests/qgraph: pci-pc driver and interface nodes

2018-08-08 Thread Laurent Vivier
On 06/08/2018 16:33, Emanuele Giuseppe Esposito wrote:
> Add pci-bus-pc node, move QPCIBusPC struct declaration in its header
> (since it will be needed by other drivers) and introduce a setter method
> for drivers that do not need to allocate but have to initialize QPCIBusPC.
> 
> Signed-off-by: Emanuele Giuseppe Esposito 
> ---
>  tests/Makefile.include |  4 +++-
>  tests/libqos/pci-pc.c  | 41 +---
>  tests/libqos/pci-pc.h  | 15 -
>  tests/libqos/pci.c | 48 +++---
>  tests/libqos/pci.h | 15 +
>  5 files changed, 110 insertions(+), 13 deletions(-)
> 
> diff --git a/tests/Makefile.include b/tests/Makefile.include
> index eabf9ed8b4..f04f9fbc3a 100644
> --- a/tests/Makefile.include
> +++ b/tests/Makefile.include
> @@ -771,11 +771,13 @@ libqos-imx-obj-y = $(libqos-obj-y) 
> tests/libqos/i2c-imx.o
>  libqos-usb-obj-y = $(libqos-spapr-obj-y) $(libqos-pc-obj-y) 
> tests/libqos/usb.o
>  libqos-virtio-obj-y = $(libqos-spapr-obj-y) $(libqos-pc-obj-y) 
> tests/libqos/virtio.o tests/libqos/virtio-pci.o tests/libqos/virtio-mmio.o 
> tests/libqos/malloc-generic.o
>  
> +libqgraph-pci-obj-y = $(libqos-pc-obj-y)
> +
>  check-unit-y += tests/test-qgraph$(EXESUF)
>  tests/test-qgraph$(EXESUF): tests/test-qgraph.o $(libqgraph-obj-y)
>  
>  check-qtest-pci-y += tests/qos-test$(EXESUF)
> -tests/qos-test$(EXESUF): tests/qos-test.o $(libqgraph-obj-y)
> +tests/qos-test$(EXESUF): tests/qos-test.o $(libqgraph-pci-obj-y)
>  
>  tests/qmp-test$(EXESUF): tests/qmp-test.o
>  tests/device-introspect-test$(EXESUF): tests/device-introspect-test.o
> diff --git a/tests/libqos/pci-pc.c b/tests/libqos/pci-pc.c
> index 83a3a32129..f5fb94eabc 100644
> --- a/tests/libqos/pci-pc.c
> +++ b/tests/libqos/pci-pc.c
> @@ -18,15 +18,9 @@
>  
>  #include "qemu-common.h"
>  
> -
>  #define ACPI_PCIHP_ADDR 0xae00
>  #define PCI_EJ_BASE 0x0008
>  
> -typedef struct QPCIBusPC
> -{
> -QPCIBus bus;
> -} QPCIBusPC;
> -
>  static uint8_t qpci_pc_pio_readb(QPCIBus *bus, uint32_t addr)
>  {
>  return inb(addr);
> @@ -115,12 +109,23 @@ static void qpci_pc_config_writel(QPCIBus *bus, int 
> devfn, uint8_t offset, uint3
>  outl(0xcfc, value);
>  }
>  
> -QPCIBus *qpci_pc_new(QTestState *qts, QGuestAllocator *alloc)
> +static void *qpci_get_driver(void *obj, const char *interface)
>  {
> -QPCIBusPC *ret = g_new0(QPCIBusPC, 1);
> +QPCIBusPC *qpci = obj;
> +if (!g_strcmp0(interface, "pci-bus")) {
> +return >bus;
> +}
> +printf("%s not present in pci-bus-pc\n", interface);

fprintf(stderr, ...) ?

> +abort();

You should use g_assert_not_reached().

> +}
>  
> +void qpci_init_pc(QPCIBusPC *ret, QTestState *qts, QGuestAllocator *alloc)
> +{
>  assert(qts);
>  
> +/* tests can use pci-bus */
> +ret->bus.has_buggy_msi = FALSE;

I think you should introduce the field has_buggy_msi when it will be
really needed: with patch 09/34 adn pci-spapr.

> +
>  ret->bus.pio_readb = qpci_pc_pio_readb;
>  ret->bus.pio_readw = qpci_pc_pio_readw;
>  ret->bus.pio_readl = qpci_pc_pio_readl;
> @@ -147,11 +152,23 @@ QPCIBus *qpci_pc_new(QTestState *qts, QGuestAllocator 
> *alloc)
>  ret->bus.mmio_alloc_ptr = 0xE000;
>  ret->bus.mmio_limit = 0x1ULL;
>  
> +ret->obj.get_driver = qpci_get_driver;
> +}
> +
> +QPCIBus *qpci_pc_new(QTestState *qts, QGuestAllocator *alloc)

As I said for 02/34, I think qpci_new_pc() should be a better name to
like the one we have for qpci_free_pc().

> +{
> +QPCIBusPC *ret = g_new0(QPCIBusPC, 1);

perhaps you can rename "ret" to "qpci"?

> +qpci_init_pc(ret, qts, alloc);
> +
>  return >bus;
>  }
>  
>  void qpci_free_pc(QPCIBus *bus)
>  {
> +if (!bus) {
> +return;
> +}
> +
>  QPCIBusPC *s = container_of(bus, QPCIBusPC, bus);

Generally, gcc doesn't like to mix declarations and instructions. I
think something like this would be nicer:

QPCIBusPC *s;

if (!bus) {
return;
}
s = container_of(bus, QPCIBusPC, bus);

>  
>  g_free(s);
> @@ -176,3 +193,11 @@ void qpci_unplug_acpi_device_test(const char *id, 
> uint8_t slot)
>  
>  qmp_eventwait("DEVICE_DELETED");
>  }
> +
> +static void qpci_pc(void)

This name is not very clear, for the qemu part, type_init() is generally
used with XXX_register_types name.

Perhaps you can use something like "qpci_pc_register_nodes" name?

> +{
> +qos_node_create_driver("pci-bus-pc", NULL);
> +qos_node_produces("pci-bus-pc", "pci-bus");
> +}
> +
> +libqos_init(qpci_pc);
> diff --git a/tests/libqos/pci-pc.h b/tests/libqos/pci-pc.h
> index 88be29eaf3..a3754c1c86 100644
> --- a/tests/libqos/pci-pc.h
> +++ b/tests/libqos/pci-pc.h
> @@ -15,9 +15,22 @@
>  
>  #include "libqos/pci.h"
>  #include "libqos/malloc.h"
> +#include "libqos/qgraph.h"
>  
> +typedef struct QPCIBusPC {
> +QOSGraphObject obj;
> +QPCIBus bus;
> +} QPCIBusPC;
> +
> +/* qpci_init_pc():
> 

Re: [Qemu-devel] [PATCH v3 2/5] qcow2: Make the default L2 cache sufficient to cover the entire image

2018-08-08 Thread Leonid Bloch

On 08/08/2018 06:16 PM, Alberto Garcia wrote:

On Wed 08 Aug 2018 04:35:19 PM CEST, Leonid Bloch wrote:

The way I see it: there are two simple changes from the user's point of
view (they can even be two separate patches).

1) The default l2-cache-size is now 32MB. DEFAULT_L2_CACHE_CLUSTERS is
  useless now and disappears.

I don't think that it can be a separate patch, because unless the other
logic is changed, the cache will occupy 32 MB *always*, regardless of
the image size, and that's quite a big and unneeded overhead.


Change the order of both patches then :-)


Do you really think it's necessary? The increase of the default max
size is directly tied to the functionality change: it will be harmful
to increase the maximum before the new functionality is implemented,
and there is no need to change the functionality if the default max is
not increased.


I think that we're looking at this from two different perspectives.

a) If I understand you correctly, you see this as a way to make the user
forget about the L2 cache: we guarantee that it's going to be big
enough for the entire image, so simply forget about it. Exception: if
you're using very large images you will have to set its size
manually, but for the vast majority of cases you'll be alright with
the default (32MB).


Yes, just with a small fix: my aim is not to make the user forget about 
the L2 cache, my aim is to make it as large as needed to cover the 
entire image in order to increase the performance. This implies 
increasing its size. Because for images smaller than 8 GB the 
performance will stay the same, and the memory usage will not be that 
different: less than 1 MB difference, while the overall QEMU memory 
overhead is about 600 MB. That's why I think that increasing the max 
size is an integral part of this patch, because just changing the 
behavior, without changing the max size, will not cause a noticeable 
improvement. But it will cause some complications, like changing the 
code for the current maximal value, and then changing to 32 MB in a 
separate patch. This doesn't look necessary to me.


Leonid.



b) The way I see it: setting the right L2 cache size is not trivial, it
depends on the image and cluster sizes, and it involves a trade-off
between how much memory you want to use and how much performance
you're willing to sacrifice. QEMU has many use cases and there's no
good default, you need to make the numbers yourself if you want to
fine-tune it. Don't blindly trust the new default size (32MB) because
it won't be enough for many cases. But we can promise you this: make
l2-cache-size the maximum amount of memory you're willing to spend on
this disk image's cache, and we guarantee that we'll only use the
amount that we need to give you the best performance.

I hope (a) was a fair description of what you're trying to achieve with
these patches. But I also hope that you can see why making l2_cache_size
= MIN(l2_cache_size, virtual_disk_size / (s->cluster_size / 8)) is a
worthwhile change on its own, even if we didn't increase the default
cache size to 32MB.

Berto





Re: [Qemu-devel] [PATCH v2 05/22] check: Only test usb-xhci-nec when it is compiled in

2018-08-08 Thread Juan Quintela
Thomas Huth  wrote:
> On 08/08/2018 01:48 PM, Juan Quintela wrote:
>> Signed-off-by: Juan Quintela 
>> ---
>>  tests/Makefile.include | 9 +
>>  1 file changed, 5 insertions(+), 4 deletions(-)
>> 
>> diff --git a/tests/Makefile.include b/tests/Makefile.include
>> index 4e5f47aac0..1105469daa 100644
>> --- a/tests/Makefile.include
>> +++ b/tests/Makefile.include
>> @@ -290,8 +290,9 @@ endif
>>  gcov-files-i386-$(CONFIG_USB_EHCI) += hw/usb/hcd-ehci.c
>>  gcov-files-i386-y += hw/usb/dev-hid.c
>>  gcov-files-i386-y += hw/usb/dev-storage.c
>> -check-qtest-i386-y += tests/usb-hcd-xhci-test$(EXESUF)
>> -gcov-files-i386-y += hw/usb/hcd-xhci.c
>> +check-qtest-i386-$(CONFIG_USB_XHCI_NEC) += tests/usb-hcd-xhci-test$(EXESUF)
>> +gcov-files-i386-$(CONFIG_USB_XHCI) += hw/usb/hcd-xhci.c
>> +gcov-files-i386-$(CONFIG_USB_XHCI) += hw/usb/hcd-xhci-nec.c
>
> Maybe use CONFIG_USB_XHCI_NEC for the hcd-xhci-nec.c entry instead?

Not sure, this split looks artifitial. My read of this:


commit 0bbb2f3df1ffd9ccf7135a69a450c6929bc0b915
Author: Gerd Hoffmann 
Date:   Wed May 17 12:33:12 2017 +0200

xhci: split into multiple files


Is that it got split so we could add more controllers that never
happened.  I am not sure what we need to do.

>>  check-qtest-i386-y += tests/cpu-plug-test$(EXESUF)
>>  check-qtest-i386-y += tests/q35-test$(EXESUF)
>>  check-qtest-i386-y += tests/vmgenid-test$(EXESUF)
>> @@ -349,8 +350,8 @@ check-qtest-ppc64-$(CONFIG_USB_OHCI) +=
>> tests/usb-hcd-ohci-test$(EXESUF)
>>  gcov-files-ppc64-$(CONFIG_USB_OHCI) += hw/usb/hcd-ohci.c
>>  check-qtest-ppc64-$(CONFIG_USB_UHCI) += tests/usb-hcd-uhci-test$(EXESUF)
>>  gcov-files-ppc64-$(CONFIG_USB_UHCI) += hw/usb/hcd-uhci.c
>> -check-qtest-ppc64-y += tests/usb-hcd-xhci-test$(EXESUF)
>> -gcov-files-ppc64-y += hw/usb/hcd-xhci.c
>> +check-qtest-ppc64-$(CONFIG_USB_XHCI_NEC) += tests/usb-hcd-xhci-test$(EXESUF)
>> +gcov-files-ppc64-$(CONFIG_USB_XHCI) += hw/usb/hcd-xhci.c
>
> Also add hcd-xhci-nec.c gcov entry here?

I didn't want to go "further", but I think that we should have here is
something like:


check-qtest-$(CONFIG_USB_XHCI_NEC) += tests/usb-hcd-xhci-test$(EXESUF)
gcov-files-$(CONFIG_USB_XHCI) += hw/usb/hcd-xhci.c

and remove the arch specific bits.  If one arch don't support it, we
know have CONFIG_USB_XHCI bits to not _enable_ it there.

What do you think?

Thanks, Juan.



Re: [Qemu-devel] [PATCH v2 06/22] i386-softmmu: Configuration is identical to x86_64-softmmu

2018-08-08 Thread Juan Quintela
Thomas Huth  wrote:
> On 08/08/2018 01:48 PM, Juan Quintela wrote:
>> If we ever changed that, just make the things that are different
>> explicit.
>> 
>> Signed-off-by: Juan Quintela 
>> ---
>>  default-configs/i386-softmmu.mak | 65 +---
>>  1 file changed, 1 insertion(+), 64 deletions(-)
>> 
>> diff --git a/default-configs/i386-softmmu.mak 
>> b/default-configs/i386-softmmu.mak
>> index 8827166ba1..6ec7a3b0ae 100644
>> --- a/default-configs/i386-softmmu.mak
>> +++ b/default-configs/i386-softmmu.mak
>> @@ -1,66 +1,3 @@
>>  # Default configuration for i386-softmmu
>>  
>> -include pci.mak
>> -include sound.mak
>> -include usb.mak
>> -CONFIG_QXL=$(CONFIG_SPICE)
>> -CONFIG_VGA_ISA=y
>> -CONFIG_VGA_CIRRUS=y
>> -CONFIG_VMWARE_VGA=y
>> -CONFIG_VMXNET3_PCI=y
>> -CONFIG_VIRTIO_VGA=y
>> -CONFIG_VMMOUSE=y
>> -CONFIG_IPMI=y
>> -CONFIG_IPMI_LOCAL=y
>> -CONFIG_IPMI_EXTERN=y
>> -CONFIG_ISA_IPMI_KCS=y
>> -CONFIG_ISA_IPMI_BT=y
>> -CONFIG_PARALLEL=y
>> -CONFIG_I8254=y
>> -CONFIG_PCSPK=y
>> -CONFIG_PCKBD=y
>> -CONFIG_FDC=y
>> -CONFIG_ACPI=y
>> -CONFIG_ACPI_X86=y
>> -CONFIG_ACPI_X86_ICH=y
>> -CONFIG_ACPI_MEMORY_HOTPLUG=y
>> -CONFIG_ACPI_CPU_HOTPLUG=y
>> -CONFIG_APM=y
>> -CONFIG_I8257=y
>> -CONFIG_IDE_ISA=y
>> -CONFIG_IDE_PIIX=y
>> -CONFIG_NE2000_ISA=y
>> -CONFIG_HPET=y
>> -CONFIG_APPLESMC=y
>> -CONFIG_I8259=y
>> -CONFIG_PFLASH_CFI01=y
>> -CONFIG_TPM_TIS=$(CONFIG_TPM)
>> -CONFIG_TPM_CRB=$(CONFIG_TPM)
>> -CONFIG_MC146818RTC=y
>> -CONFIG_PCI_PIIX=y
>> -CONFIG_WDT_IB700=y
>> -CONFIG_ISA_DEBUG=y
>> -CONFIG_ISA_TESTDEV=y
>> -CONFIG_VMPORT=y
>> -CONFIG_SGA=y
>> -CONFIG_LPC_ICH9=y
>> -CONFIG_PCI_Q35=y
>> -CONFIG_APIC=y
>> -CONFIG_IOAPIC=y
>> -CONFIG_PVPANIC=y
>> -CONFIG_MEM_HOTPLUG=y
>> -CONFIG_NVDIMM=y
>> -CONFIG_ACPI_NVDIMM=y
>> -CONFIG_PCIE_PORT=y
>> -CONFIG_XIO3130=y
>> -CONFIG_IOH3420=y
>> -CONFIG_I82801B11=y
>> -CONFIG_SMBIOS=y
>> -CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
>> -CONFIG_PXB=y
>> -CONFIG_ACPI_VMGENID=y
>> -CONFIG_FW_CFG_DMA=y
>> -CONFIG_I2C=y
>> -CONFIG_SEV=$(CONFIG_KVM)
>> -CONFIG_VTD=y
>> -CONFIG_AMD_IOMMU=y
>> +include x86_64-softmmu.mak
>
> That's theoretically a good idea, but I think I'd rather do it the other
> way round: include i386-softmmu.mak in the x86_64 config file.
> Rationale: x86_64 is supposed to be a superset of i386, not the other
> way round, so when we will ever get a CONFIG_SWITCH_FOR_X86_64_ONLY,
> it's easier to handle if the includes are done the other way round.

[1]

> And that's also how we do it in aarch64-softmmu.mak an ppc64-softmmu.mak.

This is a good point.

But for (1), I really think that we should make i386 only for old stuff,
and remove things like ISA devices from x86_64 (ok, I know that there is
some ISA on all chipsets, but ne2000-isa is not one of them).

Anyways, I don't really care enough, so I will change that.

Later, Juan.



[Qemu-devel] [PATCH] spapr_cpu_core: vmstate_[un]register per-CPU data from (un)realizefn

2018-08-08 Thread Bharata B Rao
VMStateDescription vmstate_spapr_cpu_state was added by commit
b94020268e0b6 (spapr_cpu_core: migrate per-CPU data) to migrate per-CPU
data with the required vmstate registration and unregistration calls.
However the unregistration is being done only from vcpu creation error path
and not from CPU delete path.

This causes migration to fail with the following error if migration is
attempted after a CPU unplug like this:
Unknown savevm section or instance 'spapr_cpu' 16
Additionally this leaves the source VM unresponsive after migration failure.

Fix this by ensuring the vmstate_unregister happens during CPU removal.
Fixing this becomes easier when vmstate (un)registration calls are moved to
vcpu (un)realize functions which is what this patch does.

Fixes: https://bugs.launchpad.net/qemu/+bug/1785972
Reported-by: Satheesh Rajendran 
Signed-off-by: Bharata B Rao 
---
 hw/ppc/spapr_cpu_core.c | 62 +
 1 file changed, 32 insertions(+), 30 deletions(-)

diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index 993759db47..bb88a3ce4e 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -113,26 +113,6 @@ const char *spapr_get_cpu_core_type(const char *cpu_type)
 return object_class_get_name(oc);
 }
 
-static void spapr_unrealize_vcpu(PowerPCCPU *cpu)
-{
-qemu_unregister_reset(spapr_cpu_reset, cpu);
-object_unparent(cpu->intc);
-cpu_remove_sync(CPU(cpu));
-object_unparent(OBJECT(cpu));
-}
-
-static void spapr_cpu_core_unrealize(DeviceState *dev, Error **errp)
-{
-sPAPRCPUCore *sc = SPAPR_CPU_CORE(OBJECT(dev));
-CPUCore *cc = CPU_CORE(dev);
-int i;
-
-for (i = 0; i < cc->nr_threads; i++) {
-spapr_unrealize_vcpu(sc->threads[i]);
-}
-g_free(sc->threads);
-}
-
 static bool slb_shadow_needed(void *opaque)
 {
 sPAPRCPUState *spapr_cpu = opaque;
@@ -207,10 +187,34 @@ static const VMStateDescription vmstate_spapr_cpu_state = 
{
 }
 };
 
+static void spapr_unrealize_vcpu(PowerPCCPU *cpu, sPAPRCPUCore *sc)
+{
+if (!sc->pre_3_0_migration) {
+vmstate_unregister(NULL, _spapr_cpu_state, cpu->machine_data);
+}
+qemu_unregister_reset(spapr_cpu_reset, cpu);
+object_unparent(cpu->intc);
+cpu_remove_sync(CPU(cpu));
+object_unparent(OBJECT(cpu));
+}
+
+static void spapr_cpu_core_unrealize(DeviceState *dev, Error **errp)
+{
+sPAPRCPUCore *sc = SPAPR_CPU_CORE(OBJECT(dev));
+CPUCore *cc = CPU_CORE(dev);
+int i;
+
+for (i = 0; i < cc->nr_threads; i++) {
+spapr_unrealize_vcpu(sc->threads[i], sc);
+}
+g_free(sc->threads);
+}
+
 static void spapr_realize_vcpu(PowerPCCPU *cpu, sPAPRMachineState *spapr,
-   Error **errp)
+   sPAPRCPUCore *sc, Error **errp)
 {
 CPUPPCState *env = >env;
+CPUState *cs = CPU(cpu);
 Error *local_err = NULL;
 
 object_property_set_bool(OBJECT(cpu), true, "realized", _err);
@@ -233,6 +237,11 @@ static void spapr_realize_vcpu(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 goto error_unregister;
 }
 
+if (!sc->pre_3_0_migration) {
+vmstate_register(NULL, cs->cpu_index, _spapr_cpu_state,
+ cpu->machine_data);
+}
+
 return;
 
 error_unregister:
@@ -272,10 +281,6 @@ static PowerPCCPU *spapr_create_vcpu(sPAPRCPUCore *sc, int 
i, Error **errp)
 }
 
 cpu->machine_data = g_new0(sPAPRCPUState, 1);
-if (!sc->pre_3_0_migration) {
-vmstate_register(NULL, cs->cpu_index, _spapr_cpu_state,
- cpu->machine_data);
-}
 
 object_unref(obj);
 return cpu;
@@ -290,9 +295,6 @@ static void spapr_delete_vcpu(PowerPCCPU *cpu, sPAPRCPUCore 
*sc)
 {
 sPAPRCPUState *spapr_cpu = spapr_cpu_state(cpu);
 
-if (!sc->pre_3_0_migration) {
-vmstate_unregister(NULL, _spapr_cpu_state, cpu->machine_data);
-}
 cpu->machine_data = NULL;
 g_free(spapr_cpu);
 object_unparent(OBJECT(cpu));
@@ -325,7 +327,7 @@ static void spapr_cpu_core_realize(DeviceState *dev, Error 
**errp)
 }
 
 for (j = 0; j < cc->nr_threads; j++) {
-spapr_realize_vcpu(sc->threads[j], spapr, _err);
+spapr_realize_vcpu(sc->threads[j], spapr, sc, _err);
 if (local_err) {
 goto err_unrealize;
 }
@@ -334,7 +336,7 @@ static void spapr_cpu_core_realize(DeviceState *dev, Error 
**errp)
 
 err_unrealize:
 while (--j >= 0) {
-spapr_unrealize_vcpu(sc->threads[j]);
+spapr_unrealize_vcpu(sc->threads[j], sc);
 }
 err:
 while (--i >= 0) {
-- 
2.14.3




[Qemu-devel] [Bug 1785698] Re: Solaris build error: unknown type name ‘gcry_error_t’

2018-08-08 Thread Michele Denber
"echo  $solaris "

That gives:

# /usr/xpg4/bin/sh ../configure --extra-cflags="-m32" 
--target-list=x86_64-softmmu
 yes 
Install prefix/usr/local
BIOS directory/usr/local/share/qemu
firmware path /usr/local/share/qemu-firmware
binary directory  /usr/local/bin
library directory /usr/local/lib
module directory  /usr/local/lib/qemu
libexec directory /usr/local/libexec
include directory /usr/local/include
config directory  /usr/local/etc
local state directory   /usr/local/var
Manual directory  /usr/local/share/man
ELF interp prefix /usr/gnemul/qemu-%M
...

Then:

# libgcrypt-config --cflags
-I/opt/csw/include
# libgcrypt-config --libs
-L/opt/csw/lib -lgcrypt -lgpg-error
# echo $SHELL
/bin/bash
# bash --version
GNU bash, version 4.3.33(1)-release (sparc-sun-solaris2.10)
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
#

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1785698

Title:
  Solaris build error: unknown type name ‘gcry_error_t’

Status in QEMU:
  New

Bug description:
  Building qemu 2.12.0 on a Sun Oracle Enterprise M3000 SPARC64 VII,
  Solaris 10 Update 11, opencsw toolchain and gcc 7.3.0, gmake fails
  with a bunch of related errors all in cypher-gcrypt.c:

  /export/home/denber/qemu-2.12.0/crypto/cipher-gcrypt.c:262:32: error: 
‘gcry_cipher_hd_t’ undeclared (first use in this function); did you mean 
‘gcry_cipher_info’?
   err = gcry_cipher_encrypt((gcry_cipher_hd_t)ctx, dst, length, src, 
length);^~~~
  gcry_cipher_info
  /export/home/denber/qemu-2.12.0/crypto/cipher-gcrypt.c:262:49: error: 
expected ‘)’ before ‘ctx’
   err = gcry_cipher_encrypt((gcry_cipher_hd_t)ctx, dst, length, src, 
length); ^~~
  /export/home/denber/qemu-2.12.0/crypto/cipher-gcrypt.c:262:11: error: too few 
arguments to function ‘gcry_cipher_encrypt’
   err = gcry_cipher_encrypt((gcry_cipher_hd_t)ctx, dst, length, src, 
length);   ^~~
  In file included from 
/export/home/denber/qemu-2.12.0/crypto/cipher-gcrypt.c:25:0,
   from /export/home/denber/qemu-2.12.0/crypto/cipher.c:153:
  /usr/include/gcrypt.h:566:5: note: declared here
   int gcry_cipher_encrypt (GcryCipherHd h,
   ^~~
  In file included from /export/home/denber/qemu-2.12.0/crypto/cipher.c:153:0:
  /export/home/denber/qemu-2.12.0/crypto/cipher-gcrypt.c: In function 
‘qcrypto_gcrypt_xts_decrypt’:
  /export/home/denber/qemu-2.12.0/crypto/cipher-gcrypt.c:271:5: error: unknown 
type name ‘gcry_error_t’; did you mean ‘g_error’?
   gcry_error_t err;
   ^~~~
   g_error
  /export/home/denber/qemu-2.12.0/crypto/cipher-gcrypt.c:272:32: error: 
‘gcry_cipher_hd_t’ undeclared (first use in this function); did you mean 
‘gcry_cipher_info’?
   err = gcry_cipher_decrypt((gcry_cipher_hd_t)ctx, dst, length, src, 
length);^~~~
  gcry_cipher_info
  /export/home/denber/qemu-2.12.0/crypto/cipher-gcrypt.c:272:49: error: 
expected ‘)’ before ‘ctx’
   err = gcry_cipher_decrypt((gcry_cipher_hd_t)ctx, dst, length, src, 
length); ^~~
  /export/home/denber/qemu-2.12.0/crypto/cipher-gcrypt.c:272:11: error: too few 
arguments to function ‘gcry_cipher_decrypt’
   err = gcry_cipher_decrypt((gcry_cipher_hd_t)ctx, dst, length, src, 
length);   ^~~
  In file included from 
/export/home/denber/qemu-2.12.0/crypto/cipher-gcrypt.c:25:0,
   from /export/home/denber/qemu-2.12.0/crypto/cipher.c:153:
  /usr/include/gcrypt.h:571:5: note: declared here
   int gcry_cipher_decrypt (GcryCipherHd h,
   ^~~
  In file included from /export/home/denber/qemu-2.12.0/crypto/cipher.c:153:0:
  /export/home/denber/qemu-2.12.0/crypto/cipher-gcrypt.c: In function 
‘qcrypto_gcrypt_cipher_encrypt’:
  /export/home/denber/qemu-2.12.0/crypto/cipher-gcrypt.c:284:5: error: unknown 
type name ‘gcry_error_t’; did you mean ‘g_error’?
   gcry_error_t err;
   ^~~~
   g_error
  /export/home/denber/qemu-2.12.0/crypto/cipher-gcrypt.c:293:21: warning: 
passing argument 1 of ‘xts_encrypt’ makes pointer from integer without a cast 
[-Wint-conversion]
   xts_encrypt(ctx->handle, ctx->tweakhandle,
   ^~~
  In file included from 
/export/home/denber/qemu-2.12.0/crypto/cipher-gcrypt.c:22:0,
   from /export/home/denber/qemu-2.12.0/crypto/cipher.c:153:
  /export/home/denber/qemu-2.12.0/include/crypto/xts.h:73:6: note: expected 
‘const void *’ but argument is of type ‘int’
   void xts_encrypt(const void *datactx,
    ^~~
  In file included from 

Re: [Qemu-devel] [PATCH 04/21] block/commit: utilize job_exit shim

2018-08-08 Thread Kevin Wolf
Am 07.08.2018 um 06:33 hat John Snow geschrieben:
> Change the manual deferment to commit_complete into the implicit
> callback to job_exit.
> 
> Signed-off-by: John Snow 

There is one tricky thing in this patch that the commit message could be
a bit more explicit about, which is moving job_completed() to a later
point.

This is the code that happens between the old call of job_completed()
and the new one:

/* If bdrv_drop_intermediate() didn't already do that, remove the commit
 * filter driver from the backing chain. Do this as the final step so that
 * the 'consistent read' permission can be granted.  */
if (remove_commit_top_bs) {
bdrv_child_try_set_perm(commit_top_bs->backing, 0, BLK_PERM_ALL,
_abort);
bdrv_replace_node(commit_top_bs, backing_bs(commit_top_bs),
  _abort);
}

bdrv_unref(commit_top_bs);
bdrv_unref(top);

As the comment states, bdrv_replace_node() requires that the permission
restrictions that the commit job made are already lifted. The most
important part is done by the explicit block_job_remove_all_bdrv() call
right before this hunk. It still leaves bjob->blk around, which could
have implications, but luckily we didn't take any permissions for that
one:

s = block_job_create(job_id, _job_driver, NULL, bs, 0, BLK_PERM_ALL,
 speed, JOB_DEFAULT, NULL, NULL, errp);

So I think we got everything out of the way and bdrv_replace_node() can
do what it wants to do.

Kevin



Re: [Qemu-devel] [RFC PATCH 4/4] disas: allow capstone to defer to a fallback function on failure

2018-08-08 Thread Alex Bennée


Alex Bennée  writes:

> We can abuse the CS_OPT_SKIPDATA by providing a call back when
> capstone can't disassemble something. The passing of the string to the
> dump function is a little clunky but works.
>
> Signed-off-by: Alex Bennée 
> ---
>  disas.c | 30 +-
>  include/disas/bfd.h | 11 ++-
>  target/arm/cpu.c|  4 
>  3 files changed, 43 insertions(+), 2 deletions(-)
>
> diff --git a/disas.c b/disas.c
> index 5325b7e6be..dfd2c251c5 100644
> --- a/disas.c
> +++ b/disas.c
> @@ -178,6 +178,20 @@ static int print_insn_od_target(bfd_vma pc, 
> disassemble_info *info)
> to share this across calls and across host vs target disassembly.  */
>  static __thread cs_insn *cap_insn;
>
> +
> +/* Handle fall-back dissasembly. We don't print here but we do set
> + * cap_fallback_str for cap_dump_insn to used*/
> +static size_t cap_disas_fallback(const uint8_t *code, size_t code_size,
> + size_t offset, void *user_data)
> +{
> +disassemble_info *info = (disassemble_info *) user_data;
> +info->cap_fallback_str = g_malloc0(256);
> +size_t skip = info->capstone_fallback_func(code + offset,
> +   info->cap_fallback_str, 256);
> +return skip;
> +}
> +
> +
>  /* Initialize the Capstone library.  */
>  /* ??? It would be nice to cache this.  We would need one handle for the
> host and one for the target.  For most targets we can reset specific
> @@ -206,6 +220,14 @@ static cs_err cap_disas_start(disassemble_info *info, 
> csh *handle)
>  cs_option(*handle, CS_OPT_SYNTAX, CS_OPT_SYNTAX_ATT);
>  }
>
> +if (info->capstone_fallback_func) {
> +cs_opt_skipdata skipdata = {
> +.callback = cap_disas_fallback,
> +.user_data = info,

This also needs:

.mnemonic = "deadbeef",

just to stop the capstone skip handling crashing. For some reason this
only showed up when I started doing Aarch64-to-Aarch64 testing.

> +};
> +cs_option(*handle, CS_OPT_SKIPDATA_SETUP, (size_t) );
> +}
> +
>  /* "Disassemble" unknown insns as ".byte W,X,Y,Z".  */
>  cs_option(*handle, CS_OPT_SKIPDATA, CS_OPT_ON);
>
> @@ -281,7 +303,13 @@ static void cap_dump_insn(disassemble_info *info, 
> cs_insn *insn)
>  }
>
>  /* Print the actual instruction.  */
> -print(info->stream, "  %-8s %s\n", insn->mnemonic, insn->op_str);
> +if (info->cap_fallback_str) {
> +print(info->stream, "  %s\n", info->cap_fallback_str);
> +g_free(info->cap_fallback_str);
> +info->cap_fallback_str = NULL;
> +} else {
> +print(info->stream, "  %-8s %s\n", insn->mnemonic, insn->op_str);
> +}
>
>  /* Dump any remaining part of the insn on subsequent lines.  */
>  for (i = split; i < n; i += split) {
> diff --git a/include/disas/bfd.h b/include/disas/bfd.h
> index 1f69a6e9d3..9d99bfef48 100644
> --- a/include/disas/bfd.h
> +++ b/include/disas/bfd.h
> @@ -377,6 +377,12 @@ typedef struct disassemble_info {
>int cap_insn_unit;
>int cap_insn_split;
>
> +  /* Fallback function to disassemble things capstone can't. */
> +  size_t (*capstone_fallback_func)
> +(const uint8_t *insn, char *ptr, size_t n);
> +
> +  char *cap_fallback_str;
> +
>  } disassemble_info;
>
>
> @@ -491,7 +497,10 @@ int generic_symbol_at_address(bfd_vma, struct 
> disassemble_info *);
>(INFO).bytes_per_chunk = 0, \
>(INFO).display_endian = BFD_ENDIAN_UNKNOWN, \
>(INFO).disassembler_options = NULL, \
> -  (INFO).insn_info_valid = 0
> +  (INFO).insn_info_valid = 0, \
> +  (INFO).capstone_fallback_func = NULL, \
> +  (INFO).cap_fallback_str = NULL
> +
>
>  #ifndef ATTRIBUTE_UNUSED
>  #define ATTRIBUTE_UNUSED __attribute__((unused))
> diff --git a/target/arm/cpu.c b/target/arm/cpu.c
> index 64a8005a4b..cfefbfb0b9 100644
> --- a/target/arm/cpu.c
> +++ b/target/arm/cpu.c
> @@ -519,6 +519,10 @@ static void arm_disas_set_info(CPUState *cpu, 
> disassemble_info *info)
>  info->cap_arch = CS_ARCH_ARM64;
>  info->cap_insn_unit = 4;
>  info->cap_insn_split = 4;
> +
> +#if defined(TARGET_AARCH64)
> +info->capstone_fallback_func = do_aarch64_fallback_disassembly;
> +#endif
>  } else {
>  int cap_mode;
>  if (env->thumb) {


--
Alex Bennée



Re: [Qemu-devel] [PATCH 01/21] jobs: canonize Error object

2018-08-08 Thread Kevin Wolf
Am 08.08.2018 um 17:50 hat John Snow geschrieben:
> 
> 
> On 08/08/2018 10:57 AM, Kevin Wolf wrote:
> > Am 07.08.2018 um 06:33 hat John Snow geschrieben:
> >> Jobs presently use both an Error object in the case of the create job,
> >> and char strings in the case of generic errors elsewhere.
> >>
> >> Unify the two paths as just j->err, and remove the extra argument from
> >> job_completed.
> >>
> >> Signed-off-by: John Snow 
> > 
> > Hm, not sure. Overall, this feels like a step backwards.
> > 
> >> diff --git a/include/qemu/job.h b/include/qemu/job.h
> >> index 18c9223e31..845ad00c03 100644
> >> --- a/include/qemu/job.h
> >> +++ b/include/qemu/job.h
> >> @@ -124,12 +124,12 @@ typedef struct Job {
> >>  /** Estimated progress_current value at the completion of the job */
> >>  int64_t progress_total;
> >>  
> >> -/** Error string for a failed job (NULL if, and only if, job->ret == 
> >> 0) */
> >> -char *error;
> >> -
> >>  /** ret code passed to job_completed. */
> >>  int ret;
> >>  
> >> +/** Error object for a failed job **/
> >> +Error *err;
> >> +
> >>  /** The completion function that will be called when the job 
> >> completes.  */
> >>  BlockCompletionFunc *cb;
> > 
> > This is the change that I could agree with, though I don't think it
> > makes a big difference: Whether you store the string immediately or an
> > Error object from which you get the string later, doesn't really make a
> > big difference.
> > 
> > Maybe we find more uses and having an Error object is common practice in
> > QEMU, so no objections to this change.
> > 
> >> @@ -484,15 +484,13 @@ void job_transition_to_ready(Job *job);
> >>  /**
> >>   * @job: The job being completed.
> >>   * @ret: The status code.
> >> - * @error: The error message for a failing job (only with @ret < 0). If 
> >> @ret is
> >> - * negative, but NULL is given for @error, strerror() is used.
> >>   *
> >>   * Marks @job as completed. If @ret is non-zero, the job transaction it 
> >> is part
> >>   * of is aborted. If @ret is zero, the job moves into the WAITING state. 
> >> If it
> >>   * is the last job to complete in its transaction, all jobs in the 
> >> transaction
> >>   * move from WAITING to PENDING.
> >>   */
> >> -void job_completed(Job *job, int ret, Error *error);
> >> +void job_completed(Job *job, int ret);
> > 
> > I don't like this one, though.
> > 
> > Before this change, job_completed(..., NULL) was a clear sign that the
> > error message probably needed an improvement, because an errno string
> > doesn't usually describe error situations very well. We may not have a
> > much better message in some cases, but in most cases we just don't pass
> > one because an error message after job creation is still a quite new
> > thing in the QAPI schema.
> > 
> > What we should get rid of in the long term is the int ret, not the Error
> > *error. I suspect callers really just distinguish success/error without
> > actually looking at the error code.
> > 
> > With this change applied, what's your new conversion plan for making
> > sure that every failing caller of job_completed() has set job->error
> > first?
> > 
> 
> Getting rid of job_completed and moving to our fairly ubiquitous "ret &
> Error *" combo.

Yup, with the context of the discussion for patch 2, if you make .start
(or .run) take an Error **errp, that sounds like a good plan to me.

> >> @@ -666,8 +665,8 @@ static void job_update_rc(Job *job)
> >>  job->ret = -ECANCELED;
> >>  }
> >>  if (job->ret) {
> >> -if (!job->error) {
> >> -job->error = g_strdup(strerror(-job->ret));
> >> +if (!job->err) {
> >> +error_setg_errno(>err, -job->ret, "job failed");
> >>  }
> >>  job_state_transition(job, JOB_STATUS_ABORTING);
> >>  }
> > 
> > This hunk just makes the error message more verbose with a "job failed"
> > prefix that doesn't add information. If it's the error string for a job,
> > of course the job failed.
> > 
> > Kevin
> > 
> 
> Yeah, it's not a good prefix, but if I wanted to use the error object in
> a general way across all jobs, I needed _some_ kind of prefix there...

Shouldn't this one work?

error_setg(>err, strerror(-job->ret));

Kevin



Re: [Qemu-devel] [PATCH 01/21] jobs: canonize Error object

2018-08-08 Thread John Snow



On 08/08/2018 10:57 AM, Kevin Wolf wrote:
> Am 07.08.2018 um 06:33 hat John Snow geschrieben:
>> Jobs presently use both an Error object in the case of the create job,
>> and char strings in the case of generic errors elsewhere.
>>
>> Unify the two paths as just j->err, and remove the extra argument from
>> job_completed.
>>
>> Signed-off-by: John Snow 
> 
> Hm, not sure. Overall, this feels like a step backwards.
> 
>> diff --git a/include/qemu/job.h b/include/qemu/job.h
>> index 18c9223e31..845ad00c03 100644
>> --- a/include/qemu/job.h
>> +++ b/include/qemu/job.h
>> @@ -124,12 +124,12 @@ typedef struct Job {
>>  /** Estimated progress_current value at the completion of the job */
>>  int64_t progress_total;
>>  
>> -/** Error string for a failed job (NULL if, and only if, job->ret == 0) 
>> */
>> -char *error;
>> -
>>  /** ret code passed to job_completed. */
>>  int ret;
>>  
>> +/** Error object for a failed job **/
>> +Error *err;
>> +
>>  /** The completion function that will be called when the job completes. 
>>  */
>>  BlockCompletionFunc *cb;
> 
> This is the change that I could agree with, though I don't think it
> makes a big difference: Whether you store the string immediately or an
> Error object from which you get the string later, doesn't really make a
> big difference.
> 
> Maybe we find more uses and having an Error object is common practice in
> QEMU, so no objections to this change.
> 
>> @@ -484,15 +484,13 @@ void job_transition_to_ready(Job *job);
>>  /**
>>   * @job: The job being completed.
>>   * @ret: The status code.
>> - * @error: The error message for a failing job (only with @ret < 0). If 
>> @ret is
>> - * negative, but NULL is given for @error, strerror() is used.
>>   *
>>   * Marks @job as completed. If @ret is non-zero, the job transaction it is 
>> part
>>   * of is aborted. If @ret is zero, the job moves into the WAITING state. If 
>> it
>>   * is the last job to complete in its transaction, all jobs in the 
>> transaction
>>   * move from WAITING to PENDING.
>>   */
>> -void job_completed(Job *job, int ret, Error *error);
>> +void job_completed(Job *job, int ret);
> 
> I don't like this one, though.
> 
> Before this change, job_completed(..., NULL) was a clear sign that the
> error message probably needed an improvement, because an errno string
> doesn't usually describe error situations very well. We may not have a
> much better message in some cases, but in most cases we just don't pass
> one because an error message after job creation is still a quite new
> thing in the QAPI schema.
> 
> What we should get rid of in the long term is the int ret, not the Error
> *error. I suspect callers really just distinguish success/error without
> actually looking at the error code.
> 
> With this change applied, what's your new conversion plan for making
> sure that every failing caller of job_completed() has set job->error
> first?
> 

Getting rid of job_completed and moving to our fairly ubiquitous "ret &
Error *" combo.

>> @@ -666,8 +665,8 @@ static void job_update_rc(Job *job)
>>  job->ret = -ECANCELED;
>>  }
>>  if (job->ret) {
>> -if (!job->error) {
>> -job->error = g_strdup(strerror(-job->ret));
>> +if (!job->err) {
>> +error_setg_errno(>err, -job->ret, "job failed");
>>  }
>>  job_state_transition(job, JOB_STATUS_ABORTING);
>>  }
> 
> This hunk just makes the error message more verbose with a "job failed"
> prefix that doesn't add information. If it's the error string for a job,
> of course the job failed.
> 
> Kevin
> 

Yeah, it's not a good prefix, but if I wanted to use the error object in
a general way across all jobs, I needed _some_ kind of prefix there...



Re: [Qemu-devel] [PATCH 02/21] jobs: add exit shim

2018-08-08 Thread Kevin Wolf
Am 08.08.2018 um 17:38 hat John Snow geschrieben:
> On 08/08/2018 11:23 AM, Kevin Wolf wrote:
> > Am 08.08.2018 um 06:02 hat Jeff Cody geschrieben:
> >> On Tue, Aug 07, 2018 at 12:33:30AM -0400, John Snow wrote:
> >>> Most jobs do the same thing when they leave their running loop:
> >>> - Store the return code in a structure
> >>> - wait to receive this structure in the main thread
> >>> - signal job completion via job_completed
> >>>
> >>> More seriously, when we utilize job_defer_to_main_loop_bh to call
> >>> a function that calls job_completed, job_finalize_single will run
> >>> in a context where it has recursively taken the aio_context lock,
> >>> which can cause hangs if it puts down a reference that causes a flush.
> >>>
> >>> The job infrastructure is perfectly capable of registering job
> >>> completion itself when we leave the job's entry point. In this
> >>> context, we can signal job completion from outside of the aio_context,
> >>> which should allow for job cleanup code to run with only one lock.
> >>>
> >>> Signed-off-by: John Snow 
> >>
> >> I like the simplification, both in SLOC and in exit logic (as seen in
> >> patches 3-7).
> > 
> > I agree, unifying this seems like a good idea.
> > 
> > Like in the first patch, I'm not convinced of the details, though.
> > Essentially, this is my objection regarding job->err extended to
> > job->ret: You rely on jobs setting job->ret and job->err, but the
> > interfaces don't really show this.
> > 
> >>> @@ -546,6 +559,12 @@ static void coroutine_fn job_co_entry(void *opaque)
> >>>  assert(job && job->driver && job->driver->start);
> >>>  job_pause_point(job);
> >>>  job->driver->start(job);
> >>
> >> One nit-picky observation here, that is unrelated to this patch: reading
> >> through, it may not be so obvious that 'start' is really a 'run' or
> >> 'execute', (linguistically, to me 'start' implies a kick-off rather than
> >> ongoing execution).
> > 
> > I had exactly the same thought. My proposal is to change the existing...
> > 
> > CoroutineEntry *start;
> > 
> > ...which is just short for...
> > 
> > void coroutine_fn start(void *opaque);
> > 
> > ...into this one:
> > 
> > int coroutine_fn run(void *opaque, Error **errp);
> > 
> > I see that at the end of the series, you actually introduced an int
> > return value already. I would have done that from the start, but as long
> > the final state makes sense, I won't insist.
> > 
> > But can we have the Error **errp addition, too? Pretty please?
> > 
> > Kevin
> > 
> 
> I'm actually glad you want that addition, I was considering very
> strongly adding it but I felt like I had made the series long enough
> already and didn't want to change too much all at once.
> 
> The basic thought was just:
> 
> "It'd sure be nice to have a generic function entry point that looks
> like it returns the same error information as our non-coroutine functions."
> 
> I can absolutely work that in, and break this series into two parts:
> 
> (1) Rework jobs infrastructure to use the new run signature, and
> (2) Rework jobs to use the finalization callbacks.
> 
> Sound good?

I haven't looked at the rest of the series yet, but so far this sounds
good to me.

Kevin



Re: [Qemu-devel] [PATCH 02/21] jobs: add exit shim

2018-08-08 Thread John Snow



On 08/08/2018 11:23 AM, Kevin Wolf wrote:
> Am 08.08.2018 um 06:02 hat Jeff Cody geschrieben:
>> On Tue, Aug 07, 2018 at 12:33:30AM -0400, John Snow wrote:
>>> Most jobs do the same thing when they leave their running loop:
>>> - Store the return code in a structure
>>> - wait to receive this structure in the main thread
>>> - signal job completion via job_completed
>>>
>>> More seriously, when we utilize job_defer_to_main_loop_bh to call
>>> a function that calls job_completed, job_finalize_single will run
>>> in a context where it has recursively taken the aio_context lock,
>>> which can cause hangs if it puts down a reference that causes a flush.
>>>
>>> The job infrastructure is perfectly capable of registering job
>>> completion itself when we leave the job's entry point. In this
>>> context, we can signal job completion from outside of the aio_context,
>>> which should allow for job cleanup code to run with only one lock.
>>>
>>> Signed-off-by: John Snow 
>>
>> I like the simplification, both in SLOC and in exit logic (as seen in
>> patches 3-7).
> 
> I agree, unifying this seems like a good idea.
> 
> Like in the first patch, I'm not convinced of the details, though.
> Essentially, this is my objection regarding job->err extended to
> job->ret: You rely on jobs setting job->ret and job->err, but the
> interfaces don't really show this.
> 
>>> @@ -546,6 +559,12 @@ static void coroutine_fn job_co_entry(void *opaque)
>>>  assert(job && job->driver && job->driver->start);
>>>  job_pause_point(job);
>>>  job->driver->start(job);
>>
>> One nit-picky observation here, that is unrelated to this patch: reading
>> through, it may not be so obvious that 'start' is really a 'run' or
>> 'execute', (linguistically, to me 'start' implies a kick-off rather than
>> ongoing execution).
> 
> I had exactly the same thought. My proposal is to change the existing...
> 
> CoroutineEntry *start;
> 
> ...which is just short for...
> 
> void coroutine_fn start(void *opaque);
> 
> ...into this one:
> 
> int coroutine_fn run(void *opaque, Error **errp);
> 
> I see that at the end of the series, you actually introduced an int
> return value already. I would have done that from the start, but as long
> the final state makes sense, I won't insist.
> 
> But can we have the Error **errp addition, too? Pretty please?
> 
> Kevin
> 

I'm actually glad you want that addition, I was considering very
strongly adding it but I felt like I had made the series long enough
already and didn't want to change too much all at once.

The basic thought was just:

"It'd sure be nice to have a generic function entry point that looks
like it returns the same error information as our non-coroutine functions."

I can absolutely work that in, and break this series into two parts:

(1) Rework jobs infrastructure to use the new run signature, and
(2) Rework jobs to use the finalization callbacks.

Sound good?

--js



Re: [Qemu-devel] [PATCH v3 2/5] qcow2: Make the default L2 cache sufficient to cover the entire image

2018-08-08 Thread Alberto Garcia
On Wed 08 Aug 2018 03:58:02 PM CEST, Alberto Garcia wrote:
>  1) If l2-cache-size > l2_metadata_size, then make l2-cache-size =
> l2_metadata_size. This is already useful on its own, even with the
> current default of 1MB.
>
>  2) Increase the default to 32MB. This won't waste additional memory for
> small images because of the previous patch, and will cover images up
> to 256GB. If you have larger images you would need to increase
> l2-cache-size manually if you want to cache all the L2 metadata.

Let's try this way: we had many users requesting us to add a new option
to set l2-cache-size=100%, but we never agreed on a good API. Patch (1)
would do precisely that (l2-cache-size=1T). Patch (2) changes the
default, which may be better and probably enough for many users, but
it's not what solves the problem.

Berto



Re: [Qemu-devel] [PATCH v2 08/34] tests/qgraph: rename qpci_init_spapr functions

2018-08-08 Thread Laurent Vivier
On 06/08/2018 16:33, Emanuele Giuseppe Esposito wrote:
> Rename qpci_init_spapr in qpci_new_spapr, since the function actually
> allocates a new QPCIBusSPAPR and initialize it.
> 

I think you should merge this one with 02/34.

Thanks,
Laurent




Re: [Qemu-devel] [PATCH hack dontapply v2 0/7] Dynamic _CST generation

2018-08-08 Thread Igor Mammedov
On Thu, 2 Aug 2018 11:18:08 +0200
Igor Mammedov  wrote:

> On Thu, 26 Jul 2018 19:09:22 +0300
> "Michael S. Tsirkin"  wrote:
> 
> > On Wed, Jul 25, 2018 at 05:53:35PM +0200, Igor Mammedov wrote:  
> > > On Wed, 25 Jul 2018 15:44:37 +0300
> > > "Michael S. Tsirkin"  wrote:
> > > 
> > > > On Wed, Jul 25, 2018 at 02:32:11PM +0200, Igor Mammedov wrote:
> > > > > On Tue, 10 Jul 2018 03:01:30 +0300
> > > > > "Michael S. Tsirkin"  wrote:
> > > > >   
> > > > > > Now that basic support for guest CPU PM is upstream, I started 
> > > > > > looking
> > > > > > for making it migrateable.  Since a VM can be migrated between 
> > > > > > different
> > > > > > hosts, PM info needs to change each time with move the VM to a 
> > > > > > different
> > > > > > host.  
> > > > > Considering notification is async, so there will be a window when
> > > > > guest will be using old Cstates on new host. What will happen if
> > > > > machine is migrated to host that doesn't support a Cstate
> > > > > that was used on source when guest was started?  
> > > > 
> > > > My testing shows mwait with a wrong hint works, presumably it just uses
> > > > a lot of power.
> > > > 
> > > > > > This adds infrastructure - based on Load/Unload - to support exactly
> > > > > > that: QEMU generates AML (changing it on migration) and stores it in
> > > > > > reserved memory, OSPM loads _CST from there on demand.  
> > > > > Cool and very tempting idea but I have 2 worries about this approach:
> > > > > 1. How well does it work with Windows based guests?
> > > > >(I've tried something similar but generating new AML from AML 
> > > > > itself
> > > > > to get rid of our long if/else chains there to make up Notify 
> > > > > target name.
> > > > > I ditched it (unfortunately I don't recall why) )  
> > > > 
> > > > After trying it, I can tell you why - it's a horrid mess of
> > > > unreadable code, with arbitrary restraints on package
> > > > length etc.
> > > in my case it was only 4 character NameString, but Windows probably
> > > wasn't happy or just ignored it.
> > > Considering recent development (TPM series) it might have been issue
> > > with SYSTEM_MEMORY not working properly if used as byte buffer field.
> > > 
> > > Even if it's an unreadable mess it's still stable mess and very 
> > > constrained
> > > one within a single firmware blob that came from source.
> > 
> > That's exectly the argument we had for keeping ACPI
> > generation in bios. It's just not an interface that scales.
> >   
> > > Hence it's more preferable than split brain in this series.
> > > 
> > > But I don't think we even need dynamic AML for _CST usecase at all,
> > > existing cpuhp interface should work just fine for it and should be
> > > simpler as all infrastructure is already there.
> > 
> > Not sure I get what you mean. Could you post a patch?
> >   
> > > > > 2. (probably biggest issue) Loading dynamically generated AML
> > > > >basically would make all AML code ABI, so that static AML
> > > > >in RAM of old QEMU version would match dynamic generated
> > > > >one on target side with new QEMU (/me generalizing approach beyond 
> > > > > _CST support).
> > > > >I'd try to keep our AML version less as much as possible
> > > > >and go this route only if there is no other way to do it.  
> > > > 
> > > > That's a good point, thanks for bringing this up!
> > > > 
> > > > So it seems that we will need to define the ABI, yes. All AML code is
> > > > an over-statement though, there are specific entry points
> > > > we must maintain, right?
> > > Well, I'm rather unsure what and where could break,
> > > hence I'm afraid of a new beast and it probably won't be easy
> > > to convince me that ABI would be able to keep things manageable
> > > and stable.
> > > Considering big amount of AML code that we already have
> > > I'm not confident eye balling during review and testing the latest
> > > firmware/qemu would be sufficient as we suddenly would have exploded
> > > test matrix where firmware in use is a mixed from old/and new parts.
> > 
> > Well there's one exported method so far and it relies on one container
> > device in static table set which does not sound too hard to keep stable.
> > 
> > Given how simple the dynamic table is, how about just checking it
> > with a unit test? It is literally a single return statement.
> > If it returns a valid package, that is all that we
> > care about. I can write some firmware to test that constraint.
> > 
> >   
> > > > And that in the end isn't fundamentally different from the ABIs
> > > > mandated by the ACPI spec.
> > > > 
> > > > So I have these ideas to try to mitigate the issues:
> > > > - document the ABI (like we have in the ACPI spec)
> > > > - use a specific prefix for all external calls (like _ for ACPI spec 
> > > > ones)
> > > > - use a specific (different) prefix for all internal loaded calls (like
> > > >   [A-Z] for non-ACPI spec 

Re: [Qemu-devel] [PATCH 02/21] jobs: add exit shim

2018-08-08 Thread Kevin Wolf
Am 08.08.2018 um 06:02 hat Jeff Cody geschrieben:
> On Tue, Aug 07, 2018 at 12:33:30AM -0400, John Snow wrote:
> > Most jobs do the same thing when they leave their running loop:
> > - Store the return code in a structure
> > - wait to receive this structure in the main thread
> > - signal job completion via job_completed
> > 
> > More seriously, when we utilize job_defer_to_main_loop_bh to call
> > a function that calls job_completed, job_finalize_single will run
> > in a context where it has recursively taken the aio_context lock,
> > which can cause hangs if it puts down a reference that causes a flush.
> > 
> > The job infrastructure is perfectly capable of registering job
> > completion itself when we leave the job's entry point. In this
> > context, we can signal job completion from outside of the aio_context,
> > which should allow for job cleanup code to run with only one lock.
> > 
> > Signed-off-by: John Snow 
> 
> I like the simplification, both in SLOC and in exit logic (as seen in
> patches 3-7).

I agree, unifying this seems like a good idea.

Like in the first patch, I'm not convinced of the details, though.
Essentially, this is my objection regarding job->err extended to
job->ret: You rely on jobs setting job->ret and job->err, but the
interfaces don't really show this.

> > @@ -546,6 +559,12 @@ static void coroutine_fn job_co_entry(void *opaque)
> >  assert(job && job->driver && job->driver->start);
> >  job_pause_point(job);
> >  job->driver->start(job);
> 
> One nit-picky observation here, that is unrelated to this patch: reading
> through, it may not be so obvious that 'start' is really a 'run' or
> 'execute', (linguistically, to me 'start' implies a kick-off rather than
> ongoing execution).

I had exactly the same thought. My proposal is to change the existing...

CoroutineEntry *start;

...which is just short for...

void coroutine_fn start(void *opaque);

...into this one:

int coroutine_fn run(void *opaque, Error **errp);

I see that at the end of the series, you actually introduced an int
return value already. I would have done that from the start, but as long
the final state makes sense, I won't insist.

But can we have the Error **errp addition, too? Pretty please?

Kevin



Re: [Qemu-devel] [PATCH v3 2/5] qcow2: Make the default L2 cache sufficient to cover the entire image

2018-08-08 Thread Alberto Garcia
On Wed 08 Aug 2018 04:35:19 PM CEST, Leonid Bloch wrote:
 The way I see it: there are two simple changes from the user's point of
 view (they can even be two separate patches).

 1) The default l2-cache-size is now 32MB. DEFAULT_L2_CACHE_CLUSTERS is
  useless now and disappears.
>>> I don't think that it can be a separate patch, because unless the other
>>> logic is changed, the cache will occupy 32 MB *always*, regardless of
>>> the image size, and that's quite a big and unneeded overhead.
>> 
>> Change the order of both patches then :-)
>
> Do you really think it's necessary? The increase of the default max
> size is directly tied to the functionality change: it will be harmful
> to increase the maximum before the new functionality is implemented,
> and there is no need to change the functionality if the default max is
> not increased.

I think that we're looking at this from two different perspectives.

a) If I understand you correctly, you see this as a way to make the user
   forget about the L2 cache: we guarantee that it's going to be big
   enough for the entire image, so simply forget about it. Exception: if
   you're using very large images you will have to set its size
   manually, but for the vast majority of cases you'll be alright with
   the default (32MB).

b) The way I see it: setting the right L2 cache size is not trivial, it
   depends on the image and cluster sizes, and it involves a trade-off
   between how much memory you want to use and how much performance
   you're willing to sacrifice. QEMU has many use cases and there's no
   good default, you need to make the numbers yourself if you want to
   fine-tune it. Don't blindly trust the new default size (32MB) because
   it won't be enough for many cases. But we can promise you this: make
   l2-cache-size the maximum amount of memory you're willing to spend on
   this disk image's cache, and we guarantee that we'll only use the
   amount that we need to give you the best performance.

I hope (a) was a fair description of what you're trying to achieve with
these patches. But I also hope that you can see why making l2_cache_size
= MIN(l2_cache_size, virtual_disk_size / (s->cluster_size / 8)) is a
worthwhile change on its own, even if we didn't increase the default
cache size to 32MB.

Berto



[Qemu-devel] [RFC PATCH 4/4] acpi: add support for CST update notification

2018-08-08 Thread Igor Mammedov
Reuse cpu hotplug inotification interface to notify guest about CST change.

Signed-off-by: Igor Mammedov 
---
 include/hw/acpi/cpu.h   |  1 +
 docs/specs/acpi_cpu_hotplug.txt | 11 ---
 hw/acpi/cpu.c   | 27 ++-
 3 files changed, 35 insertions(+), 4 deletions(-)

diff --git a/include/hw/acpi/cpu.h b/include/hw/acpi/cpu.h
index eb79cbf..51d7fc6 100644
--- a/include/hw/acpi/cpu.h
+++ b/include/hw/acpi/cpu.h
@@ -28,6 +28,7 @@ typedef struct AcpiCpuStatus {
 uint64_t arch_id;
 bool is_inserting;
 bool is_removing;
+bool is_cst_update;
 uint32_t ost_event;
 uint32_t ost_status;
 AcpiCState cst;
diff --git a/docs/specs/acpi_cpu_hotplug.txt b/docs/specs/acpi_cpu_hotplug.txt
index adfb026..41ba236 100644
--- a/docs/specs/acpi_cpu_hotplug.txt
+++ b/docs/specs/acpi_cpu_hotplug.txt
@@ -41,7 +41,10 @@ read access:
   It's valid only when bit 0 is set.
2: Device remove event, used to distinguish device for which
   no device eject request to OSPM was issued.
-   3-7: reserved and should be ignored by OSPM
+   3: reserved and should be ignored by OSPM
+   4: Device CST event, used to distinguish device for which
+  device eject request to OSPM was issued
+   5-7: reserved and should be ignored by OSPM
 [0x5-0x7] reserved
 [0x8] Command data: (DWORD access)
   in case of error or unsupported command reads is 0x
@@ -70,14 +73,16 @@ write access:
selected CPU device
 3: if set to 1 initiates device eject, set by OSPM when it
triggers CPU device removal and calls _EJ0 method
+4: if set to 1 clears CST update event, set by OSPM
+   after it has emitted CST update notification
 4-7: reserved, OSPM must clear them before writing to register
 [0x5] Command field: (1 byte access)
   value:
-0: selects a CPU device with inserting/removing events and
+0: selects a CPU device with inserting/removing/cst events and
following reads from 'Command data' register return
selected CPU (CPU selector value). If no CPU with events
found, the current CPU selector doesn't change and
-   corresponding insert/remove event flags are not set.
+   corresponding insert/remove/cst event flags are not set.
 1: following writes to 'Command data' register set OST event
register in QEMU
 2: following writes to 'Command data' register set OST status
diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
index 7ef04f9..c4a9fac 100644
--- a/hw/acpi/cpu.c
+++ b/hw/acpi/cpu.c
@@ -67,6 +67,7 @@ static uint64_t cpu_hotplug_rd(void *opaque, hwaddr addr, 
unsigned size)
 val |= cdev->cpu ? 1 : 0;
 val |= cdev->is_inserting ? 2 : 0;
 val |= cdev->is_removing  ? 4 : 0;
+val |= cdev->is_cst_update ? 16 : 0;
 trace_cpuhp_acpi_read_flags(cpu_st->selector, val);
 break;
 case ACPI_CPU_CMD_DATA_OFFSET_RW:
@@ -162,6 +163,8 @@ static void cpu_hotplug_wr(void *opaque, hwaddr addr, 
uint64_t data,
 dev = DEVICE(cdev->cpu);
 hotplug_ctrl = qdev_get_hotplug_handler(dev);
 hotplug_handler_unplug(hotplug_ctrl, dev, NULL);
+} else if (data & 16) { /* clear CST update event */
+cdev->is_cst_update = false;
 }
 break;
 case ACPI_CPU_CMD_OFFSET_WR:
@@ -312,6 +315,7 @@ static const VMStateDescription vmstate_cstate_sts = {
 VMSTATE_UINT32(cst.current_cst_field, AcpiCpuStatus),
 VMSTATE_UINT32(cst.latency, AcpiCpuStatus),
 VMSTATE_UINT32(cst.power, AcpiCpuStatus),
+VMSTATE_BOOL(is_cst_update, AcpiCpuStatus),
 VMSTATE_END_OF_LIST()
 }
 };
@@ -383,6 +387,7 @@ const VMStateDescription vmstate_cpu_hotplug = {
 #define CPU_INSERT_EVENT  "CINS"
 #define CPU_REMOVE_EVENT  "CRMV"
 #define CPU_EJECT_EVENT   "CEJ0"
+#define CPU_CST_EVENT "CSTU"
 
 void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
 hwaddr io_base,
@@ -435,7 +440,13 @@ void build_cpus_aml(Aml *table, MachineState *machine, 
CPUHotplugFeatures opts,
 aml_append(field, aml_named_field(CPU_REMOVE_EVENT, 1));
 /* initiates device eject, write only */
 aml_append(field, aml_named_field(CPU_EJECT_EVENT, 1));
-aml_append(field, aml_reserved_field(4));
+if (opts.cstate_enabled) {
+/* (read) 1 if has a CST event. (write) 1 to clear event */
+aml_append(field, aml_named_field(CPU_CST_EVENT, 1));
+aml_append(field, aml_reserved_field(3));
+} else {
+aml_append(field, aml_reserved_field(4));
+}
 aml_append(field, aml_named_field(CPU_COMMAND, 8));
 aml_append(cpu_ctrl_dev, field);
 
@@ -470,6 +481,7 @@ 

[Qemu-devel] [RFC PATCH 0/4] "pc: acpi: _CST support"

2018-08-08 Thread Igor Mammedov
It's an alternative approach to
 1) [PATCH hack dontapply v2 0/7] Dynamic _CST generation
which instead of dynamic AML loading uses static AML with
dynamic values.  It allows us to keep firmware blob static and
to avoid split firmware issue (1) in case of cross version migration.

ABI in this case is confined to cpu hotplug IO registers
(i.e. do it old school way, like we used to do so far).
This way we don't have to add yet another ABI to keep dynamic
AML code under control (1).

Tested  with: XPsp3 - ws2106 guests.

CC: "Michael S. Tsirkin" 


Igor Mammedov (3):
  acpi: add aml_create_byte_field()
  pc: acpi: add _CST support
  acpi: add support for CST update notification

Michael S. Tsirkin (1):
  acpi: aml: add aml_register()

 include/hw/acpi/aml-build.h |   6 ++
 include/hw/acpi/cpu.h   |  10 +++
 docs/specs/acpi_cpu_hotplug.txt |  21 +-
 hw/acpi/aml-build.c |  28 +++
 hw/acpi/cpu.c   | 158 +++-
 hw/acpi/piix4.c |   2 +
 hw/i386/acpi-build.c|   5 +-
 tests/bios-tables-test.c|   1 +
 8 files changed, 225 insertions(+), 6 deletions(-)

-- 
2.7.4




[Qemu-devel] [RFC PATCH 1/4] acpi: aml: add aml_register()

2018-08-08 Thread Igor Mammedov
From: "Michael S. Tsirkin" 

Based on a patch by Igor Mammedov.

Signed-off-by: Igor Mammedov 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/acpi/aml-build.h |  5 +
 hw/acpi/aml-build.c | 21 +
 2 files changed, 26 insertions(+)

diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 6c36903..10c7946 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -346,6 +346,11 @@ Aml *aml_qword_memory(AmlDecode dec, AmlMinFixed min_fixed,
   uint64_t len);
 Aml *aml_dma(AmlDmaType typ, AmlDmaBusMaster bm, AmlTransferSize sz,
  uint8_t channel);
+Aml *aml_register(AmlAddressSpace as,
+  uint8_t bit_width,
+  uint8_t bit_offset,
+  uint64_t address,
+  uint8_t access_size);
 Aml *aml_sleep(uint64_t msec);
 
 /* Block AML object primitives */
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 1e43cd7..def62b3 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -874,6 +874,27 @@ Aml *aml_irq_no_flags(uint8_t irq)
 return var;
 }
 
+/*
+ * ACPI: 2.0: 16.2.4.16 ASL Macro for Generic Register Descriptor
+ *
+ * access_size comes from:
+ * ACPI 3.0: 17.5.98 Register (Generic Register Resource Descriptor Macro)
+ */
+Aml *aml_register(AmlAddressSpace as,
+  uint8_t bit_width,
+  uint8_t bit_offset,
+  uint64_t address,
+  uint8_t access_size)
+{
+Aml *var = aml_alloc();
+
+build_append_byte(var->buf, 0x82);/* Generic Register Descriptor */
+build_append_byte(var->buf, 0x0C);/* Length, bits[7:0] */
+build_append_byte(var->buf, 0x0); /* Length, bits[15:8] */
+build_append_gas(var->buf, as, bit_width, bit_offset, access_size, 
address);
+return var;
+}
+
 /* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefLNot */
 Aml *aml_lnot(Aml *arg)
 {
-- 
2.7.4




[Qemu-devel] [RFC PATCH 2/4] acpi: add aml_create_byte_field()

2018-08-08 Thread Igor Mammedov
will be used by for packing _CST package in follow up patch

Signed-off-by: Igor Mammedov 
---
 include/hw/acpi/aml-build.h | 1 +
 hw/acpi/aml-build.c | 7 +++
 2 files changed, 8 insertions(+)

diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 10c7946..8c8ca0b 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -371,6 +371,7 @@ Aml *aml_release(Aml *mutex);
 Aml *aml_alias(const char *source_object, const char *alias_object);
 Aml *aml_create_field(Aml *srcbuf, Aml *bit_index, Aml *num_bits,
   const char *name);
+Aml *aml_create_byte_field(Aml *srcbuf, Aml *index, const char *name);
 Aml *aml_create_dword_field(Aml *srcbuf, Aml *index, const char *name);
 Aml *aml_create_qword_field(Aml *srcbuf, Aml *index, const char *name);
 Aml *aml_varpackage(uint32_t num_elements);
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index def62b3..977d929 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1094,6 +1094,13 @@ Aml *aml_create_field(Aml *srcbuf, Aml *bit_index, Aml 
*num_bits,
 return var;
 }
 
+/* ACPI 1.0b: 16.2.5.2 Named Objects Encoding: DefCreateByteField */
+Aml *aml_create_byte_field(Aml *srcbuf, Aml *index, const char *name)
+{
+return create_field_common(0x8C /* CreateByteFieldOp */,
+   srcbuf, index, name);
+}
+
 /* ACPI 1.0b: 16.2.5.2 Named Objects Encoding: DefCreateDWordField */
 Aml *aml_create_dword_field(Aml *srcbuf, Aml *index, const char *name)
 {
-- 
2.7.4




[Qemu-devel] [RFC PATCH 3/4] pc: acpi: add _CST support

2018-08-08 Thread Igor Mammedov
Reuse CPU hotplug IO registers for passing a CST entry
containing package for shalowest C1 using mwait and
read it out in guest with new CCST AML method.

The CState support is optional and could be turned on
with '-global PIIX4_PM.cstate=on' CLI option.

Signed-off-by: Igor Mammedov 
---
for demo purposes it's wired only to piix4
TODO: q35 wiring

'tested' with rhel7 and XPsp3 - WS2016
 (i.e. it boots and all windows versions happy about AML qemu produces)
---
 include/hw/acpi/cpu.h   |   9 +++
 docs/specs/acpi_cpu_hotplug.txt |  10 ++-
 hw/acpi/cpu.c   | 131 
 hw/acpi/piix4.c |   2 +
 hw/i386/acpi-build.c|   5 +-
 tests/bios-tables-test.c|   1 +
 6 files changed, 156 insertions(+), 2 deletions(-)

diff --git a/include/hw/acpi/cpu.h b/include/hw/acpi/cpu.h
index 89ce172..eb79cbf 100644
--- a/include/hw/acpi/cpu.h
+++ b/include/hw/acpi/cpu.h
@@ -17,6 +17,12 @@
 #include "hw/acpi/aml-build.h"
 #include "hw/hotplug.h"
 
+typedef struct AcpiCState {
+uint32_t current_cst_field;
+uint32_t latency;
+uint32_t power;
+} AcpiCState;
+
 typedef struct AcpiCpuStatus {
 struct CPUState *cpu;
 uint64_t arch_id;
@@ -24,6 +30,7 @@ typedef struct AcpiCpuStatus {
 bool is_removing;
 uint32_t ost_event;
 uint32_t ost_status;
+AcpiCState cst;
 } AcpiCpuStatus;
 
 typedef struct CPUHotplugState {
@@ -32,6 +39,7 @@ typedef struct CPUHotplugState {
 uint8_t command;
 uint32_t dev_count;
 AcpiCpuStatus *devs;
+bool enable_cstate;
 } CPUHotplugState;
 
 void acpi_cpu_plug_cb(HotplugHandler *hotplug_dev,
@@ -50,6 +58,7 @@ void cpu_hotplug_hw_init(MemoryRegion *as, Object *owner,
 typedef struct CPUHotplugFeatures {
 bool apci_1_compatible;
 bool has_legacy_cphp;
+bool cstate_enabled;
 } CPUHotplugFeatures;
 
 void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
diff --git a/docs/specs/acpi_cpu_hotplug.txt b/docs/specs/acpi_cpu_hotplug.txt
index ee219c8..adfb026 100644
--- a/docs/specs/acpi_cpu_hotplug.txt
+++ b/docs/specs/acpi_cpu_hotplug.txt
@@ -47,6 +47,12 @@ read access:
   in case of error or unsupported command reads is 0x
   current 'Command field' value:
   0: returns PXM value corresponding to device
+  3: sequential reads return a sequence of DWORDs
+   {
+ AddressSpaceKeyword, RegisterBitWidth, RegisterBitOffset,
+ RegisterAddress Lo, RegisterAddress Hi, AccessSize,
+ C State type, Latency, Power,
+   }
 
 write access:
 offset:
@@ -75,7 +81,9 @@ write access:
 1: following writes to 'Command data' register set OST event
register in QEMU
 2: following writes to 'Command data' register set OST status
-   register in QEMU
+3: following reads from 'Command data' register return Cx
+   state (command execution resets unread field counter to the 1st
+   field).
 other values: reserved
 [0x6-0x7] reserved
 [0x8] Command data: (DWORD access)
diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
index 5ae595e..7ef04f9 100644
--- a/hw/acpi/cpu.c
+++ b/hw/acpi/cpu.c
@@ -16,6 +16,7 @@ enum {
 CPHP_GET_NEXT_CPU_WITH_EVENT_CMD = 0,
 CPHP_OST_EVENT_CMD = 1,
 CPHP_OST_STATUS_CMD = 2,
+CPHP_READ_CST_CMD = 3,
 CPHP_CMD_MAX
 };
 
@@ -73,6 +74,41 @@ static uint64_t cpu_hotplug_rd(void *opaque, hwaddr addr, 
unsigned size)
 case CPHP_GET_NEXT_CPU_WITH_EVENT_CMD:
val = cpu_st->selector;
break;
+case CPHP_READ_CST_CMD:
+switch (cdev->cst.current_cst_field) {
+case 0:
+val = cpu_to_le32(AML_AS_FFH); /* AddressSpaceKeyword */
+break;
+case 1:  /* RegisterBitWidth */
+val = cpu_to_le32(1); /* Vendor: Intel */
+break;
+case 2:  /* RegisterBitOffset */
+val = cpu_to_le32(2); /* Class: Native C State Instruction */
+break;
+case 3:  /* RegisterAddress Lo */
+val = cpu_to_le64(0); /* Arg0: mwait EAX hint */
+break;
+case 4:  /* RegisterAddress Hi */
+val = cpu_to_le32(0); /* Reserved */
+break;
+case 5:  /* AccessSize */
+val = cpu_to_le32(0); /* Arg1 */
+break;
+case 6:
+val = cpu_to_le32(1); /* The C State type C1*/
+break;
+case 7:
+val = cpu_to_le32(cdev->cst.latency);
+break;
+case 8:
+val = cpu_to_le32(cdev->cst.power);
+break;
+default:
+val = 0x;
+   break;
+}
+

[Qemu-devel] [Bug 1783362] Re: qemu-user: mmap should return failure (MAP_FAILED, -1) instead of success (NULL, 0) when len==0

2018-08-08 Thread umarcor
** Changed in: qemu
   Status: Fix Committed => Fix Released

** Changed in: qemu (Ubuntu)
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1783362

Title:
  qemu-user: mmap should return failure (MAP_FAILED, -1) instead of
  success (NULL, 0) when len==0

Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released

Bug description:
  As shown in https://github.com/beehive-
  lab/mambo/issues/19#issuecomment-407420602, with len==0 mmap returns
  success (NULL, 0) instead of failure (MAP_FAILED, -1) in a x86_64 host
  executing a ELF 64-bit LSB executable, ARM aarch64 binary.

  Steps to reproduce the bug:

  - (cross-)compile the attached source file:

  $ aarch64-linux-gnu-gcc -static -std=gnu99 -lpthread test/mmap_qemu.c
  -o mmap_qemu

  - Execute in a x86_64 host with qemu-user and qemu-user-binfmt:

  $ ./mmap_qemu
  alloc: 0
  MAP_FAILED: -1
  errno: 0
  mmap_qemu: test/mmap_qemu.c:15: main: Assertion `alloc == MAP_FAILED' failed.
  qemu: uncaught target signal 6 (Aborted) - core dumped
  Aborted (core dumped)

  - Execute in a ARM host without any additional dependecy:

  $ ./mmap_qemu
  alloc: -1
  MAP_FAILED: -1
  errno: 22

  The bug is present in Fedora:

  $ qemu-aarch64 --version
  qemu-aarch64 version 2.11.2(qemu-2.11.2-1.fc28)
  Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers
  $ uname -r
  4.17.7-200.fc28.x86_64

  And also in Ubuntu:

  $ qemu-aarch64 --version
  qemu-aarch64 version 2.12.0 (Debian 1:2.12+dfsg-3ubuntu3)
  Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers
  $ uname -r
  4.15.0-23-generic

  Possibly related to:

  - https://lists.freebsd.org/pipermail/freebsd-hackers/2009-July/029109.html
  - https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=203852

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1783362/+subscriptions



Re: [Qemu-devel] [PATCH 01/21] jobs: canonize Error object

2018-08-08 Thread Kevin Wolf
Am 07.08.2018 um 06:33 hat John Snow geschrieben:
> Jobs presently use both an Error object in the case of the create job,
> and char strings in the case of generic errors elsewhere.
> 
> Unify the two paths as just j->err, and remove the extra argument from
> job_completed.
> 
> Signed-off-by: John Snow 

Hm, not sure. Overall, this feels like a step backwards.

> diff --git a/include/qemu/job.h b/include/qemu/job.h
> index 18c9223e31..845ad00c03 100644
> --- a/include/qemu/job.h
> +++ b/include/qemu/job.h
> @@ -124,12 +124,12 @@ typedef struct Job {
>  /** Estimated progress_current value at the completion of the job */
>  int64_t progress_total;
>  
> -/** Error string for a failed job (NULL if, and only if, job->ret == 0) 
> */
> -char *error;
> -
>  /** ret code passed to job_completed. */
>  int ret;
>  
> +/** Error object for a failed job **/
> +Error *err;
> +
>  /** The completion function that will be called when the job completes.  
> */
>  BlockCompletionFunc *cb;

This is the change that I could agree with, though I don't think it
makes a big difference: Whether you store the string immediately or an
Error object from which you get the string later, doesn't really make a
big difference.

Maybe we find more uses and having an Error object is common practice in
QEMU, so no objections to this change.

> @@ -484,15 +484,13 @@ void job_transition_to_ready(Job *job);
>  /**
>   * @job: The job being completed.
>   * @ret: The status code.
> - * @error: The error message for a failing job (only with @ret < 0). If @ret 
> is
> - * negative, but NULL is given for @error, strerror() is used.
>   *
>   * Marks @job as completed. If @ret is non-zero, the job transaction it is 
> part
>   * of is aborted. If @ret is zero, the job moves into the WAITING state. If 
> it
>   * is the last job to complete in its transaction, all jobs in the 
> transaction
>   * move from WAITING to PENDING.
>   */
> -void job_completed(Job *job, int ret, Error *error);
> +void job_completed(Job *job, int ret);

I don't like this one, though.

Before this change, job_completed(..., NULL) was a clear sign that the
error message probably needed an improvement, because an errno string
doesn't usually describe error situations very well. We may not have a
much better message in some cases, but in most cases we just don't pass
one because an error message after job creation is still a quite new
thing in the QAPI schema.

What we should get rid of in the long term is the int ret, not the Error
*error. I suspect callers really just distinguish success/error without
actually looking at the error code.

With this change applied, what's your new conversion plan for making
sure that every failing caller of job_completed() has set job->error
first?

> @@ -666,8 +665,8 @@ static void job_update_rc(Job *job)
>  job->ret = -ECANCELED;
>  }
>  if (job->ret) {
> -if (!job->error) {
> -job->error = g_strdup(strerror(-job->ret));
> +if (!job->err) {
> +error_setg_errno(>err, -job->ret, "job failed");
>  }
>  job_state_transition(job, JOB_STATUS_ABORTING);
>  }

This hunk just makes the error message more verbose with a "job failed"
prefix that doesn't add information. If it's the error string for a job,
of course the job failed.

Kevin



Re: [Qemu-devel] [PATCH 1/4] Add MEN Chameleon Bus emulation

2018-08-08 Thread Johannes Thumshirn
On Wed, Aug 08, 2018 at 03:52:05PM +0100, Peter Maydell wrote:
> On 8 August 2018 at 15:16, Johannes Thumshirn  wrote:
> > The MEN Chameleon Bus (MCB) is an on-chip bus system exposing IP Cores of an
> > FPGA to a outside bus system like PCIe.
> >
> > Signed-off-by: Johannes Thumshirn 
> 
> 
> > --- /dev/null
> > +++ b/hw/mcb/mcb.c
> > @@ -0,0 +1,180 @@
> > +/*
> > + * QEMU MEN Chameleon Bus emulation
> > + *
> > + * Copyright (C) 2016 Johannes Thumshirn 
> 
> Really 2016?

Yes originally. I'll bump it to 2016 - 2018.

> 
> > + *
> > + * This code is licensed under the GNU GPL v2 or (at your opinion) any
> 
> This should say "option" -- wording is important in license
> statements :-)  You might prefer to copy-and-paste an
> existing copyright header.

This should've been a copy from somehwere. Anyways I'll fix
it.

Thanks.

-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850



Re: [Qemu-devel] [PULL 21/35] block: fix QEMU crash with scsi-hd and drive_del

2018-08-08 Thread Eric Blake

On 08/08/2018 09:32 AM, Vladimir Sementsov-Ogievskiy wrote:

What's more, in commit f140e300, we specifically called out in the 
commit message that maybe it was better to trace when we detect 
connection closed rather than log it to stdout, and in all cases in 
that commit, the additional 'Connection closed' messages do not add 
any information to the error message already displayed by the rest of 
the code.





Ok, agree, I'll do it in reconnect series.




hmm, do what?

I was going to change these error messages to be traces, but now I'm not 
sure that it's a good idea.


Traces are fine. They won't show up in iotests, but will show up when 
debugging a failed connection.


We have generic errp returned from the 
function, and why to drop it from logs?


Because it is redundant with the very next line already in the log. Any 
error encountered when trying to write to a disconnected server is 
redundant with an already-reported error due to detecting EOF on reading 
from the server.


Fixing iotest is not a good 
reason, better is to adjust iotest itself a bit (just commit changed 
output) and forget about it. Is iotest racy itself, did you see 
different output running 83 iotest, not testing by hand?


The condition for the output of the 'Connection closed' message is racy 
- it depends entirely on the timing of whether the client was able to 
send() a read request to the server prior to the server disconnecting 
immediately after negotiation ended.  If the client loses the race and 
detects the server hangup prior to writing anything, you get one path; 
if the client wins the race and successfully writes the request and only 
later learns that the server has disconnected when trying to read the 
response to that request, you get the other path. The window for the 
race changed (and the iotests did not seem to ever expose it short of 
this particular change to the block layer to do an extra drain), but I 
could still imagine scenarios where iotests will trigger the opposite 
path of the race from what is expected, depending on load, since I don't 
see any synchronization points between the two processes where the 
server is hanging up after negotiation without reading the client's 
request, but where the client may or may not have had time to get its 
request sent to the server's queue.


So, just because I have not seen the iotest fail directly because of a 
race, I think that this commit causing failures in the iotest is 
evidence that the test is not robust with those extra 'Connection 
closed' messages being output.  Switching the output to be a trace 
instead should be just fine; overall, the client's attempt to read when 
the server hangs up will be an EIO failure whether or not the client was 
able to send() its request and merely fails to get a reply (server 
disconnect was slow), or whether the client was not even able to send() 
its request (server disconnect was fast).


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-devel] [PATCH v3 3/5] qcow2: Resize the cache upon image resizing

2018-08-08 Thread Leonid Bloch

On 08/08/2018 05:13 PM, Alberto Garcia wrote:

On Wed 08 Aug 2018 09:10:49 AM CEST, Leonid Bloch wrote:

The caches are now recalculated upon image resizing. This is done
because the new default behavior of assigning a sufficient L2 cache to
cover the entire image implies that the cache will still be sufficient
after an image resizing.


This is related to what I mentioned in the previous patch. The default
behavior doesn't make the cache try to cover the entire image (or at
least it doesn't *extend* the cache, which is what I understand from
this paragraph). What it does is *reduce* the cache if the smaller
version is enough for the entire image.


But it doesn't say that it extends the cache. It says that it *adapts* 
the cache to the image size, and therefore it should be resized when the 
image is resized. At least I understand it this way. That said, I'd 
mention the limit there, instead of just "sufficient".





Signed-off-by: Leonid Bloch 
---
  block/qcow2.c | 8 
  1 file changed, 8 insertions(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index 98cb96aaca..f60cb92169 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3639,6 +3639,8 @@ static int coroutine_fn 
qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
  }
  }
  
+bs->total_sectors = offset / BDRV_SECTOR_SIZE;

+
  /* write updated header.size */
  offset = cpu_to_be64(offset);
  ret = bdrv_pwrite_sync(bs->file, offsetof(QCowHeader, size),
@@ -3649,6 +3651,12 @@ static int coroutine_fn 
qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
  }
  
  s->l1_vm_state_index = new_l1_size;


You could add an empty line here for readability.


Yes, definitely, I will. Thanks.




+/* Update cache sizes */
+QDict *options = qdict_clone_shallow(bs->options);


C99 allows variable declarations in the middle of a block, but we're
still doing it at the beginning (I don't know if there's a good reason
for this?).


I did it for readability, and I didn't see a style directive for this. 
But if the style requires it - no problem. :)




Otherwise the patch looks good to me. Thanks!


Thanks!

Leonid.



Berto





  1   2   3   >