date:20200618




> On Jun 18, 2020, at 6:29 PM, Richard Henderson  
> wrote:
> 
> On 6/12/20 9:20 PM, Lijun Pan wrote:
>> +#define VMULH_DO(name, op, element, cast_orig, cast_temp)   \
>> +void helper_vmulh##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)   
>> \
>> +{   
>> \
>> +int i;  \
>> +\
>> +for (i = 0; i < ARRAY_SIZE(r->element); i++) {  \
>> +r->element[i] = (cast_orig)(((cast_temp)a->element[i] op \
>> +(cast_temp)b->element[i]) >> 32);   \
>> +}   \
>> +}
>> +VMULH_DO(sw, *, s32, int32_t, int64_t)
>> +VMULH_DO(uw, *, u32, uint32_t, uint64_t)
>> +#undef VMULH_DO
> 
> There's no point in calling the macro "VMUL" and then passing in "op" as a
> parameter.  Just inline the multiply directly.

Do you mean writing two functions directly, 

void helper_vmulhsw(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
{
int i;

for (i = 0; i < 4; i++) {
r->s32[i] = (int32_t)((int64_t)a->s32[i] * (int64_t)b->s32[i]) >> 32);
}
}

void helper_vmulhuw(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
{
int i;

for (i = 0; i < 4; i++) {
r->u32[i] = (uint32_t)((uint64_t)a->u32[i] * (uint64_t)b->u32[i]) >> 
32);
}
}

Thanks,
Lijun

Re: [PATCH 2/6] target/ppc: add vmulld instruction



> On Jun 18, 2020, at 6:27 PM, Richard Henderson  
> wrote:
> 
> On 6/12/20 9:20 PM, Lijun Pan wrote:
>> vmulld: Vector Multiply Low Doubleword.
>> 
>> Signed-off-by: Lijun Pan 
>> ---
>> target/ppc/helper.h | 1 +
>> target/ppc/int_helper.c | 1 +
>> target/ppc/translate/vmx-impl.inc.c | 1 +
>> target/ppc/translate/vmx-ops.inc.c  | 1 +
>> 4 files changed, 4 insertions(+)
>> 
>> diff --git a/target/ppc/helper.h b/target/ppc/helper.h
>> index 2dfa1c6942..c3f087ccb3 100644
>> --- a/target/ppc/helper.h
>> +++ b/target/ppc/helper.h
>> @@ -185,6 +185,7 @@ DEF_HELPER_3(vmuloub, void, avr, avr, avr)
>> DEF_HELPER_3(vmulouh, void, avr, avr, avr)
>> DEF_HELPER_3(vmulouw, void, avr, avr, avr)
>> DEF_HELPER_3(vmuluwm, void, avr, avr, avr)
>> +DEF_HELPER_3(vmulld, void, avr, avr, avr)
>> DEF_HELPER_3(vslo, void, avr, avr, avr)
>> DEF_HELPER_3(vsro, void, avr, avr, avr)
>> DEF_HELPER_3(vsrv, void, avr, avr, avr)
>> diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
>> index be53cd6f68..afbcdd05b4 100644
>> --- a/target/ppc/int_helper.c
>> +++ b/target/ppc/int_helper.c
>> @@ -533,6 +533,7 @@ void helper_vprtybq(ppc_avr_t *r, ppc_avr_t *b)
>> }   \
>> }
>> VARITH_DO(muluwm, *, u32)
>> +VARITH_DO(mulld, *, s64)
> 
>> From this implementation, I would say that both vmuluwm and vmulld can be
> implemented with tcg_gen_gvec_mul().
> 
> I guess vmuluwm was missed when many of the other vmx operations were 
> converted
> to gvec.
> 
> Please first convert vmuluwm to tcg_gen_gvec_mul, then implement vmulld in the
> same manner.

I did a grep in git repo, and found out only arm use this tcg_gen_gvec_mul.
The original implementation is very straightforward, and being adopted at many 
places
all over target/ppc/int_helper.c. Why do we need to convert
to tcg_gen_gvec_mul, which seems to me very convoluted?

Thanks,
Lijun

Re: [PATCH 1/6] target/ppc: add byte-reverse br[dwh] instructions




> On Jun 18, 2020, at 6:19 PM, Richard Henderson  
> wrote:
> 
> On 6/12/20 9:20 PM, Lijun Pan wrote:
>> POWER ISA 3.1 introduces following byte-reverse instructions:
>> brd: Byte-Reverse Doubleword X-form
>> brw: Byte-Reverse Word X-form
>> brh: Byte-Reverse Halfword X-form
>> 
>> Signed-off-by: Lijun Pan 
>> ---
>> target/ppc/translate.c | 62 ++
>> 1 file changed, 62 insertions(+)
>> 
>> diff --git a/target/ppc/translate.c b/target/ppc/translate.c
>> index 4ce3d664b5..2d48fbc8db 100644
>> --- a/target/ppc/translate.c
>> +++ b/target/ppc/translate.c
>> @@ -6971,7 +6971,69 @@ static void gen_dform3D(DisasContext *ctx)
>> return gen_invalid(ctx);
>> }
>> 
>> +/* brd */
>> +static void gen_brd(DisasContext *ctx)
>> +{
>> +TCGv_i64 temp = tcg_temp_new_i64();
>> +
>> +tcg_gen_bswap64_i64(temp, cpu_gpr[rS(ctx->opcode)]);
>> +tcg_gen_st_i64(temp, cpu_env, offsetof(CPUPPCState, 
>> gpr[rA(ctx->opcode)]));
> 
> 
> The store is wrong.  You cannot modify storage behind a tcg global variable
> like that.  This should just be
> 
>tcg_gen_bswap64_i64(cpu_gpr[rA(ctx->opcode)],
>cpu_gpr[rS(ctx->opcode)]);

Why can’t I retrieve the offset via “offsetof(CPUPPCState, 
gpr[rA(ctx->opcode)])”?
I would like to learn more.

> Is this code is within an ifdef for TARGET_PPC64?

I will change it to 
+#if defined(TARGET_PPC64)
+GEN_HANDLER_E(brd, 0x1F, 0x1B, 0x05, 0xF801, PPC_NONE, PPC2_ISA300),
+GEN_HANDLER_E(brw, 0x1F, 0x1B, 0x04, 0xF801, PPC_NONE, PPC2_ISA300),
+GEN_HANDLER_E(brh, 0x1F, 0x1B, 0x06, 0xF801, PPC_NONE, PPC2_ISA300),
+#endif

> If not, then this will break the 32-bit qemu-system-ppc build.
> Are you sure you have built and tested all configurations?
> 
> 
>> +/* brw */
>> +static void gen_brw(DisasContext *ctx)
>> +{
>> +TCGv_i64 temp = tcg_temp_new_i64();
>> +TCGv_i64 lsb = tcg_temp_new_i64();
>> +TCGv_i64 msb = tcg_temp_new_i64();
>> +
>> +tcg_gen_movi_i64(lsb, 0xull);
>> +tcg_gen_and_i64(temp, lsb, cpu_gpr[rS(ctx->opcode)]);
>> +tcg_gen_bswap32_i64(lsb, temp);
>> +
>> +tcg_gen_shri_i64(msb, cpu_gpr[rS(ctx->opcode)], 32);
>> +tcg_gen_bswap32_i64(temp, msb);
>> +tcg_gen_shli_i64(msb, temp, 32);
>> +
>> +tcg_gen_or_i64(temp, lsb, msb);
>> +
>> +tcg_gen_st_i64(temp, cpu_env, offsetof(CPUPPCState, 
>> gpr[rA(ctx->opcode)]));
> 
> Again, the store is wrong.
> 
> In addition, this can be computed as
> 
>tcg_gen_bswap64_i64(dest, source);
>tcg_gen_rotli_i64(dest, dest, 32);
> 
>> +static void gen_brh(DisasContext *ctx)
>> +{
>> +TCGv_i64 temp = tcg_temp_new_i64();
>> +TCGv_i64 t0 = tcg_temp_new_i64();
>> +TCGv_i64 t1 = tcg_temp_new_i64();
>> +TCGv_i64 t2 = tcg_temp_new_i64();
>> +TCGv_i64 t3 = tcg_temp_new_i64();
>> +
>> +tcg_gen_movi_i64(t0, 0x00ff00ff00ff00ffull);
>> +tcg_gen_shri_i64(t1, cpu_gpr[rS(ctx->opcode)], 8);
>> +tcg_gen_and_i64(t2, t1, t0);
>> +tcg_gen_and_i64(t1, cpu_gpr[rS(ctx->opcode)], t0);
>> +tcg_gen_shli_i64(t1, t1, 8);
>> +tcg_gen_or_i64(temp, t1, t2);
>> +tcg_gen_st_i64(temp, cpu_env, offsetof(CPUPPCState, 
>> gpr[rA(ctx->opcode)]));
> 
> Again, the store is wrong.
> 
> 
> r~
>

Re: [PATCH v2] target/ppc: add vmsumudm vmsumcud instructions




> On Jun 18, 2020, at 6:09 PM, Richard Henderson  
> wrote:
> 
> On 6/15/20 1:53 PM, Lijun Pan wrote:
 +static inline void uint128_add(uint64_t ah, uint64_t al, uint64_t bh,
 +  uint64_t bl, uint64_t *rh, uint64_t *rl, uint64_t *ca)
 +{
 +  __uint128_t a = (__uint128_t)ah << 64 | (__uint128_t)al;
 +  __uint128_t b = (__uint128_t)bh << 64 | (__uint128_t)bl;
 +  __uint128_t r = a + b;
 +
 +  *rh = (uint64_t)(r >> 64);
 +  *rl = (uint64_t)r;
 +  *ca = (~a < b);
 +}
>>> 
>>> This is *not* what I had in mind at all.
>>> 
>>> int128.h should be operating on Int128, and *not* component uint64_t values.
>> 
>> Should uint128_add() be included in a new file called uint128.h? or still at 
>> host-utils.h?
> 
> If you want this sort of specific operation, you should leave it in 
> target/ppc/.
> 
> I had been hoping that you could make use of Int128 as-is, or with minimal
> adjustment in the same style.
> 
>> vmsumudm/vmsumcud operate as follows:
>> 1. 128-bit prod1 = (high 64 bits of a) * (high 64 bits of b), // I reuse 
>> mulu64()

This is an implementation not relying on 128 bit compiler support (not defined 
CONFIG_INT128), 
hence using mulu64().

>> 2. 128-bit prod2 = (high 64 bits of b) * (high 64 bits of b), // I reuse 
>> mulu64()
>> 3. 128-bit result = prod1 + prod2 + c; // I added addu128() in v1, renamed 
>> it to uint128_add() in v2
> 
> Really?  That seems a very odd computation.  Your code,
> 
>> +prod1 = (__uint128_t)ah * (__uint128_t)bh;
>> +prod2 = (__uint128_t)al * (__uint128_t)bl;
>> +r = prod1 + prod2 + c;
> 
> is slightly different, but still very odd.

Above 3 lines of code are using 128 bit compiler suppor (#ifdef CONFIG_INT128). 

> 
> Why would we be adding the intermediate 128th bit of the 256-bit product
> (prod1, bit 0) with the 0th bit of the 256-bit product (prod2, bit 0).
> 
> Unfortunately, I can't find the v3.1 spec online yet, so I can't look at this
> myself.  What is the instruction supposed to produce?

https://ibm.ent.box.com/s/hhjfw0x0lrbtyzmiaffnbxh2fuo0fog0

> 
>> To better understand your request, may I ask you several questions:
>> 1. keep mulsum() in target/ppc/int_helper.c?
> 
> Probably.
> 
>> If so, it will inevitably have  #ifdef CONFIG_INT128 #else #endif in that 
>> function.  
> 
> No, you don't have to ifdef.  You can use uint64_t alone and not rely on
> compiler support for __uint128_t at all.
> 
> 
> r~
>

Re: Memory leak in spapr_machine_init()?

On Thu, Jun 18, 2020 at 08:55:53AM +0200, Markus Armbruster wrote:
> Either I'm confused (quite possible), or kvmppc_check_papr_resize_hpt()
> can leak an Error object on failure.  Please walk through the code with
> me:
> 
> kvmppc_check_papr_resize_hpt(_hpt_err);
> 
> This sets @resize_hpt_err on failure.
> 
> if (spapr->resize_hpt == SPAPR_RESIZE_HPT_DEFAULT) {
> /*
>  * If the user explicitly requested a mode we should either
>  * supply it, or fail completely (which we do below).  But if
>  * it's not set explicitly, we reset our mode to something
>  * that works
>  */
> if (resize_hpt_err) {
> spapr->resize_hpt = SPAPR_RESIZE_HPT_DISABLED;
> error_free(resize_hpt_err);
> resize_hpt_err = NULL;
> 
> Case 1: failure and SPAPR_RESIZE_HPT_DEFAULT; we free @resize_hpt_err.
> Good.
> 
> } else {
> spapr->resize_hpt = smc->resize_hpt_default;
> }
> }
> 
> assert(spapr->resize_hpt != SPAPR_RESIZE_HPT_DEFAULT);
> 
> if ((spapr->resize_hpt != SPAPR_RESIZE_HPT_DISABLED) && 
> resize_hpt_err) {
> /*
>  * User requested HPT resize, but this host can't supply it.  
> Bail out
>  */
> error_report_err(resize_hpt_err);
> exit(1);
> 
> Case 2: failure and not SPAPR_RESIZE_HPT_DISABLED; fatal.  Good.
> 
> }
> 
> What about case 3: failure and SPAPR_RESIZE_HPT_DISABLED?
> 
> Good if we get here via case 1 (we freed @resize_hpt_err).
> 
> Else, ???

I think you're right, and we leak it in this case - I think I forgot
that in the DISABLED case we still (unnecessarily) ask the kernel if
it can do it.

Of course, it will only happen once per run, so it's not like it's a
particularly noticeable leak.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [PATCH v1 2/2] sifive_e: Support the revB machine

2020-06-18 Thread Palmer Dabbelt


On Thu, 18 Jun 2020 16:18:20 PDT (-0700), alistai...@gmail.com wrote:

On Thu, Jun 18, 2020 at 3:42 PM Palmer Dabbelt  wrote:


On Wed, 10 Jun 2020 15:13:49 PDT (-0700), alistai...@gmail.com wrote:
> On Thu, May 28, 2020 at 11:13 AM Alistair Francis  
wrote:
>>
>> On Thu, May 21, 2020 at 8:57 AM Alistair Francis  
wrote:
>> >
>> > On Wed, May 20, 2020 at 4:08 PM Palmer Dabbelt  wrote:
>> > >
>> > > On Thu, 14 May 2020 13:47:10 PDT (-0700), Alistair Francis wrote:
>> > > > Signed-off-by: Alistair Francis 
>> > > > ---
>> > > >  hw/riscv/sifive_e.c | 35 +++
>> > > >  include/hw/riscv/sifive_e.h |  1 +
>> > > >  2 files changed, 32 insertions(+), 4 deletions(-)
>> > > >
>> > > > diff --git a/hw/riscv/sifive_e.c b/hw/riscv/sifive_e.c
>> > > > index 472a98970b..cb7818341b 100644
>> > > > --- a/hw/riscv/sifive_e.c
>> > > > +++ b/hw/riscv/sifive_e.c
>> > > > @@ -98,10 +98,14 @@ static void riscv_sifive_e_init(MachineState 
*machine)
>> > > >  memmap[SIFIVE_E_DTIM].base, main_mem);
>> > > >
>> > > >  /* Mask ROM reset vector */
>> > > > -uint32_t reset_vec[2] = {
>> > > > -0x204002b7,/* 0x1000: lui t0,0x20400 */
>> > > > -0x00028067,/* 0x1004: jr  t0 */
>> > > > -};
>> > > > +uint32_t reset_vec[2];
>> > > > +
>> > > > +if (s->revb) {
>> > > > +reset_vec[0] = 0x200102b7;/* 0x1000: lui 
t0,0x20010 */
>> > > > +} else {
>> > > > +reset_vec[0] = 0x204002b7;/* 0x1000: lui 
t0,0x20400 */
>> > > > +}
>> > > > +reset_vec[1] = 0x00028067;/* 0x1004: jr  t0 */
>> > > >
>> > > >  /* copy in the reset vector in little_endian byte order */
>> > > >  for (i = 0; i < sizeof(reset_vec) >> 2; i++) {
>> > > > @@ -115,8 +119,31 @@ static void riscv_sifive_e_init(MachineState 
*machine)
>> > > >  }
>> > > >  }
>> > > >
>> > > > +static bool sifive_e_machine_get_revb(Object *obj, Error **errp)
>> > > > +{
>> > > > +SiFiveEState *s = RISCV_E_MACHINE(obj);
>> > > > +
>> > > > +return s->revb;
>> > > > +}
>> > > > +
>> > > > +static void sifive_e_machine_set_revb(Object *obj, bool value, Error 
**errp)
>> > > > +{
>> > > > +SiFiveEState *s = RISCV_E_MACHINE(obj);
>> > > > +
>> > > > +s->revb = value;
>> > > > +}
>> > > > +
>> > > >  static void sifive_e_machine_instance_init(Object *obj)
>> > > >  {
>> > > > +SiFiveEState *s = RISCV_E_MACHINE(obj);
>> > > > +
>> > > > +s->revb = false;
>> > > > +object_property_add_bool(obj, "revb", sifive_e_machine_get_revb,
>> > > > + sifive_e_machine_set_revb, NULL);
>> > > > +object_property_set_description(obj, "revb",
>> > > > +"Set on to tell QEMU that it should 
model "
>> > > > +"the revB HiFive1 board",
>> > > > +NULL);
>> > > >  }
>> > > >
>> > > >  static void sifive_e_machine_class_init(ObjectClass *oc, void *data)
>> > > > diff --git a/include/hw/riscv/sifive_e.h b/include/hw/riscv/sifive_e.h
>> > > > index 414992119e..0d3cd07fcc 100644
>> > > > --- a/include/hw/riscv/sifive_e.h
>> > > > +++ b/include/hw/riscv/sifive_e.h
>> > > > @@ -45,6 +45,7 @@ typedef struct SiFiveEState {
>> > > >
>> > > >  /*< public >*/
>> > > >  SiFiveESoCState soc;
>> > > > +bool revb;
>> > > >  } SiFiveEState;
>> > > >
>> > > >  #define TYPE_RISCV_E_MACHINE MACHINE_TYPE_NAME("sifive_e")
>> > >
>> > > IIRC there are way more differences between the un-suffixed FE310 and 
the Rev
>> > > B, specifically the interrupt map is all different.
>> >
>> > The three IRQs that QEMU uses for the SiFive E (UART0, UART1 and GPIO)
>> > all seem to be the same.
>>
>> Ping!
>
> Ping^2
>
> Applying to RISC-V tree.

They're not: uart0 is interrupt 3 on the rev b but 5 on the non-rev b (which
they don't call rev a but I'm going to :)).  There's isn't even a uart1 in the
DTS on the rev a, and the GPIO interrupts are different as well.

The DTS files are in SiFive's SDK:

https://github.com/sifive/freedom-e-sdk/blob/master/bsp/sifive-hifive1-revb/core.dts
https://github.com/sifive/freedom-e-sdk/blob/master/bsp/sifive-hifive1/core.dts

which should also generate some test programs.  When I was there we tested on
QEMU for the platforms that were supported, so there should be some support for
doing so still.


I am reading the wrong data sheets?

For revB I looked at
https://sifive.cdn.prismic.io/sifive%2F9ecbb623-7c7f-4acc-966f-9bb10ecdb62e_fe310-g002.pdf,
page 46, table 26. The interrupts for the revB match what we currently
have in QEMU.

For revA 
https://sifive.cdn.prismic.io/sifive%2F500a69f8-af3a-4fd9-927f-10ca77077532_fe310-g000.pdf,
page 42, table 21 also matches what we have in QEMU.

To me it looks like both have two UARTs and both have the same
interrupt numbers.

The actual software I am running also hasn't changed UART interrupt
numbering between the two boards and

Re: [PULL V2 00/33] Net patches

2020-06-18 Thread Jason Wang




On 2020/6/18 下午10:05, no-re...@patchew.org wrote:

/tmp/qemu-test/src/tests/qht-bench.c:287:29: error: implicit conversion from 
'unsigned long' to 'double' changes value from 18446744073709551615 to 
18446744073709551616 [-Werror,-Wimplicit-int-float-conversion]
 *threshold = rate * UINT64_MAX;
   ~ ^~
/usr/include/stdint.h:130:23: note: expanded from macro 'UINT64_MAX'
---
18446744073709551615UL
^~



Cc Emilio.

This looks an issue not related to this pull request.

Thanks

Re: [PATCH v3 0/9] Generalize memory encryption models

Patchew URL: 
https://patchew.org/QEMU/20200619020602.118306-1-da...@gibson.dropbear.id.au/



Hi,

This series failed the asan build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-debug@fedora TARGET_LIST=x86_64-softmmu J=14 NETWORK=1
=== TEST SCRIPT END ===

  CC  contrib/vhost-user-input/main.o
  LINKtests/qemu-iotests/socket_scm_helper
  GEN docs/interop/qemu-qmp-ref.html
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  GEN docs/interop/qemu-qmp-ref.txt
  GEN docs/interop/qemu-qmp-ref.7
  CC  qga/commands.o
---
  SIGNpc-bios/optionrom/pvh.bin
  LINKelf2dmp
  CC  qemu-img.o
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINKqemu-io
  LINKqemu-edid
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINKfsdev/virtfs-proxy-helper
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINKscsi/qemu-pr-helper
  LINKqemu-bridge-helper
  AR  libvhost-user.a
---
  GEN docs/interop/qemu-ga-ref.txt
  GEN docs/interop/qemu-ga-ref.7
  LINKqemu-keymap
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINKivshmem-client
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINKivshmem-server
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINKqemu-nbd
  LINKqemu-storage-daemon
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINKvirtiofsd
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINKvhost-user-input
  LINKqemu-ga
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINKqemu-img

[PATCH v3 9/9] host trust limitation: Alter virtio default properties for protected guests

The default behaviour for virtio devices is not to use the platforms normal
DMA paths, but instead to use the fact that it's running in a hypervisor
to directly access guest memory.  That doesn't work if the guest's memory
is protected from hypervisor access, such as with AMD's SEV or POWER's PEF.

So, if a host trust limitation mechanism is enabled, then apply the
iommu_platform=on option so it will go through normal DMA mechanisms.
Those will presumably have some way of marking memory as shared with the
hypervisor or hardware so that DMA will work.

Signed-off-by: David Gibson 
---
 hw/core/machine.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index a71792bc16..8dfc1bb3f8 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -28,6 +28,8 @@
 #include "hw/mem/nvdimm.h"
 #include "migration/vmstate.h"
 #include "exec/host-trust-limitation.h"
+#include "hw/virtio/virtio.h"
+#include "hw/virtio/virtio-pci.h"
 
 GlobalProperty hw_compat_5_0[] = {
 { "virtio-balloon-device", "page-poison", "false" },
@@ -1165,6 +1167,15 @@ void machine_run_board_init(MachineState *machine)
  * areas.
  */
 machine_set_mem_merge(OBJECT(machine), false, _abort);
+
+/*
+ * Virtio devices can't count on directly accessing guest
+ * memory, so they need iommu_platform=on to use normal DMA
+ * mechanisms.  That requires disabling legacy virtio support
+ * for virtio pci devices
+ */
+object_register_sugar_prop(TYPE_VIRTIO_PCI, "disable-legacy", "on");
+object_register_sugar_prop(TYPE_VIRTIO_DEVICE, "iommu_platform", "on");
 }
 
 machine_class->init(machine);
-- 
2.26.2

[PATCH v3 6/9] host trust limitation: Add Error ** to HostTrustLimitation::kvm_init

This allows failures to be reported richly and idiomatically.

Signed-off-by: David Gibson 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Richard Henderson 
---
 accel/kvm/kvm-all.c  |  4 +++-
 include/exec/host-trust-limitation.h |  2 +-
 target/i386/sev.c| 31 ++--
 3 files changed, 19 insertions(+), 18 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 9645271ca5..c236ebeae0 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2158,9 +2158,11 @@ static int kvm_init(MachineState *ms)
 if (ms->htl) {
 HostTrustLimitationClass *htlc =
 HOST_TRUST_LIMITATION_GET_CLASS(ms->htl);
+Error *local_err = NULL;
 
-ret = htlc->kvm_init(ms->htl);
+ret = htlc->kvm_init(ms->htl, _err);
 if (ret < 0) {
+error_report_err(local_err);
 goto err;
 }
 }
diff --git a/include/exec/host-trust-limitation.h 
b/include/exec/host-trust-limitation.h
index fc30ea3f78..d93b537280 100644
--- a/include/exec/host-trust-limitation.h
+++ b/include/exec/host-trust-limitation.h
@@ -30,7 +30,7 @@
 typedef struct HostTrustLimitationClass {
 InterfaceClass parent;
 
-int (*kvm_init)(HostTrustLimitation *);
+int (*kvm_init)(HostTrustLimitation *, Error **);
 int (*encrypt_data)(HostTrustLimitation *, uint8_t *, uint64_t);
 } HostTrustLimitationClass;
 
diff --git a/target/i386/sev.c b/target/i386/sev.c
index 052a05d15a..829f78436a 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -617,7 +617,7 @@ sev_vm_state_change(void *opaque, int running, RunState 
state)
 }
 }
 
-static int sev_kvm_init(HostTrustLimitation *htl)
+static int sev_kvm_init(HostTrustLimitation *htl, Error **errp)
 {
 SevGuestState *sev = SEV_GUEST(htl);
 char *devname;
@@ -633,14 +633,14 @@ static int sev_kvm_init(HostTrustLimitation *htl)
 host_cbitpos = ebx & 0x3f;
 
 if (host_cbitpos != sev->cbitpos) {
-error_report("%s: cbitpos check failed, host '%d' requested '%d'",
- __func__, host_cbitpos, sev->cbitpos);
+error_setg(errp, "%s: cbitpos check failed, host '%d' requested '%d'",
+   __func__, host_cbitpos, sev->cbitpos);
 goto err;
 }
 
 if (sev->reduced_phys_bits < 1) {
-error_report("%s: reduced_phys_bits check failed, it should be >=1,"
- " requested '%d'", __func__, sev->reduced_phys_bits);
+error_setg(errp, "%s: reduced_phys_bits check failed, it should be 
>=1,"
+   " requested '%d'", __func__, sev->reduced_phys_bits);
 goto err;
 }
 
@@ -649,20 +649,19 @@ static int sev_kvm_init(HostTrustLimitation *htl)
 devname = object_property_get_str(OBJECT(sev), "sev-device", NULL);
 sev->sev_fd = open(devname, O_RDWR);
 if (sev->sev_fd < 0) {
-error_report("%s: Failed to open %s '%s'", __func__,
- devname, strerror(errno));
-}
-g_free(devname);
-if (sev->sev_fd < 0) {
+error_setg(errp, "%s: Failed to open %s '%s'", __func__,
+   devname, strerror(errno));
+g_free(devname);
 goto err;
 }
+g_free(devname);
 
 ret = sev_platform_ioctl(sev->sev_fd, SEV_PLATFORM_STATUS, ,
  _error);
 if (ret) {
-error_report("%s: failed to get platform status ret=%d "
- "fw_error='%d: %s'", __func__, ret, fw_error,
- fw_error_to_str(fw_error));
+error_setg(errp, "%s: failed to get platform status ret=%d "
+   "fw_error='%d: %s'", __func__, ret, fw_error,
+   fw_error_to_str(fw_error));
 goto err;
 }
 sev->build_id = status.build;
@@ -672,14 +671,14 @@ static int sev_kvm_init(HostTrustLimitation *htl)
 trace_kvm_sev_init();
 ret = sev_ioctl(sev->sev_fd, KVM_SEV_INIT, NULL, _error);
 if (ret) {
-error_report("%s: failed to initialize ret=%d fw_error=%d '%s'",
- __func__, ret, fw_error, fw_error_to_str(fw_error));
+error_setg(errp, "%s: failed to initialize ret=%d fw_error=%d '%s'",
+   __func__, ret, fw_error, fw_error_to_str(fw_error));
 goto err;
 }
 
 ret = sev_launch_start(sev);
 if (ret) {
-error_report("%s: failed to create encryption context", __func__);
+error_setg(errp, "%s: failed to create encryption context", __func__);
 goto err;
 }
 
-- 
2.26.2

[PATCH v3 5/9] host trust limitation: Decouple kvm_memcrypt_*() helpers from KVM

The kvm_memcrypt_enabled() and kvm_memcrypt_encrypt_data() helper functions
don't conceptually have any connection to KVM (although it's not possible
in practice to use them without it).

They also rely on looking at the global KVMState.  But the same information
is available from the machine, and the only existing callers have natural
access to the machine state.

Therefore, move and rename them to helpers in host-trust-limitation.h,
taking an explicit machine parameter.

Signed-off-by: David Gibson 
Reviewed-by: Richard Henderson 
---
 accel/kvm/kvm-all.c  | 27 -
 accel/stubs/kvm-stub.c   | 10 
 hw/i386/pc_sysfw.c   |  6 +++--
 include/exec/host-trust-limitation.h | 36 
 include/sysemu/kvm.h | 17 -
 5 files changed, 40 insertions(+), 56 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index d8e8fa345e..9645271ca5 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -118,9 +118,6 @@ struct KVMState
 KVMMemoryListener memory_listener;
 QLIST_HEAD(, KVMParkedVcpu) kvm_parked_vcpus;
 
-/* host trust limitation (e.g. by guest memory encryption) */
-HostTrustLimitation *htl;
-
 /* For "info mtree -f" to tell if an MR is registered in KVM */
 int nr_as;
 struct KVMAs {
@@ -219,28 +216,6 @@ int kvm_get_max_memslots(void)
 return s->nr_slots;
 }
 
-bool kvm_memcrypt_enabled(void)
-{
-if (kvm_state && kvm_state->htl) {
-return true;
-}
-
-return false;
-}
-
-int kvm_memcrypt_encrypt_data(uint8_t *ptr, uint64_t len)
-{
-HostTrustLimitation *htl = kvm_state->htl;
-
-if (htl) {
-HostTrustLimitationClass *htlc = HOST_TRUST_LIMITATION_GET_CLASS(htl);
-
-return htlc->encrypt_data(htl, ptr, len);
-}
-
-return 1;
-}
-
 /* Called with KVMMemoryListener.slots_lock held */
 static KVMSlot *kvm_get_free_slot(KVMMemoryListener *kml)
 {
@@ -2188,8 +2163,6 @@ static int kvm_init(MachineState *ms)
 if (ret < 0) {
 goto err;
 }
-
-kvm_state->htl = ms->htl;
 }
 
 ret = kvm_arch_init(ms, s);
diff --git a/accel/stubs/kvm-stub.c b/accel/stubs/kvm-stub.c
index 82f118d2df..78b3eef117 100644
--- a/accel/stubs/kvm-stub.c
+++ b/accel/stubs/kvm-stub.c
@@ -104,16 +104,6 @@ int kvm_on_sigbus(int code, void *addr)
 return 1;
 }
 
-bool kvm_memcrypt_enabled(void)
-{
-return false;
-}
-
-int kvm_memcrypt_encrypt_data(uint8_t *ptr, uint64_t len)
-{
-  return 1;
-}
-
 #ifndef CONFIG_USER_ONLY
 int kvm_irqchip_add_msi_route(KVMState *s, int vector, PCIDevice *dev)
 {
diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
index ec2a3b3e7e..cab5ac5695 100644
--- a/hw/i386/pc_sysfw.c
+++ b/hw/i386/pc_sysfw.c
@@ -38,6 +38,7 @@
 #include "sysemu/sysemu.h"
 #include "hw/block/flash.h"
 #include "sysemu/kvm.h"
+#include "exec/host-trust-limitation.h"
 
 /*
  * We don't have a theoretically justifiable exact lower bound on the base
@@ -196,10 +197,11 @@ static void pc_system_flash_map(PCMachineState *pcms,
 pc_isa_bios_init(rom_memory, flash_mem, size);
 
 /* Encrypt the pflash boot ROM */
-if (kvm_memcrypt_enabled()) {
+if (host_trust_limitation_enabled(MACHINE(pcms))) {
 flash_ptr = memory_region_get_ram_ptr(flash_mem);
 flash_size = memory_region_size(flash_mem);
-ret = kvm_memcrypt_encrypt_data(flash_ptr, flash_size);
+ret = host_trust_limitation_encrypt(MACHINE(pcms),
+flash_ptr, flash_size);
 if (ret) {
 error_report("failed to encrypt pflash rom");
 exit(1);
diff --git a/include/exec/host-trust-limitation.h 
b/include/exec/host-trust-limitation.h
index a19f12ae14..fc30ea3f78 100644
--- a/include/exec/host-trust-limitation.h
+++ b/include/exec/host-trust-limitation.h
@@ -14,6 +14,7 @@
 #define QEMU_HOST_TRUST_LIMITATION_H
 
 #include "qom/object.h"
+#include "hw/boards.h"
 
 #define TYPE_HOST_TRUST_LIMITATION "host-trust-limitation"
 #define HOST_TRUST_LIMITATION(obj)\
@@ -33,4 +34,39 @@ typedef struct HostTrustLimitationClass {
 int (*encrypt_data)(HostTrustLimitation *, uint8_t *, uint64_t);
 } HostTrustLimitationClass;
 
+/**
+ * host_trust_limitation_enabled - return whether guest memory is protected
+ * from hypervisor access (with memory
+ * encryption or otherwise)
+ * Returns: true guest memory is not directly accessible to qemu
+ *  false guest memory is directly accessible to qemu
+ */
+static inline bool host_trust_limitation_enabled(MachineState *machine)
+{
+return !!machine->htl;
+}
+
+/**
+ * host_trust_limitation_encrypt: encrypt the memory range to make
+ *it guest accessible
+ *
+ *

[PATCH v3 7/9] spapr: Add PEF based host trust limitation

Some upcoming POWER machines have a system called PEF (Protected
Execution Facility) which uses a small ultravisor to allow guests to
run in a way that they can't be eavesdropped by the hypervisor.  The
effect is roughly similar to AMD SEV, although the mechanisms are
quite different.

Most of the work of this is done between the guest, KVM and the
ultravisor, with little need for involvement by qemu.  However qemu
does need to tell KVM to allow secure VMs.

Because the availability of secure mode is a guest visible difference
which depends on having the right hardware and firmware, we don't
enable this by default.  In order to run a secure guest you need to
create a "pef-guest" object and set the host-trust-limitation machine
property to point to it.

Note that this just *allows* secure guests, the architecture of PEF is
such that the guest still needs to talk to the ultravisor to enter
secure mode.  Qemu has no directly way of knowing if the guest is in
secure mode, and certainly can't know until well after machine
creation time.

Signed-off-by: David Gibson 
Acked-by: Ram Pai 
---
 target/ppc/Makefile.objs |  2 +-
 target/ppc/pef.c | 83 
 2 files changed, 84 insertions(+), 1 deletion(-)
 create mode 100644 target/ppc/pef.c

diff --git a/target/ppc/Makefile.objs b/target/ppc/Makefile.objs
index e8fa18ce13..ac93b9700e 100644
--- a/target/ppc/Makefile.objs
+++ b/target/ppc/Makefile.objs
@@ -6,7 +6,7 @@ obj-y += machine.o mmu_helper.o mmu-hash32.o monitor.o 
arch_dump.o
 obj-$(TARGET_PPC64) += mmu-hash64.o mmu-book3s-v3.o compat.o
 obj-$(TARGET_PPC64) += mmu-radix64.o
 endif
-obj-$(CONFIG_KVM) += kvm.o
+obj-$(CONFIG_KVM) += kvm.o pef.o
 obj-$(call lnot,$(CONFIG_KVM)) += kvm-stub.o
 obj-y += dfp_helper.o
 obj-y += excp_helper.o
diff --git a/target/ppc/pef.c b/target/ppc/pef.c
new file mode 100644
index 00..53a6af0347
--- /dev/null
+++ b/target/ppc/pef.c
@@ -0,0 +1,83 @@
+/*
+ * PEF (Protected Execution Facility) for POWER support
+ *
+ * Copyright David Gibson, Redhat Inc. 2020
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+
+#include "qapi/error.h"
+#include "qom/object_interfaces.h"
+#include "sysemu/kvm.h"
+#include "migration/blocker.h"
+#include "exec/host-trust-limitation.h"
+
+#define TYPE_PEF_GUEST "pef-guest"
+#define PEF_GUEST(obj)  \
+OBJECT_CHECK(PefGuestState, (obj), TYPE_PEF_GUEST)
+
+typedef struct PefGuestState PefGuestState;
+
+/**
+ * PefGuestState:
+ *
+ * The PefGuestState object is used for creating and managing a PEF
+ * guest.
+ *
+ * # $QEMU \
+ * -object pef-guest,id=pef0 \
+ * -machine ...,host-trust-limitation=pef0
+ */
+struct PefGuestState {
+Object parent_obj;
+};
+
+static int pef_kvm_init(HostTrustLimitation *gmpo, Error **errp)
+{
+if (!kvm_check_extension(kvm_state, KVM_CAP_PPC_SECURE_GUEST)) {
+error_setg(errp,
+   "KVM implementation does not support Secure VMs (is an 
ultravisor running?)");
+return -1;
+} else {
+int ret = kvm_vm_enable_cap(kvm_state, KVM_CAP_PPC_SECURE_GUEST, 0, 1);
+
+if (ret < 0) {
+error_setg(errp,
+   "Error enabling PEF with KVM");
+return -1;
+}
+}
+
+return 0;
+}
+
+static void pef_guest_class_init(ObjectClass *oc, void *data)
+{
+HostTrustLimitationClass *gmpc = HOST_TRUST_LIMITATION_CLASS(oc);
+
+gmpc->kvm_init = pef_kvm_init;
+}
+
+static const TypeInfo pef_guest_info = {
+.parent = TYPE_OBJECT,
+.name = TYPE_PEF_GUEST,
+.instance_size = sizeof(PefGuestState),
+.class_init = pef_guest_class_init,
+.interfaces = (InterfaceInfo[]) {
+{ TYPE_HOST_TRUST_LIMITATION },
+{ TYPE_USER_CREATABLE },
+{ }
+}
+};
+
+static void
+pef_register_types(void)
+{
+type_register_static(_guest_info);
+}
+
+type_init(pef_register_types);
-- 
2.26.2

[PATCH v3 1/9] host trust limitation: Introduce new host trust limitation interface

Several architectures have mechanisms which are designed to protect guest
memory from interference or eavesdropping by a compromised hypervisor.  AMD
SEV does this with in-chip memory encryption and Intel has a similar
mechanism.  POWER's Protected Execution Framework (PEF) accomplishes a
similar goal using an ultravisor and new memory protection features,
instead of encryption.

To (partially) unify handling for these, this introduces a new
HostTrustLimitation QOM interface.

Signed-off-by: David Gibson 
---
 backends/Makefile.objs   |  2 ++
 backends/host-trust-limitation.c | 29 
 include/exec/host-trust-limitation.h | 33 
 include/qemu/typedefs.h  |  1 +
 4 files changed, 65 insertions(+)
 create mode 100644 backends/host-trust-limitation.c
 create mode 100644 include/exec/host-trust-limitation.h

diff --git a/backends/Makefile.objs b/backends/Makefile.objs
index 28a847cd57..af761c9ab1 100644
--- a/backends/Makefile.objs
+++ b/backends/Makefile.objs
@@ -21,3 +21,5 @@ common-obj-$(CONFIG_LINUX) += hostmem-memfd.o
 common-obj-$(CONFIG_GIO) += dbus-vmstate.o
 dbus-vmstate.o-cflags = $(GIO_CFLAGS)
 dbus-vmstate.o-libs = $(GIO_LIBS)
+
+common-obj-y += host-trust-limitation.o
diff --git a/backends/host-trust-limitation.c b/backends/host-trust-limitation.c
new file mode 100644
index 00..96a381cd8a
--- /dev/null
+++ b/backends/host-trust-limitation.c
@@ -0,0 +1,29 @@
+/*
+ * QEMU Host Trust Limitation interface
+ *
+ * Copyright: David Gibson, Red Hat Inc. 2020
+ *
+ * Authors:
+ *  David Gibson 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+
+#include "exec/host-trust-limitation.h"
+
+static const TypeInfo host_trust_limitation_info = {
+.name = TYPE_HOST_TRUST_LIMITATION,
+.parent = TYPE_INTERFACE,
+.class_size = sizeof(HostTrustLimitationClass),
+};
+
+static void host_trust_limitation_register_types(void)
+{
+type_register_static(_trust_limitation_info);
+}
+
+type_init(host_trust_limitation_register_types)
diff --git a/include/exec/host-trust-limitation.h 
b/include/exec/host-trust-limitation.h
new file mode 100644
index 00..03887b1be1
--- /dev/null
+++ b/include/exec/host-trust-limitation.h
@@ -0,0 +1,33 @@
+/*
+ * QEMU Host Trust Limitation interface
+ *
+ * Copyright: David Gibson, Red Hat Inc. 2020
+ *
+ * Authors:
+ *  David Gibson 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ *
+ */
+#ifndef QEMU_HOST_TRUST_LIMITATION_H
+#define QEMU_HOST_TRUST_LIMITATION_H
+
+#include "qom/object.h"
+
+#define TYPE_HOST_TRUST_LIMITATION "host-trust-limitation"
+#define HOST_TRUST_LIMITATION(obj)\
+INTERFACE_CHECK(HostTrustLimitation, (obj),   \
+TYPE_HOST_TRUST_LIMITATION)
+#define HOST_TRUST_LIMITATION_CLASS(klass)\
+OBJECT_CLASS_CHECK(HostTrustLimitationClass, (klass), \
+   TYPE_HOST_TRUST_LIMITATION)
+#define HOST_TRUST_LIMITATION_GET_CLASS(obj)  \
+OBJECT_GET_CLASS(HostTrustLimitationClass, (obj), \
+ TYPE_HOST_TRUST_LIMITATION)
+
+typedef struct HostTrustLimitationClass {
+InterfaceClass parent;
+} HostTrustLimitationClass;
+
+#endif /* QEMU_HOST_TRUST_LIMITATION_H */
diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h
index ce4a78b687..f75c7eb2f2 100644
--- a/include/qemu/typedefs.h
+++ b/include/qemu/typedefs.h
@@ -51,6 +51,7 @@ typedef struct FWCfgIoState FWCfgIoState;
 typedef struct FWCfgMemState FWCfgMemState;
 typedef struct FWCfgState FWCfgState;
 typedef struct HostMemoryBackend HostMemoryBackend;
+typedef struct HostTrustLimitation HostTrustLimitation;
 typedef struct I2CBus I2CBus;
 typedef struct I2SCodec I2SCodec;
 typedef struct IOMMUMemoryRegion IOMMUMemoryRegion;
-- 
2.26.2

[PATCH v3 8/9] spapr: PEF: block migration

We haven't yet implemented the fairly involved handshaking that will be
needed to migrate PEF protected guests.  For now, just use a migration
blocker so we get a meaningful error if someone attempts this (this is the
same approach used by AMD SEV).

Signed-off-by: David Gibson 
---
 target/ppc/pef.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/target/ppc/pef.c b/target/ppc/pef.c
index 53a6af0347..6a50efd580 100644
--- a/target/ppc/pef.c
+++ b/target/ppc/pef.c
@@ -36,6 +36,8 @@ struct PefGuestState {
 Object parent_obj;
 };
 
+static Error *pef_mig_blocker;
+
 static int pef_kvm_init(HostTrustLimitation *gmpo, Error **errp)
 {
 if (!kvm_check_extension(kvm_state, KVM_CAP_PPC_SECURE_GUEST)) {
@@ -52,6 +54,10 @@ static int pef_kvm_init(HostTrustLimitation *gmpo, Error 
**errp)
 }
 }
 
+/* add migration blocker */
+error_setg(_mig_blocker, "PEF: Migration is not implemented");
+migrate_add_blocker(pef_mig_blocker, _abort);
+
 return 0;
 }
 
-- 
2.26.2

[PATCH v3 2/9] host trust limitation: Handle memory encryption via interface

At the moment AMD SEV sets a special function pointer, plus an opaque
handle in KVMState to let things know how to encrypt guest memory.

Now that we have a QOM interface for handling things related to host trust
limitation, use a QOM method on that interface, rather than a bare function
pointer for this.

Signed-off-by: David Gibson 
Reviewed-by: Richard Henderson 
---
 accel/kvm/kvm-all.c  |  38 ++---
 accel/kvm/sev-stub.c |   7 +-
 include/exec/host-trust-limitation.h |   3 +
 include/sysemu/sev.h |   4 +-
 target/i386/sev.c| 117 +++
 5 files changed, 79 insertions(+), 90 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index f24d7da783..1e43e27f45 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -39,12 +39,12 @@
 #include "qemu/main-loop.h"
 #include "trace.h"
 #include "hw/irq.h"
-#include "sysemu/sev.h"
 #include "sysemu/balloon.h"
 #include "qapi/visitor.h"
 #include "qapi/qapi-types-common.h"
 #include "qapi/qapi-visit-common.h"
 #include "sysemu/reset.h"
+#include "exec/host-trust-limitation.h"
 
 #include "hw/boards.h"
 
@@ -118,9 +118,8 @@ struct KVMState
 KVMMemoryListener memory_listener;
 QLIST_HEAD(, KVMParkedVcpu) kvm_parked_vcpus;
 
-/* memory encryption */
-void *memcrypt_handle;
-int (*memcrypt_encrypt_data)(void *handle, uint8_t *ptr, uint64_t len);
+/* host trust limitation (e.g. by guest memory encryption) */
+HostTrustLimitation *htl;
 
 /* For "info mtree -f" to tell if an MR is registered in KVM */
 int nr_as;
@@ -222,7 +221,7 @@ int kvm_get_max_memslots(void)
 
 bool kvm_memcrypt_enabled(void)
 {
-if (kvm_state && kvm_state->memcrypt_handle) {
+if (kvm_state && kvm_state->htl) {
 return true;
 }
 
@@ -231,10 +230,12 @@ bool kvm_memcrypt_enabled(void)
 
 int kvm_memcrypt_encrypt_data(uint8_t *ptr, uint64_t len)
 {
-if (kvm_state->memcrypt_handle &&
-kvm_state->memcrypt_encrypt_data) {
-return kvm_state->memcrypt_encrypt_data(kvm_state->memcrypt_handle,
-  ptr, len);
+HostTrustLimitation *htl = kvm_state->htl;
+
+if (htl) {
+HostTrustLimitationClass *htlc = HOST_TRUST_LIMITATION_GET_CLASS(htl);
+
+return htlc->encrypt_data(htl, ptr, len);
 }
 
 return 1;
@@ -2180,13 +2181,24 @@ static int kvm_init(MachineState *ms)
  * encryption context.
  */
 if (ms->memory_encryption) {
-kvm_state->memcrypt_handle = sev_guest_init(ms->memory_encryption);
-if (!kvm_state->memcrypt_handle) {
+Object *obj = object_resolve_path_component(object_get_objects_root(),
+ms->memory_encryption);
+
+if (object_dynamic_cast(obj, TYPE_HOST_TRUST_LIMITATION)) {
+HostTrustLimitation *htl = HOST_TRUST_LIMITATION(obj);
+HostTrustLimitationClass *htlc
+= HOST_TRUST_LIMITATION_GET_CLASS(htl);
+
+ret = htlc->kvm_init(htl);
+if (ret < 0) {
+goto err;
+}
+
+kvm_state->htl = htl;
+} else {
 ret = -1;
 goto err;
 }
-
-kvm_state->memcrypt_encrypt_data = sev_encrypt_data;
 }
 
 ret = kvm_arch_init(ms, s);
diff --git a/accel/kvm/sev-stub.c b/accel/kvm/sev-stub.c
index 4f97452585..9c7c897593 100644
--- a/accel/kvm/sev-stub.c
+++ b/accel/kvm/sev-stub.c
@@ -15,12 +15,7 @@
 #include "qemu-common.h"
 #include "sysemu/sev.h"
 
-int sev_encrypt_data(void *handle, uint8_t *ptr, uint64_t len)
-{
-abort();
-}
-
-void *sev_guest_init(const char *id)
+HostTrustLimitation *sev_guest_init(const char *id)
 {
 return NULL;
 }
diff --git a/include/exec/host-trust-limitation.h 
b/include/exec/host-trust-limitation.h
index 03887b1be1..a19f12ae14 100644
--- a/include/exec/host-trust-limitation.h
+++ b/include/exec/host-trust-limitation.h
@@ -28,6 +28,9 @@
 
 typedef struct HostTrustLimitationClass {
 InterfaceClass parent;
+
+int (*kvm_init)(HostTrustLimitation *);
+int (*encrypt_data)(HostTrustLimitation *, uint8_t *, uint64_t);
 } HostTrustLimitationClass;
 
 #endif /* QEMU_HOST_TRUST_LIMITATION_H */
diff --git a/include/sysemu/sev.h b/include/sysemu/sev.h
index 98c1ec8d38..a4aee6a87d 100644
--- a/include/sysemu/sev.h
+++ b/include/sysemu/sev.h
@@ -16,6 +16,6 @@
 
 #include "sysemu/kvm.h"
 
-void *sev_guest_init(const char *id);
-int sev_encrypt_data(void *handle, uint8_t *ptr, uint64_t len);
+HostTrustLimitation *sev_guest_init(const char *id);
+
 #endif
diff --git a/target/i386/sev.c b/target/i386/sev.c
index d273174ad3..052a05d15a 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -28,6 +28,7 @@
 #include "sysemu/runstate.h"
 #include "trace.h"
 #include "migration/blocker.h"
+#include "exec/host-trust-limitation.h"
 
 #define TYPE_SEV_GUEST "sev-guest"
 #define SEV_GUEST(obj)

[PATCH v3 0/9] Generalize memory encryption models

A number of hardware platforms are implementing mechanisms whereby the
hypervisor does not have unfettered access to guest memory, in order
to mitigate the security impact of a compromised hypervisor.

AMD's SEV implements this with in-cpu memory encryption, and Intel has
its own memory encryption mechanism.  POWER has an upcoming mechanism
to accomplish this in a different way, using a new memory protection
level plus a small trusted ultravisor.  s390 also has a protected
execution environment.

The current code (committed or draft) for these features has each
platform's version configured entirely differently.  That doesn't seem
ideal for users, or particularly for management layers.

AMD SEV introduces a notionally generic machine option
"machine-encryption", but it doesn't actually cover any cases other
than SEV.

This series is a proposal to at least partially unify configuration
for these mechanisms, by renaming and generalizing AMD's
"memory-encryption" property.  It is replaced by a
"host-trust-limitation" property pointing to a platform specific
object which configures and manages the specific details.

For now this series covers just AMD SEV and POWER PEF.  I'm hoping it
can be extended to cover the Intel and s390 mechanisms as well,
though.

Please apply.

Changes since RFCv2:
 * Rebased
 * Removed preliminary SEV cleanups (they've been merged)
 * Changed name to "host trust limitation"
 * Added migration blocker to the PEF code (based on SEV's version)
Changes since RFCv1:
 * Rebased
 * Fixed some errors pointed out by Dave Gilbert

David Gibson (9):
  host trust limitation: Introduce new host trust limitation interface
  host trust limitation: Handle memory encryption via interface
  host trust limitation: Move side effect out of
machine_set_memory_encryption()
  host trust limitation: Rework the "memory-encryption" property
  host trust limitation: Decouple kvm_memcrypt_*() helpers from KVM
  host trust limitation: Add Error ** to HostTrustLimitation::kvm_init
  spapr: Add PEF based host trust limitation
  spapr: PEF: block migration
  host trust limitation: Alter virtio default properties for protected
guests

 accel/kvm/kvm-all.c  |  40 ++--
 accel/kvm/sev-stub.c |   7 +-
 accel/stubs/kvm-stub.c   |  10 --
 backends/Makefile.objs   |   2 +
 backends/host-trust-limitation.c |  29 ++
 hw/core/machine.c|  61 +--
 hw/i386/pc_sysfw.c   |   6 +-
 include/exec/host-trust-limitation.h |  72 +
 include/hw/boards.h  |   2 +-
 include/qemu/typedefs.h  |   1 +
 include/sysemu/kvm.h |  17 
 include/sysemu/sev.h |   4 +-
 target/i386/sev.c| 146 ---
 target/ppc/Makefile.objs |   2 +-
 target/ppc/pef.c |  89 
 15 files changed, 325 insertions(+), 163 deletions(-)
 create mode 100644 backends/host-trust-limitation.c
 create mode 100644 include/exec/host-trust-limitation.h
 create mode 100644 target/ppc/pef.c

-- 
2.26.2

[PATCH v3 3/9] host trust limitation: Move side effect out of machine_set_memory_encryption()

When the "memory-encryption" property is set, we also disable KSM
merging for the guest, since it won't accomplish anything.

We want that, but doing it in the property set function itself is
thereoretically incorrect, in the unlikely event of some configuration
environment that set the property then cleared it again before
constructing the guest.

More importantly, it makes some other cleanups we want more difficult.
So, instead move this logic to machine_run_board_init() conditional on
the final value of the property.

Signed-off-by: David Gibson 
Reviewed-by: Richard Henderson 
---
 hw/core/machine.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index 1d80ab0e1d..fdc0c7e038 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -435,14 +435,6 @@ static void machine_set_memory_encryption(Object *obj, 
const char *value,
 
 g_free(ms->memory_encryption);
 ms->memory_encryption = g_strdup(value);
-
-/*
- * With memory encryption, the host can't see the real contents of RAM,
- * so there's no point in it trying to merge areas.
- */
-if (value) {
-machine_set_mem_merge(obj, false, errp);
-}
 }
 
 static bool machine_get_nvdimm(Object *obj, Error **errp)
@@ -1135,6 +1127,15 @@ void machine_run_board_init(MachineState *machine)
 }
 }
 
+if (machine->memory_encryption) {
+/*
+ * With host trust limitation, the host can't see the real
+ * contents of RAM, so there's no point in it trying to merge
+ * areas.
+ */
+machine_set_mem_merge(OBJECT(machine), false, _abort);
+}
+
 machine_class->init(machine);
 }
 
-- 
2.26.2

[PATCH v3 4/9] host trust limitation: Rework the "memory-encryption" property

Currently the "memory-encryption" property is only looked at once we get to
kvm_init().  Although protection of guest memory from the hypervisor isn't
something that could really ever work with TCG, it's not conceptually tied
to the KVM accelerator.

In addition, the way the string property is resolved to an object is
almost identical to how a QOM link property is handled.

So, create a new "host-trust-limitation" link property which sets this QOM
interface link directly in the machine.  For compatibility we keep the
"memory-encryption" property, but now implemented in terms of the new
property.

Signed-off-by: David Gibson 
---
 accel/kvm/kvm-all.c | 23 +++
 hw/core/machine.c   | 41 -
 include/hw/boards.h |  2 +-
 3 files changed, 44 insertions(+), 22 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 1e43e27f45..d8e8fa345e 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2180,25 +2180,16 @@ static int kvm_init(MachineState *ms)
  * if memory encryption object is specified then initialize the memory
  * encryption context.
  */
-if (ms->memory_encryption) {
-Object *obj = object_resolve_path_component(object_get_objects_root(),
-ms->memory_encryption);
-
-if (object_dynamic_cast(obj, TYPE_HOST_TRUST_LIMITATION)) {
-HostTrustLimitation *htl = HOST_TRUST_LIMITATION(obj);
-HostTrustLimitationClass *htlc
-= HOST_TRUST_LIMITATION_GET_CLASS(htl);
-
-ret = htlc->kvm_init(htl);
-if (ret < 0) {
-goto err;
-}
+if (ms->htl) {
+HostTrustLimitationClass *htlc =
+HOST_TRUST_LIMITATION_GET_CLASS(ms->htl);
 
-kvm_state->htl = htl;
-} else {
-ret = -1;
+ret = htlc->kvm_init(ms->htl);
+if (ret < 0) {
 goto err;
 }
+
+kvm_state->htl = ms->htl;
 }
 
 ret = kvm_arch_init(ms, s);
diff --git a/hw/core/machine.c b/hw/core/machine.c
index fdc0c7e038..a71792bc16 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -27,6 +27,7 @@
 #include "hw/pci/pci.h"
 #include "hw/mem/nvdimm.h"
 #include "migration/vmstate.h"
+#include "exec/host-trust-limitation.h"
 
 GlobalProperty hw_compat_5_0[] = {
 { "virtio-balloon-device", "page-poison", "false" },
@@ -425,16 +426,37 @@ static char *machine_get_memory_encryption(Object *obj, 
Error **errp)
 {
 MachineState *ms = MACHINE(obj);
 
-return g_strdup(ms->memory_encryption);
+if (ms->htl) {
+return object_get_canonical_path_component(OBJECT(ms->htl));
+}
+
+return NULL;
 }
 
 static void machine_set_memory_encryption(Object *obj, const char *value,
 Error **errp)
 {
-MachineState *ms = MACHINE(obj);
+Object *htl =
+object_resolve_path_component(object_get_objects_root(), value);
+
+if (!htl) {
+error_setg(errp, "No such memory encryption object '%s'", value);
+return;
+}
 
-g_free(ms->memory_encryption);
-ms->memory_encryption = g_strdup(value);
+object_property_set_link(obj, htl, "host-trust-limitation", errp);
+}
+
+static void machine_check_host_trust_limitation(const Object *obj,
+const char *name,
+Object *new_target,
+Error **errp)
+{
+/*
+ * So far the only constraint is that the target has the
+ * TYPE_HOST_TRUST_LIMITATION interface, and that's checked by the
+ * QOM core
+ */
 }
 
 static bool machine_get_nvdimm(Object *obj, Error **errp)
@@ -855,6 +877,15 @@ static void machine_class_init(ObjectClass *oc, void *data)
 object_class_property_set_description(oc, "enforce-config-section",
 "Set on to enforce configuration section migration");
 
+object_class_property_add_link(oc, "host-trust-limitation",
+   TYPE_HOST_TRUST_LIMITATION,
+   offsetof(MachineState, htl),
+   machine_check_host_trust_limitation,
+   OBJ_PROP_LINK_STRONG);
+object_class_property_set_description(oc, "host-trust-limitation",
+"Set host trust limitation object to use");
+
+/* For compatibility */
 object_class_property_add_str(oc, "memory-encryption",
 machine_get_memory_encryption, machine_set_memory_encryption);
 object_class_property_set_description(oc, "memory-encryption",
@@ -1127,7 +1158,7 @@ void machine_run_board_init(MachineState *machine)
 }
 }
 
-if (machine->memory_encryption) {
+if (machine->htl) {
 /*
  * With host trust limitation, the host can't see the real
  * contents of RAM, so there's no point in it trying to

Re: [PATCH 1/3] riscv: Unify Qemu's reset vector code path

2020-06-18 Thread Atish Patra

On Thu, Jun 18, 2020 at 1:03 AM Bin Meng  wrote:
>
> On Wed, Jun 17, 2020 at 3:30 AM Atish Patra  wrote:
> >
> > Currently, all riscv machines have identical reset vector code
> > implementations with memory addresses being different for all machines.
> > They can be easily combined into a single function in common code.
> >
> > Move it to common function and let all the machines use the common function.
> >
> > Signed-off-by: Atish Patra 
> > ---
> >  hw/riscv/boot.c | 46 +
> >  hw/riscv/sifive_u.c | 38 +++---
>
> sifive_u's reset vector has to be different to emulate the real
> hardware MSEL pin state.
> Please rebase this on top of the following series:
> http://patchwork.ozlabs.org/project/qemu-devel/list/?series=183567
>
Sure. I will rebase. I think sifive_u may be used in future to emulate
other sifive boards in future.
This may require additional data in rom. That's why, it's better to
keep the reset vector code in
sifive_u and just unify spike & virt.

> >  hw/riscv/spike.c| 38 +++---
> >  hw/riscv/virt.c | 37 +++--
> >  include/hw/riscv/boot.h |  2 ++
> >  5 files changed, 57 insertions(+), 104 deletions(-)
> >
>
> Regards,
> Bin
>


-- 
Regards,
Atish

Re: [PATCH 2/3] RISC-V: Copy the fdt in dram instead of ROM

2020-06-18 Thread Atish Patra

On Thu, Jun 18, 2020 at 1:26 AM Bin Meng  wrote:
>
> On Wed, Jun 17, 2020 at 3:29 AM Atish Patra  wrote:
> >
> > Currently, the fdt is copied to the ROM after the reset vector. The firmware
> > has to copy it to DRAM. Instead of this, directly copy the device tree to a
> > pre-computed dram address. The device tree load address should be as far as
> > possible from kernel and initrd images. That's why it is kept at the end of
> > the DRAM or 4GB whichever is lesser.
> >
> > Signed-off-by: Atish Patra 
> > ---
> >  hw/riscv/boot.c | 45 ++---
> >  hw/riscv/sifive_u.c | 14 -
> >  hw/riscv/spike.c| 14 -
> >  hw/riscv/virt.c | 13 +++-
> >  include/hw/riscv/boot.h |  5 -
> >  5 files changed, 75 insertions(+), 16 deletions(-)
> >
> > diff --git a/hw/riscv/boot.c b/hw/riscv/boot.c
> > index 8ed96da600c9..0378b7f1bd58 100644
> > --- a/hw/riscv/boot.c
> > +++ b/hw/riscv/boot.c
> > @@ -160,25 +160,51 @@ hwaddr riscv_load_initrd(const char *filename, 
> > uint64_t mem_size,
> >  return *start + size;
> >  }
> >
> > +hwaddr riscv_calc_fdt_load_addr(hwaddr dram_base, uint64_t mem_size, void 
> > *fdt)
> > +{
> > +hwaddr temp, fdt_addr;
> > +hwaddr dram_end = dram_base + mem_size;
> > +int fdtsize = fdt_totalsize(fdt);
> > +
> > +if (fdtsize <= 0) {
> > +error_report("invalid device-tree");
> > +exit(1);
> > +}
> > +/*
> > + * We should put fdt as far as possible to avoid kernel/initrd 
> > overwriting
> > + * its content. But it should be addressable by 32 bit system as well.
> > + * Thus, put it at an aligned address that less than fdt size from end 
> > of
> > + * dram or 4GB whichever is lesser.
> > + */
> > +temp = MIN(dram_end, 4096 * MiB);
> > +fdt_addr = QEMU_ALIGN_DOWN(temp - fdtsize, 2 * MiB);
> > +
> > +return fdt_addr;
> > +}
> > +
> >  void riscv_setup_rom_reset_vec(hwaddr start_addr, hwaddr rom_base,
> > -   hwaddr rom_size, void *fdt)
> > +   hwaddr rom_size,
> > +   hwaddr fdt_load_addr, void *fdt)
> >  {
> >  int i;
> >  /* reset vector */
> > -uint32_t reset_vec[8] = {
> > -0x0297,  /* 1:  auipc  t0, %pcrel_hi(dtb) */
> > -0x02028593,  /* addi   a1, t0, %pcrel_lo(1b) */
> > +uint32_t reset_vec[10] = {
> > +0x0297,  /* 1:  auipc  t0, %pcrel_hi(fw_dyn) */
> >  0xf1402573,  /* csrr   a0, mhartid  */
> >  #if defined(TARGET_RISCV32)
> > +0x0202a583,  /* lw a1, 32(t0) */
> >  0x0182a283,  /* lw t0, 24(t0) */
> >  #elif defined(TARGET_RISCV64)
> > +0x0202b583,  /* ld a1, 32(t0) */
> >  0x0182b283,  /* ld t0, 24(t0) */
> >  #endif
> >  0x00028067,  /* jr t0 */
> >  0x,
> >  start_addr,  /* start: .dword */
> >  0x,
> > - /* dtb: */
> > +fdt_load_addr,   /* fdt_laddr: .dword */
> > +0x,
> > + /* fw_dyn: */
> >  };
> >
> >  /* copy in the reset vector in little_endian byte order */
> > @@ -189,14 +215,9 @@ void riscv_setup_rom_reset_vec(hwaddr start_addr, 
> > hwaddr rom_base,
> >rom_base, _space_memory);
> >
> >  /* copy in the device tree */
> > -if (fdt_pack(fdt) || fdt_totalsize(fdt) >
> > -rom_size - sizeof(reset_vec)) {
> > -error_report("not enough space to store device-tree");
> > -exit(1);
> > -}
> >  qemu_fdt_dumpdtb(fdt, fdt_totalsize(fdt));
> > -rom_add_blob_fixed_as("mrom.fdt", fdt, fdt_totalsize(fdt),
> > -   rom_base + sizeof(reset_vec),
> > +
> > +rom_add_blob_fixed_as("fdt", fdt, fdt_totalsize(fdt), fdt_load_addr,
> > _space_memory);
> >
> >  return;
> > diff --git a/hw/riscv/sifive_u.c b/hw/riscv/sifive_u.c
> > index c2712570e0d9..1a1540c7f98d 100644
> > --- a/hw/riscv/sifive_u.c
> > +++ b/hw/riscv/sifive_u.c
> > @@ -31,6 +31,7 @@
> >   */
> >
> >  #include "qemu/osdep.h"
> > +#include "qemu/units.h"
> >  #include "qemu/log.h"
> >  #include "qemu/error-report.h"
> >  #include "qapi/error.h"
> > @@ -325,6 +326,7 @@ static void sifive_u_machine_init(MachineState *machine)
> >  MemoryRegion *main_mem = g_new(MemoryRegion, 1);
> >  MemoryRegion *flash0 = g_new(MemoryRegion, 1);
> >  target_ulong start_addr = memmap[SIFIVE_U_DRAM].base;
> > +hwaddr fdt_load_addr;
> >
> >  /* Initialize SoC */
> >  object_initialize_child(OBJECT(machine), "soc", >soc,
> > @@ -369,13 +371,23 @@ static void sifive_u_machine_init(MachineState 
> > *machine)
>

Re: [PATCH v2 1/1] virtio-ccw: auto-manage VIRTIO_F_IOMMU_PLATFORM if PV

On Wed, Jun 10, 2020 at 11:37:14PM +0200, Halil Pasic wrote:
> On Wed, 10 Jun 2020 14:25:54 +1000
> David Gibson  wrote:
> 
> > > > I'm going to definitely have a good look at that. What I think special
> > > > about s390 is that F_ACCESS_PLATFORM is hurting us because all IO needs
> > > > to go through ZONE_DMA (this is a problem of the implementation that
> > > > stemming form a limitation of the DMA API, upstream didn't let me
> > > > fix it).   
> > > 
> > > My understanding is that power runs into similar issues, but I don't
> > > know much about power, so I might be entirely wrong :)  
> > 
> > Sort of, but not to the same extent, I think.
> 
> I'm curious what are the ramifications of a misguided hotplug on POWER?
> Does using F_ACCESS_PLATFORM when it isn't required have any
> significant drawbacks, or are you fine to just go with the safe option?

I expect it will have some performance impact, though it shouldn't be
*that* bad, at least if your guest kernel is ddw / large IOMMU window
capable.

Changing the default would require a machine type version bump, of
course.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [PATCH v2 1/1] virtio-ccw: auto-manage VIRTIO_F_IOMMU_PLATFORM if PV

On Wed, Jun 10, 2020 at 03:57:03PM +0200, Halil Pasic wrote:
> On Wed, 10 Jun 2020 14:29:29 +1000
> David Gibson  wrote:
> 
> > On Tue, Jun 09, 2020 at 06:28:39PM +0200, Halil Pasic wrote:
> > > On Tue, 9 Jun 2020 17:47:47 +0200
> > > Claudio Imbrenda  wrote:
> > > 
> > > > On Tue, 9 Jun 2020 11:41:30 +0200
> > > > Halil Pasic  wrote:
> > > > 
> > > > [...]
> > > > 
> > > > > I don't know. Janosch could answer that, but he is on vacation. Adding
> > > > > Claudio maybe he can answer. My understanding is, that while it might
> > > > > be possible, it is ugly at best. The ability to do a transition is
> > > > > indicated by a CPU model feature. Indicating the feature to the guest
> > > > > and then failing the transition sounds wrong to me.
> > > > 
> > > > I agree. If the feature is advertised, then it has to work. I don't
> > > > think we even have an architected way to fail the transition for that
> > > > reason.
> > > > 
> > > > What __could__ be done is to prevent qemu from even starting if an
> > > > incompatible device is specified together with PV.
> > > 
> > > AFAIU, the "specified together with PV" is the problem here. Currently
> > > we don't "specify PV" but PV is just a capability that is managed by the
> > > CPU model (like so many other). I.e. the fact that the
> > > visualization environment is capable providing PV (unpack facility
> > > available), and the fact, that the end user didn't fence the unpack
> > > facility, does not mean, the user is dead set to use PV.
> > > 
> > > My understanding is, that we want PV to just work, without having to
> > > put together a peculiar VM definition that says: this is going to be
> > > used as a PV VM.
> > 
> > Having had a similar discussion for POWER, I no longer think this is a
> > wise model.  I think we want to have an explicit "allow PV" option -
> > but we do want it to be a *single* option, rather than having to
> > change configuration of a whole bunch of places.
> > 
> > My intention is for my 'host-trust-limitation' series to accomplish
> > that.
> 
> Dave, many thanks for your input. I would be interested to read up that
> discussion you hand for POWER to try to catch the train of thought. Can
> you give me a pointer?

Urgh.. not really.. it was spread out over several discussions, some
of which were on IRC or Slack, rather than email.

> My current understanding is that s390x already has the "allow PV" option,
> which is the CPU model feature. But its dynamics is just like the
> dynamics of other CPU model features, in a sense that you may have to
> disable it explicitly.
> 
> Our problem is, that iommu_platform=on comes at a price point for us,
> and we don't want to enforce it when it is not needed. And if the guest
> does not decide to do the transition to protected, it is not needed.
> 
> Thus any scheme were we pessimise based on the sheer possibility of
> protected virtualization seems wrong to me.

Hrm, I see your point.  So... I guess my thinking is that although the
strict meaning of the proposed host-trust-limitation option is just
that "protection _can_ be used, at the guest/platform's option", it is
a strong hint that we're expecting protection to be used.

So would this work for s390:
  * The cpu feature remains, as now, enabled by default
  * The host-trust-limitation option would apply the protection
necessary virtio options (and any other changes to defaults we
discover we need), just as it does for SEV and POWER PEF
  * Optionally, the s390 machine type code could error out if
host-trust-limitation is specified, but the cpu option is
explicitly disabled

> The sad thing is that QEMU has every information it needs to do what is
> best: for paravirtualized devices
> * use F_ACCESS_PLATFORM when needed, to make the guest work harder and
> work around the access restrictions imposed by memory protection, and 
> * don't use F_ACCESS_PLATFORM when when not needed, and allow for
>   optimization based on the fact that no such access restrictions exist.

Right.. IIUC you're suggesting delaying finalization of the device's
featureset until the guest driver actually starts probing it

> Sure we can burden up the user, to tell us if the VM is intended to be
> used with memory protection or not. But what does it buy us? The
> opportunity to create dodgy configurations?

So, I don't know what the situation is with z, but for POWER machines
with the ultravisor running are rare (read, not actually available
outside IBM yet), and not directly tied to a cpu version (obviously
you need a cpu with support, but you also need to actually be running
under an ultravisor, which is optional).  So what are our options:

1) Require explicitly enabling PEF support - this is burdening the
   user, as you say, but..

2) Allow by default - but fail if the host doesn't have support.  That
   means explicitly *disabling* on non-ultravisor machines, a much
   bigger imposition on the user

3) Enable conditionally depending on host support.

Re: [PATCH v2 1/1] virtio-ccw: auto-manage VIRTIO_F_IOMMU_PLATFORM if PV

On Wed, Jun 10, 2020 at 03:19:22PM +0200, Viktor Mihajlovski wrote:
> 
> 
> On 6/10/20 12:24 PM, David Hildenbrand wrote:
> > On 10.06.20 12:07, David Gibson wrote:
> > > On Wed, Jun 10, 2020 at 09:22:45AM +0200, David Hildenbrand wrote:
> > > > On 10.06.20 06:31, David Gibson wrote:
> > > > > On Tue, Jun 09, 2020 at 12:44:39PM -0400, Michael S. Tsirkin wrote:
> > > > > > On Tue, Jun 09, 2020 at 06:28:39PM +0200, Halil Pasic wrote:
> > > > > > > On Tue, 9 Jun 2020 17:47:47 +0200
> > > > > > > Claudio Imbrenda  wrote:
> > > > > > > 
> > > > > > > > On Tue, 9 Jun 2020 11:41:30 +0200
> > > > > > > > Halil Pasic  wrote:
> > > > > > > > 
> > > > > > > > [...]
> > > > > > > > 
> > > > > > > > > I don't know. Janosch could answer that, but he is on 
> > > > > > > > > vacation. Adding
> > > > > > > > > Claudio maybe he can answer. My understanding is, that while 
> > > > > > > > > it might
> > > > > > > > > be possible, it is ugly at best. The ability to do a 
> > > > > > > > > transition is
> > > > > > > > > indicated by a CPU model feature. Indicating the feature to 
> > > > > > > > > the guest
> > > > > > > > > and then failing the transition sounds wrong to me.
> > > > > > > > 
> > > > > > > > I agree. If the feature is advertised, then it has to work. I 
> > > > > > > > don't
> > > > > > > > think we even have an architected way to fail the transition 
> > > > > > > > for that
> > > > > > > > reason.
> > > > > > > > 
> > > > > > > > What __could__ be done is to prevent qemu from even starting if 
> > > > > > > > an
> > > > > > > > incompatible device is specified together with PV.
> > > > > > > 
> > > > > > > AFAIU, the "specified together with PV" is the problem here. 
> > > > > > > Currently
> > > > > > > we don't "specify PV" but PV is just a capability that is managed 
> > > > > > > by the
> > > > > > > CPU model (like so many other).
> > > > > > 
> > > > > > So if we want to keep it user friendly, there could be
> > > > > > protection property with values on/off/auto, and auto
> > > > > > would poke at host capability to figure out whether
> > > > > > it's supported.
> > > > > > 
> > > > > > Both virtio and CPU would inherit from that.
> > > > > 
> > > > > Right, that's what I have in mind for my 'host-trust-limitation'
> > > > > property (a generalized version of the existing 'memory-encryption'
> > > > > machine option).  My draft patches already set virtio properties
> > > > > accordingly, it should be possible to set (default) cpu properties as
> > > > > well.
> > > > 
> > > > No crazy CPU model hacks please (at least speaking for the s390x).
> > > 
> > > Uh... I'm not really sure what you have in mind here.
> > > 
> > 
> > Reading along I got the impression that we want to glue the availability
> > of CPU features to other QEMU cmdline parameters (besides the
> > accelerator). ("to set (default) cpu properties as well"). If we are
> > talking about other CPU properties not expressed as CPU features (e.g.,
> > -cpu X,Y=on ...), then there is no issue.
> > 
> 
> The reason that the capability to run in PV mode is expressed in the CPU
> model is that this capability *is* provided by the CPU in terms of
> available instructions. I wouldn't see a benefit in providing
> a meta-property that needs to be synced with the CPU model.
> 
> So, if something has to be concluded from the fact that a VM
> could run in PV mode, that decision should be derived from the
> CPU model.

The trouble is that that approach is inherently s390 specific, and I'm
hoping we can make the configuration at least somewhat common between
platforms.

It also seems a very nasty layering violation to me for changing cpu
properties to affect apparently unrelated devices (like virtio, for
example).  It's still a bit nasty doing it from a machine property,
but it seems more reasonable to me that a machine property could
affect things elsewhere in the.. well.. machine.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [PATCH v2 1/1] virtio-ccw: auto-manage VIRTIO_F_IOMMU_PLATFORM if PV

On Wed, Jun 10, 2020 at 12:24:14PM +0200, David Hildenbrand wrote:
> On 10.06.20 12:07, David Gibson wrote:
> > On Wed, Jun 10, 2020 at 09:22:45AM +0200, David Hildenbrand wrote:
> >> On 10.06.20 06:31, David Gibson wrote:
> >>> On Tue, Jun 09, 2020 at 12:44:39PM -0400, Michael S. Tsirkin wrote:
>  On Tue, Jun 09, 2020 at 06:28:39PM +0200, Halil Pasic wrote:
> > On Tue, 9 Jun 2020 17:47:47 +0200
> > Claudio Imbrenda  wrote:
> >
> >> On Tue, 9 Jun 2020 11:41:30 +0200
> >> Halil Pasic  wrote:
> >>
> >> [...]
> >>
> >>> I don't know. Janosch could answer that, but he is on vacation. Adding
> >>> Claudio maybe he can answer. My understanding is, that while it might
> >>> be possible, it is ugly at best. The ability to do a transition is
> >>> indicated by a CPU model feature. Indicating the feature to the guest
> >>> and then failing the transition sounds wrong to me.
> >>
> >> I agree. If the feature is advertised, then it has to work. I don't
> >> think we even have an architected way to fail the transition for that
> >> reason.
> >>
> >> What __could__ be done is to prevent qemu from even starting if an
> >> incompatible device is specified together with PV.
> >
> > AFAIU, the "specified together with PV" is the problem here. Currently
> > we don't "specify PV" but PV is just a capability that is managed by the
> > CPU model (like so many other).
> 
>  So if we want to keep it user friendly, there could be
>  protection property with values on/off/auto, and auto
>  would poke at host capability to figure out whether
>  it's supported.
> 
>  Both virtio and CPU would inherit from that.
> >>>
> >>> Right, that's what I have in mind for my 'host-trust-limitation'
> >>> property (a generalized version of the existing 'memory-encryption'
> >>> machine option).  My draft patches already set virtio properties
> >>> accordingly, it should be possible to set (default) cpu properties as
> >>> well.
> >>
> >> No crazy CPU model hacks please (at least speaking for the s390x).
> > 
> > Uh... I'm not really sure what you have in mind here.
> > 
> 
> Reading along I got the impression that we want to glue the availability
> of CPU features to other QEMU cmdline parameters (besides the
> accelerator). ("to set (default) cpu properties as well"). If we are
> talking about other CPU properties not expressed as CPU features (e.g.,
> -cpu X,Y=on ...), then there is no issue.

Well, depends what you mean by "glue".  What I have in mind is that
setting the host-trust-limitation machine property will change the
defaults for cpu features in include the necessary feature for s390,
just as the draft code already changes the defaults for the relevant
virtio properties.  My intention is that if you explicitly put feature
properties on the cpu, that will override those defaults.

Is that acceptable?  I'm aware that this property affecting things in
distant devices is kinda weird and ugly, but I don't see how else we
can make configuring this not horribly complicated and differently so
for each platform.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

[Bug 1883984] Re: QEMU S/390x sqxbr (128-bit IEEE 754 square root) crashes qemu-system-s390x

2020-06-18 Thread Bruno Haible

Another way to reproduce this bug is with qemu-s390x and a cross-
compiled binary:

$ s390x-linux-gnu-gcc-5 -static -o bug-sqrtl-one-line.s390x bug-sqrtl-one-line.c
$ qemu-s390x bug-sqrtl-one-line.s390x
Segmentation fault (core dumped)

Find attached the binary.

** Attachment added: "statically compiled binary"
   
https://bugs.launchpad.net/qemu/+bug/1883984/+attachment/5385168/+files/bug-sqrtl-one-line.s390x

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1883984

Title:
  QEMU S/390x sqxbr (128-bit IEEE 754 square root) crashes qemu-system-
  s390x

Status in QEMU:
  New

Bug description:
  In porting software to guest Ubuntu 18.04 and 20.04 VMs for S/390x, I 
discovered
  that some of my own numerical programs, and also a GNU configure script for at
  least one package with CC=clang, would cause an instant crash of the VM, 
sometimes
  also destroying recently opened files, and producing long strings of NUL 
characters
  in /var/log/syslog in the S/390 guest O/S.

  Further detective work narrowed the cause of the crash down to a single IBM 
S/390
  instruction: sqxbr (128-bit IEEE 754 square root).  Here is a one-line program
  that when compiled and run on a VM hosted on QEMUcc emulator version 4.2.0 
  (Debian 1:4.2-3ubuntu6.1) [hosted on Ubuntu 20.04 on a Dell Precision 7920 
  workstation with an Intel Xeon Platinum 8253 CPU],  and also on QEMU emulator 
  version 5.0.0, reproducibly produces a VM crash under qemu-system-s390x.

  % cat bug-sqrtl-one-line.c
  int main(void) { volatile long double x, r; x = 4.0L; __asm__ 
__volatile__("sqxbr %0, %1" : "=f" (r) : "f" (x)); return (0);}

  % cc bug-sqrtl-one-line.c && ./a.out
  Segmentation fault (core dumped)

  The problem code may be the function float128_sqrt() defined in 
qemu-5.0.0/fpu/softfloat.c
  starting at line 7619.  I have NOT attempted to run the qemu-system-s390x 
executable
  under a debugger.  However, I observe that S/390 is the only CPU family that 
I know of,
  except possibly for a Fujitsu SPARC-64, that has a 128-bit square root in 
hardware.
  Thus, this instruction bug may not have been seen before.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1883984/+subscriptions

Re: [PATCH v3 4/4] spapr: Forbid nested KVM-HV in pre-power9 compat mode

On Mon, Jun 15, 2020 at 11:20:31AM +0200, Greg Kurz wrote:
> On Sat, 13 Jun 2020 17:18:04 +1000
> David Gibson  wrote:
> 
> > On Thu, Jun 11, 2020 at 03:40:33PM +0200, Greg Kurz wrote:
> > > Nested KVM-HV only works on POWER9.
> > > 
> > > Signed-off-by: Greg Kurz 
> > > Reviewed-by: Laurent Vivier 
> > 
> > Hrm.  I have mixed feelings about this.  It does bring forward an
> > error that we'd otherwise only discover when we try to load the kvm
> > module in the guest.
> > 
> > On the other hand, it's kind of a layering violation - really it's
> > KVM's business to report what it can and can't do, rather than having
> > qemu anticipate it.
> > 
> 
> Agreed and it seems that we can probably get KVM to report that
> already. I'll have closer look.
> 
> > Allowing POWER8 compat for an L2 is something we hope to have in the
> > fairly near future.
> 
> Ok but I guess we don't want to start an L2 in compat POWER8 mode
> with cap-nested-hv=on, do we ?

Sorry, "L2" was misleading, I really mean L.  Setting
cap-nested-hv kind of implies there's a L which
contradicts that.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

[Bug 1884169] [NEW] There is no option group 'fsdev' for OSX

2020-06-18 Thread Judah Holanda Correia Lima

Public bug reported:

When I try to use -fsoption on OSX I receive this error:

-fsdev local,security_model=mapped,id=fsdev0,path=devel/dmos-example:
There is no option group 'fsdev'

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1884169

Title:
  There is no option group 'fsdev' for OSX

Status in QEMU:
  New

Bug description:
  When I try to use -fsoption on OSX I receive this error:

  -fsdev local,security_model=mapped,id=fsdev0,path=devel/dmos-example:
  There is no option group 'fsdev'

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1884169/+subscriptions

Re: [PATCH 0/6] Add several Power ISA 3.1 32/64-bit vector instructions

On 6/12/20 9:20 PM, Lijun Pan wrote:
> This patch series add several newly introduced 32/64-bit vector
> instructions in Power ISA 3.1. The newly added instructions are
> flagged as ISA300 temporarily in vmx-ops.inc.c and vmx-impl.inc.c
> to make them compile and function since Power ISA 3.1, together
> with next generation processor, has not been fully enabled in QEMU
> yet. When Power ISA 3.1 and next generation processor are fully
> supported, the flags should be changed.

This is not the correct procedure.

Step 1 is to add a new define for ISA301, which is not enabled for any 
processor.

Step 2 is to add all of the new instructions, using ISA301.  In this way there
is no intermediate point in which a 3.01 instruction is enabled for 3.00.  In
addition, we do not have extra churn simply to change the ISA.

Step 3 is to add a new processor for which ISA301 is set.  It is often
reasonable to have a fake processor named "max" that contains all of the
available features.  For ppc, I see that "max" is currently aliased to 
"7400_v2.9".

r~

Re: [PATCH 6/6] target/ppc: add vdiv{su}{wd} vmod{su}{wd} instructions

On 6/12/20 9:20 PM, Lijun Pan wrote:
> +#define VDIV_MOD_DO(name, op, element)  \
> +void helper_v##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)   \
> +{   \
> +int i;  \
> +\
> +for (i = 0; i < ARRAY_SIZE(r->element); i++) {  \
> +r->element[i] = a->element[i] op b->element[i]; \
> +}   \
> +}

You're missing all of the divide-by-zero handling.


r~

Re: [PATCH 5/6] fix the prototype of muls64/mulu64

On 6/12/20 9:20 PM, Lijun Pan wrote:
> The prototypes of muls64/mulu64 in host-utils.h should match the
> definitions in host-utils.c
> 
> Signed-off-by: Lijun Pan 
> ---
>  include/qemu/host-utils.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Reviewed-by: Richard Henderson 
CC: qemu-trivial.


r~

Re: [PATCH 4/6] target/ppc: add vmulh{su}d instructions

On 6/12/20 9:20 PM, Lijun Pan wrote:
> +void helper_vmulhsd(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
> +{
> + int i;
> + uint64_t h64 = 0;
> + uint64_t l64 = 0;
> +
> + for (i = 0; i < 2; i++) {
> + muls64(, , a->s64[i], b->s64[i]);
> + r->s64[i] = h64;
> + }
> +}

Indentation is off.

This can just as easily be written as

uint64_t discard;

muls64(, >u64[0], a->s64[0], b->s64[0]);
muls64(, >u64[1], a->s64[1], b->s64[1]);

and similarly for helper_vmulhud.


r~

Re: [PATCH 3/6] targetc/ppc: add vmulh{su}w instructions

On 6/12/20 9:20 PM, Lijun Pan wrote:
> +#define VMULH_DO(name, op, element, cast_orig, cast_temp)\
> +void helper_vmulh##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
> \
> +{
> \
> + int i;  \
> + \
> + for (i = 0; i < ARRAY_SIZE(r->element); i++) {  \
> + r->element[i] = (cast_orig)(((cast_temp)a->element[i] op \
> + (cast_temp)b->element[i]) >> 32);   \
> + }   \
> +}
> +VMULH_DO(sw, *, s32, int32_t, int64_t)
> +VMULH_DO(uw, *, u32, uint32_t, uint64_t)
> +#undef VMULH_DO

There's no point in calling the macro "VMUL" and then passing in "op" as a
parameter.  Just inline the multiply directly.

Also, fix your indentation.


r~

Re: [PATCH v1 2/2] sifive_e: Support the revB machine

2020-06-18 Thread Alistair Francis

On Thu, Jun 18, 2020 at 3:42 PM Palmer Dabbelt  wrote:
>
> On Wed, 10 Jun 2020 15:13:49 PDT (-0700), alistai...@gmail.com wrote:
> > On Thu, May 28, 2020 at 11:13 AM Alistair Francis  
> > wrote:
> >>
> >> On Thu, May 21, 2020 at 8:57 AM Alistair Francis  
> >> wrote:
> >> >
> >> > On Wed, May 20, 2020 at 4:08 PM Palmer Dabbelt  
> >> > wrote:
> >> > >
> >> > > On Thu, 14 May 2020 13:47:10 PDT (-0700), Alistair Francis wrote:
> >> > > > Signed-off-by: Alistair Francis 
> >> > > > ---
> >> > > >  hw/riscv/sifive_e.c | 35 +++
> >> > > >  include/hw/riscv/sifive_e.h |  1 +
> >> > > >  2 files changed, 32 insertions(+), 4 deletions(-)
> >> > > >
> >> > > > diff --git a/hw/riscv/sifive_e.c b/hw/riscv/sifive_e.c
> >> > > > index 472a98970b..cb7818341b 100644
> >> > > > --- a/hw/riscv/sifive_e.c
> >> > > > +++ b/hw/riscv/sifive_e.c
> >> > > > @@ -98,10 +98,14 @@ static void riscv_sifive_e_init(MachineState 
> >> > > > *machine)
> >> > > >  memmap[SIFIVE_E_DTIM].base, main_mem);
> >> > > >
> >> > > >  /* Mask ROM reset vector */
> >> > > > -uint32_t reset_vec[2] = {
> >> > > > -0x204002b7,/* 0x1000: lui t0,0x20400 */
> >> > > > -0x00028067,/* 0x1004: jr  t0 */
> >> > > > -};
> >> > > > +uint32_t reset_vec[2];
> >> > > > +
> >> > > > +if (s->revb) {
> >> > > > +reset_vec[0] = 0x200102b7;/* 0x1000: lui 
> >> > > > t0,0x20010 */
> >> > > > +} else {
> >> > > > +reset_vec[0] = 0x204002b7;/* 0x1000: lui 
> >> > > > t0,0x20400 */
> >> > > > +}
> >> > > > +reset_vec[1] = 0x00028067;/* 0x1004: jr  t0 */
> >> > > >
> >> > > >  /* copy in the reset vector in little_endian byte order */
> >> > > >  for (i = 0; i < sizeof(reset_vec) >> 2; i++) {
> >> > > > @@ -115,8 +119,31 @@ static void riscv_sifive_e_init(MachineState 
> >> > > > *machine)
> >> > > >  }
> >> > > >  }
> >> > > >
> >> > > > +static bool sifive_e_machine_get_revb(Object *obj, Error **errp)
> >> > > > +{
> >> > > > +SiFiveEState *s = RISCV_E_MACHINE(obj);
> >> > > > +
> >> > > > +return s->revb;
> >> > > > +}
> >> > > > +
> >> > > > +static void sifive_e_machine_set_revb(Object *obj, bool value, 
> >> > > > Error **errp)
> >> > > > +{
> >> > > > +SiFiveEState *s = RISCV_E_MACHINE(obj);
> >> > > > +
> >> > > > +s->revb = value;
> >> > > > +}
> >> > > > +
> >> > > >  static void sifive_e_machine_instance_init(Object *obj)
> >> > > >  {
> >> > > > +SiFiveEState *s = RISCV_E_MACHINE(obj);
> >> > > > +
> >> > > > +s->revb = false;
> >> > > > +object_property_add_bool(obj, "revb", sifive_e_machine_get_revb,
> >> > > > + sifive_e_machine_set_revb, NULL);
> >> > > > +object_property_set_description(obj, "revb",
> >> > > > +"Set on to tell QEMU that it 
> >> > > > should model "
> >> > > > +"the revB HiFive1 board",
> >> > > > +NULL);
> >> > > >  }
> >> > > >
> >> > > >  static void sifive_e_machine_class_init(ObjectClass *oc, void *data)
> >> > > > diff --git a/include/hw/riscv/sifive_e.h 
> >> > > > b/include/hw/riscv/sifive_e.h
> >> > > > index 414992119e..0d3cd07fcc 100644
> >> > > > --- a/include/hw/riscv/sifive_e.h
> >> > > > +++ b/include/hw/riscv/sifive_e.h
> >> > > > @@ -45,6 +45,7 @@ typedef struct SiFiveEState {
> >> > > >
> >> > > >  /*< public >*/
> >> > > >  SiFiveESoCState soc;
> >> > > > +bool revb;
> >> > > >  } SiFiveEState;
> >> > > >
> >> > > >  #define TYPE_RISCV_E_MACHINE MACHINE_TYPE_NAME("sifive_e")
> >> > >
> >> > > IIRC there are way more differences between the un-suffixed FE310 and 
> >> > > the Rev
> >> > > B, specifically the interrupt map is all different.
> >> >
> >> > The three IRQs that QEMU uses for the SiFive E (UART0, UART1 and GPIO)
> >> > all seem to be the same.
> >>
> >> Ping!
> >
> > Ping^2
> >
> > Applying to RISC-V tree.
>
> They're not: uart0 is interrupt 3 on the rev b but 5 on the non-rev b (which
> they don't call rev a but I'm going to :)).  There's isn't even a uart1 in the
> DTS on the rev a, and the GPIO interrupts are different as well.
>
> The DTS files are in SiFive's SDK:
>
> https://github.com/sifive/freedom-e-sdk/blob/master/bsp/sifive-hifive1-revb/core.dts
> https://github.com/sifive/freedom-e-sdk/blob/master/bsp/sifive-hifive1/core.dts
>
> which should also generate some test programs.  When I was there we tested on
> QEMU for the platforms that were supported, so there should be some support 
> for
> doing so still.

I am reading the wrong data sheets?

For revB I looked at
https://sifive.cdn.prismic.io/sifive%2F9ecbb623-7c7f-4acc-966f-9bb10ecdb62e_fe310-g002.pdf,
page 46, table 26. The interrupts for the revB match what we currently
have in QEMU.

For revA 
https://sifive.cdn.prismic.io/sifive%2F500a69f8-af3a-4fd9-927f-10ca77077532_fe310-g000.pdf,

Re: [PATCH 2/6] target/ppc: add vmulld instruction

On 6/12/20 9:20 PM, Lijun Pan wrote:
> vmulld: Vector Multiply Low Doubleword.
> 
> Signed-off-by: Lijun Pan 
> ---
>  target/ppc/helper.h | 1 +
>  target/ppc/int_helper.c | 1 +
>  target/ppc/translate/vmx-impl.inc.c | 1 +
>  target/ppc/translate/vmx-ops.inc.c  | 1 +
>  4 files changed, 4 insertions(+)
> 
> diff --git a/target/ppc/helper.h b/target/ppc/helper.h
> index 2dfa1c6942..c3f087ccb3 100644
> --- a/target/ppc/helper.h
> +++ b/target/ppc/helper.h
> @@ -185,6 +185,7 @@ DEF_HELPER_3(vmuloub, void, avr, avr, avr)
>  DEF_HELPER_3(vmulouh, void, avr, avr, avr)
>  DEF_HELPER_3(vmulouw, void, avr, avr, avr)
>  DEF_HELPER_3(vmuluwm, void, avr, avr, avr)
> +DEF_HELPER_3(vmulld, void, avr, avr, avr)
>  DEF_HELPER_3(vslo, void, avr, avr, avr)
>  DEF_HELPER_3(vsro, void, avr, avr, avr)
>  DEF_HELPER_3(vsrv, void, avr, avr, avr)
> diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
> index be53cd6f68..afbcdd05b4 100644
> --- a/target/ppc/int_helper.c
> +++ b/target/ppc/int_helper.c
> @@ -533,6 +533,7 @@ void helper_vprtybq(ppc_avr_t *r, ppc_avr_t *b)
>  }   \
>  }
>  VARITH_DO(muluwm, *, u32)
> +VARITH_DO(mulld, *, s64)

>From this implementation, I would say that both vmuluwm and vmulld can be
implemented with tcg_gen_gvec_mul().

I guess vmuluwm was missed when many of the other vmx operations were converted
to gvec.

Please first convert vmuluwm to tcg_gen_gvec_mul, then implement vmulld in the
same manner.


r~

Re: [PATCH 1/6] target/ppc: add byte-reverse br[dwh] instructions

On 6/12/20 9:20 PM, Lijun Pan wrote:
> POWER ISA 3.1 introduces following byte-reverse instructions:
> brd: Byte-Reverse Doubleword X-form
> brw: Byte-Reverse Word X-form
> brh: Byte-Reverse Halfword X-form
> 
> Signed-off-by: Lijun Pan 
> ---
>  target/ppc/translate.c | 62 ++
>  1 file changed, 62 insertions(+)
> 
> diff --git a/target/ppc/translate.c b/target/ppc/translate.c
> index 4ce3d664b5..2d48fbc8db 100644
> --- a/target/ppc/translate.c
> +++ b/target/ppc/translate.c
> @@ -6971,7 +6971,69 @@ static void gen_dform3D(DisasContext *ctx)
>  return gen_invalid(ctx);
>  }
>  
> +/* brd */
> +static void gen_brd(DisasContext *ctx)
> +{
> + TCGv_i64 temp = tcg_temp_new_i64();
> +
> + tcg_gen_bswap64_i64(temp, cpu_gpr[rS(ctx->opcode)]);
> + tcg_gen_st_i64(temp, cpu_env, offsetof(CPUPPCState, 
> gpr[rA(ctx->opcode)]));


The store is wrong.  You cannot modify storage behind a tcg global variable
like that.  This should just be

tcg_gen_bswap64_i64(cpu_gpr[rA(ctx->opcode)],
cpu_gpr[rS(ctx->opcode)]);

Is this code is within an ifdef for TARGET_PPC64?
If not, then this will break the 32-bit qemu-system-ppc build.
Are you sure you have built and tested all configurations?


> +/* brw */
> +static void gen_brw(DisasContext *ctx)
> +{
> + TCGv_i64 temp = tcg_temp_new_i64();
> + TCGv_i64 lsb = tcg_temp_new_i64();
> + TCGv_i64 msb = tcg_temp_new_i64();
> +
> + tcg_gen_movi_i64(lsb, 0xull);
> + tcg_gen_and_i64(temp, lsb, cpu_gpr[rS(ctx->opcode)]);
> + tcg_gen_bswap32_i64(lsb, temp);
> + 
> + tcg_gen_shri_i64(msb, cpu_gpr[rS(ctx->opcode)], 32);
> + tcg_gen_bswap32_i64(temp, msb);
> + tcg_gen_shli_i64(msb, temp, 32);
> + 
> + tcg_gen_or_i64(temp, lsb, msb);
> +
> + tcg_gen_st_i64(temp, cpu_env, offsetof(CPUPPCState, 
> gpr[rA(ctx->opcode)]));

Again, the store is wrong.

In addition, this can be computed as

tcg_gen_bswap64_i64(dest, source);
tcg_gen_rotli_i64(dest, dest, 32);

> +static void gen_brh(DisasContext *ctx)
> +{
> + TCGv_i64 temp = tcg_temp_new_i64();
> + TCGv_i64 t0 = tcg_temp_new_i64();
> + TCGv_i64 t1 = tcg_temp_new_i64();
> + TCGv_i64 t2 = tcg_temp_new_i64();
> + TCGv_i64 t3 = tcg_temp_new_i64();
> +
> + tcg_gen_movi_i64(t0, 0x00ff00ff00ff00ffull);
> + tcg_gen_shri_i64(t1, cpu_gpr[rS(ctx->opcode)], 8);
> + tcg_gen_and_i64(t2, t1, t0);
> + tcg_gen_and_i64(t1, cpu_gpr[rS(ctx->opcode)], t0);
> + tcg_gen_shli_i64(t1, t1, 8);
> + tcg_gen_or_i64(temp, t1, t2);
> + tcg_gen_st_i64(temp, cpu_env, offsetof(CPUPPCState, 
> gpr[rA(ctx->opcode)]));

Again, the store is wrong.


r~

Re: [PATCH v2] target/ppc: add vmsumudm vmsumcud instructions

On 6/15/20 1:53 PM, Lijun Pan wrote:
>>> +static inline void uint128_add(uint64_t ah, uint64_t al, uint64_t bh,
>>> +   uint64_t bl, uint64_t *rh, uint64_t *rl, uint64_t *ca)
>>> +{
>>> +   __uint128_t a = (__uint128_t)ah << 64 | (__uint128_t)al;
>>> +   __uint128_t b = (__uint128_t)bh << 64 | (__uint128_t)bl;
>>> +   __uint128_t r = a + b;
>>> +
>>> +   *rh = (uint64_t)(r >> 64);
>>> +   *rl = (uint64_t)r;
>>> +   *ca = (~a < b);
>>> +}
>>
>> This is *not* what I had in mind at all.
>>
>> int128.h should be operating on Int128, and *not* component uint64_t values.
> 
> Should uint128_add() be included in a new file called uint128.h? or still at 
> host-utils.h?

If you want this sort of specific operation, you should leave it in target/ppc/.

I had been hoping that you could make use of Int128 as-is, or with minimal
adjustment in the same style.

> vmsumudm/vmsumcud operate as follows:
> 1. 128-bit prod1 = (high 64 bits of a) * (high 64 bits of b), // I reuse 
> mulu64()
> 2. 128-bit prod2 = (high 64 bits of b) * (high 64 bits of b), // I reuse 
> mulu64()
> 3. 128-bit result = prod1 + prod2 + c; // I added addu128() in v1, renamed it 
> to uint128_add() in v2

Really?  That seems a very odd computation.  Your code,

> + prod1 = (__uint128_t)ah * (__uint128_t)bh;
> + prod2 = (__uint128_t)al * (__uint128_t)bl;
> + r = prod1 + prod2 + c;

is slightly different, but still very odd.

Why would we be adding the intermediate 128th bit of the 256-bit product
(prod1, bit 0) with the 0th bit of the 256-bit product (prod2, bit 0).

Unfortunately, I can't find the v3.1 spec online yet, so I can't look at this
myself.  What is the instruction supposed to produce?

> To better understand your request, may I ask you several questions:
> 1. keep mulsum() in target/ppc/int_helper.c?

Probably.

> If so, it will inevitably have  #ifdef CONFIG_INT128 #else #endif in that 
> function.  

No, you don't have to ifdef.  You can use uint64_t alone and not rely on
compiler support for __uint128_t at all.

r~

Re: [PATCH v3 0/8] s390: Extended-Length SCCB & DIAGNOSE 0x318

Patchew URL: 
https://patchew.org/QEMU/2020061858.23287-1-wall...@linux.ibm.com/



Hi,

This series failed the asan build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-debug@fedora TARGET_LIST=x86_64-softmmu J=14 NETWORK=1
=== TEST SCRIPT END ===

  LINKelf2dmp
  AR  libqemuutil.a
  CC  qemu-img.o
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  AR  libvhost-user.a
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  GEN docs/interop/qemu-ga-ref.html
  GEN docs/interop/qemu-ga-ref.txt
  GEN docs/interop/qemu-ga-ref.7
  LINKqemu-keymap
  LINKivshmem-client
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINKivshmem-server
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINKqemu-nbd
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINKqemu-storage-daemon
  LINKqemu-io
  LINKqemu-edid
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINKfsdev/virtfs-proxy-helper
  LINKscsi/qemu-pr-helper
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINKqemu-bridge-helper
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINKvirtiofsd
  LINKvhost-user-input
  LINKqemu-ga
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINKqemu-img
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from

Re: [PATCH v1 2/2] sifive_e: Support the revB machine

2020-06-18 Thread Palmer Dabbelt


On Wed, 10 Jun 2020 15:13:49 PDT (-0700), alistai...@gmail.com wrote:

On Thu, May 28, 2020 at 11:13 AM Alistair Francis  wrote:


On Thu, May 21, 2020 at 8:57 AM Alistair Francis  wrote:
>
> On Wed, May 20, 2020 at 4:08 PM Palmer Dabbelt  wrote:
> >
> > On Thu, 14 May 2020 13:47:10 PDT (-0700), Alistair Francis wrote:
> > > Signed-off-by: Alistair Francis 
> > > ---
> > >  hw/riscv/sifive_e.c | 35 +++
> > >  include/hw/riscv/sifive_e.h |  1 +
> > >  2 files changed, 32 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/hw/riscv/sifive_e.c b/hw/riscv/sifive_e.c
> > > index 472a98970b..cb7818341b 100644
> > > --- a/hw/riscv/sifive_e.c
> > > +++ b/hw/riscv/sifive_e.c
> > > @@ -98,10 +98,14 @@ static void riscv_sifive_e_init(MachineState *machine)
> > >  memmap[SIFIVE_E_DTIM].base, main_mem);
> > >
> > >  /* Mask ROM reset vector */
> > > -uint32_t reset_vec[2] = {
> > > -0x204002b7,/* 0x1000: lui t0,0x20400 */
> > > -0x00028067,/* 0x1004: jr  t0 */
> > > -};
> > > +uint32_t reset_vec[2];
> > > +
> > > +if (s->revb) {
> > > +reset_vec[0] = 0x200102b7;/* 0x1000: lui t0,0x20010 
*/
> > > +} else {
> > > +reset_vec[0] = 0x204002b7;/* 0x1000: lui t0,0x20400 
*/
> > > +}
> > > +reset_vec[1] = 0x00028067;/* 0x1004: jr  t0 */
> > >
> > >  /* copy in the reset vector in little_endian byte order */
> > >  for (i = 0; i < sizeof(reset_vec) >> 2; i++) {
> > > @@ -115,8 +119,31 @@ static void riscv_sifive_e_init(MachineState 
*machine)
> > >  }
> > >  }
> > >
> > > +static bool sifive_e_machine_get_revb(Object *obj, Error **errp)
> > > +{
> > > +SiFiveEState *s = RISCV_E_MACHINE(obj);
> > > +
> > > +return s->revb;
> > > +}
> > > +
> > > +static void sifive_e_machine_set_revb(Object *obj, bool value, Error 
**errp)
> > > +{
> > > +SiFiveEState *s = RISCV_E_MACHINE(obj);
> > > +
> > > +s->revb = value;
> > > +}
> > > +
> > >  static void sifive_e_machine_instance_init(Object *obj)
> > >  {
> > > +SiFiveEState *s = RISCV_E_MACHINE(obj);
> > > +
> > > +s->revb = false;
> > > +object_property_add_bool(obj, "revb", sifive_e_machine_get_revb,
> > > + sifive_e_machine_set_revb, NULL);
> > > +object_property_set_description(obj, "revb",
> > > +"Set on to tell QEMU that it should model 
"
> > > +"the revB HiFive1 board",
> > > +NULL);
> > >  }
> > >
> > >  static void sifive_e_machine_class_init(ObjectClass *oc, void *data)
> > > diff --git a/include/hw/riscv/sifive_e.h b/include/hw/riscv/sifive_e.h
> > > index 414992119e..0d3cd07fcc 100644
> > > --- a/include/hw/riscv/sifive_e.h
> > > +++ b/include/hw/riscv/sifive_e.h
> > > @@ -45,6 +45,7 @@ typedef struct SiFiveEState {
> > >
> > >  /*< public >*/
> > >  SiFiveESoCState soc;
> > > +bool revb;
> > >  } SiFiveEState;
> > >
> > >  #define TYPE_RISCV_E_MACHINE MACHINE_TYPE_NAME("sifive_e")
> >
> > IIRC there are way more differences between the un-suffixed FE310 and the 
Rev
> > B, specifically the interrupt map is all different.
>
> The three IRQs that QEMU uses for the SiFive E (UART0, UART1 and GPIO)
> all seem to be the same.

Ping!


Ping^2

Applying to RISC-V tree.


They're not: uart0 is interrupt 3 on the rev b but 5 on the non-rev b (which
they don't call rev a but I'm going to :)).  There's isn't even a uart1 in the
DTS on the rev a, and the GPIO interrupts are different as well.

The DTS files are in SiFive's SDK:

https://github.com/sifive/freedom-e-sdk/blob/master/bsp/sifive-hifive1-revb/core.dts
https://github.com/sifive/freedom-e-sdk/blob/master/bsp/sifive-hifive1/core.dts

which should also generate some test programs.  When I was there we tested on
QEMU for the platforms that were supported, so there should be some support for
doing so still.



Alistair



>
> Alistair

[Bug 1818075] Re: qemu x86 TCG doesn't support AVX insns

2020-06-18 Thread Ronald Antony

Of course it’s open source, I get that. When I say „xyz should be done“
then in the sense of „2+2 should be 4“ not in the sense of „you must
implement xyz right now“ ;)

Nonetheless, if you run e.g. on an ARM platform the command

qemu-system-x86_64 -cpu help

then it shouldn’t list a slew of CPUs as „available“ that clearly aren’t
working.

It should then list fully implemented CPUs separately from partially 
implemented CPUs (if listing them at all), and if it does list incomplete 
implementations, it should indicate what’s missing.
It’s just a horrible user experience, if based on such output, one spends 
significant time trying to get some emulation running, only to then discover 
from runtime error messages, that an „available“ CPU isn’t actually available.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1818075

Title:
  qemu x86 TCG doesn't support AVX insns

Status in QEMU:
  New

Bug description:
  I'm trying to execute code that has been built with -march=skylake
  -mtune=generic -mavx2 under qemu-user x86-64 with -cpu Skylake-Client.
  However this code just hangs at 100% CPU.

  Adding input tracing shows that it is likely hanging when dealing with
  an AVX instruction:

  warning: TCG doesn't support requested feature: CPUID.01H:ECX.fma [bit 12]
  warning: TCG doesn't support requested feature: CPUID.01H:ECX.pcid [bit 17]
  warning: TCG doesn't support requested feature: CPUID.01H:ECX.x2apic [bit 21]
  warning: TCG doesn't support requested feature: CPUID.01H:ECX.tsc-deadline 
[bit 24]
  warning: TCG doesn't support requested feature: CPUID.01H:ECX.avx [bit 28]
  warning: TCG doesn't support requested feature: CPUID.01H:ECX.f16c [bit 29]
  warning: TCG doesn't support requested feature: CPUID.01H:ECX.rdrand [bit 30]
  warning: TCG doesn't support requested feature: CPUID.07H:EBX.hle [bit 4]
  warning: TCG doesn't support requested feature: CPUID.07H:EBX.avx2 [bit 5]
  warning: TCG doesn't support requested feature: CPUID.07H:EBX.invpcid [bit 10]
  warning: TCG doesn't support requested feature: CPUID.07H:EBX.rtm [bit 11]
  warning: TCG doesn't support requested feature: CPUID.07H:EBX.rdseed [bit 18]
  warning: TCG doesn't support requested feature: 
CPUID.8001H:ECX.3dnowprefetch [bit 8]
  warning: TCG doesn't support requested feature: CPUID.0DH:EAX.xsavec [bit 1]

  IN:
  0x4000b4ef3b:  c5 fb 5c ca  vsubsd   %xmm2, %xmm0, %xmm1
  0x4000b4ef3f:  c4 e1 fb 2c d1   vcvttsd2si %xmm1, %rdx
  0x4000b4ef44:  4c 31 e2 xorq %r12, %rdx
  0x4000b4ef47:  48 85 d2 testq%rdx, %rdx
  0x4000b4ef4a:  79 9ejns  0x4000b4eeea

  [ hangs ]

  Attaching a gdb produces this stacktrace:

  (gdb) bt
  #0  canonicalize (status=0x55a20ff67a88, parm=0x55a20bb807e0 
, part=...)
  at 
/data/poky-tmp/master/work/x86_64-linux/qemu-native/3.1.0-r0/qemu-3.1.0/fpu/softfloat.c:350
  #1  float64_unpack_canonical (s=0x55a20ff67a88, f=0)
  at 
/data/poky-tmp/master/work/x86_64-linux/qemu-native/3.1.0-r0/qemu-3.1.0/fpu/softfloat.c:547
  #2  float64_sub (a=0, b=4890909195324358656, status=0x55a20ff67a88)
  at 
/data/poky-tmp/master/work/x86_64-linux/qemu-native/3.1.0-r0/qemu-3.1.0/fpu/softfloat.c:776
  #3  0x55a20baa1949 in helper_subsd (env=, 
d=0x55a20ff67ad8, s=)
  at 
/data/poky-tmp/master/work/x86_64-linux/qemu-native/3.1.0-r0/qemu-3.1.0/target/i386/ops_sse.h:623
  #4  0x55a20cfcfea8 in static_code_gen_buffer ()
  #5  0x55a20ba3f764 in cpu_tb_exec (itb=, 
cpu=0x55a20cea2180 )
  at 
/data/poky-tmp/master/work/x86_64-linux/qemu-native/3.1.0-r0/qemu-3.1.0/accel/tcg/cpu-exec.c:171
  #6  cpu_loop_exec_tb (tb_exit=, last_tb=, tb=,
  cpu=0x55a20cea2180 )
  at 
/data/poky-tmp/master/work/x86_64-linux/qemu-native/3.1.0-r0/qemu-3.1.0/accel/tcg/cpu-exec.c:615
  #7  cpu_exec (cpu=cpu@entry=0x55a20ff5f4d0)
  at 
/data/poky-tmp/master/work/x86_64-linux/qemu-native/3.1.0-r0/qemu-3.1.0/accel/tcg/cpu-exec.c:725
  #8  0x55a20ba6d728 in cpu_loop (env=0x55a20ff67780)
  at 
/data/poky-tmp/master/work/x86_64-linux/qemu-native/3.1.0-r0/qemu-3.1.0/linux-user/x86_64/../i386/cpu_loop.c:93
  #9  0x55a20ba049ff in main (argc=, argv=0x7ffc58572868, 
envp=)
  at 
/data/poky-tmp/master/work/x86_64-linux/qemu-native/3.1.0-r0/qemu-3.1.0/linux-user/main.c:819

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1818075/+subscriptions

Re: [PATCH v3 0/8] s390: Extended-Length SCCB & DIAGNOSE 0x318

Patchew URL: 
https://patchew.org/QEMU/2020061858.23287-1-wall...@linux.ibm.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [PATCH v3 0/8] s390: Extended-Length SCCB & DIAGNOSE 0x318
Type: series
Message-id: 2020061858.23287-1-wall...@linux.ibm.com

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

From https://github.com/patchew-project/qemu
 * [new tag] patchew/2020061858.23287-1-wall...@linux.ibm.com -> 
patchew/2020061858.23287-1-wall...@linux.ibm.com
Switched to a new branch 'test'
7c099e1 s390: guest support for diagnose 0x318
11cf174 s390/kvm: header sync for diag318
9858413 s390/sclp: add extended-length sccb support for kvm guest
a8e6274 s390/sclp: use cpu offset to locate cpu entries
b839587 s390/sclp: read sccb from mem based on sccb length
2d754fb s390/sclp: rework sclp boundary and length checks
a7dce9a s390/sclp: check sccb len before filling in data
ea700d4 s390/sclp: get machine once during read scp/cpu info

=== OUTPUT BEGIN ===
1/8 Checking commit ea700d4d8a47 (s390/sclp: get machine once during read 
scp/cpu info)
2/8 Checking commit a7dce9a1bf29 (s390/sclp: check sccb len before filling in 
data)
3/8 Checking commit 2d754fbb29f6 (s390/sclp: rework sclp boundary and length 
checks)
4/8 Checking commit b839587fc52a (s390/sclp: read sccb from mem based on sccb 
length)
5/8 Checking commit a8e6274f786b (s390/sclp: use cpu offset to locate cpu 
entries)
6/8 Checking commit 985841391b37 (s390/sclp: add extended-length sccb support 
for kvm guest)
WARNING: line over 80 characters
#115: FILE: target/s390x/cpu_features_def.inc.h:100:
+DEF_FEAT(EXTENDED_LENGTH_SCCB, "els", STFL, 140, "Extended-length SCCB 
facility")

total: 0 errors, 1 warnings, 80 lines checked

Patch 6/8 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
7/8 Checking commit 11cf174b3455 (s390/kvm: header sync for diag318)
8/8 Checking commit 7c099e11b2ba (s390: guest support for diagnose 0x318)
ERROR: line over 90 characters
#103: FILE: target/s390x/cpu_features_def.inc.h:125:
+/* Features exposed via SCLP SCCB Facilities byte 134 (bit numbers relative to 
byte-134) */

WARNING: line over 80 characters
#104: FILE: target/s390x/cpu_features_def.inc.h:126:
+DEF_FEAT(DIAG_318, "diag318", SCLP_FAC134, 0, "Control program name and 
version codes")

total: 1 errors, 1 warnings, 161 lines checked

Patch 8/8 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/2020061858.23287-1-wall...@linux.ibm.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

[PATCH v3 6/8] s390/sclp: add extended-length sccb support for kvm guest

As more features and facilities are added to the Read SCP Info (RSCPI)
response, more space is required to store them. The space used to store
these new features intrudes on the space originally used to store CPU
entries. This means as more features and facilities are added to the
RSCPI response, less space can be used to store CPU entries.

With the Extended-Length SCCB (ELS) facility, a KVM guest can execute
the RSCPI command and determine if the SCCB is large enough to store a
complete reponse. If it is not large enough, then the required length
will be set in the SCCB header.

The caller of the SCLP command is responsible for creating a
large-enough SCCB to store a complete response. Proper checking should
be in place, and the caller should execute the command once-more with
the large-enough SCCB.

This facility also enables an extended SCCB for the Read CPU Info
(RCPUI) command.

When this facility is enabled, the boundary violation response cannot
be a result from the RSCPI, RSCPI Forced, or RCPUI commands.

In order to tolerate kernels that do not yet have full support for this
feature, a "fixed" offset to the start of the CPU Entries within the
Read SCP Info struct is set to allow for the original 248 max entries
when this feature is disabled.

Additionally, this is introduced as a CPU feature to protect the guest
from migrating to a machine that does not support storing an extended
SCCB. This could otherwise hinder the VM from being able to read all
available CPU entries after migration (such as during re-ipl).

Signed-off-by: Collin Walling 
---
 hw/s390x/sclp.c | 21 -
 include/hw/s390x/sclp.h |  1 +
 target/s390x/cpu_features_def.inc.h |  1 +
 target/s390x/gen-features.c |  1 +
 target/s390x/kvm.c  |  8 
 5 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/hw/s390x/sclp.c b/hw/s390x/sclp.c
index 0dfbe6e5ec..f7c49e339e 100644
--- a/hw/s390x/sclp.c
+++ b/hw/s390x/sclp.c
@@ -56,6 +56,18 @@ static bool sccb_has_valid_boundary(uint64_t sccb_addr, 
uint32_t code,
 uint64_t sccb_boundary = (sccb_addr & PAGE_MASK) + PAGE_SIZE;
 
 switch (code & SCLP_CMD_CODE_MASK) {
+case SCLP_CMDW_READ_SCP_INFO:
+case SCLP_CMDW_READ_SCP_INFO_FORCED:
+case SCLP_CMDW_READ_CPU_INFO:
+/*
+ * An extended-length SCCB is only allowed for Read SCP/CPU Info and
+ * is allowed to exceed the 4k boundary. The respective commands will
+ * set the length field to the required length if an insufficient
+ * SCCB length is provided.
+ */
+if (s390_has_feat(S390_FEAT_EXTENDED_LENGTH_SCCB)) {
+return true;
+}
 default:
 if (sccb_max_addr < sccb_boundary) {
 return true;
@@ -72,6 +84,10 @@ static bool sccb_sufficient_len(SCCB *sccb, int num_cpus, 
int data_len)
 
 if (be16_to_cpu(sccb->h.length) < required_len) {
 sccb->h.response_code = cpu_to_be16(SCLP_RC_INSUFFICIENT_SCCB_LENGTH);
+if (s390_has_feat(S390_FEAT_EXTENDED_LENGTH_SCCB) &&
+sccb->h.control_mask[2] & SCLP_VARIABLE_LENGTH_RESPONSE) {
+sccb->h.length = required_len;
+}
 return false;
 }
 return true;
@@ -101,7 +117,9 @@ static void prepare_cpu_entries(MachineState *ms, CPUEntry 
*entry, int *count)
  */
 static inline int get_read_scp_info_data_len(void)
 {
-return offsetof(ReadInfo, entries);
+return s390_has_feat(S390_FEAT_EXTENDED_LENGTH_SCCB) ?
+   offsetof(ReadInfo, entries) :
+   SCLP_READ_SCP_INFO_FIXED_CPU_OFFSET;
 }
 
 /* Provide information about the configuration, CPUs and storage */
@@ -116,6 +134,7 @@ static void read_SCP_info(SCLPDevice *sclp, SCCB *sccb)
 CPUEntry *entries_start = (void *)sccb + data_len;
 
 if (!sccb_sufficient_len(sccb, machine->possible_cpus->len, data_len)) {
+warn_report("insufficient sccb size to store read scp info response");
 return;
 }
 
diff --git a/include/hw/s390x/sclp.h b/include/hw/s390x/sclp.h
index 822eff4396..ef2d63eae9 100644
--- a/include/hw/s390x/sclp.h
+++ b/include/hw/s390x/sclp.h
@@ -110,6 +110,7 @@ typedef struct CPUEntry {
 uint8_t reserved1;
 } QEMU_PACKED CPUEntry;
 
+#define SCLP_READ_SCP_INFO_FIXED_CPU_OFFSET 128
 typedef struct ReadInfo {
 SCCBHeader h;
 uint16_t rnmax;
diff --git a/target/s390x/cpu_features_def.inc.h 
b/target/s390x/cpu_features_def.inc.h
index 5942f81f16..1c04cc18f4 100644
--- a/target/s390x/cpu_features_def.inc.h
+++ b/target/s390x/cpu_features_def.inc.h
@@ -97,6 +97,7 @@ DEF_FEAT(GUARDED_STORAGE, "gs", STFL, 133, "Guarded-storage 
facility")
 DEF_FEAT(VECTOR_PACKED_DECIMAL, "vxpd", STFL, 134, "Vector packed decimal 
facility")
 DEF_FEAT(VECTOR_ENH, "vxeh", STFL, 135, "Vector enhancements facility")
 DEF_FEAT(MULTIPLE_EPOCH, "mepoch", STFL, 139, "Multiple-epoch facility")
+DEF_FEAT(EXTENDED_LENGTH_SCCB, "els", STFL, 140, "Extended-length SCCB

[PATCH v3 5/8] s390/sclp: use cpu offset to locate cpu entries

The start of the CPU entry region in the Read SCP Info response data is
denoted by the offset_cpu field. As such, QEMU needs to begin creating
entries at this address. Note that the length of the Read SCP Info data
(data_len) denotes the same value as the cpu offset.

This is in preparation of when Read SCP Info inevitably introduces new
bytes that push the start of the CPUEntry field further away.

Read CPU Info is unlikely to ever change, so let's not bother
accounting for the offset there.

Signed-off-by: Collin Walling 
Reviewed-by: Thomas Huth 
---
 hw/s390x/sclp.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/s390x/sclp.c b/hw/s390x/sclp.c
index 772b7b3b01..0dfbe6e5ec 100644
--- a/hw/s390x/sclp.c
+++ b/hw/s390x/sclp.c
@@ -113,13 +113,14 @@ static void read_SCP_info(SCLPDevice *sclp, SCCB *sccb)
 int rnsize, rnmax;
 IplParameterBlock *ipib = s390_ipl_get_iplb();
 int data_len = get_read_scp_info_data_len();
+CPUEntry *entries_start = (void *)sccb + data_len;
 
 if (!sccb_sufficient_len(sccb, machine->possible_cpus->len, data_len)) {
 return;
 }
 
 /* CPU information */
-prepare_cpu_entries(machine, read_info->entries, _count);
+prepare_cpu_entries(machine, entries_start, _count);
 read_info->entries_cpu = cpu_to_be16(cpu_count);
 read_info->offset_cpu = cpu_to_be16(data_len);
 read_info->highest_cpu = cpu_to_be16(machine->smp.max_cpus - 1);
-- 
2.21.3

[PATCH v3 8/8] s390: guest support for diagnose 0x318

DIAGNOSE 0x318 (diag318) is an s390 instruction that allows the storage
of diagnostic information that is collected by the firmware in the case
of hardware/firmware service events.

QEMU handles the instruction by storing the info in the CPU state. A
subsequent register sync will communicate the data to the hypervisor.

QEMU handles the migration via a VM State Description.

This feature depends on the Extended-Length SCCB (els) feature. If
els is not present, then a warning will be printed and the SCLP bit
that allows the Linux kernel to execute the instruction will not be
set.

Availability of this instruction is determined by byte 134 (aka fac134)
bit 0 of the SCLP Read Info block. This coincidentally expands into the
space used for CPU entries, which means VMs running with the diag318
capability may not be able to read information regarding all CPUs
unless the guest kernel supports an extended-length SCCB.

This feature is not supported in protected virtualization mode.

Signed-off-by: Collin Walling 
---
 hw/s390x/sclp.c |  5 +
 include/hw/s390x/sclp.h |  3 +++
 target/s390x/cpu.h  |  3 ++-
 target/s390x/cpu_features.h |  1 +
 target/s390x/cpu_features_def.inc.h |  3 +++
 target/s390x/cpu_models.c   |  1 +
 target/s390x/gen-features.c |  1 +
 target/s390x/kvm.c  | 31 +
 target/s390x/machine.c  | 17 
 9 files changed, 64 insertions(+), 1 deletion(-)

diff --git a/hw/s390x/sclp.c b/hw/s390x/sclp.c
index f7c49e339e..78dbfbe427 100644
--- a/hw/s390x/sclp.c
+++ b/hw/s390x/sclp.c
@@ -152,6 +152,11 @@ static void read_SCP_info(SCLPDevice *sclp, SCCB *sccb)
 s390_get_feat_block(S390_FEAT_TYPE_SCLP_CONF_CHAR_EXT,
  read_info->conf_char_ext);
 
+if (s390_has_feat(S390_FEAT_EXTENDED_LENGTH_SCCB)) {
+s390_get_feat_block(S390_FEAT_TYPE_SCLP_FAC134,
+_info->fac134);
+}
+
 read_info->facilities = cpu_to_be64(SCLP_HAS_CPU_INFO |
 SCLP_HAS_IOA_RECONFIG);
 
diff --git a/include/hw/s390x/sclp.h b/include/hw/s390x/sclp.h
index ef2d63eae9..ccb9f0a676 100644
--- a/include/hw/s390x/sclp.h
+++ b/include/hw/s390x/sclp.h
@@ -133,6 +133,9 @@ typedef struct ReadInfo {
 uint16_t highest_cpu;
 uint8_t  _reserved5[124 - 122]; /* 122-123 */
 uint32_t hmfai;
+uint8_t  _reserved7[134 - 128]; /* 128-133 */
+uint8_t  fac134;
+uint8_t  _reserved8[144 - 135]; /* 135-143 */
 struct CPUEntry entries[];
 } QEMU_PACKED ReadInfo;
 
diff --git a/target/s390x/cpu.h b/target/s390x/cpu.h
index 035427521c..52765961cf 100644
--- a/target/s390x/cpu.h
+++ b/target/s390x/cpu.h
@@ -112,6 +112,8 @@ struct CPUS390XState {
 uint16_t external_call_addr;
 DECLARE_BITMAP(emergency_signals, S390_MAX_CPUS);
 
+uint64_t diag318_info;
+
 /* Fields up to this point are cleared by a CPU reset */
 struct {} end_reset_fields;
 
@@ -136,7 +138,6 @@ struct CPUS390XState {
 
 /* currently processed sigp order */
 uint8_t sigp_order;
-
 };
 
 static inline uint64_t *get_freg(CPUS390XState *cs, int nr)
diff --git a/target/s390x/cpu_features.h b/target/s390x/cpu_features.h
index da695a8346..f74f7fc3a1 100644
--- a/target/s390x/cpu_features.h
+++ b/target/s390x/cpu_features.h
@@ -23,6 +23,7 @@ typedef enum {
 S390_FEAT_TYPE_STFL,
 S390_FEAT_TYPE_SCLP_CONF_CHAR,
 S390_FEAT_TYPE_SCLP_CONF_CHAR_EXT,
+S390_FEAT_TYPE_SCLP_FAC134,
 S390_FEAT_TYPE_SCLP_CPU,
 S390_FEAT_TYPE_MISC,
 S390_FEAT_TYPE_PLO,
diff --git a/target/s390x/cpu_features_def.inc.h 
b/target/s390x/cpu_features_def.inc.h
index 1c04cc18f4..f82b4b5ec1 100644
--- a/target/s390x/cpu_features_def.inc.h
+++ b/target/s390x/cpu_features_def.inc.h
@@ -122,6 +122,9 @@ DEF_FEAT(SIE_CMMA, "cmma", SCLP_CONF_CHAR_EXT, 1, "SIE: 
Collaborative-memory-man
 DEF_FEAT(SIE_PFMFI, "pfmfi", SCLP_CONF_CHAR_EXT, 9, "SIE: PFMF interpretation 
facility")
 DEF_FEAT(SIE_IBS, "ibs", SCLP_CONF_CHAR_EXT, 10, "SIE: 
Interlock-and-broadcast-suppression facility")
 
+/* Features exposed via SCLP SCCB Facilities byte 134 (bit numbers relative to 
byte-134) */
+DEF_FEAT(DIAG_318, "diag318", SCLP_FAC134, 0, "Control program name and 
version codes")
+
 /* Features exposed via SCLP CPU info. */
 DEF_FEAT(SIE_F2, "sief2", SCLP_CPU, 4, "SIE: interception format 2 (Virtual 
SIE)")
 DEF_FEAT(SIE_SKEY, "skey", SCLP_CPU, 5, "SIE: Storage-key facility")
diff --git a/target/s390x/cpu_models.c b/target/s390x/cpu_models.c
index 2fa609bffe..034673be54 100644
--- a/target/s390x/cpu_models.c
+++ b/target/s390x/cpu_models.c
@@ -827,6 +827,7 @@ static void check_consistency(const S390CPUModel *model)
 { S390_FEAT_PTFF_STOE, S390_FEAT_MULTIPLE_EPOCH },
 { S390_FEAT_PTFF_STOUE, S390_FEAT_MULTIPLE_EPOCH },
 { S390_FEAT_AP_QUEUE_INTERRUPT_CONTROL, S390_FEAT_AP },
+{ S390_FEAT_DIAG_318,

[PATCH v3 7/8] s390/kvm: header sync for diag318

Signed-off-by: Collin Walling 
---
 linux-headers/asm-s390/kvm.h | 5 -
 linux-headers/linux/kvm.h| 1 +
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/linux-headers/asm-s390/kvm.h b/linux-headers/asm-s390/kvm.h
index 0138ccb0d8..98665dff19 100644
--- a/linux-headers/asm-s390/kvm.h
+++ b/linux-headers/asm-s390/kvm.h
@@ -231,11 +231,13 @@ struct kvm_guest_debug_arch {
 #define KVM_SYNC_GSCB   (1UL << 9)
 #define KVM_SYNC_BPBC   (1UL << 10)
 #define KVM_SYNC_ETOKEN (1UL << 11)
+#define KVM_SYNC_DIAG318 (1UL << 12)
 
 #define KVM_SYNC_S390_VALID_FIELDS \
(KVM_SYNC_PREFIX | KVM_SYNC_GPRS | KVM_SYNC_ACRS | KVM_SYNC_CRS | \
 KVM_SYNC_ARCH0 | KVM_SYNC_PFAULT | KVM_SYNC_VRS | KVM_SYNC_RICCB | \
-KVM_SYNC_FPRS | KVM_SYNC_GSCB | KVM_SYNC_BPBC | KVM_SYNC_ETOKEN)
+KVM_SYNC_FPRS | KVM_SYNC_GSCB | KVM_SYNC_BPBC | KVM_SYNC_ETOKEN | \
+KVM_SYNC_DIAG318)
 
 /* length and alignment of the sdnx as a power of two */
 #define SDNXC 8
@@ -254,6 +256,7 @@ struct kvm_sync_regs {
__u64 pft;  /* pfault token [PFAULT] */
__u64 pfs;  /* pfault select [PFAULT] */
__u64 pfc;  /* pfault compare [PFAULT] */
+   __u64 diag318;  /* diagnose 0x318 info */
union {
__u64 vrs[32][2];   /* vector registers (KVM_SYNC_VRS) */
__u64 fprs[16]; /* fp registers (KVM_SYNC_FPRS) */
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index 9804495a46..444fdd977f 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -1017,6 +1017,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_S390_VCPU_RESETS 179
 #define KVM_CAP_S390_PROTECTED 180
 #define KVM_CAP_PPC_SECURE_GUEST 181
+#define KVM_CAP_S390_DIAG318 184
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.21.3

[PATCH v3 1/8] s390/sclp: get machine once during read scp/cpu info

Functions within read scp/cpu info will need access to the machine
state. Let's make a call to retrieve the machine state once and
pass the appropriate data to the respective functions.

Signed-off-by: Collin Walling 
Reviewed-by: David Hildenbrand 
Reviewed-by: Thomas Huth 
---
 hw/s390x/sclp.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/s390x/sclp.c b/hw/s390x/sclp.c
index 20aca30ac4..7875334037 100644
--- a/hw/s390x/sclp.c
+++ b/hw/s390x/sclp.c
@@ -49,9 +49,8 @@ static inline bool sclp_command_code_valid(uint32_t code)
 return false;
 }
 
-static void prepare_cpu_entries(SCLPDevice *sclp, CPUEntry *entry, int *count)
+static void prepare_cpu_entries(MachineState *ms, CPUEntry *entry, int *count)
 {
-MachineState *ms = MACHINE(qdev_get_machine());
 uint8_t features[SCCB_CPU_FEATURE_LEN] = { 0 };
 int i;
 
@@ -77,7 +76,7 @@ static void read_SCP_info(SCLPDevice *sclp, SCCB *sccb)
 IplParameterBlock *ipib = s390_ipl_get_iplb();
 
 /* CPU information */
-prepare_cpu_entries(sclp, read_info->entries, _count);
+prepare_cpu_entries(machine, read_info->entries, _count);
 read_info->entries_cpu = cpu_to_be16(cpu_count);
 read_info->offset_cpu = cpu_to_be16(offsetof(ReadInfo, entries));
 read_info->highest_cpu = cpu_to_be16(machine->smp.max_cpus - 1);
@@ -132,10 +131,11 @@ static void read_SCP_info(SCLPDevice *sclp, SCCB *sccb)
 /* Provide information about the CPU */
 static void sclp_read_cpu_info(SCLPDevice *sclp, SCCB *sccb)
 {
+MachineState *machine = MACHINE(qdev_get_machine());
 ReadCpuInfo *cpu_info = (ReadCpuInfo *) sccb;
 int cpu_count;
 
-prepare_cpu_entries(sclp, cpu_info->entries, _count);
+prepare_cpu_entries(machine, cpu_info->entries, _count);
 cpu_info->nr_configured = cpu_to_be16(cpu_count);
 cpu_info->offset_configured = cpu_to_be16(offsetof(ReadCpuInfo, entries));
 cpu_info->nr_standby = cpu_to_be16(0);
-- 
2.21.3

[PATCH v3 4/8] s390/sclp: read sccb from mem based on sccb length

The header of the SCCB contains the actual length of the SCCB. Instead
of using a static 4K size, let's allow for a variable size determined
by the value set in the header. The proper checks are already in place
to ensure the SCCB length is sufficent to store a full response, and
that the length does not cross any explicitly-set boundaries.

Signed-off-by: Collin Walling 
Reviewed-by: Thomas Huth 
---
 hw/s390x/sclp.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/hw/s390x/sclp.c b/hw/s390x/sclp.c
index 0710138f91..772b7b3b01 100644
--- a/hw/s390x/sclp.c
+++ b/hw/s390x/sclp.c
@@ -256,9 +256,8 @@ int sclp_service_call_protected(CPUS390XState *env, 
uint64_t sccb,
 SCLPDevice *sclp = get_sclp_device();
 SCLPDeviceClass *sclp_c = SCLP_GET_CLASS(sclp);
 SCCB work_sccb;
-hwaddr sccb_len = sizeof(SCCB);
 
-s390_cpu_pv_mem_read(env_archcpu(env), 0, _sccb, sccb_len);
+s390_cpu_pv_mem_read(env_archcpu(env), 0, _sccb, sizeof(SCCBHeader));
 
 if (!sclp_command_code_valid(code)) {
 work_sccb.h.response_code = cpu_to_be16(SCLP_RC_INVALID_SCLP_COMMAND);
@@ -269,6 +268,9 @@ int sclp_service_call_protected(CPUS390XState *env, 
uint64_t sccb,
 goto out_write;
 }
 
+s390_cpu_pv_mem_read(env_archcpu(env), 0, _sccb,
+ be16_to_cpu(work_sccb.h.length));
+
 sclp_c->execute(sclp, _sccb, code);
 out_write:
 s390_cpu_pv_mem_write(env_archcpu(env), 0, _sccb,
@@ -283,8 +285,6 @@ int sclp_service_call(CPUS390XState *env, uint64_t sccb, 
uint32_t code)
 SCLPDeviceClass *sclp_c = SCLP_GET_CLASS(sclp);
 SCCB work_sccb;
 
-hwaddr sccb_len = sizeof(SCCB);
-
 /* first some basic checks on program checks */
 if (env->psw.mask & PSW_MASK_PSTATE) {
 return -PGM_PRIVILEGED;
@@ -302,7 +302,7 @@ int sclp_service_call(CPUS390XState *env, uint64_t sccb, 
uint32_t code)
  * from playing dirty tricks by modifying the memory content after
  * the host has checked the values
  */
-cpu_physical_memory_read(sccb, _sccb, sccb_len);
+cpu_physical_memory_read(sccb, _sccb, sizeof(SCCBHeader));
 
 /* Valid sccb sizes */
 if (be16_to_cpu(work_sccb.h.length) < sizeof(SCCBHeader)) {
@@ -318,6 +318,9 @@ int sclp_service_call(CPUS390XState *env, uint64_t sccb, 
uint32_t code)
 goto out_write;
 }
 
+/* the header contains the actual length of the sccb */
+cpu_physical_memory_read(sccb, _sccb, 
be16_to_cpu(work_sccb.h.length));
+
 sclp_c->execute(sclp, _sccb, code);
 out_write:
 cpu_physical_memory_write(sccb, _sccb,
-- 
2.21.3

[PATCH v3 3/8] s390/sclp: rework sclp boundary and length checks

Rework the SCLP boundary check to account for different SCLP commands
(eventually) allowing different boundary sizes.

Move the length check code into a separate function, and introduce a
new function to determine the length of the read SCP data (i.e. the size
from the start of the struct to where the CPU entries should begin).

The format of read CPU info is unlikely to change in the future,
so we do not require a separate function to calculate its length.

Signed-off-by: Collin Walling 
---
 hw/s390x/sclp.c | 59 -
 1 file changed, 49 insertions(+), 10 deletions(-)

diff --git a/hw/s390x/sclp.c b/hw/s390x/sclp.c
index 181ce04007..0710138f91 100644
--- a/hw/s390x/sclp.c
+++ b/hw/s390x/sclp.c
@@ -49,6 +49,34 @@ static inline bool sclp_command_code_valid(uint32_t code)
 return false;
 }
 
+static bool sccb_has_valid_boundary(uint64_t sccb_addr, uint32_t code,
+SCCBHeader *header)
+{
+uint64_t sccb_max_addr = sccb_addr + be16_to_cpu(header->length) - 1;
+uint64_t sccb_boundary = (sccb_addr & PAGE_MASK) + PAGE_SIZE;
+
+switch (code & SCLP_CMD_CODE_MASK) {
+default:
+if (sccb_max_addr < sccb_boundary) {
+return true;
+}
+}
+header->response_code = cpu_to_be16(SCLP_RC_SCCB_BOUNDARY_VIOLATION);
+return false;
+}
+
+/* Calculates sufficient SCCB length to store a full Read SCP/CPU response */
+static bool sccb_sufficient_len(SCCB *sccb, int num_cpus, int data_len)
+{
+int required_len = data_len + num_cpus * sizeof(CPUEntry);
+
+if (be16_to_cpu(sccb->h.length) < required_len) {
+sccb->h.response_code = cpu_to_be16(SCLP_RC_INSUFFICIENT_SCCB_LENGTH);
+return false;
+}
+return true;
+}
+
 static void prepare_cpu_entries(MachineState *ms, CPUEntry *entry, int *count)
 {
 uint8_t features[SCCB_CPU_FEATURE_LEN] = { 0 };
@@ -66,6 +94,16 @@ static void prepare_cpu_entries(MachineState *ms, CPUEntry 
*entry, int *count)
 }
 }
 
+/*
+ * The data length denotes the start of the struct to where the first
+ * CPU entry is to be allocated. This value also denotes the offset_cpu
+ * field.
+ */
+static inline int get_read_scp_info_data_len(void)
+{
+return offsetof(ReadInfo, entries);
+}
+
 /* Provide information about the configuration, CPUs and storage */
 static void read_SCP_info(SCLPDevice *sclp, SCCB *sccb)
 {
@@ -74,17 +112,16 @@ static void read_SCP_info(SCLPDevice *sclp, SCCB *sccb)
 int cpu_count;
 int rnsize, rnmax;
 IplParameterBlock *ipib = s390_ipl_get_iplb();
+int data_len = get_read_scp_info_data_len();
 
-if (be16_to_cpu(sccb->h.length) <
-  (sizeof(ReadInfo) + machine->possible_cpus->len * sizeof(CPUEntry))) 
{
-sccb->h.response_code = cpu_to_be16(SCLP_RC_INSUFFICIENT_SCCB_LENGTH);
+if (!sccb_sufficient_len(sccb, machine->possible_cpus->len, data_len)) {
 return;
 }
 
 /* CPU information */
 prepare_cpu_entries(machine, read_info->entries, _count);
 read_info->entries_cpu = cpu_to_be16(cpu_count);
-read_info->offset_cpu = cpu_to_be16(offsetof(ReadInfo, entries));
+read_info->offset_cpu = cpu_to_be16(data_len);
 read_info->highest_cpu = cpu_to_be16(machine->smp.max_cpus - 1);
 
 read_info->ibc_val = cpu_to_be32(s390_get_ibc_val());
@@ -133,17 +170,16 @@ static void sclp_read_cpu_info(SCLPDevice *sclp, SCCB 
*sccb)
 {
 MachineState *machine = MACHINE(qdev_get_machine());
 ReadCpuInfo *cpu_info = (ReadCpuInfo *) sccb;
+int data_len = offsetof(ReadCpuInfo, entries);
 int cpu_count;
 
-if (be16_to_cpu(sccb->h.length) <
-  (sizeof(ReadInfo) + machine->possible_cpus->len * sizeof(CPUEntry))) 
{
-sccb->h.response_code = cpu_to_be16(SCLP_RC_INSUFFICIENT_SCCB_LENGTH);
+if (!sccb_sufficient_len(sccb, machine->possible_cpus->len, data_len)) {
 return;
 }
 
 prepare_cpu_entries(machine, cpu_info->entries, _count);
 cpu_info->nr_configured = cpu_to_be16(cpu_count);
-cpu_info->offset_configured = cpu_to_be16(offsetof(ReadCpuInfo, entries));
+cpu_info->offset_configured = cpu_to_be16(data_len);
 cpu_info->nr_standby = cpu_to_be16(0);
 
 /* The standby offset is 16-byte for each CPU */
@@ -229,6 +265,10 @@ int sclp_service_call_protected(CPUS390XState *env, 
uint64_t sccb,
 goto out_write;
 }
 
+if (!sccb_has_valid_boundary(sccb, code, _sccb.h)) {
+goto out_write;
+}
+
 sclp_c->execute(sclp, _sccb, code);
 out_write:
 s390_cpu_pv_mem_write(env_archcpu(env), 0, _sccb,
@@ -274,8 +314,7 @@ int sclp_service_call(CPUS390XState *env, uint64_t sccb, 
uint32_t code)
 goto out_write;
 }
 
-if ((sccb + be16_to_cpu(work_sccb.h.length)) > ((sccb & PAGE_MASK) + 
PAGE_SIZE)) {
-work_sccb.h.response_code = 
cpu_to_be16(SCLP_RC_SCCB_BOUNDARY_VIOLATION);
+if (!sccb_has_valid_boundary(sccb, code, _sccb.h)) {
 goto out_write;
 }
 
--

[PATCH v3 2/8] s390/sclp: check sccb len before filling in data

The SCCB must be checked for a sufficient length before it is filled
with any data. If the length is insufficient, then the SCLP command
is suppressed and the proper response code is set in the SCCB header.

Fixes: 832be0d8a3bb ("s390x: sclp: Report insufficient SCCB length")
Signed-off-by: Collin Walling 
Reviewed-by: Janosch Frank 
---
 hw/s390x/sclp.c | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/hw/s390x/sclp.c b/hw/s390x/sclp.c
index 7875334037..181ce04007 100644
--- a/hw/s390x/sclp.c
+++ b/hw/s390x/sclp.c
@@ -75,6 +75,12 @@ static void read_SCP_info(SCLPDevice *sclp, SCCB *sccb)
 int rnsize, rnmax;
 IplParameterBlock *ipib = s390_ipl_get_iplb();
 
+if (be16_to_cpu(sccb->h.length) <
+  (sizeof(ReadInfo) + machine->possible_cpus->len * sizeof(CPUEntry))) 
{
+sccb->h.response_code = cpu_to_be16(SCLP_RC_INSUFFICIENT_SCCB_LENGTH);
+return;
+}
+
 /* CPU information */
 prepare_cpu_entries(machine, read_info->entries, _count);
 read_info->entries_cpu = cpu_to_be16(cpu_count);
@@ -83,12 +89,6 @@ static void read_SCP_info(SCLPDevice *sclp, SCCB *sccb)
 
 read_info->ibc_val = cpu_to_be32(s390_get_ibc_val());
 
-if (be16_to_cpu(sccb->h.length) <
-(sizeof(ReadInfo) + cpu_count * sizeof(CPUEntry))) {
-sccb->h.response_code = cpu_to_be16(SCLP_RC_INSUFFICIENT_SCCB_LENGTH);
-return;
-}
-
 /* Configuration Characteristic (Extension) */
 s390_get_feat_block(S390_FEAT_TYPE_SCLP_CONF_CHAR,
  read_info->conf_char);
@@ -135,17 +135,17 @@ static void sclp_read_cpu_info(SCLPDevice *sclp, SCCB 
*sccb)
 ReadCpuInfo *cpu_info = (ReadCpuInfo *) sccb;
 int cpu_count;
 
-prepare_cpu_entries(machine, cpu_info->entries, _count);
-cpu_info->nr_configured = cpu_to_be16(cpu_count);
-cpu_info->offset_configured = cpu_to_be16(offsetof(ReadCpuInfo, entries));
-cpu_info->nr_standby = cpu_to_be16(0);
-
 if (be16_to_cpu(sccb->h.length) <
-(sizeof(ReadCpuInfo) + cpu_count * sizeof(CPUEntry))) {
+  (sizeof(ReadInfo) + machine->possible_cpus->len * sizeof(CPUEntry))) 
{
 sccb->h.response_code = cpu_to_be16(SCLP_RC_INSUFFICIENT_SCCB_LENGTH);
 return;
 }
 
+prepare_cpu_entries(machine, cpu_info->entries, _count);
+cpu_info->nr_configured = cpu_to_be16(cpu_count);
+cpu_info->offset_configured = cpu_to_be16(offsetof(ReadCpuInfo, entries));
+cpu_info->nr_standby = cpu_to_be16(0);
+
 /* The standby offset is 16-byte for each CPU */
 cpu_info->offset_standby = cpu_to_be16(cpu_info->offset_configured
 + cpu_info->nr_configured*sizeof(CPUEntry));
-- 
2.21.3

[PATCH v3 0/8] s390: Extended-Length SCCB & DIAGNOSE 0x318

Changelog:

v3

• Device IOCTLs removed
- diag 318 info is now communicated via sync_regs

• Reset code removed
- this is now handled in KVM
- diag318_info is stored within the CPU reset portion of the
S390CPUState

• Various cleanups for ELS preliminary patches

v2

• QEMU now handles the instruction call
- as such, the "enable diag 318" IOCTL has been removed

• patch #1 now changes the read scp/cpu info functions to
  retrieve the machine state once
- as such, I have not added any ack's or r-bs since this
  patch differs from the previous version

• patch #3 introduces a new "get_read_scp_info_data_len"
  function in order clean-up the variable data length assignment
  in patch #7
- a comment above this function should help clarify what's
  going on to make things a bit easier to read

• other misc clean ups and fixes
- s/diag318/diag_318 in order to keep the naming scheme
  consistent with Linux and other diag-related code
- s/byte_134/fac134 to align naming scheme with Linux

---

This patch series introduces two features for an s390 KVM quest:
- Extended-Length SCCB (els) for the Read SCP/CPU Info SCLP 
commands
- DIAGNOSE 0x318 (diag_318) enabling / migration handling

The diag 318 feature depends on els and KVM support.

The els feature is handled entirely with QEMU, and does not require 
KVM support.

Both features are made available starting with the zEC12-full model.

These patches are introduced together for two main reasons:
- els allows diag 318 to exist while retaining the original 248 
VCPU max
- diag 318 is presented to show how els is useful

Full els support is dependant on the Linux kernel, which must react
to the SCLP response code and set an appropriate-length SCCB. 

A user should take care when tuning the CPU model for a VM.
If a user defines a VM with els support and specifies 248 CPUs, but
the guest Linux kernel cannot react to the SCLP response code, then
the guest will crash immediately upon kernel startup.

Collin L. Walling (8):
  s390/sclp: get machine once during read scp/cpu info
  s390/sclp: check sccb len before filling in data
  s390/sclp: rework sclp boundary and length checks
  s390/sclp: read sccb from mem based on sccb length
  s390/sclp: use cpu offset to locate cpu entries
  s390/sclp: add extended-length sccb support for kvm guest
  s390/kvm: header sync for diag318
  s390: guest support for diagnose 0x318

 hw/s390x/sclp.c | 117 ++--
 include/hw/s390x/sclp.h |   4 +
 linux-headers/asm-s390/kvm.h|   5 +-
 linux-headers/linux/kvm.h   |   1 +
 target/s390x/cpu.h  |   3 +-
 target/s390x/cpu_features.h |   1 +
 target/s390x/cpu_features_def.inc.h |   4 +
 target/s390x/cpu_models.c   |   1 +
 target/s390x/gen-features.c |   2 +
 target/s390x/kvm.c  |  39 ++
 target/s390x/machine.c  |  17 
 11 files changed, 167 insertions(+), 27 deletions(-)

-- 
2.21.3

[Bug 1884095] Re: QEMU not sufficiently focused on qEMUlation, with resulting holes in TCG emulation coverage

2020-06-18 Thread Ronald Antony

BTW: just because I bracket a report with why I think a matter is worth
fixing, shouldn’t make it „invalid“.

The instructions aren’t implemented, yet the CPUs are listed as
available, which is a bug in my book, as functionality is advertised
that is unavailable.

--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1884095

Title:
QEMU not sufficiently focused on qEMUlation, with resulting holes in
TCG emulation coverage

Status in QEMU:
Invalid

Bug description:
It seems that QEMU has stopped emphasizing the EMU part of the name,
and is too much focused on virtualization.

My interest is at running legacy operating systems, and as such, they must
run on foreign CPU platforms. m68 on intel, intel on ARM, etc.
Time doesn't stand still, and reliance on KVM and similar x86-on-x86 tricks,
which allow the delegation of certain CPU features to the host CPU is going to
not work going forward.

If the rumored transition of Apple to ARM is going to take place,
people will want to e.g. emulate for testing or legacy purposes a
variety of operating systems, incl. NeXTSTEP, Windows, earlier
versions of MacOS on ARM Macs.

Testing that scenario, i.e. macOS on an ARM board with the lowest
possible CPU capable of running modern macOS, results in these
problems (and of course utter failure achieving the goal):

qemu-system-x86_64: warning: TCG doesn't support requested feature:
CPUID.01H:ECX.fma [bit 12]
qemu-system-x86_64: warning: TCG doesn't support requested feature:
CPUID.01H:ECX.avx [bit 28]
qemu-system-x86_64: warning: TCG doesn't support requested feature:
CPUID.07H:EBX.avx2 [bit 5]
qemu-system-x86_64: warning: TCG doesn't support requested feature:
CPUID.8007H:EDX.invtsc [bit 8]
qemu-system-x86_64: warning: TCG doesn't support requested feature:
CPUID.0DH:EAX.xsavec [bit 1]

And this is emulating a lowly Penryn CPU with the required CPU flags for
macOS:
-cpu
Penryn,vendor=GenuineIntel,+sse3,+sse4.2,+aes,+xsave,+avx,+xsaveopt,+xsavec,+xgetbv1,+avx2,+bmi2,+smep,+bmi1,+fma,+movbe,+invtsc

Attempting to emulate a more feature laden intel CPU results in even
more issues.

I would propose that no CPU should be considered supported unless it
can be fully handled by TCG on a non-native host. KVM, native-on-
native etc. are nice to have, but peripheral to qEMUlation when it
boils down to it. At the very least, there should be a CLEAR
distinction which CPUs require KVM to be used, and which can be fully
emulated. It should not require wasting an afternoon to figure out
that an emulation attempt is futile because TCG lacks essential
functionality.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1884095/+subscriptions

[Bug 1884095] Re: QEMU not sufficiently focused on qEMUlation, with resulting holes in TCG emulation coverage

2020-06-18 Thread Ronald Antony

The comments with the other reports were just in support of getting them
fixed, and providing a reason as to why that matters. Someone looking at
those reports may not read this one, and as the issues are symptoms of
the same larger issue, this report was filed as an overarching report,
as AVX is just one aspect. Depending on the CPU model picked, an entire
slew of error messages are generated.

Fact is, an emulator that claims it emulates a CPU has a bug, if that
CPU cannot be properly emulated. Hence this report.

For the emulator not to have to be considered buggy,
EITHER
the CPU type has to be delisted as supported
OR
the missing instructions must be implemented.

But it’s not proper to say QEMU can emulate an x86_64 Penryn system,
when trying to do so fails miserably because of instructions
unimplemented in TGC.

At the very least the documentation and online help would have to
distinguish between KVM-only CPU types and TGC CPU types.

Downloading and compiling QEMU 5 sources and compiling them on an ARM64
platform results in

qemu-system-x86_64 -cpu help

listing all sorts of CPUs as „available“ even though these have
significant gaps in the covered instruction set. If that’s not a bug, I
don’t know.

How you go about fixing it, is a different matter. You could remove the CPUs,
mark them as incompletely implemented, or add support for the missing features.
Maybe it might even be possible to interest intel to contribute code from their
SDE project to TCG

--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1884095

Title:
QEMU not sufficiently focused on qEMUlation, with resulting holes in
TCG emulation coverage

Status in QEMU:
Invalid

Bug description:
It seems that QEMU has stopped emphasizing the EMU part of the name,
and is too much focused on virtualization.

Testing that scenario, i.e. macOS on an ARM board with the lowest
possible CPU capable of running modern macOS, results in these
problems (and of course utter failure achieving the goal):

Attempting to emulate a more feature laden intel CPU results in even
more issues.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1884095/+subscriptions

Re: RFC: use VFIO over a UNIX domain socket to implement device offloading

2020-06-18 Thread John G Johnson




> On Jun 15, 2020, at 3:49 AM, Stefan Hajnoczi  wrote:
> 
> 
> It's challenging to implement a fast and secure IOMMU. The simplest
> approach is secure but not fast: add protocol messages for
> DMA_READ(iova, length) and DMA_WRITE(iova, buffer, length).
> 

We do have protocol messages for the case where no FD is
associated with the DMA region:  VFIO_USER_DMA_READ/WRITE.


> An issue with file descriptor passing is that it's hard to revoke access
> once the file descriptor has been passed. memfd supports sealing with
> fnctl(F_ADD_SEALS) it doesn't revoke mmap(MAP_WRITE) on other processes.
> 
> Memory Protection Keys don't seem to be useful here either and their
> availability is limited (see pkeys(7)).
> 
> One crazy idea is to use KVM as a sandbox for running the device and let
> the vIOMMU control the page tables instead of the device (guest). That
> way the hardware MMU provides memory translation, but I think this is
> impractical because the guest environment is too different from the
> Linux userspace environment.
> 
> As a starting point adding DMA_READ/DMA_WRITE messages would provide the
> functionality and security. Unfortunately it makes DMA expensive and
> performance will suffer.
> 

Are you advocating for only using VFIO_USER_DMA_READ/WRITE and
not passing FDs at all?  The performance penalty would be large for the
cases where the client and server are equally trusted.  Or are you
advocating for an option where the slower methods are used for cases
where the server is less trusted?

JJ

Re: [PATCH] riscv: plic: Add a couple of mising sifive_plic_update calls

Patchew URL: https://patchew.org/QEMU/20200618210649.22451-1-jrt...@jrtc27.com/



Hi,

This series failed the asan build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-debug@fedora TARGET_LIST=x86_64-softmmu J=14 NETWORK=1
=== TEST SCRIPT END ===

  CC  qga/commands.o
  CC  qga/guest-agent-command-state.o
  CC  qga/main.o
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  CC  qga/commands-posix.o
  CC  qga/channel-posix.o
  CC  qga/qapi-generated/qga-qapi-types.o
---
  AR  libqemuutil.a
  LINKelf2dmp
  CC  qemu-img.o
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  AS  pc-bios/optionrom/multiboot.o
  AS  pc-bios/optionrom/pvh.o
  AS  pc-bios/optionrom/linuxboot.o
---
  LINKqemu-io
  LINKqemu-edid
  LINKfsdev/virtfs-proxy-helper
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINKscsi/qemu-pr-helper
  BUILD   pc-bios/optionrom/linuxboot_dma.img
  LINKqemu-bridge-helper
---
  BUILD   pc-bios/optionrom/linuxboot.raw
  BUILD   pc-bios/optionrom/multiboot.raw
  BUILD   pc-bios/optionrom/kvmvapic.raw
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  GEN docs/interop/qemu-ga-ref.html
  GEN docs/interop/qemu-ga-ref.txt
  SIGNpc-bios/optionrom/linuxboot.bin
---
  LINKivshmem-server
  LINKqemu-nbd
  LINKqemu-storage-daemon
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINKqemu-img
  LINKvirtiofsd
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common

[PATCH] riscv: plic: Add a couple of mising sifive_plic_update calls

2020-06-18 Thread Jessica Clarke

Claiming an interrupt and changing the source priority both potentially
affect whether an interrupt is pending, thus we must re-compute xEIP.
Note that we don't put the sifive_plic_update inside sifive_plic_claim
so that the logging of a claim (and the resulting IRQ) happens before
the state update, making the causal effect clear, and that we drop the
explicit call to sifive_plic_print_state when claiming since
sifive_plic_update already does that automatically at the end for us.

This can result in both spurious interrupt storms if you fail to
complete an IRQ before enabling interrupts (and no other actions occur
that result in a call to sifive_plic_update), but also more importantly
lost interrupts if a disabled interrupt is pending and then becomes
enabled.

Signed-off-by: Jessica Clarke 
---
 hw/riscv/sifive_plic.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/riscv/sifive_plic.c b/hw/riscv/sifive_plic.c
index d91e82b8ab..c20c192034 100644
--- a/hw/riscv/sifive_plic.c
+++ b/hw/riscv/sifive_plic.c
@@ -255,8 +255,8 @@ static uint64_t sifive_plic_read(void *opaque, hwaddr addr, 
unsigned size)
 plic->addr_config[addrid].hartid,
 mode_to_char(plic->addr_config[addrid].mode),
 value);
-sifive_plic_print_state(plic);
 }
+sifive_plic_update(plic);
 return value;
 }
 }
@@ -287,6 +287,7 @@ static void sifive_plic_write(void *opaque, hwaddr addr, 
uint64_t value,
 qemu_log("plic: write priority: irq=%d priority=%d\n",
 irq, plic->source_priority[irq]);
 }
+sifive_plic_update(plic);
 return;
 } else if (addr >= plic->pending_base && /* 1 bit per source */
addr < plic->pending_base + (plic->num_sources >> 3))
-- 
2.20.1

Re: [PATCH v7 13/42] target/arm: Define arm_cpu_do_unaligned_access for user-only

On 6/18/20 10:03 AM, Richard Henderson wrote:
> First, this could definitely be delayed to the follow-on linux-user patch set.

Bah, no, I need the function to be defined at least,
even if it isn't reachable yet.


r~

Re: [PATCH v3 5/8] acpi: Enable TPM IRQ

2020-06-18 Thread Stefan Berger


On 6/18/20 4:12 PM, Michael S. Tsirkin wrote:

On Wed, Jun 17, 2020 at 07:59:51AM -0400, Stefan Berger wrote:

On 6/17/20 4:22 AM, Auger Eric wrote:

Hi Stefan,

On 6/16/20 10:57 PM, Stefan Berger wrote:

From: Stefan Berger 

Move the TPM TIS IRQ to unused IRQ 13, which is the only one accepted by
Windows. Query for the TPM's irq number and enable the TPM IRQ unless
TPM_IRQ_DISABLED is returned.

Signed-off-by: Stefan Berger 
CC: Michael S. Tsirkin 
---
   hw/i386/acpi-build.c  | 11 +--
   include/hw/acpi/tpm.h |  2 +-
   2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 900f786d08..bb9a7f8497 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -2021,6 +2021,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
   build_append_pci_bus_devices(scope, bus, pm->pcihp_bridge_en);
   if (TPM_IS_TIS_ISA(tpm)) {
+int8_t irq = tpm_get_irqnum(tpm);
   if (misc->tpm_version == TPM_VERSION_2_0) {
   dev = aml_device("TPM");
   aml_append(dev, aml_name_decl("_HID",
@@ -2035,12 +2036,10 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
   crs = aml_resource_template();
   aml_append(crs, aml_memory32_fixed(TPM_TIS_ADDR_BASE,
  TPM_TIS_ADDR_SIZE, AML_READ_WRITE));
-/*
-FIXME: TPM_TIS_IRQ=5 conflicts with PNP0C0F irqs,
-Rewrite to take IRQ from TPM device model and
-fix default IRQ value there to use some unused IRQ
- */
-/* aml_append(crs, aml_irq_no_flags(TPM_TIS_IRQ)); */
+
+if (irq != TPM_IRQ_DISABLED) {

Out of curiosity what is the goal to expose the irq num as a property
settable by the end-user if only 13 is known to work in all cases. At
least shouldn't we warn the end-user in case he attempts to change the
default value?

For Windows only IRQ 13 works (and I am not sure whether this has always
been like this), Linux accepts several other ones. As for exposing it to the
end-user, I may have taken this from soundblaster (sb16.c), which also
exposes it. If someone plays around with the irq numbers I would say they
must have some more Pc knowledge thanÂ  just trying random numbers.


Â Â  Stefan

So is this useful to anyone? If no I'd say drop it.



So we can remove command line options?



I'm guessing sb16 has it since it is useful for running extremely old OSes 
which might
have weird quirks for a specific hardware.

Re: [PATCH v5 3/3] hw/net/imx_fec: improve PHY implementation.

2020-06-18 Thread Jean-Christophe DUBOIS


Le 15/06/2020 à 15:03, Peter Maydell a écrit :

On Thu, 4 Jun 2020 at 13:39, Jean-Christophe Dubois  
wrote:

improve the PHY implementation with more generic code.

This patch remove a lot of harcoded values to replace them with
generic symbols from header files.

Signed-off-by: Jean-Christophe Dubois 
---
  v2: Not present
  v3: Not present
  v4: Not present
  v5: improve PHY implementation.

  hw/net/imx_fec.c | 76 +++-
  include/hw/net/mii.h |  4 +++
  2 files changed, 50 insertions(+), 30 deletions(-)



-case 5: /* Auto-neg Link Partner Ability */
-val = 0x0f71;
+case MII_ANLPAR: /* Auto-neg Link Partner Ability */
+val = / | MII_ANLPAR_10 | MII_ANLPAR_10FD |
+  MII_ANLPAR_TX | MII_ANLPAR_TXFD | MII_ANLPAR_PAUSE |
+  MII_ANLPAR_PAUSEASY;

The old value is 0x0f71, but the new one with the constants
is 0x0de1.


First of I should say that this PHY, first borrowed by the mfc_fec.c 
(coldfire ethernet device) from lan9118 (and now by imx_fec.c) is not 
one used on any real i.MX (i.MX6, i.MX7, i.MX31, i.MX25, ...) based 
board that I know of (this particular PHY is embedded n the lan9118 
ethernet device)


It is there because we were in need of a PHY and this PHY needs to be 
simple and more or less standard.


I might have missed something but I am not really aware of way in Qemu 
to swap PHYs for a given ethernet emulator depending on the emulated board.


So here this PHY was just a blind cut and paste of the lan9118.c PHY 
part to get a reasonable working PHY for the FEC/ENET device.


So here the previous value of this register is not really meaningful. It 
is a mix of standard MII defined bits and LAN911X specific bits (for 
which I don't necessarily have definition ).


Here I decided to restrict the implementation of this rather "virtual" 
PHY to only standard defined bits


actually I think, I should have removed a lot more lan911x specific 
bits/registers to get to a really simple/trivial standard PHY.



-case 30:/* Interrupt mask */
+case MII_SMC911X_IM:/* Interrupt mask */
  val = s->phy_int_mask;
  break;
-case 17:
-case 18:
+case MII_NSR:
+val = 1 << 6;
+break;

The old code didn't have a case for MII_NSR (16).


I am not sure anymore why I added MII_NSR register. It is not present on 
lan9118 ethernet device but it is a standard defined register.



+case MII_LBREMR:
+case MII_REC:
  case 27:
  case 31:



-case 4: /* Auto-neg advertisement */
-s->phy_advertise = (val & 0x2d7f) | 0x80;
+case MII_ANAR: /* Auto-neg advertisement */
+s->phy_advertise = (val & (MII_ANAR_PAUSE_ASYM | MII_ANAR_PAUSE |
+   MII_ANAR_TXFD | MII_ANAR_TX |
+   MII_ANAR_10FD | MII_ANAR_10 | 0x1f)) |
+   MII_ANAR_TX;

The old code does & 0x2d7f; the new code is & 0xdff.

Same reason as the ANLPAR register.

  break;

If some of these are bug fixes, please can you put them in a separate
patch, so that the "use symbolic constants" change can be reviewed
as making no functional changes?

thanks
-- PMM

Re: [DISCUSSION] GCOV support

On Thu, 18 Jun 2020 at 20:41, Aleksandar Markovic
 wrote:
> четвртак, 18. јун 2020., Aleksandar Markovic 
>  је написао/ла:
>> You may recall that I signalled on couple of occasions that there are some 
>> problems related to gcov builds in out-of-tree builds.
>>
>> It turned out that those problems manifest on some opder Linux distribution, 
>> and are always related to the gcovr being older than 4.1. For older gcovr, 
>> the tool simply doesn't connect properly executable and its source files, 
>> and no coverage report is generated (or perhaps only some small portions, 
>> but, on any case, gcov builds are virtually unusable).

Ah. Thanks for tracking this down.

>> I propose that we don't bother supporting systems with gcovr older than 4.1. 
>> We could check version of gcovr in confugure, and refuse gcov builds if that 
>> version is older than 4.1.

Seems potentially reasonable. We don't actually check for gcovr at all in
configure right now...

It looks like we only use gcovr in creating the coverage-report.html --
I guess in theory if you wanted to use gcov directly and didn't care
about the coverage report you could still do that without a new gcovr
(ie if you were just using the facilities we provided before commit
fe8bf5f62972 in 2018). But then we'd have to make the handling of the
coverage report conditional on "do we have gcovr". I don't use gcov
so I don't have any idea whether "use gcov data but don't bother with
the gcovr coverage report" is a useful thing for anybody to be doing
or if it's silly and not worth the effort to try to support.

https://repology.org/project/gcovr/versions has the distro
coverage of gcovr versions. I note that Ubuntu Bionic only
has 3.4 still. But for a developer-use-only tool we can
be a bit less strict about our supported-distros policy I think.

Side note: the coverage-report.html targets probably ought
to be only allowed if we have gcov/gcovr enabled.

thanks
-- PMM

Re: [PATCH v2] riscv: plic: Honour source priorities

Patchew URL: https://patchew.org/QEMU/20200618202343.20455-1-jrt...@jrtc27.com/



Hi,

This series failed the asan build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-debug@fedora TARGET_LIST=x86_64-softmmu J=14 NETWORK=1
=== TEST SCRIPT END ===

  CC  qga/commands-posix.o
  CC  qga/channel-posix.o
  CC  qga/qapi-generated/qga-qapi-types.o
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  CC  qga/qapi-generated/qga-qapi-visit.o
  CC  qga/qapi-generated/qga-qapi-commands.o
  CC  qga/qapi-generated/qga-qapi-init-commands.o
---
  GEN docs/interop/qemu-ga-ref.html
  GEN docs/interop/qemu-ga-ref.txt
  GEN docs/interop/qemu-ga-ref.7
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINKqemu-keymap
  LINKivshmem-client
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINKivshmem-server
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINKqemu-nbd
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINKqemu-storage-daemon
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINKqemu-img
  AS  pc-bios/optionrom/multiboot.o
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINKqemu-io
  LINKqemu-edid
  AS  pc-bios/optionrom/linuxboot.o
  CC  pc-bios/optionrom/linuxboot_dma.o
  AS  pc-bios/optionrom/kvmvapic.o
  AS  pc-bios/optionrom/pvh.o
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  CC  pc-bios/optionrom/pvh_main.o
  BUILD   pc-bios/optionrom/multiboot.img
  LINKfsdev/virtfs-proxy-helper
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  BUILD   pc-bios/optionrom/multiboot.raw
  BUILD   pc-bios/optionrom/linuxboot.img
  BUILD   pc-bios/optionrom/linuxboot_dma.img
  BUILD   pc-bios/optionrom/pvh.img
  BUILD   pc-bios/optionrom/linuxboot.raw
  BUILD   pc-bios/optionrom/linuxboot_dma.raw
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINKscsi/qemu-pr-helper
  LINKqemu-bridge-helper
  BUILD   pc-bios/optionrom/pvh.raw
---
  BUILD   pc-bios/optionrom/kvmvapic.img
  SIGNpc-bios/optionrom/pvh.bin
  LINKvhost-user-input
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from

Re: Usage of pci bus

On Thu, 18 Jun 2020 at 20:36, Gautam Bhat  wrote:
> I am confused with the usage of PCI bus for connecting different
> peripherals. If I want to emulate an ARM board which doesn't have a
> PCI controller how can I emulate it to be as close to the real board
> as possible? Is there an ARM interconnect or something where I can
> connect the peripheral controllers and the peripherals to these
> controllers?

I'm not sure what you're asking here. If the board you want
to emulate doesn't have a PCI controller, then just don't
implement a model of a PCI controller. You won't be able
to use any PCI devices, but that's fine, because the real
hardware you're modelling doesn't have PCI devices.
Most of the devices you emulate are going to be the simple
straightforward type which have some MMIO-mapped registers
and an interrupt line. In QEMU we call those "sysbus" devices;
there are lots of examples in the tree.

thanks
-- PMM

[PATCH v2] riscv: plic: Honour source priorities

2020-06-18 Thread Jessica Clarke

The source priorities can be used to order sources with respect to other
sources, not just as a way to enable/disable them based off a threshold.
We must therefore always claim the highest-priority source, rather than
the first source we find.

Signed-off-by: Jessica Clarke 
---
Changes since v1:

 * Initialise max_prio to plic->target_priority[addrid] rather than 0,
   allowing the target priority comparison to be dropped and covered by
   the max_prio comparison.

 hw/riscv/sifive_plic.c | 17 -
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/hw/riscv/sifive_plic.c b/hw/riscv/sifive_plic.c
index 4f216c5585..d91e82b8ab 100644
--- a/hw/riscv/sifive_plic.c
+++ b/hw/riscv/sifive_plic.c
@@ -166,6 +166,9 @@ static void sifive_plic_update(SiFivePLICState *plic)
 static uint32_t sifive_plic_claim(SiFivePLICState *plic, uint32_t addrid)
 {
 int i, j;
+uint32_t max_irq = 0;
+uint32_t max_prio = plic->target_priority[addrid];
+
 for (i = 0; i < plic->bitfield_words; i++) {
 uint32_t pending_enabled_not_claimed =
 (plic->pending[i] & ~plic->claimed[i]) &
@@ -177,14 +180,18 @@ static uint32_t sifive_plic_claim(SiFivePLICState *plic, 
uint32_t addrid)
 int irq = (i << 5) + j;
 uint32_t prio = plic->source_priority[irq];
 int enabled = pending_enabled_not_claimed & (1 << j);
-if (enabled && prio > plic->target_priority[addrid]) {
-sifive_plic_set_pending(plic, irq, false);
-sifive_plic_set_claimed(plic, irq, true);
-return irq;
+if (enabled && prio > max_prio) {
+max_irq = irq;
+max_prio = prio;
 }
 }
 }
-return 0;
+
+if (max_irq) {
+sifive_plic_set_pending(plic, max_irq, false);
+sifive_plic_set_claimed(plic, max_irq, true);
+}
+return max_irq;
 }

 static uint64_t sifive_plic_read(void *opaque, hwaddr addr, unsigned size)
--
2.20.1

Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'

2020-06-18 Thread David Hildenbrand

> 
>> 2. Unclear semantics. Alex tried to document what the actual semantics
>> of hinted pages are. Assume the following in the guest to a previously
>> hinted page
>> 
>> /* page was hinted and is reused now */
>> if (page[x] != Y)
>>page[x] == Y;
>> /* migration ends, we now run on the destination */
>> BUG_ON(page[x] != Y);
>> /* BUG, because the content chan
>> 
>> A guest can observe that. And that could be a random driver that just
>> allocated a page.
>> 
>> (I *assume* in Linux we might catch that using kasan, but I am not 100%
>> sure, also, the actual semantics to document are unclear - e.g., for
>> other guests)
>> 
>> As Alex mentioned, it is not even guaranteed in QEMU that we receive a
>> zero page on the destination, it could also be something else (e.g.,
>> previously migrated values).
> 
> So this is only an issue for pages that are pushed out of the balloon
> as a part of the shrinker process though. So fixing it would be pretty
> straightforward as we would just have to initialize or at least dirty
> pages that are leaked as a part of the shrinker. That may have an
> impact on performance though as it would result in us dirtying pages
> that are freed as a result of the shrinker being triggered.
> 

It really depends on the desired semantics, which are unclear because there is 
no doc/spec. Either QEMU is buggy or the kernel is buggy.

>> 3. If I am not wrong, the iothread works in
>> virtio_ballloon_get_free_page_hints() on the virtqueue only with holding
>> the free_page_lock (no BQL).
>> 
>> Assume we're migrating, the iothread is active, and the guest triggers a
>> device reset.
>> 
>> virtio_balloon_device_reset() will trigger a
>> virtio_balloon_free_page_stop(s). That won't actually wait for the
>> iothread to stop, it will only temporarily lock free_page_lock and
>> update s->free_page_report_status.
>> 
>> I think there can be a race between the device reset and the iothread.
>> Once virtio_balloon_free_page_stop() returned,
>> virtio_ballloon_get_free_page_hints() can still call
>> - virtio_queue_set_notification(vq, 0);
>> - virtio_queue_set_notification(vq, 1);
>> - virtio_notify(vdev, vq);
>> - virtqueue_pop()
>> 
>> I doubt this is very nice.
> 
> And our conversation had me start looking though reference to
> virtio_balloon_free_page_stop. It looks like we call it for when we
> unrealize the device or reset the device. It might make more sense for
> us to look at pushing the status to DONE and forcing the iothread to
> be flushed out.
> 
>> There are other concerns I had regarding the iothread (e.g., while
>> reporting is active, virtio_ballloon_get_free_page_hints() is
>> essentially a busy loop, in contrast to documented -
>> continue_to_get_hints will always be true).
>> 
>>> The appeal of hinting is that it's 0 overhead outside migration,
>>> and pains were taken to avoid keeping pages locked while
>>> hypervisor is busy.
>>> 
>>> If we are to drop hinting completely we need to show that reporting
>>> can be comparable, and we'll probably want to add a mode for
>>> reporting that behaves somewhat similarly.
>> 
>> Depends on the actual users. If we're dropping a feature that nobody is
>> actively using, I don't think we have to show anything.
>> 
>> This feature obviously saw no proper review.
> 
> I'm pretty sure it had some, as it went through several iterations as
> I recall. However I don't think the review of the virtio interface was
> very detailed as I think most of the attention was on the kernel
> interface.

Yes, that‘s what I meant. The kernel side and the migration code (QEMU) got a 
lot of attention.

Re: [PATCH v3 5/8] acpi: Enable TPM IRQ

On Wed, Jun 17, 2020 at 07:59:51AM -0400, Stefan Berger wrote:
> On 6/17/20 4:22 AM, Auger Eric wrote:
> > Hi Stefan,
> > 
> > On 6/16/20 10:57 PM, Stefan Berger wrote:
> > > From: Stefan Berger 
> > > 
> > > Move the TPM TIS IRQ to unused IRQ 13, which is the only one accepted by
> > > Windows. Query for the TPM's irq number and enable the TPM IRQ unless
> > > TPM_IRQ_DISABLED is returned.
> > > 
> > > Signed-off-by: Stefan Berger 
> > > CC: Michael S. Tsirkin 
> > > ---
> > >   hw/i386/acpi-build.c  | 11 +--
> > >   include/hw/acpi/tpm.h |  2 +-
> > >   2 files changed, 6 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> > > index 900f786d08..bb9a7f8497 100644
> > > --- a/hw/i386/acpi-build.c
> > > +++ b/hw/i386/acpi-build.c
> > > @@ -2021,6 +2021,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> > >   build_append_pci_bus_devices(scope, bus, 
> > > pm->pcihp_bridge_en);
> > >   if (TPM_IS_TIS_ISA(tpm)) {
> > > +int8_t irq = tpm_get_irqnum(tpm);
> > >   if (misc->tpm_version == TPM_VERSION_2_0) {
> > >   dev = aml_device("TPM");
> > >   aml_append(dev, aml_name_decl("_HID",
> > > @@ -2035,12 +2036,10 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> > >   crs = aml_resource_template();
> > >   aml_append(crs, aml_memory32_fixed(TPM_TIS_ADDR_BASE,
> > >  TPM_TIS_ADDR_SIZE, AML_READ_WRITE));
> > > -/*
> > > -FIXME: TPM_TIS_IRQ=5 conflicts with PNP0C0F irqs,
> > > -Rewrite to take IRQ from TPM device model and
> > > -fix default IRQ value there to use some unused IRQ
> > > - */
> > > -/* aml_append(crs, aml_irq_no_flags(TPM_TIS_IRQ)); */
> > > +
> > > +if (irq != TPM_IRQ_DISABLED) {
> > Out of curiosity what is the goal to expose the irq num as a property
> > settable by the end-user if only 13 is known to work in all cases. At
> > least shouldn't we warn the end-user in case he attempts to change the
> > default value?
> 
> For Windows only IRQ 13 works (and I am not sure whether this has always
> been like this), Linux accepts several other ones. As for exposing it to the
> end-user, I may have taken this from soundblaster (sb16.c), which also
> exposes it. If someone plays around with the irq numbers I would say they
> must have some more Pc knowledge thanÂ  just trying random numbers.
> 
> 
> Â Â  Stefan

So is this useful to anyone? If no I'd say drop it.
I'm guessing sb16 has it since it is useful for running extremely old OSes 
which might
have weird quirks for a specific hardware.

Re: [PATCH v9 00/10] acpi: i386 tweaks

On Wed, Jun 17, 2020 at 09:11:28AM +0200, Gerd Hoffmann wrote:
> First batch of microvm patches, some generic acpi stuff.
> Split the acpi-build.c monster, specifically split the
> pc and q35 and pci bits into a separate file which we
> can skip building at some point in the future.

Thanks for the patches!
Pls take a look at tests/qtest/bios-tables-test.c, the comment
at top of this file outlines the process to use for
changing expected test files.
As it is, the patches won't be easily rebaseable or backportable.


> v2 changes: leave acpi-build.c largely as-is, move useful
> bits to other places to allow them being reused, specifically:
> 
>  * move isa device generator functions to individual isa devices.
>  * move fw_cfg generator function to fw_cfg.c
> 
> v3 changes: fix rtc, support multiple lpt devices.
> 
> v4 changes:
>  * drop merged patches.
>  * split rtc crs change to separata patch.
>  * added two cleanup patches.
>  * picked up ack & review tags.
> 
> v5 changes:
>  * add comment for rtc crs update.
>  * add even more cleanup patches.
>  * picked up ack & review tags.
> 
> v6 changes:
>  * floppy: move cmos_get_fd_drive_type.
>  * picked up ack & review tags.
> 
> v7 changes:
>  * rebased to mst/pci branch, resolved stubs conflict.
>  * dropped patches already queued up in mst/pci.
>  * added missing sign-off.
>  * picked up ack & review tags.
> 
> v8 changes:
>  * (re-)add patch to allow acpi table changes
> 
> v9 changes:
>  * add asl changes to commit messages.
>  * update acpi test data.
> 
> take care,
>   Gerd
> 
> Gerd Hoffmann (10):
>   acpi: bios-tables-test: show more context on asl diffs
>   acpi: move aml builder code for floppy device
>   floppy: make isa_fdc_get_drive_max_chs static
>   floppy: move cmos_get_fd_drive_type() from pc
>   acpi: move aml builder code for i8042 (kbd+mouse) device
>   acpi: factor out fw_cfg_add_acpi_dsdt()
>   acpi: simplify build_isa_devices_aml()
>   acpi: drop serial/parallel enable bits from dsdt
>   acpi: drop build_piix4_pm()
>   acpi: q35: drop _SB.PCI0.ISA.LPCD opregion.
> 
>  hw/i386/fw_cfg.h  |   1 +
>  include/hw/block/fdc.h|   3 +-
>  include/hw/i386/pc.h  |   1 -
>  hw/block/fdc.c| 111 +++-
>  hw/i386/acpi-build.c  | 210 +-
>  hw/i386/fw_cfg.c  |  28 
>  hw/i386/pc.c  |  25 
>  hw/input/pckbd.c  |  31 +
>  stubs/cmos.c  |   7 +
>  tests/qtest/bios-tables-test.c|   2 +-
>  stubs/Makefile.objs   |   1 +
>  tests/data/acpi/pc/DSDT   | Bin 5014 -> 4934 bytes
>  tests/data/acpi/pc/DSDT.acpihmat  | Bin 6338 -> 6258 bytes
>  tests/data/acpi/pc/DSDT.bridge| Bin 6873 -> 6793 bytes
>  tests/data/acpi/pc/DSDT.cphp  | Bin 5477 -> 5397 bytes
>  tests/data/acpi/pc/DSDT.dimmpxm   | Bin 6667 -> 6587 bytes
>  tests/data/acpi/pc/DSDT.ipmikcs   | Bin 5086 -> 5006 bytes
>  tests/data/acpi/pc/DSDT.memhp | Bin 6373 -> 6293 bytes
>  tests/data/acpi/pc/DSDT.numamem   | Bin 5020 -> 4940 bytes
>  tests/data/acpi/q35/DSDT  | Bin 7752 -> 7678 bytes
>  tests/data/acpi/q35/DSDT.acpihmat | Bin 9076 -> 9002 bytes
>  tests/data/acpi/q35/DSDT.bridge   | Bin 7769 -> 7695 bytes
>  tests/data/acpi/q35/DSDT.cphp | Bin 8215 -> 8141 bytes
>  tests/data/acpi/q35/DSDT.dimmpxm  | Bin 9405 -> 9331 bytes
>  tests/data/acpi/q35/DSDT.ipmibt   | Bin 7827 -> 7753 bytes
>  tests/data/acpi/q35/DSDT.memhp| Bin 9111 -> 9037 bytes
>  tests/data/acpi/q35/DSDT.mmio64   | Bin 8882 -> 8808 bytes
>  tests/data/acpi/q35/DSDT.numamem  | Bin 7758 -> 7684 bytes
>  tests/data/acpi/q35/DSDT.tis  | Bin 8357 -> 8283 bytes
>  29 files changed, 185 insertions(+), 235 deletions(-)
>  create mode 100644 stubs/cmos.c
> 
> -- 
> 2.18.4

Re: [PATCH v9 08/10] acpi: drop serial/parallel enable bits from dsdt

On Wed, Jun 17, 2020 at 09:11:36AM +0200, Gerd Hoffmann wrote:
> The _STA methods for COM+LPT used to reference them,
> but that isn't the case any more.
> 
> piix4 DSDT changes:
> 
>  Scope (_SB.PCI0)
>  {
>  Device (ISA)
>  {
>  Name (_ADR, 0x0001)  // _ADR: Address
>  OperationRegion (P40C, PCI_Config, 0x60, 0x04)
> -Field (^PX13.P13C, AnyAcc, NoLock, Preserve)
> -{
> -Offset (0x5F),
> -,   7,
> -LPEN,   1,
> -Offset (0x67),
> -,   3,
> -CAEN,   1,
> -,   3,
> -CBEN,   1
> -}
>  }
>  }
> 
> ich9 DSDT changes:
> 
>  Scope (_SB.PCI0)
>  {
>  Device (ISA)
>  {
>  Name (_ADR, 0x001F)  // _ADR: Address
>  OperationRegion (PIRQ, PCI_Config, 0x60, 0x0C)
>  OperationRegion (LPCD, PCI_Config, 0x80, 0x02)
>  Field (LPCD, AnyAcc, NoLock, Preserve)
>  {
>  COMA,   3,
>  ,   1,
>  COMB,   3,
>  Offset (0x01),
>  LPTD,   2
>  }
> -
> -OperationRegion (LPCE, PCI_Config, 0x82, 0x02)
> -Field (LPCE, AnyAcc, NoLock, Preserve)
> -{
> -CAEN,   1,
> -CBEN,   1,
> -LPEN,   1
> -}
>  }
>  }
> 
> Signed-off-by: Gerd Hoffmann 
> Reviewed-by: Igor Mammedov 

Don't make binary file changes with source changes.
Pls follow process at top of tests/qtest/bios-tables-test.c

> ---
>  hw/i386/acpi-build.c  |  23 ---
>  tests/data/acpi/pc/DSDT   | Bin 5014 -> 4972 bytes
>  tests/data/acpi/pc/DSDT.acpihmat  | Bin 6338 -> 6296 bytes
>  tests/data/acpi/pc/DSDT.bridge| Bin 6873 -> 6831 bytes
>  tests/data/acpi/pc/DSDT.cphp  | Bin 5477 -> 5435 bytes
>  tests/data/acpi/pc/DSDT.dimmpxm   | Bin 6667 -> 6625 bytes
>  tests/data/acpi/pc/DSDT.ipmikcs   | Bin 5086 -> 5044 bytes
>  tests/data/acpi/pc/DSDT.memhp | Bin 6373 -> 6331 bytes
>  tests/data/acpi/pc/DSDT.numamem   | Bin 5020 -> 4978 bytes
>  tests/data/acpi/q35/DSDT  | Bin 7752 -> 7718 bytes
>  tests/data/acpi/q35/DSDT.acpihmat | Bin 9076 -> 9042 bytes
>  tests/data/acpi/q35/DSDT.bridge   | Bin 7769 -> 7735 bytes
>  tests/data/acpi/q35/DSDT.cphp | Bin 8215 -> 8181 bytes
>  tests/data/acpi/q35/DSDT.dimmpxm  | Bin 9405 -> 9371 bytes
>  tests/data/acpi/q35/DSDT.ipmibt   | Bin 7827 -> 7793 bytes
>  tests/data/acpi/q35/DSDT.memhp| Bin 9111 -> 9077 bytes
>  tests/data/acpi/q35/DSDT.mmio64   | Bin 8882 -> 8848 bytes
>  tests/data/acpi/q35/DSDT.numamem  | Bin 7758 -> 7724 bytes
>  tests/data/acpi/q35/DSDT.tis  | Bin 8357 -> 8323 bytes
>  19 files changed, 23 deletions(-)
> 
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index d27cecc877c4..ffbdbee51aa8 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -1360,15 +1360,6 @@ static void build_q35_isa_bridge(Aml *table)
>  aml_append(field, aml_named_field("LPTD", 2));
>  aml_append(dev, field);
>  
> -aml_append(dev, aml_operation_region("LPCE", AML_PCI_CONFIG,
> - aml_int(0x82), 0x02));
> -/* enable bits */
> -field = aml_field("LPCE", AML_ANY_ACC, AML_NOLOCK, AML_PRESERVE);
> -aml_append(field, aml_named_field("CAEN", 1));
> -aml_append(field, aml_named_field("CBEN", 1));
> -aml_append(field, aml_named_field("LPEN", 1));
> -aml_append(dev, field);
> -
>  aml_append(scope, dev);
>  aml_append(table, scope);
>  }
> @@ -1392,7 +1383,6 @@ static void build_piix4_isa_bridge(Aml *table)
>  {
>  Aml *dev;
>  Aml *scope;
> -Aml *field;
>  
>  scope =  aml_scope("_SB.PCI0");
>  dev = aml_device("ISA");
> @@ -1401,19 +1391,6 @@ static void build_piix4_isa_bridge(Aml *table)
>  /* PIIX PCI to ISA irq remapping */
>  aml_append(dev, aml_operation_region("P40C", AML_PCI_CONFIG,
>   aml_int(0x60), 0x04));
> -/* enable bits */
> -field = aml_field("^PX13.P13C", AML_ANY_ACC, AML_NOLOCK, AML_PRESERVE);
> -/* Offset(0x5f),, 7, */
> -aml_append(field, aml_reserved_field(0x2f8));
> -aml_append(field, aml_reserved_field(7));
> -aml_append(field, aml_named_field("LPEN", 1));
> -/* Offset(0x67),, 3, */
> -aml_append(field, aml_reserved_field(0x38));
> -aml_append(field, aml_reserved_field(3));
> -aml_append(field, aml_named_field("CAEN", 1));
> -aml_append(field, aml_reserved_field(3));
> -aml_append(field, aml_named_field("CBEN", 1));
> -aml_append(dev, field);
>  
>  aml_append(scope, dev);
>  aml_append(table, scope);
> diff --git a/tests/data/acpi/pc/DSDT b/tests/data/acpi/pc/DSDT
> index 
>

Re: [PATCH v9 02/10] acpi: move aml builder code for floppy device

On Wed, Jun 17, 2020 at 09:11:30AM +0200, Gerd Hoffmann wrote:
> DSDT change: isa device order changes in case MI1 (ipmi) is present.
> 
> Signed-off-by: Gerd Hoffmann 
> Reviewed-by: Igor Mammedov 
> ---

Pls follow process outlined at the top of tests/qtest/bios-tables-test.c
Don't change expected files with source.
Thanks!

>  hw/block/fdc.c  |  83 
>  hw/i386/acpi-build.c|  83 
>  stubs/cmos.c|   7 +++
>  stubs/Makefile.objs |   1 +
>  tests/data/acpi/pc/DSDT.ipmikcs | Bin 5086 -> 5086 bytes
>  5 files changed, 91 insertions(+), 83 deletions(-)
>  create mode 100644 stubs/cmos.c
> 
> diff --git a/hw/block/fdc.c b/hw/block/fdc.c
> index 8528b9a3c722..c92436772292 100644
> --- a/hw/block/fdc.c
> +++ b/hw/block/fdc.c
> @@ -32,6 +32,8 @@
>  #include "qapi/error.h"
>  #include "qemu/error-report.h"
>  #include "qemu/timer.h"
> +#include "hw/i386/pc.h"
> +#include "hw/acpi/aml-build.h"
>  #include "hw/irq.h"
>  #include "hw/isa/isa.h"
>  #include "hw/qdev-properties.h"
> @@ -2765,6 +2767,85 @@ void isa_fdc_get_drive_max_chs(FloppyDriveType type,
>  (*maxc)--;
>  }
>  
> +static Aml *build_fdinfo_aml(int idx, FloppyDriveType type)
> +{
> +Aml *dev, *fdi;
> +uint8_t maxc, maxh, maxs;
> +
> +isa_fdc_get_drive_max_chs(type, , , );
> +
> +dev = aml_device("FLP%c", 'A' + idx);
> +
> +aml_append(dev, aml_name_decl("_ADR", aml_int(idx)));
> +
> +fdi = aml_package(16);
> +aml_append(fdi, aml_int(idx));  /* Drive Number */
> +aml_append(fdi,
> +aml_int(cmos_get_fd_drive_type(type)));  /* Device Type */
> +/*
> + * the values below are the limits of the drive, and are thus independent
> + * of the inserted media
> + */
> +aml_append(fdi, aml_int(maxc));  /* Maximum Cylinder Number */
> +aml_append(fdi, aml_int(maxs));  /* Maximum Sector Number */
> +aml_append(fdi, aml_int(maxh));  /* Maximum Head Number */
> +/*
> + * SeaBIOS returns the below values for int 0x13 func 0x08 regardless of
> + * the drive type, so shall we
> + */
> +aml_append(fdi, aml_int(0xAF));  /* disk_specify_1 */
> +aml_append(fdi, aml_int(0x02));  /* disk_specify_2 */
> +aml_append(fdi, aml_int(0x25));  /* disk_motor_wait */
> +aml_append(fdi, aml_int(0x02));  /* disk_sector_siz */
> +aml_append(fdi, aml_int(0x12));  /* disk_eot */
> +aml_append(fdi, aml_int(0x1B));  /* disk_rw_gap */
> +aml_append(fdi, aml_int(0xFF));  /* disk_dtl */
> +aml_append(fdi, aml_int(0x6C));  /* disk_formt_gap */
> +aml_append(fdi, aml_int(0xF6));  /* disk_fill */
> +aml_append(fdi, aml_int(0x0F));  /* disk_head_sttl */
> +aml_append(fdi, aml_int(0x08));  /* disk_motor_strt */
> +
> +aml_append(dev, aml_name_decl("_FDI", fdi));
> +return dev;
> +}
> +
> +static void fdc_isa_build_aml(ISADevice *isadev, Aml *scope)
> +{
> +Aml *dev;
> +Aml *crs;
> +int i;
> +
> +#define ACPI_FDE_MAX_FD 4
> +uint32_t fde_buf[5] = {
> +0, 0, 0, 0, /* presence of floppy drives #0 - #3 */
> +cpu_to_le32(2)  /* tape presence (2 == never present) */
> +};
> +
> +crs = aml_resource_template();
> +aml_append(crs, aml_io(AML_DECODE16, 0x03F2, 0x03F2, 0x00, 0x04));
> +aml_append(crs, aml_io(AML_DECODE16, 0x03F7, 0x03F7, 0x00, 0x01));
> +aml_append(crs, aml_irq_no_flags(6));
> +aml_append(crs,
> +aml_dma(AML_COMPATIBILITY, AML_NOTBUSMASTER, AML_TRANSFER8, 2));
> +
> +dev = aml_device("FDC0");
> +aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0700")));
> +aml_append(dev, aml_name_decl("_CRS", crs));
> +
> +for (i = 0; i < MIN(MAX_FD, ACPI_FDE_MAX_FD); i++) {
> +FloppyDriveType type = isa_fdc_get_drive_type(isadev, i);
> +
> +if (type < FLOPPY_DRIVE_TYPE_NONE) {
> +fde_buf[i] = cpu_to_le32(1);  /* drive present */
> +aml_append(dev, build_fdinfo_aml(i, type));
> +}
> +}
> +aml_append(dev, aml_name_decl("_FDE",
> +   aml_buffer(sizeof(fde_buf), (uint8_t *)fde_buf)));
> +
> +aml_append(scope, dev);
> +}
> +
>  static const VMStateDescription vmstate_isa_fdc ={
>  .name = "fdc",
>  .version_id = 2,
> @@ -2798,11 +2879,13 @@ static Property isa_fdc_properties[] = {
>  static void isabus_fdc_class_init(ObjectClass *klass, void *data)
>  {
>  DeviceClass *dc = DEVICE_CLASS(klass);
> +ISADeviceClass *isa = ISA_DEVICE_CLASS(klass);
>  
>  dc->realize = isabus_fdc_realize;
>  dc->fw_name = "fdc";
>  dc->reset = fdctrl_external_reset_isa;
>  dc->vmsd = _isa_fdc;
> +isa->build_aml = fdc_isa_build_aml;
>  device_class_set_props(dc, isa_fdc_properties);
>  set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
>  }
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index 900f786d08de..45297d9a90e7 100644
> --- a/hw/i386/acpi-build.c
> +++

Re: [PATCH v9 05/10] acpi: move aml builder code for i8042 (kbd+mouse) device

On Wed, Jun 17, 2020 at 09:11:33AM +0200, Gerd Hoffmann wrote:
> DSDT change: isa device order changes in case MI1 (ipmi) is present.
> 
> Signed-off-by: Gerd Hoffmann 
> Reviewed-by: Philippe Mathieu-DaudÃ© 
> Reviewed-by: Igor Mammedov 
> ---
>  hw/i386/acpi-build.c|  39 
>  hw/input/pckbd.c|  31 +
>  tests/data/acpi/pc/DSDT.ipmikcs | Bin 5086 -> 5086 bytes
>  tests/data/acpi/q35/DSDT.ipmibt | Bin 7827 -> 7827 bytes
>  4 files changed, 31 insertions(+), 39 deletions(-)

Please don't add binary file diffs together with source changes.
Pls follow the process outlined in tests/qtest/bios-tables-test.c.

> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index 45297d9a90e7..13113e83dfe2 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -938,42 +938,6 @@ static void build_hpet_aml(Aml *table)
>  aml_append(table, scope);
>  }
>  
> -static Aml *build_kbd_device_aml(void)
> -{
> -Aml *dev;
> -Aml *crs;
> -
> -dev = aml_device("KBD");
> -aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0303")));
> -
> -aml_append(dev, aml_name_decl("_STA", aml_int(0xf)));
> -
> -crs = aml_resource_template();
> -aml_append(crs, aml_io(AML_DECODE16, 0x0060, 0x0060, 0x01, 0x01));
> -aml_append(crs, aml_io(AML_DECODE16, 0x0064, 0x0064, 0x01, 0x01));
> -aml_append(crs, aml_irq_no_flags(1));
> -aml_append(dev, aml_name_decl("_CRS", crs));
> -
> -return dev;
> -}
> -
> -static Aml *build_mouse_device_aml(void)
> -{
> -Aml *dev;
> -Aml *crs;
> -
> -dev = aml_device("MOU");
> -aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0F13")));
> -
> -aml_append(dev, aml_name_decl("_STA", aml_int(0xf)));
> -
> -crs = aml_resource_template();
> -aml_append(crs, aml_irq_no_flags(12));
> -aml_append(dev, aml_name_decl("_CRS", crs));
> -
> -return dev;
> -}
> -
>  static Aml *build_vmbus_device_aml(VMBusBridge *vmbus_bridge)
>  {
>  Aml *dev;
> @@ -1019,9 +983,6 @@ static void build_isa_devices_aml(Aml *table)
>  Aml *scope = aml_scope("_SB.PCI0.ISA");
>  Object *obj = object_resolve_path_type("", TYPE_ISA_BUS, );
>  
> -aml_append(scope, build_kbd_device_aml());
> -aml_append(scope, build_mouse_device_aml());
> -
>  if (ambiguous) {
>  error_report("Multiple ISA busses, unable to define IPMI ACPI data");
>  } else if (!obj) {
> diff --git a/hw/input/pckbd.c b/hw/input/pckbd.c
> index 60a41303203a..29d633ca9478 100644
> --- a/hw/input/pckbd.c
> +++ b/hw/input/pckbd.c
> @@ -26,6 +26,7 @@
>  #include "qemu/log.h"
>  #include "hw/isa/isa.h"
>  #include "migration/vmstate.h"
> +#include "hw/acpi/aml-build.h"
>  #include "hw/input/ps2.h"
>  #include "hw/irq.h"
>  #include "hw/input/i8042.h"
> @@ -561,12 +562,42 @@ static void i8042_realizefn(DeviceState *dev, Error 
> **errp)
>  qemu_register_reset(kbd_reset, s);
>  }
>  
> +static void i8042_build_aml(ISADevice *isadev, Aml *scope)
> +{
> +Aml *kbd;
> +Aml *mou;
> +Aml *crs;
> +
> +crs = aml_resource_template();
> +aml_append(crs, aml_io(AML_DECODE16, 0x0060, 0x0060, 0x01, 0x01));
> +aml_append(crs, aml_io(AML_DECODE16, 0x0064, 0x0064, 0x01, 0x01));
> +aml_append(crs, aml_irq_no_flags(1));
> +
> +kbd = aml_device("KBD");
> +aml_append(kbd, aml_name_decl("_HID", aml_eisaid("PNP0303")));
> +aml_append(kbd, aml_name_decl("_STA", aml_int(0xf)));
> +aml_append(kbd, aml_name_decl("_CRS", crs));
> +
> +crs = aml_resource_template();
> +aml_append(crs, aml_irq_no_flags(12));
> +
> +mou = aml_device("MOU");
> +aml_append(mou, aml_name_decl("_HID", aml_eisaid("PNP0F13")));
> +aml_append(mou, aml_name_decl("_STA", aml_int(0xf)));
> +aml_append(mou, aml_name_decl("_CRS", crs));
> +
> +aml_append(scope, kbd);
> +aml_append(scope, mou);
> +}
> +
>  static void i8042_class_initfn(ObjectClass *klass, void *data)
>  {
>  DeviceClass *dc = DEVICE_CLASS(klass);
> +ISADeviceClass *isa = ISA_DEVICE_CLASS(klass);
>  
>  dc->realize = i8042_realizefn;
>  dc->vmsd = _kbd_isa;
> +isa->build_aml = i8042_build_aml;
>  set_bit(DEVICE_CATEGORY_INPUT, dc->categories);
>  }
>  
> diff --git a/tests/data/acpi/pc/DSDT.ipmikcs b/tests/data/acpi/pc/DSDT.ipmikcs
> index 
> c285651131dc2ab8b0f32de750d7ac02a8b09936..1c19e2f354d022279d7e1343fa7212396d8c25a0
>  100644
> GIT binary patch
> delta 20
> ccmcboeouYFO2)~oOdgY0GRAE7Wtu1m09c?0ga7~l
> 
> delta 20
> ccmcboeouYFO2)|_8Dl1|Wc1kV%QR6C0AuwCZvX%Q
> 
> diff --git a/tests/data/acpi/q35/DSDT.ipmibt b/tests/data/acpi/q35/DSDT.ipmibt
> index 
> 38723daef80421ea528b2ad2d411e7357df43956..0173c3668a6cdef80127de7880a19cb5c5ea7dc0
>  100644
> GIT binary patch
> delta 20
> ccmbPiJK1)F4%6fgChy5QOfj1;GaZly08TOoX8-^I
> 
> delta 20
> ccmbPiJK1)F4%6gvrkKe(Ox~L>GaZly08Od~RsaA1
> 
> -- 
> 2.18.4

Re: [PATCH] riscv: plic: Honour source priorities

Patchew URL: https://patchew.org/QEMU/20200618193556.19459-1-jrt...@jrtc27.com/



Hi,

This series failed the asan build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-debug@fedora TARGET_LIST=x86_64-softmmu J=14 NETWORK=1
=== TEST SCRIPT END ===

  GEN docs/interop/qemu-qmp-ref.7
  CC  qga/commands.o
  CC  qga/guest-agent-command-state.o
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  CC  qga/main.o
  CC  qga/commands-posix.o
  CC  qga/channel-posix.o
---
  GEN docs/interop/qemu-ga-ref.html
  GEN docs/interop/qemu-ga-ref.txt
  GEN docs/interop/qemu-ga-ref.7
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINKqemu-ga
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINKqemu-keymap
  AS  pc-bios/optionrom/multiboot.o
  AS  pc-bios/optionrom/linuxboot.o
  CC  pc-bios/optionrom/linuxboot_dma.o
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  AS  pc-bios/optionrom/kvmvapic.o
  LINKivshmem-client
  AS  pc-bios/optionrom/pvh.o
---
  BUILD   pc-bios/optionrom/multiboot.img
  LINKivshmem-server
/usr/bin/ld  BUILD   pc-bios/optionrom/linuxboot.img
: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  BUILD   pc-bios/optionrom/linuxboot_dma.img
  BUILD   pc-bios/optionrom/kvmvapic.img
  BUILD   pc-bios/optionrom/multiboot.raw
---
  LINKqemu-nbd
  BUILD   pc-bios/optionrom/pvh.img
  SIGNpc-bios/optionrom/multiboot.bin
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  SIGNpc-bios/optionrom/linuxboot.bin
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINKqemu-storage-daemon
  SIGNpc-bios/optionrom/linuxboot_dma.bin
  SIGNpc-bios/optionrom/kvmvapic.bin
---
  LINKfsdev/virtfs-proxy-helper
  LINKscsi/qemu-pr-helper
  LINKqemu-bridge-helper
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINKvirtiofsd
/usr/bin/ld: /usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o)/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `: warning: common of 
`__interception::real_vfork__interception::real_vfork' overridden by definition 
from ' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of `__interception::real_vfork' overridden by definition from 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
/usr/bin/ld: 
/usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
 warning: common of

Re: [RFC v6 2/4] cpu-throttle: new module, extracted from cpus.c

2020-06-18 Thread Laurent Vivier

On 18/06/2020 21:03, Claudio Fontana wrote:
> move the vcpu throttling functionality into its own module.
> 
> This functionality is not specific to any accelerator,
> and it is used currently by migration to slow down guests to try to
> have migrations converge, and by the cocoa MacOS UI to throttle speed.
> 
> cpu-throttle contains the controls to adjust and inspect throttle
> settings, start (set) and stop vcpu throttling, and the throttling
> function itself that is run periodically on vcpus to make them take a nap.
> 
> Execution of the throttling function on all vcpus is triggered by a timer,
> registered at module initialization.
> 
> No functionality change.
> 
> Signed-off-by: Claudio Fontana 
> Reviewed-by: Alex Bennée 
> ---
>  MAINTAINERS   |   1 +
>  include/hw/core/cpu.h |  37 -
>  include/qemu/main-loop.h  |   5 ++
>  include/sysemu/cpu-throttle.h |  68 +++
>  migration/migration.c |   1 +
>  migration/ram.c   |   1 +
>  softmmu/Makefile.objs |   1 +
>  softmmu/cpu-throttle.c| 122 
> ++
>  softmmu/cpus.c|  95 +++-
>  9 files changed, 207 insertions(+), 124 deletions(-)
>  create mode 100644 include/sysemu/cpu-throttle.h
>  create mode 100644 softmmu/cpu-throttle.c
> 

Reviewed-by: Laurent Vivier

Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'

2020-06-18 Thread Alexander Duyck

On Thu, Jun 18, 2020 at 10:10 AM David Hildenbrand  wrote:
>
> >>
> >> Ugh, ...
> >>
> >> @MST, you might have missed that in another discussion, what's your
> >> general opinion about removing free page hinting in QEMU (and Linux)? We
> >> keep finding issues in the QEMU implementation, including non-trivial
> >> ones, and have to speculate about the actual semantics. I can see that
> >> e.g., libvirt does not support it yet.
> >
> > Not maintaining two similar features sounds attractive.

Agreed. Just to make sure we are all on the same page I am adding Wei
Wang since he was the original author for page hinting.

> I consider free page hinting (in QEMU) to be in an unmaintainable state
> (and it looks like Alex and I are fixing a feature we don't actually
> intend to use / not aware of users). In contrast to that, the free page
> reporting functionality/implementation is a walk in the park.
>
> >
> > I'm still trying to get my head around the list of issues.  So far they
> > all look kind of minor to me.  Would you like to summarize them
> > somewhere?
>
> Some things I still have in my mind
>
>
> 1. If migration fails during RAM precopy, the guest will never receive a
> DONE notification. Probably easy to fix.

Agreed. It is just a matter of finding the right point to add a hook
so that if we abort the migration we can report DONE.

> 2. Unclear semantics. Alex tried to document what the actual semantics
> of hinted pages are. Assume the following in the guest to a previously
> hinted page
>
> /* page was hinted and is reused now */
> if (page[x] != Y)
> page[x] == Y;
> /* migration ends, we now run on the destination */
> BUG_ON(page[x] != Y);
> /* BUG, because the content chan
>
> A guest can observe that. And that could be a random driver that just
> allocated a page.
>
> (I *assume* in Linux we might catch that using kasan, but I am not 100%
> sure, also, the actual semantics to document are unclear - e.g., for
> other guests)
>
> As Alex mentioned, it is not even guaranteed in QEMU that we receive a
> zero page on the destination, it could also be something else (e.g.,
> previously migrated values).

So this is only an issue for pages that are pushed out of the balloon
as a part of the shrinker process though. So fixing it would be pretty
straightforward as we would just have to initialize or at least dirty
pages that are leaked as a part of the shrinker. That may have an
impact on performance though as it would result in us dirtying pages
that are freed as a result of the shrinker being triggered.

> 3. If I am not wrong, the iothread works in
> virtio_ballloon_get_free_page_hints() on the virtqueue only with holding
> the free_page_lock (no BQL).
>
> Assume we're migrating, the iothread is active, and the guest triggers a
> device reset.
>
> virtio_balloon_device_reset() will trigger a
> virtio_balloon_free_page_stop(s). That won't actually wait for the
> iothread to stop, it will only temporarily lock free_page_lock and
> update s->free_page_report_status.
>
> I think there can be a race between the device reset and the iothread.
> Once virtio_balloon_free_page_stop() returned,
> virtio_ballloon_get_free_page_hints() can still call
> - virtio_queue_set_notification(vq, 0);
> - virtio_queue_set_notification(vq, 1);
> - virtio_notify(vdev, vq);
> - virtqueue_pop()
>
> I doubt this is very nice.

And our conversation had me start looking though reference to
virtio_balloon_free_page_stop. It looks like we call it for when we
unrealize the device or reset the device. It might make more sense for
us to look at pushing the status to DONE and forcing the iothread to
be flushed out.

> There are other concerns I had regarding the iothread (e.g., while
> reporting is active, virtio_ballloon_get_free_page_hints() is
> essentially a busy loop, in contrast to documented -
> continue_to_get_hints will always be true).
>
> > The appeal of hinting is that it's 0 overhead outside migration,
> > and pains were taken to avoid keeping pages locked while
> > hypervisor is busy.
> >
> > If we are to drop hinting completely we need to show that reporting
> > can be comparable, and we'll probably want to add a mode for
> > reporting that behaves somewhat similarly.
>
> Depends on the actual users. If we're dropping a feature that nobody is
> actively using, I don't think we have to show anything.
>
> This feature obviously saw no proper review.

I'm pretty sure it had some, as it went through several iterations as
I recall. However I don't think the review of the virtio interface was
very detailed as I think most of the attention was on the kernel
interface.

As far as trying to do this with page reporting it would be doable,
but I would need to use something like the command interface so that I
would have a way to tell the driver when to drop the reported bit from
the pages and when to stop/resume hinting. However it still wouldn't
resolve the issue of copy on write style pages where the page is only
read and

Re: [DISCUSSION] GCOV support

2020-06-18 Thread Aleksandar Markovic

четвртак, 18. јун 2020., Aleksandar Markovic <
aleksandar.qemu.de...@gmail.com> је написао/ла:

> Hi, Alex, Peter.
>
> You may recall that I signalled on couple of occasions that there are some
> problems related to gcov builds in out-of-tree builds.
>
> It turned out that those problems manifest on some opder Linux
> distribution, and are always related to the gcovr being older than 4.1. For
> older gcovr, the tool simply doesn't connect properly executable and its
> source files, and no coverage report is generated (or perhaps only some
> small portions, but, on any case, gcov builds are virtually unusable).
>
> I propose that we don't bother supporting systems with gcovr older than
> 4.1. We could check version of gcovr in confugure, and refuse gcov builds
> if that version is older than 4.1.
>
>
More precisely, I propose that "configure --enable-gcov" should not be
possible if gcovr version is older than 4.1, or, of course, absent from the
system altogether.

(Note: In-tree gcov builds do not have this problem, older gcovr work
perfectly, but we anyway want to switch to out-of-tree builds only.)



> This would remove one obstacle towards removing the support of in-tree
> builds. (I am not sure about future Mason-based builds, I hope they will
> support gcov builds, and work in almost identical way.).
>
> If you agree with proposal on the level of design, Alex, can you perhaps
> write the corresponding patch, I gather you are more familiar with
> modifying configure than me? Or I should do it?
>
> Warmly,
> Aleksandar
>

Re: [RFC v6 1/4] softmmu: move softmmu only files from root

2020-06-18 Thread Laurent Vivier

On 18/06/2020 21:03, Claudio Fontana wrote:
> move arch_init, balloon, cpus, ioport, memory, memory_mapping, qtest.
> 
> They are all specific to CONFIG_SOFTMMU.
> 
> Signed-off-by: Claudio Fontana 
> Reviewed-by: Alex Bennée 
> ---
>  MAINTAINERS  | 12 ++--
>  Makefile.target  |  7 ++-
>  softmmu/Makefile.objs| 10 ++
>  arch_init.c => softmmu/arch_init.c   |  0
>  balloon.c => softmmu/balloon.c   |  0
>  cpus.c => softmmu/cpus.c |  0
>  ioport.c => softmmu/ioport.c |  0
>  memory.c => softmmu/memory.c |  0
>  memory_mapping.c => softmmu/memory_mapping.c |  0
>  qtest.c => softmmu/qtest.c   |  0
>  10 files changed, 18 insertions(+), 11 deletions(-)
>  rename arch_init.c => softmmu/arch_init.c (100%)
>  rename balloon.c => softmmu/balloon.c (100%)
>  rename cpus.c => softmmu/cpus.c (100%)
>  rename ioport.c => softmmu/ioport.c (100%)
>  rename memory.c => softmmu/memory.c (100%)
>  rename memory_mapping.c => softmmu/memory_mapping.c (100%)
>  rename qtest.c => softmmu/qtest.c (100%)
> 

Good idea...

Reviewed-by; Laurent Vivier 

Thanks,
Laurent

[PATCH] riscv: plic: Honour source priorities

2020-06-18 Thread Jessica Clarke

The source priorities can be used to order sources with respect to other
sources, not just as a way to enable/disable them based off a threshold.
We must therefore always claim the highest-priority source, rather than
the first source we find.

Signed-off-by: Jessica Clarke 
---
 hw/riscv/sifive_plic.c | 19 ++-
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/hw/riscv/sifive_plic.c b/hw/riscv/sifive_plic.c
index 4f216c5585..911e006095 100644
--- a/hw/riscv/sifive_plic.c
+++ b/hw/riscv/sifive_plic.c
@@ -166,6 +166,9 @@ static void sifive_plic_update(SiFivePLICState *plic)
 static uint32_t sifive_plic_claim(SiFivePLICState *plic, uint32_t addrid)
 {
 int i, j;
+uint32_t max_irq = 0;
+uint32_t max_prio = 0;
+
 for (i = 0; i < plic->bitfield_words; i++) {
 uint32_t pending_enabled_not_claimed =
 (plic->pending[i] & ~plic->claimed[i]) &
@@ -177,14 +180,20 @@ static uint32_t sifive_plic_claim(SiFivePLICState *plic, 
uint32_t addrid)
 int irq = (i << 5) + j;
 uint32_t prio = plic->source_priority[irq];
 int enabled = pending_enabled_not_claimed & (1 << j);
-if (enabled && prio > plic->target_priority[addrid]) {
-sifive_plic_set_pending(plic, irq, false);
-sifive_plic_set_claimed(plic, irq, true);
-return irq;
+if (enabled && prio > plic->target_priority[addrid] &&
+prio > max_prio)
+{
+max_irq = irq;
+max_prio = prio;
 }
 }
 }
-return 0;
+
+if (max_irq) {
+sifive_plic_set_pending(plic, max_irq, false);
+sifive_plic_set_claimed(plic, max_irq, true);
+}
+return max_irq;
 }
 
 static uint64_t sifive_plic_read(void *opaque, hwaddr addr, unsigned size)
-- 
2.20.1

Usage of pci bus

2020-06-18 Thread Gautam Bhat

Hi,

I am confused with the usage of PCI bus for connecting different
peripherals. If I want to emulate an ARM board which doesn't have a
PCI controller how can I emulate it to be as close to the real board
as possible? Is there an ARM interconnect or something where I can
connect the peripheral controllers and the peripherals to these
controllers?

Thanks,
Gautam.

[DISCUSSION] GCOV support

2020-06-18 Thread Aleksandar Markovic

Hi, Alex, Peter.

You may recall that I signalled on couple of occasions that there are some
problems related to gcov builds in out-of-tree builds.

It turned out that those problems manifest on some opder Linux
distribution, and are always related to the gcovr being older than 4.1. For
older gcovr, the tool simply doesn't connect properly executable and its
source files, and no coverage report is generated (or perhaps only some
small portions, but, on any case, gcov builds are virtually unusable).

I propose that we don't bother supporting systems with gcovr older than
4.1. We could check version of gcovr in confugure, and refuse gcov builds
if that version is older than 4.1.

This would remove one obstacle towards removing the support of in-tree
builds. (I am not sure about future Mason-based builds, I hope they will
support gcov builds, and work in almost identical way.).

If you agree with proposal on the level of design, Alex, can you perhaps
write the corresponding patch, I gather you are more familiar with
modifying configure than me? Or I should do it?

Warmly,
Aleksandar

Re: [Virtio-fs] [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-06-18 Thread Vivek Goyal

On Thu, Jun 18, 2020 at 08:16:55PM +0100, Dr. David Alan Gilbert wrote:
> * Vivek Goyal (vgo...@redhat.com) wrote:
> > On Thu, Apr 16, 2020 at 05:49:05PM +0100, Stefan Hajnoczi wrote:
> > > virtiofsd doesn't need of all Linux capabilities(7) available to root.  
> > > Keep a
> > > whitelisted set of capabilities that we require.  This improves security 
> > > in
> > > case virtiofsd is compromised by making it hard for an attacker to gain 
> > > further
> > > access to the system.
> > 
> > Hi Stefan,
> > 
> > I just noticed that this patch set breaks overlayfs on top of virtiofs.
> > 
> > overlayfs sets "trusted.overlay.*" and xattrs in trusted domain
> > need CAP_SYS_ADMIN.
> > 
> > man xattr says.
> > 
> >Trusted extended attributes
> >Trusted  extended  attributes  are  visible and accessible only to 
> > pro‐
> >cesses that have the  CAP_SYS_ADMIN  capability.   Attributes  in  
> > this
> >class are used to implement mechanisms in user space (i.e., outside 
> > the
> >kernel) which keep information in extended attributes to which 
> > ordinary
> >processes should not have access.
> > 
> > There is a chance that overlay moves away from trusted xattr in future.
> > But for now we need to make it work. This is an important use case for
> > kata docker in docker build.
> > 
> > May be we can add an option to virtiofsd say "--add-cap " and
> > ask user to pass in "--add-cap cap_sys_admin" if they need to run daemon
> > with this capaibility.
> 
> I'll admit I don't like the idea of giving it cap_sys_admin.
> Can you explain:
>   a) What overlayfs uses trusted for?

overlayfs stores bunch of metadata and uses "trusted" xattrs for it.

>   b) If something nasty was to write junk into the trusted attributes,
> what would happen?

This directory is owned by guest. So it should be able to write
anything it wants, as long as process in guest has CAP_SYS_ADMIN, right?

>   c) I see overlayfs has a fallback check if xattr isn't supported at
> all - what is the consequence?

It falls back to I think read only mode. 

For a moment forget about overlayfs. Say a user process in guest with
CAP_SYS_ADMIN is writing trusted.foo. Should that succeed? Is a
passthrough filesystem, so it should go through. But currently it
wont.

Thanks
Vivek

Re: [RFC v6 0/4] QEMU cpus.c refactoring

Patchew URL: https://patchew.org/QEMU/20200618190401.4895-1-cfont...@suse.de/



Hi,

This series failed the asan build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-debug@fedora TARGET_LIST=x86_64-softmmu J=14 NETWORK=1
=== TEST SCRIPT END ===

  CC  backends/rng-builtin.o
  CC  backends/rng-random.o
  CC  backends/tpm.o
/tmp/qemu-test/src/dma-helpers.c:154:20: error: use of undeclared identifier 
'use_icount'
if (mem && use_icount && dbs->dir == DMA_DIRECTION_FROM_DEVICE) {
   ^
1 error generated.
make: *** [/tmp/qemu-test/src/rules.mak:69: dma-helpers.o] Error 1
make: *** Waiting for unfinished jobs
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 669, in 
---
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', 
'--label', 'com.qemu.instance.uuid=ee200de2b1c042c6a61d565997d83f7d', '-u', 
'1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 
'TARGET_LIST=x86_64-softmmu', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 
'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', 
'-v', '/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', 
'/var/tmp/patchew-tester-tmp-yi6rts79/src/docker-src.2020-06-18-15.16.58.31304:/var/tmp/qemu:z,ro',
 'qemu:fedora', '/var/tmp/qemu/run', 'test-debug']' returned non-zero exit 
status 2.
filter=--filter=label=com.qemu.instance.uuid=ee200de2b1c042c6a61d565997d83f7d
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-yi6rts79/src'
make: *** [docker-run-test-debug@fedora] Error 2

real3m49.171s
user0m8.544s


The full log is available at
http://patchew.org/logs/20200618190401.4895-1-cfont...@suse.de/testing.asan/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [PATCH v7 14/42] target/arm: Add helper_probe_access

On 6/18/20 6:33 AM, Peter Maydell wrote:
> On Wed, 3 Jun 2020 at 02:13, Richard Henderson
>  wrote:
>>
>> Raise an exception if the given virtual memory is not accessible.
>>
>> Signed-off-by: Richard Henderson 
>> ---
> 
>> diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
>> index 0ee2ef403e..b032829194 100644
>> --- a/target/arm/translate-a64.c
>> +++ b/target/arm/translate-a64.c
>> @@ -232,6 +232,19 @@ static void gen_address_with_allocation_tag0(TCGv_i64 
>> dst, TCGv_i64 src)
>>  tcg_gen_andi_i64(dst, src, ~MAKE_64BIT_MASK(56, 4));
>>  }
>>
>> +static void gen_probe_access(DisasContext *s, TCGv_i64 ptr,
>> + MMUAccessType acc, int log2_size)
>> +{
>> +TCGv_i32 t_acc = tcg_const_i32(acc);
>> +TCGv_i32 t_idx = tcg_const_i32(get_mem_index(s));
>> +TCGv_i32 t_size = tcg_const_i32(1 << log2_size);
>> +
>> +gen_helper_probe_access(cpu_env, ptr, t_acc, t_idx, t_size);
>> +tcg_temp_free_i32(t_acc);
>> +tcg_temp_free_i32(t_idx);
>> +tcg_temp_free_i32(t_size);
>> +}
> 
> This isn't called from anywhere -- clang is probably going to
> complain about that.

Ah, yes.  I thought it would be helpful to split this patch, but I guess not.
I'll merge with the next patch where it's used.

r~

Re: [Virtio-fs] [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-06-18 Thread Dr. David Alan Gilbert

* Vivek Goyal (vgo...@redhat.com) wrote:
> On Thu, Apr 16, 2020 at 05:49:05PM +0100, Stefan Hajnoczi wrote:
> > virtiofsd doesn't need of all Linux capabilities(7) available to root.  
> > Keep a
> > whitelisted set of capabilities that we require.  This improves security in
> > case virtiofsd is compromised by making it hard for an attacker to gain 
> > further
> > access to the system.
> 
> Hi Stefan,
> 
> I just noticed that this patch set breaks overlayfs on top of virtiofs.
> 
> overlayfs sets "trusted.overlay.*" and xattrs in trusted domain
> need CAP_SYS_ADMIN.
> 
> man xattr says.
> 
>Trusted extended attributes
>Trusted  extended  attributes  are  visible and accessible only to pro‐
>cesses that have the  CAP_SYS_ADMIN  capability.   Attributes  in  this
>class are used to implement mechanisms in user space (i.e., outside the
>kernel) which keep information in extended attributes to which ordinary
>processes should not have access.
> 
> There is a chance that overlay moves away from trusted xattr in future.
> But for now we need to make it work. This is an important use case for
> kata docker in docker build.
> 
> May be we can add an option to virtiofsd say "--add-cap " and
> ask user to pass in "--add-cap cap_sys_admin" if they need to run daemon
> with this capaibility.

I'll admit I don't like the idea of giving it cap_sys_admin.
Can you explain:
  a) What overlayfs uses trusted for?
  b) If something nasty was to write junk into the trusted attributes,
what would happen?
  c) I see overlayfs has a fallback check if xattr isn't supported at
all - what is the consequence?

Dave

> Thanks
> Vivek
> 
> > 
> > Stefan Hajnoczi (2):
> >   virtiofsd: only retain file system capabilities
> >   virtiofsd: drop all capabilities in the wait parent process
> > 
> >  tools/virtiofsd/passthrough_ll.c | 51 
> >  1 file changed, 51 insertions(+)
> > 
> > -- 
> > 2.25.1
> > 
> > ___
> > Virtio-fs mailing list
> > virtio...@redhat.com
> > https://www.redhat.com/mailman/listinfo/virtio-fs
> 
> ___
> Virtio-fs mailing list
> virtio...@redhat.com
> https://www.redhat.com/mailman/listinfo/virtio-fs
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [RFC v6 0/4] QEMU cpus.c refactoring

Patchew URL: https://patchew.org/QEMU/20200618190401.4895-1-cfont...@suse.de/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

  CC  audio/spiceaudio.o
  CC  audio/wavcapture.o
/tmp/qemu-test/src/dma-helpers.c: In function 'dma_blk_cb':
/tmp/qemu-test/src/dma-helpers.c:154:20: error: 'use_icount' undeclared (first 
use in this function)
 if (mem && use_icount && dbs->dir == DMA_DIRECTION_FROM_DEVICE) {
^
/tmp/qemu-test/src/dma-helpers.c:154:20: note: each undeclared identifier is 
reported only once for each function it appears in
make: *** [dma-helpers.o] Error 1
make: *** Waiting for unfinished jobs
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 669, in 
---
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', 
'--label', 'com.qemu.instance.uuid=a051d518277a4c87b0693e58cfd4408b', '-u', 
'1001', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', 
'-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 
'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', 
'/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', 
'/var/tmp/patchew-tester-tmp-k2f41z2d/src/docker-src.2020-06-18-15.14.16.19744:/var/tmp/qemu:z,ro',
 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit 
status 2.
filter=--filter=label=com.qemu.instance.uuid=a051d518277a4c87b0693e58cfd4408b
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-k2f41z2d/src'
make: *** [docker-run-test-quick@centos7] Error 2

real2m27.787s
user0m8.701s


The full log is available at
http://patchew.org/logs/20200618190401.4895-1-cfont...@suse.de/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [RFC v6 0/4] QEMU cpus.c refactoring

Patchew URL: https://patchew.org/QEMU/20200618190401.4895-1-cfont...@suse.de/



Hi,

This series failed the docker-mingw@fedora build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#! /bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-mingw@fedora J=14 NETWORK=1
=== TEST SCRIPT END ===

  CC  backends/rng-builtin.o
  CC  backends/hostmem.o
/tmp/qemu-test/src/dma-helpers.c: In function 'dma_blk_cb':
/tmp/qemu-test/src/dma-helpers.c:154:20: error: 'use_icount' undeclared (first 
use in this function)
  154 | if (mem && use_icount && dbs->dir == DMA_DIRECTION_FROM_DEVICE) 
{
  |^~
/tmp/qemu-test/src/dma-helpers.c:154:20: note: each undeclared identifier is 
reported only once for each function it appears in
make: *** [/tmp/qemu-test/src/rules.mak:69: dma-helpers.o] Error 1
make: *** Waiting for unfinished jobs
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 669, in 
---
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', 
'--label', 'com.qemu.instance.uuid=552161979fa24612938148f7958fb4a8', '-u', 
'1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', 
'-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 
'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', 
'/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', 
'/var/tmp/patchew-tester-tmp-469ygt4n/src/docker-src.2020-06-18-15.12.41.17535:/var/tmp/qemu:z,ro',
 'qemu:fedora', '/var/tmp/qemu/run', 'test-mingw']' returned non-zero exit 
status 2.
filter=--filter=label=com.qemu.instance.uuid=552161979fa24612938148f7958fb4a8
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-469ygt4n/src'
make: *** [docker-run-test-mingw@fedora] Error 2

real3m18.463s
user0m8.153s


The full log is available at
http://patchew.org/logs/20200618190401.4895-1-cfont...@suse.de/testing.docker-mingw@fedora/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [Virtio-fs] [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-06-18 Thread Vivek Goyal

On Thu, Apr 16, 2020 at 05:49:05PM +0100, Stefan Hajnoczi wrote:
> virtiofsd doesn't need of all Linux capabilities(7) available to root.  Keep a
> whitelisted set of capabilities that we require.  This improves security in
> case virtiofsd is compromised by making it hard for an attacker to gain 
> further
> access to the system.

Hi Stefan,

I just noticed that this patch set breaks overlayfs on top of virtiofs.

overlayfs sets "trusted.overlay.*" and xattrs in trusted domain
need CAP_SYS_ADMIN.

man xattr says.

   Trusted extended attributes
   Trusted  extended  attributes  are  visible and accessible only to pro‐
   cesses that have the  CAP_SYS_ADMIN  capability.   Attributes  in  this
   class are used to implement mechanisms in user space (i.e., outside the
   kernel) which keep information in extended attributes to which ordinary
   processes should not have access.

There is a chance that overlay moves away from trusted xattr in future.
But for now we need to make it work. This is an important use case for
kata docker in docker build.

May be we can add an option to virtiofsd say "--add-cap " and
ask user to pass in "--add-cap cap_sys_admin" if they need to run daemon
with this capaibility.

Thanks
Vivek

> 
> Stefan Hajnoczi (2):
>   virtiofsd: only retain file system capabilities
>   virtiofsd: drop all capabilities in the wait parent process
> 
>  tools/virtiofsd/passthrough_ll.c | 51 
>  1 file changed, 51 insertions(+)
> 
> -- 
> 2.25.1
> 
> ___
> Virtio-fs mailing list
> virtio...@redhat.com
> https://www.redhat.com/mailman/listinfo/virtio-fs

[RFC v6 3/4] cpu-timers, icount: new modules

refactoring of cpus.c continues with cpu timer state extraction.

cpu-timers: responsible for the cpu timers state, and for access to
cpu clocks and ticks.

icount: counts the TCG instructions executed. As such it is specific to
the TCG accelerator. Therefore, it is built only under CONFIG_TCG.

One complication is due to qtest, which misuses icount to warp time
(qtest_clock_warp). In order to solve this problem, detach instead qtest
from icount, and use a trivial separate counter for it.

This requires fixing assumptions scattered in the code that
qtest_enabled() implies icount_enabled().

No functionality change.

Signed-off-by: Claudio Fontana 
Reviewed-by: Alex Bennée 
---
 MAINTAINERS  |   2 +
 accel/qtest.c|   6 +-
 accel/tcg/cpu-exec.c |  43 ++-
 accel/tcg/tcg-all.c  |   7 +-
 accel/tcg/translate-all.c|   3 +-
 docs/replay.txt  |   6 +-
 exec.c   |   4 -
 hw/core/ptimer.c |   8 +-
 hw/i386/x86.c|   1 +
 include/exec/cpu-all.h   |   4 +
 include/exec/exec-all.h  |   4 +-
 include/qemu/timer.h |  22 +-
 include/sysemu/cpu-timers.h  |  81 +
 include/sysemu/cpus.h|  12 +-
 include/sysemu/qtest.h   |   2 +
 include/sysemu/replay.h  |   4 +-
 replay/replay.c  |   6 +-
 softmmu/Makefile.objs|   2 +
 softmmu/cpu-timers.c | 284 
 softmmu/cpus.c   | 750 +--
 softmmu/icount.c | 499 
 softmmu/qtest.c  |  34 +-
 softmmu/timers-state.h   |  69 
 softmmu/vl.c |   8 +-
 stubs/Makefile.objs  |   3 +-
 stubs/clock-warp.c   |   4 +-
 stubs/cpu-get-clock.c|   3 +-
 stubs/cpu-get-icount.c   |  21 --
 stubs/icount.c   |  22 ++
 stubs/qemu-timer-notify-cb.c |   8 +
 stubs/qtest.c|   5 +
 target/alpha/translate.c |   3 +-
 target/arm/helper.c  |   7 +-
 target/riscv/csr.c   |   8 +-
 tests/ptimer-test-stubs.c|   7 +-
 tests/test-timed-average.c   |   2 +-
 util/main-loop.c |   4 +-
 util/qemu-timer.c|  12 +-
 38 files changed, 1119 insertions(+), 851 deletions(-)
 create mode 100644 include/sysemu/cpu-timers.h
 create mode 100644 softmmu/cpu-timers.c
 create mode 100644 softmmu/icount.c
 create mode 100644 softmmu/timers-state.h
 delete mode 100644 stubs/cpu-get-icount.c
 create mode 100644 stubs/icount.c
 create mode 100644 stubs/qemu-timer-notify-cb.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 78c8f2490c..cc355c4fac 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2180,6 +2180,8 @@ F: softmmu/vl.c
 F: softmmu/main.c
 F: softmmu/cpus.c
 F: softmmu/cpu-throttle.c
+F: softmmu/cpu-timers.c
+F: softmmu/icount.c
 F: qapi/run-state.json
 
 Human Monitor (HMP)
diff --git a/accel/qtest.c b/accel/qtest.c
index 5b88f55921..119d0f16a4 100644
--- a/accel/qtest.c
+++ b/accel/qtest.c
@@ -19,14 +19,10 @@
 #include "sysemu/accel.h"
 #include "sysemu/qtest.h"
 #include "sysemu/cpus.h"
+#include "sysemu/cpu-timers.h"
 
 static int qtest_init_accel(MachineState *ms)
 {
-QemuOpts *opts = qemu_opts_create(qemu_find_opts("icount"), NULL, 0,
-  _abort);
-qemu_opt_set(opts, "shift", "0", _abort);
-configure_icount(opts, _abort);
-qemu_opts_del(opts);
 return 0;
 }
 
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index d95c4848a4..82155c1db3 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -19,6 +19,7 @@
 
 #include "qemu/osdep.h"
 #include "qemu-common.h"
+#include "qemu/qemu-print.h"
 #include "cpu.h"
 #include "trace.h"
 #include "disas/disas.h"
@@ -36,6 +37,8 @@
 #include "hw/i386/apic.h"
 #endif
 #include "sysemu/cpus.h"
+#include "exec/cpu-all.h"
+#include "sysemu/cpu-timers.h"
 #include "sysemu/replay.h"
 
 /* -icount align implementation. */
@@ -56,6 +59,9 @@ typedef struct SyncClocks {
 #define MAX_DELAY_PRINT_RATE 20LL
 #define MAX_NB_PRINTS 100
 
+static int64_t max_delay;
+static int64_t max_advance;
+
 static void align_clocks(SyncClocks *sc, CPUState *cpu)
 {
 int64_t cpu_icount;
@@ -65,7 +71,7 @@ static void align_clocks(SyncClocks *sc, CPUState *cpu)
 }
 
 cpu_icount = cpu->icount_extra + cpu_neg(cpu)->icount_decr.u16.low;
-sc->diff_clk += cpu_icount_to_ns(sc->last_cpu_icount - cpu_icount);
+sc->diff_clk += icount_to_ns(sc->last_cpu_icount - cpu_icount);
 sc->last_cpu_icount = cpu_icount;
 
 if (sc->diff_clk > VM_CLOCK_ADVANCE) {
@@ -98,9 +104,9 @@ static void print_delay(const SyncClocks *sc)
 (-sc->diff_clk / (float)10LL <
  (threshold_delay - THRESHOLD_REDUCE))) {
 threshold_delay = (-sc->diff_clk / 10LL) + 1;
-printf("Warning: The guest is now late by %.1f to %.1f seconds\n",
-   threshold_delay - 1,
-

[RFC v6 2/4] cpu-throttle: new module, extracted from cpus.c

move the vcpu throttling functionality into its own module.

This functionality is not specific to any accelerator,
and it is used currently by migration to slow down guests to try to
have migrations converge, and by the cocoa MacOS UI to throttle speed.

cpu-throttle contains the controls to adjust and inspect throttle
settings, start (set) and stop vcpu throttling, and the throttling
function itself that is run periodically on vcpus to make them take a nap.

Execution of the throttling function on all vcpus is triggered by a timer,
registered at module initialization.

No functionality change.

Signed-off-by: Claudio Fontana 
Reviewed-by: Alex Bennée 
---
 MAINTAINERS   |   1 +
 include/hw/core/cpu.h |  37 -
 include/qemu/main-loop.h  |   5 ++
 include/sysemu/cpu-throttle.h |  68 +++
 migration/migration.c |   1 +
 migration/ram.c   |   1 +
 softmmu/Makefile.objs |   1 +
 softmmu/cpu-throttle.c| 122 ++
 softmmu/cpus.c|  95 +++-
 9 files changed, 207 insertions(+), 124 deletions(-)
 create mode 100644 include/sysemu/cpu-throttle.h
 create mode 100644 softmmu/cpu-throttle.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 31e5a7aa4d..78c8f2490c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2179,6 +2179,7 @@ F: util/qemu-timer.c
 F: softmmu/vl.c
 F: softmmu/main.c
 F: softmmu/cpus.c
+F: softmmu/cpu-throttle.c
 F: qapi/run-state.json
 
 Human Monitor (HMP)
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index b3f4b79318..5542577d2b 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -822,43 +822,6 @@ bool cpu_exists(int64_t id);
  */
 CPUState *cpu_by_arch_id(int64_t id);
 
-/**
- * cpu_throttle_set:
- * @new_throttle_pct: Percent of sleep time. Valid range is 1 to 99.
- *
- * Throttles all vcpus by forcing them to sleep for the given percentage of
- * time. A throttle_percentage of 25 corresponds to a 75% duty cycle roughly.
- * (example: 10ms sleep for every 30ms awake).
- *
- * cpu_throttle_set can be called as needed to adjust new_throttle_pct.
- * Once the throttling starts, it will remain in effect until cpu_throttle_stop
- * is called.
- */
-void cpu_throttle_set(int new_throttle_pct);
-
-/**
- * cpu_throttle_stop:
- *
- * Stops the vcpu throttling started by cpu_throttle_set.
- */
-void cpu_throttle_stop(void);
-
-/**
- * cpu_throttle_active:
- *
- * Returns: %true if the vcpus are currently being throttled, %false otherwise.
- */
-bool cpu_throttle_active(void);
-
-/**
- * cpu_throttle_get_percentage:
- *
- * Returns the vcpu throttle percentage. See cpu_throttle_set for details.
- *
- * Returns: The throttle percentage in range 1 to 99.
- */
-int cpu_throttle_get_percentage(void);
-
 #ifndef CONFIG_USER_ONLY
 
 typedef void (*CPUInterruptHandler)(CPUState *, int);
diff --git a/include/qemu/main-loop.h b/include/qemu/main-loop.h
index a6d20b0719..8e98613656 100644
--- a/include/qemu/main-loop.h
+++ b/include/qemu/main-loop.h
@@ -303,6 +303,11 @@ void qemu_mutex_unlock_iothread(void);
  */
 void qemu_cond_wait_iothread(QemuCond *cond);
 
+/*
+ * qemu_cond_timedwait_iothread: like the previous, but with timeout
+ */
+void qemu_cond_timedwait_iothread(QemuCond *cond, int ms);
+
 /* internal interfaces */
 
 void qemu_fd_register(int fd);
diff --git a/include/sysemu/cpu-throttle.h b/include/sysemu/cpu-throttle.h
new file mode 100644
index 00..d65bdef6d0
--- /dev/null
+++ b/include/sysemu/cpu-throttle.h
@@ -0,0 +1,68 @@
+/*
+ * Copyright (c) 2012 SUSE LINUX Products GmbH
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see
+ * 
+ */
+
+#ifndef SYSEMU_CPU_THROTTLE_H
+#define SYSEMU_CPU_THROTTLE_H
+
+#include "qemu/timer.h"
+
+/**
+ * cpu_throttle_init:
+ *
+ * Initialize the CPU throttling API.
+ */
+void cpu_throttle_init(void);
+
+/**
+ * cpu_throttle_set:
+ * @new_throttle_pct: Percent of sleep time. Valid range is 1 to 99.
+ *
+ * Throttles all vcpus by forcing them to sleep for the given percentage of
+ * time. A throttle_percentage of 25 corresponds to a 75% duty cycle roughly.
+ * (example: 10ms sleep for every 30ms awake).
+ *
+ * cpu_throttle_set can be called as needed to adjust new_throttle_pct.
+ * Once the throttling starts, it will remain in effect until cpu_throttle_stop
+ *

[RFC v6 1/4] softmmu: move softmmu only files from root

move arch_init, balloon, cpus, ioport, memory, memory_mapping, qtest.

They are all specific to CONFIG_SOFTMMU.

Signed-off-by: Claudio Fontana 
Reviewed-by: Alex Bennée 
---
 MAINTAINERS  | 12 ++--
 Makefile.target  |  7 ++-
 softmmu/Makefile.objs| 10 ++
 arch_init.c => softmmu/arch_init.c   |  0
 balloon.c => softmmu/balloon.c   |  0
 cpus.c => softmmu/cpus.c |  0
 ioport.c => softmmu/ioport.c |  0
 memory.c => softmmu/memory.c |  0
 memory_mapping.c => softmmu/memory_mapping.c |  0
 qtest.c => softmmu/qtest.c   |  0
 10 files changed, 18 insertions(+), 11 deletions(-)
 rename arch_init.c => softmmu/arch_init.c (100%)
 rename balloon.c => softmmu/balloon.c (100%)
 rename cpus.c => softmmu/cpus.c (100%)
 rename ioport.c => softmmu/ioport.c (100%)
 rename memory.c => softmmu/memory.c (100%)
 rename memory_mapping.c => softmmu/memory_mapping.c (100%)
 rename qtest.c => softmmu/qtest.c (100%)

diff --git a/MAINTAINERS b/MAINTAINERS
index 955cc8dd5c..31e5a7aa4d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -115,7 +115,7 @@ Overall TCG CPUs
 M: Richard Henderson 
 R: Paolo Bonzini 
 S: Maintained
-F: cpus.c
+F: softmmu/cpus.c
 F: cpus-common.c
 F: exec.c
 F: accel/tcg/
@@ -1686,7 +1686,7 @@ M: David Hildenbrand 
 S: Maintained
 F: hw/virtio/virtio-balloon*.c
 F: include/hw/virtio/virtio-balloon.h
-F: balloon.c
+F: softmmu/balloon.c
 F: include/sysemu/balloon.h
 
 virtio-9p
@@ -2135,12 +2135,12 @@ Memory API
 M: Paolo Bonzini 
 S: Supported
 F: include/exec/ioport.h
-F: ioport.c
 F: include/exec/memop.h
 F: include/exec/memory.h
 F: include/exec/ram_addr.h
 F: include/exec/ramblock.h
-F: memory.c
+F: softmmu/ioport.c
+F: softmmu/memory.c
 F: include/exec/memory-internal.h
 F: exec.c
 F: scripts/coccinelle/memory-region-housekeeping.cocci
@@ -2172,13 +2172,13 @@ F: ui/cocoa.m
 Main loop
 M: Paolo Bonzini 
 S: Maintained
-F: cpus.c
 F: include/qemu/main-loop.h
 F: include/sysemu/runstate.h
 F: util/main-loop.c
 F: util/qemu-timer.c
 F: softmmu/vl.c
 F: softmmu/main.c
+F: softmmu/cpus.c
 F: qapi/run-state.json
 
 Human Monitor (HMP)
@@ -2333,7 +2333,7 @@ M: Thomas Huth 
 M: Laurent Vivier 
 R: Paolo Bonzini 
 S: Maintained
-F: qtest.c
+F: softmmu/qtest.c
 F: accel/qtest.c
 F: tests/qtest/
 X: tests/qtest/bios-tables-test-allowed-diff.h
diff --git a/Makefile.target b/Makefile.target
index 8ed1eba95b..7fbf5d8b92 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -152,16 +152,13 @@ endif #CONFIG_BSD_USER
 #
 # System emulator target
 ifdef CONFIG_SOFTMMU
-obj-y += arch_init.o cpus.o gdbstub.o balloon.o ioport.o
-obj-y += qtest.o
+obj-y += softmmu/
+obj-y += gdbstub.o
 obj-y += dump/
 obj-y += hw/
 obj-y += monitor/
 obj-y += qapi/
-obj-y += memory.o
-obj-y += memory_mapping.o
 obj-y += migration/ram.o
-obj-y += softmmu/
 LIBS := $(libs_softmmu) $(LIBS)
 
 # Hardware support
diff --git a/softmmu/Makefile.objs b/softmmu/Makefile.objs
index dd15c24346..a4bd9f2f52 100644
--- a/softmmu/Makefile.objs
+++ b/softmmu/Makefile.objs
@@ -1,3 +1,13 @@
 softmmu-main-y = softmmu/main.o
+
+obj-y += arch_init.o
+obj-y += cpus.o
+obj-y += balloon.o
+obj-y += ioport.o
+obj-y += memory.o
+obj-y += memory_mapping.o
+
+obj-y += qtest.o
+
 obj-y += vl.o
 vl.o-cflags := $(GPROF_CFLAGS) $(SDL_CFLAGS)
diff --git a/arch_init.c b/softmmu/arch_init.c
similarity index 100%
rename from arch_init.c
rename to softmmu/arch_init.c
diff --git a/balloon.c b/softmmu/balloon.c
similarity index 100%
rename from balloon.c
rename to softmmu/balloon.c
diff --git a/cpus.c b/softmmu/cpus.c
similarity index 100%
rename from cpus.c
rename to softmmu/cpus.c
diff --git a/ioport.c b/softmmu/ioport.c
similarity index 100%
rename from ioport.c
rename to softmmu/ioport.c
diff --git a/memory.c b/softmmu/memory.c
similarity index 100%
rename from memory.c
rename to softmmu/memory.c
diff --git a/memory_mapping.c b/softmmu/memory_mapping.c
similarity index 100%
rename from memory_mapping.c
rename to softmmu/memory_mapping.c
diff --git a/qtest.c b/softmmu/qtest.c
similarity index 100%
rename from qtest.c
rename to softmmu/qtest.c
-- 
2.16.4

[RFC v6 4/4] cpus: extract out accel-specific code to each accel

each accelerator registers a new "CpusAccel" interface
implementation on initialization, providing functions for
starting a vcpu, kicking a vcpu, and sychronizing state.

This way the code in cpus.c is now all general softmmu code,
nothing accelerator-specific anymore.

There is still some ifdeffery for WIN32 though.

Signed-off-by: Claudio Fontana 
---
 MAINTAINERS   |   1 +
 accel/Makefile.objs   |   2 +-
 accel/kvm/Makefile.objs   |   2 +
 accel/kvm/kvm-all.c   |  15 +-
 accel/kvm/kvm-cpus.c  |  94 +
 accel/kvm/kvm-cpus.h  |  17 +
 accel/qtest/Makefile.objs |   2 +
 accel/qtest/qtest-cpus.c  | 105 +
 accel/qtest/qtest-cpus.h  |  17 +
 accel/{ => qtest}/qtest.c |   7 +
 accel/stubs/kvm-stub.c|   3 +-
 accel/tcg/Makefile.objs   |   1 +
 accel/tcg/tcg-all.c   |  12 +-
 accel/tcg/tcg-cpus.c  | 523 
 accel/tcg/tcg-cpus.h  |  17 +
 hw/core/cpu.c |   1 +
 include/sysemu/cpus.h |  33 ++
 include/sysemu/hw_accel.h |  57 +--
 include/sysemu/kvm.h  |   2 +-
 softmmu/cpus.c| 899 +++---
 stubs/Makefile.objs   |   1 +
 stubs/cpu-synchronize-state.c |  15 +
 target/i386/Makefile.objs |   7 +-
 target/i386/hax-all.c |   6 +-
 target/i386/hax-cpus.c|  85 
 target/i386/hax-cpus.h|  17 +
 target/i386/hax-i386.h|   2 +
 target/i386/hax-posix.c   |  12 +
 target/i386/hax-windows.c |  20 +
 target/i386/hvf/Makefile.objs |   2 +-
 target/i386/hvf/hvf-cpus.c| 141 +++
 target/i386/hvf/hvf-cpus.h|  17 +
 target/i386/hvf/hvf.c |   3 +
 target/i386/whpx-all.c|   3 +
 target/i386/whpx-cpus.c   |  96 +
 target/i386/whpx-cpus.h   |  17 +
 36 files changed, 1348 insertions(+), 906 deletions(-)
 create mode 100644 accel/kvm/kvm-cpus.c
 create mode 100644 accel/kvm/kvm-cpus.h
 create mode 100644 accel/qtest/Makefile.objs
 create mode 100644 accel/qtest/qtest-cpus.c
 create mode 100644 accel/qtest/qtest-cpus.h
 rename accel/{ => qtest}/qtest.c (86%)
 create mode 100644 accel/tcg/tcg-cpus.c
 create mode 100644 accel/tcg/tcg-cpus.h
 create mode 100644 stubs/cpu-synchronize-state.c
 create mode 100644 target/i386/hax-cpus.c
 create mode 100644 target/i386/hax-cpus.h
 create mode 100644 target/i386/hvf/hvf-cpus.c
 create mode 100644 target/i386/hvf/hvf-cpus.h
 create mode 100644 target/i386/whpx-cpus.c
 create mode 100644 target/i386/whpx-cpus.h

diff --git a/MAINTAINERS b/MAINTAINERS
index cc355c4fac..e723757843 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -427,6 +427,7 @@ WHPX CPUs
 M: Sunil Muthuswamy 
 S: Supported
 F: target/i386/whpx-all.c
+F: target/i386/whpx-cpus.c
 F: target/i386/whp-dispatch.h
 F: accel/stubs/whpx-stub.c
 F: include/sysemu/whpx.h
diff --git a/accel/Makefile.objs b/accel/Makefile.objs
index ff72f0d030..c5e58eb53d 100644
--- a/accel/Makefile.objs
+++ b/accel/Makefile.objs
@@ -1,5 +1,5 @@
 common-obj-$(CONFIG_SOFTMMU) += accel.o
-obj-$(call land,$(CONFIG_SOFTMMU),$(CONFIG_POSIX)) += qtest.o
+obj-$(call land,$(CONFIG_SOFTMMU),$(CONFIG_POSIX)) += qtest/
 obj-$(CONFIG_KVM) += kvm/
 obj-$(CONFIG_TCG) += tcg/
 obj-$(CONFIG_XEN) += xen/
diff --git a/accel/kvm/Makefile.objs b/accel/kvm/Makefile.objs
index fdfa481578..ce0f492b8d 100644
--- a/accel/kvm/Makefile.objs
+++ b/accel/kvm/Makefile.objs
@@ -1,2 +1,4 @@
 obj-y += kvm-all.o
+obj-y += kvm-cpus.o
+
 obj-$(call lnot,$(CONFIG_SEV)) += sev-stub.o
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index f24d7da783..204385f514 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -45,6 +45,10 @@
 #include "qapi/qapi-types-common.h"
 #include "qapi/qapi-visit-common.h"
 #include "sysemu/reset.h"
+#include "qemu/guest-random.h"
+
+#include "sysemu/hw_accel.h"
+#include "kvm-cpus.h"
 
 #include "hw/boards.h"
 
@@ -379,7 +383,7 @@ err:
 return ret;
 }
 
-int kvm_destroy_vcpu(CPUState *cpu)
+static int do_kvm_destroy_vcpu(CPUState *cpu)
 {
 KVMState *s = kvm_state;
 long mmap_size;
@@ -413,6 +417,14 @@ err:
 return ret;
 }
 
+void kvm_destroy_vcpu(CPUState *cpu)
+{
+if (do_kvm_destroy_vcpu(cpu) < 0) {
+error_report("kvm_destroy_vcpu failed");
+exit(EXIT_FAILURE);
+}
+}
+
 static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id)
 {
 struct KVMParkedVcpu *cpu;
@@ -2225,6 +2237,7 @@ static int kvm_init(MachineState *ms)
 qemu_balloon_inhibit(true);
 }
 
+cpus_register_accel(_cpus);
 return 0;
 
 err:
diff --git a/accel/kvm/kvm-cpus.c b/accel/kvm/kvm-cpus.c
new file mode 100644
index 00..ac6945a9e6
--- /dev/null
+++ b/accel/kvm/kvm-cpus.c
@@ -0,0 +1,94 @@
+/*
+ * QEMU KVM support
+ *
+ * Copyright IBM, Corp. 2008
+ *   Red Hat, Inc. 2008
+ *
+ * Authors:
+ *  Anthony Liguori   
+ *  Glauber Costa 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or

[RFC v6 0/4] QEMU cpus.c refactoring

Motivation and higher level steps:

https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg04628.html

MAIN OPEN POINTS:

* confirmation on hvf state (Roman)

* naming of "cpus.c" and functions, more cpus_ prefix use? (Roman)

* should we accept the addional clunkyness and overhead for each call of
  the kick and CPU sync state for all archs/accels of putting CpusAccel
  inside AccelClass? I would tend towards no here, but welcome further
  opinions or data if any. (Roman)

* should we reorder patches or moves inside patches to avoid code going
  from cpus.c to softmmu/cpus.c and then again to softmmu/somethingelse.c ?
  (Philippe)

* some questions about headers in include/softmmu (Philippe)



v5 -> v6:

* rebased changes on top of Emilio G. Cota changes to cpus.c
  "cpu: convert queued work to a QSIMPLEQ"

* keep a pointer in cpus.c instead of a copy of CpusAccel
  (Alex)




v4 -> v5: rebase on latest master

* rebased changes on top of roman series to remove one of the extra states for 
hvf.
  (Is the result now functional for HVF?)

* rebased changes on top of icount changes and fixes to icount_configure and
  the new shift vmstate. (Markus)

v3 -> v4:

* overall: added copyright headers to all files that were missing them
  (used copyright and license of the module the stuff was extracted from).
  For the new interface files, added SUSE LLC.

* 1/4 (move softmmu only files from root):

  MAINTAINERS: moved softmmu/cpus.c to its final location (from patch 2)

* 2/4 (cpu-throttle):

  MAINTAINERS (to patch 1),
  copyright Fabrice Bellard and license from cpus.c

* 3/4 (cpu-timers, icount):

  - MAINTAINERS: add cpu-timers.c and icount.c to Paolo

  - break very long lines (patchew)

  - add copyright SUSE LLC, GPLv2 to cpu-timers.h

  - add copyright Fabrice Bellard and license from cpus.c to timers-state.h
as it is lifted from cpus.c

  - vl.c: in configure_accelerators bail out if icount_enabled()
and !tcg_enabled() as qtest does not enable icount anymore.

* 4/4 (accel stuff to accel):

  - add copyright SUSE LLC to files that mostly only consist of the
new interface. Add whatever copyright was in the accelerator code
if instead they mostly consist of accelerator code.

  - change a comment to mention the result of the AccelClass experiment

  - moved qtest accelerator into accel/qtest/ , make it like the others.

  - rename xxx-cpus-interface to xxx-cpus (remove "interface" from names)

  - rename accel_int to cpus_accel

  - rename CpusAccel functions from cpu_synchronize_* to synchronize_*




v2 -> v3:

* turned into a 4 patch series, adding a first patch moving
  softmmu code currently in top_srcdir to softmmu/

* cpu-throttle: moved to softmmu/

* cpu-timers, icount:

  - moved to softmmu/

  - fixed assumption of qtest_enabled() => icount_enabled()
  causing the failure of check-qtest-arm goal, in test-arm-mptimer.c

  Fix is in hw/core/ptimer.c,

  where the artificial timeout rate limit should not be applied
  under qtest_enabled(), in a similar way to how it is not applied
  for icount_enabled().

* CpuAccelInterface: no change.





v1 -> v2:

* 1/3 (cpu-throttle): provide a description in the commit message

* 2/3 (cpu-timers, icount): in this v2 separate icount from cpu-timers,
  as icount is actually TCG-specific. Only build it under CONFIG_TCG.

  To do this, qtest had to be detached from icount. To this end, a
  trivial global counter for qtest has been introduced.

* 3/3 (CpuAccelInterface): provided a description.

This is point 8) in that plan. The idea is to extract the unrelated parts
in cpus, and register interfaces from each single accelerator to the main
cpus module (cpus.c).

While doing this RFC, I noticed some assumptions about Windows being
either TCG or HAX (not considering WHPX) that might need to be revisited.
I added a comment there.

The thing builds successfully based on Linux cross-compilations for
windows/hax, windows/whpx, and I got a good build on Darwin/hvf.

Tests run successully for tcg and kvm configurations, but did not test on
windows or darwin.

Welcome your feedback and help on this,

Claudio

Claudio Fontana (4):
  softmmu: move softmmu only files from root
  cpu-throttle: new module, extracted from cpus.c
  cpu-timers, icount: new modules
  cpus: extract out accel-specific code to each accel

 MAINTAINERS  |   14 +-
 Makefile.target  |7 +-
 accel/kvm/Makefile.objs  |2 +
 accel/kvm/kvm-all.c  |   15 +-
 accel/kvm/kvm-cpus-interface.c   |   94 ++
 accel/kvm/kvm-cpus-interface.h   |8 +
 accel/qtest.c|   88 +-
 accel/stubs/kvm-stub.c   |3 +-
 accel/tcg/Makefile.objs  |1 +
 accel/tcg/cpu-exec.c |   43 +-
 accel/tcg/tcg-all.c  |   19 +-

Re: [virtio-dev] Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'

2020-06-18 Thread Dr. David Alan Gilbert

* Alexander Duyck (alexander.du...@gmail.com) wrote:
> On Tue, May 26, 2020 at 9:14 PM Alexander Duyck
>  wrote:
> >
> > From: Alexander Duyck 
> >
> > In an upcoming patch a feature named Free Page Reporting is about to be
> > added. In order to avoid any confusion we should drop the use of the word
> > 'report' when referring to Free Page Hinting. So what this patch does is go
> > through and replace all instances of 'report' with 'hint" when we are
> > referring to free page hinting.
> >
> > Acked-by: David Hildenbrand 
> > Signed-off-by: Alexander Duyck 
> > ---
> >  hw/virtio/virtio-balloon.c |   78 
> > ++--
> >  include/hw/virtio/virtio-balloon.h |   20 +
> >  2 files changed, 49 insertions(+), 49 deletions(-)
> >
> > diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> > index 3e2ac1104b5f..dc15409b0bb6 100644
> > --- a/hw/virtio/virtio-balloon.c
> > +++ b/hw/virtio/virtio-balloon.c
> 
> ...
> 
> > @@ -817,14 +817,14 @@ static int virtio_balloon_post_load_device(void 
> > *opaque, int version_id)
> >  return 0;
> >  }
> >
> > -static const VMStateDescription vmstate_virtio_balloon_free_page_report = {
> > +static const VMStateDescription vmstate_virtio_balloon_free_page_hint = {
> >  .name = "virtio-balloon-device/free-page-report",
> >  .version_id = 1,
> >  .minimum_version_id = 1,
> >  .needed = virtio_balloon_free_page_support,
> >  .fields = (VMStateField[]) {
> > -VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon),
> > -VMSTATE_UINT32(free_page_report_status, VirtIOBalloon),
> > +VMSTATE_UINT32(free_page_hint_cmd_id, VirtIOBalloon),
> > +VMSTATE_UINT32(free_page_hint_status, VirtIOBalloon),
> >  VMSTATE_END_OF_LIST()
> >  }
> >  };
> 
> So I noticed this patch wasn't in the list of patches pulled, but that
> is probably for the best since I believe the change above might have
> broken migration as VMSTATE_UINT32 does a stringify on the first
> parameter.
> Any advice on how to address it, or should I just give up on renaming
> free_page_report_cmd_id and free_page_report_status?

The filed names never hit the wire; the migration format is trivial
binary, especially of things like integers - that lands as just 4 bytes
on the wire [ hopefully in the place where the destination expects to
receive them ].
You need to be careful of the names of top level vmstate devices, and
the names of subsections; I don't think any other naming is in the
stream.
(We've even done hacks in the past of converting a VMSTATE_UINT32 to a
pair of UINT16 )

Dave

> Looking at this I wonder why we even need to migrate these values? It
> seems like if we are completing a migration the cmd_id should always
> be "DONE" shouldn't it? It isn't as if we are going to migrate the
> hinting from one host to another. We will have to start over which is
> essentially the signal that the "DONE" value provides. Same thing for
> the status. We shouldn't be able to migrate unless both of these are
> already in the "DONE" state so if anything I wonder if we shouldn't
> have that as the initial state for the device and just drop the
> migration info.
> 
> Thanks.
> 
> - Alex
> 
> -
> To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [PATCH v7 03/42] target/arm: Add support for MTE to SCTLR_ELx

On Thu, 18 Jun 2020 at 19:08, Richard Henderson
 wrote:
>
> On 6/18/20 3:52 AM, Peter Maydell wrote:
> >> +if (ri->state == ARM_CP_STATE_AA64 && !cpu_isar_feature(aa64_mte, 
> >> cpu)) {
> >> +if (ri->opc1 == 6) { /* SCTLR_EL3 */
> >> +value &= ~(SCTLR_ITFSB | SCTLR_TCF | SCTLR_ATA);
> >> +} else {
> >> +value &= ~(SCTLR_ITFSB | SCTLR_TCF0 | SCTLR_TCF |
> >> +   SCTLR_ATA0 | SCTLR_ATA);
> >> +}
> >
> > Doesn't SCTLR_EL2 have the same "no ATA0 and no TCF0" that
> > SCTLR_EL3 does?
>
> No.  With HCR.{E2H,TGE} = '11', those fields are present.

Ah, right.

Reviewed-by: Peter Maydell 

thanks
-- PMM

Re: [PATCH v2 0/7] Add several Power ISA 3.1 32/64-bit vector instructions




> On Jun 17, 2020, at 7:42 PM, no-re...@patchew.org wrote:
> 
> Patchew URL: 
> https://patchew.org/QEMU/20200618001127.34438-1-...@linux.ibm.com/
> 
> 
> 
> Hi,
> 
> This series failed the asan build test. Please find the testing commands and
> their output below. If you have Docker installed, you can probably reproduce 
> it
> locally.
> 
> === TEST SCRIPT BEGIN ===
> #!/bin/bash
> export ARCH=x86_64
> make docker-image-fedora V=1 NETWORK=1
> time make docker-test-debug@fedora TARGET_LIST=x86_64-softmmu J=14 NETWORK=1
> === TEST SCRIPT END ===

It look like the errors generated below are not directly related to the code 
changes I made.
Does anyone know why it still reports errors?

Thanks,
Lijun

> 
>  GEN docs/interop/qemu-qmp-ref.html
>  GEN docs/interop/qemu-qmp-ref.txt
>  GEN docs/interop/qemu-qmp-ref.7
> /usr/bin/ld: 
> /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
>  warning: common of `__interception::real_vfork' overridden by definition 
> from 
> /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
>  CC  qga/guest-agent-command-state.o
>  CC  qga/commands.o
>  CC  qga/main.o
> ---
>  GEN docs/interop/qemu-ga-ref.html
>  GEN docs/interop/qemu-ga-ref.txt
>  GEN docs/interop/qemu-ga-ref.7
> /usr/bin/ld: 
> /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
>  warning: common of `__interception::real_vfork' overridden by definition 
> from 
> /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
>  AS  pc-bios/optionrom/linuxboot.o
>  CC  pc-bios/optionrom/linuxboot_dma.o
>  AS  pc-bios/optionrom/multiboot.o
> ---
>  BUILD   pc-bios/optionrom/linuxboot.img
>  BUILD   pc-bios/optionrom/kvmvapic.raw
>  BUILD   pc-bios/optionrom/multiboot.raw
> /usr/bin/ld: 
> /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
>  warning: common of `__interception::real_vfork' overridden by definition 
> from 
> /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
>  BUILD   pc-bios/optionrom/linuxboot.raw
>  BUILD   pc-bios/optionrom/linuxboot_dma.img
>  SIGNpc-bios/optionrom/kvmvapic.bin
> ---
>  LINKivshmem-client
>  SIGNpc-bios/optionrom/linuxboot_dma.bin
>  LINKivshmem-server
> /usr/bin/ld: 
> /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
>  warning: common of `__interception::real_vfork' overridden by definition 
> from 
> /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
>  BUILD   pc-bios/optionrom/pvh.img
>  BUILD   pc-bios/optionrom/pvh.raw
>  SIGNpc-bios/optionrom/pvh.bin
>  LINKqemu-nbd
> /usr/bin/ld: 
> /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
>  warning: common of `__interception::real_vfork' overridden by definition 
> from 
> /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
>  LINKqemu-storage-daemon
> /usr/bin/ld: 
> /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
>  warning: common of `__interception::real_vfork' overridden by definition 
> from 
> /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
>  LINKqemu-img
> /usr/bin/ld: 
> /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
>  warning: common of `__interception::real_vfork' overridden by definition 
> from 
> /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
>  LINKqemu-io
> /usr/bin/ld: 
> /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
>  warning: common of `__interception::real_vfork' overridden by definition 
> from 
> /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
>  LINKqemu-edid
>  LINKfsdev/virtfs-proxy-helper
> /usr/bin/ld: 
> /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
>  warning: common of `__interception::real_vfork' overridden by definition 
> from 
> /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
> /usr/bin/ld: 
> /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
>  warning: common of `__interception::real_vfork' overridden by definition 
> from 
> /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
>  LINKscsi/qemu-pr-helper
> /usr/bin/ld: 
> /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o):
>  warning: common of `__interception::real_vfork' overridden by definition 
> from 
> /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
> /usr/bin/ld: 
>

Re: [PATCH 0/3] Add OpenSBI dynamic firmware support

2020-06-18 Thread Atish Patra

On Thu, Jun 18, 2020 at 1:56 AM Bin Meng  wrote:
>
> On Wed, Jun 17, 2020 at 3:29 AM Atish Patra  wrote:
> >
> > This series adds support OpenSBI dynamic firmware support to Qemu.
> > Qemu loader passes the information about the DT and next stage (i.e. kernel
> > or U-boot) via "a2" register. It allows the user to build bigger OS images
> > without worrying about overwriting DT. It also unifies the reset vector code
>
> I am not sure in what situation overwriting DT could happen. Could you
> please elaborate?
>

Currently, the DT is loaded 0x8220 (34MB offset) for fw_jump.
Thus, a bigger kernel image
would overwrite the DT. In fact, it was reported by FreeBSD folks.
https://github.com/riscv/opensbi/issues/169

There are temporary solutions that can put DT a little bit further or
put it within 2MB offset. But that's
just delaying the inevitable.

> > in rom and dt placement. Now, the DT is copied directly in DRAM instead of 
> > ROM.
> >
> > The changes have been verified on following qemu machines.
> >
> > 64bit:
> >  - spike, sifive_u, virt
> > 32bit:
> >  - virt
>
> Any test instructions?
>

you just need to provide fw_dynamic instead of fw_jump in bios argument.

For example: Here is my qemu commandline for testing

qemu-system-riscv64 -M virt -smp 4 -m 2g -display none -serial
mon:stdio -bios
~/workspace/opensbi/build/platform/generic/firmware/fw_dynamic.bin \
   -kernel /home/atish/workspace/linux/arch/riscv/boot/Image -initrd
/home/atish/workspace/rootfs_images/riscv64_busybox_rootfs.img -object
rng-random,filename=/dev/urandom,id=rng0 \
   -device virtio-rng-device,rng=rng0 -device
virtio-net-device,netdev=usernet -netdev
user,id=usernet,hostfwd=tcp::1-:22 -d in_asm -D log -append 'rw
console=ttyS0 earlycon'

> >
> > I have also verified fw_jump on all the above platforms to ensure that this
> > series doesn't break the existing setup.
> >
>
> Regards,
> Bin
>


-- 
Regards,
Atish

Re: [PATCH v7 03/42] target/arm: Add support for MTE to SCTLR_ELx

On 6/18/20 3:52 AM, Peter Maydell wrote:
>> +if (ri->state == ARM_CP_STATE_AA64 && !cpu_isar_feature(aa64_mte, cpu)) 
>> {
>> +if (ri->opc1 == 6) { /* SCTLR_EL3 */
>> +value &= ~(SCTLR_ITFSB | SCTLR_TCF | SCTLR_ATA);
>> +} else {
>> +value &= ~(SCTLR_ITFSB | SCTLR_TCF0 | SCTLR_TCF |
>> +   SCTLR_ATA0 | SCTLR_ATA);
>> +}
> 
> Doesn't SCTLR_EL2 have the same "no ATA0 and no TCF0" that
> SCTLR_EL3 does?

No.  With HCR.{E2H,TGE} = '11', those fields are present.


r~

Re: [PATCH v7 13/42] target/arm: Define arm_cpu_do_unaligned_access for user-only

On Thu, 18 Jun 2020 at 18:04, Richard Henderson
 wrote:
> First, this could definitely be delayed to the follow-on linux-user patch set.
>
> Second, in the linux-user patch set, I decode the syndrome data to determine
> what kind of segv to deliver for MTE synchronous faults.  It would be easy to
> extend that just a little to notice the usual syndrome for unaligned accesses.
>  Which may be less confusing than abusing the v7m exception code?

Yeah, if we're going to look at syndrome data anyway that might
be clearer.

The other thing that really it would be nice if we were able
to feed through (via syndrome info or otherwise) is the difference
between SIGSEGV with si_code == SEGV_ACCERR vs SEGV_MAPERR.
At the moment handle_cpu_signal() knows the difference, but it
doesn't have a way to pass this through to tlb_fill, and then
cpu_loop() has to make up a si_code when it gets the EXCP_DATA_ABORT.
I mention this mostly in case it affects how you want to design
how you treat alignment and MTE faults -- it might be that the
si_code stuff is better dealt with entirely differently.

thanks
-- PMM

[Bug 1883560] Re: mips linux-user builds occasionly crash randomly only to be fixed by a full clean re-build

2020-06-18 Thread Laurent Vivier

Aleksandar, Alex, see comment #1.

I think the problem happens because I moved the syscall_nr.h from source
directory to build directory. If source directory is not cleaned up
correctly, the build will not generate the new header in the build
directory but in source directory and some targets that need 32bit,
64bit or a new API will only use the first one generated (as in this
case they are all at the same place).

See the following PR:
https://patchew.org/QEMU/20200316161550.336150-1-laur...@vivier.eu/

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1883560

Title:
  mips linux-user builds occasionly crash randomly only to be fixed by a
  full clean re-build

Status in QEMU:
  New

Bug description:
  From time to time I find check-tcg crashes with a one of the MIPS
  binaries. The last time it crashed was running the test:

./mips64el-linux-user/qemu-mips64el ./tests/tcg/mips64el-linux-
  user/threadcount

  Inevitably after some time noodling around wondering what could be
  causing this weird behaviour I wonder if it is a build issue. I wipe
  all the mips* build directories, re-run configure and re-build and
  voila problem goes away.

  It seems there must be some sort of build artefact which isn't being
  properly re-generated on a build update which causes weird problems.
  Additional data point if I:

rm -rf mips64el-linux-user
../../configure
make

  then I see failures in mip32 builds - eg:

  GEN mipsn32el-linux-user/config-target.h
In file included from /home/alex/lsrc/qemu.git/linux-user/syscall_defs.h:10,
 from /home/alex/lsrc/qemu.git/linux-user/qemu.h:16,
 from /home/alex/lsrc/qemu.git/linux-user/linuxload.c:5:
/home/alex/lsrc/qemu.git/linux-user/mips64/syscall_nr.h:1: error: 
unterminated #ifndef
 #ifndef LINUX_USER_MIPS64_SYSCALL_NR_H

make[1]: *** [/home/alex/lsrc/qemu.git/rules.mak:69: 
linux-user/linuxload.o] Error 1
make[1]: *** Waiting for unfinished jobs

  which implies there is a cross dependency between different targets
  somewhere. If I executed:

rm -rf mips*

  before re-configuring and re-building then everything works again.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1883560/+subscriptions

what are the requirements on target/ code for -icount to work correctly?

For -icount mode to work, there are requirements on the target/
code (notably around marking up "I/O" instructions). Unfortunately
we've never documented what these are, which makes it pretty rough
for people writing new targets or reviewing changes to existing ones.
Does anybody understand what they actually are?

Some more specific questions on the general theme:

Q1: the comment on gen_io_end() says:
/*
 * cpu->can_do_io is cleared automatically at the beginning of
 * each translation block.  The cost is minimal and only paid
 * for -icount, plus it would be very easy to forget doing it
 * in the translator.  Therefore, backends only need to call
 * gen_io_start.
 */
but in fact multiple backends *do* call gen_io_end(). When
does a backend have to call this, and when not? Or are those
all legacy useless calls we should delete? (If so, can we
just get rid of this function entirely ?)

Q2: is it a requirement that after an insn which is a "known
to be an I/O insn" one (like x86 in/out) and which is marked
up with gen_io_start()/gen_io_end() that we also end the TB?
Or is it OK to generate more insns after that one? If the former,
is there somewhere we can assert() that this is done ?

Q3: why does gen_tb_start() call gen_io_end()? This is the
*start* of the TB so by definition we haven't started doing
any IO yet...

thanks
-- PMM

Re: [PATCH v7 39/42] target/arm: Enable MTE

On 6/18/20 9:39 AM, Peter Maydell wrote:
>>  t = cpu->isar.id_aa64pfr1;
>>  t = FIELD_DP64(t, ID_AA64PFR1, BT, 1);
>> +t = FIELD_DP64(t, ID_AA64PFR1, MTE, 2);
>>  cpu->isar.id_aa64pfr1 = t;
> 
> If we don't actually have tagged memory yet should we really
> set the MTE field to 2 rather than 1 ?

Well, we reduce that later in arm_cpu_realizefn.
But perhaps this patch should be sorted after patch 41.


r~

Re: [PATCH v7 25/42] target/arm: Implement helper_mte_check1