date:20230323


Hello Cedric,

I tried to use ast2600-evb machine but it is not getting any message on 
I2C bus.


Any suggestions?

# Start the software TPM emulator.
    $ swtpm socket --tpmstate dir=/tmp/mytpm1   --ctrl 
type=unixio,path=/tmp/mytpm1/swtpm-sock   --tpm2   --log level=100


# Start a qemu and point at swtpm. I am using i2c bus 12 and address 0x2e
    $ ~/qemu/build/qemu-system-arm -M ast2600-evb -nographic -kernel 
$IMAGEDIR/zImage -dtb $IMAGEDIR/aspeed-ast2600-evb.dtb -initrd 
$IMAGEDIR/rootfs.cpio -chardev 
socket,id=chrtpm,path=/tmp/mytpm1/swtpm-sock -tpmdev 
emulator,id=tpm0,chardev=chrtpm -device 
tpm-tis-i2c,tpmdev=tpm0,bus=aspeed.i2c.bus.12,address=0x2e


# Inside the ast2600-evb machine. Insantiated the device
    # echo 12 0x2e > /sys/bus/i2c/devices/i2c-12/new_device
    [  158.265321] i2c i2c-12: new_device: Instantiated device 12 at 0x2e

# Tried to instantiate TPM device but nothing happening on I2C bus.
    # echo 12-002e > /sys/bus/i2c/drivers/tpm_tis_i2c/bind
    sh: write error: No such device

Thanks & Regards,

Ninad Palsule

On 3/23/23 2:49 AM, Cédric Le Goater wrote:

On 3/23/23 04:01, Ninad Palsule wrote:

This is a documentation change for I2C TPM device support.

Qemu already supports devices attached to ISA and sysbus.
This drop adds support for the I2C bus attached TPM devices.

Signed-off-by: Ninad Palsule 

---
V2:

Incorporated Stephen's review comments
- Added example in the document.
---
  docs/specs/tpm.rst | 20 +++-
  1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/docs/specs/tpm.rst b/docs/specs/tpm.rst
index 535912a92b..bf7249b09c 100644
--- a/docs/specs/tpm.rst
+++ b/docs/specs/tpm.rst
@@ -21,11 +21,15 @@ QEMU files related to TPM TIS interface:
   - ``hw/tpm/tpm_tis_common.c``
   - ``hw/tpm/tpm_tis_isa.c``
   - ``hw/tpm/tpm_tis_sysbus.c``
+ - ``hw/tpm/tpm_tis_i2c.c``
   - ``hw/tpm/tpm_tis.h``
    Both an ISA device and a sysbus device are available. The former is
  used with pc/q35 machine while the latter can be instantiated in the
-Arm virt machine.
+Arm virt machine. An I2C device support is also added which can be
+instantiated in the arm based emulation machine. An I2C device is also
+supported for the Arm virt machine. This device only supports the
+TPM 2 protocol.
    CRB interface
  -
@@ -348,6 +352,20 @@ In case an Arm virt machine is emulated, use the 
following command line:

  -drive if=pflash,format=raw,file=flash0.img,readonly=on \
  -drive if=pflash,format=raw,file=flash1.img
  +In case a Rainier bmc machine is emulated, use the following 
command line:

+
+.. code-block:: console
+
+  qemu-system-arm -M rainier-bmc -nographic \
+    -kernel ${IMAGEPATH}/fitImage-linux.bin \
+    -dtb ${IMAGEPATH}/aspeed-bmc-ibm-rainier.dtb \
+    -initrd ${IMAGEPATH}/obmc-phosphor-initramfs.rootfs.cpio.xz \
+    -drive 
file=${IMAGEPATH}/obmc-phosphor-image.rootfs.wic.qcow2,if=sd,index=2\
+    -net nic -net 
user,hostfwd=:127.0.0.1:-:22,hostfwd=:127.0.0.1:2443-:443\

+    -chardev socket,id=chrtpm,path=/tmp/mytpm1/swtpm-sock \
+    -tpmdev emulator,id=tpm0,chardev=chrtpm \
+    -device tpm-tis-i2c,tpmdev=tpm0,bus=aspeed.i2c.bus.12,address=0x2e



The rainier images are not the easiest to find. Could we use an 
AST2600 EVB

machine instead and instantiate the device from user space ? see commit
3302184f7f or 7a7308eae0.

Thanks,

C.


  In case SeaBIOS is used as firmware, it should show the TPM menu item
  after entering the menu with 'ESC'.

[PATCH v3 0/3] Add support for TPM devices over I2C bus

Hello,

I have incorporated review comments from Stefan and Cedric. Please
review.

This drop adds support for the TPM devices attached to the I2C bus. It
only supports the TPM2 protocol. You need to run it with the external
TPM emulator like swtpm. I have tested it with swtpm.

I have refered to the work done by zhdan...@meta.com but at the core
level out implementation is different.
https://github.com/theopolis/qemu/commit/2e2e57cde9e419c36af8071bb85392ad1ed70966

Based-on: $MESSAGE_ID

Ninad Palsule (3):
  docs: Add support for TPM devices over I2C bus
  TPM TIS: Add support for TPM devices over I2C bus
  New I2C: Add support for TPM devices over I2C bus

 docs/specs/tpm.rst  |  20 +-
 hw/arm/Kconfig  |   1 +
 hw/tpm/Kconfig  |   7 +
 hw/tpm/meson.build  |   1 +
 hw/tpm/tpm_tis.h|   3 +
 hw/tpm/tpm_tis_common.c |  32 +++
 hw/tpm/tpm_tis_i2c.c| 513 
 include/hw/acpi/tpm.h   |  28 +++
 include/sysemu/tpm.h|   3 +
 9 files changed, 607 insertions(+), 1 deletion(-)
 create mode 100644 hw/tpm/tpm_tis_i2c.c

-- 
2.37.2

Re: [PATCH v4 1/2] target/riscv: separate priv from mmu_idx

On 3/24/2023 9:02 AM, Wu, Fei wrote:
> On 3/23/2023 11:53 PM, Richard Henderson wrote:
>> On 3/22/23 19:44, Fei Wu wrote:
>>> Currently it's assumed the 2 low bits of mmu_idx map to privilege mode,
>>> this assumption won't last as we are about to add more mmu_idx.
>>>
>>> Signed-off-by: Fei Wu 
>>> ---
>>>   target/riscv/cpu.h | 1 -
>>>   target/riscv/cpu_helper.c  | 2 +-
>>>   target/riscv/insn_trans/trans_privileged.c.inc | 2 +-
>>>   target/riscv/insn_trans/trans_xthead.c.inc | 7 +--
>>>   target/riscv/translate.c   | 3 +++
>>>   5 files changed, 6 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
>>> index 638e47c75a..66f7e3d1ba 100644
>>> --- a/target/riscv/cpu.h
>>> +++ b/target/riscv/cpu.h
>>> @@ -623,7 +623,6 @@ G_NORETURN void
>>> riscv_raise_exception(CPURISCVState *env,
>>>   target_ulong riscv_cpu_get_fflags(CPURISCVState *env);
>>>   void riscv_cpu_set_fflags(CPURISCVState *env, target_ulong);
>>>   -#define TB_FLAGS_PRIV_MMU_MASK    3
>>>   #define TB_FLAGS_PRIV_HYP_ACCESS_MASK   (1 << 2)
>>>   #define TB_FLAGS_MSTATUS_FS MSTATUS_FS
>>>   #define TB_FLAGS_MSTATUS_VS MSTATUS_VS
>>> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
>>> index f88c503cf4..76e1b0100e 100644
>>> --- a/target/riscv/cpu_helper.c
>>> +++ b/target/riscv/cpu_helper.c
>>> @@ -762,7 +762,7 @@ static int get_physical_address(CPURISCVState
>>> *env, hwaddr *physical,
>>>    * (riscv_cpu_do_interrupt) is correct */
>>>   MemTxResult res;
>>>   MemTxAttrs attrs = MEMTXATTRS_UNSPECIFIED;
>>> -    int mode = mmu_idx & TB_FLAGS_PRIV_MMU_MASK;
>>> +    int mode = env->priv;
>>
>> This is incorrect.  You must map back from mmu_idx to priv.
>> Recall the semantics of MPRV.
>>
> The following code (~20 lines below) takes MPRV into consideration:
> 
> if (riscv_cpu_two_stage_lookup(mmu_idx)) {
> mode = get_field(env->hstatus, HSTATUS_SPVP);
> } else if (mode == PRV_M && access_type != MMU_INST_FETCH) {
> if (get_field(env->mstatus, MSTATUS_MPRV)) {
> mode = get_field(env->mstatus, MSTATUS_MPP);
> }
> }
> 
> I think it's the same logic as the original one.
> 
>>> diff --git a/target/riscv/insn_trans/trans_xthead.c.inc
>>> b/target/riscv/insn_trans/trans_xthead.c.inc
>>> index df504c3f2c..adfb53cb4c 100644
>>> --- a/target/riscv/insn_trans/trans_xthead.c.inc
>>> +++ b/target/riscv/insn_trans/trans_xthead.c.inc
>>> @@ -265,12 +265,7 @@ static bool trans_th_tst(DisasContext *ctx,
>>> arg_th_tst *a)
>>>     static inline int priv_level(DisasContext *ctx)
>>>   {
>>> -#ifdef CONFIG_USER_ONLY
>>> -    return PRV_U;
>>> -#else
>>> - /* Priv level is part of mem_idx. */
>>> -    return ctx->mem_idx & TB_FLAGS_PRIV_MMU_MASK;
>>> -#endif
>>> +    return ctx->priv;
>>>   }
>>
>> I guess we aren't expecting optimization to remove dead system code?
>> That would be the only reason to keep the ifdef.
>>
>> This function should be hoisted to translate.c, or simply replaced by
>> the field access.
>>
> I only want to replace mem_idx to priv here, Zhiwei might decide how to
> adjust the code further.
> 
I'm not sure if this is good English, sorry for that if it's not. I mean
I want to keep this patch small.

Thanks,
Fei.

> Thanks,
> Fei.
> 
>>> @@ -1162,8 +1163,10 @@ static void
>>> riscv_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
>>>   } else {
>>>   ctx->virt_enabled = false;
>>>   }
>>> +    ctx->priv = env->priv;
>>
>> Incorrect, as Zhiwei pointed out.
>> I gave you the changes required to TB_FLAGS...
>>
>>
>> r~
>

Re: [PATCH v4 1/2] target/riscv: separate priv from mmu_idx

On 3/24/2023 12:07 AM, Richard Henderson wrote:
> On 3/22/23 23:00, Wu, Fei wrote:
 +    ctx->priv = env->priv;
>>>
>>> This is not right. You should put env->priv into tb flags before you use
>>> it in translation.
>>>
>> I see some other env usages in this function, when will env->priv and
>> tb_flags.priv mismatch (assume we have recorded priv in tb_flags)?
> 
> You are correct that they should match, because of tb_flags, but it is
> bad form to read from env, as that leads to errors.  Since you *can*
> read the same data from tb_flags, you should.
> 
I lack some background here, why should tb_flags be preferred if env has
the same info? Then for reading from tb_flags, we need to add it to
tb_flags first.

Correct me if I'm wrong. The only strong reason we add some flag to
tb_flags is that it causes codegen generate different code because of
this flag change, and it looks like priv is not one of them, neither is
mmu_idx, the generated code does use mmu_idx, but no different code
generated for it.

I think here we have some other reasons to include more, e.g. reference
env can be error-prone in some way. So there are minimal flags must be
in tb_flags, but we think it's better to add more?

Thanks,
Fei.

> The read of misa_ext and misa_mxl_max are correct, because they are
> constant set at cpu init/realize.
> 
> The read of vstart is incorrect.  The TB_FLAGS field is VL_EQ_VLMAX,
> which includes a test for vstart == 0, but the smaller vstart == 0 test
> is not extractable from that.  Thus the usage in e.g.
> vext_check_reduction is incorrect.  One would require a new TB_FLAGS bit
> to encode vstart ==/!= 0 alone.
> 
> 
> r~

Re: [PATCH v4 1/2] target/riscv: separate priv from mmu_idx

On 3/23/2023 11:53 PM, Richard Henderson wrote:
> On 3/22/23 19:44, Fei Wu wrote:
>> Currently it's assumed the 2 low bits of mmu_idx map to privilege mode,
>> this assumption won't last as we are about to add more mmu_idx.
>>
>> Signed-off-by: Fei Wu 
>> ---
>>   target/riscv/cpu.h | 1 -
>>   target/riscv/cpu_helper.c  | 2 +-
>>   target/riscv/insn_trans/trans_privileged.c.inc | 2 +-
>>   target/riscv/insn_trans/trans_xthead.c.inc | 7 +--
>>   target/riscv/translate.c   | 3 +++
>>   5 files changed, 6 insertions(+), 9 deletions(-)
>>
>> diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
>> index 638e47c75a..66f7e3d1ba 100644
>> --- a/target/riscv/cpu.h
>> +++ b/target/riscv/cpu.h
>> @@ -623,7 +623,6 @@ G_NORETURN void
>> riscv_raise_exception(CPURISCVState *env,
>>   target_ulong riscv_cpu_get_fflags(CPURISCVState *env);
>>   void riscv_cpu_set_fflags(CPURISCVState *env, target_ulong);
>>   -#define TB_FLAGS_PRIV_MMU_MASK    3
>>   #define TB_FLAGS_PRIV_HYP_ACCESS_MASK   (1 << 2)
>>   #define TB_FLAGS_MSTATUS_FS MSTATUS_FS
>>   #define TB_FLAGS_MSTATUS_VS MSTATUS_VS
>> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
>> index f88c503cf4..76e1b0100e 100644
>> --- a/target/riscv/cpu_helper.c
>> +++ b/target/riscv/cpu_helper.c
>> @@ -762,7 +762,7 @@ static int get_physical_address(CPURISCVState
>> *env, hwaddr *physical,
>>    * (riscv_cpu_do_interrupt) is correct */
>>   MemTxResult res;
>>   MemTxAttrs attrs = MEMTXATTRS_UNSPECIFIED;
>> -    int mode = mmu_idx & TB_FLAGS_PRIV_MMU_MASK;
>> +    int mode = env->priv;
> 
> This is incorrect.  You must map back from mmu_idx to priv.
> Recall the semantics of MPRV.
> 
The following code (~20 lines below) takes MPRV into consideration:

if (riscv_cpu_two_stage_lookup(mmu_idx)) {
mode = get_field(env->hstatus, HSTATUS_SPVP);
} else if (mode == PRV_M && access_type != MMU_INST_FETCH) {
if (get_field(env->mstatus, MSTATUS_MPRV)) {
mode = get_field(env->mstatus, MSTATUS_MPP);
}
}

I think it's the same logic as the original one.

>> diff --git a/target/riscv/insn_trans/trans_xthead.c.inc
>> b/target/riscv/insn_trans/trans_xthead.c.inc
>> index df504c3f2c..adfb53cb4c 100644
>> --- a/target/riscv/insn_trans/trans_xthead.c.inc
>> +++ b/target/riscv/insn_trans/trans_xthead.c.inc
>> @@ -265,12 +265,7 @@ static bool trans_th_tst(DisasContext *ctx,
>> arg_th_tst *a)
>>     static inline int priv_level(DisasContext *ctx)
>>   {
>> -#ifdef CONFIG_USER_ONLY
>> -    return PRV_U;
>> -#else
>> - /* Priv level is part of mem_idx. */
>> -    return ctx->mem_idx & TB_FLAGS_PRIV_MMU_MASK;
>> -#endif
>> +    return ctx->priv;
>>   }
> 
> I guess we aren't expecting optimization to remove dead system code?
> That would be the only reason to keep the ifdef.
> 
> This function should be hoisted to translate.c, or simply replaced by
> the field access.
> 
I only want to replace mem_idx to priv here, Zhiwei might decide how to
adjust the code further.

Thanks,
Fei.

>> @@ -1162,8 +1163,10 @@ static void
>> riscv_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
>>   } else {
>>   ctx->virt_enabled = false;
>>   }
>> +    ctx->priv = env->priv;
> 
> Incorrect, as Zhiwei pointed out.
> I gave you the changes required to TB_FLAGS...
> 
> 
> r~

Re: [PATCH for-8.1 v4 11/25] target/riscv/cpu.c: set cpu config in set_misa()





On 3/22/23 23:14, LIU Zhiwei wrote:


On 2023/3/23 6:19, Daniel Henrique Barboza wrote:

set_misa() is setting all 'misa' related env states and nothing else.
But other functions, namely riscv_cpu_validate_set_extensions(), uses
the config object to do its job.

This creates a need to set the single letter extensions in the cfg
object to keep both in sync. At this moment this is being done by
register_cpu_props(), forcing every CPU to do a call to this function.

Let's beef up set_misa() and make the function do the sync for us. This
will relieve named CPUs to having to call register_cpu_props(), which
will then be redesigned to a more specialized role next.

Signed-off-by: Daniel Henrique Barboza 
---
  target/riscv/cpu.c | 43 ---
  target/riscv/cpu.h |  4 ++--
  2 files changed, 34 insertions(+), 13 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 36c55abda0..df5c0bda70 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -236,8 +236,40 @@ const char *riscv_cpu_get_trap_name(target_ulong cause, 
bool async)
  static void set_misa(CPURISCVState *env, RISCVMXL mxl, uint32_t ext)
  {
+    RISCVCPU *cpu;
+
  env->misa_mxl_max = env->misa_mxl = mxl;
  env->misa_ext_mask = env->misa_ext = ext;
+
+    /*
+ * ext = 0 will only be a thing during cpu_init() functions
+ * as a way of setting an extension-agnostic CPU. We do
+ * not support clearing misa_ext* and the ext_N flags in
+ * RISCVCPUConfig in regular circunstances.
+ */
+    if (ext == 0) {
+    return;
+    }
+
+    /*
+ * We can't use riscv_cpu_cfg() in this case because it is
+ * a read-only inline and we're going to change the values
+ * of cpu->cfg.
+ */
+    cpu = env_archcpu(env);
+
+    cpu->cfg.ext_i = ext & RVI;
+    cpu->cfg.ext_e = ext & RVE;
+    cpu->cfg.ext_m = ext & RVM;
+    cpu->cfg.ext_a = ext & RVA;
+    cpu->cfg.ext_f = ext & RVF;
+    cpu->cfg.ext_d = ext & RVD;
+    cpu->cfg.ext_v = ext & RVV;
+    cpu->cfg.ext_c = ext & RVC;
+    cpu->cfg.ext_s = ext & RVS;
+    cpu->cfg.ext_u = ext & RVU;
+    cpu->cfg.ext_h = ext & RVH;
+    cpu->cfg.ext_j = ext & RVJ;
  }
  #ifndef CONFIG_USER_ONLY
@@ -340,7 +372,6 @@ static void riscv_any_cpu_init(Object *obj)
  #endif
  env->priv_ver = PRIV_VERSION_LATEST;
-    register_cpu_props(obj);


This patch will break the original logic. We can only can a empty CPU here.


  /* inherited from parent obj via riscv_cpu_init() */
  cpu->cfg.ext_ifencei = true;
@@ -368,7 +399,6 @@ static void rv64_sifive_u_cpu_init(Object *obj)
  RISCVCPU *cpu = RISCV_CPU(obj);
  CPURISCVState *env = >env;
  set_misa(env, MXL_RV64, RVI | RVM | RVA | RVF | RVD | RVC | RVS | RVU);
-    register_cpu_props(obj);
  env->priv_ver = PRIV_VERSION_1_10_0;
  #ifndef CONFIG_USER_ONLY
  set_satp_mode_max_supported(RISCV_CPU(obj), VM_1_10_SV39);
@@ -387,7 +417,6 @@ static void rv64_sifive_e_cpu_init(Object *obj)
  RISCVCPU *cpu = RISCV_CPU(obj);
  set_misa(env, MXL_RV64, RVI | RVM | RVA | RVC | RVU);
-    register_cpu_props(obj);
  env->priv_ver = PRIV_VERSION_1_10_0;
  #ifndef CONFIG_USER_ONLY
  set_satp_mode_max_supported(cpu, VM_1_10_MBARE);
@@ -408,9 +437,6 @@ static void rv64_thead_c906_cpu_init(Object *obj)
  env->priv_ver = PRIV_VERSION_1_11_0;
  cpu->cfg.ext_g = true;
-    cpu->cfg.ext_c = true;
-    cpu->cfg.ext_u = true;
-    cpu->cfg.ext_s = true;


Why specially for these configurations?


Because there is a set_misa() call right before:

set_misa(env, MXL_RV64, RVI | RVM | RVA | RVF | RVD | RVC | RVS | RVU);

And since set_misa() is now syncing with cpu->cfg these are unneeded. There
are no other cases like that in other cpu_init() functions. I'll make sure to
mention this in the commit message in the next version.

Patch 14 is going to do the same thing for ext_g in this same function.


Thanks,

Daniel



Zhiwei


  cpu->cfg.ext_icsr = true;
  cpu->cfg.ext_zfh = true;
  cpu->cfg.mmu = true;
@@ -472,8 +498,6 @@ static void rv32_sifive_u_cpu_init(Object *obj)
  RISCVCPU *cpu = RISCV_CPU(obj);
  CPURISCVState *env = >env;
  set_misa(env, MXL_RV32, RVI | RVM | RVA | RVF | RVD | RVC | RVS | RVU);
-
-    register_cpu_props(obj);
  env->priv_ver = PRIV_VERSION_1_10_0;
  #ifndef CONFIG_USER_ONLY
  set_satp_mode_max_supported(RISCV_CPU(obj), VM_1_10_SV32);
@@ -492,7 +516,6 @@ static void rv32_sifive_e_cpu_init(Object *obj)
  RISCVCPU *cpu = RISCV_CPU(obj);
  set_misa(env, MXL_RV32, RVI | RVM | RVA | RVC | RVU);
-    register_cpu_props(obj);
  env->priv_ver = PRIV_VERSION_1_10_0;
  #ifndef CONFIG_USER_ONLY
  set_satp_mode_max_supported(cpu, VM_1_10_MBARE);
@@ -510,7 +533,6 @@ static void rv32_ibex_cpu_init(Object *obj)
  RISCVCPU *cpu = RISCV_CPU(obj);
  set_misa(env, MXL_RV32, RVI | RVM | RVC | RVU);
-    register_cpu_props(obj);
  env->priv_ver = PRIV_VERSION_1_11_0;
  #ifndef CONFIG_USER_ONLY

Re: [PATCH 1/3] docs: Add support for TPM devices over I2C bus




On 3/23/23 2:49 AM, Cédric Le Goater wrote:

On 3/23/23 04:01, Ninad Palsule wrote:

This is a documentation change for I2C TPM device support.

Qemu already supports devices attached to ISA and sysbus.
This drop adds support for the I2C bus attached TPM devices.

Signed-off-by: Ninad Palsule 

---
V2:

Incorporated Stephen's review comments
- Added example in the document.
---
  docs/specs/tpm.rst | 20 +++-
  1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/docs/specs/tpm.rst b/docs/specs/tpm.rst
index 535912a92b..bf7249b09c 100644
--- a/docs/specs/tpm.rst
+++ b/docs/specs/tpm.rst
@@ -21,11 +21,15 @@ QEMU files related to TPM TIS interface:
   - ``hw/tpm/tpm_tis_common.c``
   - ``hw/tpm/tpm_tis_isa.c``
   - ``hw/tpm/tpm_tis_sysbus.c``
+ - ``hw/tpm/tpm_tis_i2c.c``
   - ``hw/tpm/tpm_tis.h``
    Both an ISA device and a sysbus device are available. The former is
  used with pc/q35 machine while the latter can be instantiated in the
-Arm virt machine.
+Arm virt machine. An I2C device support is also added which can be
+instantiated in the arm based emulation machine. An I2C device is also
+supported for the Arm virt machine. This device only supports the
+TPM 2 protocol.
    CRB interface
  -
@@ -348,6 +352,20 @@ In case an Arm virt machine is emulated, use the 
following command line:

  -drive if=pflash,format=raw,file=flash0.img,readonly=on \
  -drive if=pflash,format=raw,file=flash1.img
  +In case a Rainier bmc machine is emulated, use the following 
command line:

+
+.. code-block:: console
+
+  qemu-system-arm -M rainier-bmc -nographic \
+    -kernel ${IMAGEPATH}/fitImage-linux.bin \
+    -dtb ${IMAGEPATH}/aspeed-bmc-ibm-rainier.dtb \
+    -initrd ${IMAGEPATH}/obmc-phosphor-initramfs.rootfs.cpio.xz \
+    -drive 
file=${IMAGEPATH}/obmc-phosphor-image.rootfs.wic.qcow2,if=sd,index=2\
+    -net nic -net 
user,hostfwd=:127.0.0.1:-:22,hostfwd=:127.0.0.1:2443-:443\

+    -chardev socket,id=chrtpm,path=/tmp/mytpm1/swtpm-sock \
+    -tpmdev emulator,id=tpm0,chardev=chrtpm \
+    -device tpm-tis-i2c,tpmdev=tpm0,bus=aspeed.i2c.bus.12,address=0x2e



The rainier images are not the easiest to find. Could we use an 
AST2600 EVB

machine instead and instantiate the device from user space ? see commit
3302184f7f or 7a7308eae0.

Ok, I will check on AST2600 EVB machine. The rainier images are 
available here: 
https://jenkins.openbmc.org/job/ci-openbmc/lastSuccessfulBuild/distro=ubuntu,label=docker-builder,target=p10bmc/artifact/openbmc/build/tmp/deploy/images/p10bmc/


Thanks for the review.

Ninad


Thanks,

C.


  In case SeaBIOS is used as firmware, it should show the TPM menu item
  after entering the menu with 'ESC'.

Re: [PATCH 3/3] New I2C: Add support for TPM devices over I2C bus




On 3/23/23 3:37 AM, Cédric Le Goater wrote:

On 3/23/23 04:01, Ninad Palsule wrote:

Qemu already supports devices attached to ISA and sysbus. This drop adds
support for the I2C bus attached TPM devices. I2C model only supports
TPM2 protocol.

This commit includes changes for the common code.
- Added I2C emulation model. Logic was added in the model to temporarily
   cache the data as I2C interface works per byte basis.
- New tpm type "tpm-tis-i2c" added for I2C support. User specify this
   string on command line.

Testing:
   TPM I2C device modulte is tested using SWTPM (software based TPM
   package). The qemu used the rainier machine and it was connected to
   swtpm over the socket interface.

   The command to start swtpm is as follows:
   $ swtpm socket --tpmstate dir=/tmp/mytpm1    \
  --ctrl type=unixio,path=/tmp/mytpm1/swtpm-sock  \
  --tpm2 --log level=100

   The command to start qemu is as follows:
   $ qemu-system-arm -M rainier-bmc -nographic \
 -kernel ${IMAGEPATH}/fitImage-linux.bin \
 -dtb ${IMAGEPATH}/aspeed-bmc-ibm-rainier.dtb \
 -initrd 
${IMAGEPATH}/obmc-phosphor-initramfs.rootfs.cpio.xz \
 -drive 
file=${IMAGEPATH}/obmc-phosphor-image.rootfs.wic.qcow2,if=sd,index=2 \
 -net nic -net 
user,hostfwd=:127.0.0.1:-:22,hostfwd=:127.0.0.1:2443-:443 \

 -chardev socket,id=chrtpm,path=/tmp/mytpm1/swtpm-sock \
 -tpmdev emulator,id=tpm0,chardev=chrtpm \
 -device 
tpm-tis-i2c,tpmdev=tpm0,bus=aspeed.i2c.bus.12,address=0x2e


   Note: Currently you need to specify the I2C bus and device address on
 command line. In future we can add a device at board level.


yes. Anyhow, it is better to start with user created device first than 
with

default devices created at the board level.



I could not find the  arch/arm/boot/dts/aspeed-ast2600-evb.dtb in 
https://jenkins.openbmc.org/job/ci-openbmc/lastSuccessfulBuild/ 
. Can 
you please point me to the location?






Signed-off-by: Ninad Palsule 
---
V2:
Incorporated Stephen's review comments.
- Handled checksum related register in I2C layer
- Defined I2C interface capabilities and return those instead of
   capabilities from TPM TIS. Add required capabilities from TIS.
- Do not cache FIFO data in the I2C layer.
- Make sure that Device address change register is not passed to I2C
   layer as capability indicate that it is not supported.
- Added boundary checks.
- Make sure that bits 26-31 are zeroed for the TPM_STS register on read
- Updated Kconfig files for new define.
---
  hw/arm/Kconfig   |   1 +
  hw/tpm/Kconfig   |   7 +
  hw/tpm/meson.build   |   1 +
  hw/tpm/tpm_tis_i2c.c | 440 +++
  include/sysemu/tpm.h |   3 +
  5 files changed, 452 insertions(+)
  create mode 100644 hw/tpm/tpm_tis_i2c.c

diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
index b5aed4aff5..05d6ef1a31 100644
--- a/hw/arm/Kconfig
+++ b/hw/arm/Kconfig
@@ -6,6 +6,7 @@ config ARM_VIRT
  imply VFIO_PLATFORM
  imply VFIO_XGMAC
  imply TPM_TIS_SYSBUS
+    imply TPM_TIS_I2C
  imply NVDIMM
  select ARM_GIC
  select ACPI
diff --git a/hw/tpm/Kconfig b/hw/tpm/Kconfig
index 29e82f3c92..a46663288c 100644
--- a/hw/tpm/Kconfig
+++ b/hw/tpm/Kconfig
@@ -1,3 +1,10 @@
+config TPM_TIS_I2C
+    bool
+    depends on TPM
+    select TPM_BACKEND
+    select I2C
+    select TPM_TIS
+
  config TPM_TIS_ISA
  bool
  depends on TPM && ISA_BUS
diff --git a/hw/tpm/meson.build b/hw/tpm/meson.build
index 7abc2d794a..76fe3cb098 100644
--- a/hw/tpm/meson.build
+++ b/hw/tpm/meson.build
@@ -1,6 +1,7 @@
  softmmu_ss.add(when: 'CONFIG_TPM_TIS', if_true: 
files('tpm_tis_common.c'))
  softmmu_ss.add(when: 'CONFIG_TPM_TIS_ISA', if_true: 
files('tpm_tis_isa.c'))
  softmmu_ss.add(when: 'CONFIG_TPM_TIS_SYSBUS', if_true: 
files('tpm_tis_sysbus.c'))
+softmmu_ss.add(when: 'CONFIG_TPM_TIS_I2C', if_true: 
files('tpm_tis_i2c.c'))

  softmmu_ss.add(when: 'CONFIG_TPM_CRB', if_true: files('tpm_crb.c'))
  softmmu_ss.add(when: 'CONFIG_TPM_TIS', if_true: files('tpm_ppi.c'))
  softmmu_ss.add(when: 'CONFIG_TPM_CRB', if_true: files('tpm_ppi.c'))
diff --git a/hw/tpm/tpm_tis_i2c.c b/hw/tpm/tpm_tis_i2c.c
new file mode 100644
index 00..5cec5f7806
--- /dev/null
+++ b/hw/tpm/tpm_tis_i2c.c
@@ -0,0 +1,440 @@
+/*
+ * tpm_tis_i2c.c - QEMU's TPM TIS I2C Device
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 
or later.

+ * See the COPYING file in the top-level directory.
+ *
+ * Implementation of the TIS interface according to specs found at
+ * http://www.trustedcomputinggroup.org. This implementation currently
+ * supports version 1.3, 21 March 2013
+ * In the developers menu choose the PC Client section then find the 
TIS

+ * specification.
+ *
+ * TPM TIS for TPM 2 implementation following TCG PC Client Platform
+ * TPM Profile (PTP)

Cxl devel!

2023-03-23 Thread Maverickk 78

Hello Jonathan

Raghu here, I'm going over your cxl patches for past few days, it's very
impressive.

I want to get involved and contribute in your endeavor, may be bits &
pieces to start.

If you're specific trivial task(cvl/pcie/fm) about cxl, please let me know.

Regards
Raghu

Re: [PATCH 0/3] Add support for TPM devices over I2C bus

On 3/23/23 2:23 AM, Cédric Le Goater wrote:

Hello Ninad,

On 3/23/23 04:01, Ninad Palsule wrote:

This drop adds support for the TPM devices attached to the I2C bus. It
only supports the TPM2 protocol. You need to run it with the external
TPM emulator like swtpm. I have tested it with swtpm.

I have refered to the work done by zhdan...@meta.com but at the core
level out implementation is different.
https://github.com/theopolis/qemu/commit/2e2e57cde9e419c36af8071bb85392ad1ed70966

Based-on: $MESSAGE_ID
---
V2:
Incorporated Stephan's comments.

Please add a version to the patchsets you send :

git format-patch -v 2 --cover-letter

it is better practice and easier to track in our mailboxes. The automated
tools patchew, patchwork, also track them.

Yes, I noted down. Sorry I missed that last time.

Ninad Palsule (3):
docs: Add support for TPM devices over I2C bus

Generally we add the docs after support. No big deal.

Ok, I will remember this next time.

TPM TIS: Add support for TPM devices over I2C bus
New I2C: Add support for TPM devices over I2C bus

Have you looked at adding tests ? qtest or avocado ?

We discussed offline about it with Stefan and the I2C qos framework in
qtest is a bit of a challenge for the TPM purpose. See the thread here :

https://lore.kernel.org/qemu-devel/dd43ec84-51e4-11f7-e067-2fb57a567...@linux.ibm.com/T/#u

Stefan has already created some tests. Thanks Stefan.

Thanks for the review!

Ninad

Thanks,

[PATCH v2 1/1] hw/arm: do not free machine->fdt in arm_load_dtb()

At this moment, arm_load_dtb() can free machine->fdt when
binfo->dtb_filename is NULL. If there's no 'dtb_filename', 'fdt' will be
retrieved by binfo->get_dtb(). If get_dtb() returns machine->fdt, as is
the case of machvirt_dtb() from hw/arm/virt.c, fdt now has a pointer to
machine->fdt. And, in that case, the existing g_free(fdt) at the end of
arm_load_dtb() will make machine->fdt point to an invalid memory region.

After the command 'dumpdtb' were introduced a couple of releases ago,
running it with any ARM machine that uses arm_load_dtb() will crash
QEMU.

Let's enable all arm_load_dtb() callers to use dumpdtb properly. Instead
of freeing 'fdt', assign it back to ms->fdt.

Note that all current callers (sbsa-ref.c, virt.c, xlnx-versal-virt.c)
are assigning ms->fdt before arm_load_dtb() is called, regardless of
whether the user is inputting an external FDT via '-dtb'. To avoid
leaking the board FDT if '-dtb' is used (since we're assigning ms->fdt
in the end), free ms->fdt before load_device_tree().

Cc: Peter Maydell 
Cc: qemu-...@nongnu.org
Fixes: bf353ad55590f ("qmp/hmp, device_tree.c: introduce dumpdtb")
Reported-by: Markus Armbruster i
Signed-off-by: Daniel Henrique Barboza 
---
 hw/arm/boot.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 50e5141116..de18c0a969 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -549,6 +549,13 @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info 
*binfo,
 goto fail;
 }
 
+/*
+ * If we're here we won't be using the ms->fdt from the board.
+ * We'll assign a new ms->fdt at the end, so free it now to
+ * avoid leaking the board FDT.
+ */
+g_free(ms->fdt);
+
 fdt = load_device_tree(filename, );
 if (!fdt) {
 fprintf(stderr, "Couldn't open dtb file %s\n", filename);
@@ -689,7 +696,8 @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info 
*binfo,
 qemu_register_reset_nosnapshotload(qemu_fdt_randomize_seeds,
rom_ptr_for_as(as, addr, size));
 
-g_free(fdt);
+/* Set ms->fdt for 'dumpdtb' QMP/HMP command */
+ms->fdt = fdt;
 
 return size;
 
-- 
2.39.2

[PATCH v2 0/1] fix dumpdtb crash with ARM machines

Hi,

In this version I fixed a mem leak that was happening if the user inputs
a fdt via '-dtb'. In that case we would assign the updated FDT on top of
the existing board FDT that was already assigned to ms->fdt.

Tested as follows:

$ ./qemu-system-aarch64 -S -M virt -display none -qmp stdio
{"QMP": {"version": {"qemu": {"micro": 91, "minor": 2, "major": 7}, "package": 
"v8.0.0-rc1-37-ge573ef31e7-dirty"}, "capabilities": ["oob"]}}
{"execute": "qmp_capabilities", "arguments": {"enable": ["oob"]}}
{"return": {}}

{"execute": "dumpdtb", "arguments": {"filename": "fdt.dtb"}}
{"return": {}}

^Cqemu-system-aarch64: terminating on signal 2
{"timestamp": {"seconds": 1679603324, "microseconds": 62159}, "event": 
"SHUTDOWN", "data": {"guest": false, "reason": "host-signal"}}
$ 


$ ./qemu-system-aarch64 -S -M virt -display none -qmp stdio -dtb fdt.dtb 
{"QMP": {"version": {"qemu": {"micro": 91, "minor": 2, "major": 7}, "package": 
"v8.0.0-rc1-37-ge573ef31e7-dirty"}, "capabilities": ["oob"]}}
{"execute": "qmp_capabilities", "arguments": {"enable": ["oob"]}}
{"return": {}}

{"execute": "dumpdtb", "arguments": {"filename": "fdt.dtb"}}
{"return": {}}
^Cqemu-system-aarch64: terminating on signal 2
{"timestamp": {"seconds": 1679603344, "microseconds": 145748}, "event": 
"SHUTDOWN", "data": {"guest": false, "reason": "host-signal"}}
$ 


First we use 'dumpdtb' to dump the board FDT in fdt.dtb, then we use it as
an argument to '-dtb' and do the test again. This covers both code paths.


Changes from v1:
- g_free(ms->fdt) before load_device_tree()
- v1 link: https://lists.gnu.org/archive/html/qemu-devel/2023-03/msg05930.html

Cc: Peter Maydell 
Cc: Markus Armbruster 
Cc: qemu-...@nongnu.org

Daniel Henrique Barboza (1):
  hw/arm: do not free machine->fdt in arm_load_dtb()

 hw/arm/boot.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

-- 
2.39.2

Re: [PATCH 1/1] hw/arm: do not free machine->fdt in arm_load_dtb()





On 3/23/23 14:59, Peter Maydell wrote:

On Thu, 23 Mar 2023 at 17:54, Daniel Henrique Barboza
 wrote:




On 3/23/23 14:38, Peter Maydell wrote:

On Thu, 23 Mar 2023 at 16:11, Daniel Henrique Barboza
 wrote:

-g_free(fdt);
+/* Set ms->fdt for 'dumpdtb' QMP/HMP command */
+ms->fdt = fdt;


With this we're now setting ms->fdt twice for the virt board: we set
it in create_fdt() in hw/arm/virt.c, and then we set it again here.
Which is the right place to set it?

Is the QMP 'dumpdtb' command intended to dump the DTB only for
board types where the DTB is created at runtime by QEMU? Or
is it supposed to also work for DTBs that were originally
provided by the user using the '-dtb' command line? The docs
don't say. If we want the former, then we should be setting
ms->fdt in the board code; if the latter, then here is right.


My original intent with this command was to dump the current state of the FDT,
regardless of whether the FDT was loaded via -dtb or at runtime.


Mmm. I think that makes sense; we do make a few tweaks to the DTB
even if it was user-provided and you might want to check those
for debug purposes. So we should keep this assignment, and remove
the now-unneeded setting of ms->fdt in create_fdt().



I don't think we can remove it. arm_load_dtb() does the following:

if (binfo->dtb_filename) {
   (...)
} else {
fdt = binfo->get_dtb(binfo, );
if (!fdt) {
fprintf(stderr, "Board was unable to create a dtb blob\n");
goto fail;
}
}

So if we don't have a '-dtb' option, fdt = binfo->get_dtb(). For the 'virt' 
machine,
machvirt_dtb(), will return ms->fdt. So we would SIGSEG right at the start.

And now that I think more about it, this patch is leaking the board FDT if we're
using the FDT from dtb_filename, isn't it? We're assigning a new ms->fdt on top
of the existing ms->fdt from the board. I'll send a new version.

Also, given that we're not using the board FDT at all if '-dtb' is present, I
think it would be good to move create_fdt() from machvirt_init() to 
machvirt_dtb().
Some code juggling would be required (some functions from init() are using 
ms->fdt)
but it think it would make the code clearer - create the fdt board only if we're
really going to use it. I'll see if I can pull this off and send a 8.1 patch
with it.


Thanks,


Daniel






thanks
-- PMM

Re: [PATCH 3/3] New I2C: Add support for TPM devices over I2C bus




On 3/23/23 7:18 AM, Stefan Berger wrote:



On 3/22/23 23:01, Ninad Palsule wrote:

Qemu already supports devices attached to ISA and sysbus. This drop adds
support for the I2C bus attached TPM devices. I2C model only supports
TPM2 protocol.

This commit includes changes for the common code.
- Added I2C emulation model. Logic was added in the model to temporarily
   cache the data as I2C interface works per byte basis.
- New tpm type "tpm-tis-i2c" added for I2C support. User specify this
   string on command line.

Testing:
   TPM I2C device modulte is tested using SWTPM (software based TPM
   package). The qemu used the rainier machine and it was connected to
   swtpm over the socket interface.

   The command to start swtpm is as follows:
   $ swtpm socket --tpmstate dir=/tmp/mytpm1    \
  --ctrl type=unixio,path=/tmp/mytpm1/swtpm-sock  \
  --tpm2 --log level=100

   The command to start qemu is as follows:
   $ qemu-system-arm -M rainier-bmc -nographic \
 -kernel ${IMAGEPATH}/fitImage-linux.bin \
 -dtb ${IMAGEPATH}/aspeed-bmc-ibm-rainier.dtb \
 -initrd 
${IMAGEPATH}/obmc-phosphor-initramfs.rootfs.cpio.xz \
 -drive 
file=${IMAGEPATH}/obmc-phosphor-image.rootfs.wic.qcow2,if=sd,index=2 \
 -net nic -net 
user,hostfwd=:127.0.0.1:-:22,hostfwd=:127.0.0.1:2443-:443 \

 -chardev socket,id=chrtpm,path=/tmp/mytpm1/swtpm-sock \
 -tpmdev emulator,id=tpm0,chardev=chrtpm \
 -device 
tpm-tis-i2c,tpmdev=tpm0,bus=aspeed.i2c.bus.12,address=0x2e


   Note: Currently you need to specify the I2C bus and device address on
 command line. In future we can add a device at board level.

Signed-off-by: Ninad Palsule 
---
V2:
Incorporated Stephen's review comments.
- Handled checksum related register in I2C layer
- Defined I2C interface capabilities and return those instead of
   capabilities from TPM TIS. Add required capabilities from TIS.
- Do not cache FIFO data in the I2C layer.
- Make sure that Device address change register is not passed to I2C
   layer as capability indicate that it is not supported.
- Added boundary checks.
- Make sure that bits 26-31 are zeroed for the TPM_STS register on read
- Updated Kconfig files for new define.
---
  hw/arm/Kconfig   |   1 +
  hw/tpm/Kconfig   |   7 +
  hw/tpm/meson.build   |   1 +
  hw/tpm/tpm_tis_i2c.c | 440 +++
  include/sysemu/tpm.h |   3 +
  5 files changed, 452 insertions(+)
  create mode 100644 hw/tpm/tpm_tis_i2c.c

diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
index b5aed4aff5..05d6ef1a31 100644
--- a/hw/arm/Kconfig
+++ b/hw/arm/Kconfig
@@ -6,6 +6,7 @@ config ARM_VIRT
  imply VFIO_PLATFORM
  imply VFIO_XGMAC
  imply TPM_TIS_SYSBUS
+    imply TPM_TIS_I2C
  imply NVDIMM
  select ARM_GIC
  select ACPI
diff --git a/hw/tpm/Kconfig b/hw/tpm/Kconfig
index 29e82f3c92..a46663288c 100644
--- a/hw/tpm/Kconfig
+++ b/hw/tpm/Kconfig
@@ -1,3 +1,10 @@
+config TPM_TIS_I2C
+    bool
+    depends on TPM
+    select TPM_BACKEND
+    select I2C
+    select TPM_TIS
+
  config TPM_TIS_ISA
  bool
  depends on TPM && ISA_BUS
diff --git a/hw/tpm/meson.build b/hw/tpm/meson.build
index 7abc2d794a..76fe3cb098 100644
--- a/hw/tpm/meson.build
+++ b/hw/tpm/meson.build
@@ -1,6 +1,7 @@
  softmmu_ss.add(when: 'CONFIG_TPM_TIS', if_true: 
files('tpm_tis_common.c'))
  softmmu_ss.add(when: 'CONFIG_TPM_TIS_ISA', if_true: 
files('tpm_tis_isa.c'))
  softmmu_ss.add(when: 'CONFIG_TPM_TIS_SYSBUS', if_true: 
files('tpm_tis_sysbus.c'))
+softmmu_ss.add(when: 'CONFIG_TPM_TIS_I2C', if_true: 
files('tpm_tis_i2c.c'))

  softmmu_ss.add(when: 'CONFIG_TPM_CRB', if_true: files('tpm_crb.c'))
  softmmu_ss.add(when: 'CONFIG_TPM_TIS', if_true: files('tpm_ppi.c'))
  softmmu_ss.add(when: 'CONFIG_TPM_CRB', if_true: files('tpm_ppi.c'))
diff --git a/hw/tpm/tpm_tis_i2c.c b/hw/tpm/tpm_tis_i2c.c
new file mode 100644
index 00..5cec5f7806
--- /dev/null
+++ b/hw/tpm/tpm_tis_i2c.c
@@ -0,0 +1,440 @@
+/*
+ * tpm_tis_i2c.c - QEMU's TPM TIS I2C Device
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 
or later.

+ * See the COPYING file in the top-level directory.
+ *
+ * Implementation of the TIS interface according to specs found at
+ * http://www.trustedcomputinggroup.org. This implementation currently
+ * supports version 1.3, 21 March 2013
+ * In the developers menu choose the PC Client section then find the 
TIS

+ * specification.
+ *
+ * TPM TIS for TPM 2 implementation following TCG PC Client Platform
+ * TPM Profile (PTP) Specification, Familiy 2.0, Revision 00.43
+ *
+ * TPM I2C implementation follows TCG TPM I2c Interface specification,
+ * Family 2.0, Level 00, Revision 1.00
+ */
+
+#include "qemu/osdep.h"
+#include "hw/i2c/i2c.h"
+#include "hw/qdev-properties.h"
+#include "hw/acpi/tpm.h"
+#include "migration/vmstate.h"
+#include "tpm_prop.h"
+#include "tpm_tis.h"
+#include "qom/object.h"

[PATCH for 8.1 v2 6/6] vdpa: Cache cvq group in VhostVDPAState

Continue the move of code that interacts with the device from control
virtqueue start to control virtqueue init.

As with previous patches, it reduces the number of ioctls in the
migration, reducing failure possibilities.

Signed-off-by: Eugenio Pérez 
---
 net/vhost-vdpa.c | 40 +++-
 1 file changed, 27 insertions(+), 13 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index db2c9afcb3..6a60e8cc2b 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -41,6 +41,12 @@ typedef struct VhostVDPAState {
 void *cvq_cmd_out_buffer;
 virtio_net_ctrl_ack *status;
 
+/* CVQ group if cvq_isolated_mq */
+uint32_t cvq_group_mq;
+
+/* CVQ group if cvq_isolated */
+uint32_t cvq_group;
+
 /* The device always have SVQ enabled */
 bool always_svq;
 
@@ -480,7 +486,6 @@ static int vhost_vdpa_net_cvq_start(NetClientState *nc)
 struct vhost_vdpa *v;
 int64_t cvq_group;
 int r;
-Error *err = NULL;
 
 assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
 
@@ -509,18 +514,14 @@ static int vhost_vdpa_net_cvq_start(NetClientState *nc)
 if (!s->cvq_isolated_mq) {
 return 0;
 }
+
+cvq_group = s->cvq_group_mq;
 } else {
 if (!s->cvq_isolated) {
 return 0;
 }
-}
 
-cvq_group = vhost_vdpa_get_vring_group(v->device_fd,
-   v->dev->vq_index_end - 1,
-   );
-if (unlikely(cvq_group < 0)) {
-error_report_err(err);
-return cvq_group;
+cvq_group = s->cvq_group;
 }
 
 r = vhost_vdpa_set_address_space_id(v, cvq_group, VHOST_VDPA_NET_CVQ_ASID);
@@ -790,11 +791,13 @@ static const VhostShadowVirtqueueOps 
vhost_vdpa_net_svq_ops = {
  * @device_fd vhost-vdpa file descriptor
  * @features features to negotiate
  * @cvq_index Control vq index
+ * @pcvq_group: Returns CVQ group if cvq is isolated.
  *
  * Returns -1 in case of error, 0 if false and 1 if true
  */
 static int vhost_vdpa_cvq_is_isolated(int device_fd, uint64_t features,
-  unsigned cvq_index, Error **errp)
+  unsigned cvq_index, uint32_t *pcvq_group,
+  Error **errp)
 {
 int64_t cvq_group;
 int r;
@@ -810,6 +813,7 @@ static int vhost_vdpa_cvq_is_isolated(int device_fd, 
uint64_t features,
 return cvq_group;
 }
 
+*pcvq_group = (uint32_t)cvq_group;
 for (int i = 0; i < cvq_index; ++i) {
 int64_t group = vhost_vdpa_get_vring_group(device_fd, i, errp);
 
@@ -836,12 +840,15 @@ static int vhost_vdpa_cvq_is_isolated(int device_fd, 
uint64_t features,
  *negotiated.
  * @cvq_isolated_mq   It'll be set to true if cvq is isolated if mq is
  *negotiated.
+ * @cvq_group CVQ group if MQ is not negotiated.
+ * @cvq_group_mq  CVQ group if MQ is negotiated.
  *
  * Returns -1 in case of failure
  */
 static int vhost_vdpa_probe_cvq_isolation(int device_fd, uint64_t features,
   int cvq_index, bool *cvq_isolated,
-  bool *cvq_isolated_mq, Error **errp)
+  bool *cvq_isolated_mq, uint32_t 
*cvq_group,
+  uint32_t *cvq_group_mq, Error **errp)
 {
 uint64_t backend_features;
 int r;
@@ -850,6 +857,8 @@ static int vhost_vdpa_probe_cvq_isolation(int device_fd, 
uint64_t features,
 
 *cvq_isolated = false;
 *cvq_isolated_mq = false;
+*cvq_group = 0;
+*cvq_group_mq = 0;
 r = ioctl(device_fd, VHOST_GET_BACKEND_FEATURES, _features);
 if (unlikely(r < 0)) {
 error_setg_errno(errp, errno, "Cannot get vdpa backend_features");
@@ -862,7 +871,7 @@ static int vhost_vdpa_probe_cvq_isolation(int device_fd, 
uint64_t features,
 
 r = vhost_vdpa_cvq_is_isolated(device_fd,
features & ~BIT_ULL(VIRTIO_NET_F_MQ), 2,
-   errp);
+   cvq_group, errp);
 if (unlikely(r < 0)) {
 if (r == -ENOTSUP) {
 /*
@@ -884,7 +893,8 @@ static int vhost_vdpa_probe_cvq_isolation(int device_fd, 
uint64_t features,
 return 0;
 }
 
-r = vhost_vdpa_cvq_is_isolated(device_fd, features, cvq_index * 2, errp);
+r = vhost_vdpa_cvq_is_isolated(device_fd, features, cvq_index * 2,
+   cvq_group_mq, errp);
 if (unlikely(r < 0)) {
 return r;
 }
@@ -911,6 +921,7 @@ static NetClientState *net_vhost_vdpa_init(NetClientState 
*peer,
 int ret = 0;
 assert(name);
 bool cvq_isolated, cvq_isolated_mq;
+uint32_t cvq_group, cvq_group_mq;
 
 if (is_datapath) {
 nc = qemu_new_net_client(_vhost_vdpa_info, peer, device,
@@ -918,7 +929,8 @@ static NetClientState

[PATCH for 8.1 v2 1/6] vdpa: Remove status in reset tracing

It is always 0 and it is not useful to route call through file
descriptor.

Signed-off-by: Eugenio Pérez 
---
 hw/virtio/vhost-vdpa.c | 2 +-
 hw/virtio/trace-events | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index bc6bad23d5..bbabea18f3 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -716,7 +716,7 @@ static int vhost_vdpa_reset_device(struct vhost_dev *dev)
 uint8_t status = 0;
 
 ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, );
-trace_vhost_vdpa_reset_device(dev, status);
+trace_vhost_vdpa_reset_device(dev);
 v->suspended = false;
 return ret;
 }
diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 8f8d05cf9b..6265231683 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -44,7 +44,7 @@ vhost_vdpa_set_mem_table(void *dev, uint32_t nregions, 
uint32_t padding) "dev: %
 vhost_vdpa_dump_regions(void *dev, int i, uint64_t guest_phys_addr, uint64_t 
memory_size, uint64_t userspace_addr, uint64_t flags_padding) "dev: %p %d: 
guest_phys_addr: 0x%"PRIx64" memory_size: 0x%"PRIx64" userspace_addr: 
0x%"PRIx64" flags_padding: 0x%"PRIx64
 vhost_vdpa_set_features(void *dev, uint64_t features) "dev: %p features: 
0x%"PRIx64
 vhost_vdpa_get_device_id(void *dev, uint32_t device_id) "dev: %p device_id 
%"PRIu32
-vhost_vdpa_reset_device(void *dev, uint8_t status) "dev: %p status: 0x%"PRIx8
+vhost_vdpa_reset_device(void *dev) "dev: %p"
 vhost_vdpa_get_vq_index(void *dev, int idx, int vq_idx) "dev: %p idx: %d vq 
idx: %d"
 vhost_vdpa_set_vring_ready(void *dev) "dev: %p"
 vhost_vdpa_dump_config(void *dev, const char *line) "dev: %p %s"
-- 
2.31.1

[PATCH for 8.1 v2 5/6] vdpa: move CVQ isolation check to net_init_vhost_vdpa

Evaluating it at start time instead of initialization time may make the
guest capable of dynamically adding or removing migration blockers.

Also, moving to initialization reduces the number of ioctls in the
migration, reducing failure possibilities.

As a drawback we need to check for CVQ isolation twice: one time with no
MQ negotiated and another one acking it, as long as the device supports
it.  This is because Vring ASID / group management is based on vq
indexes, but we don't know the index of CVQ before negotiating MQ.

Signed-off-by: Eugenio Pérez 
---
v2: Take out the reset of the device from vhost_vdpa_cvq_is_isolated
---
 net/vhost-vdpa.c | 194 ---
 1 file changed, 151 insertions(+), 43 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 4397c0d4b3..db2c9afcb3 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -43,6 +43,13 @@ typedef struct VhostVDPAState {
 
 /* The device always have SVQ enabled */
 bool always_svq;
+
+/* The device can isolate CVQ in its own ASID if MQ is negotiated */
+bool cvq_isolated_mq;
+
+/* The device can isolate CVQ in its own ASID if MQ is not negotiated */
+bool cvq_isolated;
+
 bool started;
 } VhostVDPAState;
 
@@ -361,15 +368,8 @@ static NetClientInfo net_vhost_vdpa_info = {
 .check_peer_type = vhost_vdpa_check_peer_type,
 };
 
-/**
- * Get vring virtqueue group
- *
- * @device_fd  vdpa device fd
- * @vq_index   Virtqueue index
- *
- * Return -errno in case of error, or vq group if success.
- */
-static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
+static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index,
+  Error **errp)
 {
 struct vhost_vring_state state = {
 .index = vq_index,
@@ -378,8 +378,7 @@ static int64_t vhost_vdpa_get_vring_group(int device_fd, 
unsigned vq_index)
 
 if (unlikely(r < 0)) {
 r = -errno;
-error_report("Cannot get VQ %u group: %s", vq_index,
- g_strerror(errno));
+error_setg_errno(errp, errno, "Cannot get VQ %u group", vq_index);
 return r;
 }
 
@@ -479,9 +478,9 @@ static int vhost_vdpa_net_cvq_start(NetClientState *nc)
 {
 VhostVDPAState *s, *s0;
 struct vhost_vdpa *v;
-uint64_t backend_features;
 int64_t cvq_group;
-int cvq_index, r;
+int r;
+Error *err = NULL;
 
 assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
 
@@ -501,42 +500,29 @@ static int vhost_vdpa_net_cvq_start(NetClientState *nc)
 /*
  * If we early return in these cases SVQ will not be enabled. The migration
  * will be blocked as long as vhost-vdpa backends will not offer _F_LOG.
- *
- * Calling VHOST_GET_BACKEND_FEATURES as they are not available in v->dev
- * yet.
  */
-r = ioctl(v->device_fd, VHOST_GET_BACKEND_FEATURES, _features);
-if (unlikely(r < 0)) {
-error_report("Cannot get vdpa backend_features: %s(%d)",
-g_strerror(errno), errno);
-return -1;
-}
-if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOTLB_ASID)) ||
-!vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
+if (!vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
 return 0;
 }
 
-/*
- * Check if all the virtqueues of the virtio device are in a different vq
- * than the last vq. VQ group of last group passed in cvq_group.
- */
-cvq_index = v->dev->vq_index_end - 1;
-cvq_group = vhost_vdpa_get_vring_group(v->device_fd, cvq_index);
-if (unlikely(cvq_group < 0)) {
-return cvq_group;
-}
-for (int i = 0; i < cvq_index; ++i) {
-int64_t group = vhost_vdpa_get_vring_group(v->device_fd, i);
-
-if (unlikely(group < 0)) {
-return group;
+if (v->dev->features & BIT_ULL(VIRTIO_NET_F_MQ)) {
+if (!s->cvq_isolated_mq) {
+return 0;
 }
-
-if (group == cvq_group) {
+} else {
+if (!s->cvq_isolated) {
 return 0;
 }
 }
 
+cvq_group = vhost_vdpa_get_vring_group(v->device_fd,
+   v->dev->vq_index_end - 1,
+   );
+if (unlikely(cvq_group < 0)) {
+error_report_err(err);
+return cvq_group;
+}
+
 r = vhost_vdpa_set_address_space_id(v, cvq_group, VHOST_VDPA_NET_CVQ_ASID);
 if (unlikely(r < 0)) {
 return r;
@@ -798,6 +784,116 @@ static const VhostShadowVirtqueueOps 
vhost_vdpa_net_svq_ops = {
 .avail_handler = vhost_vdpa_net_handle_ctrl_avail,
 };
 
+/**
+ * Probe the device to check control virtqueue is isolated.
+ *
+ * @device_fd vhost-vdpa file descriptor
+ * @features features to negotiate
+ * @cvq_index Control vq index
+ *
+ * Returns -1 in case of error, 0 if false and 1 if true
+ */
+static int vhost_vdpa_cvq_is_isolated(int device_fd, uint64_t features,
+

[PATCH for 8.1 v2 4/6] vdpa: return errno in vhost_vdpa_get_vring_group error

We need to tell in the caller, as some errors are expected in a normal
workflow.  In particular, parent drivers in recent kernels with
VHOST_BACKEND_F_IOTLB_ASID may not support vring groups.  In that case,
-ENOTSUP is returned.

This is the case of vp_vdpa in Linux 6.2.

Next patches in this series will use that information to know if it must
abort or not.  Also, next patches return properly an errp instead of
printing with error_report.

Signed-off-by: Eugenio Pérez 
---
 net/vhost-vdpa.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 99904a0da7..4397c0d4b3 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -361,6 +361,14 @@ static NetClientInfo net_vhost_vdpa_info = {
 .check_peer_type = vhost_vdpa_check_peer_type,
 };
 
+/**
+ * Get vring virtqueue group
+ *
+ * @device_fd  vdpa device fd
+ * @vq_index   Virtqueue index
+ *
+ * Return -errno in case of error, or vq group if success.
+ */
 static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
 {
 struct vhost_vring_state state = {
@@ -369,6 +377,7 @@ static int64_t vhost_vdpa_get_vring_group(int device_fd, 
unsigned vq_index)
 int r = ioctl(device_fd, VHOST_VDPA_GET_VRING_GROUP, );
 
 if (unlikely(r < 0)) {
+r = -errno;
 error_report("Cannot get VQ %u group: %s", vq_index,
  g_strerror(errno));
 return r;
-- 
2.31.1

[PATCH for 8.1 v2 3/6] vdpa: add vhost_vdpa_set_dev_features_fd

This allows to set the features of a vhost-vdpa device from external
subsystems like vhost-net.  It is used in subsequent patches to
negotiate features and probe for CVQ ASID isolation.

Reviewed-by: Stefano Garzarella 
Signed-off-by: Eugenio Pérez 
---
 include/hw/virtio/vhost-vdpa.h |  1 +
 hw/virtio/vhost-vdpa.c | 20 +---
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
index 28de7da91e..a9cb6f3a32 100644
--- a/include/hw/virtio/vhost-vdpa.h
+++ b/include/hw/virtio/vhost-vdpa.h
@@ -55,6 +55,7 @@ typedef struct vhost_vdpa {
 } VhostVDPA;
 
 void vhost_vdpa_reset_status_fd(int fd);
+int vhost_vdpa_set_dev_features_fd(int fd, uint64_t features);
 int vhost_vdpa_get_iova_range(int fd, struct vhost_vdpa_iova_range 
*iova_range);
 
 int vhost_vdpa_dma_map(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 7a2053b8d9..acd5be46a9 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -651,11 +651,22 @@ static int vhost_vdpa_set_mem_table(struct vhost_dev *dev,
 return 0;
 }
 
+int vhost_vdpa_set_dev_features_fd(int fd, uint64_t features)
+{
+int ret;
+
+ret = vhost_vdpa_call_fd(fd, VHOST_SET_FEATURES, );
+if (ret) {
+return ret;
+}
+
+return vhost_vdpa_add_status_fd(fd, VIRTIO_CONFIG_S_FEATURES_OK);
+}
+
 static int vhost_vdpa_set_features(struct vhost_dev *dev,
uint64_t features)
 {
 struct vhost_vdpa *v = dev->opaque;
-int ret;
 
 if (!vhost_vdpa_first_dev(dev)) {
 return 0;
@@ -678,12 +689,7 @@ static int vhost_vdpa_set_features(struct vhost_dev *dev,
 }
 
 trace_vhost_vdpa_set_features(dev, features);
-ret = vhost_vdpa_call(dev, VHOST_SET_FEATURES, );
-if (ret) {
-return ret;
-}
-
-return vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_FEATURES_OK);
+return vhost_vdpa_set_dev_features_fd(vhost_vdpa_dev_fd(dev), features);
 }
 
 static int vhost_vdpa_set_backend_cap(struct vhost_dev *dev)
-- 
2.31.1

[PATCH for 8.1 v2 0/6] Move ASID test to vhost-vdpa net initialization

QEMU v8.0.0-rc0 is able to switch dynamically between vhost-vdpa passthrough
and SVQ mode as long as the net device does not have CVQ.  The net device
state followed (and migrated) by CVQ requires special care.

A pre-requisite to add CVQ to that framework is to determine if devices with
CVQ are migratable or not at initialization time.  The solution to it is to
always shadow only CVQ, and vq groups and ASID are used for that.

However, current qemu version only checks ASID at device start (as "driver set
DRIVER_OK status bit"), not at device initialization.  A check at
initialization time is required.  Otherwise, the guest would be able to set
and remove migration blockers at will [1].

This series is a requisite for migration of vhost-vdpa net devices with CVQ.
However it already makes sense by its own, as it reduces the number of ioctls
at migration time, decreasing the error paths there.

[1] 
https://lore.kernel.org/qemu-devel/2616f0cd-f9e8-d183-ea78-db1be4825...@redhat.com/
---
v2:
* Take out the reset of the device from vhost_vdpa_cvq_is_isolated
  (reported by Lei Yang).
* Expand patch messages by Stefano G. questions.

Eugenio Pérez (6):
  vdpa: Remove status in reset tracing
  vdpa: add vhost_vdpa_reset_status_fd
  vdpa: add vhost_vdpa_set_dev_features_fd
  vdpa: return errno in vhost_vdpa_get_vring_group error
  vdpa: move CVQ isolation check to net_init_vhost_vdpa
  vdpa: Cache cvq group in VhostVDPAState

 include/hw/virtio/vhost-vdpa.h |   2 +
 hw/virtio/vhost-vdpa.c |  78 -
 net/vhost-vdpa.c   | 199 +++--
 hw/virtio/trace-events |   2 +-
 4 files changed, 221 insertions(+), 60 deletions(-)

-- 
2.31.1

[PATCH for 8.1 v2 2/6] vdpa: add vhost_vdpa_reset_status_fd

This allows to reset a vhost-vdpa device from external subsystems like
vhost-net, since it does not have any struct vhost_dev by the time we
need to use it.

It is used in subsequent patches to negotiate features
and probe for CVQ ASID isolation.

Signed-off-by: Eugenio Pérez 
---
 include/hw/virtio/vhost-vdpa.h |  1 +
 hw/virtio/vhost-vdpa.c | 58 +++---
 2 files changed, 41 insertions(+), 18 deletions(-)

diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
index c278a2a8de..28de7da91e 100644
--- a/include/hw/virtio/vhost-vdpa.h
+++ b/include/hw/virtio/vhost-vdpa.h
@@ -54,6 +54,7 @@ typedef struct vhost_vdpa {
 VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
 } VhostVDPA;
 
+void vhost_vdpa_reset_status_fd(int fd);
 int vhost_vdpa_get_iova_range(int fd, struct vhost_vdpa_iova_range 
*iova_range);
 
 int vhost_vdpa_dma_map(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index bbabea18f3..7a2053b8d9 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -335,38 +335,45 @@ static const MemoryListener vhost_vdpa_memory_listener = {
 .region_del = vhost_vdpa_listener_region_del,
 };
 
-static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
- void *arg)
+static int vhost_vdpa_dev_fd(const struct vhost_dev *dev)
 {
 struct vhost_vdpa *v = dev->opaque;
-int fd = v->device_fd;
-int ret;
 
 assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA);
+return v->device_fd;
+}
+
+static int vhost_vdpa_call_fd(int fd, unsigned long int request, void *arg)
+{
+int ret = ioctl(fd, request, arg);
 
-ret = ioctl(fd, request, arg);
 return ret < 0 ? -errno : ret;
 }
 
-static int vhost_vdpa_add_status(struct vhost_dev *dev, uint8_t status)
+static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
+   void *arg)
+{
+return vhost_vdpa_call_fd(vhost_vdpa_dev_fd(dev), request, arg);
+}
+
+static int vhost_vdpa_add_status_fd(int fd, uint8_t status)
 {
 uint8_t s;
 int ret;
 
-trace_vhost_vdpa_add_status(dev, status);
-ret = vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, );
+ret = vhost_vdpa_call_fd(fd, VHOST_VDPA_GET_STATUS, );
 if (ret < 0) {
 return ret;
 }
 
 s |= status;
 
-ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, );
+ret = vhost_vdpa_call_fd(fd, VHOST_VDPA_SET_STATUS, );
 if (ret < 0) {
 return ret;
 }
 
-ret = vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, );
+ret = vhost_vdpa_call_fd(fd, VHOST_VDPA_GET_STATUS, );
 if (ret < 0) {
 return ret;
 }
@@ -378,6 +385,12 @@ static int vhost_vdpa_add_status(struct vhost_dev *dev, 
uint8_t status)
 return 0;
 }
 
+static int vhost_vdpa_add_status(struct vhost_dev *dev, uint8_t status)
+{
+trace_vhost_vdpa_add_status(dev, status);
+return vhost_vdpa_add_status_fd(vhost_vdpa_dev_fd(dev), status);
+}
+
 int vhost_vdpa_get_iova_range(int fd, struct vhost_vdpa_iova_range *iova_range)
 {
 int ret = ioctl(fd, VHOST_VDPA_GET_IOVA_RANGE, iova_range);
@@ -709,16 +722,20 @@ static int vhost_vdpa_get_device_id(struct vhost_dev *dev,
 return ret;
 }
 
+static int vhost_vdpa_reset_device_fd(int fd)
+{
+uint8_t status = 0;
+
+return vhost_vdpa_call_fd(fd, VHOST_VDPA_SET_STATUS, );
+}
+
 static int vhost_vdpa_reset_device(struct vhost_dev *dev)
 {
 struct vhost_vdpa *v = dev->opaque;
-int ret;
-uint8_t status = 0;
 
-ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, );
-trace_vhost_vdpa_reset_device(dev);
 v->suspended = false;
-return ret;
+trace_vhost_vdpa_reset_device(dev);
+return vhost_vdpa_reset_device_fd(vhost_vdpa_dev_fd(dev));
 }
 
 static int vhost_vdpa_get_vq_index(struct vhost_dev *dev, int idx)
@@ -1170,6 +1187,13 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, 
bool started)
 return 0;
 }
 
+void vhost_vdpa_reset_status_fd(int fd)
+{
+vhost_vdpa_reset_device_fd(fd);
+vhost_vdpa_add_status_fd(fd, VIRTIO_CONFIG_S_ACKNOWLEDGE |
+ VIRTIO_CONFIG_S_DRIVER);
+}
+
 static void vhost_vdpa_reset_status(struct vhost_dev *dev)
 {
 struct vhost_vdpa *v = dev->opaque;
@@ -1178,9 +1202,7 @@ static void vhost_vdpa_reset_status(struct vhost_dev *dev)
 return;
 }
 
-vhost_vdpa_reset_device(dev);
-vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
-   VIRTIO_CONFIG_S_DRIVER);
+vhost_vdpa_reset_status_fd(vhost_vdpa_dev_fd(dev));
 memory_listener_unregister(>listener);
 }
 
-- 
2.31.1

Re: [PATCH v17 6/8] qemu-iotests: test new zone operations

On Thu, Mar 23, 2023 at 01:08:32PM +0800, Sam Li wrote:
> The new block layer APIs of zoned block devices can be tested by:
> $ tests/qemu-iotests/check zoned
> Run each zone operation on a newly created null_blk device
> and see whether it outputs the same zone information.
> 
> Signed-off-by: Sam Li 
> Reviewed-by: Stefan Hajnoczi 
> ---
>  tests/qemu-iotests/tests/zoned | 86 ++
>  tests/qemu-iotests/tests/zoned.out | 53 ++
>  2 files changed, 139 insertions(+)
>  create mode 100755 tests/qemu-iotests/tests/zoned
>  create mode 100644 tests/qemu-iotests/tests/zoned.out
> 
> diff --git a/tests/qemu-iotests/tests/zoned b/tests/qemu-iotests/tests/zoned
> new file mode 100755
> index 00..53097e44d9
> --- /dev/null
> +++ b/tests/qemu-iotests/tests/zoned
> @@ -0,0 +1,86 @@
> +#!/usr/bin/env bash
> +#
> +# Test zone management operations.
> +#
> +
> +seq="$(basename $0)"
> +echo "QA output created by $seq"
> +status=1 # failure is the default!
> +
> +_cleanup()
> +{
> +  _cleanup_test_img
> +  sudo rmmod null_blk
> +}
> +trap "_cleanup; exit \$status" 0 1 2 3 15
> +
> +# get standard environment, filters and checks
> +. ../common.rc
> +. ../common.filter
> +. ../common.qemu
> +
> +# This test only runs on Linux hosts with raw image files.
> +_supported_fmt raw
> +_supported_proto file
> +_supported_os Linux
> +
> +IMG="--image-opts -n driver=host_device,filename=/dev/nullb0"
> +QEMU_IO_OPTIONS=$QEMU_IO_OPTIONS_NO_FMT
> +
> +echo "Testing a null_blk device:"
> +echo "case 1: if the operations work"
> +sudo modprobe null_blk nr_devices=1 zoned=1

I took a look at how existing qemu-iotests use sudo. The run it in
non-interactive mode and skip the test if sudo is unavailable.

Please do something like this to check for sudo support:

  sudo -n true || _notrun 'Password-less sudo required'

Then always use "sudo -n ...".


> +sudo chmod 0666 /dev/nullb0
> +
> +echo "(1) report the first zone:"
> +$QEMU_IO $IMG -c "zrp 0 1"
> +echo
> +echo "report the first 10 zones"
> +$QEMU_IO $IMG -c "zrp 0 10"
> +echo
> +echo "report the last zone:"
> +$QEMU_IO $IMG -c "zrp 0x3e7000 2" # 0x3e7000 / 512 = 0x1f38
> +echo
> +echo
> +echo "(2) opening the first zone"
> +$QEMU_IO $IMG -c "zo 0 268435456"  # 268435456 / 512 = 524288
> +echo "report after:"
> +$QEMU_IO $IMG -c "zrp 0 1"
> +echo
> +echo "opening the second zone"
> +$QEMU_IO $IMG -c "zo 268435456 268435456" #
> +echo "report after:"
> +$QEMU_IO $IMG -c "zrp 268435456 1"
> +echo
> +echo "opening the last zone"
> +$QEMU_IO $IMG -c "zo 0x3e7000 268435456"
> +echo "report after:"
> +$QEMU_IO $IMG -c "zrp 0x3e7000 2"
> +echo
> +echo
> +echo "(3) closing the first zone"
> +$QEMU_IO $IMG -c "zc 0 268435456"
> +echo "report after:"
> +$QEMU_IO $IMG -c "zrp 0 1"
> +echo
> +echo "closing the last zone"
> +$QEMU_IO $IMG -c "zc 0x3e7000 268435456"
> +echo "report after:"
> +$QEMU_IO $IMG -c "zrp 0x3e7000 2"
> +echo
> +echo
> +echo "(4) finishing the second zone"
> +$QEMU_IO $IMG -c "zf 268435456 268435456"
> +echo "After finishing a zone:"
> +$QEMU_IO $IMG -c "zrp 268435456 1"
> +echo
> +echo
> +echo "(5) resetting the second zone"
> +$QEMU_IO $IMG -c "zrs 268435456 268435456"
> +echo "After resetting a zone:"
> +$QEMU_IO $IMG -c "zrp 268435456 1"
> +
> +# success, all done
> +echo "*** done"
> +rm -f $seq.full
> +status=0
> diff --git a/tests/qemu-iotests/tests/zoned.out 
> b/tests/qemu-iotests/tests/zoned.out
> new file mode 100644
> index 00..b2d061da49
> --- /dev/null
> +++ b/tests/qemu-iotests/tests/zoned.out
> @@ -0,0 +1,53 @@
> +QA output created by zoned
> +Testing a null_blk device:
> +case 1: if the operations work
> +(1) report the first zone:
> +start: 0x0, len 0x8, cap 0x8, wptr 0x0, zcond:1, [type: 2]
> +
> +report the first 10 zones
> +start: 0x0, len 0x8, cap 0x8, wptr 0x0, zcond:1, [type: 2]
> +start: 0x8, len 0x8, cap 0x8, wptr 0x8, zcond:1, [type: 2]
> +start: 0x10, len 0x8, cap 0x8, wptr 0x10, zcond:1, [type: 2]
> +start: 0x18, len 0x8, cap 0x8, wptr 0x18, zcond:1, [type: 2]
> +start: 0x20, len 0x8, cap 0x8, wptr 0x20, zcond:1, [type: 2]
> +start: 0x28, len 0x8, cap 0x8, wptr 0x28, zcond:1, [type: 2]
> +start: 0x30, len 0x8, cap 0x8, wptr 0x30, zcond:1, [type: 2]
> +start: 0x38, len 0x8, cap 0x8, wptr 0x38, zcond:1, [type: 2]
> +start: 0x40, len 0x8, cap 0x8, wptr 0x40, zcond:1, [type: 2]
> +start: 0x48, len 0x8, cap 0x8, wptr 0x48, zcond:1, [type: 2]
> +
> +report the last zone:
> +start: 0x1f38, len 0x8, cap 0x8, wptr 0x1f38, zcond:1, 
> [type: 2]
> +
> +
> +(2) opening the first zone
> +report after:
> +start: 0x0, len 0x8, cap 0x8, wptr 0x0, zcond:3, [type: 2]
> +
> +opening the second zone
> +report after:
> +start: 0x8, len 0x8, cap 0x8, wptr 0x8, zcond:3, [type: 2]
> +

Re: [PATCH v8 0/4] Add zoned storage emulation to virtio-blk driver

On Thu, Mar 23, 2023 at 01:28:24PM +0800, Sam Li wrote:
> This patch adds zoned storage emulation to the virtio-blk driver.
> 
> The patch implements the virtio-blk ZBD support standardization that is
> recently accepted by virtio-spec. The link to related commit is at
> 
> https://github.com/oasis-tcs/virtio-spec/commit/b4e8efa0fa6c8d844328090ad15db65af8d7d981
> 
> The Linux zoned device code that implemented by Dmitry Fomichev has been
> released at the latest Linux version v6.3-rc1.
> 
> Aside: adding zoned=on alike options to virtio-blk device will be
> considered in following-up plan.
> 
> v7:
> - address Stefan's review comments
>   * rm aio_context_acquire/release in handle_req
>   * rename function return type
>   * rename BLOCK_ACCT_APPEND to BLOCK_ACCT_ZONE_APPEND for clarity
> 
> v6:
> - update headers to v6.3-rc1
> 
> v5:
> - address Stefan's review comments
>   * restore the way writing zone append result to buffer
>   * fix error checking case and other errands
> 
> v4:
> - change the way writing zone append request result to buffer
> - change zone state, zone type value of virtio_blk_zone_descriptor
> - add trace events for new zone APIs
> 
> v3:
> - use qemuio_from_buffer to write status bit [Stefan]
> - avoid using req->elem directly [Stefan]
> - fix error checkings and memory leak [Stefan]
> 
> v2:
> - change units of emulated zone op coresponding to block layer APIs
> - modify error checking cases [Stefan, Damien]
> 
> v1:
> - add zoned storage emulation
> 
> Sam Li (4):
>   include: update virtio_blk headers to v6.3-rc1
>   virtio-blk: add zoned storage emulation for zoned devices
>   block: add accounting for zone append operation
>   virtio-blk: add some trace events for zoned emulation
> 
>  block/qapi-sysemu.c  |  11 +
>  block/qapi.c |  18 +
>  hw/block/trace-events|   7 +
>  hw/block/virtio-blk-common.c |   2 +
>  hw/block/virtio-blk.c| 405 +++
>  include/block/accounting.h   |   1 +
>  include/standard-headers/drm/drm_fourcc.h|  12 +
>  include/standard-headers/linux/ethtool.h |  48 ++-
>  include/standard-headers/linux/fuse.h|  45 ++-
>  include/standard-headers/linux/pci_regs.h|   1 +
>  include/standard-headers/linux/vhost_types.h |   2 +
>  include/standard-headers/linux/virtio_blk.h  | 105 +
>  linux-headers/asm-arm64/kvm.h|   1 +
>  linux-headers/asm-x86/kvm.h  |  34 +-
>  linux-headers/linux/kvm.h|   9 +
>  linux-headers/linux/vfio.h   |  15 +-
>  linux-headers/linux/vhost.h  |   8 +
>  qapi/block-core.json |  62 ++-
>  qapi/block.json  |   4 +
>  19 files changed, 769 insertions(+), 21 deletions(-)
> 
> -- 
> 2.39.2
> 

I'll wait for the next version that addresses the comments before
merging, but I'm happy now:

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature

Re: [PATCH v8 0/4] Add zoned storage emulation to virtio-blk driver

On Thu, Mar 23, 2023 at 09:38:03PM +0800, Sam Li wrote:
> Matias Bjørling  于2023年3月23日周四 21:26写道：
> > On 23/03/2023 06.28, Sam Li wrote:
> For the question, this patch is exposing the zoned interface through
> virtio-blk only. It's a good suggestion to put a use case inside
> documentation. I will add it in the subsequent patch.

Regarding the state of other zoned devices:

--device scsi-block should be able to pass through SCSI ZBC devices.

QEMU supports NVMe ZNS emulation for testing, but cannot pass through
zoned devices from the host yet. If you have an NVMe ZNS device you can
use VFIO PCI pass the entire NVMe PCI adapter through to the guest.

Stefan

signature.asc
Description: PGP signature

Re: [PATCH] Change the default for Mixed declarations.

2023-03-23 Thread Daniel P . Berrangé

On Tue, Feb 14, 2023 at 05:07:38PM +0100, Juan Quintela wrote:
> Hi
> 
> I want to enter a discussion about changing the default of the style
> guide.
> 
> There are several reasons for that:
> - they exist since C99 (i.e. all supported compilers support them)
> - they eliminate the posibility of an unitialized variable.

Actually they don't do that reliably. In fact, when combined
with usage of 'goto', they introduce uninitialized variables,
despite the declaration having an initialization present, and
thus actively mislead reviewers into thinking their code is
safe.

Consider this example:

#include 
#include 

void foo(int something) {
  if (something == 8) {
goto cleanup;
  }

  int nitems = 3;
  int *items = malloc(sizeof(int) *nitems);

  printf("Hello world %p\n", items);

 cleanup:
  printf("nitems=%d items=%p\n", nitems, items);
  if (nitems) {
free(items);
  }
}

int main(int argc, char **argv) {
  foo(atoi(argv[1]));
  return 0;
}

Superficially everything looks awesome right - the variables are
all initialized at time of declaration after all.

$ gcc -Wall -o mixed mixed.c

$ ./mixed 3
Hello world 0x18ef2a0
nitems=3 items=0x18ef2a0

$ ./mixed 8
nitems=32677 items=0x7fa5a9209000
free(): invalid pointer
Aborted (core dumped)

What happens is that when you 'goto $LABEL' across a variable
declaration, the variable is in scope at your target label, but
its declared initializers never get run :-(

Luckily you can protect against that with gcc:

$ gcc -Wjump-misses-init -Wall -o mixed mixed.c
mixed.c: In function ‘foo’:
mixed.c:7:12: warning: jump skips variable initialization [-Wjump-misses-init]
7 |goto cleanup;
  |^~~~
mixed.c:15:5: note: label ‘cleanup’ defined here
   15 | cleanup:
  | ^~~
mixed.c:11:13: note: ‘items’ declared here
   11 |int *items = malloc(sizeof(int) *nitems);
  | ^
mixed.c:7:12: warning: jump skips variable initialization [-Wjump-misses-init]
7 |goto cleanup;
  |^~~~
mixed.c:15:5: note: label ‘cleanup’ defined here
   15 | cleanup:
  | ^~~
mixed.c:10:12: note: ‘nitems’ declared here
   10 |int nitems = 3;
  |^~

however that will warn about *all* cases where we jump over a
declared variable, even if the variable we're jumping over is
not used at the target label location. IOW, it has significant
false positive rates. There are quite a few triggers for this
in the QEMU code already if we turn on this warning.

It also doesn't alter that the code initialization is misleading
to read.

> - (at least for me), declaring the index inside the for make clear
>   that index is not used outside the for.

I'll admit that declaring loop indexes in the for() is a nice
bit, but I'm not a fan in general of mixing the declarations
in the middle of code for projects that use the 'goto cleanup'
pattern.

> - Current documentation already declares that they are allowed in some
>   cases.
> - Lots of places already use them.
> 
> We can change the text to whatever you want, just wondering if it is
> valib to change the standard.
> 
> Doing a trivial grep through my local qemu messages (around 100k) it
> shows that some people are complaining that they are not allowed, and
> other saying that they are used all over the place.

IMHO the status quo is bad because it is actively dangerous when
combined with goto and we aren't using any compiler warnings to
help us.

Either we allow it, but use -Wjump-misses-init to prevent mixing
delayed declarations with gotos, and just avoid this when it triggers
a false positive.

Or we forbid it, rewrite current cases that use it, and then add
-Wdeclaration-after-statement to enforce it.

IMHO if we are concerned about uninitialized variables then I think
a better approach is to add -ftrivial-auto-var-init=zero, which will
make the compiler initialize all variables to 0 if they lack an
explicit initializer. 

> Discuss.
> ---
>  docs/devel/style.rst | 23 ++-
>  1 file changed, 14 insertions(+), 9 deletions(-)
> 
> diff --git a/docs/devel/style.rst b/docs/devel/style.rst
> index 68aa776930..dc248aa9e4 100644
> --- a/docs/devel/style.rst
> +++ b/docs/devel/style.rst
> @@ -202,15 +202,20 @@ Furthermore, it is the QEMU coding style.
>  Declarations
>  
>  
> -Mixed declarations (interleaving statements and declarations within
> -blocks) are generally not allowed; declarations should be at the beginning
> -of blocks.
> -
> -Every now and then, an exception is made for declarations inside a
> -#ifdef or #ifndef block: if the code looks nicer, such declarations can
> -be placed at the top of the block even if there are statements above.
> -On the other hand, however, it's often best to move that #ifdef/#ifndef
> -block to a separate function altogether.
> +Declaring variables at first use has two advantages:
> +- we can see the right type of the variable just to the

[PATCH 2/2] virtio-scsi: stop using aio_disable_external() during unplug

This patch is part of an effort to remove the aio_disable_external()
API because it does not fit in a multi-queue block layer world where
many AioContexts may be submitting requests to the same disk.

The SCSI emulation code is already in good shape to stop using
aio_disable_external(). The API is only used by commit 9c5aad84da1c
("virtio-scsi: fixed virtio_scsi_ctx_check failed when detaching scsi
disk") to ensure that virtio_scsi_hotunplug() works while the guest
driver is submitting I/O.

Ensure virtio_scsi_hotunplug() is safe as follows:

1. qdev_simple_device_unplug_cb() -> qdev_unrealize() ->
   device_set_realized() calls qatomic_set(>realized, false) so
   that future scsi_device_get() calls return NULL because they exclude
   SCSIDevices with realized=false.

   That means virtio-scsi will reject new I/O requests to this
   SCSIDevice with VIRTIO_SCSI_S_BAD_TARGET even while
   virtio_scsi_hotunplug() is still executing. We are protected against
   new requests!

2. Add a call to scsi_device_purge_requests() from scsi_unrealize() so
   that in-flight requests are cancelled synchronously. This ensures
   that no in-flight requests remain once qdev_simple_device_unplug_cb()
   returns.

Thanks to these two conditions we don't need aio_disable_external()
anymore.

Cc: Zhengui Li 
Signed-off-by: Stefan Hajnoczi 
---
 hw/scsi/scsi-disk.c   | 1 +
 hw/scsi/virtio-scsi.c | 3 ---
 2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index 97c9b1c8cd..e01bd84541 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -2522,6 +2522,7 @@ static void scsi_realize(SCSIDevice *dev, Error **errp)
 
 static void scsi_unrealize(SCSIDevice *dev)
 {
+scsi_device_purge_requests(dev, SENSE_CODE(RESET));
 del_boot_device_lchs(>qdev, NULL);
 }
 
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index 000961446c..a02f9233ec 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -1061,11 +1061,8 @@ static void virtio_scsi_hotunplug(HotplugHandler 
*hotplug_dev, DeviceState *dev,
 VirtIODevice *vdev = VIRTIO_DEVICE(hotplug_dev);
 VirtIOSCSI *s = VIRTIO_SCSI(vdev);
 SCSIDevice *sd = SCSI_DEVICE(dev);
-AioContext *ctx = s->ctx ?: qemu_get_aio_context();
 
-aio_disable_external(ctx);
 qdev_simple_device_unplug_cb(hotplug_dev, dev, errp);
-aio_enable_external(ctx);
 
 if (s->ctx) {
 virtio_scsi_acquire(s);
-- 
2.39.2

[PATCH 0/2] virtio-scsi: stop using aio_disable_external() during unplug

The aio_disable_external() API is a solution for stopping I/O during critical
sections. The newer BlockDevOps->drained_begin/end/poll() callbacks offer a
cleaner solution that supports the upcoming multi-queue block layer. This
series removes aio_disable_external() from the virtio-scsi emulation code.

Patch 1 is a fix for something I noticed when reading the code.

Patch 2 replaces aio_disable_external() with functionality for safe hot unplug
that's mostly already present in the SCSI emulation code.

Stefan Hajnoczi (2):
  virtio-scsi: avoid race between unplug and transport event
  virtio-scsi: stop using aio_disable_external() during unplug

 hw/scsi/scsi-bus.c|  3 ++-
 hw/scsi/scsi-disk.c   |  1 +
 hw/scsi/virtio-scsi.c | 21 +
 3 files changed, 12 insertions(+), 13 deletions(-)

-- 
2.39.2

[PATCH 1/2] virtio-scsi: avoid race between unplug and transport event

Only report a transport reset event to the guest after the SCSIDevice
has been unrealized by qdev_simple_device_unplug_cb().

qdev_simple_device_unplug_cb() sets the SCSIDevice's qdev.realized field
to false so that scsi_device_find/get() no longer see it.

scsi_target_emulate_report_luns() also needs to be updated to filter out
SCSIDevices that are unrealized.

These changes ensure that the guest driver does not see the SCSIDevice
that's being unplugged if it responds very quickly to the transport
reset event.

I noticed this issue when reading the code and have not reproduced it.

Signed-off-by: Stefan Hajnoczi 
---
 hw/scsi/scsi-bus.c|  3 ++-
 hw/scsi/virtio-scsi.c | 18 +-
 2 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index c97176110c..f9bd064833 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -487,7 +487,8 @@ static bool scsi_target_emulate_report_luns(SCSITargetReq 
*r)
 DeviceState *qdev = kid->child;
 SCSIDevice *dev = SCSI_DEVICE(qdev);
 
-if (dev->channel == channel && dev->id == id && dev->lun != 0) {
+if (dev->channel == channel && dev->id == id && dev->lun != 0 &&
+qatomic_load_acquire(>qdev.realized)) {
 store_lun(tmp, dev->lun);
 g_byte_array_append(buf, tmp, 8);
 len += 8;
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index 612c525d9d..000961446c 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -1063,15 +1063,6 @@ static void virtio_scsi_hotunplug(HotplugHandler 
*hotplug_dev, DeviceState *dev,
 SCSIDevice *sd = SCSI_DEVICE(dev);
 AioContext *ctx = s->ctx ?: qemu_get_aio_context();
 
-if (virtio_vdev_has_feature(vdev, VIRTIO_SCSI_F_HOTPLUG)) {
-virtio_scsi_acquire(s);
-virtio_scsi_push_event(s, sd,
-   VIRTIO_SCSI_T_TRANSPORT_RESET,
-   VIRTIO_SCSI_EVT_RESET_REMOVED);
-scsi_bus_set_ua(>bus, SENSE_CODE(REPORTED_LUNS_CHANGED));
-virtio_scsi_release(s);
-}
-
 aio_disable_external(ctx);
 qdev_simple_device_unplug_cb(hotplug_dev, dev, errp);
 aio_enable_external(ctx);
@@ -1082,6 +1073,15 @@ static void virtio_scsi_hotunplug(HotplugHandler 
*hotplug_dev, DeviceState *dev,
 blk_set_aio_context(sd->conf.blk, qemu_get_aio_context(), NULL);
 virtio_scsi_release(s);
 }
+
+if (virtio_vdev_has_feature(vdev, VIRTIO_SCSI_F_HOTPLUG)) {
+virtio_scsi_acquire(s);
+virtio_scsi_push_event(s, sd,
+   VIRTIO_SCSI_T_TRANSPORT_RESET,
+   VIRTIO_SCSI_EVT_RESET_REMOVED);
+scsi_bus_set_ua(>bus, SENSE_CODE(REPORTED_LUNS_CHANGED));
+virtio_scsi_release(s);
+}
 }
 
 static struct SCSIBusInfo virtio_scsi_scsi_info = {
-- 
2.39.2

RE: [PATCH] Hexagon (translate.c): avoid redundant PC updates on COF

2023-03-23 Thread Taylor Simpson




> -Original Message-
> From: Matheus Tavares Bernardino 
> Sent: Wednesday, March 22, 2023 3:17 PM
> To: qemu-devel@nongnu.org
> Cc: Taylor Simpson ; richard.hender...@linaro.org;
> a...@rev.ng
> Subject: [PATCH] Hexagon (translate.c): avoid redundant PC updates on COF
> 
> When there is a conditional change of flow or an endloop instruction, we
> preload HEX_REG_PC with ctx->next_PC at gen_start_packet().
> Nonetheless, we still generate TCG code to do this update again at
> gen_goto_tb() when the condition for the COF is not met, thus producing
> redundant instructions. This can be seen with the following packet:
> 
>  0x004002e4:  0x5c20d000 {   if (!P0) jump:t PC+0 }
> 
> Which generates this TCG code:
> 
> 004002e4
> -> mov_i32 pc,$0x4002e8
>and_i32 loc9,p0,$0x1
>mov_i32 branch_taken,loc9
>add_i32 pkt_cnt,pkt_cnt,$0x2
>add_i32 insn_cnt,insn_cnt,$0x2
>brcond_i32 branch_taken,$0x0,ne,$L1
>goto_tb $0x0
>mov_i32 pc,$0x4002e4
>exit_tb $0x7fb0c36e5200
>set_label $L1
>goto_tb $0x1
> -> mov_i32 pc,$0x4002e8
>exit_tb $0x7fb0c36e5201
>set_label $L0
>exit_tb $0x7fb0c36e5203
> 
> Note that even after optimizations, the redundant PC update is still
> present:
> 
> 004002e4
> -> mov_i32 pc,$0x4002e8 sync: 0  dead: 0 1  pref=0x
>mov_i32 branch_taken,$0x1sync: 0  dead: 0 1  pref=0x
>add_i32 pkt_cnt,pkt_cnt,$0x2 sync: 0  dead: 0 1  pref=0x
>add_i32 insn_cnt,insn_cnt,$0x2   sync: 0  dead: 0 1 2  pref=0x
>goto_tb $0x1
> -> mov_i32 pc,$0x4002e8 sync: 0  dead: 0 1  pref=0x
>exit_tb $0x7fb0c36e5201
>set_label $L0
>exit_tb $0x7fb0c36e5203
> 
> With this patch, the second redundant update is properly discarded.
> 
> Note that we need the additional "move_to_pc" flag instead of just avoiding
> the update whenever `dest == ctx->next_PC`, as that could potentially skip
> updates from a COF with met condition, whose
> ctx->branch_dest just happens to be equal to ctx->next_PC.
> 
> Signed-off-by: Matheus Tavares Bernardino 
> ---
>  target/hexagon/translate.c | 21 +
>  1 file changed, 13 insertions(+), 8 deletions(-)

Reviewed-by: Taylor Simpson

Re: [PATCH] vfio/pci: add support for VF token

2023-03-23 Thread Alex Williamson

On Thu, 23 Mar 2023 06:19:45 +0900
Minwoo Im  wrote:

> > On Mon, 20 Mar 2023 11:03:40 +0100
> > Cédric Le Goater  wrote:
> >   
> > > On 3/20/23 08:35, Minwoo Im wrote:  
> > > > VF token was introduced [1] to kernel vfio-pci along with SR-IOV
> > > > support [2].  This patch adds support VF token among PF and VF(s). To
> > > > passthu PCIe VF to a VM, kernel >= v5.7 needs this.
> > > >
> > > > It can be configured with UUID like:
> > > >
> > > >-device vfio-pci,host=:BB:DD:F,vf-token=,...
> > > >
> > > > [1] https://lore.kernel.org/linux-  
> > pci/158396393244.5601.10297430724964025753.st...@gimli.home/  
> > > > [2] https://lore.kernel.org/linux-  
> > pci/158396044753.5601.14804870681174789709.st...@gimli.home/  
> > > >
> > > > Cc: Alex Williamson 
> > > > Signed-off-by: Minwoo Im 
> > > > Reviewed-by: Klaus Jensen 
> > > > ---
> > > >   hw/vfio/pci.c | 13 -
> > > >   hw/vfio/pci.h |  1 +
> > > >   2 files changed, 13 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> > > > index ec9a854361..cf27f28936 100644
> > > > --- a/hw/vfio/pci.c
> > > > +++ b/hw/vfio/pci.c
> > > > @@ -2856,6 +2856,8 @@ static void vfio_realize(PCIDevice *pdev, Error 
> > > > **errp)
> > > >   int groupid;
> > > >   int i, ret;
> > > >   bool is_mdev;
> > > > +char uuid[UUID_FMT_LEN];
> > > > +char *name;
> > > >
> > > >   if (!vbasedev->sysfsdev) {
> > > >   if (!(~vdev->host.domain || ~vdev->host.bus ||
> > > > @@ -2936,7 +2938,15 @@ static void vfio_realize(PCIDevice *pdev, Error 
> > > > **errp)
> > > >   goto error;
> > > >   }
> > > >
> > > > -ret = vfio_get_device(group, vbasedev->name, vbasedev, errp);
> > > > +if (!qemu_uuid_is_null(>vf_token)) {
> > > > +qemu_uuid_unparse(>vf_token, uuid);
> > > > +name = g_strdup_printf("%s vf_token=%s", vbasedev->name, uuid);
> > > > +} else {
> > > > +name = vbasedev->name;
> > > > +}
> > > > +
> > > > +ret = vfio_get_device(group, name, vbasedev, errp);
> > > > +g_free(name);
> > > >   if (ret) {
> > > >   vfio_put_group(group);
> > > >   goto error;  
> > >
> > > Shouldn't we set the VF token in the kernel also ? See this QEMU 
> > > implementation
> > >
> > >https://lore.kernel.org/lkml/20200204161737.34696...@w520.home/
> > >
> > > May be I misunderstood.
> > >  
> > 
> > I think you're referring to the part there that calls
> > VFIO_DEVICE_FEATURE in order to set a VF token.  I don't think that's
> > necessarily applicable here.  I believe this patch is only trying to
> > make it so that QEMU can consume a VF associated with a PF owned by a
> > userspace vfio driver, ie. not QEMU.  
> 
> Yes, that's what this patch exactly does.
> 
> > 
> > Setting the VF token is only relevant to PFs, which would require
> > significantly more SR-IOV infrastructure in QEMU than sketched out in
> > that proof-of-concept patch.  Even if we did have a QEMU owned PF where
> > we wanted to generate VFs, the token we use in that case would likely
> > need to be kept private to QEMU, not passed on the command line.
> > Thanks,  
> 
> Can we also take a command line property for the PF for that case that
> QEMU owns a PF?  I think the one who wants to make QEMU owns PF or VF
> should know the VF token.  If I've missed anything, please let me know.

IIRC, the only case where a VF token is required for a PF is if there
are existing VFs in use.  Opening the PF would then require a token
matching the VFs.  In general though, if the PF is owned by QEMU, the
VF token should likely be an internal secret to QEMU.  Configuring the
PF device with a token suggests that VFs could be created and bound to
other userspace drivers outside of the control of the QEMU instance
that owns the PF.  Therefore I would not suggest adding the ability to
set the VF token for a PF without a strong use case in-hand, an
certainly not when QEMU doesn't expose SR-IOV support to be able to
manage VFs itself.  Thanks,

Alex

Re: QMP command dumpdtb crash bug

2023-03-23 Thread Bernhard Beschow




Am 23. März 2023 15:13:28 UTC schrieb Daniel Henrique Barboza 
:
>
>
>On 3/23/23 10:38, Peter Maydell wrote:
>> On Thu, 23 Mar 2023 at 13:29, Markus Armbruster  wrote:
>>> 
>>> Peter, Daniel offers two ways to fix this bug (see below).  Got a
>>> preference?
>> 
>> Not freeing seems the correct thing. As Daniel says, this
>> should have been a prerequisite for implementing the
>> command in the first place (you need to change the lifecycle
>> of the fdt blob from "delete when done with in the arm boot code"
>> to "delete on machine finalize"). It looks like somehow we added
>> the command but missed out on getting all of the prerequisite
>> patches in. (File under "need to be cautious about applying partial
>> patchsets", I guess.)
>
>Yeah, I'm at fault here. I should've been more insistent about acking
>the ARM patch. All other patches that we left behind was optional, meaning
>that the machine wouldn't implement the command but nothing bad would happen,
>but the ARM patch was kind of mandatory because arm_load_dtb() is
>freeing ms->fdt without assigning it to NULL.
>
>> 
>> Did anything else from that initial patchset get omitted?
>
>Searching the ML I see that I sent a message saying that I pushed patches 1,
>6 and 8-15 via ppc-next. This means that these patches got left behind:
>
> 2  hw/core: free ms->fdt in machine_finalize()
> 3  hw/arm: do not free machine->fdt in arm_load_dtb()
> 4  hw/mips: set machine->fdt in boston_mach_init()
> 5  hw/microblaze: set machine->fdt in microblaze_load_dtb()
> 7  hw/ppc: set machine->fdt in ppce500_load_device_tree()

We dealt with e500 in a different series. So 7 is basically in 8.0 already (but 
in a different form).

Best regards,
Bernhard

>15  hw/xtensa: set machine->fdt in xtfpga_init()
>
>
>Patch 2 was suggested by Phil and changes the common code to free ms->fdt
>during machine_finalize(). Can be re-sent I guess.
>
>All other patches, aside from patch 3 from ARM, are optional because the
>machine isn't freeing ms->fdt or anything like that.
>
>
>I'll rebase and re-sent patch 3 as a bug fix. I'll re-sent the hw/core patch
>as well for 8.1.
>
>
>Daniel
>
>
>> 
>> thanks
>> -- PMM
>

Re: [PATCH] qtests: avoid printing comments before g_test_init()

2023-03-23 Thread Juan Quintela

Daniel P. Berrangé  wrote:
> The TAP protocol version line must be the first thing printed on
> stdout. The migration test failed that requirement in certain
> scenarios:
>
>   # Skipping test: Userfault not available (builtdtime)
>   TAP version 13
>   # random seed: R02Sc120c807f11053eb90bfea845ba1e368
>   1..32
>   # Start of x86_64 tests
>   # Start of migration tests
>   
>
> The TAP version is printed by g_test_init(), so we need to make
> sure that any methods which print are run after that.
>
> Signed-off-by: Daniel P. Berrangé 
> ---

Reviewed-by: Juan Quintela 

> -const bool has_kvm = qtest_has_accel("kvm");
> -const bool has_uffd = ufd_version_check();
> -const char *arch = qtest_get_arch();
> +bool has_kvm;
> +bool has_uffd;
> +const char *arch;

Why don't you move also the declarations of the variables?
I think that one of the biggest troubles of C is variables that are not
initialized.

All compilers that we support are C99 or later, so we can do that (and
we already do in lot of places.)

And yeap, I know that CodingStyle says otherwise, but I think that what
is wrong is CodingStyle.

https://lists.gnu.org/archive/html/qemu-devel/2023-02/msg03836.html

Later, Juan.

[PATCH] hw/ssi: Fix Linux driver init issue with xilinx_spi

2023-03-23 Thread Chris Rauer

The problem is that the Linux driver expects the master transaction inhibit
bit(R_SPICR_MTI) to be set during driver initialization so that it can
detect the fifo size but QEMU defaults it to zero out of reset.  The
datasheet indicates this bit is active on reset.

See page 25, SPI Control Register section:
https://www.xilinx.com/content/dam/xilinx/support/documents/ip_documentation/axi_quad_spi/v3_2/pg153-axi-quad-spi.pdf

Signed-off-by: Chris Rauer 
---
 hw/ssi/xilinx_spi.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/ssi/xilinx_spi.c b/hw/ssi/xilinx_spi.c
index 552927622f..d4de2e7aab 100644
--- a/hw/ssi/xilinx_spi.c
+++ b/hw/ssi/xilinx_spi.c
@@ -156,6 +156,7 @@ static void xlx_spi_do_reset(XilinxSPI *s)
 txfifo_reset(s);
 
 s->regs[R_SPISSR] = ~0;
+s->regs[R_SPICR] = R_SPICR_MTI;
 xlx_spi_update_irq(s);
 xlx_spi_update_cs(s);
 }
-- 
2.40.0.348.gf938b09366-goog

Re: [PATCH 1/1] hw/arm: do not free machine->fdt in arm_load_dtb()

On Thu, 23 Mar 2023 at 17:54, Daniel Henrique Barboza
 wrote:
>
>
>
> On 3/23/23 14:38, Peter Maydell wrote:
> > On Thu, 23 Mar 2023 at 16:11, Daniel Henrique Barboza
> >  wrote:
> >> -g_free(fdt);
> >> +/* Set ms->fdt for 'dumpdtb' QMP/HMP command */
> >> +ms->fdt = fdt;
> >
> > With this we're now setting ms->fdt twice for the virt board: we set
> > it in create_fdt() in hw/arm/virt.c, and then we set it again here.
> > Which is the right place to set it?
> >
> > Is the QMP 'dumpdtb' command intended to dump the DTB only for
> > board types where the DTB is created at runtime by QEMU? Or
> > is it supposed to also work for DTBs that were originally
> > provided by the user using the '-dtb' command line? The docs
> > don't say. If we want the former, then we should be setting
> > ms->fdt in the board code; if the latter, then here is right.
>
> My original intent with this command was to dump the current state of the FDT,
> regardless of whether the FDT was loaded via -dtb or at runtime.

Mmm. I think that makes sense; we do make a few tweaks to the DTB
even if it was user-provided and you might want to check those
for debug purposes. So we should keep this assignment, and remove
the now-unneeded setting of ms->fdt in create_fdt().

thanks
-- PMM

Re: [PATCH 1/1] hw/arm: do not free machine->fdt in arm_load_dtb()





On 3/23/23 14:38, Peter Maydell wrote:

On Thu, 23 Mar 2023 at 16:11, Daniel Henrique Barboza
 wrote:


At this moment, arm_load_dtb() can free machine->fdt when
binfo->dtb_filename is NULL. If there's no 'dtb_filename', 'fdt' will be
retrieved by binfo->get_dtb(). If get_dtb() returns machine->fdt, as is
the case of machvirt_dtb() from hw/arm/virt.c, fdt now has a pointer to
machine->fdt. And, in that case, the existing g_free(fdt) at the end of
arm_load_dtb() will make machine->fdt point to an invalid memory region.

After the command 'dumpdtb' were introduced a couple of releases ago,
running it with any ARM machine that uses arm_load_dtb() will crash
QEMU.

One alternative would be to mark machine->fdt = NULL when exiting
arm_load_dtb() when freeing the fdt. Another is to not free the fdt and,
instead, update machine->fdt with the new fdt generated. This will
enable dumpdtb for all ARM machines that uses arm_load_dtb(), regardless
of having 'dtb_filename' or not.

Cc: Peter Maydell 
Cc: qemu-...@nongnu.org
Fixes: bf353ad55590f ("qmp/hmp, device_tree.c: introduce dumpdtb")
Reported-by: Markus Armbruster i
Signed-off-by: Daniel Henrique Barboza 
---
  hw/arm/boot.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 50e5141116..9418cc3373 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -689,7 +689,8 @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info 
*binfo,
  qemu_register_reset_nosnapshotload(qemu_fdt_randomize_seeds,
 rom_ptr_for_as(as, addr, size));

-g_free(fdt);
+/* Set ms->fdt for 'dumpdtb' QMP/HMP command */
+ms->fdt = fdt;


With this we're now setting ms->fdt twice for the virt board: we set
it in create_fdt() in hw/arm/virt.c, and then we set it again here.
Which is the right place to set it?

Is the QMP 'dumpdtb' command intended to dump the DTB only for
board types where the DTB is created at runtime by QEMU? Or
is it supposed to also work for DTBs that were originally
provided by the user using the '-dtb' command line? The docs
don't say. If we want the former, then we should be setting
ms->fdt in the board code; if the latter, then here is right.


My original intent with this command was to dump the current state of the FDT,
regardless of whether the FDT was loaded via -dtb or at runtime.

Ideally it would also reflect hotplug changes that affects the FDT as well, 
although
I'm aware of only one board that does that (ppc64 pseries) and we would need 
some
work done that to update ms->fdt after the hotplug/hotunplug path.

Perhaps a doc path would also be good.


Daniel



thanks
-- PMM

QEMU fortnightly developers call for agenda for 2023-04-04

2023-03-23 Thread Juan Quintela




Hi

Please, send any topic that you are interested in covering.

Topics on the backburner:
- qemu single binary
  On this week call, phillipe said that he wanted to discuss pci irqs
  and default devices?
- The future of icount
  My understanding is that Alex wanted to wait until the openning of 8.1
  to continue discussions/show code.

Anything else?

At the end of Monday I will send an email with the agenda or the
cancellation of the call, so hurry up.

After discussions on the QEMU Summit, we are going to have always open a
QEMU call where you can add topics.

 Call details:

By popular demand, a google calendar public entry with it

  
https://calendar.google.com/calendar/event?action=TEMPLATE=NWR0NWppODdqNXFyYzAwbzYza3RxN2dob3VfMjAyMjEwMThUMTMwMDAwWiBlZ2VkN2NraTA1bG11MXRuZ3ZrbDN0aGlkc0Bn=eged7cki05lmu1tngvkl3thids%40group.calendar.google.com=ALL

(Let me know if you have any problems with the calendar entry.  I just
gave up about getting right at the same time CEST, CET, EDT and DST).

If you need phone number details,  contact me privately

Thanks, Juan.

Re: [PATCH v3 8/8] target/arm: Add CPU property for QARMA3, enable FPACCombined by default


On 3/22/23 13:25, Aaron Lindsay wrote:

Signed-off-by: Aaron Lindsay
---
  target/arm/cpu.h   |  1 +
  target/arm/cpu64.c | 48 +++---
  2 files changed, 34 insertions(+), 15 deletions(-)


Reviewed-by: Richard Henderson 


r~

Re: [PATCH v3 7/8] target/arm: Implement v8.3 FPAC and FPACCOMBINE


On 3/22/23 13:25, Aaron Lindsay wrote:

Signed-off-by: Aaron Lindsay
---
  target/arm/syndrome.h |  7 +++
  target/arm/tcg/pauth_helper.c | 16 
  2 files changed, 23 insertions(+)


Reviewed-by: Richard Henderson 


r~

Re: [PATCH v3 2/8] target/arm: v8.3 PAC ID_AA64ISAR[12] feature-detection


On 3/22/23 13:25, Aaron Lindsay wrote:

+static inline int isar_feature_pauth_get_features(const ARMISARegisters *id)
+{
+if (isar_feature_aa64_pauth_arch_qarma5(id)) {
+return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, APA);
+} else if (isar_feature_aa64_pauth_arch_qarma3(id)) {
+return FIELD_EX64(id->id_aa64isar2, ID_AA64ISAR2, APA3);
+} else {
+return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, API);
+}
+}


Only one of these fields is allowed to be non-zero, so we can avoid the tests and simply 
OR them all together.  With a comment to that effect, of course.


Otherwise,
Reviewed-by: Richard Henderson 


r~

Re: [PATCH 1/1] hw/arm: do not free machine->fdt in arm_load_dtb()

On Thu, 23 Mar 2023 at 16:11, Daniel Henrique Barboza
 wrote:
>
> At this moment, arm_load_dtb() can free machine->fdt when
> binfo->dtb_filename is NULL. If there's no 'dtb_filename', 'fdt' will be
> retrieved by binfo->get_dtb(). If get_dtb() returns machine->fdt, as is
> the case of machvirt_dtb() from hw/arm/virt.c, fdt now has a pointer to
> machine->fdt. And, in that case, the existing g_free(fdt) at the end of
> arm_load_dtb() will make machine->fdt point to an invalid memory region.
>
> After the command 'dumpdtb' were introduced a couple of releases ago,
> running it with any ARM machine that uses arm_load_dtb() will crash
> QEMU.
>
> One alternative would be to mark machine->fdt = NULL when exiting
> arm_load_dtb() when freeing the fdt. Another is to not free the fdt and,
> instead, update machine->fdt with the new fdt generated. This will
> enable dumpdtb for all ARM machines that uses arm_load_dtb(), regardless
> of having 'dtb_filename' or not.
>
> Cc: Peter Maydell 
> Cc: qemu-...@nongnu.org
> Fixes: bf353ad55590f ("qmp/hmp, device_tree.c: introduce dumpdtb")
> Reported-by: Markus Armbruster i
> Signed-off-by: Daniel Henrique Barboza 
> ---
>  hw/arm/boot.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/hw/arm/boot.c b/hw/arm/boot.c
> index 50e5141116..9418cc3373 100644
> --- a/hw/arm/boot.c
> +++ b/hw/arm/boot.c
> @@ -689,7 +689,8 @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info 
> *binfo,
>  qemu_register_reset_nosnapshotload(qemu_fdt_randomize_seeds,
> rom_ptr_for_as(as, addr, size));
>
> -g_free(fdt);
> +/* Set ms->fdt for 'dumpdtb' QMP/HMP command */
> +ms->fdt = fdt;

With this we're now setting ms->fdt twice for the virt board: we set
it in create_fdt() in hw/arm/virt.c, and then we set it again here.
Which is the right place to set it?

Is the QMP 'dumpdtb' command intended to dump the DTB only for
board types where the DTB is created at runtime by QEMU? Or
is it supposed to also work for DTBs that were originally
provided by the user using the '-dtb' command line? The docs
don't say. If we want the former, then we should be setting
ms->fdt in the board code; if the latter, then here is right.

thanks
-- PMM

Re: [PATCH v3 1/8] target/arm: Add ID_AA64ISAR2_EL1


On 3/22/23 13:25, Aaron Lindsay wrote:

--- a/target/arm/hvf/hvf.c
+++ b/target/arm/hvf/hvf.c
@@ -507,6 +507,7 @@ static bool 
hvf_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
  { HV_SYS_REG_ID_AA64DFR1_EL1, _isar.id_aa64dfr1 },
  { HV_SYS_REG_ID_AA64ISAR0_EL1, _isar.id_aa64isar0 },
  { HV_SYS_REG_ID_AA64ISAR1_EL1, _isar.id_aa64isar1 },
+{ HV_SYS_REG_ID_AA64ISAR2_EL1, _isar.id_aa64isar2 },


Ah, I may have given you a bum steer here.
In MacOSX 12.6, there is no HV_SYS_REG_ID_AA64ISAR2_EL1 enumerator.

Irritatingly, it is an enum not a define we can test via the preprocessor.
In addition, the form of the enum,

HV_SYS_REG_ID_AA64ISAR0_EL1 = 0xc030,
HV_SYS_REG_ID_AA64ISAR1_EL1 = 0xc031,
HV_SYS_REG_ID_AA64MMFR0_EL1 = 0xc038,
HV_SYS_REG_ID_AA64MMFR1_EL1 = 0xc039,

suggests an encoding of the system register number, so we would eventually be able to use 
0xc032.  But I wouldn't want to try that without the interface being properly exposed.  I 
don't know if some later version of MacOS already does so.


Otherwise,
Reviewed-by: Richard Henderson 


r~

Re: [PATCH v3] target/riscv: reduce overhead of MSTATUS_SUM change


On 3/23/23 06:24, Wu, Fei wrote:

There is MSTATUS_MPRV and MSTATUS_MPP kind of thing, priv+sum is not
able to represent all of the status, probably we can just add an extra
'priv' at the back of TB_FLAGS?


MPRV+MPP looks not necessary be in TB_FLAGS, it's just used to calculate
the mmu_idx. Is it necessary to encode SUM into TB_FLAGS?


We need priv separate from midx for priv_level() and similar uses.
We do not want MPRV=1, MPP=U to prevent M-mode from executing M instructions.

With PRIV and MIDX in TB_FLAGS, we do not need to separately encode MPRV+MPP+SUM, because 
all of those are composited into MIDX.



r~

Re: qapi: [RFC] Doc comment convention for @arg: sections

On Thu, 23 Mar 2023 at 14:48, Markus Armbruster  wrote:
>
> The QAPI schema doc comment language provides special syntax for command
> and event arguments, struct and union members, alternate branches,
> enumeration values, and features: "sections" starting with @arg:.
>
> By convention, we format them like this:
>
> # @arg: Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed
> #   do eiusmod tempor incididunt ut labore et dolore magna
> #   aliqua.
>
> Okay for names as short as "name", but we have much longer ones.  Their
> description gets squeezed against the right margin, like this:
>
> # @dirty-sync-missed-zero-copy: Number of times dirty RAM synchronization 
> could
> #   not avoid copying dirty pages. This is 
> between
> #   0 and @dirty-sync-count * 
> @multifd-channels.
> #   (since 7.1)
>
> The description is effectively 50 characters wide.  Easy enough to read,
> but awkward to write.

The documentation also permits a second form:

 # @argone:
 # This is a two line description
 # in the first style.

We tend to use that for type names, not field names, but IIRC
it's the same handling for both.

I'll re-mention here something I said back in 2020 when we were
landing the rST-conversion of the qapi docs:

There is rST syntax for field lists and option lists:
https://docutils.sourceforge.io/docs/ref/rst/restructuredtext.html#field-lists
which are kind of similar to what we're doing with @foo: stuff
markup, and which handle indentation like this:

:Hello: This field has a short field name, so aligning the field
body with the first line is feasible.

:Number-of-African-swallows-required-to-carry-a-coconut: It would
be very difficult to align the field body with the left edge
of the first line. It may even be preferable not to begin the
body on the same line as the marker.

The differences to what we have today are:
 * indent of lines 2+ is determined by the indent of line 2, not 1
 * lines 2+ must be indented, so anything that currently uses
   "no indent, start in column 0" would need indenting. (This would
   be a lot of change to our current docs text.)

At the time I think I basically went with "whatever requires the
minimum amount of change to the existing doc comments and parser
to get them into a shape that Sphinx is happy with". But if we're
up for a wide reformatting then maybe it would be worth following
the rST syntax?

PS: I see with a quick grep we also have misformatted field docs;
check out for instance the HTML rendering of the bps_max etc
fields of BlockDeviceInfo, which is weird because the second
line of the field docs in the sources is wrongly indented.

-- PMM

[PATCH 1/1] hw/arm: do not free machine->fdt in arm_load_dtb()

At this moment, arm_load_dtb() can free machine->fdt when
binfo->dtb_filename is NULL. If there's no 'dtb_filename', 'fdt' will be
retrieved by binfo->get_dtb(). If get_dtb() returns machine->fdt, as is
the case of machvirt_dtb() from hw/arm/virt.c, fdt now has a pointer to
machine->fdt. And, in that case, the existing g_free(fdt) at the end of
arm_load_dtb() will make machine->fdt point to an invalid memory region.

After the command 'dumpdtb' were introduced a couple of releases ago,
running it with any ARM machine that uses arm_load_dtb() will crash
QEMU.

One alternative would be to mark machine->fdt = NULL when exiting
arm_load_dtb() when freeing the fdt. Another is to not free the fdt and,
instead, update machine->fdt with the new fdt generated. This will
enable dumpdtb for all ARM machines that uses arm_load_dtb(), regardless
of having 'dtb_filename' or not.

Cc: Peter Maydell 
Cc: qemu-...@nongnu.org
Fixes: bf353ad55590f ("qmp/hmp, device_tree.c: introduce dumpdtb")
Reported-by: Markus Armbruster i
Signed-off-by: Daniel Henrique Barboza 
---
 hw/arm/boot.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 50e5141116..9418cc3373 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -689,7 +689,8 @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info 
*binfo,
 qemu_register_reset_nosnapshotload(qemu_fdt_randomize_seeds,
rom_ptr_for_as(as, addr, size));
 
-g_free(fdt);
+/* Set ms->fdt for 'dumpdtb' QMP/HMP command */
+ms->fdt = fdt;
 
 return size;
 
-- 
2.39.2

Re: [PATCH v3] target/riscv: reduce overhead of MSTATUS_SUM change


On 3/22/23 17:38, Wu, Fei wrote:

There is MSTATUS_MPRV and MSTATUS_MPP kind of thing, priv+sum is not
able to represent all of the status, probably we can just add an extra
'priv' at the back of TB_FLAGS?


Yes, I think that's required.


r~

[PATCH 0/1] fix dumpdtb crash with ARM machines

Hi,

This is a re-post of "[PATCH v8 03/16] hw/arm: do not free machine->fdt
in arm_load_dtb()":

https://lists.gnu.org/archive/html/qemu-devel/2022-09/msg04201.html

Turns out that I drop the ball and left this patch behind. Aside from
some patches of that series that were optional, the way ARM code is
working ATM is causing 'dumpdtb' to crash QEMU, as reported by Markus in
https://lists.gnu.org/archive/html/qemu-devel/2023-03/msg05858.html .

Applying this patch fixes the reported crash:

$ ./qemu-system-aarch64 -S -M virt -display none -qmp stdio
{"QMP": {"version": {"qemu": {"micro": 91, "minor": 2, "major": 7}, "package": 
"v8.0.0-rc1-37-g298c4469cf"}, "capabilities": ["oob"]}}
{"execute": "qmp_capabilities", "arguments": {"enable": ["oob"]}}
{"return": {}}
{"execute": "dumpdtb", "arguments": {"filename": "fdt.dtb"}}
{"return": {}}
^Cqemu-system-aarch64: terminating on signal 2
{"timestamp": {"seconds": 1679587153, "microseconds": 714319}, "event": 
"SHUTDOWN", "data": {"guest": false, "reason": "host-signal"}}
$ 
$ dtc -I dtb -O dts fdt.dtb | grep timer
timer {
compatible = "arm,armv7-timer";
$

Cc: Peter Maydell 
Cc: Markus Armbruster 
Cc: qemu-...@nongnu.org

Daniel Henrique Barboza (1):
  hw/arm: do not free machine->fdt in arm_load_dtb()

 hw/arm/boot.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

-- 
2.39.2

Re: [PATCH v4 1/2] target/riscv: separate priv from mmu_idx


On 3/22/23 23:00, Wu, Fei wrote:

+    ctx->priv = env->priv;


This is not right. You should put env->priv into tb flags before you use
it in translation.


I see some other env usages in this function, when will env->priv and
tb_flags.priv mismatch (assume we have recorded priv in tb_flags)?


You are correct that they should match, because of tb_flags, but it is bad form to read 
from env, as that leads to errors.  Since you *can* read the same data from tb_flags, you 
should.


The read of misa_ext and misa_mxl_max are correct, because they are constant set at cpu 
init/realize.


The read of vstart is incorrect.  The TB_FLAGS field is VL_EQ_VLMAX, which includes a test 
for vstart == 0, but the smaller vstart == 0 test is not extractable from that.  Thus the 
usage in e.g. vext_check_reduction is incorrect.  One would require a new TB_FLAGS bit to 
encode vstart ==/!= 0 alone.



r~

Re: [RFC PATCH v1] hw/misc: add i2c slave device that passes i2c ops outside

2023-03-23 Thread Corey Minyard

On Thu, Mar 23, 2023 at 10:09:02AM +, Karol Nowak wrote:
> Hi,
> 
> There is a feature I prepared which may be practical for some QEMU users.
> 
> The feature provides a new I2C slave device
> that prepares a message depending what i2c-slave callback was called
> and sends it outside of QEMU through the character device to a client
> that receives that message, processes it and send back a response.
> Thanks to that feature,
> a user can emulate a logic of I2C device outside of QEMU.
> The message contains 3 bytes ended with CRLF: BBB\r\l
> Basically, the I2C slave does 4 steps in each i2c-slave callback:
> * encode
> * send
> * receive
> * decode
> 
> I put more details in esp32_i2c_tcp_slave.c
> and also provided a demo client in python that uses TCP.
> 
> The feature still needs some improvements, but the question is:
> * Do you find the feature useful?

Someone else has proposed this before with a patch, and it was actually
pretty complete and mostly ok, but I pointed out an issue and never
heard back from them.  This feature is something that might be nice.  As
you say, this needs some improvements.  Some I would point out:

Obviously this can't be named esp32, it needs to be general.

All the I/O (reading and writing) has to be non-blocking.  I/O handling
in qemu is single-threaded, if you block anywhere you basically stop
qemu.  You need to implement something where whatever you do (like
handling a NAK, for instance) it doesn't block qemu.

The protocol you have implemented is basically an extension of the QEMU
protocol.  That's probably not ideal, it would be best to think about a
general protocol for extending I2C over a TCP connection.  A lot of the
details of the QEMU implementation is probably not necessary over a TCP
connection.

-corey

> 
> 
> NOTE:
> The feature originally was prepared for espressif/qemu
> that's why there are references to esp32
> 
> 
> Signed-off-by: Karol Nowak 
> ---
>  hw/misc/esp32_i2c_tcp_slave.c | 288 ++
>  include/hw/misc/esp32_i2c_tcp_slave.h |  19 ++
>  tests/i2c-tcp-demo/i2c-tcp-demo.py| 133 
>  3 files changed, 440 insertions(+)
>  create mode 100644 hw/misc/esp32_i2c_tcp_slave.c
>  create mode 100644 include/hw/misc/esp32_i2c_tcp_slave.h
>  create mode 100644 tests/i2c-tcp-demo/i2c-tcp-demo.py
> 
> diff --git a/hw/misc/esp32_i2c_tcp_slave.c b/hw/misc/esp32_i2c_tcp_slave.c
> new file mode 100644
> index 00..db3b6d366a
> --- /dev/null
> +++ b/hw/misc/esp32_i2c_tcp_slave.c
> @@ -0,0 +1,288 @@
> +#include "qemu/osdep.h"
> +#include "qemu/error-report.h"
> +#include "qemu/log.h"
> +#include "hw/i2c/i2c.h"
> +#include "hw/irq.h"
> +#include "hw/misc/esp32_i2c_tcp_slave.h"
> +#include "qemu/module.h"
> +
> +#include "qapi/qmp/json-writer.h"
> +#include "chardev/char-fe.h"
> +#include "io/channel-socket.h"
> +#include "chardev/char-io.h"
> +#include "chardev/char-socket.h"
> +#include "qapi/error.h"
> +
> +/*
> + * Description:
> + * To allow to emulate a I2C slave device which is not supported by QEMU,
> + * a new I2C slave device was created that encapsulates I2C operations
> + * and passes them through a selected chardev to the host
> + * where a client resides that implements a logic of emulated device.
> + *
> + *
> + * Architecture:
> + *---
> + *| QEMU|
> + *| | ---
> + *|  ESP32 Firmware writes  | | |
> + *|  to I2C Slave   | | I2C Slave Emulation |
> + *| | | |
> + *|  ---&-& |
> + *|  | I2C Slave at 0x7F&   tcp   & recv msg|
> + *|  ---&-& process msg |
> + *| | | send respone|
> + *| | | |
> + *| | | |
> + *--- |--
> + *
> + *
> + * Syntax & protocol:
> + *  QEMU I2C Slave sends a msg in following format: BBB\r\n
> + *  where each 'B' represents a single byte 0-255
> + *  QEMU I2C Slave expects a respone message in the same format as fast 
> as possible
> + *  Example:
> + * req: 0x45 0x01 0x00 \r\n
> + *resp: 0x45 0x01 0x00 \r\n
> + *
> + *  The format BBB\r\n
> + *first 'B' is a message type
> + *second 'B' is a data value
> + *third 'B' is an error value (not used at the moment)
> + *
> + *  There are three types of message
> + *'E' or 0x45 - Event:
> + *'S' or 0x53 - Send: byte sent to emulated I2C Slave
> + *'R' or 0x52 - Recv: byte to be received by I2C Master
> + *
> + *
> + *  'E' message
> + *second byte is an event type:
> + * 0x0: I2C_START_RECV
> + *

Re: [PATCH v4 1/2] target/riscv: separate priv from mmu_idx


On 3/22/23 19:44, Fei Wu wrote:

Currently it's assumed the 2 low bits of mmu_idx map to privilege mode,
this assumption won't last as we are about to add more mmu_idx.

Signed-off-by: Fei Wu 
---
  target/riscv/cpu.h | 1 -
  target/riscv/cpu_helper.c  | 2 +-
  target/riscv/insn_trans/trans_privileged.c.inc | 2 +-
  target/riscv/insn_trans/trans_xthead.c.inc | 7 +--
  target/riscv/translate.c   | 3 +++
  5 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 638e47c75a..66f7e3d1ba 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -623,7 +623,6 @@ G_NORETURN void riscv_raise_exception(CPURISCVState *env,
  target_ulong riscv_cpu_get_fflags(CPURISCVState *env);
  void riscv_cpu_set_fflags(CPURISCVState *env, target_ulong);
  
-#define TB_FLAGS_PRIV_MMU_MASK3

  #define TB_FLAGS_PRIV_HYP_ACCESS_MASK   (1 << 2)
  #define TB_FLAGS_MSTATUS_FS MSTATUS_FS
  #define TB_FLAGS_MSTATUS_VS MSTATUS_VS
diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index f88c503cf4..76e1b0100e 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -762,7 +762,7 @@ static int get_physical_address(CPURISCVState *env, hwaddr 
*physical,
   * (riscv_cpu_do_interrupt) is correct */
  MemTxResult res;
  MemTxAttrs attrs = MEMTXATTRS_UNSPECIFIED;
-int mode = mmu_idx & TB_FLAGS_PRIV_MMU_MASK;
+int mode = env->priv;


This is incorrect.  You must map back from mmu_idx to priv.
Recall the semantics of MPRV.


diff --git a/target/riscv/insn_trans/trans_xthead.c.inc 
b/target/riscv/insn_trans/trans_xthead.c.inc
index df504c3f2c..adfb53cb4c 100644
--- a/target/riscv/insn_trans/trans_xthead.c.inc
+++ b/target/riscv/insn_trans/trans_xthead.c.inc
@@ -265,12 +265,7 @@ static bool trans_th_tst(DisasContext *ctx, arg_th_tst *a)
  
  static inline int priv_level(DisasContext *ctx)

  {
-#ifdef CONFIG_USER_ONLY
-return PRV_U;
-#else
- /* Priv level is part of mem_idx. */
-return ctx->mem_idx & TB_FLAGS_PRIV_MMU_MASK;
-#endif
+return ctx->priv;
  }


I guess we aren't expecting optimization to remove dead system code?
That would be the only reason to keep the ifdef.

This function should be hoisted to translate.c, or simply replaced by the field 
access.


@@ -1162,8 +1163,10 @@ static void riscv_tr_init_disas_context(DisasContextBase 
*dcbase, CPUState *cs)
  } else {
  ctx->virt_enabled = false;
  }
+ctx->priv = env->priv;


Incorrect, as Zhiwei pointed out.
I gave you the changes required to TB_FLAGS...


r~

Re: [PATCH 2/3] TPM TIS: Add support for TPM devices over I2C bus




On 3/23/23 2:44 AM, Cédric Le Goater wrote:

On 3/23/23 04:01, Ninad Palsule wrote:

Qemu already supports devices attached to ISA and sysbus. This drop adds
support for the I2C bus attached TPM devices.

This commit includes changes for the common code.
- Added support for the new checksum registers which are required for
   the I2C support. The checksum calculation is handled in the qemu
   common code.
- Added wrapper function for read and write data so that I2C code can
   call it without MMIO interface.

Signed-off-by: Ninad Palsule 
---
V2:

Incorporated Stephen's comments.

- Removed checksum enable and checksum get registers.
- Added checksum calculation function which can be called from
   i2c layer.
---
  hw/tpm/tpm_tis.h    |  3 +++
  hw/tpm/tpm_tis_common.c | 32 
  2 files changed, 35 insertions(+)

diff --git a/hw/tpm/tpm_tis.h b/hw/tpm/tpm_tis.h
index f6b5872ba6..6f29a508dd 100644
--- a/hw/tpm/tpm_tis.h
+++ b/hw/tpm/tpm_tis.h
@@ -86,5 +86,8 @@ int tpm_tis_pre_save(TPMState *s);
  void tpm_tis_reset(TPMState *s);
  enum TPMVersion tpm_tis_get_tpm_version(TPMState *s);
  void tpm_tis_request_completed(TPMState *s, int ret);
+uint32_t tpm_tis_read_data(TPMState *s, hwaddr addr, unsigned size);
+void tpm_tis_write_data(TPMState *s, hwaddr addr, uint64_t val, 
uint32_t size);

+uint16_t tpm_tis_get_checksum(TPMState *s);
    #endif /* TPM_TPM_TIS_H */
diff --git a/hw/tpm/tpm_tis_common.c b/hw/tpm/tpm_tis_common.c
index 503be2a541..b1acde74cb 100644
--- a/hw/tpm/tpm_tis_common.c
+++ b/hw/tpm/tpm_tis_common.c
@@ -26,6 +26,8 @@
  #include "hw/irq.h"
  #include "hw/isa/isa.h"
  #include "qapi/error.h"
+#include "qemu/bswap.h"
+#include "qemu/crc-ccitt.h"
  #include "qemu/module.h"
    #include "hw/acpi/tpm.h"
@@ -447,6 +449,27 @@ static uint64_t tpm_tis_mmio_read(void *opaque, 
hwaddr addr,

  return val;
  }
  +/*
+ * A wrapper read function so that it can be directly called without
+ * mmio.
+ */
+uint32_t tpm_tis_read_data(TPMState *s, hwaddr addr, unsigned size)
+{
+    return tpm_tis_mmio_read(s, addr, size);
+}
+
+/*
+ * Calculate current data buffer checksum
+ */
+uint16_t tpm_tis_get_checksum(TPMState *s)
+{
+    uint16_t val = 0x;
+
+    val = cpu_to_be16(crc_ccitt(0, s->buffer, s->rw_offset));


this routine could simply return cpu_to_be16(


Done.

Thank you for the review.

Ninad


Thanks,

C.



+
+    return val;
+}
+
  /*
   * Write a value to a register of the TIS interface
   * See specs pages 33-63 for description of the registers
@@ -767,6 +790,15 @@ static void tpm_tis_mmio_write(void *opaque, 
hwaddr addr,

  }
  }
  +/*
+ * A wrapper write function so that it can be directly called without
+ * mmio.
+ */
+void tpm_tis_write_data(TPMState *s, hwaddr addr, uint64_t val, 
uint32_t size)

+{
+    tpm_tis_mmio_write(s, addr, val, size);
+}
+
  const MemoryRegionOps tpm_tis_memory_ops = {
  .read = tpm_tis_mmio_read,
  .write = tpm_tis_mmio_write,

Re: QMP command dumpdtb crash bug





On 3/23/23 10:38, Peter Maydell wrote:

On Thu, 23 Mar 2023 at 13:29, Markus Armbruster  wrote:


Peter, Daniel offers two ways to fix this bug (see below).  Got a
preference?


Not freeing seems the correct thing. As Daniel says, this
should have been a prerequisite for implementing the
command in the first place (you need to change the lifecycle
of the fdt blob from "delete when done with in the arm boot code"
to "delete on machine finalize"). It looks like somehow we added
the command but missed out on getting all of the prerequisite
patches in. (File under "need to be cautious about applying partial
patchsets", I guess.)


Yeah, I'm at fault here. I should've been more insistent about acking
the ARM patch. All other patches that we left behind was optional, meaning
that the machine wouldn't implement the command but nothing bad would happen,
but the ARM patch was kind of mandatory because arm_load_dtb() is
freeing ms->fdt without assigning it to NULL.



Did anything else from that initial patchset get omitted?


Searching the ML I see that I sent a message saying that I pushed patches 1,
6 and 8-15 via ppc-next. This means that these patches got left behind:

 2  hw/core: free ms->fdt in machine_finalize()
 3  hw/arm: do not free machine->fdt in arm_load_dtb()
 4  hw/mips: set machine->fdt in boston_mach_init()
 5  hw/microblaze: set machine->fdt in microblaze_load_dtb()
 7  hw/ppc: set machine->fdt in ppce500_load_device_tree()
15  hw/xtensa: set machine->fdt in xtfpga_init()


Patch 2 was suggested by Phil and changes the common code to free ms->fdt
during machine_finalize(). Can be re-sent I guess.

All other patches, aside from patch 3 from ARM, are optional because the
machine isn't freeing ms->fdt or anything like that.


I'll rebase and re-sent patch 3 as a bug fix. I'll re-sent the hw/core patch
as well for 8.1.


Daniel




thanks
-- PMM

Re: [PATCH 0/9] (mostly) block: add more coroutine_fn annotations, use bdrv/blk_co_*

2023-03-23 Thread Kevin Wolf

Am 09.03.2023 um 09:44 hat Paolo Bonzini geschrieben:
> "Mostly" because there is a 9pfs patch in here too.
> 
> The series was developed with the help of vrc and the clang TSA annotations.

Thanks, fixed up some style issues and applied to block-next.

Kevin

[PATCH] block/export: only acquire AioContext once for vhost_user_server_stop()

vhost_user_server_stop() uses AIO_WAIT_WHILE(). AIO_WAIT_WHILE()
requires that AioContext is only acquired once.

Since blk_exp_request_shutdown() already acquires the AioContext it
shouldn't be acquired again in vhost_user_server_stop().

Signed-off-by: Stefan Hajnoczi 
---
 util/vhost-user-server.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
index 40f36ea214..5b6216069c 100644
--- a/util/vhost-user-server.c
+++ b/util/vhost-user-server.c
@@ -346,10 +346,9 @@ static void vu_accept(QIONetListener *listener, 
QIOChannelSocket *sioc,
 aio_context_release(server->ctx);
 }
 
+/* server->ctx acquired by caller */
 void vhost_user_server_stop(VuServer *server)
 {
-aio_context_acquire(server->ctx);
-
 qemu_bh_delete(server->restart_listener_bh);
 server->restart_listener_bh = NULL;
 
@@ -366,8 +365,6 @@ void vhost_user_server_stop(VuServer *server)
 AIO_WAIT_WHILE(server->ctx, server->co_trip);
 }
 
-aio_context_release(server->ctx);
-
 if (server->listener) {
 qio_net_listener_disconnect(server->listener);
 object_unref(OBJECT(server->listener));
-- 
2.39.2

[PATCH for-8.0 v2] aio-posix: fix race between epoll upgrade and aio_set_fd_handler()

If another thread calls aio_set_fd_handler() while the IOThread event
loop is upgrading from ppoll(2) to epoll(7) then we might miss new
AioHandlers. The epollfd will not monitor the new AioHandler's fd,
resulting in hangs.

Take the AioHandler list lock while upgrading to epoll. This prevents
AioHandlers from changing while epoll is being set up. If we cannot lock
because we're in a nested event loop, then don't upgrade to epoll (it
will happen next time we're not in a nested call).

The downside to taking the lock is that the aio_set_fd_handler() thread
has to wait until the epoll upgrade is finished, which involves many
epoll_ctl(2) system calls. However, this scenario is rare and I couldn't
think of another solution that is still simple.

Reported-by: Qing Wang 
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2090998
Cc: Paolo Bonzini 
Cc: Fam Zheng 
Signed-off-by: Stefan Hajnoczi 
---
v2:
- Use qemu_lockcnt_inc_and_unlock() instead of qemu_lockcnt_unlock() [Paolo]
---
 util/fdmon-epoll.c | 25 ++---
 1 file changed, 18 insertions(+), 7 deletions(-)

diff --git a/util/fdmon-epoll.c b/util/fdmon-epoll.c
index e11a8a022e..1683aa1105 100644
--- a/util/fdmon-epoll.c
+++ b/util/fdmon-epoll.c
@@ -127,6 +127,8 @@ static bool fdmon_epoll_try_enable(AioContext *ctx)
 
 bool fdmon_epoll_try_upgrade(AioContext *ctx, unsigned npfd)
 {
+bool ok;
+
 if (ctx->epollfd < 0) {
 return false;
 }
@@ -136,14 +138,23 @@ bool fdmon_epoll_try_upgrade(AioContext *ctx, unsigned 
npfd)
 return false;
 }
 
-if (npfd >= EPOLL_ENABLE_THRESHOLD) {
-if (fdmon_epoll_try_enable(ctx)) {
-return true;
-} else {
-fdmon_epoll_disable(ctx);
-}
+if (npfd < EPOLL_ENABLE_THRESHOLD) {
+return false;
 }
-return false;
+
+/* The list must not change while we add fds to epoll */
+if (!qemu_lockcnt_dec_if_lock(>list_lock)) {
+return false;
+}
+
+ok = fdmon_epoll_try_enable(ctx);
+
+qemu_lockcnt_inc_and_unlock(>list_lock);
+
+if (!ok) {
+fdmon_epoll_disable(ctx);
+}
+return ok;
 }
 
 void fdmon_epoll_setup(AioContext *ctx)
-- 
2.39.2

qapi: [RFC] Doc comment convention for @arg: sections

The QAPI schema doc comment language provides special syntax for command
and event arguments, struct and union members, alternate branches,
enumeration values, and features: "sections" starting with @arg:.

By convention, we format them like this:

# @arg: Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed
#   do eiusmod tempor incididunt ut labore et dolore magna
#   aliqua.

Okay for names as short as "name", but we have much longer ones.  Their
description gets squeezed against the right margin, like this:

# @dirty-sync-missed-zero-copy: Number of times dirty RAM synchronization 
could
#   not avoid copying dirty pages. This is 
between
#   0 and @dirty-sync-count * @multifd-channels.
#   (since 7.1)

The description is effectively 50 characters wide.  Easy enough to read,
but awkward to write.

The awkward squeeze against the right margin makes people go beyond it,
which produces two undesirables: arguments about style, and descriptions
that are unnecessarily hard to read, like this one:

# @postcopy-vcpu-blocktime: list of the postcopy blocktime per vCPU.  This 
is
#   only present when the postcopy-blocktime 
migration capability
#   is enabled. (Since 3.0)

Ugly, too.

I'd like to change the convention to

# @arg:
# Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed
# do eiusmod tempor incididunt ut labore et dolore magna aliqua.

# @dirty-sync-missed-zero-copy:
# Number of times dirty RAM synchronization could not avoid
# copying dirty pages.  This is between 0 and @dirty-sync-count
# * @multifd-channels.  (since 7.1)

# @postcopy-vcpu-blocktime:
# list of the postcopy blocktime per vCPU.  This is only present
# when the postcopy-blocktime migration capability is
# enabled.  (Since 3.0)

We may want to keep short descriptions one the same line, like

# @multifd-bytes: The number of bytes sent through multifd (since 3.0)

We could instead do

# @arg: Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed
# do eiusmod tempor incididunt ut labore et dolore magna aliqua.

Another option would be

# @arg:
# Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do
# eiusmod tempor incididunt ut labore et dolore magna aliqua.

or even

# @arg: Lorem ipsum dolor sit amet, consectetur adipiscing elit,
# sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.

but I find these less readable.

A bulk reformatting of doc comments will mess up git-blame, which will
be kind of painful[*].  But so is the status quo.

Thoughts?



[*] Yes, I'm aware of (and grateful for) --ignore-rev & friends.  It's
still kind of painful.

Re: [PATCH 1/2] ui/gtk: use widget size for cursor motion event

2023-03-23 Thread Marc-André Lureau

Hi Erico

On Thu, Mar 23, 2023 at 9:02 AM Kasireddy, Vivek
 wrote:
>
> Hi Erico,
>
> > >
> > >>
> > >> The gd_motion_event size has some calculations for the cursor position,
> > >> which also take into account things like different size of the
> > >> framebuffer compared to the window size.
> > >> The use of window size makes things more difficult though, as at least
> > >> in the case of Wayland includes the size of ui elements like a menu bar
> > >> at the top of the window. This leads to a wrong position calculation by
> > >> a few pixels.
> > >> Fix it by using the size of the widget, which already returns the size
> > >> of the actual space to render the framebuffer.
> > >>
> > >> Signed-off-by: Erico Nunes 
> > >> ---
> > >>  ui/gtk.c | 8 +++-
> > >>  1 file changed, 3 insertions(+), 5 deletions(-)
> > >>
> > >> diff --git a/ui/gtk.c b/ui/gtk.c
> > >> index fd82e9b1ca..d1b2a80c2b 100644
> > >> --- a/ui/gtk.c
> > >> +++ b/ui/gtk.c
> > >> @@ -868,7 +868,6 @@ static gboolean gd_motion_event(GtkWidget *widget,
> > >> GdkEventMotion *motion,
> > >>  {
> > >>  VirtualConsole *vc = opaque;
> > >>  GtkDisplayState *s = vc->s;
> > >> -GdkWindow *window;
> > >>  int x, y;
> > >>  int mx, my;
> > >>  int fbh, fbw;
> > >> @@ -881,10 +880,9 @@ static gboolean gd_motion_event(GtkWidget
> > *widget,
> > >> GdkEventMotion *motion,
> > >>  fbw = surface_width(vc->gfx.ds) * vc->gfx.scale_x;
> > >>  fbh = surface_height(vc->gfx.ds) * vc->gfx.scale_y;
> > >>
> > >> -window = gtk_widget_get_window(vc->gfx.drawing_area);
> > >> -ww = gdk_window_get_width(window);
> > >> -wh = gdk_window_get_height(window);
> > >> -ws = gdk_window_get_scale_factor(window);
> > >> +ww = gtk_widget_get_allocated_width(widget);
> > >> +wh = gtk_widget_get_allocated_height(widget);
> > > [Kasireddy, Vivek] Could you please confirm if this works in X-based 
> > > compositor
> > > environments as well? Last time I checked (with Fedora 36 and Gnome + X), 
> > > the
> > > get_allocated_xxx APIs were not accurate in X-based environments. 
> > > Therefore,
> > > I restricted the above change to Wayland-based environments only:
> > > https://lists.nongnu.org/archive/html/qemu-devel/2022-11/msg03100.html
> >
> > Yes, I tested again and it seems to work fine for me even with the gtk
> > ui running on X. I'm using Fedora 37.
> [Kasireddy, Vivek] Ok, in that case, this patch is
> Acked-by: Vivek Kasireddy 
>
> >
> > I was not aware of that patch series though and just spent some time
> > debugging these ui issues. It looks like your series was missed?
> [Kasireddy, Vivek] Yeah, not sure why my series was missed but in
> retrospect, I probably should have separated out bug fix patches
> from new feature enablement patches.
>
> >
> > I'm still debugging additional issues with cursor position calculation,
> > especially in wayland environments (and in particular with
> > vhost-user-gpu now). Do those patches address more cursor issues?
> [Kasireddy, Vivek] They do address more cursor issues but not sure how
> helpful they would be for you as most of them deal with relative mode +
> Wayland environment. However, there is another one that deals with
> cursor/pointer in absolute mode + multiple monitors:
> https://lists.nongnu.org/archive/html/qemu-devel/2022-11/msg03097.html
>

Should we queue the 2 patches from this series? (note that they were
not correctly handled by patchew, probably because you dropped the
cover letter).

For me -display gtk is unusable on hidpi & wayland anyway, because the
cursor position given to the guest does not match the dimensions given
for the monitor.

Also relative mouse support is broken as well (mouse wrapping and
confinement/grab is not supported by gdk/gtk on wayland).

I am not actively looking at these problems, they are "solved" with
spice (use -display spice-app). And I am also regularly working on a
gtk4/rust widget, using -display dbus
(https://gitlab.gnome.org/malureau/rdw). There is also
https://gitlab.gnome.org/chergert/libmks as a gtk4/C alternative. I am
not sure we should keep maintaining the gtk3 backend going forward.
And as a Gtk4/-display dbus client mature, I hope we can offer a
better alternative than ui/sdl or ui/cocoa on other platforms as well.


-- 
Marc-André Lureau

Re: [PATCH v8 0/4] Add zoned storage emulation to virtio-blk driver

2023-03-23 Thread Matias Bjørling


On 23/03/2023 06.28, Sam Li wrote:

This patch adds zoned storage emulation to the virtio-blk driver.

The patch implements the virtio-blk ZBD support standardization that is
recently accepted by virtio-spec. The link to related commit is at

https://github.com/oasis-tcs/virtio-spec/commit/b4e8efa0fa6c8d844328090ad15db65af8d7d981

The Linux zoned device code that implemented by Dmitry Fomichev has been
released at the latest Linux version v6.3-rc1.

Aside: adding zoned=on alike options to virtio-blk device will be
considered in following-up plan.

v7:
- address Stefan's review comments
   * rm aio_context_acquire/release in handle_req
   * rename function return type
   * rename BLOCK_ACCT_APPEND to BLOCK_ACCT_ZONE_APPEND for clarity

v6:
- update headers to v6.3-rc1

v5:
- address Stefan's review comments
   * restore the way writing zone append result to buffer
   * fix error checking case and other errands

v4:
- change the way writing zone append request result to buffer
- change zone state, zone type value of virtio_blk_zone_descriptor
- add trace events for new zone APIs

v3:
- use qemuio_from_buffer to write status bit [Stefan]
- avoid using req->elem directly [Stefan]
- fix error checkings and memory leak [Stefan]

v2:
- change units of emulated zone op coresponding to block layer APIs
- modify error checking cases [Stefan, Damien]

v1:
- add zoned storage emulation

Sam Li (4):
   include: update virtio_blk headers to v6.3-rc1
   virtio-blk: add zoned storage emulation for zoned devices
   block: add accounting for zone append operation
   virtio-blk: add some trace events for zoned emulation

  block/qapi-sysemu.c  |  11 +
  block/qapi.c |  18 +
  hw/block/trace-events|   7 +
  hw/block/virtio-blk-common.c |   2 +
  hw/block/virtio-blk.c| 405 +++
  include/block/accounting.h   |   1 +
  include/standard-headers/drm/drm_fourcc.h|  12 +
  include/standard-headers/linux/ethtool.h |  48 ++-
  include/standard-headers/linux/fuse.h|  45 ++-
  include/standard-headers/linux/pci_regs.h|   1 +
  include/standard-headers/linux/vhost_types.h |   2 +
  include/standard-headers/linux/virtio_blk.h  | 105 +
  linux-headers/asm-arm64/kvm.h|   1 +
  linux-headers/asm-x86/kvm.h  |  34 +-
  linux-headers/linux/kvm.h|   9 +
  linux-headers/linux/vfio.h   |  15 +-
  linux-headers/linux/vhost.h  |   8 +
  qapi/block-core.json |  62 ++-
  qapi/block.json  |   4 +
  19 files changed, 769 insertions(+), 21 deletions(-)




Hi Sam,

I applied your patches and can report that they work with both SMR HDDs 
and ZNS SSDs. Very nice work!


Regarding the documentation (docs/system/qemu-block-drivers.rst.inc). Is 
it possible to expose the host's zoned block device through something 
else than virtio-blk? If not, I wouldn't mind seeing the documentation 
updated to show a case when using the virtio-blk driver.


For example (this also includes the device part):

-device virtio-blk-pci,drive=drive0,id=virtblk0 \
-blockdev 
host_device,node-name=drive0,filename=/dev/nullb0,cache.direct=on``


It might also be nice to describe the shorthand for those that likes to 
pass in the parameters using only the -drive parameter.


 -drive driver=host_device,file=/dev/nullb0,if=virtio,cache.direct=on

Cheers, Matias

Re: [PATCH v8 2/4] virtio-blk: add zoned storage emulation for zoned devices

2023-03-23 Thread Matias Bjørling


On 23/03/2023 06.28, Sam Li wrote:

This patch extends virtio-blk emulation to handle zoned device commands
by calling the new block layer APIs to perform zoned device I/O on
behalf of the guest. It supports Report Zone, four zone oparations (open,
close, finish, reset), and Append Zone.

The VIRTIO_BLK_F_ZONED feature bit will only be set if the host does
support zoned block devices. Regular block devices(conventional zones)
will not be set.

The guest os can use blktests, fio to test those commands on zoned devices.
Furthermore, using zonefs to test zone append write is also supported.

Signed-off-by: Sam Li 
---
  hw/block/virtio-blk-common.c |   2 +
  hw/block/virtio-blk.c| 389 +++
  2 files changed, 391 insertions(+)

diff --git a/hw/block/virtio-blk-common.c b/hw/block/virtio-blk-common.c
index ac52d7c176..e2f8e2f6da 100644
--- a/hw/block/virtio-blk-common.c
+++ b/hw/block/virtio-blk-common.c
@@ -29,6 +29,8 @@ static const VirtIOFeature feature_sizes[] = {
   .end = endof(struct virtio_blk_config, discard_sector_alignment)},
  {.flags = 1ULL << VIRTIO_BLK_F_WRITE_ZEROES,
   .end = endof(struct virtio_blk_config, write_zeroes_may_unmap)},
+{.flags = 1ULL << VIRTIO_BLK_F_ZONED,
+ .end = endof(struct virtio_blk_config, zoned)},
  {}
  };


I used the qemu monitor to expect the state of the devices, and on the 
zoned block device specific entries, the zoned device feature shows up 
in the "unknown-features" field (info virtio-status )


What is missing is an entry in the blk_feature_map structure within 
hw/virtio/virtio-qmp.c. The below fixes it up.


diff --git i/hw/virtio/virtio-qmp.c w/hw/virtio/virtio-qmp.c
index b70148aba9..3efa529bab 100644
--- i/hw/virtio/virtio-qmp.c
+++ w/hw/virtio/virtio-qmp.c
@@ -176,6 +176,8 @@ static const qmp_virtio_feature_map_t 
virtio_blk_feature_map[] = {

 "VIRTIO_BLK_F_DISCARD: Discard command supported"),
 FEATURE_ENTRY(VIRTIO_BLK_F_WRITE_ZEROES, \
 "VIRTIO_BLK_F_WRITE_ZEROES: Write zeroes command supported"),
+FEATURE_ENTRY(VIRTIO_BLK_F_ZONED, \
+"VIRTIO_BLK_F_ZONED: Zoned block device"),
 #ifndef VIRTIO_BLK_NO_LEGACY
 FEATURE_ENTRY(VIRTIO_BLK_F_BARRIER, \
 "VIRTIO_BLK_F_BARRIER: Request barriers supported"),

Which then lets qemu report the support like this:

(qemu) info virtio-status /machine/peripheral/virtblk0/virtio-backend
/machine/peripheral/virtblk0/virtio-backend:
  device_name: virtio-blk
  device_id:   2
  vhost_started:   false
  bus_name:(null)
  broken:  false
  disabled:false
  disable_legacy_check:false
  started: true
  use_started: true
  start_on_kick:   false
  use_guest_notifier_mask: true
  vm_running:  true
  num_vqs: 4
  queue_sel:   3
  isr: 1
  endianness:  little
  status:
VIRTIO_CONFIG_S_ACKNOWLEDGE: Valid virtio device found,
VIRTIO_CONFIG_S_DRIVER: Guest OS compatible with device,
VIRTIO_CONFIG_S_FEATURES_OK: Feature negotiation complete,
VIRTIO_CONFIG_S_DRIVER_OK: Driver setup and ready
  Guest features:
VIRTIO_RING_F_EVENT_IDX: Used & avail. event fields enabled,
VIRTIO_RING_F_INDIRECT_DESC: Indirect descriptors supported,
VIRTIO_F_VERSION_1: Device compliant for v1 spec (legacy)
VIRTIO_BLK_F_CONFIG_WCE: Cache writeback and ...,
VIRTIO_BLK_F_FLUSH: Flush command supported,
VIRTIO_BLK_F_ZONED: Zoned block device,
VIRTIO_BLK_F_WRITE_ZEROES: Write zeroes command supported,
VIRTIO_BLK_F_MQ: Multiqueue supported,
VIRTIO_BLK_F_TOPOLOGY: Topology information available,
VIRTIO_BLK_F_BLK_SIZE: Block size of disk available,
VIRTIO_BLK_F_GEOMETRY: Legacy geometry available,
VIRTIO_BLK_F_SEG_MAX: Max segments in a request is seg_max
  unknown-features(0x0100)
  Host features:
VIRTIO_RING_F_EVENT_IDX: Used & avail. event fields enabled,
VIRTIO_RING_F_INDIRECT_DESC: Indirect descriptors supported,
VIRTIO_F_VERSION_1: Device compliant for v1 spec (legacy),
VIRTIO_F_ANY_LAYOUT: Device accepts arbitrary desc. layouts,
VIRTIO_F_NOTIFY_ON_EMPTY: Notify when device ...,
VHOST_USER_F_PROTOCOL_FEATURES: Vhost-user protocol ...,
VIRTIO_BLK_F_CONFIG_WCE: Cache writeback and w...,
VIRTIO_BLK_F_FLUSH: Flush command supported,
VIRTIO_BLK_F_ZONED: Zoned block device,
VIRTIO_BLK_F_WRITE_ZEROES: Write zeroes command supported,
VIRTIO_BLK_F_MQ: Multiqueue supported,
VIRTIO_BLK_F_TOPOLOGY: Topology information available,
VIRTIO_BLK_F_BLK_SIZE: Block size of disk available,
VIRTIO_BLK_F_GEOMETRY: Legacy geometry available,
VIRTIO_BLK_F_SEG_MAX: Max segments in a request is seg_max

[PATCH for-8.0 v2 1/1] modules: load modules from /var/run/qemu/ directory firstly

2023-03-23 Thread Siddhi Katage

From: Siddhi Katage 

An old running QEMU will try to load modules with new build-id first, this
will fail as expected, then QEMU will fallback to load the old modules that
matches its build-id from /var/run/qemu/ directory.
Make /var/run/qemu/ directory as first search path to load modules.

Fixes: bd83c861c0 ("modules: load modules from versioned /var/run dir")
Signed-off-by: Siddhi Katage 
Reviewed-by: Philippe Mathieu-Daudé 
---
v2: Added reviewed-by tag
 
 util/module.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/util/module.c b/util/module.c
index 32e2631..b723d65 100644
--- a/util/module.c
+++ b/util/module.c
@@ -233,17 +233,17 @@ int module_load(const char *prefix, const char *name, 
Error **errp)
 g_hash_table_add(loaded_modules, module_name);
 
 search_dir = getenv("QEMU_MODULE_DIR");
-if (search_dir != NULL) {
-dirs[n_dirs++] = g_strdup_printf("%s", search_dir);
-}
-dirs[n_dirs++] = get_relocated_path(CONFIG_QEMU_MODDIR);
-
 #ifdef CONFIG_MODULE_UPGRADES
 version_dir = g_strcanon(g_strdup(QEMU_PKGVERSION),
  G_CSET_A_2_Z G_CSET_a_2_z G_CSET_DIGITS "+-.~",
  '_');
 dirs[n_dirs++] = g_strdup_printf("/var/run/qemu/%s", version_dir);
 #endif
+if (search_dir != NULL) {
+dirs[n_dirs++] = g_strdup_printf("%s", search_dir);
+}
+dirs[n_dirs++] = get_relocated_path(CONFIG_QEMU_MODDIR);
+
 assert(n_dirs <= ARRAY_SIZE(dirs));
 
 /* end of resources managed by the out: label */
-- 
1.8.3.1

Re: [PATCH] Hexagon (translate.c): avoid redundant PC updates on COF

2023-03-23 Thread Anton Johansson via




On 3/22/23 22:17, Matheus Tavares Bernardino wrote:

When there is a conditional change of flow or an endloop instruction, we
preload HEX_REG_PC with ctx->next_PC at gen_start_packet(). Nonetheless,
we still generate TCG code to do this update again at gen_goto_tb() when
the condition for the COF is not met, thus producing redundant
instructions. This can be seen with the following packet:

  0x004002e4:  0x5c20d000 {   if (!P0) jump:t PC+0 }

Which generates this TCG code:

 004002e4
-> mov_i32 pc,$0x4002e8
and_i32 loc9,p0,$0x1
mov_i32 branch_taken,loc9
add_i32 pkt_cnt,pkt_cnt,$0x2
add_i32 insn_cnt,insn_cnt,$0x2
brcond_i32 branch_taken,$0x0,ne,$L1
goto_tb $0x0
mov_i32 pc,$0x4002e4
exit_tb $0x7fb0c36e5200
set_label $L1
goto_tb $0x1
-> mov_i32 pc,$0x4002e8
exit_tb $0x7fb0c36e5201
set_label $L0
exit_tb $0x7fb0c36e5203

Note that even after optimizations, the redundant PC update is still
present:

 004002e4
-> mov_i32 pc,$0x4002e8 sync: 0  dead: 0 1  pref=0x
mov_i32 branch_taken,$0x1sync: 0  dead: 0 1  pref=0x
add_i32 pkt_cnt,pkt_cnt,$0x2 sync: 0  dead: 0 1  pref=0x
add_i32 insn_cnt,insn_cnt,$0x2   sync: 0  dead: 0 1 2  pref=0x
goto_tb $0x1
-> mov_i32 pc,$0x4002e8 sync: 0  dead: 0 1  pref=0x
exit_tb $0x7fb0c36e5201
set_label $L0
exit_tb $0x7fb0c36e5203

With this patch, the second redundant update is properly discarded.

Note that we need the additional "move_to_pc" flag instead of just
avoiding the update whenever `dest == ctx->next_PC`, as that could
potentially skip updates from a COF with met condition, whose
ctx->branch_dest just happens to be equal to ctx->next_PC.

Signed-off-by: Matheus Tavares Bernardino 
---
  target/hexagon/translate.c | 21 +
  1 file changed, 13 insertions(+), 8 deletions(-)

diff --git target/hexagon/translate.c target/hexagon/translate.c
index 665476ab48..58d638f734 100644
--- target/hexagon/translate.c
+++ target/hexagon/translate.c


Looks good, I appreciate the thorough motivation for this patch!

Reviewed-by: Anton Johansson 

--
Anton Johansson,
rev.ng Labs Srl.

Re: [PATCH for-8.0] aio-posix: fix race between epoll upgrade and aio_set_fd_handler()

On Thu, Mar 23, 2023 at 06:02:36AM +0100, Paolo Bonzini wrote:
> Il mer 22 mar 2023, 15:55 Stefan Hajnoczi  ha scritto:
> 
> > +/* The list must not change while we add fds to epoll */
> > +if (!qemu_lockcnt_dec_if_lock(>list_lock)) {
> > +return false;
> > +}
> > +
> > +ok = fdmon_epoll_try_enable(ctx);
> > +
> > +qemu_lockcnt_unlock(>list_lock);
> >
> 
> Shouldn't this be inc_and_unlock to balance the change made by dec_if_lock?

Yes, the caller expects list_lock to still be incremented. Thanks for
catching this!

Stefan


signature.asc
Description: PGP signature

Re: [PATCH v3] target/riscv: reduce overhead of MSTATUS_SUM change

On 3/23/2023 8:38 AM, Wu, Fei wrote:
> On 3/22/2023 9:19 PM, Richard Henderson wrote:
>> On 3/22/23 05:12, Fei Wu wrote:
>>> Kernel needs to access user mode memory e.g. during syscalls, the window
>>> is usually opened up for a very limited time through MSTATUS.SUM, the
>>> overhead is too much if tlb_flush() gets called for every SUM change.
>>>
>>> This patch creates a separate MMU index for S+SUM, so that it's not
>>> necessary to flush tlb anymore when SUM changes. This is similar to how
>>> ARM handles Privileged Access Never (PAN).
>>>
>>> Result of 'pipe 10' from unixbench boosts from 223656 to 1705006. Many
>>> other syscalls benefit a lot from this too.
>>>
>>> Signed-off-by: Fei Wu 
>>> ---
>>>   target/riscv/cpu-param.h  |  2 +-
>>>   target/riscv/cpu.h    |  2 +-
>>>   target/riscv/cpu_bits.h   |  1 +
>>>   target/riscv/cpu_helper.c | 11 +++
>>>   target/riscv/csr.c    |  2 +-
>>>   5 files changed, 15 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/target/riscv/cpu-param.h b/target/riscv/cpu-param.h
>>> index ebaf26d26d..9e21b943f9 100644
>>> --- a/target/riscv/cpu-param.h
>>> +++ b/target/riscv/cpu-param.h
>>> @@ -27,6 +27,6 @@
>>>    *  - S mode HLV/HLVX/HSV 0b101
>>>    *  - M mode HLV/HLVX/HSV 0b111
>>>    */
>>> -#define NB_MMU_MODES 8
>>> +#define NB_MMU_MODES 16
>>
>> This line no longer exists on master.
>> The comment above should be updated, and perhaps moved.
>>
>>>   #define TB_FLAGS_PRIV_MMU_MASK    3
>>> -#define TB_FLAGS_PRIV_HYP_ACCESS_MASK   (1 << 2)
>>> +#define TB_FLAGS_PRIV_HYP_ACCESS_MASK   (1 << 3)
>>
>> You can't do this, as you're now overlapping
>>
> As you mentioned below HYP_ACCESS_MASK is set directly by hyp
> instruction translation, there is no overlapping if it's not part of
> TB_FLAGS.
> 
>> FIELD(TB_FLAGS, LMUL, 3, 3)
>>
>> You'd need to shift all other fields up to do this.
>> There is room, to be sure.
>>
>> Or you could reuse MMU mode number 2.  For that you'd need to separate
>> DisasContext.mem_idx from priv.  Which should probably be done anyway,
>> because tests such as
>>
> Yes, it looks good to reuse number 2. I tried this v3 patch again with a
> different MMUIdx_S_SUM number, only 5 is okay below 8, for the other
> number there is no kernel message from guest after opensbi output. I
> need to find it out.
> 
>> insn_trans/trans_privileged.c.inc:    if
>> (semihosting_enabled(ctx->mem_idx < PRV_S) &&
>>
>> are already borderline wrong.
>> Yes, it's better not to compare idx to priv.
> 
>> I suggest
>>
>> - #define TB_FLAGS_PRIV_MMU_MASK    3
>> - #define TB_FLAGS_PRIV_HYP_ACCESS_MASK   (1 << 2)
>>
>> HYP_ACCESS_MASK never needed to be part of TB_FLAGS; it is only set
>> directly by the hyp access instruction translation.  Drop the PRIV mask
>> and represent that directly:
>>
>> - FIELD(TB_FLAGS, MEM_IDX, 0, 3)
>> + FIELD(TB_FLAGS, PRIV, 0, 2)
>> + FIELD(TB_FLAGS, SUM, 2, 1)
>>
>> Let SUM occupy the released bit.
>>
>> In internals.h,
>>
>> /*
>>  * The current MMU Modes are:
>>  *  - U 0b000
>>  *  - S 0b001
>>  *  - S+SUM 0b010
>>  *  - M 0b011
>>  *  - HLV/HLVX/HSV adds 0b100
>>  */
>> #define MMUIdx_U    0
>> #define MMUIdx_S    1
>> #define MMUIdx_S_SUM    2
>> #define MMUIdx_M    3
>> #define MMU_HYP_ACCESS_BIT  (1 << 2)
>>
>>
>> In riscv_tr_init_disas_context:
>>
>>     ctx->priv = FIELD_EX32(tb_flags, TB_FLAGS, PRIV);
>>     ctx->mmu_idx = ctx->priv;
>>     if (ctx->mmu_idx == PRV_S && FIELD_EX32(tb_flags, TB_FLAGS, SUM)) {
>>     ctx->mmu_idx = MMUIdx_S_SUM;
>>     }
>>
> There is MSTATUS_MPRV and MSTATUS_MPP kind of thing, priv+sum is not
> able to represent all of the status, probably we can just add an extra
> 'priv' at the back of TB_FLAGS?
> 
MPRV+MPP looks not necessary be in TB_FLAGS, it's just used to calculate
the mmu_idx. Is it necessary to encode SUM into TB_FLAGS?

Thanks,
Fei.

> Thanks,
> Fei.
> 
>> and similarly in riscv_cpu_mmu_index.
>>
>> Fix all uses of ctx->mmu_idx that are not specifically for memory
>> operations.
>>
>>
>> r~
>

Re: [PATCH v8 2/4] virtio-blk: add zoned storage emulation for zoned devices

2023-03-23 Thread Sam Li

Matias Bjørling  于2023年3月23日周四 21:39写道：
>
> On 23/03/2023 06.28, Sam Li wrote:
> > This patch extends virtio-blk emulation to handle zoned device commands
> > by calling the new block layer APIs to perform zoned device I/O on
> > behalf of the guest. It supports Report Zone, four zone oparations (open,
> > close, finish, reset), and Append Zone.
> >
> > The VIRTIO_BLK_F_ZONED feature bit will only be set if the host does
> > support zoned block devices. Regular block devices(conventional zones)
> > will not be set.
> >
> > The guest os can use blktests, fio to test those commands on zoned devices.
> > Furthermore, using zonefs to test zone append write is also supported.
> >
> > Signed-off-by: Sam Li 
> > ---
> >   hw/block/virtio-blk-common.c |   2 +
> >   hw/block/virtio-blk.c| 389 +++
> >   2 files changed, 391 insertions(+)
> >
> > diff --git a/hw/block/virtio-blk-common.c b/hw/block/virtio-blk-common.c
> > index ac52d7c176..e2f8e2f6da 100644
> > --- a/hw/block/virtio-blk-common.c
> > +++ b/hw/block/virtio-blk-common.c
> > @@ -29,6 +29,8 @@ static const VirtIOFeature feature_sizes[] = {
> >.end = endof(struct virtio_blk_config, discard_sector_alignment)},
> >   {.flags = 1ULL << VIRTIO_BLK_F_WRITE_ZEROES,
> >.end = endof(struct virtio_blk_config, write_zeroes_may_unmap)},
> > +{.flags = 1ULL << VIRTIO_BLK_F_ZONED,
> > + .end = endof(struct virtio_blk_config, zoned)},
> >   {}
> >   };
>
> I used the qemu monitor to expect the state of the devices, and on the
> zoned block device specific entries, the zoned device feature shows up
> in the "unknown-features" field (info virtio-status )
>
> What is missing is an entry in the blk_feature_map structure within
> hw/virtio/virtio-qmp.c. The below fixes it up.
>
> diff --git i/hw/virtio/virtio-qmp.c w/hw/virtio/virtio-qmp.c
> index b70148aba9..3efa529bab 100644
> --- i/hw/virtio/virtio-qmp.c
> +++ w/hw/virtio/virtio-qmp.c
> @@ -176,6 +176,8 @@ static const qmp_virtio_feature_map_t
> virtio_blk_feature_map[] = {
>   "VIRTIO_BLK_F_DISCARD: Discard command supported"),
>   FEATURE_ENTRY(VIRTIO_BLK_F_WRITE_ZEROES, \
>   "VIRTIO_BLK_F_WRITE_ZEROES: Write zeroes command supported"),
> +FEATURE_ENTRY(VIRTIO_BLK_F_ZONED, \
> +"VIRTIO_BLK_F_ZONED: Zoned block device"),
>   #ifndef VIRTIO_BLK_NO_LEGACY
>   FEATURE_ENTRY(VIRTIO_BLK_F_BARRIER, \
>   "VIRTIO_BLK_F_BARRIER: Request barriers supported"),
>
> Which then lets qemu report the support like this:
>
> (qemu) info virtio-status /machine/peripheral/virtblk0/virtio-backend
> /machine/peripheral/virtblk0/virtio-backend:
>device_name: virtio-blk
>device_id:   2
>vhost_started:   false
>bus_name:(null)
>broken:  false
>disabled:false
>disable_legacy_check:false
>started: true
>use_started: true
>start_on_kick:   false
>use_guest_notifier_mask: true
>vm_running:  true
>num_vqs: 4
>queue_sel:   3
>isr: 1
>endianness:  little
>status:
>  VIRTIO_CONFIG_S_ACKNOWLEDGE: Valid virtio device found,
>  VIRTIO_CONFIG_S_DRIVER: Guest OS compatible with device,
>  VIRTIO_CONFIG_S_FEATURES_OK: Feature negotiation complete,
>  VIRTIO_CONFIG_S_DRIVER_OK: Driver setup and ready
>Guest features:
>  VIRTIO_RING_F_EVENT_IDX: Used & avail. event fields enabled,
>  VIRTIO_RING_F_INDIRECT_DESC: Indirect descriptors supported,
>  VIRTIO_F_VERSION_1: Device compliant for v1 spec (legacy)
>  VIRTIO_BLK_F_CONFIG_WCE: Cache writeback and ...,
>  VIRTIO_BLK_F_FLUSH: Flush command supported,
>  VIRTIO_BLK_F_ZONED: Zoned block device,
>  VIRTIO_BLK_F_WRITE_ZEROES: Write zeroes command supported,
>  VIRTIO_BLK_F_MQ: Multiqueue supported,
>  VIRTIO_BLK_F_TOPOLOGY: Topology information available,
>  VIRTIO_BLK_F_BLK_SIZE: Block size of disk available,
>  VIRTIO_BLK_F_GEOMETRY: Legacy geometry available,
>  VIRTIO_BLK_F_SEG_MAX: Max segments in a request is seg_max
>unknown-features(0x0100)
>Host features:
>  VIRTIO_RING_F_EVENT_IDX: Used & avail. event fields enabled,
>  VIRTIO_RING_F_INDIRECT_DESC: Indirect descriptors supported,
>  VIRTIO_F_VERSION_1: Device compliant for v1 spec (legacy),
>  VIRTIO_F_ANY_LAYOUT: Device accepts arbitrary desc. layouts,
>  VIRTIO_F_NOTIFY_ON_EMPTY: Notify when device ...,
>  VHOST_USER_F_PROTOCOL_FEATURES: Vhost-user protocol ...,
>  VIRTIO_BLK_F_CONFIG_WCE: Cache writeback and w...,
>  VIRTIO_BLK_F_FLUSH: Flush command supported,
>  VIRTIO_BLK_F_ZONED: Zoned block device,
>  VIRTIO_BLK_F_WRITE_ZEROES:

Re: [PATCH 1/1] accel/kvm/kvm-all: fix vm crash when set dirty ring and memorybacking

2023-03-23 Thread Peter Xu

On Thu, Mar 23, 2023 at 09:19:15PM +0800, Jiajing Zhou wrote:
> From: "zhoujiajing.vergil" 
> 
> It is possible enter this function when the cpu not finished creating but
> is already in the cpu list. The value of dirty_gfns is null, causing vm
> crash here.
> 
> When both dirty-ring and memorybacking are set, creating a vm will assert
> on kvm_dirty_ring_reap_one. Part of the xml as follows:
> 
> 
> 
>   ...
>   
> 
>   
> 
>   
>   ...
>   
> 
> 
>   
> 
>   
>   ...
> 
> 
> The kvm-reaper thread was created before vcpu thread, and the value of
> cpu->kvm_dirty_gfns is assigned at cpu thread. In the x86_cpu_realizefn
> function, the cpu is inserted into the cpu list first, and then the cpu
> thread is created for initialization. The entry functions are
> cpu_exec_realizefn and qemu_init_vcpu. In the existing logic, the
> kvm-reaper thread traverses the cpu list every second and finally call
> kvm_dirty_ring_reap_one for each cpu in the list. If cpu has been inserted
> into cpu list but has not been initialized so that the value of dirty_gfns
> is null, kvm-reaper thread call kvm_dirty_ring_reap_one will cause vm crash.
> 
> The call stack is as follows:
>   kvm_dirty_ring_reaper_thread
> -> kvm_dirty_ring_reap
>->kvm_dirty_ring_reap_locked
>  ->kvm_dirty_ring_reap_one
> 
> 
> Signed-off-by: zhoujiajing.vergil 

Acked-by: Peter Xu 

And there's a prior fix last year:

https://lore.kernel.org/r/20220927154653.77296-1-pet...@redhat.com

The most recent post that I'm aware of is by Yong:

https://lore.kernel.org/r/1d14deb6684bcb7de1c9633c5bd21113988cc698.1676563222.git.huang...@chinatelecom.cn

A bunch of people hit this already.

Paolo, ping yet again - would you consider merging any of the versions?
For this one I'd think it'll be good to have even for 8.0.

Thanks!

-- 
Peter Xu

Re: [PATCH v4 1/2] target/riscv: separate priv from mmu_idx

On 3/23/2023 2:59 PM, LIU Zhiwei wrote:
> 
> On 2023/3/23 14:00, Wu, Fei wrote:
>> On 3/23/2023 1:37 PM, LIU Zhiwei wrote:
>>> On 2023/3/23 10:44, Fei Wu wrote:
 Currently it's assumed the 2 low bits of mmu_idx map to privilege mode,
 this assumption won't last as we are about to add more mmu_idx.
>>> For patch set has more than 1 patch, usually add a cover letter.
>> This is cover letter:
>>     https://www.mail-archive.com/qemu-devel@nongnu.org/msg950849.html
>>
>> I added scripts/get_maintainer.pl to .git/config,
> Interesting.
>> it couldn't find out
>> the maintainers for the cover letter, so I added the mail lists to "To"
>> manually.
> Maybe you should also cc to maintainers manually. I don't know the
> automatically way.
>>
 Signed-off-by: Fei Wu 
 ---
    target/riscv/cpu.h | 1 -
    target/riscv/cpu_helper.c  | 2 +-
    target/riscv/insn_trans/trans_privileged.c.inc | 2 +-
    target/riscv/insn_trans/trans_xthead.c.inc | 7 +--
    target/riscv/translate.c   | 3 +++
    5 files changed, 6 insertions(+), 9 deletions(-)

 diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
 index 638e47c75a..66f7e3d1ba 100644
 --- a/target/riscv/cpu.h
 +++ b/target/riscv/cpu.h
 @@ -623,7 +623,6 @@ G_NORETURN void
 riscv_raise_exception(CPURISCVState *env,
    target_ulong riscv_cpu_get_fflags(CPURISCVState *env);
    void riscv_cpu_set_fflags(CPURISCVState *env, target_ulong);
    -#define TB_FLAGS_PRIV_MMU_MASK    3
    #define TB_FLAGS_PRIV_HYP_ACCESS_MASK   (1 << 2)
    #define TB_FLAGS_MSTATUS_FS MSTATUS_FS
    #define TB_FLAGS_MSTATUS_VS MSTATUS_VS
 diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
 index f88c503cf4..76e1b0100e 100644
 --- a/target/riscv/cpu_helper.c
 +++ b/target/riscv/cpu_helper.c
 @@ -762,7 +762,7 @@ static int get_physical_address(CPURISCVState
 *env, hwaddr *physical,
     * (riscv_cpu_do_interrupt) is correct */
    MemTxResult res;
    MemTxAttrs attrs = MEMTXATTRS_UNSPECIFIED;
 -    int mode = mmu_idx & TB_FLAGS_PRIV_MMU_MASK;
 +    int mode = env->priv;
    bool use_background = false;
    hwaddr ppn;
    RISCVCPU *cpu = env_archcpu(env);
 diff --git a/target/riscv/insn_trans/trans_privileged.c.inc
 b/target/riscv/insn_trans/trans_privileged.c.inc
 index 59501b2780..9305b18299 100644
 --- a/target/riscv/insn_trans/trans_privileged.c.inc
 +++ b/target/riscv/insn_trans/trans_privileged.c.inc
 @@ -52,7 +52,7 @@ static bool trans_ebreak(DisasContext *ctx,
 arg_ebreak *a)
     * that no exception will be raised when fetching them.
     */
    -    if (semihosting_enabled(ctx->mem_idx < PRV_S) &&
 +    if (semihosting_enabled(ctx->priv < PRV_S) &&
    (pre_addr & TARGET_PAGE_MASK) == (post_addr &
 TARGET_PAGE_MASK)) {
    pre    = opcode_at(>base, pre_addr);
    ebreak = opcode_at(>base, ebreak_addr);
 diff --git a/target/riscv/insn_trans/trans_xthead.c.inc
 b/target/riscv/insn_trans/trans_xthead.c.inc
 index df504c3f2c..adfb53cb4c 100644
 --- a/target/riscv/insn_trans/trans_xthead.c.inc
 +++ b/target/riscv/insn_trans/trans_xthead.c.inc
 @@ -265,12 +265,7 @@ static bool trans_th_tst(DisasContext *ctx,
 arg_th_tst *a)
      static inline int priv_level(DisasContext *ctx)
    {
 -#ifdef CONFIG_USER_ONLY
 -    return PRV_U;
 -#else
 - /* Priv level is part of mem_idx. */
 -    return ctx->mem_idx & TB_FLAGS_PRIV_MMU_MASK;
 -#endif
 +    return ctx->priv;
    }
      /* Test if priv level is M, S, or U (cannot fail). */
 diff --git a/target/riscv/translate.c b/target/riscv/translate.c
 index 0ee8ee147d..e8880f9423 100644
 --- a/target/riscv/translate.c
 +++ b/target/riscv/translate.c
 @@ -69,6 +69,7 @@ typedef struct DisasContext {
    uint32_t mstatus_hs_fs;
    uint32_t mstatus_hs_vs;
    uint32_t mem_idx;
 +    uint32_t priv;
    /* Remember the rounding mode encoded in the previous fp
 instruction,
   which we have already installed into env->fp_status.  Or
 -1 for
   no previous fp instruction.  Note that we exit the TB when
 writing
 @@ -1162,8 +1163,10 @@ static void
 riscv_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
    } else {
    ctx->virt_enabled = false;
    }
 +    ctx->priv = env->priv;
>>> This is not right. You should put env->priv into tb flags before you use
>>> it in translation.
>>>
>> I see some other env usages in this function,
> Don't do it that way. It just be merged by accident. It will make review
> harder and probably be wrong.
>> when will env->priv and
>> tb_flags.priv mismatch

Re: QMP command dumpdtb crash bug

On Thu, 23 Mar 2023 at 13:29, Markus Armbruster  wrote:
>
> Peter, Daniel offers two ways to fix this bug (see below).  Got a
> preference?

Not freeing seems the correct thing. As Daniel says, this
should have been a prerequisite for implementing the
command in the first place (you need to change the lifecycle
of the fdt blob from "delete when done with in the arm boot code"
to "delete on machine finalize"). It looks like somehow we added
the command but missed out on getting all of the prerequisite
patches in. (File under "need to be cautious about applying partial
patchsets", I guess.)

Did anything else from that initial patchset get omitted?

thanks
-- PMM

Re: [PATCH v8 0/4] Add zoned storage emulation to virtio-blk driver

2023-03-23 Thread Sam Li

Matias Bjørling  于2023年3月23日周四 21:26写道：
>
> On 23/03/2023 06.28, Sam Li wrote:
> > This patch adds zoned storage emulation to the virtio-blk driver.
> >
> > The patch implements the virtio-blk ZBD support standardization that is
> > recently accepted by virtio-spec. The link to related commit is at
> >
> > https://github.com/oasis-tcs/virtio-spec/commit/b4e8efa0fa6c8d844328090ad15db65af8d7d981
> >
> > The Linux zoned device code that implemented by Dmitry Fomichev has been
> > released at the latest Linux version v6.3-rc1.
> >
> > Aside: adding zoned=on alike options to virtio-blk device will be
> > considered in following-up plan.
> >
> > v7:
> > - address Stefan's review comments
> >* rm aio_context_acquire/release in handle_req
> >* rename function return type
> >* rename BLOCK_ACCT_APPEND to BLOCK_ACCT_ZONE_APPEND for clarity
> >
> > v6:
> > - update headers to v6.3-rc1
> >
> > v5:
> > - address Stefan's review comments
> >* restore the way writing zone append result to buffer
> >* fix error checking case and other errands
> >
> > v4:
> > - change the way writing zone append request result to buffer
> > - change zone state, zone type value of virtio_blk_zone_descriptor
> > - add trace events for new zone APIs
> >
> > v3:
> > - use qemuio_from_buffer to write status bit [Stefan]
> > - avoid using req->elem directly [Stefan]
> > - fix error checkings and memory leak [Stefan]
> >
> > v2:
> > - change units of emulated zone op coresponding to block layer APIs
> > - modify error checking cases [Stefan, Damien]
> >
> > v1:
> > - add zoned storage emulation
> >
> > Sam Li (4):
> >include: update virtio_blk headers to v6.3-rc1
> >virtio-blk: add zoned storage emulation for zoned devices
> >block: add accounting for zone append operation
> >virtio-blk: add some trace events for zoned emulation
> >
> >   block/qapi-sysemu.c  |  11 +
> >   block/qapi.c |  18 +
> >   hw/block/trace-events|   7 +
> >   hw/block/virtio-blk-common.c |   2 +
> >   hw/block/virtio-blk.c| 405 +++
> >   include/block/accounting.h   |   1 +
> >   include/standard-headers/drm/drm_fourcc.h|  12 +
> >   include/standard-headers/linux/ethtool.h |  48 ++-
> >   include/standard-headers/linux/fuse.h|  45 ++-
> >   include/standard-headers/linux/pci_regs.h|   1 +
> >   include/standard-headers/linux/vhost_types.h |   2 +
> >   include/standard-headers/linux/virtio_blk.h  | 105 +
> >   linux-headers/asm-arm64/kvm.h|   1 +
> >   linux-headers/asm-x86/kvm.h  |  34 +-
> >   linux-headers/linux/kvm.h|   9 +
> >   linux-headers/linux/vfio.h   |  15 +-
> >   linux-headers/linux/vhost.h  |   8 +
> >   qapi/block-core.json |  62 ++-
> >   qapi/block.json  |   4 +
> >   19 files changed, 769 insertions(+), 21 deletions(-)
> >
>
>
> Hi Sam,
>
> I applied your patches and can report that they work with both SMR HDDs
> and ZNS SSDs. Very nice work!
>
> Regarding the documentation (docs/system/qemu-block-drivers.rst.inc). Is
> it possible to expose the host's zoned block device through something
> else than virtio-blk? If not, I wouldn't mind seeing the documentation
> updated to show a case when using the virtio-blk driver.
>
> For example (this also includes the device part):
>
> -device virtio-blk-pci,drive=drive0,id=virtblk0 \
> -blockdev
> host_device,node-name=drive0,filename=/dev/nullb0,cache.direct=on``
>
> It might also be nice to describe the shorthand for those that likes to
> pass in the parameters using only the -drive parameter.
>
>   -drive driver=host_device,file=/dev/nullb0,if=virtio,cache.direct=on

Hi Matias,

I'm glad it works. Thanks for your feedback!

For the question, this patch is exposing the zoned interface through
virtio-blk only. It's a good suggestion to put a use case inside
documentation. I will add it in the subsequent patch.

Thanks,
Sam

Re: QMP command dumpdtb crash bug

Peter, Daniel offers two ways to fix this bug (see below).  Got a
preference?

Daniel Henrique Barboza  writes:

> On 3/23/23 03:29, Markus Armbruster wrote:
>> Watch this:
>> 
>>  $ gdb --args ../qemu/bld/qemu-system-aarch64 -S -M virt -display none 
>> -qmp stdio
>>  [...]
>>  (gdb) r
>>  [...]
>>  {"QMP": {"version": {"qemu": {"micro": 50, "minor": 2, "major": 7}, 
>> "package": "v7.2.0-2331-gda89f78a7d"}, "capabilities": ["oob"]}}
>>  [New Thread 0x7fffed62c6c0 (LWP 1021967)]
>>  {"execute": "qmp_capabilities", "arguments": {"enable": ["oob"]}}
>>  {"return": {}}
>>  {"execute": "dumpdtb", "arguments": {"filename": "fdt.dtb"}}
>> 
>>  Thread 1 "qemu-system-aar" received signal SIGSEGV, Segmentation fault.
>>  qmp_dumpdtb (filename=0x581c5170 "fdt.dtb", 
>> errp=errp@entry=0x7fffdae8)
>>  at ../softmmu/device_tree.c:661
>>  661 size = fdt_totalsize(current_machine->fdt);
>> 
>> current_machine->fdt is non-null here.  The crash is within
>> fdt_totalsize().
>
>
> Back when I added this command [1] one of the patches of this series was:
>
> [PATCH v8 03/16] hw/arm: do not free machine->fdt in arm_load_dtb()
>
> https://lists.gnu.org/archive/html/qemu-devel/2022-09/msg04201.html
>
> The patch aimed to address what you're seeing here. machine->fdt is not NULL,
> but it was freed in arm_load_dtb() without assigning it to NULL and it 
> contains
> junk.
>
> Back then this patch got no acks/reviews and got left behind. If I apply it 
> now
> at current master your example works.
>
>> 
>> I suspect ...
>> 
>>  void qmp_dumpdtb(const char *filename, Error **errp)
>>  {
>>  g_autoptr(GError) err = NULL;
>>  uint32_t size;
>> 
>>  if (!current_machine->fdt) {
>>  error_setg(errp, "This machine doesn't have a FDT");
>>  return;
>>  }
>> 
>> ... we're missing an "FDT isn't ready" guard here.
>
>
> There might be a case where a guard would be needed, but for all ARM machines 
> that
> uses arm_load_dtb() I can say that the dumpdtb is broken regardless of 
> whether you
> attempt it during early boot or not.
>
> One solution is to just apply the patch I mentioned above. Another solution 
> is to
> make a g_free(fdt) in arm_load_dtb and also do a ms->fdt = NULL to tell 
> dumpdtb()
> that there is no fdt available.
>
> I don't see any particular reason why arm machines can't support this 
> command, so
> let me know and I can re-send that patch.
>
>
>
> Thanks,
>
>
> Daniel
>
>
> [1] https://lists.gnu.org/archive/html/qemu-devel/2022-09/msg04190.html
>
>> 
>>  size = fdt_totalsize(current_machine->fdt);
>> 
>>  g_assert(size > 0);
>> 
>>  if (!g_file_set_contents(filename, current_machine->fdt, size, 
>> )) {
>>  error_setg(errp, "Error saving FDT to file %s: %s",
>> filename, err->message);
>>  }
>>  }
>> 
>> Also, I think the error message "does not have a FDT" should say "an
>> FDT".
>>

Re: [PATCH] hw/xenpv: Initialize Xen backend operations

2023-03-23 Thread Paul Durrant


On 23/03/2023 10:57, David Woodhouse wrote:

From: David Woodhouse 

As the Xen backend operations were abstracted out into a function table to
allow for internally emulated Xen support, we missed the xen_init_pv()
code path which also needs to install the operations for the true Xen
libraries. Add the missing call to setup_xen_backend_ops().

Fixes: b6cacfea0b38 ("hw/xen: Add evtchn operations to allow redirection to internal 
emulation")
Reported-by: Anthony PERARD 
Signed-off-by: David Woodhouse 
---
  hw/xenpv/xen_machine_pv.c | 2 ++
  1 file changed, 2 insertions(+)


Reviewed-by: Paul Durrant 



diff --git a/hw/xenpv/xen_machine_pv.c b/hw/xenpv/xen_machine_pv.c
index 2e759d0619..17cda5ec13 100644
--- a/hw/xenpv/xen_machine_pv.c
+++ b/hw/xenpv/xen_machine_pv.c
@@ -35,6 +35,8 @@ static void xen_init_pv(MachineState *machine)
  DriveInfo *dinfo;
  int i;
  
+setup_xen_backend_ops();

+
  /* Initialize backend core & drivers */
  xen_be_init();

[PATCH 1/1] accel/kvm/kvm-all: fix vm crash when set dirty ring and memorybacking

2023-03-23 Thread Jiajing Zhou

From: "zhoujiajing.vergil" 

It is possible enter this function when the cpu not finished creating but
is already in the cpu list. The value of dirty_gfns is null, causing vm
crash here.

When both dirty-ring and memorybacking are set, creating a vm will assert
on kvm_dirty_ring_reap_one. Part of the xml as follows:



  ...
  

  

  
  ...
  


  

  
  ...


The kvm-reaper thread was created before vcpu thread, and the value of
cpu->kvm_dirty_gfns is assigned at cpu thread. In the x86_cpu_realizefn
function, the cpu is inserted into the cpu list first, and then the cpu
thread is created for initialization. The entry functions are
cpu_exec_realizefn and qemu_init_vcpu. In the existing logic, the
kvm-reaper thread traverses the cpu list every second and finally call
kvm_dirty_ring_reap_one for each cpu in the list. If cpu has been inserted
into cpu list but has not been initialized so that the value of dirty_gfns
is null, kvm-reaper thread call kvm_dirty_ring_reap_one will cause vm crash.

The call stack is as follows:
  kvm_dirty_ring_reaper_thread
-> kvm_dirty_ring_reap
   ->kvm_dirty_ring_reap_locked
 ->kvm_dirty_ring_reap_one


Signed-off-by: zhoujiajing.vergil 
---
 accel/kvm/kvm-all.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index f2a6ea6a68..698a59162a 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -685,6 +685,11 @@ static uint32_t kvm_dirty_ring_reap_one(KVMState *s, 
CPUState *cpu)
 uint32_t ring_size = s->kvm_dirty_ring_size;
 uint32_t count = 0, fetch = cpu->kvm_fetch_index;
 
+/* return 0 when cpu not finished creating */
+if (!cpu->created) {
+return 0;
+}
+
 assert(dirty_gfns && ring_size);
 trace_kvm_dirty_ring_reap_vcpu(cpu->cpu_index);
 
-- 
2.40.0

vhost-user's disk can support physical_block_size=4096, logical_block_size=512?

2023-03-23 Thread 顾泽兵 via

Hello community,

I have a disk with both logical and physical sector size being 4096. I have
a qcow2 image which is built from a virtual machine has legacy 512 bytes
sector size.
I use spdk-vhost as the backend storage.

The commandline of qemu is like this:
/usr/bin/qemu-system-x86_64  \
-nographic \
-name debug-threads=on -nic none  -m 8G,slots=3,maxmem=16G -mem-prealloc
-cpu host  -enable-kvm  -M q35 \
-smp 8,sockets=4,cores=2,threads=1 \
-object
memory-backend-file,id=ram-node0,mem-path=/dev/hugepages,share=yes,size=8G,host-nodes=0,policy=bind
\
-numa node,nodeid=0,cpus='0-7',memdev=ram-node0 \
-device
pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2
\
-object iothread,id=iothread1 \
-object iothread,id=iothread2 \
-device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1
\
-device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2
\
-device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3
\
-device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4
\
*-chardev
socket,id=charvirtio-disk1,path=/var/run/spdk/qemu/disk0.sock,reconnect=1 \*
*-device
vhost-user-blk-pci-non-transitional,chardev=charvirtio-disk1,bus=pci.4,addr=0x0,id=virtio-disk1,num-queues=2,bootindex=0
\*


The result is that the system can’t start normally.
As I konw, there are 2 questios:
1、Seabios can't support sector size of 4096.
2、The qcow2 image which is built with sector size of 512 can’t interpreted
with sector size of 4096. The key point is MBR partion table.


I have konwn that QEMU is supposed to emulate 512 byte sectors on top of a
4k sector disk on the host. The command line is like this:
'physical_block_size=4096,logical_block_size=512’.

Unfortunately, When I add 'physical_block_size=4096,logical_block_size=512’
to vhost-user-blk-pci-non-transitional, qemu print error:
Property 'vhost-user-blk-pci-non-transitional.physical_block_size' not
found.

Can we have a plan to support params of
'physical_block_size=4096,logical_block_size=512’
to vhost-user-blk-pci-non-transitional? Or any other advice on how to
possibly work around this?
Thank you very much!

[RFC] accel/kvm/kvm-all: fix vm crash when set dirty ring and memorybacking

2023-03-23 Thread Jiajing Zhou

From: "zhoujiajing.vergil" 

It is possible enter this function when the cpu not finished creating but
is already in the cpu list. The value of dirty_gfns is null, causing vm
crash here.

The call stack is as follows:
  kvm_dirty_ring_reaper_thread
-> kvm_dirty_ring_reap
   ->kvm_dirty_ring_reap_locked
 ->kvm_dirty_ring_reap_one


When both dirty-ring and memorybacking are set, creating a vm will assert 
on kvm_dirty_ring_reap_one. Part of the xml as follows:



  ...
  

  

  
  ...
  


  

  
  ...


The kvm-reaper thread was created before vcpu thread, and the value of 
cpu->kvm_dirty_gfns is assigned at cpu thread. In the x86_cpu_realizefn 
function, the cpu is inserted into the cpu list first, and then the cpu 
thread is created for initialization. The entry functions are 
cpu_exec_realizefn and qemu_init_vcpu. In the existing logic, the 
kvm-reaper thread traverses the cpu list every second and finally call 
kvm_dirty_ring_reap_one for each cpu in the list. If cpu has been inserted 
into cpu list but has not been initialized so that the value of dirty_gfns 
is null, kvm-reaper thread call kvm_dirty_ring_reap_one will cause vm crash.


Signed-off-by: zhoujiajing.vergil 
---
 accel/kvm/kvm-all.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index f2a6ea6a68..ecd873fe73 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -685,6 +685,10 @@ static uint32_t kvm_dirty_ring_reap_one(KVMState *s, 
CPUState *cpu)
 uint32_t ring_size = s->kvm_dirty_ring_size;
 uint32_t count = 0, fetch = cpu->kvm_fetch_index;
 
+/* return 0 when cpu not finished creating */
+if(!cpu->created)
+   return 0;
+
 assert(dirty_gfns && ring_size);
 trace_kvm_dirty_ring_reap_vcpu(cpu->cpu_index);
 
-- 
2.20.1

Re: [PATCH 00/45] Add RISC-V vector cryptographic instruction set support


Hi,

On 3/10/23 13:03, Lawrence Hunter wrote:

NB: this is an update over the patch series submitted today (2023/03/10) at 
09:11. It fixes some accidental mangling of commits 02, 04 and 08/45.

This patchset provides an implementation for Zvkb, Zvkned, Zvknh, Zvksh, Zvkg, 
and Zvksed of the draft RISC-V vector cryptography extensions as per the 
20230303 version of the specification(1) (1fcbb30). Please note that the Zvkt 
data-independent execution latency extension has not been implemented, and we 
would recommend not using these patches in an environment where timing attacks 
are an issue.

Work performed by Dickon, Lawrence, Nazar, Kiran, and William from Codethink 
sponsored by SiFive, as well as Max Chou and Frank Chang from SiFive.

For convenience we have created a git repo with our patches on top of a recent 
master. https://github.com/CodethinkLabs/qemu-ct

1. https://github.com/riscv/riscv-crypto/releases


For the next versions I suggest CCing qemu-ri...@nongnu.org as well. It's an
easier way to keep track of RISC-V contributions since qemu-devel has a lot of
traffic.

Thanks,


Daniel



Dickon Hood (2):
   qemu/bitops.h: Limit rotate amounts
   target/riscv: Add vrol.[vv,vx] and vror.[vv,vx,vi] decoding,
 translation and execution support

Kiran Ostrolenk (7):
   target/riscv: Refactor some of the generic vector functionality
   target/riscv: Refactor some of the generic vector functionality
   target/riscv: Refactor some of the generic vector functionality
   target/riscv: Add vsha2ms.vv decoding, translation and execution
 support
   target/riscv: Add zvksh cpu property
   target/riscv: Add vsm3c.vi decoding, translation and execution support
   target/riscv: Expose zvksh cpu property

Lawrence Hunter (17):
   target/riscv: Add vclmul.vv decoding, translation and execution
 support
   target/riscv: Add vclmul.vx decoding, translation and execution
 support
   target/riscv: Add vclmulh.vv decoding, translation and execution
 support
   target/riscv: Add vclmulh.vx decoding, translation and execution
 support
   target/riscv: Add vaesef.vv decoding, translation and execution
 support
   target/riscv: Add vaesef.vs decoding, translation and execution
 support
   target/riscv: Add vaesdf.vv decoding, translation and execution
 support
   target/riscv: Add vaesdf.vs decoding, translation and execution
 support
   target/riscv: Add vaesdm.vv decoding, translation and execution
 support
   target/riscv: Add vaesdm.vs decoding, translation and execution
 support
   target/riscv: Add vaesz.vs decoding, translation and execution support
   target/riscv: Add vsha2c[hl].vv decoding, translation and execution
 support
   target/riscv: Add vsm3me.vv decoding, translation and execution
 support
   target/riscv: Add zvkg cpu property
   target/riscv: Add vgmul.vv decoding, translation and execution support
   target/riscv: Add vghsh.vv decoding, translation and execution support
   target/riscv: Expose zvkg cpu property

Max Chou (5):
   crypto: Create sm4_subword
   crypto: Add SM4 constant parameter CK
   target/riscv: Add zvksed cfg property
   target/riscv: Add Zvksed support
   target/riscv: Expose Zvksed property

Nazar Kazakov (11):
   target/riscv: Add zvkb cpu property
   target/riscv: Refactor some of the generic vector functionality
   target/riscv: Add vrev8.v decoding, translation and execution support
   target/riscv: Add vandn.[vv,vx] decoding, translation and execution
 support
   target/riscv: Expose zvkb cpu property
   target/riscv: Add zvkned cpu property
   target/riscv: Add vaeskf1.vi decoding, translation and execution
 support
   target/riscv: Add vaeskf2.vi decoding, translation and execution
 support
   target/riscv: Expose zvkned cpu property
   target/riscv: Add zvknh cpu properties
   target/riscv: Expose zvknh cpu properties

William Salmon (3):
   target/riscv: Add vbrev8.v decoding, translation and execution support
   target/riscv: Add vaesem.vv decoding, translation and execution
 support
   target/riscv: Add vaesem.vs decoding, translation and execution
 support

  accel/tcg/tcg-runtime-gvec.c |   11 +
  accel/tcg/tcg-runtime.h  |1 +
  crypto/sm4.c |   10 +
  include/crypto/sm4.h |9 +
  include/qemu/bitops.h|   24 +-
  target/arm/tcg/crypto_helper.c   |   10 +-
  target/riscv/cpu.c   |   36 +
  target/riscv/cpu.h   |7 +
  target/riscv/helper.h|   71 ++
  target/riscv/insn32.decode   |   49 +
  target/riscv/insn_trans/trans_rvv.c.inc  |   93 +-
  target/riscv/insn_trans/trans_rvzvkb.c.inc   |  220 
  target/riscv/insn_trans/trans_rvzvkg.c.inc   |   40 +
  target/riscv/insn_trans/trans_rvzvkned.c.inc |  170 +++
  target/riscv/insn_trans/trans_rvzvknh.c.inc  |   84

Re: [PATCH 3/3] New I2C: Add support for TPM devices over I2C bus

2023-03-23 Thread Stefan Berger





On 3/22/23 23:01, Ninad Palsule wrote:

Qemu already supports devices attached to ISA and sysbus. This drop adds
support for the I2C bus attached TPM devices. I2C model only supports
TPM2 protocol.

This commit includes changes for the common code.
- Added I2C emulation model. Logic was added in the model to temporarily
   cache the data as I2C interface works per byte basis.
- New tpm type "tpm-tis-i2c" added for I2C support. User specify this
   string on command line.

Testing:
   TPM I2C device modulte is tested using SWTPM (software based TPM
   package). The qemu used the rainier machine and it was connected to
   swtpm over the socket interface.

   The command to start swtpm is as follows:
   $ swtpm socket --tpmstate dir=/tmp/mytpm1\
  --ctrl type=unixio,path=/tmp/mytpm1/swtpm-sock  \
  --tpm2 --log level=100

   The command to start qemu is as follows:
   $ qemu-system-arm -M rainier-bmc -nographic \
 -kernel ${IMAGEPATH}/fitImage-linux.bin \
 -dtb ${IMAGEPATH}/aspeed-bmc-ibm-rainier.dtb \
 -initrd ${IMAGEPATH}/obmc-phosphor-initramfs.rootfs.cpio.xz \
 -drive 
file=${IMAGEPATH}/obmc-phosphor-image.rootfs.wic.qcow2,if=sd,index=2 \
 -net nic -net 
user,hostfwd=:127.0.0.1:-:22,hostfwd=:127.0.0.1:2443-:443 \
 -chardev socket,id=chrtpm,path=/tmp/mytpm1/swtpm-sock \
 -tpmdev emulator,id=tpm0,chardev=chrtpm \
 -device tpm-tis-i2c,tpmdev=tpm0,bus=aspeed.i2c.bus.12,address=0x2e

   Note: Currently you need to specify the I2C bus and device address on
 command line. In future we can add a device at board level.

Signed-off-by: Ninad Palsule 
---
V2:
Incorporated Stephen's review comments.
- Handled checksum related register in I2C layer
- Defined I2C interface capabilities and return those instead of
   capabilities from TPM TIS. Add required capabilities from TIS.
- Do not cache FIFO data in the I2C layer.
- Make sure that Device address change register is not passed to I2C
   layer as capability indicate that it is not supported.
- Added boundary checks.
- Make sure that bits 26-31 are zeroed for the TPM_STS register on read
- Updated Kconfig files for new define.
---
  hw/arm/Kconfig   |   1 +
  hw/tpm/Kconfig   |   7 +
  hw/tpm/meson.build   |   1 +
  hw/tpm/tpm_tis_i2c.c | 440 +++
  include/sysemu/tpm.h |   3 +
  5 files changed, 452 insertions(+)
  create mode 100644 hw/tpm/tpm_tis_i2c.c

diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
index b5aed4aff5..05d6ef1a31 100644
--- a/hw/arm/Kconfig
+++ b/hw/arm/Kconfig
@@ -6,6 +6,7 @@ config ARM_VIRT
  imply VFIO_PLATFORM
  imply VFIO_XGMAC
  imply TPM_TIS_SYSBUS
+imply TPM_TIS_I2C
  imply NVDIMM
  select ARM_GIC
  select ACPI
diff --git a/hw/tpm/Kconfig b/hw/tpm/Kconfig
index 29e82f3c92..a46663288c 100644
--- a/hw/tpm/Kconfig
+++ b/hw/tpm/Kconfig
@@ -1,3 +1,10 @@
+config TPM_TIS_I2C
+bool
+depends on TPM
+select TPM_BACKEND
+select I2C
+select TPM_TIS
+
  config TPM_TIS_ISA
  bool
  depends on TPM && ISA_BUS
diff --git a/hw/tpm/meson.build b/hw/tpm/meson.build
index 7abc2d794a..76fe3cb098 100644
--- a/hw/tpm/meson.build
+++ b/hw/tpm/meson.build
@@ -1,6 +1,7 @@
  softmmu_ss.add(when: 'CONFIG_TPM_TIS', if_true: files('tpm_tis_common.c'))
  softmmu_ss.add(when: 'CONFIG_TPM_TIS_ISA', if_true: files('tpm_tis_isa.c'))
  softmmu_ss.add(when: 'CONFIG_TPM_TIS_SYSBUS', if_true: 
files('tpm_tis_sysbus.c'))
+softmmu_ss.add(when: 'CONFIG_TPM_TIS_I2C', if_true: files('tpm_tis_i2c.c'))
  softmmu_ss.add(when: 'CONFIG_TPM_CRB', if_true: files('tpm_crb.c'))
  softmmu_ss.add(when: 'CONFIG_TPM_TIS', if_true: files('tpm_ppi.c'))
  softmmu_ss.add(when: 'CONFIG_TPM_CRB', if_true: files('tpm_ppi.c'))
diff --git a/hw/tpm/tpm_tis_i2c.c b/hw/tpm/tpm_tis_i2c.c
new file mode 100644
index 00..5cec5f7806
--- /dev/null
+++ b/hw/tpm/tpm_tis_i2c.c
@@ -0,0 +1,440 @@
+/*
+ * tpm_tis_i2c.c - QEMU's TPM TIS I2C Device
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ * Implementation of the TIS interface according to specs found at
+ * http://www.trustedcomputinggroup.org. This implementation currently
+ * supports version 1.3, 21 March 2013
+ * In the developers menu choose the PC Client section then find the TIS
+ * specification.
+ *
+ * TPM TIS for TPM 2 implementation following TCG PC Client Platform
+ * TPM Profile (PTP) Specification, Familiy 2.0, Revision 00.43
+ *
+ * TPM I2C implementation follows TCG TPM I2c Interface specification,
+ * Family 2.0, Level 00, Revision 1.00
+ */
+
+#include "qemu/osdep.h"
+#include "hw/i2c/i2c.h"
+#include "hw/qdev-properties.h"
+#include "hw/acpi/tpm.h"
+#include "migration/vmstate.h"
+#include "tpm_prop.h"
+#include "tpm_tis.h"
+#include "qom/object.h"
+#include "block/aio.h"
+#include "qemu/main-loop.h"
+

Re: QMP command dumpdtb crash bug





On 3/23/23 03:29, Markus Armbruster wrote:

Watch this:

 $ gdb --args ../qemu/bld/qemu-system-aarch64 -S -M virt -display none -qmp 
stdio
 [...]
 (gdb) r
 [...]
 {"QMP": {"version": {"qemu": {"micro": 50, "minor": 2, "major": 7}, "package": 
"v7.2.0-2331-gda89f78a7d"}, "capabilities": ["oob"]}}
 [New Thread 0x7fffed62c6c0 (LWP 1021967)]
 {"execute": "qmp_capabilities", "arguments": {"enable": ["oob"]}}
 {"return": {}}
 {"execute": "dumpdtb", "arguments": {"filename": "fdt.dtb"}}

 Thread 1 "qemu-system-aar" received signal SIGSEGV, Segmentation fault.
 qmp_dumpdtb (filename=0x581c5170 "fdt.dtb", 
errp=errp@entry=0x7fffdae8)
 at ../softmmu/device_tree.c:661
 661size = fdt_totalsize(current_machine->fdt);

current_machine->fdt is non-null here.  The crash is within
fdt_totalsize().



Back when I added this command [1] one of the patches of this series was:

[PATCH v8 03/16] hw/arm: do not free machine->fdt in arm_load_dtb()

https://lists.gnu.org/archive/html/qemu-devel/2022-09/msg04201.html

The patch aimed to address what you're seeing here. machine->fdt is not NULL,
but it was freed in arm_load_dtb() without assigning it to NULL and it contains
junk.

Back then this patch got no acks/reviews and got left behind. If I apply it now
at current master your example works.



I suspect ...

 void qmp_dumpdtb(const char *filename, Error **errp)
 {
 g_autoptr(GError) err = NULL;
 uint32_t size;

 if (!current_machine->fdt) {
 error_setg(errp, "This machine doesn't have a FDT");
 return;
 }

... we're missing an "FDT isn't ready" guard here.



There might be a case where a guard would be needed, but for all ARM machines 
that
uses arm_load_dtb() I can say that the dumpdtb is broken regardless of whether 
you
attempt it during early boot or not.

One solution is to just apply the patch I mentioned above. Another solution is 
to
make a g_free(fdt) in arm_load_dtb and also do a ms->fdt = NULL to tell 
dumpdtb()
that there is no fdt available.

I don't see any particular reason why arm machines can't support this command, 
so
let me know and I can re-send that patch.



Thanks,


Daniel


[1] https://lists.gnu.org/archive/html/qemu-devel/2022-09/msg04190.html



 size = fdt_totalsize(current_machine->fdt);

 g_assert(size > 0);

 if (!g_file_set_contents(filename, current_machine->fdt, size, )) {
 error_setg(errp, "Error saving FDT to file %s: %s",
filename, err->message);
 }
 }

Also, I think the error message "does not have a FDT" should say "an
FDT".

Re: [PATCH 00/45] Add RISC-V vector cryptographic instruction set support

2023-03-23 Thread Christoph Müllner

On Thu, Mar 23, 2023 at 12:34 PM Lawrence Hunter
 wrote:
>
> On 21/03/2023 12:02, Christoph Müllner wrote:
> > On Fri, Mar 10, 2023 at 10:16 AM Lawrence Hunter
> >  wrote:
> >>
> >> This patchset provides an implementation for Zvkb, Zvkned, Zvknh,
> >> Zvksh, Zvkg, and Zvksed of the draft RISC-V vector cryptography
> >> extensions as per the 20230303 version of the specification(1)
> >> (1fcbb30). Please note that the Zvkt data-independent execution
> >> latency extension has not been implemented, and we would recommend not
> >> using these patches in an environment where timing attacks are an
> >> issue.
> >>
> >> Work performed by Dickon, Lawrence, Nazar, Kiran, and William from
> >> Codethink sponsored by SiFive, as well as Max Chou and Frank Chang
> >> from SiFive.
> >>
> >> For convenience we have created a git repo with our patches on top of
> >> a recent master. https://github.com/CodethinkLabs/qemu-ct
> >
> > I did test and review this patchset.
> > Since most of my comments affect multiple patches I have summarized
> > them here in one email.
> > Observations that only affect a single patch will be sent in response
> > to the corresponding email.
> >
> > I have tested this series with the OpenSSL PR for Zvk that can be found
> > here:
> >https://github.com/openssl/openssl/pull/20149
> > I ran with all Zvk* extensions enabled (using Zvkg for GCM) and with
> > Zvkb only (using Zvkb for GCM).
> > All tests succeed. Note, however, that the test coverage is limited
> > (e.g. no .vv instructions, vstart is always zero).
> >
> > When sending out a follow-up version (even if it just introduces a
> > minimal fix),
> > then consider using patchset versioning (e.g. git format-patch -v2
> > ...).
>
> Ok, will do
>
> > It might be a matter of taste, but I would prefer a series that groups
> > and orders the commits differently:
> >a) independent changes to the existing code (refactoring only, but
> > no new features) - one commit per topic
> >b) introduction of new functionality - one commit per extension
> > A series using such a commit granularity and order would be easier to
> > maintain and review (and not result in 45 patches).
> > Also, the refactoring changes could land before Zvk freezes if
> > maintainers decide to do so.
>
> Makes sense, will do
>
> > So far all translation files in target/riscv/insn_trans/* contain
> > multiple extensions if they are related.
> > I think we should follow this pattern and use a common trans_zvk.c.inc
> > file.
>
> Agree, will do
>
> > All patches to insn32.decode have comments of the form "RV64 Zvk*
> > vector crypto extension".
> > What is the point of the "RV64"? I would simply remove that.
>
> Ok, will remove it
>
> > All instructions set "env->vstart = 0;" at the end.
> > I don't think that this is correct (the specification does not require
> > this).
>
> That's from vector spec: "All vector instructions are defined to begin
> execution with the element number given in the vstart CSR, leaving
> earlier elements in the destination vector undisturbed, and to reset the
> vstart CSR to zero at the end of execution." - from "3.7. Vector Start
> Index CSR vstart"

Yes, you are right.
I just created a PR for the Zvk* spec to clarify this:
  https://github.com/riscv/riscv-crypto/pull/308

>
> > The tests of the reserved encodings are not consistent:
> > * Zvknh does a dynamic test (query tcg_gen_*())
> > * Zvkned does a dynamic test (tcg_gen_*())
> > * Zvkg does not test for (vl%EGS == 0)
>
> Zvkg also does dynamic test, by calling macros GEN_V_UNMASKED_TRANS and
> GEN_VV_UNMASKED_TRANS
>
> > The vl CSR can only be updated by the vset{i}vl{i} instructions.
> > The same applies to the vstart CSR and the vtype CSR that holds vsew,
> > vlmul and other fields.
> > The current code tests the VSTART/SEW value using "s->vstart % 4 ==
> > 0"/"s->sew == MO_32".
> > Why is it not possible to do the same with VL, i.e. "s->vl % 4 == 0"
> > (after adding it to DisasContext)?
>
> vl can also be updated by another instruction - from vector spec "3.5.
> Vector Length Register vl" - "The XLEN-bit-wide read-only vl CSR can
> only be updated by the vset{i}vl{i} instructions, and the
> fault-only-first vector load instruction variants." So just because of
> that fault-only-first instruction we need dynamic checks.
>
> vstart is just another CSR -- software can write to it, but probably
> shouldn't.  Whether that's ever going to be useful outside testing ISA
> conformance tests or not I don't know, but it's clearly read-write (also
> section 3.7).
>
> > Also, I would introduce named constants or macros for the EGS values
> > to avoid magic constants in the code
> > (some extensions do that - e.g. ZVKSED_EGS).
>
> Makes sense, will do
>
> Best,
> Lawrence

Re: [PATCH 00/45] Add RISC-V vector cryptographic instruction set support

2023-03-23 Thread Lawrence Hunter


On 21/03/2023 12:02, Christoph Müllner wrote:

On Fri, Mar 10, 2023 at 10:16 AM Lawrence Hunter
 wrote:


This patchset provides an implementation for Zvkb, Zvkned, Zvknh, 
Zvksh, Zvkg, and Zvksed of the draft RISC-V vector cryptography 
extensions as per the 20230303 version of the specification(1) 
(1fcbb30). Please note that the Zvkt data-independent execution 
latency extension has not been implemented, and we would recommend not 
using these patches in an environment where timing attacks are an 
issue.


Work performed by Dickon, Lawrence, Nazar, Kiran, and William from 
Codethink sponsored by SiFive, as well as Max Chou and Frank Chang 
from SiFive.


For convenience we have created a git repo with our patches on top of 
a recent master. https://github.com/CodethinkLabs/qemu-ct


I did test and review this patchset.
Since most of my comments affect multiple patches I have summarized
them here in one email.
Observations that only affect a single patch will be sent in response
to the corresponding email.

I have tested this series with the OpenSSL PR for Zvk that can be found 
here:

   https://github.com/openssl/openssl/pull/20149
I ran with all Zvk* extensions enabled (using Zvkg for GCM) and with
Zvkb only (using Zvkb for GCM).
All tests succeed. Note, however, that the test coverage is limited
(e.g. no .vv instructions, vstart is always zero).

When sending out a follow-up version (even if it just introduces a 
minimal fix),
then consider using patchset versioning (e.g. git format-patch -v2 
...).


Ok, will do


It might be a matter of taste, but I would prefer a series that groups
and orders the commits differently:
   a) independent changes to the existing code (refactoring only, but
no new features) - one commit per topic
   b) introduction of new functionality - one commit per extension
A series using such a commit granularity and order would be easier to
maintain and review (and not result in 45 patches).
Also, the refactoring changes could land before Zvk freezes if
maintainers decide to do so.


Makes sense, will do


So far all translation files in target/riscv/insn_trans/* contain
multiple extensions if they are related.
I think we should follow this pattern and use a common trans_zvk.c.inc 
file.


Agree, will do


All patches to insn32.decode have comments of the form "RV64 Zvk*
vector crypto extension".
What is the point of the "RV64"? I would simply remove that.


Ok, will remove it


All instructions set "env->vstart = 0;" at the end.
I don't think that this is correct (the specification does not require 
this).


That's from vector spec: "All vector instructions are defined to begin 
execution with the element number given in the vstart CSR, leaving 
earlier elements in the destination vector undisturbed, and to reset the 
vstart CSR to zero at the end of execution." - from "3.7. Vector Start 
Index CSR vstart"



The tests of the reserved encodings are not consistent:
* Zvknh does a dynamic test (query tcg_gen_*())
* Zvkned does a dynamic test (tcg_gen_*())
* Zvkg does not test for (vl%EGS == 0)


Zvkg also does dynamic test, by calling macros GEN_V_UNMASKED_TRANS and 
GEN_VV_UNMASKED_TRANS



The vl CSR can only be updated by the vset{i}vl{i} instructions.
The same applies to the vstart CSR and the vtype CSR that holds vsew,
vlmul and other fields.
The current code tests the VSTART/SEW value using "s->vstart % 4 ==
0"/"s->sew == MO_32".
Why is it not possible to do the same with VL, i.e. "s->vl % 4 == 0"
(after adding it to DisasContext)?


vl can also be updated by another instruction - from vector spec "3.5. 
Vector Length Register vl" - "The XLEN-bit-wide read-only vl CSR can 
only be updated by the vset{i}vl{i} instructions, and the 
fault-only-first vector load instruction variants." So just because of 
that fault-only-first instruction we need dynamic checks.


vstart is just another CSR -- software can write to it, but probably 
shouldn't.  Whether that's ever going to be useful outside testing ISA 
conformance tests or not I don't know, but it's clearly read-write (also 
section 3.7).



Also, I would introduce named constants or macros for the EGS values
to avoid magic constants in the code
(some extensions do that - e.g. ZVKSED_EGS).


Makes sense, will do

Best,
Lawrence

Re: [PATCH 02/45] target/riscv: Refactor some of the generic vector functionality

2023-03-23 Thread Lawrence Hunter


On 21/03/2023 12:02, Christoph Müllner wrote:

On Fri, Mar 10, 2023 at 5:06 PM Lawrence Hunter
 wrote:


From: Kiran Ostrolenk 

Summary of refactoring:

* take some functions/macros out of `vector_helper` and put them in a
new module called `vector_internals`

* factor the non SEW-specific stuff out of `GEN_OPIVV_TRANS` into
function `opivv_trans` (similar to `opivi_trans`)


I think splitting this commit into two changes would be better.
Besides that the two changes look reasonable and correct.


Makes sense, we can do this

Best,
Lawrence

Re: [PATCH] hw/arm/virt: support both pl011 and 16550 uart

On Wed, 22 Mar 2023 at 23:27, Patrick Venture  wrote:
>
> From: Shu-Chun Weng 
>
> Select uart for virt machine from pl011 and ns16550a with
> -M virt,uart={pl011|ns16550a}.

Firm "no". The PL011 is a perfectly good UART widely
supported by guest OSes, and there is no reason
why we should want to also provide a 16550.

thanks
-- PMM

Re: [PATCH] hw/nvme: Change alignment in dma functions for nvme_blk_*

2023-03-23 Thread Klaus Jensen

On Mar 20 13:40, Mateusz Kozlowski wrote:
> Since the nvme_blk_read/write are used by both the data and metadata
> portions of the IO, it can't have the 512B alignment requirement.
> Without this change any metadata transfer, which length isn't a multiple
> of 512B and which is bigger than 512B, will result in only a partial
> transfer.
> 
> Signed-off-by: Mateusz Kozlowski 

Thanks Mateusz,

LGTM.

Reviewed-by: Klaus Jensen 


signature.asc
Description: PGP signature

[PATCH] hw/xenpv: Initialize Xen backend operations

2023-03-23 Thread David Woodhouse

From: David Woodhouse 

As the Xen backend operations were abstracted out into a function table to
allow for internally emulated Xen support, we missed the xen_init_pv()
code path which also needs to install the operations for the true Xen
libraries. Add the missing call to setup_xen_backend_ops().

Fixes: b6cacfea0b38 ("hw/xen: Add evtchn operations to allow redirection to 
internal emulation")
Reported-by: Anthony PERARD 
Signed-off-by: David Woodhouse 
---
 hw/xenpv/xen_machine_pv.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/xenpv/xen_machine_pv.c b/hw/xenpv/xen_machine_pv.c
index 2e759d0619..17cda5ec13 100644
--- a/hw/xenpv/xen_machine_pv.c
+++ b/hw/xenpv/xen_machine_pv.c
@@ -35,6 +35,8 @@ static void xen_init_pv(MachineState *machine)
 DriveInfo *dinfo;
 int i;
 
+setup_xen_backend_ops();
+
 /* Initialize backend core & drivers */
 xen_be_init();
 
-- 
2.34.1




smime.p7s
Description: S/MIME cryptographic signature

Re: [PATCH v2 09/27] hw/xen: Add evtchn operations to allow redirection to internal emulation

2023-03-23 Thread David Woodhouse

On Thu, 2023-03-23 at 10:29 +, Anthony PERARD wrote:
> On Tue, Mar 07, 2023 at 05:17:32PM +, David Woodhouse wrote:
> > From: David Woodhouse 
> > 
> > The existing implementation calling into the real libxenevtchn moves to
> > a new file hw/xen/xen-operations.c, and is called via a function table
> > which in a subsequent commit will also be able to invoke the emulated
> > event channel support.
> > 
> > Signed-off-by: David Woodhouse 
> > Reviewed-by: Paul Durrant 
> 
> Hi David, Paul,
> 
> This patch prevents existing use case from booting, that is even with the
> state change notification fix.

Just checking... definitely the *event channel* patch in commit b6cacfe
and not the xenstore patch which comes later?

The state change notification fix doesn't even apply yet, does it?


smime.p7s
Description: S/MIME cryptographic signature

Example output for query-vcpu-dirty-limit

query-vcpu-dirty-limit's doc comment has an example, but it shows only
input, no output:

##
# @query-vcpu-dirty-limit:
#
# Returns information about virtual CPU dirty page rate limits, if any.
#
# Since: 7.1
#
# Example:
#   {"execute": "query-vcpu-dirty-limit"}
#
##

The simplest fix is

# Example:
# -> {"execute": "query-vcpu-dirty-limit"}
# <- {"return": []}

But I'd rather show a non-empty reply here.  I don't want to make one
up, because making up example output tends to result in incorrect
examples.  Could you run the command for me in context where it returns
non-empty output?

Re: QMP command set-vcpu-dirty-limit hangs