date:20240311

On Fri Mar 8, 2024 at 9:19 PM AEST, Harsh Prateek Bora wrote:
> There is an existing Nested-HV API to enable nested guests on powernv
> machines. However, that is not supported on pseries/PowerVM LPARs.
> This patch series implements required hcall interfaces to enable nested
> guests with KVM on PowerVM.
> Unlike Nested-HV, with this API, entire L2 state is retained by L0
> during guest entry/exit and uses pre-defined Guest State Buffer (GSB)
> format to communicate guest state between L1 and L2 via L0.
>
> L0 here refers to the phyp/PowerVM, or launching a Qemu TCG L0 with the
> newly introduced option cap-nested-papr=true.
> L1 refers to the LPAR host on PowerVM or Linux booted on Qemu TCG with
> above mentioned option cap-nested-papr=true.
> L2 refers to nested guest running on top of L1 using KVM.
> No SW changes needed for Qemu running in L1 Linux as well as L2 Kernel.
>
> Linux Kernel side support is already merged upstream:

This is all looking pretty good to me now. Considering it's quite
self-contained and adding a new feature, I think it's good to
merge.

I would like to have an avocado test for it, but the avocado
framework has a bug that's causing issues with the ppc hv test,
and we might have to move to a new host kernel with the KVM
support merged. I can take a look at that, I think we can add
new tests after soft-freeze.

Thanks,
Nick

> ---
> commit 19d31c5f115754c369c0995df47479c384757f82
> Author: Jordan Niethe 
> Date:   Thu Sep 14 13:05:59 2023 +1000
>
> KVM: PPC: Add support for nestedv2 guests
> ---
> For more details, documentation can be referred in either of patch
> series.
>
> There are scripts available to assist in setting up an environment for
> testing nested guests at https://github.com/iamjpn/kvm-powervm-test
>
> A tree with this series is available at:
> https://github.com/planetharsh/qemu/tree/upstream-0305-v5
>
> Thanks to Michael Neuling, Shivaprasad Bhat, Amit Machhiwal, Kautuk
> Consul, Vaibhav Jain and Jordan Niethe.
>
> Changelog:
> v5: addressed review comments from Nick on v4
> v4: 
> https://lore.kernel.org/qemu-devel/20240220083609.748325-1-hars...@linux.ibm.com/
> v3: 
> https://lore.kernel.org/qemu-devel/20240118052438.1475437-1-hars...@linux.ibm.com/
> v2: 
> https://lore.kernel.org/qemu-devel/20231012104951.194876-1-hars...@linux.ibm.com/
> v1: 
> https://lore.kernel.org/qemu-devel/2023090604.448244-1-hars...@linux.ibm.com/
>
> Harsh Prateek Bora (14):
>   spapr: nested: register nested-hv api hcalls only for cap-nested-hv
>   spapr: nested: move nested part of spapr_get_pate into spapr_nested.c
>   spapr: nested: Introduce SpaprMachineStateNested to store related
> info.
>   spapr: nested: keep nested-hv related code restricted to its API.
>   spapr: nested: Document Nested PAPR API
>   spapr: nested: Introduce H_GUEST_[GET|SET]_CAPABILITIES hcalls.
>   spapr: nested: Introduce H_GUEST_[CREATE|DELETE] hcalls.
>   spapr: nested: Introduce H_GUEST_CREATE_VCPU hcall.
>   spapr: nested: Extend nested_ppc_state for nested PAPR API
>   spapr: nested: Initialize the GSB elements lookup table.
>   spapr: nested: Introduce H_GUEST_[GET|SET]_STATE hcalls.
>   spapr: nested: Use correct source for parttbl info for nested PAPR
> API.
>   spapr: nested: Introduce H_GUEST_RUN_VCPU hcall.
>   spapr: nested: Introduce cap-nested-papr for Nested PAPR API
>
>  docs/devel/nested-papr.txt|  119 +++
>  include/hw/ppc/spapr.h|   27 +-
>  include/hw/ppc/spapr_nested.h |  428 -
>  target/ppc/cpu.h  |4 +
>  hw/ppc/ppc.c  |   10 +
>  hw/ppc/spapr.c|   35 +-
>  hw/ppc/spapr_caps.c   |   62 ++
>  hw/ppc/spapr_hcall.c  |   24 +-
>  hw/ppc/spapr_nested.c | 1550 -
>  9 files changed, 2204 insertions(+), 55 deletions(-)
>  create mode 100644 docs/devel/nested-papr.txt

[PATCH] error: Move ERRP_GUARD() to the beginning of the function

2024-03-11 Thread Zhao Liu

From: Zhao Liu 

Since the commit 05e385d2a9 ("error: Move ERRP_GUARD() to the beginning
of the function"), there are new codes that don't put ERRP_GUARD() at
the beginning of the functions.

As stated in the commit 05e385d2a9: "include/qapi/error.h advises to put
ERRP_GUARD() right at the beginning of the function, because only then
can it guard the whole function.", so clean up the few spots
disregarding the advice.

Inspired-by: Markus Armbruster 
Signed-off-by: Zhao Liu 
---
 * Inspired by Markus' original cleanup and copied his commit message.
---
 block.c| 2 +-
 block/qapi.c   | 6 +++---
 hw/s390x/s390-virtio-ccw.c | 2 +-
 migration/options.c| 2 +-
 migration/postcopy-ram.c   | 4 ++--
 net/vhost-vdpa.c   | 3 +--
 6 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/block.c b/block.c
index 1ed9214f66ed..8a43a83c11ca 100644
--- a/block.c
+++ b/block.c
@@ -534,9 +534,9 @@ typedef struct CreateCo {
 int coroutine_fn bdrv_co_create(BlockDriver *drv, const char *filename,
 QemuOpts *opts, Error **errp)
 {
+ERRP_GUARD();
 int ret;
 GLOBAL_STATE_CODE();
-ERRP_GUARD();
 
 if (!drv->bdrv_co_create_opts) {
 error_setg(errp, "Driver '%s' does not support image creation",
diff --git a/block/qapi.c b/block/qapi.c
index 9e806fa230d8..31183d493341 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -46,11 +46,11 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk,
 bool flat,
 Error **errp)
 {
+ERRP_GUARD();
 ImageInfo **p_image_info;
 ImageInfo *backing_info;
 BlockDriverState *backing;
 BlockDeviceInfo *info;
-ERRP_GUARD();
 
 if (!bs->drv) {
 error_setg(errp, "Block device %s is ejected", bs->node_name);
@@ -330,8 +330,8 @@ void bdrv_query_image_info(BlockDriverState *bs,
bool skip_implicit_filters,
Error **errp)
 {
-ImageInfo *info;
 ERRP_GUARD();
+ImageInfo *info;
 
 info = g_new0(ImageInfo, 1);
 bdrv_do_query_node_info(bs, qapi_ImageInfo_base(info), errp);
@@ -382,10 +382,10 @@ void bdrv_query_block_graph_info(BlockDriverState *bs,
  BlockGraphInfo **p_info,
  Error **errp)
 {
+ERRP_GUARD();
 BlockGraphInfo *info;
 BlockChildInfoList **children_list_tail;
 BdrvChild *c;
-ERRP_GUARD();
 
 info = g_new0(BlockGraphInfo, 1);
 bdrv_do_query_node_info(bs, qapi_BlockGraphInfo_base(info), errp);
diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index 62804cc2281d..4b6aab8eef98 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -312,9 +312,9 @@ static void ccw_init(MachineState *machine)
 static void s390_cpu_plug(HotplugHandler *hotplug_dev,
 DeviceState *dev, Error **errp)
 {
+ERRP_GUARD();
 MachineState *ms = MACHINE(hotplug_dev);
 S390CPU *cpu = S390_CPU(dev);
-ERRP_GUARD();
 
 g_assert(!ms->possible_cpus->cpus[cpu->env.core_id].cpu);
 ms->possible_cpus->cpus[cpu->env.core_id].cpu = OBJECT(dev);
diff --git a/migration/options.c b/migration/options.c
index 40eb9309401c..80f49a6a8562 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -478,9 +478,9 @@ static bool migrate_incoming_started(void)
  */
 bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp)
 {
+ERRP_GUARD();
 MigrationIncomingState *mis = migration_incoming_get_current();
 
-ERRP_GUARD();
 #ifndef CONFIG_LIVE_BLOCK_MIGRATION
 if (new_caps[MIGRATION_CAPABILITY_BLOCK]) {
 error_setg(errp, "QEMU compiled without old-style (blk/-b, inc/-i) "
diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index 0273dc6a94ac..eccff499cb20 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -283,10 +283,10 @@ static bool request_ufd_features(int ufd, uint64_t 
features)
 static bool ufd_check_and_apply(int ufd, MigrationIncomingState *mis,
 Error **errp)
 {
+ERRP_GUARD();
 uint64_t asked_features = 0;
 static uint64_t supported_features;
 
-ERRP_GUARD();
 /*
  * it's not possible to
  * request UFFD_API twice per one fd
@@ -371,6 +371,7 @@ static int test_ramblock_postcopiable(RAMBlock *rb, Error 
**errp)
  */
 bool postcopy_ram_supported_by_host(MigrationIncomingState *mis, Error **errp)
 {
+ERRP_GUARD();
 long pagesize = qemu_real_host_page_size();
 int ufd = -1;
 bool ret = false; /* Error unless we change it */
@@ -380,7 +381,6 @@ bool postcopy_ram_supported_by_host(MigrationIncomingState 
*mis, Error **errp)
 uint64_t feature_mask;
 RAMBlock *block;
 
-ERRP_GUARD();
 if (qemu_target_page_size() > pagesize) {
 error_setg(errp, "Target page size bigger than host page size");
 goto out;
diff

[PATCH v2] target/riscv: Implement dynamic establishment of custom decoder

2024-03-11 Thread Huang Tao

In this patch, we modify the decoder to be a freely composable data
structure instead of a hardcoded one. It can be dynamically builded up
according to the extensions.
This approach has several benefits:
1. Provides support for heterogeneous cpu architectures. As we add decoder in
   RISCVCPU, each cpu can have their own decoder, and the decoders can be
   different due to cpu's features.
2. Improve the decoding efficiency. We run the guard_func to see if the decoder
   can be added to the dynamic_decoder when building up the decoder. Therefore,
   there is no need to run the guard_func when decoding each instruction. It can
   improve the decoding efficiency
3. For vendor or dynamic cpus, it allows them to customize their own decoder
   functions to improve decoding efficiency, especially when vendor-defined
   instruction sets increase. Because of dynamic building up, it can skip the 
other
   decoder guard functions when decoding.
4. Pre patch for allowing adding a vendor decoder before decode_insn32() with 
minimal
   overhead for users that don't need this particular vendor deocder.

Signed-off-by: Huang Tao 
Suggested-by: Christoph Muellner 
Co-authored-by: LIU Zhiwei 
---
 target/riscv/cpu.c | 19 +++
 target/riscv/cpu.h |  2 ++
 target/riscv/cpu_decoder.h | 34 ++
 target/riscv/translate.c   | 28 
 4 files changed, 67 insertions(+), 16 deletions(-)
 create mode 100644 target/riscv/cpu_decoder.h

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 5ff0192c52..5ea5232ed8 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -38,6 +38,7 @@
 #include "kvm/kvm_riscv.h"
 #include "tcg/tcg-cpu.h"
 #include "tcg/tcg.h"
+#include "cpu_decoder.h"
 
 /* RISC-V CPU definitions */
 static const char riscv_single_letter_exts[] = "IEMAFDQCBPVH";
@@ -1102,6 +1103,22 @@ static void riscv_cpu_satp_mode_finalize(RISCVCPU *cpu, 
Error **errp)
 }
 #endif
 
+static void riscv_cpu_finalize_dynamic_decoder(RISCVCPU *cpu)
+{
+decode_fn *dynamic_decoders;
+dynamic_decoders = g_new0(decode_fn, decoder_table_size);
+int j = 0;
+for (size_t i = 0; i < decoder_table_size; ++i) {
+if (decoder_table[i].guard_func &&
+decoder_table[i].guard_func(>cfg)) {
+dynamic_decoders[j] = decoder_table[i].decode_fn;
+j++;
+}
+}
+
+cpu->decoders = dynamic_decoders;
+}
+
 void riscv_cpu_finalize_features(RISCVCPU *cpu, Error **errp)
 {
 Error *local_err = NULL;
@@ -1127,6 +1144,8 @@ void riscv_cpu_finalize_features(RISCVCPU *cpu, Error 
**errp)
 return;
 }
 }
+
+riscv_cpu_finalize_dynamic_decoder(cpu);
 }
 
 static void riscv_cpu_realize(DeviceState *dev, Error **errp)
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 5d291a7092..bb96af97f9 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -30,6 +30,7 @@
 #include "qemu/int128.h"
 #include "cpu_bits.h"
 #include "cpu_cfg.h"
+#include "cpu_decoder.h"
 #include "qapi/qapi-types-common.h"
 #include "cpu-qom.h"
 
@@ -457,6 +458,7 @@ struct ArchCPU {
 uint32_t pmu_avail_ctrs;
 /* Mapping of events to counters */
 GHashTable *pmu_event_ctr_map;
+const decode_fn *decoders;
 };
 
 /**
diff --git a/target/riscv/cpu_decoder.h b/target/riscv/cpu_decoder.h
new file mode 100644
index 00..549414ce4c
--- /dev/null
+++ b/target/riscv/cpu_decoder.h
@@ -0,0 +1,34 @@
+/*
+ * QEMU RISC-V CPU Decoder
+ *
+ * Copyright (c) 2023-2024 Alibaba Group
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see .
+ */
+
+#ifndef RISCV_CPU_DECODER_H
+#define RISCV_CPU_DECODER_H
+
+struct DisasContext;
+struct RISCVCPUConfig;
+typedef struct RISCVDecoder {
+bool (*guard_func)(const struct RISCVCPUConfig *);
+bool (*decode_fn)(struct DisasContext *, uint32_t);
+} RISCVDecoder;
+
+typedef bool (*decode_fn)(struct DisasContext *, uint32_t);
+
+extern const size_t decoder_table_size;
+
+extern const RISCVDecoder decoder_table[];
+#endif
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 177418b2b9..3f50737a50 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -115,6 +115,7 @@ typedef struct DisasContext {
 bool frm_valid;
 /* TCG of the current insn_start */
 TCGOp *insn_start;
+const decode_fn *decoders;
 } DisasContext;
 
 static inline bool has_ext(DisasContext *ctx,

Re: [PATCH v5 0/8] qtest: migration: Add tests for introducing 'channels' argument in migrate QAPIs

Can also check the passed build at 
https://gitlab.com/galahet/Qemu/-/pipelines/1209497470


On 12/03/24 3:23 am, Het Gala wrote:

With recent migrate QAPI changes, enabling the direct use of the
'channels' argument to avoid redundant URI string parsing is achieved.



v4->v5 Changelog:

1. Remove redundant imports from migration-test.c after moving helper
functions to migration-helpers.c
2. call migrate_get_socket_address(to) and internally let qdict_get() call
“socket-address” parameter to make more sense to the reader.
3. qdict needs to be freed, other small fixups.


[...]


Het Gala (8):
   Add 'to' object into migrate_qmp()
   Replace connect_uri and move migrate_get_socket_address inside
 migrate_qmp
   Replace migrate_get_connect_uri inplace of migrate_get_socket_address
   Add channels parameter in migrate_qmp_fail
   Add migrate_set_ports into migrate_qmp to update migration port value
   Add channels parameter in migrate_qmp
   Add multifd_tcp_plain test using list of channels instead of uri
   Add negative tests to validate migration QAPIs

  tests/qtest/migration-helpers.c | 158 +++-
  tests/qtest/migration-helpers.h |  10 +-
  tests/qtest/migration-test.c| 180 +---
  3 files changed, 257 insertions(+), 91 deletions(-)

Regards,
Het Gala

Re: [PATCH-for-9.0] docs: Deprecate the pseries-2.12 machines

On Tue Mar 12, 2024 at 5:04 AM AEST, Philippe Mathieu-Daudé wrote:
> pSeries machines before 3.0 have complex migration back
> compatibility code we'd like to get ride of. The last
> one is 2.12, which is 6 years old. We just deprecated up
> to the 2.11 machine in commit 1392617d35 ("spapr: Tag
> pseries-2.1 - 2.11 machines as deprecated").
> Take to opportunity to also deprecate the 2.12 machines.
>
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> In 2025 I'd like to get ride of the code related to:
>
>   include/hw/ppc/spapr_cpu_core.h:31:bool pre_3_0_migration; /* older 
> machine don't know about SpaprCpuState */

Acked-by: Nicholas Piggin 

I can merge this via the PPC tree.

Thanks,
Nick

> ---
>  docs/about/deprecated.rst | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst
> index dfd681cd02..65111513cc 100644
> --- a/docs/about/deprecated.rst
> +++ b/docs/about/deprecated.rst
> @@ -237,13 +237,13 @@ The Nios II architecture is orphan.
>  The machine is no longer in existence and has been long unmaintained
>  in QEMU. This also holds for the TC51828 16MiB flash that it uses.
>  
> -``pseries-2.1`` up to ``pseries-2.11`` (since 9.0)
> +``pseries-2.1`` up to ``pseries-2.12`` (since 9.0)
>  ''
>  
> -Older pseries machines before version 2.12 have undergone many changes
> +Older pseries machines before version 3.0 have undergone many changes
>  to correct issues, mostly regarding migration compatibility. These are
>  no longer maintained and removing them will make the code easier to
> -read and maintain. Use versions 2.12 and above as a replacement.
> +read and maintain. Use versions 3.0 and above as a replacement.
>  
>  Backend options
>  ---

Re: [PATCH v2] spapr: Tag pseries-2.1 - 2.11 machines as deprecated

On Tue Mar 12, 2024 at 4:56 AM AEST, Daniel P. Berrangé wrote:
> On Mon, Mar 11, 2024 at 06:46:53PM +0100, Philippe Mathieu-Daudé wrote:
> > Hi,
> > 
> > On 14/12/23 19:17, Cédric Le Goater wrote:
> > > pseries machines before version 2.11 have undergone many changes to
> > > correct issues, mostly regarding migration compatibility. This is
> > > obfuscating the code uselessly and makes maintenance more difficult.
> > > Remove them and only keep the last version of the 2.x series, 2.12,
> > > still in use by old distros.
> > 
> > By the time we get to QEMU v9.2, will pseries-2.12 still be used
> > by old distros? (which ones btw?)
>
> That's the wrong question really.
>
> Machine types are there to facilitate live migration, and by
> extension also handle save/restore to disk.
>
> So the question is more which distros are likely to ship
> new QEMU 9.2, and also still need the ability to incoming
> migrate from an older version of their distro where 2.12
> (or a downstream equiv) was a fully supported machine type.

>From Cedric's list, they are ~2018 vintage. I don't know if
there is upstream policy on this, but 2025 seems reasonable
to remove support for live migration from then.

Thanks,
Nick

Re: [PATCH 01/13] ppc: Drop support for POWER9 and POWER10 DD1 chips

2024-03-11 Thread Harsh Prateek Bora





On 3/12/24 10:20, Harsh Prateek Bora wrote:



On 3/12/24 00:21, Nicholas Piggin wrote:

The POWER9 DD1 and POWER10 DD1 chips are not public and are no longer of
any use in QEMU. Remove them.

Signed-off-by: Nicholas Piggin 
---
  hw/ppc/spapr_cpu_core.c |  2 --
  target/ppc/cpu-models.c |  4 
  target/ppc/cpu_init.c   |  7 ++-
  target/ppc/kvm.c    | 11 ---
  4 files changed, 2 insertions(+), 22 deletions(-)


Do we want to squash in removal of the macro as well?




Actually both, correcting diff:

diff --git a/target/ppc/cpu-models.h b/target/ppc/cpu-models.h
index 0229ef3a9a..7d89b41214 100644
--- a/target/ppc/cpu-models.h
+++ b/target/ppc/cpu-models.h
@@ -348,11 +348,9 @@ enum {
 CPU_POWERPC_POWER8NVL_BASE = 0x004C,
 CPU_POWERPC_POWER8NVL_v10  = 0x004C0100,
 CPU_POWERPC_POWER9_BASE= 0x004E,
-CPU_POWERPC_POWER9_DD1 = 0x004E1100,
 CPU_POWERPC_POWER9_DD20= 0x004E1200,
 CPU_POWERPC_POWER9_DD22= 0x004E1202,
 CPU_POWERPC_POWER10_BASE   = 0x0080,
-CPU_POWERPC_POWER10_DD1= 0x00801100,
 CPU_POWERPC_POWER10_DD20   = 0x00801200,
 CPU_POWERPC_970_v22= 0x00390202,
 CPU_POWERPC_970FX_v10  = 0x00391100,



With that,

Reviewed-by: Harsh Prateek Bora 

regards,
Harsh



diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index 40b7c52f7f..50523ead25 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -394,10 +394,8 @@ static const TypeInfo spapr_cpu_core_type_infos[] 
= {

  DEFINE_SPAPR_CPU_CORE_TYPE("power8_v2.0"),
  DEFINE_SPAPR_CPU_CORE_TYPE("power8e_v2.1"),
  DEFINE_SPAPR_CPU_CORE_TYPE("power8nvl_v1.0"),
-    DEFINE_SPAPR_CPU_CORE_TYPE("power9_v1.0"),
  DEFINE_SPAPR_CPU_CORE_TYPE("power9_v2.0"),
  DEFINE_SPAPR_CPU_CORE_TYPE("power9_v2.2"),
-    DEFINE_SPAPR_CPU_CORE_TYPE("power10_v1.0"),
  DEFINE_SPAPR_CPU_CORE_TYPE("power10_v2.0"),
  #ifdef CONFIG_KVM
  DEFINE_SPAPR_CPU_CORE_TYPE("host"),
diff --git a/target/ppc/cpu-models.c b/target/ppc/cpu-models.c
index 36e465b390..f2301b43f7 100644
--- a/target/ppc/cpu-models.c
+++ b/target/ppc/cpu-models.c
@@ -728,14 +728,10 @@
  "POWER8 v2.0")
  POWERPC_DEF("power8nvl_v1.0", CPU_POWERPC_POWER8NVL_v10, 
POWER8,

  "POWER8NVL v1.0")
-    POWERPC_DEF("power9_v1.0",   CPU_POWERPC_POWER9_DD1, 
POWER9,

-    "POWER9 v1.0")
  POWERPC_DEF("power9_v2.0",   CPU_POWERPC_POWER9_DD20,
POWER9,

  "POWER9 v2.0")
  POWERPC_DEF("power9_v2.2",   CPU_POWERPC_POWER9_DD22,
POWER9,

  "POWER9 v2.2")
-    POWERPC_DEF("power10_v1.0",  CPU_POWERPC_POWER10_DD1,
POWER10,

-    "POWER10 v1.0")
  POWERPC_DEF("power10_v2.0",  CPU_POWERPC_POWER10_DD20,   
POWER10,

  "POWER10 v2.0")
  #endif /* defined (TARGET_PPC64) */
diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
index 1d3d1db7c3..572cbdf25f 100644
--- a/target/ppc/cpu_init.c
+++ b/target/ppc/cpu_init.c
@@ -6350,10 +6350,7 @@ static bool 
ppc_pvr_match_power9(PowerPCCPUClass *pcc, uint32_t pvr, bool best)

  return false;
  }
-    if ((pvr & 0x0f00) == 0x100) {
-    /* DD1.x always matches power9_v1.0 */
-    return true;
-    } else if ((pvr & 0x0f00) == 0x200) {
+    if ((pvr & 0x0f00) == 0x200) {
  if ((pvr & 0xf) < 2) {
  /* DD2.0, DD2.1 match power9_v2.0 */
  if ((pcc->pvr & 0xf) == 0) {
@@ -6536,7 +6533,7 @@ static bool 
ppc_pvr_match_power10(PowerPCCPUClass *pcc, uint32_t pvr, bool best)

  }
  if ((pvr & 0x0f00) == (pcc->pvr & 0x0f00)) {
-    /* Major DD version matches to power10_v1.0 and power10_v2.0 */
+    /* Major DD version matches power10_v2.0 */
  return true;
  }
diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index bcf30a5400..525fbe3892 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -2369,17 +2369,6 @@ static void 
kvmppc_host_cpu_class_init(ObjectClass *oc, void *data)

  #if defined(TARGET_PPC64)
  pcc->radix_page_info = kvmppc_get_radix_page_info();
-
-    if ((pcc->pvr & 0xff00) == CPU_POWERPC_POWER9_DD1) {
-    /*
- * POWER9 DD1 has some bugs which make it not really ISA 3.00
- * compliant.  More importantly, advertising ISA 3.00
- * architected mode may prevent guests from activating
- * necessary DD1 workarounds.
- */
-    pcc->pcr_supported &= ~(PCR_COMPAT_3_00 | PCR_COMPAT_2_07
-    | PCR_COMPAT_2_06 | PCR_COMPAT_2_05);
-    }
  #endif /* defined(TARGET_PPC64) */
  }

Re: [PATCH 06/13] ppc/spapr: Add pa-features for POWER10 machines

On Tue Mar 12, 2024 at 7:07 AM AEST, BALATON Zoltan wrote:
> On Mon, 11 Mar 2024, Philippe Mathieu-Daudé wrote:
> > On 11/3/24 19:51, Nicholas Piggin wrote:
> >> From: Benjamin Gray 
> >> 
> >> Add POWER10 pa-features entry.
> >> 
> >> Notably DEXCR and and [P]HASHST/[P]HASHCHK instruction support is
> >> advertised. Each DEXCR aspect is allocated a bit in the device tree,
> >> using the 68--71 byte range (inclusive). The functionality of the
> >> [P]HASHST/[P]HASHCHK instructions is separately declared in byte 72,
> >> bit 0 (BE).
> >> 
> >> Signed-off-by: Benjamin Gray 
> >> [npiggin: reword title and changelog, adjust a few bits]
> >> Signed-off-by: Nicholas Piggin 
> >> ---
> >>   hw/ppc/spapr.c | 34 ++
> >>   1 file changed, 34 insertions(+)
> >> 
> >> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >> index 247f920f07..128bfe11a8 100644
> >> --- a/hw/ppc/spapr.c
> >> +++ b/hw/ppc/spapr.c
> >> @@ -265,6 +265,36 @@ static void spapr_dt_pa_features(SpaprMachineState 
> >> *spapr,
> >>   /* 60: NM atomic, 62: RNG */
> >>   0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 60 - 65 */
> >>   };
> >> +/* 3.1 removes SAO, HTM support */
> >> +uint8_t pa_features_31[] = { 74, 0,
> >
> > Nitpicking because pre-existing, all these arrays could be static const.
>
> If we are at it then maybe also s/0x00/   0/ because having a stream of 
> 0x80 and 0x00 is not the most readable.

Eh, it's more readable because it aligns colums. But probably better
more readable and  less error prone would be like -

PA_FEATURE_SET(pa_features_31,  6, 0); /* DS207 */
PA_FEATURE_SET(pa_features_31, 18, 0); /* Vector scalar */

I just didn't quite find something I like yet. I won't change style
before adding the missing bits either way, but certainly would be
good to clean it up after.

Thanks,
Nick

Re: [PATCH 01/13] ppc: Drop support for POWER9 and POWER10 DD1 chips

2024-03-11 Thread Harsh Prateek Bora





On 3/12/24 00:21, Nicholas Piggin wrote:

The POWER9 DD1 and POWER10 DD1 chips are not public and are no longer of
any use in QEMU. Remove them.

Signed-off-by: Nicholas Piggin 
---
  hw/ppc/spapr_cpu_core.c |  2 --
  target/ppc/cpu-models.c |  4 
  target/ppc/cpu_init.c   |  7 ++-
  target/ppc/kvm.c| 11 ---
  4 files changed, 2 insertions(+), 22 deletions(-)


Do we want to squash in removal of the macro as well?

diff --git a/target/ppc/cpu-models.h b/target/ppc/cpu-models.h
index 0229ef3a9a..a5167873ae 100644
--- a/target/ppc/cpu-models.h
+++ b/target/ppc/cpu-models.h
@@ -348,7 +348,6 @@ enum {
 CPU_POWERPC_POWER8NVL_BASE = 0x004C,
 CPU_POWERPC_POWER8NVL_v10  = 0x004C0100,
 CPU_POWERPC_POWER9_BASE= 0x004E,
-CPU_POWERPC_POWER9_DD1 = 0x004E1100,
 CPU_POWERPC_POWER9_DD20= 0x004E1200,
 CPU_POWERPC_POWER9_DD22= 0x004E1202,
 CPU_POWERPC_POWER10_BASE   = 0x0080,

With that,

Reviewed-by: Harsh Prateek Bora 

regards,
Harsh



diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index 40b7c52f7f..50523ead25 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -394,10 +394,8 @@ static const TypeInfo spapr_cpu_core_type_infos[] = {
  DEFINE_SPAPR_CPU_CORE_TYPE("power8_v2.0"),
  DEFINE_SPAPR_CPU_CORE_TYPE("power8e_v2.1"),
  DEFINE_SPAPR_CPU_CORE_TYPE("power8nvl_v1.0"),
-DEFINE_SPAPR_CPU_CORE_TYPE("power9_v1.0"),
  DEFINE_SPAPR_CPU_CORE_TYPE("power9_v2.0"),
  DEFINE_SPAPR_CPU_CORE_TYPE("power9_v2.2"),
-DEFINE_SPAPR_CPU_CORE_TYPE("power10_v1.0"),
  DEFINE_SPAPR_CPU_CORE_TYPE("power10_v2.0"),
  #ifdef CONFIG_KVM
  DEFINE_SPAPR_CPU_CORE_TYPE("host"),
diff --git a/target/ppc/cpu-models.c b/target/ppc/cpu-models.c
index 36e465b390..f2301b43f7 100644
--- a/target/ppc/cpu-models.c
+++ b/target/ppc/cpu-models.c
@@ -728,14 +728,10 @@
  "POWER8 v2.0")
  POWERPC_DEF("power8nvl_v1.0", CPU_POWERPC_POWER8NVL_v10, POWER8,
  "POWER8NVL v1.0")
-POWERPC_DEF("power9_v1.0",   CPU_POWERPC_POWER9_DD1, POWER9,
-"POWER9 v1.0")
  POWERPC_DEF("power9_v2.0",   CPU_POWERPC_POWER9_DD20,POWER9,
  "POWER9 v2.0")
  POWERPC_DEF("power9_v2.2",   CPU_POWERPC_POWER9_DD22,POWER9,
  "POWER9 v2.2")
-POWERPC_DEF("power10_v1.0",  CPU_POWERPC_POWER10_DD1,POWER10,
-"POWER10 v1.0")
  POWERPC_DEF("power10_v2.0",  CPU_POWERPC_POWER10_DD20,   POWER10,
  "POWER10 v2.0")
  #endif /* defined (TARGET_PPC64) */
diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
index 1d3d1db7c3..572cbdf25f 100644
--- a/target/ppc/cpu_init.c
+++ b/target/ppc/cpu_init.c
@@ -6350,10 +6350,7 @@ static bool ppc_pvr_match_power9(PowerPCCPUClass *pcc, 
uint32_t pvr, bool best)
  return false;
  }
  
-if ((pvr & 0x0f00) == 0x100) {

-/* DD1.x always matches power9_v1.0 */
-return true;
-} else if ((pvr & 0x0f00) == 0x200) {
+if ((pvr & 0x0f00) == 0x200) {
  if ((pvr & 0xf) < 2) {
  /* DD2.0, DD2.1 match power9_v2.0 */
  if ((pcc->pvr & 0xf) == 0) {
@@ -6536,7 +6533,7 @@ static bool ppc_pvr_match_power10(PowerPCCPUClass *pcc, 
uint32_t pvr, bool best)
  }
  
  if ((pvr & 0x0f00) == (pcc->pvr & 0x0f00)) {

-/* Major DD version matches to power10_v1.0 and power10_v2.0 */
+/* Major DD version matches power10_v2.0 */
  return true;
  }
  
diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c

index bcf30a5400..525fbe3892 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -2369,17 +2369,6 @@ static void kvmppc_host_cpu_class_init(ObjectClass *oc, 
void *data)
  
  #if defined(TARGET_PPC64)

  pcc->radix_page_info = kvmppc_get_radix_page_info();
-
-if ((pcc->pvr & 0xff00) == CPU_POWERPC_POWER9_DD1) {
-/*
- * POWER9 DD1 has some bugs which make it not really ISA 3.00
- * compliant.  More importantly, advertising ISA 3.00
- * architected mode may prevent guests from activating
- * necessary DD1 workarounds.
- */
-pcc->pcr_supported &= ~(PCR_COMPAT_3_00 | PCR_COMPAT_2_07
-| PCR_COMPAT_2_06 | PCR_COMPAT_2_05);
-}
  #endif /* defined(TARGET_PPC64) */
  }

Re: [PATCH 06/13] ppc/spapr: Add pa-features for POWER10 machines

On Tue Mar 12, 2024 at 6:05 AM AEST, Philippe Mathieu-Daudé wrote:
> On 11/3/24 19:51, Nicholas Piggin wrote:
> > From: Benjamin Gray 
> > 
> > Add POWER10 pa-features entry.
> > 
> > Notably DEXCR and and [P]HASHST/[P]HASHCHK instruction support is
> > advertised. Each DEXCR aspect is allocated a bit in the device tree,
> > using the 68--71 byte range (inclusive). The functionality of the
> > [P]HASHST/[P]HASHCHK instructions is separately declared in byte 72,
> > bit 0 (BE).
> > 
> > Signed-off-by: Benjamin Gray 
> > [npiggin: reword title and changelog, adjust a few bits]
> > Signed-off-by: Nicholas Piggin 
> > ---
> >   hw/ppc/spapr.c | 34 ++
> >   1 file changed, 34 insertions(+)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 247f920f07..128bfe11a8 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -265,6 +265,36 @@ static void spapr_dt_pa_features(SpaprMachineState 
> > *spapr,
> >   /* 60: NM atomic, 62: RNG */
> >   0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 60 - 65 */
> >   };
> > +/* 3.1 removes SAO, HTM support */
> > +uint8_t pa_features_31[] = { 74, 0,
>
> Nitpicking because pre-existing, all these arrays could be static const.

That's true. I was looking for a nicer way to do it, probably generate
the bits with macros and share between spapr and pnv. This is just a
quick dumb approach to getting the missing bits in for now.

Thanks,
Nick

RE: [PATCH v2 3/9] aspeed/sdmc: Add AST2700 support

2024-03-11 Thread Jamin Lin

> >>
> > Hi Cedrice,
> >
> > Thanks for review and sorry reply you late.
> >
> >> On 3/4/24 10:29, Jamin Lin wrote:
> >>> The SDRAM memory controller(DRAMC) controls the access to external
> >>> DDR4 and DDR5 SDRAM and power up to DDR4 and DDR5 PHY.
> >>>
> >>> The DRAM memory controller of AST2700 is not backward compatible to
> >>> previous chips such AST2600, AST2500 and AST2400.
> >>>
> >>> Max memory is now 8GiB on the AST2700. Introduce new
> >> aspeed_2700_sdmc
> >>> and class with read/write operation and reset handlers.
> >>>
> >>> Define DRAMC necessary protected registers and unprotected registers
> >>> for AST2700 and increase the register set to 0x1000.
> >>>
> >>> Signed-off-by: Troy Lee 
> >>> Signed-off-by: Jamin Lin 
> >>> ---
> >>>hw/misc/aspeed_sdmc.c | 215
> >> ++
> >>>include/hw/misc/aspeed_sdmc.h |   4 +-
> >>>2 files changed, 198 insertions(+), 21 deletions(-)
> >>>
> >>> diff --git a/hw/misc/aspeed_sdmc.c b/hw/misc/aspeed_sdmc.c index
> >>> 64cd1a81dc..63fb7936c4 100644
> >>> --- a/hw/misc/aspeed_sdmc.c
> >>> +++ b/hw/misc/aspeed_sdmc.c
> >>> @@ -27,6 +27,7 @@
> >>>#define   PROT_SOFTLOCKED0x00
> >>>
> >>>#define   PROT_KEY_UNLOCK 0xFC600309
> >>> +#define   PROT_2700_KEY_UNLOCK  0x1688A8A8
> >>>#define   PROT_KEY_HARDLOCK   0xDEADDEAD /* AST2600 */
> >>>
> >>>/* Configuration Register */
> >>> @@ -54,6 +55,46 @@
> >>>#define R_DRAM_TIME   (0x8c / 4)
> >>>#define R_ECC_ERR_INJECT  (0xb4 / 4)
> >>>
> >>> +/* AST2700 Register */
> >>> +#define R_2700_PROT (0x00 / 4)
> >>> +#define R_INT_STATUS(0x04 / 4)
> >>> +#define R_INT_CLEAR (0x08 / 4)
> >>> +#define R_INT_MASK  (0x0c / 4)
> >>> +#define R_MAIN_CONF (0x10 / 4)
> >>> +#define R_MAIN_CONTROL  (0x14 / 4)
> >>> +#define R_MAIN_STATUS   (0x18 / 4)
> >>> +#define R_ERR_STATUS(0x1c / 4)
> >>> +#define R_ECC_FAIL_STATUS   (0x78 / 4)
> >>> +#define R_ECC_FAIL_ADDR (0x7c / 4)
> >>> +#define R_ECC_TESTING_CONTROL   (0x80 / 4)
> >>> +#define R_PROT_REGION_LOCK_STATUS   (0x94 / 4)
> >>> +#define R_TEST_FAIL_ADDR(0xd4 / 4)
> >>> +#define R_TEST_FAIL_D0  (0xd8 / 4)
> >>> +#define R_TEST_FAIL_D1  (0xdc / 4)
> >>> +#define R_TEST_FAIL_D2  (0xe0 / 4)
> >>> +#define R_TEST_FAIL_D3  (0xe4 / 4)
> >>> +#define R_DBG_STATUS(0xf4 / 4)
> >>> +#define R_PHY_INTERFACE_STATUS  (0xf8 / 4)
> >>> +#define R_GRAPHIC_MEM_BASE_ADDR (0x10c / 4)
> >>> +#define R_PORT0_INTERFACE_MONITOR0  (0x240 / 4) #define
> >>> +R_PORT0_INTERFACE_MONITOR1  (0x244 / 4) #define
> >>> +R_PORT0_INTERFACE_MONITOR2  (0x248 / 4) #define
> >>> +R_PORT1_INTERFACE_MONITOR0  (0x2c0 / 4) #define
> >>> +R_PORT1_INTERFACE_MONITOR1  (0x2c4 / 4) #define
> >>> +R_PORT1_INTERFACE_MONITOR2  (0x2c8 / 4) #define
> >>> +R_PORT2_INTERFACE_MONITOR0  (0x340 / 4) #define
> >>> +R_PORT2_INTERFACE_MONITOR1  (0x344 / 4) #define
> >>> +R_PORT2_INTERFACE_MONITOR2  (0x348 / 4) #define
> >>> +R_PORT3_INTERFACE_MONITOR0  (0x3c0 / 4) #define
> >>> +R_PORT3_INTERFACE_MONITOR1  (0x3c4 / 4) #define
> >>> +R_PORT3_INTERFACE_MONITOR2  (0x3c8 / 4) #define
> >>> +R_PORT4_INTERFACE_MONITOR0  (0x440 / 4) #define
> >>> +R_PORT4_INTERFACE_MONITOR1  (0x444 / 4) #define
> >>> +R_PORT4_INTERFACE_MONITOR2  (0x448 / 4) #define
> >>> +R_PORT5_INTERFACE_MONITOR0  (0x4c0 / 4) #define
> >>> +R_PORT5_INTERFACE_MONITOR1  (0x4c4 / 4) #define
> >>> +R_PORT5_INTERFACE_MONITOR2  (0x4c8 / 4)
> >>> +
> >>>/*
> >>> * Configuration register Ox4 (for Aspeed AST2400 SOC)
> >>> *
> >>> @@ -76,10 +117,6 @@
> >>>#define ASPEED_SDMC_VGA_32MB0x2
> >>>#define ASPEED_SDMC_VGA_64MB0x3
> >>>#define ASPEED_SDMC_DRAM_SIZE(x)(x & 0x3)
> >>> -#define ASPEED_SDMC_DRAM_64MB   0x0
> >>> -#define ASPEED_SDMC_DRAM_128MB  0x1
> >>> -#define ASPEED_SDMC_DRAM_256MB  0x2
> >>> -#define ASPEED_SDMC_DRAM_512MB  0x3
> >>>
> >>>#define ASPEED_SDMC_READONLY_MASK
> \
> >>>(ASPEED_SDMC_RESERVED | ASPEED_SDMC_VGA_COMPAT |
> \
> >>> @@ -100,22 +137,24 @@
> >>>#define ASPEED_SDMC_CACHE_ENABLE(1 << 10) /* differs
> >> from AST2400 */
> >>>#define ASPEED_SDMC_DRAM_TYPE   (1 << 4)  /* differs
> >> from AST2400 */
> >>>
> >>> -/* DRAM size definitions differs */
> >>> -#define ASPEED_SDMC_AST2500_128MB   0x0
> >>> -#define ASPEED_SDMC_AST2500_256MB   0x1
> >>> -#define ASPEED_SDMC_AST2500_512MB   0x2
> >>> -#define ASPEED_SDMC_AST2500_1024MB  0x3
> >>> -
> >>> -#define ASPEED_SDMC_AST2600_256MB   0x0
> >>> -#define ASPEED_SDMC_AST2600_512MB   0x1
> >>> -#define ASPEED_SDMC_AST2600_1024MB  0x2
> >>> -#define ASPEED_SDMC_AST2600_2048MB  0x3
> >>> -
> >> Please

Re: [PATCH V3 1/1] target/loongarch: Fixed tlb huge page loading issue

2024-03-11 Thread lixianglai


Hi Richard:


@@ -495,30 +508,10 @@ target_ulong helper_lddir(CPULoongArchState 
*env, target_ulong base,

  shift = FIELD_EX64(env->CSR_PWCL, CSR_PWCL, PTEWIDTH);
  shift = (shift + 1) * 3;
  -    if (huge) {
-    return base;
-    }
-    switch (level) {
-    case 1:
-    dir_base = FIELD_EX64(env->CSR_PWCL, CSR_PWCL, DIR1_BASE);
-    dir_width = FIELD_EX64(env->CSR_PWCL, CSR_PWCL, DIR1_WIDTH);
-    break;
-    case 2:
-    dir_base = FIELD_EX64(env->CSR_PWCL, CSR_PWCL, DIR2_BASE);
-    dir_width = FIELD_EX64(env->CSR_PWCL, CSR_PWCL, DIR2_WIDTH);
-    break;
-    case 3:
-    dir_base = FIELD_EX64(env->CSR_PWCH, CSR_PWCH, DIR3_BASE);
-    dir_width = FIELD_EX64(env->CSR_PWCH, CSR_PWCH, DIR3_WIDTH);
-    break;
-    case 4:
-    dir_base = FIELD_EX64(env->CSR_PWCH, CSR_PWCH, DIR4_BASE);
-    dir_width = FIELD_EX64(env->CSR_PWCH, CSR_PWCH, DIR4_WIDTH);
-    break;
-    default:
-    do_raise_exception(env, EXCCODE_INE, GETPC());
+    if (get_dir_base_width(env, _base, _width, level) != 0) {
  return 0;
  }


I believe that we should not raise an exception here at all.  This 
illegal instruction exception is based on the LDDIR immediate operand, 
so we should have diagnosed this error and raised an exception in 
trans_lddir().


After consulting the hardware technician, when the level value is 
greater than 4,


the hardware does not report an exception, we can check the level in 
helper_lddir,


if the parameter is not valid, we will directly return to base,

and it is not reasonable to check the validity of the immediate number 
in trans_lddir.


The actual action should be implemented in the instruction 
simulation,and the log should be printed and recorded,


like this:

target_ulong helper_lddir( )

{

    if ((level == 0) || (level > 4)) {

   qemu_log_mask(LOG_GUEST_ERROR, "Illegal instruction level 
%lu\n",  level);


    return base;

    }

..

}



Therefore the default label should use only g_assert_not_reached(), 
and there need not be a error return from get_dir_base_width at all.



@@ -534,17 +527,38 @@ void helper_ldpte(CPULoongArchState *env, 
target_ulong base, target_ulong odd,

  bool huge = (base >> LOONGARCH_PAGE_HUGE_SHIFT) & 0x1;
  uint64_t ptbase = FIELD_EX64(env->CSR_PWCL, CSR_PWCL, PTBASE);
  uint64_t ptwidth = FIELD_EX64(env->CSR_PWCL, CSR_PWCL, PTWIDTH);
+    uint64_t dir_base, dir_width;
+    uint64_t huge_page_level;
    base = base & TARGET_PHYS_MASK;
    if (huge) {
-    /* Huge Page. base is paddr */
+    /*
+ * Gets the huge page level
+ * Clears the huge page level information in the address
+ * Clears huge page bit
+ * Gets huge page size
+ */
+    huge_page_level = (base & HUGE_PAGE_LEVEL_MASK) >>
+  HUGE_PAGE_LEVEL_SHIFT;
+
+    base &= ~HUGE_PAGE_LEVEL_MASK;
+
  tmp0 = base ^ (1 << LOONGARCH_PAGE_HUGE_SHIFT);
  /* Move Global bit */
  tmp0 = ((tmp0 & (1 << LOONGARCH_HGLOBAL_SHIFT))  >>
  LOONGARCH_HGLOBAL_SHIFT) << R_TLBENTRY_G_SHIFT |
  (tmp0 & (~(1 << LOONGARCH_HGLOBAL_SHIFT)));
-    ps = ptbase + ptwidth - 1;
+
+    huge_page_level++;


Why are you incrementing the level?


level plus 1 is to obtain the dir_base of the upper level,

because I directly use the dir_base of the upper level as the size of 
the huge page when calculating the page size,


this practice is different from the hardware implementation,

the hardware implementation is explained below,

the next version I will refer to the hardware implementation method to 
calculate the size of the huge page.




I think you want

    level = MIN(level, 1);

Google translates the documentation for LDPTE as "bits [14:13] ... 
should be a non-zero value".  I don't know if "should" is precisely 
correct here (english technical documents prefer "shall" or "may" to 
indicate a hard requirement vs optional behaviour). The document does 
not appear to say what happens if the value is zero.




After consulting hardware technicians, LDPTE uses dir_base + dir_width 
corresponding to [14..13]bits as the page size,


and when [14..13]bits is 0, the page size should be PTbase + PTwidth.

So [14..13]bits can be zero and we should revise the manual.

And The get_dir_base_width function plans to add the handling of case 0,

so that get_dir_base_width will not receive illegal level arguments when 
ldpte,


and because of the validity of the level at the entry of the lddir 
function,


the get_dir_base_width function will not receive illegal level arguments.

So you will not receive level == 0 and level > 4:


static void get_dir_base_width(CPULoongArchState *env, uint64_t *dir_base,
   uint64_t *dir_width, target_ulong level)
{
    switch (level) {
    case 0:
    *dir_base = FIELD_EX64(env->CSR_PWCL, CSR_PWCL, PTBASE);
    *dir_width =

Re: [PATCH] target/ppc: Fix GDB SPR regnum indexing

2024-03-11 Thread Akihiko Odaki


On 2024/03/12 2:54, Nicholas Piggin wrote:

Fix an off by one bug.

Cc: Akihiko Odaki 
Cc: Alex Bennée 
Fixes: 1b53948ff8f70 ("target/ppc: Use GDBFeature for dynamic XML")
Signed-off-by: Nicholas Piggin 


Reviewed-by: Akihiko Odaki 


---
  target/ppc/gdbstub.c | 7 +++
  1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/target/ppc/gdbstub.c b/target/ppc/gdbstub.c
index 122ea9d0c0..80a2e7990b 100644
--- a/target/ppc/gdbstub.c
+++ b/target/ppc/gdbstub.c
@@ -324,6 +324,9 @@ static void gdb_gen_spr_feature(CPUState *cs)
  continue;
  }
  
+gdb_feature_builder_append_reg(, g_ascii_strdown(spr->name, -1),

+   TARGET_LONG_BITS, num_regs,
+   "int", "spr");
  /*
   * GDB identifies registers based on the order they are
   * presented in the XML. These ids will not match QEMU's
@@ -334,10 +337,6 @@ static void gdb_gen_spr_feature(CPUState *cs)
   */
  spr->gdb_id = num_regs;
  num_regs++;
-
-gdb_feature_builder_append_reg(, g_ascii_strdown(spr->name, 
-1),
-   TARGET_LONG_BITS, num_regs,
-   "int", "spr");
  }
  
  gdb_feature_builder_end();

Re: [PATCH v2] target/riscv: raise an exception when CSRRS/CSRRC writes a read-only CSR

2024-03-11 Thread LIU Zhiwei




On 2024/3/11 11:08, Yu-Ming Chang wrote:

Both CSRRS and CSRRC always read the addressed CSR and cause any read side
effects regardless of rs1 and rd fields. Note that if rs1 specifies a register
holding a zero value other than x0, the instruction will still attempt to write
the unmodified value back to the CSR and will cause any attendant side effects.

So if CSRRS or CSRRC tries to write a read-only CSR with rs1 which specifies
a register holding a zero value, an illegal instruction exception should be
raised.

Signed-off-by: Yu-Ming Chang 
---
This incorporated the comments from Richard. Thank you.

  target/riscv/cpu.h   |  2 ++
  target/riscv/csr.c   | 17 ++---
  target/riscv/op_helper.c |  2 +-
  3 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 5d291a7092..452841ae2f 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -710,6 +710,8 @@ void cpu_get_tb_cpu_state(CPURISCVState *env, vaddr *pc,
  void riscv_cpu_update_mask(CPURISCVState *env);
  bool riscv_cpu_is_32bit(RISCVCPU *cpu);
  
+RISCVException riscv_csrr(CPURISCVState *env, int csrno,

+  target_ulong *ret_value);
  RISCVException riscv_csrrw(CPURISCVState *env, int csrno,
 target_ulong *ret_value,
 target_ulong new_value, target_ulong write_mask);
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index d4e8ac13b9..0d14ba2ba5 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -4306,7 +4306,7 @@ static RISCVException rmw_seed(CPURISCVState *env, int 
csrno,
  
  static inline RISCVException riscv_csrrw_check(CPURISCVState *env,

 int csrno,
-   bool write_mask)
+   bool write)
  {
  /* check privileges and return RISCV_EXCP_ILLEGAL_INST if check fails */
  bool read_only = get_field(csrno, 0xC00) == 3;
@@ -4328,7 +4328,7 @@ static inline RISCVException 
riscv_csrrw_check(CPURISCVState *env,
  }
  
  /* read / write check */

-if (write_mask && read_only) {
+if (write && read_only) {
  return RISCV_EXCP_ILLEGAL_INST;
  }
  
@@ -4415,11 +4415,22 @@ static RISCVException riscv_csrrw_do64(CPURISCVState *env, int csrno,

  return RISCV_EXCP_NONE;
  }
  
+RISCVException riscv_csrr(CPURISCVState *env, int csrno,

+   target_ulong *ret_value)
+{
+RISCVException ret = riscv_csrrw_check(env, csrno, false);
+if (ret != RISCV_EXCP_NONE) {
+return ret;
+}
+
+return riscv_csrrw_do64(env, csrno, ret_value, 0, 0);
+}
+
  RISCVException riscv_csrrw(CPURISCVState *env, int csrno,
 target_ulong *ret_value,
 target_ulong new_value, target_ulong write_mask)
  {
-RISCVException ret = riscv_csrrw_check(env, csrno, write_mask);
+RISCVException ret = riscv_csrrw_check(env, csrno, true);
  if (ret != RISCV_EXCP_NONE) {
  return ret;
  }
diff --git a/target/riscv/op_helper.c b/target/riscv/op_helper.c
index f414aaebdb..f3aa705be8 100644
--- a/target/riscv/op_helper.c
+++ b/target/riscv/op_helper.c
@@ -51,7 +51,7 @@ target_ulong helper_csrr(CPURISCVState *env, int csr)
  }
  
  target_ulong val = 0;

-RISCVException ret = riscv_csrrw(env, csr, , 0, 0);
+RISCVException ret = riscv_csrr(env, csr, );
  
  if (ret != RISCV_EXCP_NONE) {

  riscv_raise_exception(env, ret, GETPC());


Hi Yu-Ming,

The 128-bit CSR operations have the similar errors. Could you solve the 
similar bug in this patch set?


Otherwise,

Reviewed-by: LIU Zhiwei 

Thanks,
Zhiwei

[PATCH V8 6/8] physmem: Add helper function to destroy CPU AddressSpace

Virtual CPU Hot-unplug leads to unrealization of a CPU object. This also
involves destruction of the CPU AddressSpace. Add common function to help
destroy the CPU AddressSpace.

Signed-off-by: Salil Mehta 
Tested-by: Vishnu Pajjuri 
Reviewed-by: Gavin Shan 
Tested-by: Xianglai Li 
Tested-by: Miguel Luis 
Reviewed-by: Shaoqin Huang 
---
 include/exec/cpu-common.h |  8 
 include/hw/core/cpu.h |  1 +
 system/physmem.c  | 29 +
 3 files changed, 38 insertions(+)

diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index 6346df17ce..a427d80340 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -123,6 +123,14 @@ size_t qemu_ram_pagesize_largest(void);
  */
 void cpu_address_space_init(CPUState *cpu, int asidx,
 const char *prefix, MemoryRegion *mr);
+/**
+ * cpu_address_space_destroy:
+ * @cpu: CPU for which address space needs to be destroyed
+ * @asidx: integer index of this address space
+ *
+ * Note that with KVM only one address space is supported.
+ */
+void cpu_address_space_destroy(CPUState *cpu, int asidx);
 
 void cpu_physical_memory_rw(hwaddr addr, void *buf,
 hwaddr len, bool is_write);
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index ec14f74ce5..e975b8085f 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -493,6 +493,7 @@ struct CPUState {
 QSIMPLEQ_HEAD(, qemu_work_item) work_list;
 
 CPUAddressSpace *cpu_ases;
+int cpu_ases_count;
 int num_ases;
 AddressSpace *as;
 MemoryRegion *memory;
diff --git a/system/physmem.c b/system/physmem.c
index 6e9ed97597..61b32ac4f2 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -761,6 +761,7 @@ void cpu_address_space_init(CPUState *cpu, int asidx,
 
 if (!cpu->cpu_ases) {
 cpu->cpu_ases = g_new0(CPUAddressSpace, cpu->num_ases);
+cpu->cpu_ases_count = cpu->num_ases;
 }
 
 newas = >cpu_ases[asidx];
@@ -774,6 +775,34 @@ void cpu_address_space_init(CPUState *cpu, int asidx,
 }
 }
 
+void cpu_address_space_destroy(CPUState *cpu, int asidx)
+{
+CPUAddressSpace *cpuas;
+
+assert(cpu->cpu_ases);
+assert(asidx >= 0 && asidx < cpu->num_ases);
+/* KVM cannot currently support multiple address spaces. */
+assert(asidx == 0 || !kvm_enabled());
+
+cpuas = >cpu_ases[asidx];
+if (tcg_enabled()) {
+memory_listener_unregister(>tcg_as_listener);
+}
+
+address_space_destroy(cpuas->as);
+g_free_rcu(cpuas->as, rcu);
+
+if (asidx == 0) {
+/* reset the convenience alias for address space 0 */
+cpu->as = NULL;
+}
+
+if (--cpu->cpu_ases_count == 0) {
+g_free(cpu->cpu_ases);
+cpu->cpu_ases = NULL;
+}
+}
+
 AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx)
 {
 /* Return the AddressSpace corresponding to the specified index */
-- 
2.34.1

Re: [PATCH v3 1/3] hw/core: Cleanup unused included headers in cpu-common.c

2024-03-11 Thread Zhao Liu

> > Thanks for helpping me verify this!!
> > 
> > EMM, but I'm still not understanding how this approach distinguishes
> > whether hw/core/cpu-common.c needs the header (include/exec/cpu-common.h)
> > directly or just include/exec/memory.h needs that header? For the latter,
> > the header needn't be included in .c file.
> 
> Yes, you are right, it might not be necessary.
> 
> For all other headers in your series I checked that no function /
> definition is used in the source, but "exec/cpu-common.h" is too
> big to do that manually.

Thanks! I checked manually as well... In the future I'll also think
about if there's a more elegant way to do it.

> I mostly skipped it because it is odd to
> have cpu-common.c not including the header with the same name...

Yes, I think the "cpu-common.c" is the related .c file of
exec/cpu-common.h.

And the related header of "hw/core/cpu-common.c" should be
"hw/core/cpu.h".

Could we rename "hw/core/cpu-common.c" to "hw/core/cpu.c"? Then the
relationship could be clear.

> Also, in another series I split / reworked "exec/cpu-common.h" and
> IIRC a part of it will be included here. Bah, I'll stop writing
> and take your patch unmodified.

Many thanks!

Regards,
Zhao

[PATCH V8 8/8] docs/specs/acpi_hw_reduced_hotplug: Add the CPU Hotplug Event Bit

GED interface is used by many hotplug events like memory hotplug, NVDIMM hotplug
and non-hotplug events like system power down event. Each of these can be
selected using a bit in the 32 bit GED IO interface. A bit has been reserved for
the CPU hotplug event.

Signed-off-by: Salil Mehta 
Reviewed-by: Gavin Shan 
---
 docs/specs/acpi_hw_reduced_hotplug.rst | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/docs/specs/acpi_hw_reduced_hotplug.rst 
b/docs/specs/acpi_hw_reduced_hotplug.rst
index 0bd3f9399f..3acd6fcd8b 100644
--- a/docs/specs/acpi_hw_reduced_hotplug.rst
+++ b/docs/specs/acpi_hw_reduced_hotplug.rst
@@ -64,7 +64,8 @@ GED IO interface (4 byte access)
0: Memory hotplug event
1: System power down event
2: NVDIMM hotplug event
-3-31: Reserved
+   3: CPU hotplug event
+4-31: Reserved
 
 **write_access:**
 
-- 
2.34.1

[PATCH V8 7/8] gdbstub: Add helper function to unregister GDB register space

Add common function to help unregister the GDB register space. This shall be
done in context to the CPU unrealization.

Signed-off-by: Salil Mehta 
Tested-by: Vishnu Pajjuri 
Reviewed-by: Gavin Shan 
Tested-by: Xianglai Li 
Tested-by: Miguel Luis 
Reviewed-by: Shaoqin Huang 
---
 gdbstub/gdbstub.c  | 12 
 include/exec/gdbstub.h |  6 ++
 2 files changed, 18 insertions(+)

diff --git a/gdbstub/gdbstub.c b/gdbstub/gdbstub.c
index 17efcae0d0..a8449dc309 100644
--- a/gdbstub/gdbstub.c
+++ b/gdbstub/gdbstub.c
@@ -615,6 +615,18 @@ void gdb_register_coprocessor(CPUState *cpu,
 }
 }
 
+void gdb_unregister_coprocessor_all(CPUState *cpu)
+{
+/*
+ * Safe to nuke everything. GDBRegisterState::xml is static const char so
+ * it won't be freed
+ */
+g_array_free(cpu->gdb_regs, true);
+
+cpu->gdb_regs = NULL;
+cpu->gdb_num_g_regs = 0;
+}
+
 static void gdb_process_breakpoint_remove_all(GDBProcess *p)
 {
 CPUState *cpu = gdb_get_first_cpu_in_process(p);
diff --git a/include/exec/gdbstub.h b/include/exec/gdbstub.h
index eb14b91139..249d4d4bc8 100644
--- a/include/exec/gdbstub.h
+++ b/include/exec/gdbstub.h
@@ -49,6 +49,12 @@ void gdb_register_coprocessor(CPUState *cpu,
   gdb_get_reg_cb get_reg, gdb_set_reg_cb set_reg,
   const GDBFeature *feature, int g_pos);
 
+/**
+ * gdb_unregister_coprocessor_all() - unregisters supplemental set of registers
+ * @cpu - the CPU associated with registers
+ */
+void gdb_unregister_coprocessor_all(CPUState *cpu);
+
 /**
  * gdbserver_start: start the gdb server
  * @port_or_device: connection spec for gdb
-- 
2.34.1

[PATCH V8 4/8] hw/acpi: Update GED _EVT method AML with CPU scan

OSPM evaluates _EVT method to map the event. The CPU hotplug event eventually
results in start of the CPU scan. Scan figures out the CPU and the kind of
event(plug/unplug) and notifies it back to the guest. Update the GED AML _EVT
method with the call to \\_SB.CPUS.CSCN

Also, macro CPU_SCAN_METHOD might be referred in other places like during GED
intialization so it makes sense to have its definition placed in some common
header file like cpu_hotplug.h. But doing this can cause compilation break
because of the conflicting macro definitions present in cpu.c and cpu_hotplug.c
and because both these files get compiled due to historic reasons of x86 world
i.e. decision to use legacy(GPE.2)/modern(GED) CPU hotplug interface happens
during runtime [1]. To mitigate above, for now, declare a new common macro
ACPI_CPU_SCAN_METHOD for CPU scan method instead.
(This needs a separate discussion later on for clean-up)

Reference:
[1] 
https://lore.kernel.org/qemu-devel/1463496205-251412-24-git-send-email-imamm...@redhat.com/

Co-developed-by: Keqian Zhu 
Signed-off-by: Keqian Zhu 
Signed-off-by: Salil Mehta 
Reviewed-by: Jonathan Cameron 
Reviewed-by: Gavin Shan 
Tested-by: Vishnu Pajjuri 
Tested-by: Xianglai Li 
Tested-by: Miguel Luis 
Reviewed-by: Shaoqin Huang 
---
 hw/acpi/cpu.c  | 2 +-
 hw/acpi/generic_event_device.c | 4 
 include/hw/acpi/cpu_hotplug.h  | 2 ++
 3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
index 69aaa563db..bde5a42a0b 100644
--- a/hw/acpi/cpu.c
+++ b/hw/acpi/cpu.c
@@ -323,7 +323,7 @@ const VMStateDescription vmstate_cpu_hotplug = {
 #define CPUHP_RES_DEVICE  "PRES"
 #define CPU_LOCK  "CPLK"
 #define CPU_STS_METHOD"CSTA"
-#define CPU_SCAN_METHOD   "CSCN"
+#define CPU_SCAN_METHOD   ACPI_CPU_SCAN_METHOD
 #define CPU_NOTIFY_METHOD "CTFY"
 #define CPU_EJECT_METHOD  "CEJ0"
 #define CPU_OST_METHOD"COST"
diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
index 35a71505a5..58c7882555 100644
--- a/hw/acpi/generic_event_device.c
+++ b/hw/acpi/generic_event_device.c
@@ -109,6 +109,10 @@ void build_ged_aml(Aml *table, const char *name, 
HotplugHandler *hotplug_dev,
 aml_append(if_ctx, aml_call0(MEMORY_DEVICES_CONTAINER "."
  MEMORY_SLOT_SCAN_METHOD));
 break;
+case ACPI_GED_CPU_HOTPLUG_EVT:
+aml_append(if_ctx, aml_call0(ACPI_CPU_CONTAINER "."
+ ACPI_CPU_SCAN_METHOD));
+break;
 case ACPI_GED_PWR_DOWN_EVT:
 aml_append(if_ctx,
aml_notify(aml_name(ACPI_POWER_BUTTON_DEVICE),
diff --git a/include/hw/acpi/cpu_hotplug.h b/include/hw/acpi/cpu_hotplug.h
index 48b291e45e..ef631750b4 100644
--- a/include/hw/acpi/cpu_hotplug.h
+++ b/include/hw/acpi/cpu_hotplug.h
@@ -20,6 +20,8 @@
 #include "hw/acpi/cpu.h"
 
 #define ACPI_CPU_HOTPLUG_REG_LEN 12
+#define ACPI_CPU_SCAN_METHOD "CSCN"
+#define ACPI_CPU_CONTAINER "\\_SB.CPUS"
 
 typedef struct AcpiCpuHotplug {
 Object *device;
-- 
2.34.1

[PATCH V8 5/8] hw/acpi: Update CPUs AML with cpu-(ctrl)dev change

CPUs Control device(\\_SB.PCI0) register interface for the x86 arch is IO port
based and existing CPUs AML code assumes _CRS objects would evaluate to a system
resource which describes IO Port address. But on ARM arch CPUs control
device(\\_SB.PRES) register interface is memory-mapped hence _CRS object should
evaluate to system resource which describes memory-mapped base address. Update
build CPUs AML function to accept both IO/MEMORY region spaces and accordingly
update the _CRS object.

On x86, CPU Hotplug uses Generic ACPI GPE Block Bit 2 (GPE.2) event handler to
notify OSPM about any CPU hot(un)plug events. Latest CPU Hotplug is based on
ACPI Generic Event Device framework and uses ACPI GED device for the same. Not
all architectures support GPE based CPU Hotplug event handler. Hence, make AML
for GPE.2 event handler conditional.

Co-developed-by: Keqian Zhu 
Signed-off-by: Keqian Zhu 
Signed-off-by: Salil Mehta 
Reviewed-by: Gavin Shan 
Tested-by: Vishnu Pajjuri 
Reviewed-by: Jonathan Cameron 
Tested-by: Xianglai Li 
Tested-by: Miguel Luis 
Reviewed-by: Shaoqin Huang 
---
 hw/acpi/cpu.c | 23 ---
 hw/i386/acpi-build.c  |  3 ++-
 include/hw/acpi/cpu.h |  5 +++--
 3 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
index bde5a42a0b..b5fb2075d0 100644
--- a/hw/acpi/cpu.c
+++ b/hw/acpi/cpu.c
@@ -339,9 +339,10 @@ const VMStateDescription vmstate_cpu_hotplug = {
 #define CPU_FW_EJECT_EVENT "CEJF"
 
 void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
-build_madt_cpu_fn build_madt_cpu, hwaddr io_base,
+build_madt_cpu_fn build_madt_cpu, hwaddr base_addr,
 const char *res_root,
-const char *event_handler_method)
+const char *event_handler_method,
+AmlRegionSpace rs)
 {
 Aml *ifctx;
 Aml *field;
@@ -366,13 +367,19 @@ void build_cpus_aml(Aml *table, MachineState *machine, 
CPUHotplugFeatures opts,
 aml_append(cpu_ctrl_dev, aml_mutex(CPU_LOCK, 0));
 
 crs = aml_resource_template();
-aml_append(crs, aml_io(AML_DECODE16, io_base, io_base, 1,
+if (rs == AML_SYSTEM_IO) {
+aml_append(crs, aml_io(AML_DECODE16, base_addr, base_addr, 1,
ACPI_CPU_HOTPLUG_REG_LEN));
+} else {
+aml_append(crs, aml_memory32_fixed(base_addr,
+   ACPI_CPU_HOTPLUG_REG_LEN, AML_READ_WRITE));
+}
+
 aml_append(cpu_ctrl_dev, aml_name_decl("_CRS", crs));
 
 /* declare CPU hotplug MMIO region with related access fields */
 aml_append(cpu_ctrl_dev,
-aml_operation_region("PRST", AML_SYSTEM_IO, aml_int(io_base),
+aml_operation_region("PRST", rs, aml_int(base_addr),
  ACPI_CPU_HOTPLUG_REG_LEN));
 
 field = aml_field("PRST", AML_BYTE_ACC, AML_NOLOCK,
@@ -696,9 +703,11 @@ void build_cpus_aml(Aml *table, MachineState *machine, 
CPUHotplugFeatures opts,
 aml_append(sb_scope, cpus_dev);
 aml_append(table, sb_scope);
 
-method = aml_method(event_handler_method, 0, AML_NOTSERIALIZED);
-aml_append(method, aml_call0("\\_SB.CPUS." CPU_SCAN_METHOD));
-aml_append(table, method);
+if (event_handler_method) {
+method = aml_method(event_handler_method, 0, AML_NOTSERIALIZED);
+aml_append(method, aml_call0("\\_SB.CPUS." CPU_SCAN_METHOD));
+aml_append(table, method);
+}
 
 g_free(cphp_res_path);
 }
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 15242b9096..f0cdfaf9aa 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1536,7 +1536,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 .fw_unplugs_cpu = pm->smi_on_cpu_unplug,
 };
 build_cpus_aml(dsdt, machine, opts, pc_madt_cpu_entry,
-   pm->cpu_hp_io_base, "\\_SB.PCI0", "\\_GPE._E02");
+   pm->cpu_hp_io_base, "\\_SB.PCI0", "\\_GPE._E02",
+   AML_SYSTEM_IO);
 }
 
 if (pcms->memhp_io_base && nr_mem) {
diff --git a/include/hw/acpi/cpu.h b/include/hw/acpi/cpu.h
index e6e1a9ef59..48cded697c 100644
--- a/include/hw/acpi/cpu.h
+++ b/include/hw/acpi/cpu.h
@@ -61,9 +61,10 @@ typedef void (*build_madt_cpu_fn)(int uid, const 
CPUArchIdList *apic_ids,
   GArray *entry, bool force_enabled);
 
 void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
-build_madt_cpu_fn build_madt_cpu, hwaddr io_base,
+build_madt_cpu_fn build_madt_cpu, hwaddr base_addr,
 const char *res_root,
-const char *event_handler_method);
+const char *event_handler_method,
+AmlRegionSpace rs);
 
 void acpi_cpu_ospm_status(CPUHotplugState *cpu_st, ACPIOSTInfoList ***list);

[PATCH V8 3/8] hw/acpi: Update ACPI GED framework to support vCPU Hotplug

ACPI GED (as described in the ACPI 6.4 spec) uses an interrupt listed in the
_CRS object of GED to intimate OSPM about an event. Later then demultiplexes the
notified event by evaluating ACPI _EVT method to know the type of event. Use
ACPI GED to also notify the guest kernel about any CPU hot(un)plug events.

ACPI CPU hotplug related initialization should only happen if ACPI_CPU_HOTPLUG
support has been enabled for particular architecture. Add cpu_hotplug_hw_init()
stub to avoid compilation break.

Co-developed-by: Keqian Zhu 
Signed-off-by: Keqian Zhu 
Signed-off-by: Salil Mehta 
Reviewed-by: Jonathan Cameron 
Reviewed-by: Gavin Shan 
Reviewed-by: David Hildenbrand 
Reviewed-by: Shaoqin Huang 
Tested-by: Vishnu Pajjuri 
Tested-by: Xianglai Li 
Tested-by: Miguel Luis 
---
 hw/acpi/acpi-cpu-hotplug-stub.c|  6 ++
 hw/acpi/generic_event_device.c | 17 +
 include/hw/acpi/generic_event_device.h |  4 
 3 files changed, 27 insertions(+)

diff --git a/hw/acpi/acpi-cpu-hotplug-stub.c b/hw/acpi/acpi-cpu-hotplug-stub.c
index 3fc4b14c26..c6c61bb9cd 100644
--- a/hw/acpi/acpi-cpu-hotplug-stub.c
+++ b/hw/acpi/acpi-cpu-hotplug-stub.c
@@ -19,6 +19,12 @@ void legacy_acpi_cpu_hotplug_init(MemoryRegion *parent, 
Object *owner,
 return;
 }
 
+void cpu_hotplug_hw_init(MemoryRegion *as, Object *owner,
+ CPUHotplugState *state, hwaddr base_addr)
+{
+return;
+}
+
 void acpi_cpu_ospm_status(CPUHotplugState *cpu_st, ACPIOSTInfoList ***list)
 {
 return;
diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
index 2d6e91b124..35a71505a5 100644
--- a/hw/acpi/generic_event_device.c
+++ b/hw/acpi/generic_event_device.c
@@ -12,6 +12,7 @@
 #include "qemu/osdep.h"
 #include "qapi/error.h"
 #include "hw/acpi/acpi.h"
+#include "hw/acpi/cpu.h"
 #include "hw/acpi/generic_event_device.h"
 #include "hw/irq.h"
 #include "hw/mem/pc-dimm.h"
@@ -25,6 +26,7 @@ static const uint32_t ged_supported_events[] = {
 ACPI_GED_MEM_HOTPLUG_EVT,
 ACPI_GED_PWR_DOWN_EVT,
 ACPI_GED_NVDIMM_HOTPLUG_EVT,
+ACPI_GED_CPU_HOTPLUG_EVT,
 };
 
 /*
@@ -234,6 +236,8 @@ static void acpi_ged_device_plug_cb(HotplugHandler 
*hotplug_dev,
 } else {
 acpi_memory_plug_cb(hotplug_dev, >memhp_state, dev, errp);
 }
+} else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+acpi_cpu_plug_cb(hotplug_dev, >cpuhp_state, dev, errp);
 } else {
 error_setg(errp, "virt: device plug request for unsupported device"
" type: %s", object_get_typename(OBJECT(dev)));
@@ -248,6 +252,8 @@ static void acpi_ged_unplug_request_cb(HotplugHandler 
*hotplug_dev,
 if ((object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM) &&
!(object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM {
 acpi_memory_unplug_request_cb(hotplug_dev, >memhp_state, dev, errp);
+} else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+acpi_cpu_unplug_request_cb(hotplug_dev, >cpuhp_state, dev, errp);
 } else {
 error_setg(errp, "acpi: device unplug request for unsupported device"
" type: %s", object_get_typename(OBJECT(dev)));
@@ -261,6 +267,8 @@ static void acpi_ged_unplug_cb(HotplugHandler *hotplug_dev,
 
 if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
 acpi_memory_unplug_cb(>memhp_state, dev, errp);
+} else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+acpi_cpu_unplug_cb(>cpuhp_state, dev, errp);
 } else {
 error_setg(errp, "acpi: device unplug for unsupported device"
" type: %s", object_get_typename(OBJECT(dev)));
@@ -272,6 +280,7 @@ static void acpi_ged_ospm_status(AcpiDeviceIf *adev, 
ACPIOSTInfoList ***list)
 AcpiGedState *s = ACPI_GED(adev);
 
 acpi_memory_ospm_status(>memhp_state, list);
+acpi_cpu_ospm_status(>cpuhp_state, list);
 }
 
 static void acpi_ged_send_event(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
@@ -286,6 +295,8 @@ static void acpi_ged_send_event(AcpiDeviceIf *adev, 
AcpiEventStatusBits ev)
 sel = ACPI_GED_PWR_DOWN_EVT;
 } else if (ev & ACPI_NVDIMM_HOTPLUG_STATUS) {
 sel = ACPI_GED_NVDIMM_HOTPLUG_EVT;
+} else if (ev & ACPI_CPU_HOTPLUG_STATUS) {
+sel = ACPI_GED_CPU_HOTPLUG_EVT;
 } else {
 /* Unknown event. Return without generating interrupt. */
 warn_report("GED: Unsupported event %d. No irq injected", ev);
@@ -400,6 +411,12 @@ static void acpi_ged_initfn(Object *obj)
 memory_region_init_io(_st->regs, obj, _regs_ops, ged_st,
   TYPE_ACPI_GED "-regs", ACPI_GED_REG_COUNT);
 sysbus_init_mmio(sbd, _st->regs);
+
+memory_region_init(>container_cpuhp, OBJECT(dev), "cpuhp container",
+   ACPI_CPU_HOTPLUG_REG_LEN);
+sysbus_init_mmio(SYS_BUS_DEVICE(dev), >container_cpuhp);
+cpu_hotplug_hw_init(>container_cpuhp, OBJECT(dev),
+>cpuhp_state, 0);
 }
 
 static

[PATCH V8 2/8] hw/acpi: Move CPU ctrl-dev MMIO region len macro to common header file

CPU ctrl-dev MMIO region length could be used in ACPI GED and various other
architecture specific places. Move ACPI_CPU_HOTPLUG_REG_LEN macro to more
appropriate common header file.

Signed-off-by: Salil Mehta 
Reviewed-by: Alex Bennée 
Reviewed-by: Jonathan Cameron 
Reviewed-by: Gavin Shan 
Reviewed-by: David Hildenbrand 
Reviewed-by: Shaoqin Huang 
Tested-by: Vishnu Pajjuri 
Tested-by: Xianglai Li 
Tested-by: Miguel Luis 
---
 hw/acpi/cpu.c | 2 +-
 include/hw/acpi/cpu_hotplug.h | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
index 2d81c1e790..69aaa563db 100644
--- a/hw/acpi/cpu.c
+++ b/hw/acpi/cpu.c
@@ -1,13 +1,13 @@
 #include "qemu/osdep.h"
 #include "migration/vmstate.h"
 #include "hw/acpi/cpu.h"
+#include "hw/acpi/cpu_hotplug.h"
 #include "hw/core/cpu.h"
 #include "qapi/error.h"
 #include "qapi/qapi-events-acpi.h"
 #include "trace.h"
 #include "sysemu/numa.h"
 
-#define ACPI_CPU_HOTPLUG_REG_LEN 12
 #define ACPI_CPU_SELECTOR_OFFSET_WR 0
 #define ACPI_CPU_FLAGS_OFFSET_RW 4
 #define ACPI_CPU_CMD_OFFSET_WR 5
diff --git a/include/hw/acpi/cpu_hotplug.h b/include/hw/acpi/cpu_hotplug.h
index 3b932a..48b291e45e 100644
--- a/include/hw/acpi/cpu_hotplug.h
+++ b/include/hw/acpi/cpu_hotplug.h
@@ -19,6 +19,8 @@
 #include "hw/hotplug.h"
 #include "hw/acpi/cpu.h"
 
+#define ACPI_CPU_HOTPLUG_REG_LEN 12
+
 typedef struct AcpiCpuHotplug {
 Object *device;
 MemoryRegion io;
-- 
2.34.1

[PATCH V8 1/8] accel/kvm: Extract common KVM vCPU {creation, parking} code

KVM vCPU creation is done once during the vCPU realization when Qemu vCPU thread
is spawned. This is common to all the architectures as of now.

Hot-unplug of vCPU results in destruction of the vCPU object in QOM but the
corresponding KVM vCPU object in the Host KVM is not destroyed as KVM doesn't
support vCPU removal. Therefore, its representative KVM vCPU object/context in
Qemu is parked.

Refactor architecture common logic so that some APIs could be reused by vCPU
Hotplug code of some architectures likes ARM, Loongson etc. Update new/old APIs
with trace events instead of DPRINTF. No functional change is intended here.

Signed-off-by: Salil Mehta 
Reviewed-by: Gavin Shan 
Tested-by: Vishnu Pajjuri 
Reviewed-by: Jonathan Cameron 
Tested-by: Xianglai Li 
Tested-by: Miguel Luis 
Reviewed-by: Shaoqin Huang 
---
 accel/kvm/kvm-all.c| 64 --
 accel/kvm/trace-events |  5 +++-
 include/sysemu/kvm.h   | 16 +++
 3 files changed, 69 insertions(+), 16 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index a8cecd040e..3bc3207bda 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -126,6 +126,7 @@ static QemuMutex kml_slots_lock;
 #define kvm_slots_unlock()  qemu_mutex_unlock(_slots_lock)
 
 static void kvm_slot_init_dirty_bitmap(KVMSlot *mem);
+static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id);
 
 static inline void kvm_resample_fd_remove(int gsi)
 {
@@ -314,14 +315,53 @@ err:
 return ret;
 }
 
+void kvm_park_vcpu(CPUState *cpu)
+{
+struct KVMParkedVcpu *vcpu;
+
+trace_kvm_park_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
+
+vcpu = g_malloc0(sizeof(*vcpu));
+vcpu->vcpu_id = kvm_arch_vcpu_id(cpu);
+vcpu->kvm_fd = cpu->kvm_fd;
+QLIST_INSERT_HEAD(_state->kvm_parked_vcpus, vcpu, node);
+}
+
+int kvm_create_vcpu(CPUState *cpu)
+{
+unsigned long vcpu_id = kvm_arch_vcpu_id(cpu);
+KVMState *s = kvm_state;
+int kvm_fd;
+
+trace_kvm_create_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
+
+/* check if the KVM vCPU already exist but is parked */
+kvm_fd = kvm_get_vcpu(s, vcpu_id);
+if (kvm_fd < 0) {
+/* vCPU not parked: create a new KVM vCPU */
+kvm_fd = kvm_vm_ioctl(s, KVM_CREATE_VCPU, vcpu_id);
+if (kvm_fd < 0) {
+error_report("KVM_CREATE_VCPU IOCTL failed for vCPU %lu", vcpu_id);
+return kvm_fd;
+}
+}
+
+cpu->kvm_fd = kvm_fd;
+cpu->kvm_state = s;
+cpu->vcpu_dirty = true;
+cpu->dirty_pages = 0;
+cpu->throttle_us_per_full = 0;
+
+return 0;
+}
+
 static int do_kvm_destroy_vcpu(CPUState *cpu)
 {
 KVMState *s = kvm_state;
 long mmap_size;
-struct KVMParkedVcpu *vcpu = NULL;
 int ret = 0;
 
-trace_kvm_destroy_vcpu();
+trace_kvm_destroy_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
 
 ret = kvm_arch_destroy_vcpu(cpu);
 if (ret < 0) {
@@ -347,10 +387,7 @@ static int do_kvm_destroy_vcpu(CPUState *cpu)
 }
 }
 
-vcpu = g_malloc0(sizeof(*vcpu));
-vcpu->vcpu_id = kvm_arch_vcpu_id(cpu);
-vcpu->kvm_fd = cpu->kvm_fd;
-QLIST_INSERT_HEAD(_state->kvm_parked_vcpus, vcpu, node);
+kvm_park_vcpu(cpu);
 err:
 return ret;
 }
@@ -371,6 +408,8 @@ static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id)
 if (cpu->vcpu_id == vcpu_id) {
 int kvm_fd;
 
+trace_kvm_get_vcpu(vcpu_id);
+
 QLIST_REMOVE(cpu, node);
 kvm_fd = cpu->kvm_fd;
 g_free(cpu);
@@ -378,7 +417,7 @@ static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id)
 }
 }
 
-return kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)vcpu_id);
+return -ENOENT;
 }
 
 int kvm_init_vcpu(CPUState *cpu, Error **errp)
@@ -389,19 +428,14 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp)
 
 trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
 
-ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
+ret = kvm_create_vcpu(cpu);
 if (ret < 0) {
-error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu failed 
(%lu)",
+error_setg_errno(errp, -ret,
+ "kvm_init_vcpu: kvm_create_vcpu failed (%lu)",
  kvm_arch_vcpu_id(cpu));
 goto err;
 }
 
-cpu->kvm_fd = ret;
-cpu->kvm_state = s;
-cpu->vcpu_dirty = true;
-cpu->dirty_pages = 0;
-cpu->throttle_us_per_full = 0;
-
 mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0);
 if (mmap_size < 0) {
 ret = mmap_size;
diff --git a/accel/kvm/trace-events b/accel/kvm/trace-events
index a25902597b..5558cff0dc 100644
--- a/accel/kvm/trace-events
+++ b/accel/kvm/trace-events
@@ -9,6 +9,10 @@ kvm_device_ioctl(int fd, int type, void *arg) "dev fd %d, type 
0x%x, arg %p"
 kvm_failed_reg_get(uint64_t id, const char *msg) "Warning: Unable to retrieve 
ONEREG %" PRIu64 " from KVM: %s"
 kvm_failed_reg_set(uint64_t id, const char *msg) "Warning: Unable to set 
ONEREG %" PRIu64 " to KVM: %s"

[PATCH V8 0/8] Add architecture agnostic code to support vCPU Hotplug

Virtual CPU hotplug support is being added across various architectures[1][3].
This series adds various code bits common across all architectures:

1. vCPU creation and Parking code refactor [Patch 1]
2. Update ACPI GED framework to support vCPU Hotplug [Patch 2,3]
3. ACPI CPUs AML code change [Patch 4,5]
4. Helper functions to support unrealization of CPU objects [Patch 6,7]
5. Docs [Patch 8]


Repository:

[*] https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v2.common.v8


Revision History:

Patch-set V7 -> V8

1. Rebased and Fixed the conflicts

Patch-set  V6 -> V7
1. Addressed Alex Bennée's comments
   - Updated the docs
2. Addressed Igor Mammedov's comments
   - Merged patches [Patch V6 3/9] & [Patch V6 7/9] with [Patch V6 4/9]
   - Updated commit-log of [Patch V6 1/9] and [Patch V6 5/9] 
3. Added Shaoqin Huang's Reviewed-by tags for whole series.
Link: 
https://lore.kernel.org/qemu-devel/20231013105129.25648-1-salil.me...@huawei.com/

Patch-set  V5 -> V6
1. Addressed Gavin Shan's comments
   - Fixed the assert() ranges of address spaces
   - Rebased the patch-set to latest changes in the qemu.git
   - Added Reviewed-by tags for patches {8,9}
2. Addressed Jonathan Cameron's comments
   - Updated commit-log for [Patch V5 1/9] with mention of trace events
   - Added Reviewed-by tags for patches {1,5}
3. Added Tested-by tags from Xianglai Li
4. Fixed checkpatch.pl error "Qemu -> QEMU" in [Patch V5 1/9] 
Link: 
https://lore.kernel.org/qemu-devel/20231011194355.15628-1-salil.me...@huawei.com/

Patch-set  V4 -> V5
1. Addressed Gavin Shan's comments
   - Fixed the trace events print string for kvm_{create,get,park,destroy}_vcpu
   - Added Reviewed-by tag for patch {1}
2. Added Shaoqin Huang's Reviewed-by tags for Patches {2,3}
3. Added Tested-by Tag from Vishnu Pajjuri to the patch-set
4. Dropped the ARM specific [Patch V4 10/10]
Link: 
https://lore.kernel.org/qemu-devel/20231009203601.17584-1-salil.me...@huawei.com/

Patch-set  V3 -> V4
1. Addressed David Hilderbrand's comments
   - Fixed the wrong doc comment of kvm_park_vcpu API prototype
   - Added Reviewed-by tags for patches {2,4}
Link: 
https://lore.kernel.org/qemu-devel/20231009112812.10612-1-salil.me...@huawei.com/

Patch-set  V2 -> V3
1. Addressed Jonathan Cameron's comments
   - Fixed 'vcpu-id' type wrongly changed from 'unsigned long' to 'integer'
   - Removed unnecessary use of variable 'vcpu_id' in kvm_park_vcpu
   - Updated [Patch V2 3/10] commit-log with details of ACPI_CPU_SCAN_METHOD 
macro
   - Updated [Patch V2 5/10] commit-log with details of conditional event 
handler method
   - Added Reviewed-by tags for patches {2,3,4,6,7}
2. Addressed Gavin Shan's comments
   - Remove unnecessary use of variable 'vcpu_id' in kvm_par_vcpu
   - Fixed return value in kvm_get_vcpu from -1 to -ENOENT
   - Reset the value of 'gdb_num_g_regs' in gdb_unregister_coprocessor_all
   - Fixed the kvm_{create,park}_vcpu prototypes docs
   - Added Reviewed-by tags for patches {2,3,4,5,6,7,9,10}
3. Addressed one earlier missed comment by Alex Bennée in RFC V1
   - Added traces instead of DPRINTF in the newly added and some existing 
functions
Link: 
https://lore.kernel.org/qemu-devel/20230930001933.2660-1-salil.me...@huawei.com/

Patch-set V1 -> V2
1. Addressed Alex Bennée's comments
   - Refactored the kvm_create_vcpu logic to get rid of goto
   - Added the docs for kvm_{create,park}_vcpu prototypes
   - Splitted the gdbstub and AddressSpace destruction change into separate 
patches
   - Added Reviewed-by tags for patches {2,10}
Link: 
https://lore.kernel.org/qemu-devel/20230929124304.13672-1-salil.me...@huawei.com/

References:

[1] 
https://lore.kernel.org/qemu-devel/20230926100436.28284-1-salil.me...@huawei.com/
[2] https://lore.kernel.org/all/20230913163823.7880-1-james.mo...@arm.com/
[3] 
https://lore.kernel.org/qemu-devel/cover.1695697701.git.lixiang...@loongson.cn/



Salil Mehta (8):
  accel/kvm: Extract common KVM vCPU {creation,parking} code
  hw/acpi: Move CPU ctrl-dev MMIO region len macro to common header file
  hw/acpi: Update ACPI GED framework to support vCPU Hotplug
  hw/acpi: Update GED _EVT method AML with CPU scan
  hw/acpi: Update CPUs AML with cpu-(ctrl)dev change
  physmem: Add helper function to destroy CPU AddressSpace
  gdbstub: Add helper function to unregister GDB register space
  docs/specs/acpi_hw_reduced_hotplug: Add the CPU Hotplug Event Bit

 accel/kvm/kvm-all.c| 64 --
 accel/kvm/trace-events |  5 +-
 docs/specs/acpi_hw_reduced_hotplug.rst |  3 +-
 gdbstub/gdbstub.c  | 12 +
 hw/acpi/acpi-cpu-hotplug-stub.c|  6 +++
 hw/acpi/cpu.c  | 27 +++
 hw/acpi/generic_event_device.c | 21 +
 hw/i386/acpi-build.c   |  3 +-
 include/exec/cpu-common.h  |  8 
 include/exec/gdbstub.h |  6 +++
 include/hw/acpi/cpu.h  |  5 +-

Re: [PATCH v12 4/7] target/riscv: remove 'over' brconds from vector trans

2024-03-11 Thread LIU Zhiwei




On 2024/3/12 2:08, Daniel Henrique Barboza wrote:

The previous patch added an early vstart >= vl exit in all vector
helpers, most of them using the VSTART_CHECK_EARLY_EXIT() macro,
and now we're left with a lot of 'brcond' that has not use. The
pattern goes like this:

 VSTART_CHECK_EARLY_EXIT(env);
 (...)
 tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
 (...)
 gen_set_label(over);
 return true;

The early exit makes the 'brcond' unneeded since it's already granted that
vstart < vl. Remove all 'over' conditionals from the vector helpers.

Note that not all insns uses helpers, and for those cases the 'brcond'
jump is the only way to filter vstart >= vl. This is the case of
trans_vmv_s_x() and trans_vfmv_s_f(). We won't remove the 'brcond'
conditionals from them.

While we're at it, remove the (vl == 0) brconds from trans_rvbf16.c.inc
too since they're unneeded.

Suggested-by: Richard Henderson 
Signed-off-by: Daniel Henrique Barboza 
Reviewed-by: Richard Henderson 
Reviewed-by: Alistair Francis 
---
  target/riscv/insn_trans/trans_rvbf16.c.inc |  12 ---
  target/riscv/insn_trans/trans_rvv.c.inc| 108 -
  target/riscv/insn_trans/trans_rvvk.c.inc   |  18 
  3 files changed, 138 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvbf16.c.inc 
b/target/riscv/insn_trans/trans_rvbf16.c.inc
index 8ee99df3f3..a842e76a6b 100644
--- a/target/riscv/insn_trans/trans_rvbf16.c.inc
+++ b/target/riscv/insn_trans/trans_rvbf16.c.inc
@@ -71,11 +71,8 @@ static bool trans_vfncvtbf16_f_f_w(DisasContext *ctx, 
arg_vfncvtbf16_f_f_w *a)
  
  if (opfv_narrow_check(ctx, a) && (ctx->sew == MO_16)) {

  uint32_t data = 0;
-TCGLabel *over = gen_new_label();
  
  gen_set_rm_chkfrm(ctx, RISCV_FRM_DYN);

-tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);
-tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
  
  data = FIELD_DP32(data, VDATA, VM, a->vm);

  data = FIELD_DP32(data, VDATA, LMUL, ctx->lmul);
@@ -87,7 +84,6 @@ static bool trans_vfncvtbf16_f_f_w(DisasContext *ctx, 
arg_vfncvtbf16_f_f_w *a)
 ctx->cfg_ptr->vlenb, data,
 gen_helper_vfncvtbf16_f_f_w);
  mark_vs_dirty(ctx);
-gen_set_label(over);
  return true;
  }
  return false;
@@ -100,11 +96,8 @@ static bool trans_vfwcvtbf16_f_f_v(DisasContext *ctx, 
arg_vfwcvtbf16_f_f_v *a)
  
  if (opfv_widen_check(ctx, a) && (ctx->sew == MO_16)) {

  uint32_t data = 0;
-TCGLabel *over = gen_new_label();
  
  gen_set_rm_chkfrm(ctx, RISCV_FRM_DYN);

-tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);
-tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
  
  data = FIELD_DP32(data, VDATA, VM, a->vm);

  data = FIELD_DP32(data, VDATA, LMUL, ctx->lmul);
@@ -116,7 +109,6 @@ static bool trans_vfwcvtbf16_f_f_v(DisasContext *ctx, 
arg_vfwcvtbf16_f_f_v *a)
 ctx->cfg_ptr->vlenb, data,
 gen_helper_vfwcvtbf16_f_f_v);
  mark_vs_dirty(ctx);
-gen_set_label(over);
  return true;
  }
  return false;
@@ -130,11 +122,8 @@ static bool trans_vfwmaccbf16_vv(DisasContext *ctx, 
arg_vfwmaccbf16_vv *a)
  if (require_rvv(ctx) && vext_check_isa_ill(ctx) && (ctx->sew == MO_16) &&
  vext_check_dss(ctx, a->rd, a->rs1, a->rs2, a->vm)) {
  uint32_t data = 0;
-TCGLabel *over = gen_new_label();
  
  gen_set_rm_chkfrm(ctx, RISCV_FRM_DYN);

-tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);
-tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
  
  data = FIELD_DP32(data, VDATA, VM, a->vm);

  data = FIELD_DP32(data, VDATA, LMUL, ctx->lmul);
@@ -147,7 +136,6 @@ static bool trans_vfwmaccbf16_vv(DisasContext *ctx, 
arg_vfwmaccbf16_vv *a)
 ctx->cfg_ptr->vlenb, data,
 gen_helper_vfwmaccbf16_vv);
  mark_vs_dirty(ctx);
-gen_set_label(over);
  return true;
  }
  return false;
diff --git a/target/riscv/insn_trans/trans_rvv.c.inc 
b/target/riscv/insn_trans/trans_rvv.c.inc
index 8c16a9f5b3..4c1a064cf6 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -616,9 +616,6 @@ static bool ldst_us_trans(uint32_t vd, uint32_t rs1, 
uint32_t data,
  TCGv base;
  TCGv_i32 desc;
  
-TCGLabel *over = gen_new_label();

-tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
-
  dest = tcg_temp_new_ptr();
  mask = tcg_temp_new_ptr();
  base = get_gpr(s, rs1, EXT_NONE);
@@ -660,7 +657,6 @@ static bool ldst_us_trans(uint32_t vd, uint32_t rs1, 
uint32_t data,
  tcg_gen_mb(TCG_MO_ALL | TCG_BAR_LDAQ);
  }
  
-gen_set_label(over);

  return true;
  }
  
@@ -802,9 +798,6 @@ static bool ldst_stride_trans(uint32_t vd, uint32_t

RE: [PATCH V2 09/11] migration: privatize colo interfaces

2024-03-11 Thread Zhang, Chen




> -Original Message-
> From: Steve Sistare 
> Sent: Tuesday, March 12, 2024 1:49 AM
> To: qemu-devel@nongnu.org
> Cc: Alex Williamson ; Cedric Le Goater
> ; Michael S. Tsirkin ; David Hildenbrand
> ; Peter Xu ; Fabiano Rosas
> ; Zhang, Hailiang ; Zhang,
> Chen ; Li Zhijian ; Jason Wang
> ; Hyman Huang ; Song
> Gao ; Alistair Francis ;
> Steve Sistare 
> Subject: [PATCH V2 09/11] migration: privatize colo interfaces
> 
> Remove private migration interfaces from net/colo-compare.c and push them
> to migration/colo.c.
> 
> Signed-off-by: Steve Sistare 

Reviewed-by: Zhang Chen 

Thanks
Chen

> ---
>  migration/colo.c   | 17 +++--
>  net/colo-compare.c |  3 +--
>  stubs/colo.c   |  1 -
>  3 files changed, 12 insertions(+), 9 deletions(-)
> 
> diff --git a/migration/colo.c b/migration/colo.c index 315e31f..84632a6
> 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -63,9 +63,9 @@ static bool colo_runstate_is_stopped(void)
>  return runstate_check(RUN_STATE_COLO) || !runstate_is_running();  }
> 
> -static void colo_checkpoint_notify(void *opaque)
> +static void colo_checkpoint_notify(void)
>  {
> -MigrationState *s = opaque;
> +MigrationState *s = migrate_get_current();
>  int64_t next_notify_time;
> 
>  qemu_event_set(>colo_checkpoint_event);
> @@ -74,10 +74,15 @@ static void colo_checkpoint_notify(void *opaque)
>  timer_mod(s->colo_delay_timer, next_notify_time);  }
> 
> +static void colo_checkpoint_notify_timer(void *opaque) {
> +colo_checkpoint_notify();
> +}
> +
>  void colo_checkpoint_delay_set(void)
>  {
>  if (migration_in_colo_state()) {
> -colo_checkpoint_notify(migrate_get_current());
> +colo_checkpoint_notify();
>  }
>  }
> 
> @@ -162,7 +167,7 @@ static void primary_vm_do_failover(void)
>   * kick COLO thread which might wait at
>   * qemu_sem_wait(>colo_checkpoint_sem).
>   */
> -colo_checkpoint_notify(s);
> +colo_checkpoint_notify();
> 
>  /*
>   * Wake up COLO thread which may blocked in recv() or send(), @@ -518,7
> +523,7 @@ out:
> 
>  static void colo_compare_notify_checkpoint(Notifier *notifier, void *data)  {
> -colo_checkpoint_notify(data);
> +colo_checkpoint_notify();
>  }
> 
>  static void colo_process_checkpoint(MigrationState *s) @@ -642,7 +647,7
> @@ void migrate_start_colo_process(MigrationState *s)
>  bql_unlock();
>  qemu_event_init(>colo_checkpoint_event, false);
>  s->colo_delay_timer =  timer_new_ms(QEMU_CLOCK_HOST,
> -colo_checkpoint_notify, s);
> +colo_checkpoint_notify_timer, NULL);
> 
>  qemu_sem_init(>colo_exit_sem, 0);
>  colo_process_checkpoint(s);
> diff --git a/net/colo-compare.c b/net/colo-compare.c index f2dfc0e..c4ad0ab
> 100644
> --- a/net/colo-compare.c
> +++ b/net/colo-compare.c
> @@ -28,7 +28,6 @@
>  #include "sysemu/iothread.h"
>  #include "net/colo-compare.h"
>  #include "migration/colo.h"
> -#include "migration/migration.h"
>  #include "util.h"
> 
>  #include "block/aio-wait.h"
> @@ -189,7 +188,7 @@ static void
> colo_compare_inconsistency_notify(CompareState *s)
>  notify_remote_frame(s);
>  } else {
>  notifier_list_notify(_compare_notifiers,
> - migrate_get_current());
> + NULL);
>  }
>  }
> 
> diff --git a/stubs/colo.c b/stubs/colo.c index 08c9f98..f8c069b 100644
> --- a/stubs/colo.c
> +++ b/stubs/colo.c
> @@ -2,7 +2,6 @@
>  #include "qemu/notify.h"
>  #include "net/colo-compare.h"
>  #include "migration/colo.h"
> -#include "migration/migration.h"
>  #include "qemu/error-report.h"
>  #include "qapi/qapi-commands-migration.h"
> 
> --
> 1.8.3.1

[PATCH] meson: Make DEBUG_REMAP a meson option

2024-03-11 Thread Ilya Leoshkevich

Currently DEBUG_REMAP is a macro that needs to be manually #defined to
be activated, which makes it hard to have separate build directories
dedicated to testing the code with it. Promote it to a meson option.

Signed-off-by: Ilya Leoshkevich 
---
 bsd-user/qemu.h   | 6 ++
 linux-user/qemu.h | 4 +---
 linux-user/uaccess.c  | 4 ++--
 meson.build   | 4 
 meson_options.txt | 2 ++
 scripts/meson-buildoptions.sh | 3 +++
 6 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/bsd-user/qemu.h b/bsd-user/qemu.h
index 1b0a591d2d2..8629f0dcde9 100644
--- a/bsd-user/qemu.h
+++ b/bsd-user/qemu.h
@@ -22,8 +22,6 @@
 #include "exec/cpu_ldst.h"
 #include "exec/exec-all.h"
 
-#undef DEBUG_REMAP
-
 #include "exec/user/abitypes.h"
 
 extern char **environ;
@@ -437,7 +435,7 @@ static inline void *lock_user(int type, abi_ulong 
guest_addr, long len,
 if (!access_ok(type, guest_addr, len)) {
 return NULL;
 }
-#ifdef DEBUG_REMAP
+#ifdef CONFIG_DEBUG_REMAP
 {
 void *addr;
 addr = g_malloc(len);
@@ -461,7 +459,7 @@ static inline void unlock_user(void *host_ptr, abi_ulong 
guest_addr,
long len)
 {
 
-#ifdef DEBUG_REMAP
+#ifdef CONFIG_DEBUG_REMAP
 if (!host_ptr) {
 return;
 }
diff --git a/linux-user/qemu.h b/linux-user/qemu.h
index 32cd43d9eff..4777856b529 100644
--- a/linux-user/qemu.h
+++ b/linux-user/qemu.h
@@ -4,8 +4,6 @@
 #include "cpu.h"
 #include "exec/cpu_ldst.h"
 
-#undef DEBUG_REMAP
-
 #include "exec/user/abitypes.h"
 
 #include "syscall_defs.h"
@@ -332,7 +330,7 @@ void *lock_user(int type, abi_ulong guest_addr, ssize_t 
len, bool copy);
 /* Unlock an area of guest memory.  The first LEN bytes must be
flushed back to guest memory. host_ptr = NULL is explicitly
allowed and does nothing. */
-#ifndef DEBUG_REMAP
+#ifndef CONFIG_DEBUG_REMAP
 static inline void unlock_user(void *host_ptr, abi_ulong guest_addr,
ssize_t len)
 {
diff --git a/linux-user/uaccess.c b/linux-user/uaccess.c
index 425cbf677f7..27e841e6510 100644
--- a/linux-user/uaccess.c
+++ b/linux-user/uaccess.c
@@ -14,7 +14,7 @@ void *lock_user(int type, abi_ulong guest_addr, ssize_t len, 
bool copy)
 return NULL;
 }
 host_addr = g2h_untagged(guest_addr);
-#ifdef DEBUG_REMAP
+#ifdef CONFIG_DEBUG_REMAP
 if (copy) {
 host_addr = g_memdup(host_addr, len);
 } else {
@@ -24,7 +24,7 @@ void *lock_user(int type, abi_ulong guest_addr, ssize_t len, 
bool copy)
 return host_addr;
 }
 
-#ifdef DEBUG_REMAP
+#ifdef CONFIG_DEBUG_REMAP
 void unlock_user(void *host_ptr, abi_ulong guest_addr, ssize_t len)
 {
 void *host_ptr_conv;
diff --git a/meson.build b/meson.build
index f9dbe7634e5..1427e9f8811 100644
--- a/meson.build
+++ b/meson.build
@@ -2342,6 +2342,7 @@ config_host_data.set('CONFIG_DEBUG_GRAPH_LOCK', 
get_option('debug_graph_lock'))
 config_host_data.set('CONFIG_DEBUG_MUTEX', get_option('debug_mutex'))
 config_host_data.set('CONFIG_DEBUG_STACK_USAGE', 
get_option('debug_stack_usage'))
 config_host_data.set('CONFIG_DEBUG_TCG', get_option('debug_tcg'))
+config_host_data.set('CONFIG_DEBUG_REMAP', get_option('debug_remap'))
 config_host_data.set('CONFIG_LIVE_BLOCK_MIGRATION', 
get_option('live_block_migration').allowed())
 config_host_data.set('CONFIG_QOM_CAST_DEBUG', get_option('qom_cast_debug'))
 config_host_data.set('CONFIG_REPLICATION', get_option('replication').allowed())
@@ -4285,6 +4286,9 @@ if config_all_accel.has_key('CONFIG_TCG')
   endif
   summary_info += {'TCG plugins':   get_option('plugins')}
   summary_info += {'TCG debug enabled': get_option('debug_tcg')}
+  if have_linux_user or have_bsd_user
+summary_info += {'syscall buffer debugging support': 
get_option('debug_remap')}
+  endif
 endif
 summary_info += {'target list':   ' '.join(target_dirs)}
 if have_system
diff --git a/meson_options.txt b/meson_options.txt
index 0a99a059ec8..7cd48a88e6e 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -85,6 +85,8 @@ option('plugins', type: 'boolean', value: false,
description: 'TCG plugins via shared library loading')
 option('debug_tcg', type: 'boolean', value: false,
description: 'TCG debugging')
+option('debug_remap', type: 'boolean', value: false,
+   description: 'syscall buffer debugging support')
 option('tcg_interpreter', type: 'boolean', value: false,
description: 'TCG with bytecode interpreter (slow)')
 option('safe_stack', type: 'boolean', value: false,
diff --git a/scripts/meson-buildoptions.sh b/scripts/meson-buildoptions.sh
index 680fa3f581d..aff58275daf 100644
--- a/scripts/meson-buildoptions.sh
+++ b/scripts/meson-buildoptions.sh
@@ -29,6 +29,7 @@ meson_options_help() {
   printf "%s\n" '  --enable-debug-graph-lock'
   printf "%s\n" '   graph lock debugging support'
   printf "%s\n" '  --enable-debug-mutex mutex debugging support'
+  printf

Re: [RISC-V][tech-server-soc] [RISC-V][tech-server-platform] [RFC 1/2] hw/riscv: Add server platform reference machine

2024-03-11 Thread Atish Kumar Patra

On Mon, Mar 11, 2024 at 7:38 AM Andrew Jones  wrote:
>
> On Mon, Mar 11, 2024 at 04:55:24AM -0700, Wu, Fei2 wrote:
> > On 3/8/2024 5:20 PM, Andrew Jones wrote:
> > > On Thu, Mar 07, 2024 at 02:26:18PM +0800, Wu, Fei wrote:
> > >> On 3/7/2024 8:48 AM, Alistair Francis wrote:
> > >>> On Thu, Mar 7, 2024 at 5:13 AM Atish Kumar Patra  
> > >>> wrote:
> > 
> >  On Wed, Mar 6, 2024 at 4:56 AM Wu, Fei  wrote:
> > >
> > > On 3/6/2024 8:19 AM, Alistair Francis wrote:
> > >> On Mon, Mar 4, 2024 at 8:28 PM Fei Wu  wrote:
> > > ...
> > >>> +config SERVER_PLATFORM_REF
> > >>> +bool
> > >>> +select RISCV_NUMA
> > >>> +select GOLDFISH_RTC
> > >>> +select PCI
> > >>> +select PCI_EXPRESS_GENERIC_BRIDGE
> > >>> +select PFLASH_CFI01
> > >>> +select SERIAL
> > >>> +select RISCV_ACLINT
> > >>> +select RISCV_APLIC
> > >>> +select RISCV_IMSIC
> > >>> +select SIFIVE_TEST
> > >>
> > >> Do we really need SiFive Test in the server platform?
> > >>
> > > It's used to reset the system, is there any better choice?
> > >>>
> > >>> If we add this now we are stuck with it forever (or at least a long
> > >>> time). So it'd be nice to think about these and decide if these really
> > >>> are the best way to do things. We don't have to just copy the existing
> > >>> virt machine.
> > >>>
> > >> We need a solution to poweroff/reboot, and sifive test is one of the
> > >> hardware implementations, so in general I think it's okay. But I agree
> > >> Sifive test looks a device for testing only.
> > >>
> > >>> There must be a more standard way to do this then MMIO mapped SiFive 
> > >>> hardware?
> > >>>
> > >> The mapped MMIO mechanism leveraged by Sifive test by itself is kinda
> > >> generic, the sbsa_ec for sbsa-ref is also an MMIO mapped device. These
> > >> two devices look very similar except different encodings of the
> > >> shutdown/reboot command.
> > >>
> > >> Probably we can have a generic shutdown/reboot device in QEMU for both
> > >> sifive test and sbsa_ec, and likely more (not in this patch series). In
> > >> this way, sifive test device will be replaced by this more generic
> > >> device. Any suggestions?
> > >
> > > Operating systems shouldn't need to implement odd-ball device drivers to
> > > function on a reference of a standard platform. So the reference platform
> > > should only be comprised of devices which have specifications and already,
> > > or will, have DT bindings. Generic devices would be best, but I don't
> > > think it should be a problem to use devices from multiple vendors. The
> > > devices just need to allow GPL drivers to be written. With all that in
> > > mind, what about adding a generic GPIO controller or using SiFive's GPIO
> > > controller. Then, we could add gpio-restart and gpio-poweroff.
> > >
> > I agree with most of what you said. Regarding generic devices, syscon
> > looks a better choice than gpio in the current situation.
> >
> > Linux kernel has these configurations enabled for virt, and I'm not
> > going to add a new soc for this new board currently, we can use the same
> > syscon interface for power, and it has already well supported.
> >
> > config SOC_VIRT
> >   bool "QEMU Virt Machine"
> >   select CLINT_TIMER if RISCV_M_MODE
> >   select POWER_RESET
> >   select POWER_RESET_SYSCON
> >   select POWER_RESET_SYSCON_POWEROFF
> >   select GOLDFISH
> >
> > For the qemu part, we can remove device 'sifive_test' and manage that
> > memory region directly with MemoryRegionOps, similar to what
> > hw/mips/boston.c does.
>
> OK, that sounds good. Also, I guess the real concern is whether firmware
> (e.g. OpenSBI) supports the platform's power-off device, since firmware
> will present the SRST SBI call to Linux, so Linux doesn't need to worry
> about it at all. However, if we want Linux to worry about it, then we

Syscon devices are already supported in OpenSBI. So syscon seems to be
the best option right now.

> can't forget to ensure we can implement the syscon interface in AML for
> ACPI too. Indeed, we should be introducing ACPI support for this reference
> machine type at the same time we introduce the machine in order to ensure
> all device selections have, or will have, both DT and ACPI support.
>

Yeah. In addition to that, this reference platform also needs to
generate minimalistic DT
for OpenSBI even though only ACPI is required for Linux. IIRC,
sbsa-ref also does something similar.

> Thanks,
> drew

[PATCH] gdbstub: Fix double close() of the follow-fork-mode socket

2024-03-11 Thread Ilya Leoshkevich

When the terminal GDB_FORK_ENABLED state is reached, the coordination
socket is not needed anymore and is therefore closed. However, if there
is a communication error between QEMU gdbstub and GDB, the generic
error handling code attempts to close it again.

Fix by closing it later - before returning - instead.

Fixes: Coverity CID 1539966
Fixes: d547e711a8a5 ("gdbstub: Implement follow-fork-mode child")
Signed-off-by: Ilya Leoshkevich 
---
 gdbstub/user.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/gdbstub/user.c b/gdbstub/user.c
index 7f9f19a1249..08aed022e26 100644
--- a/gdbstub/user.c
+++ b/gdbstub/user.c
@@ -502,6 +502,7 @@ void gdbserver_fork_end(CPUState *cpu, pid_t pid)
 switch (gdbserver_user_state.fork_state) {
 case GDB_FORK_ENABLED:
 if (gdbserver_user_state.running_state) {
+close(fd);
 return;
 }
 QEMU_FALLTHROUGH;
@@ -527,7 +528,6 @@ void gdbserver_fork_end(CPUState *cpu, pid_t pid)
 gdbserver_user_state.fork_state = GDB_FORK_ACTIVE;
 break;
 case GDB_FORK_ENABLE:
-close(fd);
 gdbserver_user_state.fork_state = GDB_FORK_ENABLED;
 break;
 case GDB_FORK_DISABLE:
@@ -542,7 +542,6 @@ void gdbserver_fork_end(CPUState *cpu, pid_t pid)
 if (write(fd, , 1) != 1) {
 goto fail;
 }
-close(fd);
 gdbserver_user_state.fork_state = GDB_FORK_ENABLED;
 break;
 case GDB_FORK_DISABLING:
-- 
2.44.0

[PATCH 0/2] migration: mapped-ram fixes

Hi,

Here are the fixes for the dup() issues found by Coverity.

@Peter Xu, I fixed the leak you found, but I'm holding on to that patch
because I noticed we're not freeing the ioc in the error paths. I'll
need to add some infrastructure to be able to cancel the glib polling
(qio_channel_add_watch_full) when the channel creation fails before
the source has connnected.

CI run: https://gitlab.com/farosas/qemu/-/pipelines/1209529440

Fabiano Rosas (2):
  io: Introduce qio_channel_file_new_dupfd
  migration: Fix error handling after dup in file migration

 include/io/channel-file.h | 18 ++
 io/channel-file.c | 12 
 migration/fd.c|  9 -
 migration/file.c  | 14 +++---
 4 files changed, 41 insertions(+), 12 deletions(-)

-- 
2.35.3

[PATCH 1/2] io: Introduce qio_channel_file_new_dupfd

Add a new helper function for creating a QIOChannelFile channel with a
duplicated file descriptor. This saves the calling code from having to
do error checking on the dup() call.

Suggested-by: Daniel P. Berrangé 
Signed-off-by: Fabiano Rosas 
---
 include/io/channel-file.h | 18 ++
 io/channel-file.c | 12 
 2 files changed, 30 insertions(+)

diff --git a/include/io/channel-file.h b/include/io/channel-file.h
index 50e8eb1138..d373a4e44d 100644
--- a/include/io/channel-file.h
+++ b/include/io/channel-file.h
@@ -68,6 +68,24 @@ struct QIOChannelFile {
 QIOChannelFile *
 qio_channel_file_new_fd(int fd);
 
+/**
+ * qio_channel_file_new_dupfd:
+ * @fd: the file descriptor
+ * @errp: pointer to initialized error object
+ *
+ * Create a new IO channel object for a file represented by the @fd
+ * parameter. Like qio_channel_file_new_fd(), but the @fd is first
+ * duplicated with dup().
+ *
+ * The channel will own the duplicated file descriptor and will take
+ * responsibility for closing it, the original FD is owned by the
+ * caller.
+ *
+ * Returns: the new channel object
+ */
+QIOChannelFile *
+qio_channel_file_new_dupfd(int fd, Error **errp);
+
 /**
  * qio_channel_file_new_path:
  * @path: the file path
diff --git a/io/channel-file.c b/io/channel-file.c
index d4706fa592..cbdd03bf21 100644
--- a/io/channel-file.c
+++ b/io/channel-file.c
@@ -45,6 +45,18 @@ qio_channel_file_new_fd(int fd)
 return ioc;
 }
 
+QIOChannelFile *
+qio_channel_file_new_dupfd(int fd, Error **errp)
+{
+int newfd = dup(fd);
+
+if (newfd < 0) {
+error_setg_errno(errp, errno, "Could not dup FD %d", fd);
+return NULL;
+}
+
+return qio_channel_file_new_fd(newfd);
+}
 
 QIOChannelFile *
 qio_channel_file_new_path(const char *path,
-- 
2.35.3

[PATCH 2/2] migration: Fix error handling after dup in file migration

The file migration code was allowing a possible -1 from a failed call
to dup() to propagate into the new QIOFileChannel::fd before checking
for validity. Coverity doesn't like that, possibly due to the the
lseek(-1, ...) call that would ensue before returning from the channel
creation routine.

Use the newly introduced qio_channel_file_dupfd() to properly check
the return of dup() before proceeding.

Fixes: CID 1539961
Fixes: CID 1539965
Fixes: CID 1539960
Fixes: 2dd7ee7a51 ("migration/multifd: Add incoming QIOChannelFile support")
Fixes: decdc76772 ("migration/multifd: Add mapped-ram support to fd: URI")
Reported-by: Peter Maydell 
Signed-off-by: Fabiano Rosas 
---
 migration/fd.c   |  9 -
 migration/file.c | 14 +++---
 2 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/migration/fd.c b/migration/fd.c
index d4ae72d132..4e2a63a73d 100644
--- a/migration/fd.c
+++ b/migration/fd.c
@@ -80,6 +80,7 @@ static gboolean fd_accept_incoming_migration(QIOChannel *ioc,
 void fd_start_incoming_migration(const char *fdname, Error **errp)
 {
 QIOChannel *ioc;
+QIOChannelFile *fioc;
 int fd = monitor_fd_param(monitor_cur(), fdname, errp);
 if (fd == -1) {
 return;
@@ -103,15 +104,13 @@ void fd_start_incoming_migration(const char *fdname, 
Error **errp)
 int channels = migrate_multifd_channels();
 
 while (channels--) {
-ioc = QIO_CHANNEL(qio_channel_file_new_fd(dup(fd)));
-
-if (QIO_CHANNEL_FILE(ioc)->fd == -1) {
-error_setg(errp, "Failed to duplicate fd %d", fd);
+fioc = qio_channel_file_new_dupfd(fd, errp);
+if (!fioc) {
 return;
 }
 
 qio_channel_set_name(ioc, "migration-fd-incoming");
-qio_channel_add_watch_full(ioc, G_IO_IN,
+qio_channel_add_watch_full(QIO_CHANNEL(fioc), G_IO_IN,
fd_accept_incoming_migration,
NULL, NULL,
g_main_context_get_thread_default());
diff --git a/migration/file.c b/migration/file.c
index 164b079966..d458f48269 100644
--- a/migration/file.c
+++ b/migration/file.c
@@ -58,12 +58,13 @@ bool file_send_channel_create(gpointer opaque, Error **errp)
 int fd = fd_args_get_fd();
 
 if (fd && fd != -1) {
-ioc = qio_channel_file_new_fd(dup(fd));
+ioc = qio_channel_file_new_dupfd(fd, errp);
 } else {
 ioc = qio_channel_file_new_path(outgoing_args.fname, flags, 0, errp);
-if (!ioc) {
-goto out;
-}
+}
+
+if (!ioc) {
+goto out;
 }
 
 multifd_channel_connect(opaque, QIO_CHANNEL(ioc));
@@ -147,10 +148,9 @@ void file_start_incoming_migration(FileMigrationArgs 
*file_args, Error **errp)
NULL, NULL,
g_main_context_get_thread_default());
 
-fioc = qio_channel_file_new_fd(dup(fioc->fd));
+fioc = qio_channel_file_new_dupfd(fioc->fd, errp);
 
-if (!fioc || fioc->fd == -1) {
-error_setg(errp, "Error creating migration incoming channel");
+if (!fioc) {
 break;
 }
 } while (++i < channels);
-- 
2.35.3

[PULL 20/34] migration: export migration_is_running

From: Steve Sistare 

Delete the MigrationState parameter from migration_is_running and move
it to the public API in misc.h.

Signed-off-by: Steve Sistare 
Link: 
https://lore.kernel.org/r/1710179338-294359-5-git-send-email-steven.sist...@oracle.com
Signed-off-by: Peter Xu 
---
 include/migration/misc.h   |  1 +
 migration/migration.h  |  2 --
 migration/migration.c  | 10 ++
 migration/options.c|  4 ++--
 migration/savevm.c |  2 +-
 system/dirtylimit.c|  2 +-
 target/riscv/kvm/kvm-cpu.c |  4 ++--
 7 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/include/migration/misc.h b/include/migration/misc.h
index e1f1bf853e..7526977de6 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -106,6 +106,7 @@ int migration_call_notifiers(MigrationState *s, 
MigrationEventType type,
 bool migration_in_setup(MigrationState *);
 bool migration_has_finished(MigrationState *);
 bool migration_has_failed(MigrationState *);
+bool migration_is_running(void);
 /* ...and after the device transmission */
 /* True if incoming migration entered POSTCOPY_INCOMING_DISCARD */
 bool migration_in_incoming_postcopy(void);
diff --git a/migration/migration.h b/migration/migration.h
index 736460aa8b..e4983db9c9 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -479,8 +479,6 @@ bool migrate_has_error(MigrationState *s);
 
 void migrate_fd_connect(MigrationState *s, Error *error_in);
 
-bool migration_is_running(int state);
-
 int migrate_init(MigrationState *s, Error **errp);
 bool migration_is_blocked(Error **errp);
 /* True if outgoing migration has entered postcopy phase */
diff --git a/migration/migration.c b/migration/migration.c
index 17859cbaee..546ba86c63 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1103,9 +1103,11 @@ bool migration_is_setup_or_active(void)
 }
 }
 
-bool migration_is_running(int state)
+bool migration_is_running(void)
 {
-switch (state) {
+MigrationState *s = current_migration;
+
+switch (s->state) {
 case MIGRATION_STATUS_ACTIVE:
 case MIGRATION_STATUS_POSTCOPY_ACTIVE:
 case MIGRATION_STATUS_POSTCOPY_PAUSED:
@@ -1477,7 +1479,7 @@ static void migrate_fd_cancel(MigrationState *s)
 
 do {
 old_state = s->state;
-if (!migration_is_running(old_state)) {
+if (!migration_is_running()) {
 break;
 }
 /* If the migration is paused, kick it out of the pause */
@@ -1962,7 +1964,7 @@ static bool migrate_prepare(MigrationState *s, bool blk, 
bool blk_inc,
 return true;
 }
 
-if (migration_is_running(s->state)) {
+if (migration_is_running()) {
 error_setg(errp, QERR_MIGRATION_ACTIVE);
 return false;
 }
diff --git a/migration/options.c b/migration/options.c
index 40eb930940..642cfb00a3 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -681,7 +681,7 @@ bool migrate_cap_set(int cap, bool value, Error **errp)
 MigrationState *s = migrate_get_current();
 bool new_caps[MIGRATION_CAPABILITY__MAX];
 
-if (migration_is_running(s->state)) {
+if (migration_is_running()) {
 error_setg(errp, QERR_MIGRATION_ACTIVE);
 return false;
 }
@@ -725,7 +725,7 @@ void 
qmp_migrate_set_capabilities(MigrationCapabilityStatusList *params,
 MigrationCapabilityStatusList *cap;
 bool new_caps[MIGRATION_CAPABILITY__MAX];
 
-if (migration_is_running(s->state) || migration_in_colo_state()) {
+if (migration_is_running() || migration_in_colo_state()) {
 error_setg(errp, QERR_MIGRATION_ACTIVE);
 return;
 }
diff --git a/migration/savevm.c b/migration/savevm.c
index 76b57a9888..388d7af7cd 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1706,7 +1706,7 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
 MigrationState *ms = migrate_get_current();
 MigrationStatus status;
 
-if (migration_is_running(ms->state)) {
+if (migration_is_running()) {
 error_setg(errp, QERR_MIGRATION_ACTIVE);
 return -EINVAL;
 }
diff --git a/system/dirtylimit.c b/system/dirtylimit.c
index 051e0311c1..1622bb7426 100644
--- a/system/dirtylimit.c
+++ b/system/dirtylimit.c
@@ -451,7 +451,7 @@ static bool dirtylimit_is_allowed(void)
 {
 MigrationState *ms = migrate_get_current();
 
-if (migration_is_running(ms->state) &&
+if (migration_is_running() &&
 (!qemu_thread_is_self(>thread)) &&
 migrate_dirty_limit() &&
 dirtylimit_in_service()) {
diff --git a/target/riscv/kvm/kvm-cpu.c b/target/riscv/kvm/kvm-cpu.c
index c7afdb1e81..cda7d78a77 100644
--- a/target/riscv/kvm/kvm-cpu.c
+++ b/target/riscv/kvm/kvm-cpu.c
@@ -44,7 +44,7 @@
 #include "kvm_riscv.h"
 #include "sbi_ecall_interface.h"
 #include "chardev/char-fe.h"
-#include "migration/migration.h"
+#include "migration/misc.h"
 #include "sysemu/runstate.h"
 #include "hw/riscv/numa.h"
 
@@ -729,7 +729,7 @@ static void kvm_riscv_put_regs_timer(CPUState

[PULL 31/34] migration/multifd: Implement zero page transmission on the multifd thread.

From: Hao Xiang 

1. Add zero_pages field in MultiFDPacket_t.
2. Implements the zero page detection and handling on the multifd
threads for non-compression, zlib and zstd compression backends.
3. Added a new value 'multifd' in ZeroPageDetection enumeration.
4. Adds zero page counters and updates multifd send/receive tracing
format to track the newly added counters.

Signed-off-by: Hao Xiang 
Acked-by: Markus Armbruster 
Reviewed-by: Fabiano Rosas 
Link: https://lore.kernel.org/r/20240311180015.3359271-5-hao.xi...@linux.dev
Signed-off-by: Peter Xu 
---
 qapi/migration.json  |  7 ++-
 migration/multifd.h  | 23 +++-
 hw/core/qdev-properties-system.c |  2 +-
 migration/multifd-zero-page.c| 87 ++
 migration/multifd-zlib.c | 21 ++--
 migration/multifd-zstd.c | 20 +--
 migration/multifd.c  | 90 +++-
 migration/ram.c  |  1 -
 migration/meson.build|  1 +
 migration/trace-events   |  8 +--
 10 files changed, 228 insertions(+), 32 deletions(-)
 create mode 100644 migration/multifd-zero-page.c

diff --git a/qapi/migration.json b/qapi/migration.json
index 83fdef73b9..2684e4e9ac 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -677,10 +677,15 @@
 #
 # @legacy: Perform zero page checking in main migration thread.
 #
+# @multifd: Perform zero page checking in multifd sender thread if
+# multifd migration is enabled, else in the main migration
+# thread as for @legacy.
+#
 # Since: 9.0
+#
 ##
 { 'enum': 'ZeroPageDetection',
-  'data': [ 'none', 'legacy' ] }
+  'data': [ 'none', 'legacy', 'multifd' ] }
 
 ##
 # @BitmapMigrationBitmapAliasTransform:
diff --git a/migration/multifd.h b/migration/multifd.h
index 7447c2bea3..c9d9b09239 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -55,14 +55,24 @@ typedef struct {
 /* size of the next packet that contains pages */
 uint32_t next_packet_size;
 uint64_t packet_num;
-uint64_t unused[4];/* Reserved for future use */
+/* zero pages */
+uint32_t zero_pages;
+uint32_t unused32[1];/* Reserved for future use */
+uint64_t unused64[3];/* Reserved for future use */
 char ramblock[256];
+/*
+ * This array contains the pointers to:
+ *  - normal pages (initial normal_pages entries)
+ *  - zero pages (following zero_pages entries)
+ */
 uint64_t offset[];
 } __attribute__((packed)) MultiFDPacket_t;
 
 typedef struct {
 /* number of used pages */
 uint32_t num;
+/* number of normal pages */
+uint32_t normal_num;
 /* number of allocated pages */
 uint32_t allocated;
 /* offset of each page */
@@ -136,6 +146,8 @@ typedef struct {
 uint64_t packets_sent;
 /* non zero pages sent through this channel */
 uint64_t total_normal_pages;
+/* zero pages sent through this channel */
+uint64_t total_zero_pages;
 /* buffers to send */
 struct iovec *iov;
 /* number of iovs used */
@@ -194,12 +206,18 @@ typedef struct {
 uint8_t *host;
 /* non zero pages recv through this channel */
 uint64_t total_normal_pages;
+/* zero pages recv through this channel */
+uint64_t total_zero_pages;
 /* buffers to recv */
 struct iovec *iov;
 /* Pages that are not zero */
 ram_addr_t *normal;
 /* num of non zero pages */
 uint32_t normal_num;
+/* Pages that are zero */
+ram_addr_t *zero;
+/* num of zero pages */
+uint32_t zero_num;
 /* used for de-compression methods */
 void *compress_data;
 } MultiFDRecvParams;
@@ -221,6 +239,9 @@ typedef struct {
 
 void multifd_register_ops(int method, MultiFDMethods *ops);
 void multifd_send_fill_packet(MultiFDSendParams *p);
+bool multifd_send_prepare_common(MultiFDSendParams *p);
+void multifd_send_zero_page_detect(MultiFDSendParams *p);
+void multifd_recv_zero_page_process(MultiFDRecvParams *p);
 
 static inline void multifd_send_prepare_header(MultiFDSendParams *p)
 {
diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
index 71a21bf24e..7eca2f2377 100644
--- a/hw/core/qdev-properties-system.c
+++ b/hw/core/qdev-properties-system.c
@@ -696,7 +696,7 @@ const PropertyInfo qdev_prop_granule_mode = {
 const PropertyInfo qdev_prop_zero_page_detection = {
 .name = "ZeroPageDetection",
 .description = "zero_page_detection values, "
-   "none,legacy",
+   "none,legacy,multifd",
 .enum_table = _lookup,
 .get = qdev_propinfo_get_enum,
 .set = qdev_propinfo_set_enum,
diff --git a/migration/multifd-zero-page.c b/migration/multifd-zero-page.c
new file mode 100644
index 00..1ba38be636
--- /dev/null
+++ b/migration/multifd-zero-page.c
@@ -0,0 +1,87 @@
+/*
+ * Multifd zero page detection implementation.
+ *
+ * Copyright (c) 2024 Bytedance Inc
+ *
+ * Authors:
+ *  Hao Xiang 
+ *
+ * This work is licensed under the terms

[PULL 30/34] migration/multifd: Add new migration option zero-page-detection.

From: Hao Xiang 

This new parameter controls where the zero page checking is running.
1. If this parameter is set to 'legacy', zero page checking is
done in the migration main thread.
2. If this parameter is set to 'none', zero page checking is disabled.

Signed-off-by: Hao Xiang 
Reviewed-by: Peter Xu 
Acked-by: Markus Armbruster 
Link: https://lore.kernel.org/r/20240311180015.3359271-4-hao.xi...@linux.dev
Signed-off-by: Peter Xu 
---
 qapi/migration.json | 33 ++---
 include/hw/qdev-properties-system.h |  4 
 migration/options.h |  1 +
 hw/core/qdev-properties-system.c| 10 +
 migration/migration-hmp-cmds.c  |  9 
 migration/options.c | 21 ++
 migration/ram.c |  4 
 7 files changed, 79 insertions(+), 3 deletions(-)

diff --git a/qapi/migration.json b/qapi/migration.json
index 51d188b902..83fdef73b9 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -670,6 +670,18 @@
 { 'enum': 'MigMode',
   'data': [ 'normal', 'cpr-reboot' ] }
 
+##
+# @ZeroPageDetection:
+#
+# @none: Do not perform zero page checking.
+#
+# @legacy: Perform zero page checking in main migration thread.
+#
+# Since: 9.0
+##
+{ 'enum': 'ZeroPageDetection',
+  'data': [ 'none', 'legacy' ] }
+
 ##
 # @BitmapMigrationBitmapAliasTransform:
 #
@@ -891,6 +903,10 @@
 # @mode: Migration mode. See description in @MigMode. Default is 'normal'.
 #(Since 8.2)
 #
+# @zero-page-detection: Whether and how to detect zero pages.
+# See description in @ZeroPageDetection.  Default is 'legacy'.
+# (since 9.0)
+#
 # Features:
 #
 # @deprecated: Member @block-incremental is deprecated.  Use
@@ -924,7 +940,8 @@
'block-bitmap-mapping',
{ 'name': 'x-vcpu-dirty-limit-period', 'features': ['unstable'] },
'vcpu-dirty-limit',
-   'mode'] }
+   'mode',
+   'zero-page-detection'] }
 
 ##
 # @MigrateSetParameters:
@@ -1083,6 +1100,10 @@
 # @mode: Migration mode. See description in @MigMode. Default is 'normal'.
 #(Since 8.2)
 #
+# @zero-page-detection: Whether and how to detect zero pages.
+# See description in @ZeroPageDetection.  Default is 'legacy'.
+# (since 9.0)
+#
 # Features:
 #
 # @deprecated: Member @block-incremental is deprecated.  Use
@@ -1136,7 +1157,8 @@
 '*x-vcpu-dirty-limit-period': { 'type': 'uint64',
 'features': [ 'unstable' ] },
 '*vcpu-dirty-limit': 'uint64',
-'*mode': 'MigMode'} }
+'*mode': 'MigMode',
+'*zero-page-detection': 'ZeroPageDetection'} }
 
 ##
 # @migrate-set-parameters:
@@ -1311,6 +1333,10 @@
 # @mode: Migration mode. See description in @MigMode. Default is 'normal'.
 #(Since 8.2)
 #
+# @zero-page-detection: Whether and how to detect zero pages.
+# See description in @ZeroPageDetection.  Default is 'legacy'.
+# (since 9.0)
+#
 # Features:
 #
 # @deprecated: Member @block-incremental is deprecated.  Use
@@ -1361,7 +1387,8 @@
 '*x-vcpu-dirty-limit-period': { 'type': 'uint64',
 'features': [ 'unstable' ] },
 '*vcpu-dirty-limit': 'uint64',
-'*mode': 'MigMode'} }
+'*mode': 'MigMode',
+'*zero-page-detection': 'ZeroPageDetection'} }
 
 ##
 # @query-migrate-parameters:
diff --git a/include/hw/qdev-properties-system.h 
b/include/hw/qdev-properties-system.h
index 626be87dd3..438f65389f 100644
--- a/include/hw/qdev-properties-system.h
+++ b/include/hw/qdev-properties-system.h
@@ -9,6 +9,7 @@ extern const PropertyInfo qdev_prop_reserved_region;
 extern const PropertyInfo qdev_prop_multifd_compression;
 extern const PropertyInfo qdev_prop_mig_mode;
 extern const PropertyInfo qdev_prop_granule_mode;
+extern const PropertyInfo qdev_prop_zero_page_detection;
 extern const PropertyInfo qdev_prop_losttickpolicy;
 extern const PropertyInfo qdev_prop_blockdev_on_error;
 extern const PropertyInfo qdev_prop_bios_chs_trans;
@@ -50,6 +51,9 @@ extern const PropertyInfo qdev_prop_iothread_vq_mapping_list;
MigMode)
 #define DEFINE_PROP_GRANULE_MODE(_n, _s, _f, _d) \
 DEFINE_PROP_SIGNED(_n, _s, _f, _d, qdev_prop_granule_mode, GranuleMode)
+#define DEFINE_PROP_ZERO_PAGE_DETECTION(_n, _s, _f, _d) \
+DEFINE_PROP_SIGNED(_n, _s, _f, _d, qdev_prop_zero_page_detection, \
+   ZeroPageDetection)
 #define DEFINE_PROP_LOSTTICKPOLICY(_n, _s, _f, _d) \
 DEFINE_PROP_SIGNED(_n, _s, _f, _d, qdev_prop_losttickpolicy, \
 LostTickPolicy)
diff --git a/migration/options.h b/migration/options.h
index b6b69c2bb7..ab8199e207 100644
--- a/migration/options.h
+++ b/migration/options.h
@@ -90,6 +90,7 @@ const char *migrate_tls_authz(void);
 const char *migrate_tls_creds(void);
 const char *migrate_tls_hostname(void);
 uint64_t

[PULL 25/34] migration: privatize colo interfaces

From: Steve Sistare 

Remove private migration interfaces from net/colo-compare.c and push them
to migration/colo.c.

Signed-off-by: Steve Sistare 
Link: 
https://lore.kernel.org/r/1710179338-294359-10-git-send-email-steven.sist...@oracle.com
Signed-off-by: Peter Xu 
---
 migration/colo.c   | 17 +++--
 net/colo-compare.c |  3 +--
 stubs/colo.c   |  1 -
 3 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 315e31fe32..84632a603e 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -63,9 +63,9 @@ static bool colo_runstate_is_stopped(void)
 return runstate_check(RUN_STATE_COLO) || !runstate_is_running();
 }
 
-static void colo_checkpoint_notify(void *opaque)
+static void colo_checkpoint_notify(void)
 {
-MigrationState *s = opaque;
+MigrationState *s = migrate_get_current();
 int64_t next_notify_time;
 
 qemu_event_set(>colo_checkpoint_event);
@@ -74,10 +74,15 @@ static void colo_checkpoint_notify(void *opaque)
 timer_mod(s->colo_delay_timer, next_notify_time);
 }
 
+static void colo_checkpoint_notify_timer(void *opaque)
+{
+colo_checkpoint_notify();
+}
+
 void colo_checkpoint_delay_set(void)
 {
 if (migration_in_colo_state()) {
-colo_checkpoint_notify(migrate_get_current());
+colo_checkpoint_notify();
 }
 }
 
@@ -162,7 +167,7 @@ static void primary_vm_do_failover(void)
  * kick COLO thread which might wait at
  * qemu_sem_wait(>colo_checkpoint_sem).
  */
-colo_checkpoint_notify(s);
+colo_checkpoint_notify();
 
 /*
  * Wake up COLO thread which may blocked in recv() or send(),
@@ -518,7 +523,7 @@ out:
 
 static void colo_compare_notify_checkpoint(Notifier *notifier, void *data)
 {
-colo_checkpoint_notify(data);
+colo_checkpoint_notify();
 }
 
 static void colo_process_checkpoint(MigrationState *s)
@@ -642,7 +647,7 @@ void migrate_start_colo_process(MigrationState *s)
 bql_unlock();
 qemu_event_init(>colo_checkpoint_event, false);
 s->colo_delay_timer =  timer_new_ms(QEMU_CLOCK_HOST,
-colo_checkpoint_notify, s);
+colo_checkpoint_notify_timer, NULL);
 
 qemu_sem_init(>colo_exit_sem, 0);
 colo_process_checkpoint(s);
diff --git a/net/colo-compare.c b/net/colo-compare.c
index f2dfc0ebdc..c4ad0ab71f 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -28,7 +28,6 @@
 #include "sysemu/iothread.h"
 #include "net/colo-compare.h"
 #include "migration/colo.h"
-#include "migration/migration.h"
 #include "util.h"
 
 #include "block/aio-wait.h"
@@ -189,7 +188,7 @@ static void colo_compare_inconsistency_notify(CompareState 
*s)
 notify_remote_frame(s);
 } else {
 notifier_list_notify(_compare_notifiers,
- migrate_get_current());
+ NULL);
 }
 }
 
diff --git a/stubs/colo.c b/stubs/colo.c
index 08c9f982d5..f8c069b739 100644
--- a/stubs/colo.c
+++ b/stubs/colo.c
@@ -2,7 +2,6 @@
 #include "qemu/notify.h"
 #include "net/colo-compare.h"
 #include "migration/colo.h"
-#include "migration/migration.h"
 #include "qemu/error-report.h"
 #include "qapi/qapi-commands-migration.h"
 
-- 
2.44.0

[PULL 27/34] migration: purge MigrationState from public interface

From: Steve Sistare 

Move remaining MigrationState references from the public file
misc.h to the private file migration.h.

Signed-off-by: Steve Sistare 
Link: 
https://lore.kernel.org/r/1710179338-294359-12-git-send-email-steven.sist...@oracle.com
Signed-off-by: Peter Xu 
---
 include/migration/misc.h | 6 ++
 migration/migration.h| 6 ++
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/include/migration/misc.h b/include/migration/misc.h
index d563d2c801..c9e200f4eb 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -64,7 +64,6 @@ bool migration_is_active(void);
 bool migration_is_device(void);
 bool migration_thread_is_self(void);
 bool migration_is_setup_or_active(void);
-bool migrate_mode_is_cpr(MigrationState *);
 
 typedef enum MigrationEventType {
 MIG_EVENT_PRECOPY_SETUP,
@@ -103,16 +102,15 @@ void migration_add_notifier_mode(NotifierWithReturn 
*notify,
  MigrationNotifyFunc func, MigMode mode);
 
 void migration_remove_notifier(NotifierWithReturn *notify);
-int migration_call_notifiers(MigrationState *s, MigrationEventType type,
- Error **errp);
-bool migration_has_failed(MigrationState *);
 bool migration_is_running(void);
 void migration_file_set_error(int err);
 
 /* True if incoming migration entered POSTCOPY_INCOMING_DISCARD */
 bool migration_in_incoming_postcopy(void);
+
 /* True if incoming migration entered POSTCOPY_INCOMING_ADVISE */
 bool migration_incoming_postcopy_advised(void);
+
 /* True if background snapshot is active */
 bool migration_in_bg_snapshot(void);
 
diff --git a/migration/migration.h b/migration/migration.h
index e4983db9c9..8045e39c26 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -26,6 +26,7 @@
 #include "qom/object.h"
 #include "postcopy-ram.h"
 #include "sysemu/runstate.h"
+#include "migration/misc.h"
 
 struct PostcopyBlocktimeContext;
 
@@ -479,12 +480,17 @@ bool migrate_has_error(MigrationState *s);
 
 void migrate_fd_connect(MigrationState *s, Error *error_in);
 
+int migration_call_notifiers(MigrationState *s, MigrationEventType type,
+ Error **errp);
+
 int migrate_init(MigrationState *s, Error **errp);
 bool migration_is_blocked(Error **errp);
 /* True if outgoing migration has entered postcopy phase */
 bool migration_in_postcopy(void);
 bool migration_postcopy_is_alive(int state);
 MigrationState *migrate_get_current(void);
+bool migration_has_failed(MigrationState *);
+bool migrate_mode_is_cpr(MigrationState *);
 
 uint64_t ram_get_total_transferred_pages(void);
 
-- 
2.44.0

[PULL 21/34] migration: export vcpu_dirty_limit_period

From: Steve Sistare 

Define and export vcpu_dirty_limit_period to eliminate a dependency
on MigrationState.

Signed-off-by: Steve Sistare 
Link: 
https://lore.kernel.org/r/1710179338-294359-6-git-send-email-steven.sist...@oracle.com
Signed-off-by: Peter Xu 
---
 include/migration/client-options.h | 1 +
 migration/options.c| 7 +++
 system/dirtylimit.c| 3 +--
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/include/migration/client-options.h 
b/include/migration/client-options.h
index 887fea1565..59f4b55cf4 100644
--- a/include/migration/client-options.h
+++ b/include/migration/client-options.h
@@ -20,5 +20,6 @@ bool migrate_switchover_ack(void);
 /* parameters */
 
 MigMode migrate_mode(void);
+uint64_t migrate_vcpu_dirty_limit_period(void);
 
 #endif
diff --git a/migration/options.c b/migration/options.c
index 642cfb00a3..09178c6f60 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -924,6 +924,13 @@ const char *migrate_tls_hostname(void)
 return s->parameters.tls_hostname;
 }
 
+uint64_t migrate_vcpu_dirty_limit_period(void)
+{
+MigrationState *s = migrate_get_current();
+
+return s->parameters.x_vcpu_dirty_limit_period;
+}
+
 uint64_t migrate_xbzrle_cache_size(void)
 {
 MigrationState *s = migrate_get_current();
diff --git a/system/dirtylimit.c b/system/dirtylimit.c
index 1622bb7426..b0afaa0776 100644
--- a/system/dirtylimit.c
+++ b/system/dirtylimit.c
@@ -77,14 +77,13 @@ static bool dirtylimit_quit;
 
 static void vcpu_dirty_rate_stat_collect(void)
 {
-MigrationState *s = migrate_get_current();
 VcpuStat stat;
 int i = 0;
 int64_t period = DIRTYLIMIT_CALC_TIME_MS;
 
 if (migrate_dirty_limit() &&
 migration_is_active()) {
-period = s->parameters.x_vcpu_dirty_limit_period;
+period = migrate_vcpu_dirty_limit_period();
 }
 
 /* calculate vcpu dirtyrate */
-- 
2.44.0

[PULL 15/34] migration: Fix format in error message

From: Anthony PERARD 

In file_write_ramblock_iov(), "offset" is "uintptr_t" and not
"ram_addr_t". While usually they are both equivalent, this is not the
case with CONFIG_XEN_BACKEND.

Use the right format. This will fix build on 32-bit.

Fixes: f427d90b9898 ("migration/multifd: Support outgoing mapped-ram stream 
format")
Signed-off-by: Anthony PERARD 
Link: https://lore.kernel.org/r/20240311123439.16844-1-anthony.per...@citrix.com
Signed-off-by: Peter Xu 
---
 migration/file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/file.c b/migration/file.c
index 164b079966..5054a60851 100644
--- a/migration/file.c
+++ b/migration/file.c
@@ -191,7 +191,7 @@ int file_write_ramblock_iov(QIOChannel *ioc, const struct 
iovec *iov,
  */
 offset = (uintptr_t) iov[slice_idx].iov_base - (uintptr_t) block->host;
 if (offset >= block->used_length) {
-error_setg(errp, "offset " RAM_ADDR_FMT
+error_setg(errp, "offset %" PRIxPTR
"outside of ramblock %s range", offset, block->idstr);
 ret = -1;
 break;
-- 
2.44.0

Re: [PATCH v4 0/8] qtest: migration: Add tests for introducing 'channels' argument in migrate QAPIs

On 12/03/24 3:08 am, Peter Xu wrote:

On Tue, Mar 12, 2024 at 03:01:51AM +0530, Het Gala wrote:

On 12/03/24 2:55 am, Peter Xu wrote:

On Sat, Mar 09, 2024 at 01:11:45PM +0530, Het Gala wrote:

Can find the reference to the githab pipeline (before patchset) :
https://urldefense.proofpoint.com/v2/url?u=https-3A__gitlab.com_galahet_Qemu_-2D_pipelines_1207185095=DwIBaQ=s883GpUCOChKOHiocYtGcg=-qwZZzrw4EKSsq0BK7MBd3wW1WEpXmJeng3ZUT5uBCg=y2xUaOwvRVC5eTpFNEdxb37JYDdxN61W406HlCyx3CWIVyBRgLwjJhAYALZLinoi=vZRNX33_DuLO1TsfTpYR_s9bf_EMFm3oHHH_eg57zE0=

Can find the reference to the githab pipeline (after patchset) :
https://urldefense.proofpoint.com/v2/url?u=https-3A__gitlab.com_galahet_Qemu_-2D_pipelines_1207183673=DwIBaQ=s883GpUCOChKOHiocYtGcg=-qwZZzrw4EKSsq0BK7MBd3wW1WEpXmJeng3ZUT5uBCg=y2xUaOwvRVC5eTpFNEdxb37JYDdxN61W406HlCyx3CWIVyBRgLwjJhAYALZLinoi=C73ka3k3ouAuRJYNVLPIBQiWx3jDFDDvVYDiEYqfE04=

Het,

Please still copy me for any migration patches. In this case Fabiano is
looking it'll be all fine, but it will still help me on marking the emails.

Thanks,

So sorry about that Peter. I am aware that you and Fabiano are the go to
migration
maintainers. I thought I emailed or cc'd all the stakeholders that should be
involved
for this patchset series. Even in earlier series of this patchset, you were
cc'ed,
but somehow I just forgot to cc you for this patchset. Sure, will take care
from next
time. Again apologies for the mixup :)

No problem at all. As long as you have at least 1 maintainers copied,
logically nothing will get lost. It's just that it helps me in the routines.

Are you managing cc list manually for each version? In that case I suggest
you have a look at Stefan's tool:

I used to earlier. But lately markus introduced me to
scripts/get_maintainers.pl -f
It gives list of all the maintainers handling that particular file.
So that helped me for this patchset.
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_stefanha_git-2Dpublish=DwIBaQ=s883GpUCOChKOHiocYtGcg=-qwZZzrw4EKSsq0BK7MBd3wW1WEpXmJeng3ZUT5uBCg=ydJfb02Wuk_NnlYl8-RkRkYXzWNpzlEht7yj5kakeAlz_WPoD6yvC7b-fVCeLzom=8KSe9MiMzmHda3uZ_uaGCIEjub4tSzpeDTpZZwq5knc=

Thanks a lot Peter, looks cool. Will try to explore and use git-publish
and its different methods for next patchset.

It might help a great deal in patch managements at least to me, and it
definitely covers more than maintaining the cc list for a patchset.

Yes, it looks like there are a lot of useful methods that I can leverage
in future :)

Regards,
Het Gala

[PULL 29/34] migration/multifd: Allow clearing of the file_bmap from multifd

From: Fabiano Rosas 

We currently only need to clear the mapped-ram file bitmap from the
migration thread during save_zero_page.

We're about to add support for zero page detection on the multifd
thread, so allow ramblock_set_file_bmap_atomic() to also clear the
bits.

Signed-off-by: Fabiano Rosas 
Link: https://lore.kernel.org/r/20240311180015.3359271-3-hao.xi...@linux.dev
Signed-off-by: Peter Xu 
---
 migration/ram.h | 3 ++-
 migration/multifd.c | 2 +-
 migration/ram.c | 8 ++--
 3 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/migration/ram.h b/migration/ram.h
index b9ac0da587..08feecaf51 100644
--- a/migration/ram.h
+++ b/migration/ram.h
@@ -75,7 +75,8 @@ bool ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *rb, 
Error **errp);
 bool ramblock_page_is_discarded(RAMBlock *rb, ram_addr_t start);
 void postcopy_preempt_shutdown_file(MigrationState *s);
 void *postcopy_preempt_thread(void *opaque);
-void ramblock_set_file_bmap_atomic(RAMBlock *block, ram_addr_t offset);
+void ramblock_set_file_bmap_atomic(RAMBlock *block, ram_addr_t offset,
+   bool set);
 
 /* ram cache */
 int colo_init_ram_cache(void);
diff --git a/migration/multifd.c b/migration/multifd.c
index bf9d483f7a..3ba922694e 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -115,7 +115,7 @@ static void multifd_set_file_bitmap(MultiFDSendParams *p)
 assert(pages->block);
 
 for (int i = 0; i < p->pages->num; i++) {
-ramblock_set_file_bmap_atomic(pages->block, pages->offset[i]);
+ramblock_set_file_bmap_atomic(pages->block, pages->offset[i], true);
 }
 }
 
diff --git a/migration/ram.c b/migration/ram.c
index 3ee8cb47d3..dec2e73f8e 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3149,9 +3149,13 @@ static void ram_save_file_bmap(QEMUFile *f)
 }
 }
 
-void ramblock_set_file_bmap_atomic(RAMBlock *block, ram_addr_t offset)
+void ramblock_set_file_bmap_atomic(RAMBlock *block, ram_addr_t offset, bool 
set)
 {
-set_bit_atomic(offset >> TARGET_PAGE_BITS, block->file_bmap);
+if (set) {
+set_bit_atomic(offset >> TARGET_PAGE_BITS, block->file_bmap);
+} else {
+clear_bit_atomic(offset >> TARGET_PAGE_BITS, block->file_bmap);
+}
 }
 
 /**
-- 
2.44.0

[PULL 05/34] migration: Report error when shutdown fails

From: Cédric Le Goater 

This will help detect issues regarding I/O channels usage.

Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Peter Xu 
Signed-off-by: Cédric Le Goater 
Link: https://lore.kernel.org/r/20240304122844.1888308-7-...@redhat.com
Signed-off-by: Peter Xu 
---
 migration/qemu-file.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index b10c882629..a10882d47f 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -63,6 +63,8 @@ struct QEMUFile {
  */
 int qemu_file_shutdown(QEMUFile *f)
 {
+Error *err = NULL;
+
 /*
  * We must set qemufile error before the real shutdown(), otherwise
  * there can be a race window where we thought IO all went though
@@ -91,7 +93,8 @@ int qemu_file_shutdown(QEMUFile *f)
 return -ENOSYS;
 }
 
-if (qio_channel_shutdown(f->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL) < 0) {
+if (qio_channel_shutdown(f->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, ) < 0) {
+error_report_err(err);
 return -EIO;
 }
 
-- 
2.44.0

[PULL 32/34] migration/multifd: Implement ram_save_target_page_multifd to handle multifd version of MigrationOps::ram_save_target_page.

From: Hao Xiang 

1. Add a dedicated handler for MigrationOps::ram_save_target_page in
multifd live migration.
2. Refactor ram_save_target_page_legacy so that the legacy and multifd
handlers don't have internal functions calling into each other.

Signed-off-by: Hao Xiang 
Reviewed-by: Fabiano Rosas 
Message-Id: <20240226195654.934709-4-hao.xi...@bytedance.com>
Link: https://lore.kernel.org/r/20240311180015.3359271-6-hao.xi...@linux.dev
Signed-off-by: Peter Xu 
---
 migration/ram.c | 38 +-
 1 file changed, 29 insertions(+), 9 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index c26435adc7..8deb84984f 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2079,7 +2079,6 @@ static bool save_compress_page(RAMState *rs, 
PageSearchStatus *pss,
  */
 static int ram_save_target_page_legacy(RAMState *rs, PageSearchStatus *pss)
 {
-RAMBlock *block = pss->block;
 ram_addr_t offset = ((ram_addr_t)pss->page) << TARGET_PAGE_BITS;
 int res;
 
@@ -2095,17 +2094,33 @@ static int ram_save_target_page_legacy(RAMState *rs, 
PageSearchStatus *pss)
 return 1;
 }
 
+return ram_save_page(rs, pss);
+}
+
+/**
+ * ram_save_target_page_multifd: send one target page to multifd workers
+ *
+ * Returns 1 if the page was queued, -1 otherwise.
+ *
+ * @rs: current RAM state
+ * @pss: data about the page we want to send
+ */
+static int ram_save_target_page_multifd(RAMState *rs, PageSearchStatus *pss)
+{
+RAMBlock *block = pss->block;
+ram_addr_t offset = ((ram_addr_t)pss->page) << TARGET_PAGE_BITS;
+
 /*
- * Do not use multifd in postcopy as one whole host page should be
- * placed.  Meanwhile postcopy requires atomic update of pages, so even
- * if host page size == guest page size the dest guest during run may
- * still see partially copied pages which is data corruption.
+ * While using multifd live migration, we still need to handle zero
+ * page checking on the migration main thread.
  */
-if (migrate_multifd() && !migration_in_postcopy()) {
-return ram_save_multifd_page(block, offset);
+if (migrate_zero_page_detection() == ZERO_PAGE_DETECTION_LEGACY) {
+if (save_zero_page(rs, pss, offset)) {
+return 1;
+}
 }
 
-return ram_save_page(rs, pss);
+return ram_save_multifd_page(block, offset);
 }
 
 /* Should be called before sending a host page */
@@ -3112,7 +3127,12 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
 }
 
 migration_ops = g_malloc0(sizeof(MigrationOps));
-migration_ops->ram_save_target_page = ram_save_target_page_legacy;
+
+if (migrate_multifd()) {
+migration_ops->ram_save_target_page = ram_save_target_page_multifd;
+} else {
+migration_ops->ram_save_target_page = ram_save_target_page_legacy;
+}
 
 bql_unlock();
 ret = multifd_send_sync_main();
-- 
2.44.0

[PULL 07/34] migration: Add documentation for SaveVMHandlers

From: Cédric Le Goater 

The SaveVMHandlers structure is still in use for complex subsystems
and devices. Document the handlers since we are going to modify a few
later.

Reviewed-by: Peter Xu 
Signed-off-by: Cédric Le Goater 
Link: https://lore.kernel.org/r/20240304122844.1888308-9-...@redhat.com
Signed-off-by: Peter Xu 
---
 include/migration/register.h | 263 +++
 1 file changed, 237 insertions(+), 26 deletions(-)

diff --git a/include/migration/register.h b/include/migration/register.h
index 2e6a7d766e..d7b70a8be6 100644
--- a/include/migration/register.h
+++ b/include/migration/register.h
@@ -16,30 +16,130 @@
 
 #include "hw/vmstate-if.h"
 
+/**
+ * struct SaveVMHandlers: handler structure to finely control
+ * migration of complex subsystems and devices, such as RAM, block and
+ * VFIO.
+ */
 typedef struct SaveVMHandlers {
-/* This runs inside the BQL.  */
+
+/* The following handlers run inside the BQL. */
+
+/**
+ * @save_state
+ *
+ * Saves state section on the source using the latest state format
+ * version.
+ *
+ * Legacy method. Should be deprecated when all users are ported
+ * to VMStateDescription.
+ *
+ * @f: QEMUFile where to send the data
+ * @opaque: data pointer passed to register_savevm_live()
+ */
 void (*save_state)(QEMUFile *f, void *opaque);
 
-/*
- * save_prepare is called early, even before migration starts, and can be
- * used to perform early checks.
+/**
+ * @save_prepare
+ *
+ * Called early, even before migration starts, and can be used to
+ * perform early checks.
+ *
+ * @opaque: data pointer passed to register_savevm_live()
+ * @errp: pointer to Error*, to store an error if it happens.
+ *
+ * Returns zero to indicate success and negative for error
  */
 int (*save_prepare)(void *opaque, Error **errp);
+
+/**
+ * @save_setup
+ *
+ * Initializes the data structures on the source and transmits
+ * first section containing information on the device
+ *
+ * @f: QEMUFile where to send the data
+ * @opaque: data pointer passed to register_savevm_live()
+ *
+ * Returns zero to indicate success and negative for error
+ */
 int (*save_setup)(QEMUFile *f, void *opaque);
+
+/**
+ * @save_cleanup
+ *
+ * Uninitializes the data structures on the source
+ *
+ * @opaque: data pointer passed to register_savevm_live()
+ */
 void (*save_cleanup)(void *opaque);
+
+/**
+ * @save_live_complete_postcopy
+ *
+ * Called at the end of postcopy for all postcopyable devices.
+ *
+ * @f: QEMUFile where to send the data
+ * @opaque: data pointer passed to register_savevm_live()
+ *
+ * Returns zero to indicate success and negative for error
+ */
 int (*save_live_complete_postcopy)(QEMUFile *f, void *opaque);
+
+/**
+ * @save_live_complete_precopy
+ *
+ * Transmits the last section for the device containing any
+ * remaining data at the end of a precopy phase. When postcopy is
+ * enabled, devices that support postcopy will skip this step,
+ * where the final data will be flushed at the end of postcopy via
+ * @save_live_complete_postcopy instead.
+ *
+ * @f: QEMUFile where to send the data
+ * @opaque: data pointer passed to register_savevm_live()
+ *
+ * Returns zero to indicate success and negative for error
+ */
 int (*save_live_complete_precopy)(QEMUFile *f, void *opaque);
 
 /* This runs both outside and inside the BQL.  */
+
+/**
+ * @is_active
+ *
+ * Will skip a state section if not active
+ *
+ * @opaque: data pointer passed to register_savevm_live()
+ *
+ * Returns true if state section is active else false
+ */
 bool (*is_active)(void *opaque);
+
+/**
+ * @has_postcopy
+ *
+ * Checks if a device supports postcopy
+ *
+ * @opaque: data pointer passed to register_savevm_live()
+ *
+ * Returns true for postcopy support else false
+ */
 bool (*has_postcopy)(void *opaque);
 
-/* is_active_iterate
- * If it is not NULL then qemu_savevm_state_iterate will skip iteration if
- * it returns false. For example, it is needed for only-postcopy-states,
- * which needs to be handled by qemu_savevm_state_setup and
- * qemu_savevm_state_pending, but do not need iterations until not in
- * postcopy stage.
+/**
+ * @is_active_iterate
+ *
+ * As #SaveVMHandlers.is_active(), will skip an inactive state
+ * section in qemu_savevm_state_iterate.
+ *
+ * For example, it is needed for only-postcopy-states, which needs
+ * to be handled by qemu_savevm_state_setup() and
+ * qemu_savevm_state_pending(), but do not need iterations until
+ * not in postcopy stage.
+ *
+ * @opaque: data pointer passed to register_savevm_live()
+ *

[PULL 18/34] migration: export migration_is_setup_or_active

From: Steve Sistare 

Delete the MigrationState parameter from migration_is_setup_or_active
and move it to the public API in misc.h.

Signed-off-by: Steve Sistare 
Reviewed-by: Philippe Mathieu-Daudé 
Link: 
https://lore.kernel.org/r/1710179338-294359-3-git-send-email-steven.sist...@oracle.com
Signed-off-by: Peter Xu 
---
 include/migration/misc.h |  1 +
 migration/migration.h|  1 -
 hw/vfio/common.c |  2 +-
 migration/migration.c| 12 ++--
 migration/ram.c  |  5 ++---
 net/vhost-vdpa.c |  3 +--
 6 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/include/migration/misc.h b/include/migration/misc.h
index 4c226a40bb..79cff6224e 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -61,6 +61,7 @@ void migration_object_init(void);
 void migration_shutdown(void);
 bool migration_is_idle(void);
 bool migration_is_active(MigrationState *);
+bool migration_is_setup_or_active(void);
 bool migrate_mode_is_cpr(MigrationState *);
 
 typedef enum MigrationEventType {
diff --git a/migration/migration.h b/migration/migration.h
index 65c0b61cbd..736460aa8b 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -479,7 +479,6 @@ bool migrate_has_error(MigrationState *s);
 
 void migrate_fd_connect(MigrationState *s, Error *error_in);
 
-bool migration_is_setup_or_active(int state);
 bool migration_is_running(int state);
 
 int migrate_init(MigrationState *s, Error **errp);
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 059bfdc07a..896eab8103 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -152,7 +152,7 @@ static void vfio_set_migration_error(int err)
 {
 MigrationState *ms = migrate_get_current();
 
-if (migration_is_setup_or_active(ms->state)) {
+if (migration_is_setup_or_active()) {
 WITH_QEMU_LOCK_GUARD(>qemu_file_lock) {
 if (ms->to_dst_file) {
 qemu_file_set_error(ms->to_dst_file, err);
diff --git a/migration/migration.c b/migration/migration.c
index a49fcd53ee..af21403bad 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1081,9 +1081,11 @@ void migrate_send_rp_resume_ack(MigrationIncomingState 
*mis, uint32_t value)
  * Return true if we're already in the middle of a migration
  * (i.e. any of the active or setup states)
  */
-bool migration_is_setup_or_active(int state)
+bool migration_is_setup_or_active(void)
 {
-switch (state) {
+MigrationState *s = current_migration;
+
+switch (s->state) {
 case MIGRATION_STATUS_ACTIVE:
 case MIGRATION_STATUS_POSTCOPY_ACTIVE:
 case MIGRATION_STATUS_POSTCOPY_PAUSED:
@@ -1601,10 +1603,8 @@ bool migration_incoming_postcopy_advised(void)
 
 bool migration_in_bg_snapshot(void)
 {
-MigrationState *s = migrate_get_current();
-
 return migrate_background_snapshot() &&
-migration_is_setup_or_active(s->state);
+   migration_is_setup_or_active();
 }
 
 bool migration_is_idle(void)
@@ -2297,7 +2297,7 @@ static void *source_return_path_thread(void *opaque)
 trace_source_return_path_thread_entry();
 rcu_register_thread();
 
-while (migration_is_setup_or_active(ms->state)) {
+while (migration_is_setup_or_active()) {
 trace_source_return_path_thread_loop_top();
 
 header_type = qemu_get_be16(rp);
diff --git a/migration/ram.c b/migration/ram.c
index 2cd936d9ce..3ee8cb47d3 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2909,10 +2909,9 @@ void qemu_guest_free_page_hint(void *addr, size_t len)
 RAMBlock *block;
 ram_addr_t offset;
 size_t used_len, start, npages;
-MigrationState *s = migrate_get_current();
 
 /* This function is currently expected to be used during live migration */
-if (!migration_is_setup_or_active(s->state)) {
+if (!migration_is_setup_or_active()) {
 return;
 }
 
@@ -3263,7 +3262,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
 
 out:
 if (ret >= 0
-&& migration_is_setup_or_active(migrate_get_current()->state)) {
+&& migration_is_setup_or_active()) {
 if (migrate_multifd() && migrate_multifd_flush_after_each_section() &&
 !migrate_mapped_ram()) {
 ret = multifd_send_sync_main();
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index e6bdb4562d..8564817073 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -26,7 +26,6 @@
 #include 
 #include "standard-headers/linux/virtio_net.h"
 #include "monitor/monitor.h"
-#include "migration/migration.h"
 #include "migration/misc.h"
 #include "hw/virtio/vhost.h"
 
@@ -355,7 +354,7 @@ static int vhost_vdpa_net_data_start(NetClientState *nc)
 assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
 
 if (s->always_svq ||
-migration_is_setup_or_active(migrate_get_current()->state)) {
+migration_is_setup_or_active()) {
 v->shadow_vqs_enabled = true;
 } else {
 v->shadow_vqs_enabled = false;
-- 
2.44.0

[PULL 22/34] migration: migration_thread_is_self

From: Steve Sistare 

Define and export migration_thread_is_self to eliminate a dependency
on MigrationState.

Signed-off-by: Steve Sistare 
Link: 
https://lore.kernel.org/r/1710179338-294359-7-git-send-email-steven.sist...@oracle.com
Signed-off-by: Peter Xu 
---
 include/migration/misc.h | 1 +
 migration/migration.c| 7 +++
 system/dirtylimit.c  | 5 +
 3 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/include/migration/misc.h b/include/migration/misc.h
index 7526977de6..c4b5416357 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -61,6 +61,7 @@ void migration_object_init(void);
 void migration_shutdown(void);
 bool migration_is_idle(void);
 bool migration_is_active(void);
+bool migration_thread_is_self(void);
 bool migration_is_setup_or_active(void);
 bool migrate_mode_is_cpr(MigrationState *);
 
diff --git a/migration/migration.c b/migration/migration.c
index 546ba86c63..afe72af0b1 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1647,6 +1647,13 @@ bool migration_is_active(void)
 s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE);
 }
 
+bool migration_thread_is_self(void)
+{
+MigrationState *s = current_migration;
+
+return qemu_thread_is_self(>thread);
+}
+
 bool migrate_mode_is_cpr(MigrationState *s)
 {
 return s->parameters.mode == MIG_MODE_CPR_REBOOT;
diff --git a/system/dirtylimit.c b/system/dirtylimit.c
index b0afaa0776..ab20da34bb 100644
--- a/system/dirtylimit.c
+++ b/system/dirtylimit.c
@@ -25,7 +25,6 @@
 #include "sysemu/kvm.h"
 #include "trace.h"
 #include "migration/misc.h"
-#include "migration/migration.h"
 
 /*
  * Dirtylimit stop working if dirty page rate error
@@ -448,10 +447,8 @@ static void dirtylimit_cleanup(void)
  */
 static bool dirtylimit_is_allowed(void)
 {
-MigrationState *ms = migrate_get_current();
-
 if (migration_is_running() &&
-(!qemu_thread_is_self(>thread)) &&
+!migration_thread_is_self() &&
 migrate_dirty_limit() &&
 dirtylimit_in_service()) {
 return false;
-- 
2.44.0

[PULL 10/34] migration/rdma: Fix a memory issue for migration

From: Yu Zhang 

In commit 3fa9642ff7 change was made to convert the RDMA backend to
accept MigrateAddress struct. However, the assignment of "host" leads
to data corruption on the target host and the failure of migration.

isock->host = rdma->host;

By allocating the memory explicitly for it with g_strdup_printf(), the
issue is fixed and the migration doesn't fail any more.

Fixes: 3fa9642ff7 ("migration: convert rdma backend to accept MigrateAddress")
Cc: qemu-stable 
Cc: Li Zhijian 
Link: 
https://lore.kernel.org/r/CAHEcVy4L_D6tuhJ8h=xlr4wapaprje3nnxzaeyunotrxq6c...@mail.gmail.com
Signed-off-by: Yu Zhang 
[peterx: use g_strdup() instead of g_strdup_printf(), per Zhijian]
Signed-off-by: Peter Xu 
---
 migration/rdma.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/rdma.c b/migration/rdma.c
index a355dcea89..855753c671 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -3357,7 +3357,7 @@ static int qemu_rdma_accept(RDMAContext *rdma)
 goto err_rdma_dest_wait;
 }
 
-isock->host = rdma->host;
+isock->host = g_strdup(rdma->host);
 isock->port = g_strdup_printf("%d", rdma->port);
 
 /*
-- 
2.44.0

[PULL 28/34] migration/multifd: Allow zero pages in file migration

From: Fabiano Rosas 

Currently, it's an error to have no data pages in the multifd file
migration because zero page detection is done in the migration thread
and zero pages don't reach multifd. This is enforced with the
pages->num assert.

We're about to add zero page detection on the multifd thread. Fix the
file_write_ramblock_iov() to stop considering p->iovs_num=0 an error.

Signed-off-by: Fabiano Rosas 
Link: https://lore.kernel.org/r/20240311180015.3359271-2-hao.xi...@linux.dev
Signed-off-by: Peter Xu 
---
 migration/file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/file.c b/migration/file.c
index 5054a60851..b0b963e0ce 100644
--- a/migration/file.c
+++ b/migration/file.c
@@ -159,7 +159,7 @@ void file_start_incoming_migration(FileMigrationArgs 
*file_args, Error **errp)
 int file_write_ramblock_iov(QIOChannel *ioc, const struct iovec *iov,
 int niov, RAMBlock *block, Error **errp)
 {
-ssize_t ret = -1;
+ssize_t ret = 0;
 int i, slice_idx, slice_num;
 uintptr_t base, next, offset;
 size_t len;
-- 
2.44.0

[PULL 34/34] migration/multifd: Add new migration test cases for legacy zero page checking.

From: Hao Xiang 

Now that zero page checking is done on the multifd sender threads by
default, we still provide an option for backward compatibility. This
change adds a qtest migration test case to set the zero-page-detection
option to "legacy" and run multifd migration with zero page checking on the
migration main thread.

Signed-off-by: Hao Xiang 
Reviewed-by: Peter Xu 
Link: https://lore.kernel.org/r/20240311180015.3359271-8-hao.xi...@linux.dev
Signed-off-by: Peter Xu 
---
 tests/qtest/migration-test.c | 52 
 1 file changed, 52 insertions(+)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 4023d808f9..71895abb7f 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -2771,6 +2771,24 @@ test_migrate_precopy_tcp_multifd_start(QTestState *from,
 return test_migrate_precopy_tcp_multifd_start_common(from, to, "none");
 }
 
+static void *
+test_migrate_precopy_tcp_multifd_start_zero_page_legacy(QTestState *from,
+QTestState *to)
+{
+test_migrate_precopy_tcp_multifd_start_common(from, to, "none");
+migrate_set_parameter_str(from, "zero-page-detection", "legacy");
+return NULL;
+}
+
+static void *
+test_migration_precopy_tcp_multifd_start_no_zero_page(QTestState *from,
+  QTestState *to)
+{
+test_migrate_precopy_tcp_multifd_start_common(from, to, "none");
+migrate_set_parameter_str(from, "zero-page-detection", "none");
+return NULL;
+}
+
 static void *
 test_migrate_precopy_tcp_multifd_zlib_start(QTestState *from,
 QTestState *to)
@@ -2812,6 +2830,36 @@ static void test_multifd_tcp_none(void)
 test_precopy_common();
 }
 
+static void test_multifd_tcp_zero_page_legacy(void)
+{
+MigrateCommon args = {
+.listen_uri = "defer",
+.start_hook = test_migrate_precopy_tcp_multifd_start_zero_page_legacy,
+/*
+ * Multifd is more complicated than most of the features, it
+ * directly takes guest page buffers when sending, make sure
+ * everything will work alright even if guest page is changing.
+ */
+.live = true,
+};
+test_precopy_common();
+}
+
+static void test_multifd_tcp_no_zero_page(void)
+{
+MigrateCommon args = {
+.listen_uri = "defer",
+.start_hook = test_migration_precopy_tcp_multifd_start_no_zero_page,
+/*
+ * Multifd is more complicated than most of the features, it
+ * directly takes guest page buffers when sending, make sure
+ * everything will work alright even if guest page is changing.
+ */
+.live = true,
+};
+test_precopy_common();
+}
+
 static void test_multifd_tcp_zlib(void)
 {
 MigrateCommon args = {
@@ -3729,6 +3777,10 @@ int main(int argc, char **argv)
 }
 migration_test_add("/migration/multifd/tcp/plain/none",
test_multifd_tcp_none);
+migration_test_add("/migration/multifd/tcp/plain/zero-page/legacy",
+   test_multifd_tcp_zero_page_legacy);
+migration_test_add("/migration/multifd/tcp/plain/zero-page/none",
+   test_multifd_tcp_no_zero_page);
 migration_test_add("/migration/multifd/tcp/plain/cancel",
test_multifd_tcp_cancel);
 migration_test_add("/migration/multifd/tcp/plain/zlib",
-- 
2.44.0

[PULL 26/34] migration: delete unused accessors

From: Steve Sistare 

Signed-off-by: Steve Sistare 
Link: 
https://lore.kernel.org/r/1710179338-294359-11-git-send-email-steven.sist...@oracle.com
Signed-off-by: Peter Xu 
---
 include/migration/misc.h |  3 ---
 migration/migration.c| 10 --
 2 files changed, 13 deletions(-)

diff --git a/include/migration/misc.h b/include/migration/misc.h
index e521cd5229..d563d2c801 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -105,13 +105,10 @@ void migration_add_notifier_mode(NotifierWithReturn 
*notify,
 void migration_remove_notifier(NotifierWithReturn *notify);
 int migration_call_notifiers(MigrationState *s, MigrationEventType type,
  Error **errp);
-bool migration_in_setup(MigrationState *);
-bool migration_has_finished(MigrationState *);
 bool migration_has_failed(MigrationState *);
 bool migration_is_running(void);
 void migration_file_set_error(int err);
 
-/* ...and after the device transmission */
 /* True if incoming migration entered POSTCOPY_INCOMING_DISCARD */
 bool migration_in_incoming_postcopy(void);
 /* True if incoming migration entered POSTCOPY_INCOMING_ADVISE */
diff --git a/migration/migration.c b/migration/migration.c
index 216f63d62b..644e073b7d 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1548,16 +1548,6 @@ int migration_call_notifiers(MigrationState *s, 
MigrationEventType type,
 return ret;
 }
 
-bool migration_in_setup(MigrationState *s)
-{
-return s->state == MIGRATION_STATUS_SETUP;
-}
-
-bool migration_has_finished(MigrationState *s)
-{
-return s->state == MIGRATION_STATUS_COMPLETED;
-}
-
 bool migration_has_failed(MigrationState *s)
 {
 return (s->state == MIGRATION_STATUS_CANCELLED ||
-- 
2.44.0

[PULL 09/34] migration/multifd: Don't fsync when closing QIOChannelFile

From: Fabiano Rosas 

Commit bc38feddeb ("io: fsync before closing a file channel") added a
fsync/fdatasync at the closing point of the QIOChannelFile to ensure
integrity of the migration stream in case of QEMU crash.

The decision to do the sync at qio_channel_close() was not the best
since that function runs in the main thread and the fsync can cause
QEMU to hang for several minutes, depending on the migration size and
disk speed.

To fix the hang, remove the fsync from qio_channel_file_close().

At this moment, the migration code is the only user of the fsync and
we're taking the tradeoff of not having a sync at all, leaving the
responsibility to the upper layers.

Fixes: bc38feddeb ("io: fsync before closing a file channel")
Reviewed-by: "Daniel P. Berrangé" 
Signed-off-by: Fabiano Rosas 
Link: https://lore.kernel.org/r/20240305195629.9922-1-faro...@suse.de
Link: https://lore.kernel.org/r/20240305174332.2553-1-faro...@suse.de
[peterx: add more comment to the qio_channel_close()]
Signed-off-by: Peter Xu 
---
 docs/devel/migration/main.rst |  3 ++-
 io/channel-file.c |  5 -
 migration/multifd.c   | 28 +++-
 3 files changed, 21 insertions(+), 15 deletions(-)

diff --git a/docs/devel/migration/main.rst b/docs/devel/migration/main.rst
index 8024275d6d..54385a23e5 100644
--- a/docs/devel/migration/main.rst
+++ b/docs/devel/migration/main.rst
@@ -44,7 +44,8 @@ over any transport.
 - file migration: do the migration using a file that is passed to QEMU
   by path. A file offset option is supported to allow a management
   application to add its own metadata to the start of the file without
-  QEMU interference.
+  QEMU interference. Note that QEMU does not flush cached file
+  data/metadata at the end of migration.
 
 In addition, support is included for migration using RDMA, which
 transports the page data using ``RDMA``, where the hardware takes care of
diff --git a/io/channel-file.c b/io/channel-file.c
index d4706fa592..a6ad7770c6 100644
--- a/io/channel-file.c
+++ b/io/channel-file.c
@@ -242,11 +242,6 @@ static int qio_channel_file_close(QIOChannel *ioc,
 {
 QIOChannelFile *fioc = QIO_CHANNEL_FILE(ioc);
 
-if (qemu_fdatasync(fioc->fd) < 0) {
-error_setg_errno(errp, errno,
- "Unable to synchronize file data with storage 
device");
-return -1;
-}
 if (qemu_close(fioc->fd) < 0) {
 error_setg_errno(errp, errno,
  "Unable to close file");
diff --git a/migration/multifd.c b/migration/multifd.c
index d4a44da559..bf9d483f7a 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -710,16 +710,26 @@ static bool 
multifd_send_cleanup_channel(MultiFDSendParams *p, Error **errp)
 if (p->c) {
 migration_ioc_unregister_yank(p->c);
 /*
- * An explicit close() on the channel here is normally not
- * required, but can be helpful for "file:" iochannels, where it
- * will include fdatasync() to make sure the data is flushed to the
- * disk backend.
+ * The object_unref() cannot guarantee the fd will always be
+ * released because finalize() of the iochannel is only
+ * triggered on the last reference and it's not guaranteed
+ * that we always hold the last refcount when reaching here.
  *
- * The object_unref() cannot guarantee that because: (1) finalize()
- * of the iochannel is only triggered on the last reference, and
- * it's not guaranteed that we always hold the last refcount when
- * reaching here, and, (2) even if finalize() is invoked, it only
- * does a close(fd) without data flush.
+ * Closing the fd explicitly has the benefit that if there is any
+ * registered I/O handler callbacks on such fd, that will get a
+ * POLLNVAL event and will further trigger the cleanup to finally
+ * release the IOC.
+ *
+ * FIXME: It should logically be guaranteed that all multifd
+ * channels have no I/O handler callback registered when reaching
+ * here, because migration thread will wait for all multifd channel
+ * establishments to complete during setup.  Since
+ * migrate_fd_cleanup() will be scheduled in main thread too, all
+ * previous callbacks should guarantee to be completed when
+ * reaching here.  See multifd_send_state.channels_created and its
+ * usage.  In the future, we could replace this with an assert
+ * making sure we're the last reference, or simply drop it if above
+ * is more clear to be justified.
  */
 qio_channel_close(p->c, _abort);
 object_unref(OBJECT(p->c));
-- 
2.44.0

[PULL 33/34] migration/multifd: Enable multifd zero page checking by default.

From: Hao Xiang 

1. Set default "zero-page-detection" option to "multifd". Now
zero page checking can be done in the multifd threads and this
becomes the default configuration.
2. Handle migration QEMU9.0 -> QEMU8.2 compatibility. We provide
backward compatibility where zero page checking is done from the
migration main thread.

Signed-off-by: Hao Xiang 
Reviewed-by: Fabiano Rosas 
Reviewed-by: Peter Xu 
Link: https://lore.kernel.org/r/20240311180015.3359271-7-hao.xi...@linux.dev
Signed-off-by: Peter Xu 
---
 qapi/migration.json | 6 +++---
 hw/core/machine.c   | 4 +++-
 migration/options.c | 2 +-
 3 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/qapi/migration.json b/qapi/migration.json
index 2684e4e9ac..aa1b39bce1 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -909,7 +909,7 @@
 #(Since 8.2)
 #
 # @zero-page-detection: Whether and how to detect zero pages.
-# See description in @ZeroPageDetection.  Default is 'legacy'.
+# See description in @ZeroPageDetection.  Default is 'multifd'.
 # (since 9.0)
 #
 # Features:
@@ -1106,7 +1106,7 @@
 #(Since 8.2)
 #
 # @zero-page-detection: Whether and how to detect zero pages.
-# See description in @ZeroPageDetection.  Default is 'legacy'.
+# See description in @ZeroPageDetection.  Default is 'multifd'.
 # (since 9.0)
 #
 # Features:
@@ -1339,7 +1339,7 @@
 #(Since 8.2)
 #
 # @zero-page-detection: Whether and how to detect zero pages.
-# See description in @ZeroPageDetection.  Default is 'legacy'.
+# See description in @ZeroPageDetection.  Default is 'multifd'.
 # (since 9.0)
 #
 # Features:
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 9ac5d5389a..0e9d646b61 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -32,7 +32,9 @@
 #include "hw/virtio/virtio-net.h"
 #include "audio/audio.h"
 
-GlobalProperty hw_compat_8_2[] = {};
+GlobalProperty hw_compat_8_2[] = {
+{ "migration", "zero-page-detection", "legacy"},
+};
 const size_t hw_compat_8_2_len = G_N_ELEMENTS(hw_compat_8_2);
 
 GlobalProperty hw_compat_8_1[] = {
diff --git a/migration/options.c b/migration/options.c
index 8f2a3a2fa5..9ed2fe4bee 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -181,7 +181,7 @@ Property migration_properties[] = {
   MIG_MODE_NORMAL),
 DEFINE_PROP_ZERO_PAGE_DETECTION("zero-page-detection", MigrationState,
parameters.zero_page_detection,
-   ZERO_PAGE_DETECTION_LEGACY),
+   ZERO_PAGE_DETECTION_MULTIFD),
 
 /* Migration capabilities */
 DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE),
-- 
2.44.0

[PULL 06/34] migration: Remove SaveStateHandler and LoadStateHandler typedefs

From: Cédric Le Goater 

They are only used once.

Reviewed-by: Fabiano Rosas 
Reviewed-by: Peter Xu 
Signed-off-by: Cédric Le Goater 
Link: https://lore.kernel.org/r/20240304122844.1888308-8-...@redhat.com
Signed-off-by: Peter Xu 
---
 include/migration/register.h | 4 ++--
 include/qemu/typedefs.h  | 2 --
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/include/migration/register.h b/include/migration/register.h
index 9ab1f79512..2e6a7d766e 100644
--- a/include/migration/register.h
+++ b/include/migration/register.h
@@ -18,7 +18,7 @@
 
 typedef struct SaveVMHandlers {
 /* This runs inside the BQL.  */
-SaveStateHandler *save_state;
+void (*save_state)(QEMUFile *f, void *opaque);
 
 /*
  * save_prepare is called early, even before migration starts, and can be
@@ -71,7 +71,7 @@ typedef struct SaveVMHandlers {
 /* This calculate the exact remaining data to transfer */
 void (*state_pending_exact)(void *opaque, uint64_t *must_precopy,
 uint64_t *can_postcopy);
-LoadStateHandler *load_state;
+int (*load_state)(QEMUFile *f, void *opaque, int version_id);
 int (*load_setup)(QEMUFile *f, void *opaque);
 int (*load_cleanup)(void *opaque);
 /* Called when postcopy migration wants to resume from failure */
diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h
index a028dba4d0..50c277cf0b 100644
--- a/include/qemu/typedefs.h
+++ b/include/qemu/typedefs.h
@@ -151,8 +151,6 @@ typedef struct IRQState *qemu_irq;
 /*
  * Function types
  */
-typedef void SaveStateHandler(QEMUFile *f, void *opaque);
-typedef int LoadStateHandler(QEMUFile *f, void *opaque, int version_id);
 typedef void (*qemu_irq_handler)(void *opaque, int n, int level);
 
 #endif /* QEMU_TYPEDEFS_H */
-- 
2.44.0

[PULL 24/34] migration: migration_file_set_error

From: Steve Sistare 

Define and export migration_file_set_error to eliminate a dependency
on MigrationState.

Signed-off-by: Steve Sistare 
Link: 
https://lore.kernel.org/r/1710179338-294359-9-git-send-email-steven.sist...@oracle.com
Signed-off-by: Peter Xu 
---
 include/migration/misc.h |  2 ++
 hw/vfio/common.c |  9 +
 hw/vfio/migration.c  | 11 +++
 migration/migration.c| 11 +++
 4 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/include/migration/misc.h b/include/migration/misc.h
index 28cfaed2c7..e521cd5229 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -109,6 +109,8 @@ bool migration_in_setup(MigrationState *);
 bool migration_has_finished(MigrationState *);
 bool migration_has_failed(MigrationState *);
 bool migration_is_running(void);
+void migration_file_set_error(int err);
+
 /* ...and after the device transmission */
 /* True if incoming migration entered POSTCOPY_INCOMING_DISCARD */
 bool migration_in_incoming_postcopy(void);
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index de010680ff..b44204eade 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -39,7 +39,6 @@
 #include "sysemu/runstate.h"
 #include "trace.h"
 #include "qapi/error.h"
-#include "migration/migration.h"
 #include "migration/misc.h"
 #include "migration/blocker.h"
 #include "migration/qemu-file.h"
@@ -150,14 +149,8 @@ bool vfio_viommu_preset(VFIODevice *vbasedev)
 
 static void vfio_set_migration_error(int err)
 {
-MigrationState *ms = migrate_get_current();
-
 if (migration_is_setup_or_active()) {
-WITH_QEMU_LOCK_GUARD(>qemu_file_lock) {
-if (ms->to_dst_file) {
-qemu_file_set_error(ms->to_dst_file, err);
-}
-}
+migration_file_set_error(err);
 }
 }
 
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 49c0016add..a15fd486c6 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -17,13 +17,12 @@
 
 #include "sysemu/runstate.h"
 #include "hw/vfio/vfio-common.h"
-#include "migration/migration.h"
+#include "migration/misc.h"
 #include "migration/savevm.h"
 #include "migration/vmstate.h"
 #include "migration/qemu-file.h"
 #include "migration/register.h"
 #include "migration/blocker.h"
-#include "migration/misc.h"
 #include "qapi/error.h"
 #include "exec/ramlist.h"
 #include "exec/ram_addr.h"
@@ -714,9 +713,7 @@ static void vfio_vmstate_change_prepare(void *opaque, bool 
running,
  * Migration should be aborted in this case, but vm_state_notify()
  * currently does not support reporting failures.
  */
-if (migrate_get_current()->to_dst_file) {
-qemu_file_set_error(migrate_get_current()->to_dst_file, ret);
-}
+migration_file_set_error(ret);
 }
 
 trace_vfio_vmstate_change_prepare(vbasedev->name, running,
@@ -746,9 +743,7 @@ static void vfio_vmstate_change(void *opaque, bool running, 
RunState state)
  * Migration should be aborted in this case, but vm_state_notify()
  * currently does not support reporting failures.
  */
-if (migrate_get_current()->to_dst_file) {
-qemu_file_set_error(migrate_get_current()->to_dst_file, ret);
-}
+migration_file_set_error(ret);
 }
 
 trace_vfio_vmstate_change(vbasedev->name, running, RunState_str(state),
diff --git a/migration/migration.c b/migration/migration.c
index db1e627848..216f63d62b 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3038,6 +3038,17 @@ static MigThrError postcopy_pause(MigrationState *s)
 }
 }
 
+void migration_file_set_error(int err)
+{
+MigrationState *s = current_migration;
+
+WITH_QEMU_LOCK_GUARD(>qemu_file_lock) {
+if (s->to_dst_file) {
+qemu_file_set_error(s->to_dst_file, err);
+}
+}
+}
+
 static MigThrError migration_detect_error(MigrationState *s)
 {
 int ret;
-- 
2.44.0

[PULL 12/34] physmem: Reduce local variable scope in flatview_read/write_continue()

From: Jonathan Cameron 

Precursor to factoring out the inner loops for reuse.

Reviewed-by: Peter Xu 
Signed-off-by: Jonathan Cameron 
Reviewed-by: David Hildenbrand 
Reviewed-by: Philippe Mathieu-Daudé 
Link: 
https://lore.kernel.org/r/20240307153710.30907-3-jonathan.came...@huawei.com
Signed-off-by: Peter Xu 
---
 system/physmem.c | 40 
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/system/physmem.c b/system/physmem.c
index e92bed50a6..e35aa29343 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -2688,10 +2688,7 @@ static MemTxResult flatview_write_continue(FlatView *fv, 
hwaddr addr,
hwaddr len, hwaddr mr_addr,
hwaddr l, MemoryRegion *mr)
 {
-uint8_t *ram_ptr;
-uint64_t val;
 MemTxResult result = MEMTX_OK;
-bool release_lock = false;
 const uint8_t *buf = ptr;
 
 for (;;) {
@@ -2699,7 +2696,9 @@ static MemTxResult flatview_write_continue(FlatView *fv, 
hwaddr addr,
 result |= MEMTX_ACCESS_ERROR;
 /* Keep going. */
 } else if (!memory_access_is_direct(mr, true)) {
-release_lock |= prepare_mmio_access(mr);
+uint64_t val;
+bool release_lock = prepare_mmio_access(mr);
+
 l = memory_access_size(mr, l, mr_addr);
 /* XXX: could force current_cpu to NULL to avoid
potential bugs */
@@ -2717,18 +2716,21 @@ static MemTxResult flatview_write_continue(FlatView 
*fv, hwaddr addr,
 val = ldn_he_p(buf, l);
 result |= memory_region_dispatch_write(mr, mr_addr, val,
size_memop(l), attrs);
+if (release_lock) {
+bql_unlock();
+}
+
+
 } else {
 /* RAM case */
-ram_ptr = qemu_ram_ptr_length(mr->ram_block, mr_addr, , false);
+
+uint8_t *ram_ptr = qemu_ram_ptr_length(mr->ram_block, mr_addr, ,
+   false);
+
 memmove(ram_ptr, buf, l);
 invalidate_and_set_dirty(mr, mr_addr, l);
 }
 
-if (release_lock) {
-bql_unlock();
-release_lock = false;
-}
-
 len -= l;
 buf += l;
 addr += l;
@@ -2767,10 +2769,7 @@ MemTxResult flatview_read_continue(FlatView *fv, hwaddr 
addr,
hwaddr len, hwaddr mr_addr, hwaddr l,
MemoryRegion *mr)
 {
-uint8_t *ram_ptr;
-uint64_t val;
 MemTxResult result = MEMTX_OK;
-bool release_lock = false;
 uint8_t *buf = ptr;
 
 fuzz_dma_read_cb(addr, len, mr);
@@ -2780,7 +2779,9 @@ MemTxResult flatview_read_continue(FlatView *fv, hwaddr 
addr,
 /* Keep going. */
 } else if (!memory_access_is_direct(mr, false)) {
 /* I/O case */
-release_lock |= prepare_mmio_access(mr);
+uint64_t val;
+bool release_lock = prepare_mmio_access(mr);
+
 l = memory_access_size(mr, l, mr_addr);
 result |= memory_region_dispatch_read(mr, mr_addr, ,
   size_memop(l), attrs);
@@ -2796,17 +2797,16 @@ MemTxResult flatview_read_continue(FlatView *fv, hwaddr 
addr,
(l == 8 && len >= 8));
 #endif
 stn_he_p(buf, l, val);
+if (release_lock) {
+bql_unlock();
+}
 } else {
 /* RAM case */
-ram_ptr = qemu_ram_ptr_length(mr->ram_block, mr_addr, , false);
+uint8_t *ram_ptr = qemu_ram_ptr_length(mr->ram_block, mr_addr, ,
+   false);
 memcpy(buf, ram_ptr, l);
 }
 
-if (release_lock) {
-bql_unlock();
-release_lock = false;
-}
-
 len -= l;
 buf += l;
 addr += l;
-- 
2.44.0

[PULL 01/34] migration: Don't serialize devices in qemu_savevm_state_iterate()

From: Avihai Horon 

Commit 90697be8896c ("live migration: Serialize vmstate saving in stage
2") introduced device serialization in qemu_savevm_state_iterate(). The
rationale behind it was to first complete migration of slower changing
block devices and only then migrate the RAM, to avoid sending fast
changing RAM pages over and over.

This commit was added a long time ago, and while it was useful back
then, it is not the case anymore:
1. Block migration is deprecated, see commit 66db46ca83b8 ("migration:
   Deprecate block migration").
2. Today there are other iterative devices besides RAM and block, such
   as VFIO, which are registered for migration after RAM. With current
   serialization behavior, a fast changing device can block other
   devices from sending their data, which may prevent migration from
   converging in some cases.

The issue described in item 2 was observed in several VFIO migration
scenarios with switchover-ack capability enabled, where some workload on
the VM prevented RAM from ever reaching a hard zero, thus blocking VFIO
initial pre-copy data from being sent. Hence, destination could not ack
switchover and migration could not converge.

Fix that by not serializing iterative devices in
qemu_savevm_state_iterate().

Note that this still doesn't fully prevent device starvation. As
correctly pointed out by Peter [1], a fast changing device might
constantly consume all allocated bandwidth and block the following
devices. However, this scenario is more likely to happen only if
max-bandwidth is low.

[1] https://lore.kernel.org/qemu-devel/Zd6iw9dBhW6wKNxx@x1n/

Signed-off-by: Avihai Horon 
Reviewed-by: Fabiano Rosas 
Link: https://lore.kernel.org/r/20240304105339.20713-2-avih...@nvidia.com
Signed-off-by: Peter Xu 
---
 migration/savevm.c | 15 ++-
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index dc1fb9c0d3..e84b26e1c8 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1390,7 +1390,8 @@ int qemu_savevm_state_resume_prepare(MigrationState *s)
 int qemu_savevm_state_iterate(QEMUFile *f, bool postcopy)
 {
 SaveStateEntry *se;
-int ret = 1;
+bool all_finished = true;
+int ret;
 
 trace_savevm_state_iterate();
 QTAILQ_FOREACH(se, _state.handlers, entry) {
@@ -1431,16 +1432,12 @@ int qemu_savevm_state_iterate(QEMUFile *f, bool 
postcopy)
  "%d(%s): %d",
  se->section_id, se->idstr, ret);
 qemu_file_set_error(f, ret);
-}
-if (ret <= 0) {
-/* Do not proceed to the next vmstate before this one reported
-   completion of the current stage. This serializes the migration
-   and reduces the probability that a faster changing state is
-   synchronized over and over again. */
-break;
+return ret;
+} else if (!ret) {
+all_finished = false;
 }
 }
-return ret;
+return all_finished;
 }
 
 static bool should_send_vmdesc(void)
-- 
2.44.0

[PULL 23/34] migration: migration_is_device

From: Steve Sistare 

Define and export migration_is_device to eliminate a dependency
on MigrationState.

Signed-off-by: Steve Sistare 
Link: 
https://lore.kernel.org/r/1710179338-294359-8-git-send-email-steven.sist...@oracle.com
Signed-off-by: Peter Xu 
---
 include/migration/misc.h | 1 +
 hw/vfio/common.c | 4 +---
 migration/migration.c| 7 +++
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/include/migration/misc.h b/include/migration/misc.h
index c4b5416357..28cfaed2c7 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -61,6 +61,7 @@ void migration_object_init(void);
 void migration_shutdown(void);
 bool migration_is_idle(void);
 bool migration_is_active(void);
+bool migration_is_device(void);
 bool migration_thread_is_self(void);
 bool migration_is_setup_or_active(void);
 bool migrate_mode_is_cpr(MigrationState *);
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 2dbbf62e15..de010680ff 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -180,10 +180,8 @@ bool vfio_device_state_is_precopy(VFIODevice *vbasedev)
 static bool vfio_devices_all_dirty_tracking(VFIOContainerBase *bcontainer)
 {
 VFIODevice *vbasedev;
-MigrationState *ms = migrate_get_current();
 
-if (!migration_is_active() &&
-ms->state != MIGRATION_STATUS_DEVICE) {
+if (!migration_is_active() && !migration_is_device()) {
 return false;
 }
 
diff --git a/migration/migration.c b/migration/migration.c
index afe72af0b1..db1e627848 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1647,6 +1647,13 @@ bool migration_is_active(void)
 s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE);
 }
 
+bool migration_is_device(void)
+{
+MigrationState *s = current_migration;
+
+return s->state == MIGRATION_STATUS_DEVICE;
+}
+
 bool migration_thread_is_self(void)
 {
 MigrationState *s = current_migration;
-- 
2.44.0

[PULL 17/34] migration: remove migration.h references

From: Steve Sistare 

Remove migration.h from files that no longer need it due to
previous commits.

Signed-off-by: Steve Sistare 
Link: 
https://lore.kernel.org/r/1710179338-294359-2-git-send-email-steven.sist...@oracle.com
Signed-off-by: Peter Xu 
---
 hw/vfio/container.c| 1 -
 hw/virtio/vhost-user.c | 1 -
 hw/virtio/virtio-balloon.c | 1 -
 system/qdev-monitor.c  | 1 -
 target/loongarch/kvm/kvm.c | 1 -
 tests/unit/test-vmstate.c  | 1 -
 6 files changed, 6 deletions(-)

diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index bd25b9fbad..ff081a12c2 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -32,7 +32,6 @@
 #include "sysemu/reset.h"
 #include "trace.h"
 #include "qapi/error.h"
-#include "migration/migration.h"
 #include "pci.h"
 
 VFIOGroupList vfio_group_list =
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index a1eea8547e..1af8621481 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -26,7 +26,6 @@
 #include "qemu/sockets.h"
 #include "sysemu/runstate.h"
 #include "sysemu/cryptodev.h"
-#include "migration/migration.h"
 #include "migration/postcopy-ram.h"
 #include "trace.h"
 #include "exec/ramblock.h"
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index a59ff172bd..609e39a821 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -31,7 +31,6 @@
 #include "trace.h"
 #include "qemu/error-report.h"
 #include "migration/misc.h"
-#include "migration/migration.h"
 
 #include "hw/virtio/virtio-bus.h"
 #include "hw/virtio/virtio-access.h"
diff --git a/system/qdev-monitor.c b/system/qdev-monitor.c
index 09e07cab9b..c1243891c3 100644
--- a/system/qdev-monitor.c
+++ b/system/qdev-monitor.c
@@ -38,7 +38,6 @@
 #include "qemu/option_int.h"
 #include "sysemu/block-backend.h"
 #include "migration/misc.h"
-#include "migration/migration.h"
 #include "qemu/cutils.h"
 #include "hw/qdev-properties.h"
 #include "hw/clock.h"
diff --git a/target/loongarch/kvm/kvm.c b/target/loongarch/kvm/kvm.c
index c19978a970..11a69a3b4e 100644
--- a/target/loongarch/kvm/kvm.c
+++ b/target/loongarch/kvm/kvm.c
@@ -22,7 +22,6 @@
 #include "hw/irq.h"
 #include "qemu/log.h"
 #include "hw/loader.h"
-#include "migration/migration.h"
 #include "sysemu/runstate.h"
 #include "cpu-csr.h"
 #include "kvm_loongarch.h"
diff --git a/tests/unit/test-vmstate.c b/tests/unit/test-vmstate.c
index c4f9faa273..63f28f26f4 100644
--- a/tests/unit/test-vmstate.c
+++ b/tests/unit/test-vmstate.c
@@ -24,7 +24,6 @@
 
 #include "qemu/osdep.h"
 
-#include "../migration/migration.h"
 #include "migration/vmstate.h"
 #include "migration/qemu-file-types.h"
 #include "../migration/qemu-file.h"
-- 
2.44.0

[PULL 00/34] Migration 20240311 patches

From: Peter Xu 

The following changes since commit 7489f7f3f81dcb776df8c1b9a9db281fc21bf05f:

  Merge tag 'hw-misc-20240309' of https://github.com/philmd/qemu into staging 
(2024-03-09 20:12:21 +)

are available in the Git repository at:

  https://gitlab.com/peterx/qemu.git tags/migration-20240311-pull-request

for you to fetch changes up to 1815338df00fd0a3fe25085564c6966f74c8f43d:

  migration/multifd: Add new migration test cases for legacy zero page 
checking. (2024-03-11 16:57:09 -0400)


Migration pull request

- Avihai's fix to allow vmstate iterators to not starve for VFIO
- Maksim's fix on additional check on precopy load error
- Fabiano's fix on fdatasync() hang in mapped-ram
- Jonathan's fix on vring cached access over MMIO regions
- Cedric's cleanup patches 1-4 out of his error report series
- Yu's fix for RDMA migration (which used to be broken even for 8.2)
- Anthony's small cleanup/fix on err message
- Steve's patches on privatize migration.h
- Xiang's patchset to enable zero page detections in multifd threads



Anthony PERARD (1):
  migration: Fix format in error message

Avihai Horon (3):
  migration: Don't serialize devices in qemu_savevm_state_iterate()
  vfio/migration: Refactor vfio_save_state() return value
  vfio/migration: Add a note about migration rate limiting

Cédric Le Goater (4):
  migration: Report error when shutdown fails
  migration: Remove SaveStateHandler and LoadStateHandler typedefs
  migration: Add documentation for SaveVMHandlers
  migration: Do not call PRECOPY_NOTIFY_SETUP notifiers in case of error

Fabiano Rosas (3):
  migration/multifd: Don't fsync when closing QIOChannelFile
  migration/multifd: Allow zero pages in file migration
  migration/multifd: Allow clearing of the file_bmap from multifd

Hao Xiang (5):
  migration/multifd: Add new migration option zero-page-detection.
  migration/multifd: Implement zero page transmission on the multifd
thread.
  migration/multifd: Implement ram_save_target_page_multifd to handle
multifd version of MigrationOps::ram_save_target_page.
  migration/multifd: Enable multifd zero page checking by default.
  migration/multifd: Add new migration test cases for legacy zero page
checking.

Jonathan Cameron (4):
  physmem: Rename addr1 to more informative mr_addr in
flatview_read/write() and similar
  physmem: Reduce local variable scope in flatview_read/write_continue()
  physmem: Factor out body of flatview_read/write_continue() loop
  physmem: Fix wrong address in large
address_space_read/write_cached_slow()

Maksim Davydov (1):
  migration/ram: add additional check

Steve Sistare (12):
  migration: export fewer options
  migration: remove migration.h references
  migration: export migration_is_setup_or_active
  migration: export migration_is_active
  migration: export migration_is_running
  migration: export vcpu_dirty_limit_period
  migration: migration_thread_is_self
  migration: migration_is_device
  migration: migration_file_set_error
  migration: privatize colo interfaces
  migration: delete unused accessors
  migration: purge MigrationState from public interface

Yu Zhang (1):
  migration/rdma: Fix a memory issue for migration

 docs/devel/migration/main.rst   |   3 +-
 qapi/migration.json |  38 +++-
 include/hw/qdev-properties-system.h |   4 +
 include/migration/client-options.h  |  25 +++
 include/migration/misc.h|  18 +-
 include/migration/register.h| 267 +---
 include/qemu/typedefs.h |   2 -
 migration/migration.h   |   7 +-
 migration/multifd.h |  23 ++-
 migration/options.h |   7 +-
 migration/ram.h |   3 +-
 hw/core/machine.c   |   4 +-
 hw/core/qdev-properties-system.c|  10 ++
 hw/vfio/common.c|  17 +-
 hw/vfio/container.c |   1 -
 hw/vfio/migration.c |  24 ++-
 hw/virtio/vhost-user.c  |   1 -
 hw/virtio/virtio-balloon.c  |   2 -
 io/channel-file.c   |   5 -
 migration/colo.c|  17 +-
 migration/file.c|   4 +-
 migration/migration-hmp-cmds.c  |   9 +
 migration/migration.c   |  67 ---
 migration/multifd-zero-page.c   |  87 +
 migration/multifd-zlib.c|  21 ++-
 migration/multifd-zstd.c|  20 ++-
 migration/multifd.c | 120 ++---
 migration/options.c |  32 +++-
 migration/qemu-file.c   |   5 +-
 migration/ram.c |  62 +--
 migration/rdma.c|   2 +-
 migration/savevm.c  |  23 +--
 net/colo-compare.c  |   3 +-
 net/vhost-vdpa.c|   3 +-
 stubs/colo.c|   1 -
 system

[PULL 08/34] migration: Do not call PRECOPY_NOTIFY_SETUP notifiers in case of error

From: Cédric Le Goater 

When commit bd2270608fa0 ("migration/ram.c: add a notifier chain for
precopy") added PRECOPY_NOTIFY_SETUP notifiers at the end of
qemu_savevm_state_setup(), it didn't take into account a possible
error in the loop calling vmstate_save() or .save_setup() handlers.

Check ret value before calling the notifiers.

Reviewed-by: Peter Xu 
Signed-off-by: Cédric Le Goater 
Link: https://lore.kernel.org/r/20240304122844.1888308-10-...@redhat.com
Signed-off-by: Peter Xu 
---
 migration/savevm.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index e84b26e1c8..76b57a9888 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1317,7 +1317,7 @@ void qemu_savevm_state_setup(QEMUFile *f)
 MigrationState *ms = migrate_get_current();
 SaveStateEntry *se;
 Error *local_err = NULL;
-int ret;
+int ret = 0;
 
 json_writer_int64(ms->vmdesc, "page_size", qemu_target_page_size());
 json_writer_start_array(ms->vmdesc, "devices");
@@ -1351,6 +1351,10 @@ void qemu_savevm_state_setup(QEMUFile *f)
 }
 }
 
+if (ret) {
+return;
+}
+
 if (precopy_notify(PRECOPY_NOTIFY_SETUP, _err)) {
 error_report_err(local_err);
 }
-- 
2.44.0

[PULL 16/34] migration: export fewer options

From: Steve Sistare 

A small number of migration options are accessed by migration clients,
but to see them clients must include all of options.h, which is mostly
for migration core code.  migrate_mode() in particular will be needed by
multiple clients.

Refactor the option declarations so clients can see the necessary few via
misc.h, which already exports a portion of the client API.

Signed-off-by: Steve Sistare 
Link: 
https://lore.kernel.org/r/1710179319-294320-1-git-send-email-steven.sist...@oracle.com
Signed-off-by: Peter Xu 
---
 include/migration/client-options.h | 24 
 include/migration/misc.h   |  1 +
 migration/options.h|  6 +-
 hw/vfio/migration.c|  1 -
 hw/virtio/virtio-balloon.c |  1 -
 system/dirtylimit.c|  1 -
 6 files changed, 26 insertions(+), 8 deletions(-)
 create mode 100644 include/migration/client-options.h

diff --git a/include/migration/client-options.h 
b/include/migration/client-options.h
new file mode 100644
index 00..887fea1565
--- /dev/null
+++ b/include/migration/client-options.h
@@ -0,0 +1,24 @@
+/*
+ * QEMU public migration capabilities
+ *
+ * Copyright (c) 2012-2023 Red Hat Inc
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_MIGRATION_CLIENT_OPTIONS_H
+#define QEMU_MIGRATION_CLIENT_OPTIONS_H
+
+/* capabilities */
+
+bool migrate_background_snapshot(void);
+bool migrate_dirty_limit(void);
+bool migrate_postcopy_ram(void);
+bool migrate_switchover_ack(void);
+
+/* parameters */
+
+MigMode migrate_mode(void);
+
+#endif
diff --git a/include/migration/misc.h b/include/migration/misc.h
index 5d1aa593ed..4c226a40bb 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -17,6 +17,7 @@
 #include "qemu/notify.h"
 #include "qapi/qapi-types-migration.h"
 #include "qapi/qapi-types-net.h"
+#include "migration/client-options.h"
 
 /* migration/ram.c */
 
diff --git a/migration/options.h b/migration/options.h
index 6ddd8dad9b..b6b69c2bb7 100644
--- a/migration/options.h
+++ b/migration/options.h
@@ -16,6 +16,7 @@
 
 #include "hw/qdev-properties.h"
 #include "hw/qdev-properties-system.h"
+#include "migration/client-options.h"
 
 /* migration properties */
 
@@ -24,12 +25,10 @@ extern Property migration_properties[];
 /* capabilities */
 
 bool migrate_auto_converge(void);
-bool migrate_background_snapshot(void);
 bool migrate_block(void);
 bool migrate_colo(void);
 bool migrate_compress(void);
 bool migrate_dirty_bitmaps(void);
-bool migrate_dirty_limit(void);
 bool migrate_events(void);
 bool migrate_mapped_ram(void);
 bool migrate_ignore_shared(void);
@@ -38,11 +37,9 @@ bool migrate_multifd(void);
 bool migrate_pause_before_switchover(void);
 bool migrate_postcopy_blocktime(void);
 bool migrate_postcopy_preempt(void);
-bool migrate_postcopy_ram(void);
 bool migrate_rdma_pin_all(void);
 bool migrate_release_ram(void);
 bool migrate_return_path(void);
-bool migrate_switchover_ack(void);
 bool migrate_validate_uuid(void);
 bool migrate_xbzrle(void);
 bool migrate_zero_blocks(void);
@@ -84,7 +81,6 @@ uint8_t migrate_max_cpu_throttle(void);
 uint64_t migrate_max_bandwidth(void);
 uint64_t migrate_avail_switchover_bandwidth(void);
 uint64_t migrate_max_postcopy_bandwidth(void);
-MigMode migrate_mode(void);
 int migrate_multifd_channels(void);
 MultiFDCompression migrate_multifd_compression(void);
 int migrate_multifd_zlib_level(void);
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index f82dcabc49..49c0016add 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -18,7 +18,6 @@
 #include "sysemu/runstate.h"
 #include "hw/vfio/vfio-common.h"
 #include "migration/migration.h"
-#include "migration/options.h"
 #include "migration/savevm.h"
 #include "migration/vmstate.h"
 #include "migration/qemu-file.h"
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 89f853fa9e..a59ff172bd 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -32,7 +32,6 @@
 #include "qemu/error-report.h"
 #include "migration/misc.h"
 #include "migration/migration.h"
-#include "migration/options.h"
 
 #include "hw/virtio/virtio-bus.h"
 #include "hw/virtio/virtio-access.h"
diff --git a/system/dirtylimit.c b/system/dirtylimit.c
index b5607eb8c2..774ff44f79 100644
--- a/system/dirtylimit.c
+++ b/system/dirtylimit.c
@@ -26,7 +26,6 @@
 #include "trace.h"
 #include "migration/misc.h"
 #include "migration/migration.h"
-#include "migration/options.h"
 
 /*
  * Dirtylimit stop working if dirty page rate error
-- 
2.44.0

[PULL 03/34] vfio/migration: Add a note about migration rate limiting

From: Avihai Horon 

VFIO migration buffer size is currently limited to 1MB. Therefore, there
is no need to check if migration rate exceeded, as in the worst case it
will exceed by only 1MB.

However, if the buffer size is later changed to a bigger value,
vfio_save_iterate() should enforce migration rate (similar to migration
RAM code).

Add a note about this in vfio_save_iterate() to serve as a reminder.

Suggested-by: Peter Xu 
Signed-off-by: Avihai Horon 
Reviewed-by: Fabiano Rosas 
Link: https://lore.kernel.org/r/20240304105339.20713-4-avih...@nvidia.com
Signed-off-by: Peter Xu 
---
 hw/vfio/migration.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 0af783a589..f82dcabc49 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -505,6 +505,12 @@ static bool vfio_is_active_iterate(void *opaque)
 return vfio_device_state_is_precopy(vbasedev);
 }
 
+/*
+ * Note about migration rate limiting: VFIO migration buffer size is currently
+ * limited to 1MB, so there is no need to check if migration rate exceeded (as
+ * in the worst case it will exceed by 1MB). However, if the buffer size is
+ * later changed to a bigger value, migration rate should be enforced here.
+ */
 static int vfio_save_iterate(QEMUFile *f, void *opaque)
 {
 VFIODevice *vbasedev = opaque;
-- 
2.44.0

[PULL 04/34] migration/ram: add additional check

From: Maksim Davydov 

If a migration stream is broken, the address and flag reading can return
zero. Thus, an irrelevant flag error will be returned instead of EIO.
It can be fixed by additional check after the reading.

Signed-off-by: Maksim Davydov 
Link: 
https://lore.kernel.org/r/20240304144203.158477-1-davydov-...@yandex-team.ru
Signed-off-by: Peter Xu 
---
 migration/ram.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index 003c28e133..2cd936d9ce 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -4214,6 +4214,12 @@ static int ram_load_precopy(QEMUFile *f)
 i++;
 
 addr = qemu_get_be64(f);
+ret = qemu_file_get_error(f);
+if (ret) {
+error_report("Getting RAM address failed");
+break;
+}
+
 flags = addr & ~TARGET_PAGE_MASK;
 addr &= TARGET_PAGE_MASK;
 
-- 
2.44.0

[PULL 02/34] vfio/migration: Refactor vfio_save_state() return value

From: Avihai Horon 

Currently, vfio_save_state() returns 1 regardless of whether there is
more data to send or not. This was done to prevent a fast changing VFIO
device from potentially blocking other devices from sending their data,
as qemu_savevm_state_iterate() serialized devices.

Now that qemu_savevm_state_iterate() no longer serializes devices, there
is no need for that.

Refactor vfio_save_state() to return 0 if more data is available and 1
if no more data is available.

Signed-off-by: Avihai Horon 
Reviewed-by: Fabiano Rosas 
Link: https://lore.kernel.org/r/20240304105339.20713-3-avih...@nvidia.com
Signed-off-by: Peter Xu 
---
 hw/vfio/migration.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 50140eda87..0af783a589 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -529,11 +529,7 @@ static int vfio_save_iterate(QEMUFile *f, void *opaque)
 trace_vfio_save_iterate(vbasedev->name, migration->precopy_init_size,
 migration->precopy_dirty_size);
 
-/*
- * A VFIO device's pre-copy dirty_bytes is not guaranteed to reach zero.
- * Return 1 so following handlers will not be potentially blocked.
- */
-return 1;
+return !migration->precopy_init_size && !migration->precopy_dirty_size;
 }
 
 static int vfio_save_complete_precopy(QEMUFile *f, void *opaque)
-- 
2.44.0

[PULL 19/34] migration: export migration_is_active

From: Steve Sistare 

Delete the MigrationState parameter from migration_is_active so it
can be exported and used without including migration.h.

Signed-off-by: Steve Sistare 
Link: 
https://lore.kernel.org/r/1710179338-294359-4-git-send-email-steven.sist...@oracle.com
Signed-off-by: Peter Xu 
---
 include/migration/misc.h |  2 +-
 hw/vfio/common.c |  4 ++--
 migration/migration.c| 10 ++
 system/dirtylimit.c  |  2 +-
 4 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/include/migration/misc.h b/include/migration/misc.h
index 79cff6224e..e1f1bf853e 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -60,7 +60,7 @@ void dump_vmstate_json_to_file(FILE *out_fp);
 void migration_object_init(void);
 void migration_shutdown(void);
 bool migration_is_idle(void);
-bool migration_is_active(MigrationState *);
+bool migration_is_active(void);
 bool migration_is_setup_or_active(void);
 bool migrate_mode_is_cpr(MigrationState *);
 
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 896eab8103..2dbbf62e15 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -182,7 +182,7 @@ static bool 
vfio_devices_all_dirty_tracking(VFIOContainerBase *bcontainer)
 VFIODevice *vbasedev;
 MigrationState *ms = migrate_get_current();
 
-if (ms->state != MIGRATION_STATUS_ACTIVE &&
+if (!migration_is_active() &&
 ms->state != MIGRATION_STATUS_DEVICE) {
 return false;
 }
@@ -225,7 +225,7 @@ vfio_devices_all_running_and_mig_active(const 
VFIOContainerBase *bcontainer)
 {
 VFIODevice *vbasedev;
 
-if (!migration_is_active(migrate_get_current())) {
+if (!migration_is_active()) {
 return false;
 }
 
diff --git a/migration/migration.c b/migration/migration.c
index af21403bad..17859cbaee 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1406,7 +1406,7 @@ static void migrate_fd_cleanup(MigrationState *s)
 qemu_fclose(tmp);
 }
 
-assert(!migration_is_active(s));
+assert(!migration_is_active());
 
 if (s->state == MIGRATION_STATUS_CANCELLING) {
 migrate_set_state(>state, MIGRATION_STATUS_CANCELLING,
@@ -1637,8 +1637,10 @@ bool migration_is_idle(void)
 return false;
 }
 
-bool migration_is_active(MigrationState *s)
+bool migration_is_active(void)
 {
+MigrationState *s = current_migration;
+
 return (s->state == MIGRATION_STATUS_ACTIVE ||
 s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE);
 }
@@ -3461,7 +3463,7 @@ static void *migration_thread(void *opaque)
 
 trace_migration_thread_setup_complete();
 
-while (migration_is_active(s)) {
+while (migration_is_active()) {
 if (urgent || !migration_rate_exceeded(s->to_dst_file)) {
 MigIterateState iter_state = migration_iteration_run(s);
 if (iter_state == MIG_ITERATE_SKIP) {
@@ -3607,7 +3609,7 @@ static void *bg_migration_thread(void *opaque)
 migration_bh_schedule(bg_migration_vm_start_bh, s);
 bql_unlock();
 
-while (migration_is_active(s)) {
+while (migration_is_active()) {
 MigIterateState iter_state = bg_migration_iteration_run(s);
 if (iter_state == MIG_ITERATE_SKIP) {
 continue;
diff --git a/system/dirtylimit.c b/system/dirtylimit.c
index 774ff44f79..051e0311c1 100644
--- a/system/dirtylimit.c
+++ b/system/dirtylimit.c
@@ -83,7 +83,7 @@ static void vcpu_dirty_rate_stat_collect(void)
 int64_t period = DIRTYLIMIT_CALC_TIME_MS;
 
 if (migrate_dirty_limit() &&
-migration_is_active(s)) {
+migration_is_active()) {
 period = s->parameters.x_vcpu_dirty_limit_period;
 }
 
-- 
2.44.0

[PULL 14/34] physmem: Fix wrong address in large address_space_read/write_cached_slow()

From: Jonathan Cameron 

If the access is bigger than the MemoryRegion supports,
flatview_read/write_continue() will attempt to update the Memory Region.
but the address passed to flatview_translate() is relative to the cache, not
to the FlatView.

On arm/virt with interleaved CXL memory emulation and virtio-blk-pci this
lead to the first part of descriptor being read from the CXL memory and the
second part from PA 0x8 which happens to be a blank region
of a flash chip and all ffs on this particular configuration.
Note this test requires the out of tree ARM support for CXL, but
the problem is more general.

Avoid this by adding new address_space_read_continue_cached()
and address_space_write_continue_cached() which share all the logic
with the flatview versions except for the MemoryRegion lookup which
is unnecessary as the MemoryRegionCache only covers one MemoryRegion.

Signed-off-by: Jonathan Cameron 
Link: 
https://lore.kernel.org/r/20240307153710.30907-5-jonathan.came...@huawei.com
Signed-off-by: Peter Xu 
---
 system/physmem.c | 63 +++-
 1 file changed, 57 insertions(+), 6 deletions(-)

diff --git a/system/physmem.c b/system/physmem.c
index 737869a3f5..6cfb7a80ab 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -3370,6 +3370,59 @@ static inline MemoryRegion 
*address_space_translate_cached(
 return section.mr;
 }
 
+/* Called within RCU critical section.  */
+static MemTxResult address_space_write_continue_cached(MemTxAttrs attrs,
+   const void *ptr,
+   hwaddr len,
+   hwaddr mr_addr,
+   hwaddr l,
+   MemoryRegion *mr)
+{
+MemTxResult result = MEMTX_OK;
+const uint8_t *buf = ptr;
+
+for (;;) {
+result |= flatview_write_continue_step(attrs, buf, len, mr_addr, ,
+   mr);
+
+len -= l;
+buf += l;
+mr_addr += l;
+
+if (!len) {
+break;
+}
+
+l = len;
+}
+
+return result;
+}
+
+/* Called within RCU critical section.  */
+static MemTxResult address_space_read_continue_cached(MemTxAttrs attrs,
+  void *ptr, hwaddr len,
+  hwaddr mr_addr, hwaddr l,
+  MemoryRegion *mr)
+{
+MemTxResult result = MEMTX_OK;
+uint8_t *buf = ptr;
+
+for (;;) {
+result |= flatview_read_continue_step(attrs, buf, len, mr_addr, , 
mr);
+len -= l;
+buf += l;
+mr_addr += l;
+
+if (!len) {
+break;
+}
+l = len;
+}
+
+return result;
+}
+
 /* Called from RCU critical section. address_space_read_cached uses this
  * out of line function when the target is an MMIO or IOMMU region.
  */
@@ -3383,9 +3436,8 @@ address_space_read_cached_slow(MemoryRegionCache *cache, 
hwaddr addr,
 l = len;
 mr = address_space_translate_cached(cache, addr, _addr, , false,
 MEMTXATTRS_UNSPECIFIED);
-return flatview_read_continue(cache->fv,
-  addr, MEMTXATTRS_UNSPECIFIED, buf, len,
-  mr_addr, l, mr);
+return address_space_read_continue_cached(MEMTXATTRS_UNSPECIFIED,
+  buf, len, mr_addr, l, mr);
 }
 
 /* Called from RCU critical section. address_space_write_cached uses this
@@ -3401,9 +3453,8 @@ address_space_write_cached_slow(MemoryRegionCache *cache, 
hwaddr addr,
 l = len;
 mr = address_space_translate_cached(cache, addr, _addr, , true,
 MEMTXATTRS_UNSPECIFIED);
-return flatview_write_continue(cache->fv,
-   addr, MEMTXATTRS_UNSPECIFIED, buf, len,
-   mr_addr, l, mr);
+return address_space_write_continue_cached(MEMTXATTRS_UNSPECIFIED,
+   buf, len, mr_addr, l, mr);
 }
 
 #define ARG1_DECLMemoryRegionCache *cache
-- 
2.44.0

[PULL 13/34] physmem: Factor out body of flatview_read/write_continue() loop

From: Jonathan Cameron 

This code will be reused for the address_space_cached accessors
shortly.

Also reduce scope of result variable now we aren't directly
calling this in the loop.

Signed-off-by: Jonathan Cameron 
Reviewed-by: David Hildenbrand 
Link: 
https://lore.kernel.org/r/20240307153710.30907-4-jonathan.came...@huawei.com
Signed-off-by: Peter Xu 
---
 system/physmem.c | 169 +++
 1 file changed, 99 insertions(+), 70 deletions(-)

diff --git a/system/physmem.c b/system/physmem.c
index e35aa29343..737869a3f5 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -2681,6 +2681,56 @@ static bool flatview_access_allowed(MemoryRegion *mr, 
MemTxAttrs attrs,
 return false;
 }
 
+static MemTxResult flatview_write_continue_step(MemTxAttrs attrs,
+const uint8_t *buf,
+hwaddr len, hwaddr mr_addr,
+hwaddr *l, MemoryRegion *mr)
+{
+if (!flatview_access_allowed(mr, attrs, mr_addr, *l)) {
+return MEMTX_ACCESS_ERROR;
+}
+
+if (!memory_access_is_direct(mr, true)) {
+uint64_t val;
+MemTxResult result;
+bool release_lock = prepare_mmio_access(mr);
+
+*l = memory_access_size(mr, *l, mr_addr);
+/*
+ * XXX: could force current_cpu to NULL to avoid
+ * potential bugs
+ */
+
+/*
+ * Assure Coverity (and ourselves) that we are not going to OVERRUN
+ * the buffer by following ldn_he_p().
+ */
+#ifdef QEMU_STATIC_ANALYSIS
+assert((*l == 1 && len >= 1) ||
+   (*l == 2 && len >= 2) ||
+   (*l == 4 && len >= 4) ||
+   (*l == 8 && len >= 8));
+#endif
+val = ldn_he_p(buf, *l);
+result = memory_region_dispatch_write(mr, mr_addr, val,
+  size_memop(*l), attrs);
+if (release_lock) {
+bql_unlock();
+}
+
+return result;
+} else {
+/* RAM case */
+uint8_t *ram_ptr = qemu_ram_ptr_length(mr->ram_block, mr_addr, l,
+   false);
+
+memmove(ram_ptr, buf, *l);
+invalidate_and_set_dirty(mr, mr_addr, *l);
+
+return MEMTX_OK;
+}
+}
+
 /* Called within RCU critical section.  */
 static MemTxResult flatview_write_continue(FlatView *fv, hwaddr addr,
MemTxAttrs attrs,
@@ -2692,44 +2742,8 @@ static MemTxResult flatview_write_continue(FlatView *fv, 
hwaddr addr,
 const uint8_t *buf = ptr;
 
 for (;;) {
-if (!flatview_access_allowed(mr, attrs, mr_addr, l)) {
-result |= MEMTX_ACCESS_ERROR;
-/* Keep going. */
-} else if (!memory_access_is_direct(mr, true)) {
-uint64_t val;
-bool release_lock = prepare_mmio_access(mr);
-
-l = memory_access_size(mr, l, mr_addr);
-/* XXX: could force current_cpu to NULL to avoid
-   potential bugs */
-
-/*
- * Assure Coverity (and ourselves) that we are not going to OVERRUN
- * the buffer by following ldn_he_p().
- */
-#ifdef QEMU_STATIC_ANALYSIS
-assert((l == 1 && len >= 1) ||
-   (l == 2 && len >= 2) ||
-   (l == 4 && len >= 4) ||
-   (l == 8 && len >= 8));
-#endif
-val = ldn_he_p(buf, l);
-result |= memory_region_dispatch_write(mr, mr_addr, val,
-   size_memop(l), attrs);
-if (release_lock) {
-bql_unlock();
-}
-
-
-} else {
-/* RAM case */
-
-uint8_t *ram_ptr = qemu_ram_ptr_length(mr->ram_block, mr_addr, ,
-   false);
-
-memmove(ram_ptr, buf, l);
-invalidate_and_set_dirty(mr, mr_addr, l);
-}
+result |= flatview_write_continue_step(attrs, buf, len, mr_addr, ,
+   mr);
 
 len -= l;
 buf += l;
@@ -2763,6 +2777,52 @@ static MemTxResult flatview_write(FlatView *fv, hwaddr 
addr, MemTxAttrs attrs,
mr_addr, l, mr);
 }
 
+static MemTxResult flatview_read_continue_step(MemTxAttrs attrs, uint8_t *buf,
+   hwaddr len, hwaddr mr_addr,
+   hwaddr *l,
+   MemoryRegion *mr)
+{
+if (!flatview_access_allowed(mr, attrs, mr_addr, *l)) {
+return MEMTX_ACCESS_ERROR;
+}
+
+if (!memory_access_is_direct(mr, false)) {
+/* I/O case */
+uint64_t val;
+MemTxResult result;
+bool release_lock = prepare_mmio_access(mr);
+
+*l = memory_access_size(mr,

[PULL 11/34] physmem: Rename addr1 to more informative mr_addr in flatview_read/write() and similar

From: Jonathan Cameron 

The calls to flatview_read/write[_continue]() have parameters addr and
addr1 but the names give no indication of what they are addresses of.
Rename addr1 to mr_addr to reflect that it is the translated address
offset within the MemoryRegion returned by flatview_translate().
Similarly rename the parameter in address_space_read/write_cached_slow()

Suggested-by: Peter Xu 
Signed-off-by: Jonathan Cameron 
Reviewed-by: David Hildenbrand 
Link: 
https://lore.kernel.org/r/20240307153710.30907-2-jonathan.came...@huawei.com
Signed-off-by: Peter Xu 
---
 system/physmem.c | 50 
 1 file changed, 25 insertions(+), 25 deletions(-)

diff --git a/system/physmem.c b/system/physmem.c
index 6e9ed97597..e92bed50a6 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -2685,7 +2685,7 @@ static bool flatview_access_allowed(MemoryRegion *mr, 
MemTxAttrs attrs,
 static MemTxResult flatview_write_continue(FlatView *fv, hwaddr addr,
MemTxAttrs attrs,
const void *ptr,
-   hwaddr len, hwaddr addr1,
+   hwaddr len, hwaddr mr_addr,
hwaddr l, MemoryRegion *mr)
 {
 uint8_t *ram_ptr;
@@ -2695,12 +2695,12 @@ static MemTxResult flatview_write_continue(FlatView 
*fv, hwaddr addr,
 const uint8_t *buf = ptr;
 
 for (;;) {
-if (!flatview_access_allowed(mr, attrs, addr1, l)) {
+if (!flatview_access_allowed(mr, attrs, mr_addr, l)) {
 result |= MEMTX_ACCESS_ERROR;
 /* Keep going. */
 } else if (!memory_access_is_direct(mr, true)) {
 release_lock |= prepare_mmio_access(mr);
-l = memory_access_size(mr, l, addr1);
+l = memory_access_size(mr, l, mr_addr);
 /* XXX: could force current_cpu to NULL to avoid
potential bugs */
 
@@ -2715,13 +2715,13 @@ static MemTxResult flatview_write_continue(FlatView 
*fv, hwaddr addr,
(l == 8 && len >= 8));
 #endif
 val = ldn_he_p(buf, l);
-result |= memory_region_dispatch_write(mr, addr1, val,
+result |= memory_region_dispatch_write(mr, mr_addr, val,
size_memop(l), attrs);
 } else {
 /* RAM case */
-ram_ptr = qemu_ram_ptr_length(mr->ram_block, addr1, , false);
+ram_ptr = qemu_ram_ptr_length(mr->ram_block, mr_addr, , false);
 memmove(ram_ptr, buf, l);
-invalidate_and_set_dirty(mr, addr1, l);
+invalidate_and_set_dirty(mr, mr_addr, l);
 }
 
 if (release_lock) {
@@ -2738,7 +2738,7 @@ static MemTxResult flatview_write_continue(FlatView *fv, 
hwaddr addr,
 }
 
 l = len;
-mr = flatview_translate(fv, addr, , , true, attrs);
+mr = flatview_translate(fv, addr, _addr, , true, attrs);
 }
 
 return result;
@@ -2749,22 +2749,22 @@ static MemTxResult flatview_write(FlatView *fv, hwaddr 
addr, MemTxAttrs attrs,
   const void *buf, hwaddr len)
 {
 hwaddr l;
-hwaddr addr1;
+hwaddr mr_addr;
 MemoryRegion *mr;
 
 l = len;
-mr = flatview_translate(fv, addr, , , true, attrs);
+mr = flatview_translate(fv, addr, _addr, , true, attrs);
 if (!flatview_access_allowed(mr, attrs, addr, len)) {
 return MEMTX_ACCESS_ERROR;
 }
 return flatview_write_continue(fv, addr, attrs, buf, len,
-   addr1, l, mr);
+   mr_addr, l, mr);
 }
 
 /* Called within RCU critical section.  */
 MemTxResult flatview_read_continue(FlatView *fv, hwaddr addr,
MemTxAttrs attrs, void *ptr,
-   hwaddr len, hwaddr addr1, hwaddr l,
+   hwaddr len, hwaddr mr_addr, hwaddr l,
MemoryRegion *mr)
 {
 uint8_t *ram_ptr;
@@ -2775,14 +2775,14 @@ MemTxResult flatview_read_continue(FlatView *fv, hwaddr 
addr,
 
 fuzz_dma_read_cb(addr, len, mr);
 for (;;) {
-if (!flatview_access_allowed(mr, attrs, addr1, l)) {
+if (!flatview_access_allowed(mr, attrs, mr_addr, l)) {
 result |= MEMTX_ACCESS_ERROR;
 /* Keep going. */
 } else if (!memory_access_is_direct(mr, false)) {
 /* I/O case */
 release_lock |= prepare_mmio_access(mr);
-l = memory_access_size(mr, l, addr1);
-result |= memory_region_dispatch_read(mr, addr1, ,
+l = memory_access_size(mr, l, mr_addr);
+result |= memory_region_dispatch_read(mr, mr_addr, ,
   size_memop(l), attrs);
 
 /*
@@ -2798,7 +2798,7 @@ MemTxResult

[PATCH v5 1/8] Add 'to' object into migrate_qmp()

Add the 'to' object into migrate_qmp(), so we can use
migrate_get_socket_address() inside migrate_qmp() to get
the port value. This is not applied to other migrate_qmp*
because they don't need the port.

Signed-off-by: Het Gala 
Suggested-by: Fabiano Rosas 
Reviewed-by: Fabiano Rosas 
---
 tests/qtest/migration-helpers.c |  3 ++-
 tests/qtest/migration-helpers.h |  5 +++--
 tests/qtest/migration-test.c| 28 ++--
 3 files changed, 19 insertions(+), 17 deletions(-)

diff --git a/tests/qtest/migration-helpers.c b/tests/qtest/migration-helpers.c
index e451dbdbed..b6206a04fb 100644
--- a/tests/qtest/migration-helpers.c
+++ b/tests/qtest/migration-helpers.c
@@ -68,7 +68,8 @@ void migrate_qmp_fail(QTestState *who, const char *uri, const 
char *fmt, ...)
  * Arguments are built from @fmt... (formatted like
  * qobject_from_jsonf_nofail()) with "uri": @uri spliced in.
  */
-void migrate_qmp(QTestState *who, const char *uri, const char *fmt, ...)
+void migrate_qmp(QTestState *who, QTestState *to, const char *uri,
+ const char *fmt, ...)
 {
 va_list ap;
 QDict *args;
diff --git a/tests/qtest/migration-helpers.h b/tests/qtest/migration-helpers.h
index 3bf7ded1b9..e16a34c796 100644
--- a/tests/qtest/migration-helpers.h
+++ b/tests/qtest/migration-helpers.h
@@ -25,8 +25,9 @@ typedef struct QTestMigrationState {
 bool migrate_watch_for_events(QTestState *who, const char *name,
   QDict *event, void *opaque);
 
-G_GNUC_PRINTF(3, 4)
-void migrate_qmp(QTestState *who, const char *uri, const char *fmt, ...);
+G_GNUC_PRINTF(4, 5)
+void migrate_qmp(QTestState *who, QTestState *to, const char *uri,
+ const char *fmt, ...);
 
 G_GNUC_PRINTF(3, 4)
 void migrate_incoming_qmp(QTestState *who, const char *uri,
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 4023d808f9..d9b4e28c12 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -1350,7 +1350,7 @@ static int migrate_postcopy_prepare(QTestState **from_ptr,
 wait_for_suspend(from, _state);
 
 g_autofree char *uri = migrate_get_socket_address(to, "socket-address");
-migrate_qmp(from, uri, "{}");
+migrate_qmp(from, to, uri, "{}");
 
 migrate_wait_for_dirty_mem(from, to);
 
@@ -1500,7 +1500,7 @@ static void postcopy_recover_fail(QTestState *from, 
QTestState *to)
 g_assert_cmpint(ret, ==, 1);
 
 migrate_recover(to, "fd:fd-mig");
-migrate_qmp(from, "fd:fd-mig", "{'resume': true}");
+migrate_qmp(from, to, "fd:fd-mig", "{'resume': true}");
 
 /*
  * Make sure both QEMU instances will go into RECOVER stage, then test
@@ -1588,7 +1588,7 @@ static void test_postcopy_recovery_common(MigrateCommon 
*args)
  * Try to rebuild the migration channel using the resume flag and
  * the newly created channel
  */
-migrate_qmp(from, uri, "{'resume': true}");
+migrate_qmp(from, to, uri, "{'resume': true}");
 
 /* Restore the postcopy bandwidth to unlimited */
 migrate_set_parameter_int(from, "max-postcopy-bandwidth", 0);
@@ -1669,7 +1669,7 @@ static void test_baddest(void)
 if (test_migrate_start(, , "tcp:127.0.0.1:0", )) {
 return;
 }
-migrate_qmp(from, "tcp:127.0.0.1:0", "{}");
+migrate_qmp(from, to, "tcp:127.0.0.1:0", "{}");
 wait_for_migration_fail(from, false);
 test_migrate_end(from, to, false);
 }
@@ -1708,7 +1708,7 @@ static void test_analyze_script(void)
 uri = g_strdup_printf("exec:cat > %s", file);
 
 migrate_ensure_converge(from);
-migrate_qmp(from, uri, "{}");
+migrate_qmp(from, to, uri, "{}");
 wait_for_migration_complete(from);
 
 pid = fork();
@@ -1777,7 +1777,7 @@ static void test_precopy_common(MigrateCommon *args)
 goto finish;
 }
 
-migrate_qmp(from, connect_uri, "{}");
+migrate_qmp(from, to, connect_uri, "{}");
 
 if (args->result != MIG_TEST_SUCCEED) {
 bool allow_active = args->result == MIG_TEST_FAIL;
@@ -1873,7 +1873,7 @@ static void test_file_common(MigrateCommon *args, bool 
stop_src)
 goto finish;
 }
 
-migrate_qmp(from, connect_uri, "{}");
+migrate_qmp(from, to, connect_uri, "{}");
 wait_for_migration_complete(from);
 
 /*
@@ -2029,7 +2029,7 @@ static void test_ignore_shared(void)
 /* Wait for the first serial output from the source */
 wait_for_serial("src_serial");
 
-migrate_qmp(from, uri, "{}");
+migrate_qmp(from, to, uri, "{}");
 
 migrate_wait_for_dirty_mem(from, to);
 
@@ -2605,7 +2605,7 @@ static void do_test_validate_uuid(MigrateStart *args, 
bool should_fail)
 /* Wait for the first serial output from the source */
 wait_for_serial("src_serial");
 
-migrate_qmp(from, uri, "{}");
+migrate_qmp(from, to, uri, "{}");
 
 if (should_fail) {
 qtest_set_expected_status(to, EXIT_FAILURE);
@@ -2708,7 +2708,7 @@ static void test_migrate_auto_converge(void)
 /* Wait for the first serial

[PATCH v5 5/8] Add migrate_set_ports into migrate_qmp to update migration port value

migrate_get_connect_qdict gets qdict with the dst QEMU parameters.

migrate_set_ports() from list of channels reads each QDict for port,
and fills the port with correct value in case it was 0 in the test.

Signed-off-by: Het Gala 
Suggested-by: Fabiano Rosas 
---
 tests/qtest/migration-helpers.c | 75 +
 1 file changed, 75 insertions(+)

diff --git a/tests/qtest/migration-helpers.c b/tests/qtest/migration-helpers.c
index f215f44467..ed8d812e9d 100644
--- a/tests/qtest/migration-helpers.c
+++ b/tests/qtest/migration-helpers.c
@@ -16,6 +16,8 @@
 #include "qapi/qapi-visit-sockets.h"
 #include "qapi/qobject-input-visitor.h"
 #include "qapi/error.h"
+#include "qapi/qmp/qlist.h"
+#include "qemu/cutils.h"
 
 #include "migration-helpers.h"
 
@@ -48,6 +50,37 @@ static char *SocketAddress_to_str(SocketAddress *addr)
 }
 }
 
+static QDict *SocketAddress_to_qdict(SocketAddress *addr)
+{
+QDict *dict = qdict_new();
+
+switch (addr->type) {
+case SOCKET_ADDRESS_TYPE_INET:
+qdict_put_str(dict, "type", "inet");
+qdict_put_str(dict, "host", addr->u.inet.host);
+qdict_put_str(dict, "port", addr->u.inet.port);
+break;
+case SOCKET_ADDRESS_TYPE_UNIX:
+qdict_put_str(dict, "type", "unix");
+qdict_put_str(dict, "path", addr->u.q_unix.path);
+break;
+case SOCKET_ADDRESS_TYPE_FD:
+qdict_put_str(dict, "type", "fd");
+qdict_put_str(dict, "str", addr->u.fd.str);
+break;
+case SOCKET_ADDRESS_TYPE_VSOCK:
+qdict_put_str(dict, "type", "vsock");
+qdict_put_str(dict, "cid", addr->u.vsock.cid);
+qdict_put_str(dict, "port", addr->u.vsock.port);
+break;
+default:
+g_assert_not_reached();
+break;
+}
+
+return dict;
+}
+
 static SocketAddress *migrate_get_socket_address(QTestState *who)
 {
 QDict *rsp;
@@ -81,6 +114,46 @@ migrate_get_connect_uri(QTestState *who)
 return connect_uri;
 }
 
+static QDict *
+migrate_get_connect_qdict(QTestState *who)
+{
+SocketAddress *addrs;
+QDict *connect_qdict;
+
+addrs = migrate_get_socket_address(who);
+connect_qdict = SocketAddress_to_qdict(addrs);
+
+qapi_free_SocketAddress(addrs);
+return connect_qdict;
+}
+
+static void migrate_set_ports(QTestState *to, QList *channel_list)
+{
+QDict *addr;
+QListEntry *entry;
+g_autofree const char *addr_port = NULL;
+
+if (channel_list == NULL) {
+return;
+}
+
+addr = migrate_get_connect_qdict(to);
+
+QLIST_FOREACH_ENTRY(channel_list, entry) {
+QDict *channel = qobject_to(QDict, qlist_entry_obj(entry));
+QDict *addrdict = qdict_get_qdict(channel, "addr");
+
+if (qdict_haskey(addrdict, "port") &&
+qdict_haskey(addr, "port") &&
+(strcmp(qdict_get_str(addrdict, "port"), "0") == 0)) {
+addr_port = qdict_get_str(addr, "port");
+qdict_put_str(addrdict, "port", addr_port);
+}
+}
+
+qobject_unref(addr);
+}
+
 bool migrate_watch_for_events(QTestState *who, const char *name,
   QDict *event, void *opaque)
 {
@@ -139,6 +212,7 @@ void migrate_qmp(QTestState *who, QTestState *to, const 
char *uri,
 {
 va_list ap;
 QDict *args;
+QList *channel_list = NULL;
 g_autofree char *connect_uri = NULL;
 
 va_start(ap, fmt);
@@ -149,6 +223,7 @@ void migrate_qmp(QTestState *who, QTestState *to, const 
char *uri,
 if (!uri) {
 connect_uri = migrate_get_connect_uri(to);
 }
+migrate_set_ports(to, channel_list);
 qdict_put_str(args, "uri", uri ? uri : connect_uri);
 
 qtest_qmp_assert_success(who,
-- 
2.22.3

[PATCH v5 7/8] Add multifd_tcp_plain test using list of channels instead of uri

Add a positive test to check multifd live migration but this time
using list of channels (restricted to 1) as the starting point
instead of simple uri string.

Signed-off-by: Het Gala 
Suggested-by: Fabiano Rosas 
---
 tests/qtest/migration-test.c | 30 +++---
 1 file changed, 27 insertions(+), 3 deletions(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index bf27766eb0..392d5d0b62 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -655,6 +655,13 @@ typedef struct {
  */
 const char *connect_uri;
 
+/*
+ * Optional: JSON-formatted list of src QEMU URIs. If a port is
+ * defined as '0' in any QDict key a value of '0' will be
+ * automatically converted to the correct destination port.
+ */
+const char *connect_channels;
+
 /* Optional: callback to run at start to set migration parameters */
 TestMigrateStartHook start_hook;
 /* Optional: callback to run at finish to cleanup */
@@ -2740,7 +2747,7 @@ test_migrate_precopy_tcp_multifd_zstd_start(QTestState 
*from,
 }
 #endif /* CONFIG_ZSTD */
 
-static void test_multifd_tcp_none(void)
+static void test_multifd_tcp_uri_none(void)
 {
 MigrateCommon args = {
 .listen_uri = "defer",
@@ -2755,6 +2762,21 @@ static void test_multifd_tcp_none(void)
 test_precopy_common();
 }
 
+static void test_multifd_tcp_channels_none(void)
+{
+MigrateCommon args = {
+.listen_uri = "defer",
+.start_hook = test_migrate_precopy_tcp_multifd_start,
+.live = true,
+.connect_channels = "[ { 'channel-type': 'main',"
+"'addr': { 'transport': 'socket',"
+"  'type': 'inet',"
+"  'host': '127.0.0.1',"
+"  'port': '0' } } ]",
+};
+test_precopy_common();
+}
+
 static void test_multifd_tcp_zlib(void)
 {
 MigrateCommon args = {
@@ -3664,8 +3686,10 @@ int main(int argc, char **argv)
test_migrate_dirty_limit);
 }
 }
-migration_test_add("/migration/multifd/tcp/plain/none",
-   test_multifd_tcp_none);
+migration_test_add("/migration/multifd/tcp/uri/plain/none",
+   test_multifd_tcp_uri_none);
+migration_test_add("/migration/multifd/tcp/channels/plain/none",
+   test_multifd_tcp_channels_none);
 migration_test_add("/migration/multifd/tcp/plain/cancel",
test_multifd_tcp_cancel);
 migration_test_add("/migration/multifd/tcp/plain/zlib",
-- 
2.22.3

[PATCH v5 8/8] Add negative tests to validate migration QAPIs

Migration QAPI arguments - uri and channels are mutually exhaustive.
Add negative validation tests, one with both arguments present and
one with none present.

Signed-off-by: Het Gala 
Suggested-by: Fabiano Rosas 
Reviewed-by: Fabiano Rosas 
---
 tests/qtest/migration-test.c | 53 
 1 file changed, 53 insertions(+)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 392d5d0b62..9e3146d23f 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -2608,6 +2608,55 @@ static void test_validate_uuid_dst_not_set(void)
 do_test_validate_uuid(, false);
 }
 
+static void do_test_validate_uri_channel(MigrateCommon *args)
+{
+QTestState *from, *to;
+
+if (test_migrate_start(, , args->listen_uri, >start)) {
+return;
+}
+
+/* Wait for the first serial output from the source */
+wait_for_serial("src_serial");
+
+/*
+ * 'uri' and 'channels' validation is checked even before the migration
+ * starts.
+ */
+migrate_qmp_fail(from, args->connect_uri, args->connect_channels, "{}");
+test_migrate_end(from, to, false);
+}
+
+static void test_validate_uri_channels_both_set(void)
+{
+MigrateCommon args = {
+.start = {
+.hide_stderr = true,
+},
+.listen_uri = "defer",
+.connect_uri = "tcp:127.0.0.1:0",
+.connect_channels = "[ { 'channel-type': 'main',"
+"'addr': { 'transport': 'socket',"
+"  'type': 'inet',"
+"  'host': '127.0.0.1',"
+"  'port': '0' } } ]",
+};
+
+do_test_validate_uri_channel();
+}
+
+static void test_validate_uri_channels_none_set(void)
+{
+MigrateCommon args = {
+.start = {
+.hide_stderr = true,
+},
+.listen_uri = "defer",
+};
+
+do_test_validate_uri_channel();
+}
+
 /*
  * The way auto_converge works, we need to do too many passes to
  * run this test.  Auto_converge logic is only run once every
@@ -3674,6 +3723,10 @@ int main(int argc, char **argv)
test_validate_uuid_src_not_set);
 migration_test_add("/migration/validate_uuid_dst_not_set",
test_validate_uuid_dst_not_set);
+migration_test_add("/migration/validate_uri/channels/both_set",
+   test_validate_uri_channels_both_set);
+migration_test_add("/migration/validate_uri/channels/none_set",
+   test_validate_uri_channels_none_set);
 /*
  * See explanation why this test is slow on function definition
  */
-- 
2.22.3

[PATCH v5 2/8] Replace connect_uri and move migrate_get_socket_address inside migrate_qmp

Move the calls to migrate_get_socket_address() into migrate_qmp().
Get rid of connect_uri and replace it with args->connect_uri only
because 'to' object will help to generate connect_uri with the
correct port number.

Signed-off-by: Het Gala 
Suggested-by: Fabiano Rosas 
---
 tests/qtest/migration-helpers.c | 54 -
 tests/qtest/migration-test.c| 83 -
 2 files changed, 63 insertions(+), 74 deletions(-)

diff --git a/tests/qtest/migration-helpers.c b/tests/qtest/migration-helpers.c
index b6206a04fb..3e8c19c4de 100644
--- a/tests/qtest/migration-helpers.c
+++ b/tests/qtest/migration-helpers.c
@@ -13,6 +13,9 @@
 #include "qemu/osdep.h"
 #include "qemu/ctype.h"
 #include "qapi/qmp/qjson.h"
+#include "qapi/qapi-visit-sockets.h"
+#include "qapi/qobject-input-visitor.h"
+#include "qapi/error.h"
 
 #include "migration-helpers.h"
 
@@ -24,6 +27,51 @@
  */
 #define MIGRATION_STATUS_WAIT_TIMEOUT 120
 
+static char *SocketAddress_to_str(SocketAddress *addr)
+{
+switch (addr->type) {
+case SOCKET_ADDRESS_TYPE_INET:
+return g_strdup_printf("tcp:%s:%s",
+   addr->u.inet.host,
+   addr->u.inet.port);
+case SOCKET_ADDRESS_TYPE_UNIX:
+return g_strdup_printf("unix:%s",
+   addr->u.q_unix.path);
+case SOCKET_ADDRESS_TYPE_FD:
+return g_strdup_printf("fd:%s", addr->u.fd.str);
+case SOCKET_ADDRESS_TYPE_VSOCK:
+return g_strdup_printf("tcp:%s:%s",
+   addr->u.vsock.cid,
+   addr->u.vsock.port);
+default:
+return g_strdup("unknown address type");
+}
+}
+
+static char *
+migrate_get_socket_address(QTestState *who, const char *parameter)
+{
+QDict *rsp;
+char *result;
+SocketAddressList *addrs;
+Visitor *iv = NULL;
+QObject *object;
+
+rsp = migrate_query(who);
+object = qdict_get(rsp, parameter);
+
+iv = qobject_input_visitor_new(object);
+visit_type_SocketAddressList(iv, NULL, , _abort);
+visit_free(iv);
+
+/* we are only using a single address */
+result = SocketAddress_to_str(addrs->value);
+
+qapi_free_SocketAddressList(addrs);
+qobject_unref(rsp);
+return result;
+}
+
 bool migrate_watch_for_events(QTestState *who, const char *name,
   QDict *event, void *opaque)
 {
@@ -73,13 +121,17 @@ void migrate_qmp(QTestState *who, QTestState *to, const 
char *uri,
 {
 va_list ap;
 QDict *args;
+g_autofree char *connect_uri = NULL;
 
 va_start(ap, fmt);
 args = qdict_from_vjsonf_nofail(fmt, ap);
 va_end(ap);
 
 g_assert(!qdict_haskey(args, "uri"));
-qdict_put_str(args, "uri", uri);
+if (!uri) {
+connect_uri = migrate_get_socket_address(to, "socket-address");
+}
+qdict_put_str(args, "uri", uri ? uri : connect_uri);
 
 qtest_qmp_assert_success(who,
  "{ 'execute': 'migrate', 'arguments': %p}", args);
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index d9b4e28c12..9bb24fd7c5 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -13,16 +13,12 @@
 #include "qemu/osdep.h"
 
 #include "libqtest.h"
-#include "qapi/error.h"
 #include "qapi/qmp/qdict.h"
 #include "qemu/module.h"
 #include "qemu/option.h"
 #include "qemu/range.h"
 #include "qemu/sockets.h"
 #include "chardev/char.h"
-#include "qapi/qapi-visit-sockets.h"
-#include "qapi/qobject-input-visitor.h"
-#include "qapi/qobject-output-visitor.h"
 #include "crypto/tlscredspsk.h"
 #include "qapi/qmp/qlist.h"
 
@@ -369,50 +365,6 @@ static void cleanup(const char *filename)
 unlink(path);
 }
 
-static char *SocketAddress_to_str(SocketAddress *addr)
-{
-switch (addr->type) {
-case SOCKET_ADDRESS_TYPE_INET:
-return g_strdup_printf("tcp:%s:%s",
-   addr->u.inet.host,
-   addr->u.inet.port);
-case SOCKET_ADDRESS_TYPE_UNIX:
-return g_strdup_printf("unix:%s",
-   addr->u.q_unix.path);
-case SOCKET_ADDRESS_TYPE_FD:
-return g_strdup_printf("fd:%s", addr->u.fd.str);
-case SOCKET_ADDRESS_TYPE_VSOCK:
-return g_strdup_printf("tcp:%s:%s",
-   addr->u.vsock.cid,
-   addr->u.vsock.port);
-default:
-return g_strdup("unknown address type");
-}
-}
-
-static char *migrate_get_socket_address(QTestState *who, const char *parameter)
-{
-QDict *rsp;
-char *result;
-SocketAddressList *addrs;
-Visitor *iv = NULL;
-QObject *object;
-
-rsp = migrate_query(who);
-object = qdict_get(rsp, parameter);
-
-iv = qobject_input_visitor_new(object);
-visit_type_SocketAddressList(iv, NULL, , _abort);
-visit_free(iv);
-
-/* we are only using a single address */
-result =

[PATCH v5 6/8] Add channels parameter in migrate_qmp

Alter migrate_qmp() to allow use of channels parameter, but only
fill the uri with correct port number if there are no channels.
Here we don't want to allow the wrong cases of having both or
none (ex: migrate_qmp_fail).

Signed-off-by: Het Gala 
Suggested-by: Fabiano Rosas 
---
 tests/qtest/migration-helpers.c | 22 +-
 tests/qtest/migration-helpers.h |  4 ++--
 tests/qtest/migration-test.c| 28 ++--
 3 files changed, 29 insertions(+), 25 deletions(-)

diff --git a/tests/qtest/migration-helpers.c b/tests/qtest/migration-helpers.c
index ed8d812e9d..b2a90469fb 100644
--- a/tests/qtest/migration-helpers.c
+++ b/tests/qtest/migration-helpers.c
@@ -133,10 +133,6 @@ static void migrate_set_ports(QTestState *to, QList 
*channel_list)
 QListEntry *entry;
 g_autofree const char *addr_port = NULL;
 
-if (channel_list == NULL) {
-return;
-}
-
 addr = migrate_get_connect_qdict(to);
 
 QLIST_FOREACH_ENTRY(channel_list, entry) {
@@ -208,11 +204,10 @@ void migrate_qmp_fail(QTestState *who, const char *uri,
  * qobject_from_jsonf_nofail()) with "uri": @uri spliced in.
  */
 void migrate_qmp(QTestState *who, QTestState *to, const char *uri,
- const char *fmt, ...)
+ const char *channels, const char *fmt, ...)
 {
 va_list ap;
 QDict *args;
-QList *channel_list = NULL;
 g_autofree char *connect_uri = NULL;
 
 va_start(ap, fmt);
@@ -220,11 +215,20 @@ void migrate_qmp(QTestState *who, QTestState *to, const 
char *uri,
 va_end(ap);
 
 g_assert(!qdict_haskey(args, "uri"));
-if (!uri) {
+if (uri) {
+qdict_put_str(args, "uri", uri);
+} else if (!channels) {
 connect_uri = migrate_get_connect_uri(to);
+qdict_put_str(args, "uri", connect_uri);
+}
+
+g_assert(!qdict_haskey(args, "channels"));
+if (channels) {
+QObject *channels_obj = qobject_from_json(channels, _abort);
+QList *channel_list = qobject_to(QList, channels_obj);
+migrate_set_ports(to, channel_list);
+qdict_put_obj(args, "channels", channels_obj);
 }
-migrate_set_ports(to, channel_list);
-qdict_put_str(args, "uri", uri ? uri : connect_uri);
 
 qtest_qmp_assert_success(who,
  "{ 'execute': 'migrate', 'arguments': %p}", args);
diff --git a/tests/qtest/migration-helpers.h b/tests/qtest/migration-helpers.h
index 4e664148a5..1339835698 100644
--- a/tests/qtest/migration-helpers.h
+++ b/tests/qtest/migration-helpers.h
@@ -25,9 +25,9 @@ typedef struct QTestMigrationState {
 bool migrate_watch_for_events(QTestState *who, const char *name,
   QDict *event, void *opaque);
 
-G_GNUC_PRINTF(4, 5)
+G_GNUC_PRINTF(5, 6)
 void migrate_qmp(QTestState *who, QTestState *to, const char *uri,
- const char *fmt, ...);
+ const char *channels, const char *fmt, ...);
 
 G_GNUC_PRINTF(3, 4)
 void migrate_incoming_qmp(QTestState *who, const char *uri,
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index da4b0006c7..bf27766eb0 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -1301,7 +1301,7 @@ static int migrate_postcopy_prepare(QTestState **from_ptr,
 wait_for_serial("src_serial");
 wait_for_suspend(from, _state);
 
-migrate_qmp(from, to, NULL, "{}");
+migrate_qmp(from, to, NULL, NULL, "{}");
 
 migrate_wait_for_dirty_mem(from, to);
 
@@ -1451,7 +1451,7 @@ static void postcopy_recover_fail(QTestState *from, 
QTestState *to)
 g_assert_cmpint(ret, ==, 1);
 
 migrate_recover(to, "fd:fd-mig");
-migrate_qmp(from, to, "fd:fd-mig", "{'resume': true}");
+migrate_qmp(from, to, "fd:fd-mig", NULL, "{'resume': true}");
 
 /*
  * Make sure both QEMU instances will go into RECOVER stage, then test
@@ -1539,7 +1539,7 @@ static void test_postcopy_recovery_common(MigrateCommon 
*args)
  * Try to rebuild the migration channel using the resume flag and
  * the newly created channel
  */
-migrate_qmp(from, to, uri, "{'resume': true}");
+migrate_qmp(from, to, uri, NULL, "{'resume': true}");
 
 /* Restore the postcopy bandwidth to unlimited */
 migrate_set_parameter_int(from, "max-postcopy-bandwidth", 0);
@@ -1620,7 +1620,7 @@ static void test_baddest(void)
 if (test_migrate_start(, , "tcp:127.0.0.1:0", )) {
 return;
 }
-migrate_qmp(from, to, "tcp:127.0.0.1:0", "{}");
+migrate_qmp(from, to, "tcp:127.0.0.1:0", NULL, "{}");
 wait_for_migration_fail(from, false);
 test_migrate_end(from, to, false);
 }
@@ -1659,7 +1659,7 @@ static void test_analyze_script(void)
 uri = g_strdup_printf("exec:cat > %s", file);
 
 migrate_ensure_converge(from);
-migrate_qmp(from, to, uri, "{}");
+migrate_qmp(from, to, uri, NULL, "{}");
 wait_for_migration_complete(from);
 
 pid = fork();
@@ -1721,7 +1721,7 @@ static void

[PATCH v5 0/8] qtest: migration: Add tests for introducing 'channels' argument in migrate QAPIs

With recent migrate QAPI changes, enabling the direct use of the
'channels' argument to avoid redundant URI string parsing is achieved.

To ensure backward compatibility, both 'uri' and 'channels' are kept as
optional parameters in migration QMP commands. However, they are mutually
exhaustive, requiring at least one for a successful migration connection.
This patchset adds qtests to validate 'uri' and 'channels' arguments'
mututally exhaustive behaviour.

Additionally, all migration qtests fail to employ 'channel' as the primary
method for validating migration QAPIs. This patchset also adds test to
enforce only use of 'channel' argument as the initial entry point for
migration QAPIs.

Patch Summary:
-
Patch 1-2:
-
Introduce 'to' object inside migrate_qmp() so and move the calls to
migrate_get_socket_address() inside migrate_qmp. Also, replace connect_uri
with args->connect_uri everywhere.

Patch 3-6:
-
Add channels argument to allow both migration QAPI arguments independently
into migrate_qmp and migrate_qmp_fail. migrate_qmp requires the port value to
be changed from 0 to port value coming from migrate_get_socket_address. Add
migrate_set_ports to address this change of port value.

Patch 7-8:
-
Add 2 negative tests to validate mutually exhaustive behaviour of migration
QAPIs. Add a positive multifd_tcp_plain qtest with only channels as the
initial entry point for migration QAPIs.


v4->v5 Changelog:

1. Remove redundant imports from migration-test.c after moving helper
   functions to migration-helpers.c
2. call migrate_get_socket_address(to) and internally let qdict_get() call
   ???socket-address??? parameter to make more sense to the reader.
3. qdict needs to be freed, other small fixups.

v3->v4 Changelog:

1. introduced migrate_get_connect_uri and migrate_get_connect_qdict to
   both used migrate_get_socket_address to get dest uri in socket-
   address, and then use SokcketAddress_to_qdict to convert it into qdict.
2. Misc code changes.

v2->v3 Changelog:
-
1. 'channels' introduction is not required now for migrate_qmp_incoming
2. Refactor the code into 7 different patches
3. 'channels' introduction is not required now for migrate_qmp_incoming
4. Remove custom function for converting string to MigrationChannelList
5. move calls for migrate_get_socket_address inside migrate_qmp so that
   migrate_set_ports can replace the QAPI's port with correct value.

Het Gala (8):
  Add 'to' object into migrate_qmp()
  Replace connect_uri and move migrate_get_socket_address inside
migrate_qmp
  Replace migrate_get_connect_uri inplace of migrate_get_socket_address
  Add channels parameter in migrate_qmp_fail
  Add migrate_set_ports into migrate_qmp to update migration port value
  Add channels parameter in migrate_qmp
  Add multifd_tcp_plain test using list of channels instead of uri
  Add negative tests to validate migration QAPIs

 tests/qtest/migration-helpers.c | 158 +++-
 tests/qtest/migration-helpers.h |  10 +-
 tests/qtest/migration-test.c| 180 +---
 3 files changed, 257 insertions(+), 91 deletions(-)

-- 
2.22.3

[PATCH v5 4/8] Add channels parameter in migrate_qmp_fail

Alter migrate_qmp_fail() to allow both uri and channels
independently. For channels, convert string to a Dict.
No dealing with migrate_get_socket_address() here because
we will fail before starting the migration anyway.

Signed-off-by: Het Gala 
Suggested-by: Fabiano Rosas 
---
 tests/qtest/migration-helpers.c | 13 +++--
 tests/qtest/migration-helpers.h |  5 +++--
 tests/qtest/migration-test.c|  4 ++--
 3 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/tests/qtest/migration-helpers.c b/tests/qtest/migration-helpers.c
index 8806dc841e..f215f44467 100644
--- a/tests/qtest/migration-helpers.c
+++ b/tests/qtest/migration-helpers.c
@@ -100,7 +100,8 @@ bool migrate_watch_for_events(QTestState *who, const char 
*name,
 return false;
 }
 
-void migrate_qmp_fail(QTestState *who, const char *uri, const char *fmt, ...)
+void migrate_qmp_fail(QTestState *who, const char *uri,
+  const char *channels, const char *fmt, ...)
 {
 va_list ap;
 QDict *args, *err;
@@ -110,7 +111,15 @@ void migrate_qmp_fail(QTestState *who, const char *uri, 
const char *fmt, ...)
 va_end(ap);
 
 g_assert(!qdict_haskey(args, "uri"));
-qdict_put_str(args, "uri", uri);
+if (uri) {
+qdict_put_str(args, "uri", uri);
+}
+
+g_assert(!qdict_haskey(args, "channels"));
+if (channels) {
+QObject *channels_obj = qobject_from_json(channels, _abort);
+qdict_put_obj(args, "channels", channels_obj);
+}
 
 err = qtest_qmp_assert_failure_ref(
 who, "{ 'execute': 'migrate', 'arguments': %p}", args);
diff --git a/tests/qtest/migration-helpers.h b/tests/qtest/migration-helpers.h
index e16a34c796..4e664148a5 100644
--- a/tests/qtest/migration-helpers.h
+++ b/tests/qtest/migration-helpers.h
@@ -33,8 +33,9 @@ G_GNUC_PRINTF(3, 4)
 void migrate_incoming_qmp(QTestState *who, const char *uri,
   const char *fmt, ...);
 
-G_GNUC_PRINTF(3, 4)
-void migrate_qmp_fail(QTestState *who, const char *uri, const char *fmt, ...);
+G_GNUC_PRINTF(4, 5)
+void migrate_qmp_fail(QTestState *who, const char *uri,
+  const char *channels, const char *fmt, ...);
 
 void migrate_set_capability(QTestState *who, const char *capability,
 bool value);
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 9bb24fd7c5..da4b0006c7 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -1717,7 +1717,7 @@ static void test_precopy_common(MigrateCommon *args)
 }
 
 if (args->result == MIG_TEST_QMP_ERROR) {
-migrate_qmp_fail(from, args->connect_uri, "{}");
+migrate_qmp_fail(from, args->connect_uri, NULL, "{}");
 goto finish;
 }
 
@@ -1812,7 +1812,7 @@ static void test_file_common(MigrateCommon *args, bool 
stop_src)
 }
 
 if (args->result == MIG_TEST_QMP_ERROR) {
-migrate_qmp_fail(from, args->connect_uri, "{}");
+migrate_qmp_fail(from, args->connect_uri, NULL, "{}");
 goto finish;
 }
 
-- 
2.22.3

[PATCH v5 3/8] Replace migrate_get_connect_uri inplace of migrate_get_socket_address

Refactor migrate_get_socket_address to internally utilize 'socket-address'
parameter, reducing redundancy in the function definition.

migrate_get_socket_address implicitly converts SocketAddress into str.
Move migrate_get_socket_address inside migrate_get_connect_uri which
should return the uri string instead.

Signed-off-by: Het Gala 
Suggested-by: Fabiano Rosas 
---
 tests/qtest/migration-helpers.c | 29 +++--
 1 file changed, 19 insertions(+), 10 deletions(-)

diff --git a/tests/qtest/migration-helpers.c b/tests/qtest/migration-helpers.c
index 3e8c19c4de..8806dc841e 100644
--- a/tests/qtest/migration-helpers.c
+++ b/tests/qtest/migration-helpers.c
@@ -48,28 +48,37 @@ static char *SocketAddress_to_str(SocketAddress *addr)
 }
 }
 
-static char *
-migrate_get_socket_address(QTestState *who, const char *parameter)
+static SocketAddress *migrate_get_socket_address(QTestState *who)
 {
 QDict *rsp;
-char *result;
 SocketAddressList *addrs;
+SocketAddress *addr;
 Visitor *iv = NULL;
 QObject *object;
 
 rsp = migrate_query(who);
-object = qdict_get(rsp, parameter);
+object = qdict_get(rsp, "socket-address");
 
 iv = qobject_input_visitor_new(object);
 visit_type_SocketAddressList(iv, NULL, , _abort);
+addr = addrs->value;
 visit_free(iv);
 
-/* we are only using a single address */
-result = SocketAddress_to_str(addrs->value);
-
-qapi_free_SocketAddressList(addrs);
 qobject_unref(rsp);
-return result;
+return addr;
+}
+
+static char *
+migrate_get_connect_uri(QTestState *who)
+{
+SocketAddress *addrs;
+char *connect_uri;
+
+addrs = migrate_get_socket_address(who);
+connect_uri = SocketAddress_to_str(addrs);
+
+qapi_free_SocketAddress(addrs);
+return connect_uri;
 }
 
 bool migrate_watch_for_events(QTestState *who, const char *name,
@@ -129,7 +138,7 @@ void migrate_qmp(QTestState *who, QTestState *to, const 
char *uri,
 
 g_assert(!qdict_haskey(args, "uri"));
 if (!uri) {
-connect_uri = migrate_get_socket_address(to, "socket-address");
+connect_uri = migrate_get_connect_uri(to);
 }
 qdict_put_str(args, "uri", uri ? uri : connect_uri);
 
-- 
2.22.3

Re: [PATCH v4 0/8] qtest: migration: Add tests for introducing 'channels' argument in migrate QAPIs

On Tue, Mar 12, 2024 at 03:01:51AM +0530, Het Gala wrote:
> 
> On 12/03/24 2:55 am, Peter Xu wrote:
> > On Sat, Mar 09, 2024 at 01:11:45PM +0530, Het Gala wrote:
> > > Can find the reference to the githab pipeline (before patchset) :
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__gitlab.com_galahet_Qemu_-2D_pipelines_1207185095=DwIBaQ=s883GpUCOChKOHiocYtGcg=-qwZZzrw4EKSsq0BK7MBd3wW1WEpXmJeng3ZUT5uBCg=y2xUaOwvRVC5eTpFNEdxb37JYDdxN61W406HlCyx3CWIVyBRgLwjJhAYALZLinoi=vZRNX33_DuLO1TsfTpYR_s9bf_EMFm3oHHH_eg57zE0=
> > > 
> > > Can find the reference to the githab pipeline (after patchset) :
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__gitlab.com_galahet_Qemu_-2D_pipelines_1207183673=DwIBaQ=s883GpUCOChKOHiocYtGcg=-qwZZzrw4EKSsq0BK7MBd3wW1WEpXmJeng3ZUT5uBCg=y2xUaOwvRVC5eTpFNEdxb37JYDdxN61W406HlCyx3CWIVyBRgLwjJhAYALZLinoi=C73ka3k3ouAuRJYNVLPIBQiWx3jDFDDvVYDiEYqfE04=
> > Het,
> > 
> > Please still copy me for any migration patches.  In this case Fabiano is
> > looking it'll be all fine, but it will still help me on marking the emails.
> > 
> > Thanks,
> So sorry about that Peter. I am aware that you and Fabiano are the go to
> migration
> maintainers. I thought I emailed or cc'd all the stakeholders that should be
> involved
> for this patchset series. Even in earlier series of this patchset, you were
> cc'ed,
> but somehow I just forgot to cc you for this patchset. Sure, will take care
> from next
> time. Again apologies for the mixup :)

No problem at all.  As long as you have at least 1 maintainers copied,
logically nothing will get lost.  It's just that it helps me in the routines.

Are you managing cc list manually for each version?  In that case I suggest
you have a look at Stefan's tool:

https://github.com/stefanha/git-publish

It might help a great deal in patch managements at least to me, and it
definitely covers more than maintaining the cc list for a patchset.

-- 
Peter Xu

Re: [PATCH v4 0/8] qtest: migration: Add tests for introducing 'channels' argument in migrate QAPIs




On 12/03/24 2:55 am, Peter Xu wrote:

On Sat, Mar 09, 2024 at 01:11:45PM +0530, Het Gala wrote:

Can find the reference to the githab pipeline (before patchset) :
https://urldefense.proofpoint.com/v2/url?u=https-3A__gitlab.com_galahet_Qemu_-2D_pipelines_1207185095=DwIBaQ=s883GpUCOChKOHiocYtGcg=-qwZZzrw4EKSsq0BK7MBd3wW1WEpXmJeng3ZUT5uBCg=y2xUaOwvRVC5eTpFNEdxb37JYDdxN61W406HlCyx3CWIVyBRgLwjJhAYALZLinoi=vZRNX33_DuLO1TsfTpYR_s9bf_EMFm3oHHH_eg57zE0=

Can find the reference to the githab pipeline (after patchset) :
https://urldefense.proofpoint.com/v2/url?u=https-3A__gitlab.com_galahet_Qemu_-2D_pipelines_1207183673=DwIBaQ=s883GpUCOChKOHiocYtGcg=-qwZZzrw4EKSsq0BK7MBd3wW1WEpXmJeng3ZUT5uBCg=y2xUaOwvRVC5eTpFNEdxb37JYDdxN61W406HlCyx3CWIVyBRgLwjJhAYALZLinoi=C73ka3k3ouAuRJYNVLPIBQiWx3jDFDDvVYDiEYqfE04=

Het,

Please still copy me for any migration patches.  In this case Fabiano is
looking it'll be all fine, but it will still help me on marking the emails.

Thanks,
So sorry about that Peter. I am aware that you and Fabiano are the go to 
migration
maintainers. I thought I emailed or cc'd all the stakeholders that 
should be involved
for this patchset series. Even in earlier series of this patchset, you 
were cc'ed,
but somehow I just forgot to cc you for this patchset. Sure, will take 
care from next

time. Again apologies for the mixup :)

Regards,
Het Gala

Re: [PATCH v4 0/8] qtest: migration: Add tests for introducing 'channels' argument in migrate QAPIs

On Sat, Mar 09, 2024 at 01:11:45PM +0530, Het Gala wrote:
> Can find the reference to the githab pipeline (before patchset) :
> https://gitlab.com/galahet/Qemu/-/pipelines/1207185095
> 
> Can find the reference to the githab pipeline (after patchset) :
> https://gitlab.com/galahet/Qemu/-/pipelines/1207183673

Het,

Please still copy me for any migration patches.  In this case Fabiano is
looking it'll be all fine, but it will still help me on marking the emails.

Thanks,

-- 
Peter Xu

Re: [PATCH v4 5/8] Add migrate_set_ports into migrate_qmp to update migration port value



On 12/03/24 12:12 am, Fabiano Rosas wrote:

Het Gala  writes:


migrate_set_get_qdict gets qdict with the dst QEMU parameters

s/set_//

Ack

migrate_set_ports() from list of channels reads each QDict for port,
and fills the port with correct value in case it was 0 in the test.

Signed-off-by: Het Gala
Suggested-by: Fabiano Rosas
---
  tests/qtest/migration-helpers.c | 73 +
  1 file changed, 73 insertions(+)

diff --git a/tests/qtest/migration-helpers.c b/tests/qtest/migration-helpers.c
index 91c8a817d2..7c17d78d6b 100644
--- a/tests/qtest/migration-helpers.c
+++ b/tests/qtest/migration-helpers.c
@@ -17,6 +17,8 @@
  #include "qapi/qapi-visit-sockets.h"
  #include "qapi/qobject-input-visitor.h"
  #include "qapi/error.h"
+#include "qapi/qmp/qlist.h"
+#include "include/qemu/cutils.h"

Extra "include/" here?

Ack
  
  #include "migration-helpers.h"
  
@@ -49,6 +51,37 @@ static char *SocketAddress_to_str(SocketAddress *addr)

  }
  }
  
+static QDict *SocketAddress_to_qdict(SocketAddress *addr)

+{
+QDict *dict = qdict_new();
+
+switch (addr->type) {
+case SOCKET_ADDRESS_TYPE_INET:
+qdict_put_str(dict, "type", "inet");
+qdict_put_str(dict, "host", addr->u.inet.host);
+qdict_put_str(dict, "port", addr->u.inet.port);
+break;
+case SOCKET_ADDRESS_TYPE_UNIX:
+qdict_put_str(dict, "type", "unix");
+qdict_put_str(dict, "path", addr->u.q_unix.path);
+break;
+case SOCKET_ADDRESS_TYPE_FD:
+qdict_put_str(dict, "type", "fd");
+qdict_put_str(dict, "str", addr->u.fd.str);
+break;
+case SOCKET_ADDRESS_TYPE_VSOCK:
+qdict_put_str(dict, "type", "vsock");
+qdict_put_str(dict, "cid", addr->u.vsock.cid);
+qdict_put_str(dict, "port", addr->u.vsock.port);
+break;
+default:
+g_assert_not_reached();
+break;
+}
+
+return dict;
+}
+
  static SocketAddress *
  migrate_get_socket_address(QTestState *who, const char *parameter)
  {
@@ -83,6 +116,44 @@ migrate_get_connect_uri(QTestState *who, const char 
*parameter)
  return connect_uri;
  }
  
+static QDict *

+migrate_get_connect_qdict(QTestState *who, const char *parameter)
+{
+SocketAddress *addrs;
+QDict *connect_qdict;
+
+addrs = migrate_get_socket_address(who, parameter);
+connect_qdict = SocketAddress_to_qdict(addrs);
+
+qapi_free_SocketAddress(addrs);
+return connect_qdict;
+}
+
+static void migrate_set_ports(QTestState *to, QList *channel_list)
+{
+QDict *addr;
+QListEntry *entry;
+g_autofree const char *addr_port = NULL;
+
+if (channel_list == NULL) {
+return;
+}
+
+addr = migrate_get_connect_qdict(to, "socket-address");

addr needs to be freed.

Ack. Thanks for pointing this out

+
+QLIST_FOREACH_ENTRY(channel_list, entry) {
+QDict *channel = qobject_to(QDict, qlist_entry_obj(entry));
+QDict *addrdict = qdict_get_qdict(channel, "addr");
+
+if (qdict_haskey(addrdict, "port") &&
+qdict_haskey(addr, "port") &&
+(strcmp(qdict_get_str(addrdict, "port"), "0") == 0)) {
+addr_port = qdict_get_str(addr, "port");
+qdict_put_str(addrdict, "port", addr_port);
+}
+}
+}
+
  bool migrate_watch_for_events(QTestState *who, const char *name,
QDict *event, void *opaque)
  {
@@ -141,6 +212,7 @@ void migrate_qmp(QTestState *who, QTestState *to, const 
char *uri,
  {
  va_list ap;
  QDict *args;
+QList *channel_list = NULL;
  g_autofree char *connect_uri = NULL;
  
  va_start(ap, fmt);

@@ -151,6 +223,7 @@ void migrate_qmp(QTestState *who, QTestState *to, const 
char *uri,
  if (!uri) {
  connect_uri = migrate_get_connect_uri(to, "socket-address");
  }
+migrate_set_ports(to, channel_list);
  qdict_put_str(args, "uri", uri ? uri : connect_uri);
  
  qtest_qmp_assert_success(who,

Regards,
Het Gala

Re: [PATCH 06/13] ppc/spapr: Add pa-features for POWER10 machines

2024-03-11 Thread BALATON Zoltan


On Mon, 11 Mar 2024, Philippe Mathieu-Daudé wrote:

On 11/3/24 19:51, Nicholas Piggin wrote:

From: Benjamin Gray 

Add POWER10 pa-features entry.

Notably DEXCR and and [P]HASHST/[P]HASHCHK instruction support is
advertised. Each DEXCR aspect is allocated a bit in the device tree,
using the 68--71 byte range (inclusive). The functionality of the
[P]HASHST/[P]HASHCHK instructions is separately declared in byte 72,
bit 0 (BE).

Signed-off-by: Benjamin Gray 
[npiggin: reword title and changelog, adjust a few bits]
Signed-off-by: Nicholas Piggin 
---
  hw/ppc/spapr.c | 34 ++
  1 file changed, 34 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 247f920f07..128bfe11a8 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -265,6 +265,36 @@ static void spapr_dt_pa_features(SpaprMachineState 
*spapr,

  /* 60: NM atomic, 62: RNG */
  0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 60 - 65 */
  };
+/* 3.1 removes SAO, HTM support */
+uint8_t pa_features_31[] = { 74, 0,


Nitpicking because pre-existing, all these arrays could be static const.


If we are at it then maybe also s/0x00/   0/ because having a stream of 
0x80 and 0x00 is not the most readable.


Regards,
BALATON Zoltan

+/* 0: MMU|FPU|SLB|RUN|DABR|NX, 1: fri[nzpm]|DABRX|SPRG3|SLB0|PP110 
*/

+/* 2: VPM|DS205|PPR|DS202|DS206, 3: LSD|URG, 5: LE|CFAR|EB|LSQ */
+0xf6, 0x1f, 0xc7, 0xc0, 0x00, 0xf0, /* 0 - 5 */
+/* 6: DS207 */
+0x80, 0x00, 0x00, 0x00, 0x00, 0x00, /* 6 - 11 */
+/* 16: Vector */
+0x00, 0x00, 0x00, 0x00, 0x80, 0x00, /* 12 - 17 */
+/* 18: Vec. Scalar, 20: Vec. XOR */
+0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 18 - 23 */
+/* 24: Ext. Dec, 26: 64 bit ftrs, 28: PM ftrs */
+0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 24 - 29 */
+/* 32: LE atomic, 34: EBB + ext EBB */
+0x00, 0x00, 0x80, 0x00, 0xC0, 0x00, /* 30 - 35 */
+/* 40: Radix MMU */
+0x00, 0x00, 0x00, 0x00, 0x80, 0x00, /* 36 - 41 */
+/* 42: PM, 44: PC RA, 46: SC vec'd */
+0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 42 - 47 */
+/* 48: SIMD, 50: QP BFP, 52: String */
+0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 48 - 53 */
+/* 54: DecFP, 56: DecI, 58: SHA */
+0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 54 - 59 */
+/* 60: NM atomic, 62: RNG */
+0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 60 - 65 */
+/* 68: DEXCR[SBHE|IBRTPDUS|SRAPD|NPHIE|PHIE] */
+0x00, 0x00, 0xce, 0x00, 0x00, 0x00, /* 66 - 71 */
+/* 72: [P]HASHCHK */
+0x80, 0x00, /* 72 - 73 */
+};
  uint8_t *pa_features = NULL;
  size_t pa_size;
  @@ -280,6 +310,10 @@ static void spapr_dt_pa_features(SpaprMachineState 
*spapr,

  pa_features = pa_features_300;
  pa_size = sizeof(pa_features_300);
  }
+if (ppc_check_compat(cpu, CPU_POWERPC_LOGICAL_3_10, 0, 
cpu->compat_pvr)) {

+pa_features = pa_features_31;
+pa_size = sizeof(pa_features_31);
+}
  if (!pa_features) {
  return;
  }

Re: [PATCH v6 7/7] migration/multifd: Add new migration test cases for legacy zero page checking.

On Mon, Mar 11, 2024 at 06:00:15PM +, Hao Xiang wrote:
> From: Hao Xiang 
> 
> Now that zero page checking is done on the multifd sender threads by
> default, we still provide an option for backward compatibility. This
> change adds a qtest migration test case to set the zero-page-detection
> option to "legacy" and run multifd migration with zero page checking on the
> migration main thread.
> 
> Signed-off-by: Hao Xiang 
> Reviewed-by: Peter Xu 
> Message-Id: <20240301022829.3390548-6-hao.xi...@bytedance.com>

We don't need to attach message-id when posting patches.  I'll attach them
when queuing patches to make sure the link points to the exact version that
we merged.

I'll drop this line and the other one (in patch 3).  No action needed from
your side; just a heads-up for the future.

-- 
Peter Xu

Re: [PATCH v4 3/8] Replace migrate_get_connect_uri inplace of migrate_get_socket_address



On 12/03/24 2:21 am, Fabiano Rosas wrote:

Het Gala  writes:


On 11/03/24 11:49 pm, Fabiano Rosas wrote:

Het Gala   writes:


   bool migrate_watch_for_events(QTestState *who, const char *name,
@@ -130,7 +140,7 @@ void migrate_qmp(QTestState *who, QTestState *to, const 
char *uri,
   
   g_assert(!qdict_haskey(args, "uri"));

   if (!uri) {
-connect_uri = migrate_get_socket_address(to, "socket-address");
+connect_uri = migrate_get_connect_uri(to, "socket-address");

What's the point of the "socket-address" argument here? Seems a bit
nonsensical to me to call: migrate_get_socket_address(..., "socket-address").

What about we just suppress this throughout the stack and directly call:

  object = qdict_get(rsp, "socket-address");

Fabiano, I didn't get clearly understand your point here. From what I
understand,
you want to call just
1. migrate_get_connect_uri(to) and migrate_get_connect_qdict(to)

Yes.

Ack

2. delete migrate_get_socket_address(..., "socket-address") altogether

No, just the string argument, not the whole function:

static char *migrate_get_socket_address(QTestState *who) <
{
 QDict *rsp;
 char *result;
 SocketAddressList *addrs;
 Visitor *iv = NULL;
 QObject *object;

 rsp = migrate_query(who);
 object = qdict_get(rsp, "socket-address"); <-
 ...
}

If the thing is called migrate_get_SOCKET_ADDRESS(), it's obvious that
the "socket-address" is the parameter we want. We even call
SocketAddress_to_str, so there's no point in having that argument
there. We will never call the function with something else in
'parameter'.

Ahh, okay. I got your point, and yes, it makes sense. Will just call
migrate_get_socket_address(to) and let the qdict_get() call 
"socket-address" internally.


Regards,

Het Gala

Re: [PATCH v4 3/8] Replace migrate_get_connect_uri inplace of migrate_get_socket_address

Het Gala  writes:

> On 11/03/24 11:49 pm, Fabiano Rosas wrote:
>> Het Gala  writes:
>>
>>>
>>>   bool migrate_watch_for_events(QTestState *who, const char *name,
>>> @@ -130,7 +140,7 @@ void migrate_qmp(QTestState *who, QTestState *to, const 
>>> char *uri,
>>>   
>>>   g_assert(!qdict_haskey(args, "uri"));
>>>   if (!uri) {
>>> -connect_uri = migrate_get_socket_address(to, "socket-address");
>>> +connect_uri = migrate_get_connect_uri(to, "socket-address");
>> What's the point of the "socket-address" argument here? Seems a bit
>> nonsensical to me to call: migrate_get_socket_address(..., "socket-address").
>>
>> What about we just suppress this throughout the stack and directly call:
>>
>>  object = qdict_get(rsp, "socket-address");
>
> Fabiano, I didn't get clearly understand your point here. From what I 
> understand,
> you want to call just
> 1. migrate_get_connect_uri(to) and migrate_get_connect_qdict(to)

Yes.

> 2. delete migrate_get_socket_address(..., "socket-address") altogether 

No, just the string argument, not the whole function:

static char *migrate_get_socket_address(QTestState *who) <
{
QDict *rsp;
char *result;
SocketAddressList *addrs;
Visitor *iv = NULL;
QObject *object;

rsp = migrate_query(who);
object = qdict_get(rsp, "socket-address"); <-
...
}

If the thing is called migrate_get_SOCKET_ADDRESS(), it's obvious that
the "socket-address" is the parameter we want. We even call
SocketAddress_to_str, so there's no point in having that argument
there. We will never call the function with something else in
'parameter'.

Re: [PATCH v6 6/7] migration/multifd: Enable multifd zero page checking by default.

Hao Xiang  writes:

> From: Hao Xiang 
>
> 1. Set default "zero-page-detection" option to "multifd". Now
> zero page checking can be done in the multifd threads and this
> becomes the default configuration.
> 2. Handle migration QEMU9.0 -> QEMU8.2 compatibility. We provide
> backward compatibility where zero page checking is done from the
> migration main thread.
>
> Signed-off-by: Hao Xiang 

Reviewed-by: Fabiano Rosas

Re: [PATCH v6 0/7] Introduce multifd zero page checking.

On Mon, Mar 11, 2024 at 06:00:08PM +, Hao Xiang wrote:
> v6 update:
> * Make ZERO_PAGE_DETECTION_NONE option work in legacy migration.
> * Rebase on top of 7489f7f3f81dcb776df8c1b9a9db281fc21bf05f.

Queued, thanks.

-- 
Peter Xu

Re: [PATCH v6 6/7] migration/multifd: Enable multifd zero page checking by default.

On Mon, Mar 11, 2024 at 06:00:14PM +, Hao Xiang wrote:
> From: Hao Xiang 
> 
> 1. Set default "zero-page-detection" option to "multifd". Now
> zero page checking can be done in the multifd threads and this
> becomes the default configuration.
> 2. Handle migration QEMU9.0 -> QEMU8.2 compatibility. We provide
> backward compatibility where zero page checking is done from the
> migration main thread.
> 
> Signed-off-by: Hao Xiang 

Reviewed-by: Peter Xu 

-- 
Peter Xu

Re: [PATCH] target/i386: fix direction of "32-bit MMU" test

2024-03-11 Thread Mark Cave-Ayland


On 11/03/2024 07:58, Paolo Bonzini wrote:


The low bit of MMU indices for x86 TCG indicates whether the processor is
in 32-bit mode and therefore linear addresses have to be masked to 32 bits.
However, the index was computed incorrectly, leading to possible conflicts
in the TLB for any address above 4G.

Analyzed-by: Mark Cave-Ayland 
Fixes: b1661801c18 ("target/i386: Fix physical address truncation", 2024-02-28)
Cc: qemu-sta...@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2206
Signed-off-by: Paolo Bonzini 
---
  target/i386/cpu.h | 2 +-
  target/i386/cpu.c | 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 952174bb6f5..6b057380791 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -2334,7 +2334,7 @@ static inline bool is_mmu_index_32(int mmu_index)
  
  static inline int cpu_mmu_index_kernel(CPUX86State *env)

  {
-int mmu_index_32 = (env->hflags & HF_LMA_MASK) ? 1 : 0;
+int mmu_index_32 = (env->hflags & HF_LMA_MASK) ? 0 : 1;
  int mmu_index_base =
  !(env->hflags & HF_SMAP_MASK) ? MMU_KNOSMAP64_IDX :
  ((env->hflags & HF_CPL_MASK) < 3 && (env->eflags & AC_MASK)) ? 
MMU_KNOSMAP64_IDX : MMU_KSMAP64_IDX;
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 2666ef38089..78524bc6073 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -7735,7 +7735,7 @@ static bool x86_cpu_has_work(CPUState *cs)
  static int x86_cpu_mmu_index(CPUState *cs, bool ifetch)
  {
  CPUX86State *env = cpu_env(cs);
-int mmu_index_32 = (env->hflags & HF_CS64_MASK) ? 1 : 0;
+int mmu_index_32 = (env->hflags & HF_CS64_MASK) ? 0 : 1;
  int mmu_index_base =
  (env->hflags & HF_CPL_MASK) == 3 ? MMU_USER64_IDX :
  !(env->hflags & HF_SMAP_MASK) ? MMU_KNOSMAP64_IDX :


LGTM. I've just done a few quick Windows boot tests, and all of Win98SE, WinXP and 
Win7 64-bit now appear to be working fine with this patch so:


Tested-by: Mark Cave-Ayland 


ATB,

Mark.

Re: [PATCH V2 00/11] privatize migration.h

2024-03-11 Thread Steven Sistare


On 3/11/2024 4:28 PM, Peter Xu wrote:

On Mon, Mar 11, 2024 at 04:24:14PM -0400, Steven Sistare wrote:

On 3/11/2024 3:45 PM, Steven Sistare wrote:

On 3/11/2024 3:30 PM, Peter Xu wrote:

Steve,

On Mon, Mar 11, 2024 at 10:48:47AM -0700, Steve Sistare wrote:

Changes in V2:
    * rebase to migration-next, add RB


Not apply even to master branch.  Note that there're >=1 PULLs sent and
merged since my last reply..  Perhaps you rebased to the "old" next?


I pulled from branch migration-next in https://gitlab.com/peterx/qemu a
few hours ago, but I must have screwed up somewhere.  I'll figure it out
and post a V4.


My pull was a fiew hours old, but my patches still apply cleanly to the
most recent tip:
   a1bb5dd169f4 ("migration: Fix format in error message")

I can sent that as V3, but ...
Note that you must apply "migration: export fewer options" before
"privatize migration.h".  If that does not help, I will send V3.


Ouch, I forgot that dependency... Sorry.

Yeah it works now.  No need to resend for now.


Great! - steve

Re: [PATCH v4 3/8] Replace migrate_get_connect_uri inplace of migrate_get_socket_address



On 11/03/24 11:49 pm, Fabiano Rosas wrote:

Het Gala  writes:



  bool migrate_watch_for_events(QTestState *who, const char *name,
@@ -130,7 +140,7 @@ void migrate_qmp(QTestState *who, QTestState *to, const 
char *uri,
  
  g_assert(!qdict_haskey(args, "uri"));

  if (!uri) {
-connect_uri = migrate_get_socket_address(to, "socket-address");
+connect_uri = migrate_get_connect_uri(to, "socket-address");

What's the point of the "socket-address" argument here? Seems a bit
nonsensical to me to call: migrate_get_socket_address(..., "socket-address").

What about we just suppress this throughout the stack and directly call:

 object = qdict_get(rsp, "socket-address");


Fabiano, I didn't get clearly understand your point here. From what I 
understand,

you want to call just
1. migrate_get_connect_uri(to) and migrate_get_connect_qdict(to)
2. delete migrate_get_socket_address(..., "socket-address") altogether 
3. Just call qdict_get(rsp, "socket-address") which will return an 
object 4. Then convert this object into qdict and uri string respectively ?


Hmm, If that's the case, converting to qdict shouldn't be a problem. But 
for uri string is there a simpler method or writing a parsing function 
would be needed ?



Regards,

Het Gala

Re: [PATCH V2 00/11] privatize migration.h

On Mon, Mar 11, 2024 at 04:24:14PM -0400, Steven Sistare wrote:
> On 3/11/2024 3:45 PM, Steven Sistare wrote:
> > On 3/11/2024 3:30 PM, Peter Xu wrote:
> > > Steve,
> > > 
> > > On Mon, Mar 11, 2024 at 10:48:47AM -0700, Steve Sistare wrote:
> > > > Changes in V2:
> > > >    * rebase to migration-next, add RB
> > > 
> > > Not apply even to master branch.  Note that there're >=1 PULLs sent and
> > > merged since my last reply..  Perhaps you rebased to the "old" next?
> > 
> > I pulled from branch migration-next in https://gitlab.com/peterx/qemu a
> > few hours ago, but I must have screwed up somewhere.  I'll figure it out
> > and post a V4.
> 
> My pull was a fiew hours old, but my patches still apply cleanly to the
> most recent tip:
>   a1bb5dd169f4 ("migration: Fix format in error message")
> 
> I can sent that as V3, but ...
> Note that you must apply "migration: export fewer options" before
> "privatize migration.h".  If that does not help, I will send V3.

Ouch, I forgot that dependency... Sorry.

Yeah it works now.  No need to resend for now.

-- 
Peter Xu

Re: [PATCH V2 00/11] privatize migration.h

2024-03-11 Thread Steven Sistare


On 3/11/2024 3:45 PM, Steven Sistare wrote:

On 3/11/2024 3:30 PM, Peter Xu wrote:

Steve,

On Mon, Mar 11, 2024 at 10:48:47AM -0700, Steve Sistare wrote:

Changes in V2:
   * rebase to migration-next, add RB


Not apply even to master branch.  Note that there're >=1 PULLs sent and
merged since my last reply..  Perhaps you rebased to the "old" next?


I pulled from branch migration-next in https://gitlab.com/peterx/qemu a 
few hours ago, but I must have screwed up somewhere.  I'll figure it out

and post a V4.


My pull was a fiew hours old, but my patches still apply cleanly to the
most recent tip:
  a1bb5dd169f4 ("migration: Fix format in error message")

I can sent that as V3, but ...
Note that you must apply "migration: export fewer options" before
"privatize migration.h".  If that does not help, I will send V3.

- Steve

Re: [PATCH v4 00/25] migration: Improve error reporting

On Fri, Mar 08, 2024 at 04:15:08PM +0800, Peter Xu wrote:
> On Wed, Mar 06, 2024 at 02:34:15PM +0100, Cédric Le Goater wrote:
> > * [1-4] already queued in migration-next.
> >   
> >   migration: Report error when shutdown fails
> >   migration: Remove SaveStateHandler and LoadStateHandler typedefs
> >   migration: Add documentation for SaveVMHandlers
> >   migration: Do not call PRECOPY_NOTIFY_SETUP notifiers in case of error
> >   
> > * [5-9] are prequisite changes in other components related to the
> >   migration save_setup() handler. They make sure a failure is not
> >   returned without setting an error.
> >   
> >   s390/stattrib: Add Error** argument to set_migrationmode() handler
> >   vfio: Always report an error in vfio_save_setup()
> >   migration: Always report an error in block_save_setup()
> >   migration: Always report an error in ram_save_setup()
> >   migration: Add Error** argument to vmstate_save()
> > 
> > * [10-15] are the core changes in migration and memory components to
> >   propagate an error reported in a save_setup() handler.
> > 
> >   migration: Add Error** argument to qemu_savevm_state_setup()
> >   migration: Add Error** argument to .save_setup() handler
> >   migration: Add Error** argument to .load_setup() handler
> 
> Further queued 5-12 in migration-staging (until here), thanks.

Just to keep a record: due to the virtio failover test failure and the
other block migration uncertainty in patch 7 (in which case we may want to
have a fix on sectors==0 case), I unqueued this chunk for 9.0.

Thanks,

-- 
Peter Xu

Re: [PATCH v6 4/7] migration/multifd: Implement zero page transmission on the multifd thread.

Hao Xiang  writes:

> From: Hao Xiang 
>
> 1. Add zero_pages field in MultiFDPacket_t.
> 2. Implements the zero page detection and handling on the multifd
> threads for non-compression, zlib and zstd compression backends.
> 3. Added a new value 'multifd' in ZeroPageDetection enumeration.
> 4. Adds zero page counters and updates multifd send/receive tracing
> format to track the newly added counters.
>
> Signed-off-by: Hao Xiang 
> Acked-by: Markus Armbruster 

Reviewed-by: Fabiano Rosas

Re: [PATCH v4 10/25] migration: Add Error** argument to qemu_savevm_state_setup()

On Mon, Mar 11, 2024 at 07:12:11PM +0100, Cédric Le Goater wrote:
> On 3/8/24 15:17, Peter Xu wrote:
> > On Fri, Mar 08, 2024 at 02:55:30PM +0100, Cédric Le Goater wrote:
> > > On 3/8/24 14:39, Cédric Le Goater wrote:
> > > > On 3/8/24 14:14, Cédric Le Goater wrote:
> > > > > On 3/8/24 13:56, Peter Xu wrote:
> > > > > > On Wed, Mar 06, 2024 at 02:34:25PM +0100, Cédric Le Goater wrote:
> > > > > > > This prepares ground for the changes coming next which add an 
> > > > > > > Error**
> > > > > > > argument to the .save_setup() handler. Callers of 
> > > > > > > qemu_savevm_state_setup()
> > > > > > > now handle the error and fail earlier setting the migration state 
> > > > > > > from
> > > > > > > MIGRATION_STATUS_SETUP to MIGRATION_STATUS_FAILED.
> > > > > > > 
> > > > > > > In qemu_savevm_state(), move the cleanup to preserve the error
> > > > > > > reported by .save_setup() handlers.
> > > > > > > 
> > > > > > > Since the previous behavior was to ignore errors at this step of
> > > > > > > migration, this change should be examined closely to check that
> > > > > > > cleanups are still correctly done.
> > > > > > > 
> > > > > > > Signed-off-by: Cédric Le Goater 
> > > > > > > ---
> > > > > > > 
> > > > > > >    Changes in v4:
> > > > > > >    - Merged cleanup change in qemu_savevm_state()
> > > > > > >    Changes in v3:
> > > > > > >    - Set migration state to MIGRATION_STATUS_FAILED
> > > > > > >    - Fixed error handling to be done under lock in 
> > > > > > > bg_migration_thread()
> > > > > > >    - Made sure an error is always set in case of failure in
> > > > > > >      qemu_savevm_state_setup()
> > > > > > >    migration/savevm.h    |  2 +-
> > > > > > >    migration/migration.c | 27 ---
> > > > > > >    migration/savevm.c    | 26 +++---
> > > > > > >    3 files changed, 40 insertions(+), 15 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/migration/savevm.h b/migration/savevm.h
> > > > > > > index 
> > > > > > > 74669733dd63a080b765866c703234a5c4939223..9ec96a995c93a42aad621595f0ed58596c532328
> > > > > > >  100644
> > > > > > > --- a/migration/savevm.h
> > > > > > > +++ b/migration/savevm.h
> > > > > > > @@ -32,7 +32,7 @@
> > > > > > >    bool qemu_savevm_state_blocked(Error **errp);
> > > > > > >    void qemu_savevm_non_migratable_list(strList **reasons);
> > > > > > >    int qemu_savevm_state_prepare(Error **errp);
> > > > > > > -void qemu_savevm_state_setup(QEMUFile *f);
> > > > > > > +int qemu_savevm_state_setup(QEMUFile *f, Error **errp);
> > > > > > >    bool qemu_savevm_state_guest_unplug_pending(void);
> > > > > > >    int qemu_savevm_state_resume_prepare(MigrationState *s);
> > > > > > >    void qemu_savevm_state_header(QEMUFile *f);
> > > > > > > diff --git a/migration/migration.c b/migration/migration.c
> > > > > > > index 
> > > > > > > a49fcd53ee19df1ce0182bc99d7e064968f0317b..6d1544224e96f5edfe56939a9c8395d88ef29581
> > > > > > >  100644
> > > > > > > --- a/migration/migration.c
> > > > > > > +++ b/migration/migration.c
> > > > > > > @@ -3408,6 +3408,8 @@ static void *migration_thread(void *opaque)
> > > > > > >    int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
> > > > > > >    MigThrError thr_error;
> > > > > > >    bool urgent = false;
> > > > > > > +    Error *local_err = NULL;
> > > > > > > +    int ret;
> > > > > > >    thread = migration_threads_add("live_migration", 
> > > > > > > qemu_get_thread_id());
> > > > > > > @@ -3451,9 +3453,17 @@ static void *migration_thread(void *opaque)
> > > > > > >    }
> > > > > > >    bql_lock();
> > > > > > > -    qemu_savevm_state_setup(s->to_dst_file);
> > > > > > > +    ret = qemu_savevm_state_setup(s->to_dst_file, _err);
> > > > > > >    bql_unlock();
> > > > > > > +    if (ret) {
> > > > > > > +    migrate_set_error(s, local_err);
> > > > > > > +    error_free(local_err);
> > > > > > > +    migrate_set_state(>state, MIGRATION_STATUS_SETUP,
> > > > > > > +  MIGRATION_STATUS_FAILED);
> > > > > > > +    goto out;
> > > > > > > + }
> > > > > > 
> > > > > > There's a small indent issue, I can fix it.
> > > > > 
> > > > > checkpatch did report anything.
> > > > > 
> > > > > > 
> > > > > > The bigger problem is I _think_ this will trigger a ci failure in 
> > > > > > the
> > > > > > virtio-net-failover test:
> > > > > > 
> > > > > > ▶ 121/464 
> > > > > > ERROR:../tests/qtest/virtio-net-failover.c:1203:test_migrate_abort_wait_unplug:
> > > > > >  assertion failed (status == "cancelling"): ("cancelled" == 
> > > > > > "cancelling") ERROR
> > > > > > 121/464 qemu:qtest+qtest-x86_64 / qtest-x86_64/virtio-net-failover  
> > > > > >   ERROR    4.77s   killed by signal 6 SIGABRT
> > > > > > > > > PYTHON=/builds/peterx/qemu/build/pyvenv/bin/python3.8 
> > > > > > > > > G_TEST_DBUS_DAEMON=/builds/peterx/qemu/tests/dbus-vmstate-daemon.sh
> > > > > > > > >  MALLOC_PERTURB_=161 QTEST_QEMU_IMG=./qemu-img 
> > > > >

[PATCH] virtio-blk: iothread-vq-mapping coroutine pool sizing

2024-03-11 Thread Stefan Hajnoczi

It is possible to hit the sysctl vm.max_map_count limit when the
coroutine pool size becomes large. Each coroutine requires two mappings
(one for the stack and one for the guard page). QEMU can crash with
"failed to set up stack guard page" or "failed to allocate memory for
stack" when this happens.

Coroutine pool sizing is simple when there is only one AioContext: sum
up all I/O requests across all virtqueues.

When the iothread-vq-mapping option is used we should calculate tighter
bounds: take the maximum number of the number of I/O requests across all
virtqueues. This number is lower than simply summing all virtqueues when
only a subset of the virtqueues is handled by each AioContext.

This is not a solution to hitting vm.max_map_count, but it helps. A
guest with 64 vCPUs (hence 64 virtqueues) across 4 IOThreads with one
iothread-vq-mapping virtio-blk device and a root disk without goes from
pool_max_size 16,448 to 10,304.

Reported-by: Sanjay Rao 
Reported-by: Boaz Ben Shabat 
Signed-off-by: Stefan Hajnoczi 
---
 include/hw/virtio/virtio-blk.h |  2 ++
 hw/block/virtio-blk.c  | 34 --
 2 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index 5c14110c4b..ac29700ad4 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -74,6 +74,8 @@ struct VirtIOBlock {
 uint64_t host_features;
 size_t config_size;
 BlockRAMRegistrar blk_ram_registrar;
+
+unsigned coroutine_pool_size;
 };
 
 typedef struct VirtIOBlockReq {
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 738cb2ac36..0a14b2b175 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -1957,6 +1957,35 @@ static void virtio_blk_stop_ioeventfd(VirtIODevice *vdev)
 s->ioeventfd_stopping = false;
 }
 
+/* Increase the coroutine pool size to include our I/O requests */
+static void virtio_blk_inc_coroutine_pool_size(VirtIOBlock *s)
+{
+VirtIOBlkConf *conf = >conf;
+unsigned max_requests = 0;
+
+/* Tracks the total number of requests for AioContext */
+g_autoptr(GHashTable) counters = g_hash_table_new(NULL, NULL);
+
+/* Call this function after setting up vq_aio_context[] */
+assert(s->vq_aio_context);
+
+for (unsigned i = 0; i < conf->num_queues; i++) {
+AioContext *ctx = s->vq_aio_context[i];
+unsigned n = GPOINTER_TO_UINT(g_hash_table_lookup(counters, ctx));
+
+n += conf->queue_size / 2; /* this is a heuristic */
+
+g_hash_table_insert(counters, ctx, GUINT_TO_POINTER(n));
+
+if (n > max_requests) {
+max_requests = n;
+}
+}
+
+qemu_coroutine_inc_pool_size(max_requests);
+s->coroutine_pool_size = max_requests; /* stash it for ->unrealize() */
+}
+
 static void virtio_blk_device_realize(DeviceState *dev, Error **errp)
 {
 VirtIODevice *vdev = VIRTIO_DEVICE(dev);
@@ -2048,7 +2077,6 @@ static void virtio_blk_device_realize(DeviceState *dev, 
Error **errp)
 for (i = 0; i < conf->num_queues; i++) {
 virtio_add_queue(vdev, conf->queue_size, virtio_blk_handle_output);
 }
-qemu_coroutine_inc_pool_size(conf->num_queues * conf->queue_size / 2);
 
 /* Don't start ioeventfd if transport does not support notifiers. */
 if (!virtio_device_ioeventfd_enabled(vdev)) {
@@ -2065,6 +2093,8 @@ static void virtio_blk_device_realize(DeviceState *dev, 
Error **errp)
 return;
 }
 
+virtio_blk_inc_coroutine_pool_size(s);
+
 /*
  * This must be after virtio_init() so virtio_blk_dma_restart_cb() gets
  * called after ->start_ioeventfd() has already set blk's AioContext.
@@ -2096,7 +2126,7 @@ static void virtio_blk_device_unrealize(DeviceState *dev)
 for (i = 0; i < conf->num_queues; i++) {
 virtio_del_queue(vdev, i);
 }
-qemu_coroutine_dec_pool_size(conf->num_queues * conf->queue_size / 2);
+qemu_coroutine_dec_pool_size(s->coroutine_pool_size);
 qemu_mutex_destroy(>rq_lock);
 blk_ram_registrar_destroy(>blk_ram_registrar);
 qemu_del_vm_change_state_handler(s->change);
-- 
2.44.0

Re: [PATCH v4 2/8] Replace connect_uri and move migrate_get_socket_address inside migrate_qmp




On 11/03/24 11:46 pm, Fabiano Rosas wrote:

Het Gala  writes:


Move the calls to migrate_get_socket_address() into migrate_qmp().
Get rid of connect_uri and replace it with args->connect_uri only
because 'to' object will help to generate connect_uri with the
correct port number.

Signed-off-by: Het Gala 
Suggested-by: Fabiano Rosas 
Reviewed-by: Fabiano Rosas 
---
  tests/qtest/migration-helpers.c | 55 ++-
  tests/qtest/migration-test.c| 79 +
  2 files changed, 64 insertions(+), 70 deletions(-)

diff --git a/tests/qtest/migration-helpers.c b/tests/qtest/migration-helpers.c
index b6206a04fb..9af3c7d4d5 100644
--- a/tests/qtest/migration-helpers.c
+++ b/tests/qtest/migration-helpers.c
@@ -13,6 +13,10 @@
  #include "qemu/osdep.h"
  #include "qemu/ctype.h"
  #include "qapi/qmp/qjson.h"
+#include "qemu/sockets.h"
+#include "qapi/qapi-visit-sockets.h"
+#include "qapi/qobject-input-visitor.h"
+#include "qapi/error.h"

Are any of these now superfluous at migration-test.c?


Yess, actually all of them are now redundant or not required from the 
migration-test.c. Will remove all imports from there



Regards,

Het Gala

Re: [PATCH v4 10/25] migration: Add Error** argument to qemu_savevm_state_setup()