Re: [RFC PATCH] tests/vm: update openbsd image to 7.4

2024-02-26 Thread Thomas Huth

On 26/02/2024 23.48, Alex Bennée wrote:

The old links are dead so even if we have the ISO cached we can't
finish the install. Update to the current stable and tweak the install
strings.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2192
Signed-off-by: Alex Bennée 
---
  tests/vm/openbsd | 9 +
  1 file changed, 5 insertions(+), 4 deletions(-)


Thanks, this seems to work fine:

Tested-by: Thomas Huth 




Re: [PATCH v3 21/27] plugins: add an API to read registers

2024-02-26 Thread Akihiko Odaki

On 2024/02/27 1:56, Alex Bennée wrote:

We can only request a list of registers once the vCPU has been
initialised so the user needs to use either call the get function on
vCPU initialisation or during the translation phase.

We don't expose the reg number to the plugin instead hiding it behind
an opaque handle. For now this is just the gdb_regnum encapsulated in
an anonymous GPOINTER but in future as we add more state for plugins
to track we can expand it.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1706
Cc: Akihiko Odaki 
Message-Id: <20240103173349.398526-39-alex.ben...@linaro.org>
Based-on: <20231025093128.33116-18-akihiko.od...@daynix.com>
Signed-off-by: Alex Bennée 
Reviewed-by: Pierrick Bouvier 


Hi,

Mostly looks good. I have a few trivial comments so please have a look 
at them.




---
v4
   - the get/read_registers functions are now implicitly for current
   vCPU only to accidental cpu != current_cpu uses.
v5
   - make reg_handles as per-CPUPluginState variable.
v6
   - for now just wrap gdb_regnum
---
  include/qemu/qemu-plugin.h   | 48 +--
  plugins/api.c| 56 
  plugins/qemu-plugins.symbols |  2 ++
  3 files changed, 104 insertions(+), 2 deletions(-)

diff --git a/include/qemu/qemu-plugin.h b/include/qemu/qemu-plugin.h
index 93981f8f89f..3b6b18058d2 100644
--- a/include/qemu/qemu-plugin.h
+++ b/include/qemu/qemu-plugin.h
@@ -11,6 +11,7 @@
  #ifndef QEMU_QEMU_PLUGIN_H
  #define QEMU_QEMU_PLUGIN_H
  
+#include 

  #include 
  #include 
  #include 
@@ -229,8 +230,8 @@ struct qemu_plugin_insn;
   * @QEMU_PLUGIN_CB_R_REGS: callback reads the CPU's regs
   * @QEMU_PLUGIN_CB_RW_REGS: callback reads and writes the CPU's regs
   *
- * Note: currently unused, plugins cannot read or change system
- * register state.
+ * Note: currently QEMU_PLUGIN_CB_RW_REGS is unused, plugins cannot change
+ * system register state.
   */
  enum qemu_plugin_cb_flags {
  QEMU_PLUGIN_CB_NO_REGS,
@@ -707,4 +708,47 @@ uint64_t qemu_plugin_end_code(void);
  QEMU_PLUGIN_API
  uint64_t qemu_plugin_entry_code(void);
  
+/** struct qemu_plugin_register - Opaque handle for register access */

+struct qemu_plugin_register;
+
+/**
+ * typedef qemu_plugin_reg_descriptor - register descriptions
+ *
+ * @handle: opaque handle for retrieving value with qemu_plugin_read_register
+ * @name: register name
+ * @feature: optional feature descriptor, can be NULL
+ */
+typedef struct {
+struct qemu_plugin_register *handle;
+const char *name;
+const char *feature;
+} qemu_plugin_reg_descriptor;
+
+/**
+ * qemu_plugin_get_registers() - return register list for current vCPU
+ *
+ * Returns a GArray of qemu_plugin_reg_descriptor or NULL. Caller
+ * frees the array (but not the const strings).
+ *
+ * Should be used from a qemu_plugin_register_vcpu_init_cb() callback
+ * after the vCPU is initialised, i.e. in the vCPU context.
+ */
+GArray *qemu_plugin_get_registers(void);
+
+/**
+ * qemu_plugin_read_register() - read register for current vCPU
+ *
+ * @handle: a @qemu_plugin_reg_handle handle
+ * @buf: A GByteArray for the data owned by the plugin
+ *
+ * This function is only available in a context that register read access is
+ * explicitly requested via the QEMU_PLUGIN_CB_R_REGS flag.
+ *
+ * Returns the size of the read register. The content of @buf is in target byte
+ * order. On failure returns -1
+ */
+int qemu_plugin_read_register(struct qemu_plugin_register *handle,
+  GByteArray *buf);
+
+
  #endif /* QEMU_QEMU_PLUGIN_H */
diff --git a/plugins/api.c b/plugins/api.c
index 54df72c1c00..03412598047 100644
--- a/plugins/api.c
+++ b/plugins/api.c
@@ -8,6 +8,7 @@
   *
   *  qemu_plugin_tb
   *  qemu_plugin_insn
+ *  qemu_plugin_register
   *
   * Which can then be passed back into the API to do additional things.
   * As such all the public functions in here are exported in
@@ -35,10 +36,12 @@
   */
  
  #include "qemu/osdep.h"

+#include "qemu/main-loop.h"
  #include "qemu/plugin.h"
  #include "qemu/log.h"
  #include "tcg/tcg.h"
  #include "exec/exec-all.h"
+#include "exec/gdbstub.h"
  #include "exec/ram_addr.h"
  #include "disas/disas.h"
  #include "plugin.h"
@@ -410,3 +413,56 @@ uint64_t qemu_plugin_entry_code(void)
  #endif
  return entry;
  }
+
+/*
+ * Create register handles.
+ *
+ * We need to create a handle for each register so the plugin
+ * infrastructure can call gdbstub to read a register. They are
+ * currently just a pointer encapsulation of the gdb_regnum but in
+ * future may hold internal plugin state so its important plugin
+ * authors are not tempted to treat them as numbers.
+ *
+ * We also construct a result array with those handles and some
+ * ancillary data the plugin might find useful.
+ */
+
+static GArray *create_register_handles(CPUState *cs, GArray *gdbstub_regs)
+{


cs is unused.


+GArray *find_data = g_array_new(true, true,
+

Re: [PATCH] migration: Don't serialize migration while can't switchover

2024-02-26 Thread Peter Xu
On Thu, Feb 22, 2024 at 05:56:27PM +0200, Avihai Horon wrote:
> Currently, migration code serializes device data sending during pre-copy
> iterative phase. As noted in the code comment, this is done to prevent
> faster changing device from sending its data over and over.

Frankly speaking I don't understand the rational behind 90697be889 ("live
migration: Serialize vmstate saving in stage 2").  I don't even think I
noticed this logic before even if I worked on migration for a few years...

I was thinking all devices should always get its chance to run for some
period during iterations.  Do you know the reasoning behind?  And I must
confess I also know little on block migration, which seems to be relevant
to this change.  Anyway, I also copy Jan just in case he'll be able to chim
in.

If there is a fast changing device, even if we don't proceed with other
device iterators and we stick with the current one, assuming it can finally
finish dumping all data, but then we'll proceed with other devices and the
fast changing device can again accumulate dirty information?

> 
> However, with switchover-ack capability enabled, this behavior can be
> problematic and may prevent migration from converging. The problem lies
> in the fact that an earlier device may never finish sending its data and
> thus block other devices from sending theirs.

Yes, this is a problem.

> 
> This bug was observed in several VFIO migration scenarios where some
> workload on the VM prevented RAM from ever reaching a hard zero, not
> allowing VFIO initial pre-copy data to be sent, and thus destination
> could not ack switchover. Note that the same scenario, but without
> switchover-ack, would converge.
> 
> Fix it by not serializing device data sending during pre-copy iterative
> phase if switchover was not acked yet.

I am still not fully convinced that it's even legal that one device can
consume all iterator's bandwidth, ignoring the rest..  Though again it's
not about this patch, but about commit 90697be889.

I'm thinking whether we should allow each device to have its own portion of
chance to push data for each call to qemu_savevm_state_iterate(),
irrelevant of vfio's switchover-ack capability.

> 
> Fixes: 1b4adb10f898 ("migration: Implement switchover ack logic")
> Signed-off-by: Avihai Horon 
> ---
>  migration/savevm.h|  2 +-
>  migration/migration.c |  4 ++--
>  migration/savevm.c| 22 +++---
>  3 files changed, 18 insertions(+), 10 deletions(-)
> 
> diff --git a/migration/savevm.h b/migration/savevm.h
> index 74669733dd6..d4a368b522b 100644
> --- a/migration/savevm.h
> +++ b/migration/savevm.h
> @@ -36,7 +36,7 @@ void qemu_savevm_state_setup(QEMUFile *f);
>  bool qemu_savevm_state_guest_unplug_pending(void);
>  int qemu_savevm_state_resume_prepare(MigrationState *s);
>  void qemu_savevm_state_header(QEMUFile *f);
> -int qemu_savevm_state_iterate(QEMUFile *f, bool postcopy);
> +int qemu_savevm_state_iterate(QEMUFile *f, bool postcopy, bool 
> can_switchover);
>  void qemu_savevm_state_cleanup(void);
>  void qemu_savevm_state_complete_postcopy(QEMUFile *f);
>  int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only,
> diff --git a/migration/migration.c b/migration/migration.c
> index ab21de2cadb..d8bfe1fb1b9 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -3133,7 +3133,7 @@ static MigIterateState 
> migration_iteration_run(MigrationState *s)
>  }
>  
>  /* Just another iteration step */
> -qemu_savevm_state_iterate(s->to_dst_file, in_postcopy);
> +qemu_savevm_state_iterate(s->to_dst_file, in_postcopy, can_switchover);
>  return MIG_ITERATE_RESUME;
>  }
>  
> @@ -3216,7 +3216,7 @@ static MigIterateState 
> bg_migration_iteration_run(MigrationState *s)
>  {
>  int res;
>  
> -res = qemu_savevm_state_iterate(s->to_dst_file, false);
> +res = qemu_savevm_state_iterate(s->to_dst_file, false, true);
>  if (res > 0) {
>  bg_migration_completion(s);
>  return MIG_ITERATE_BREAK;
> diff --git a/migration/savevm.c b/migration/savevm.c
> index d612c8a9020..3a012796375 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1386,7 +1386,7 @@ int qemu_savevm_state_resume_prepare(MigrationState *s)
>   *   0 : We haven't finished, caller have to go again
>   *   1 : We have finished, we can go to complete phase
>   */
> -int qemu_savevm_state_iterate(QEMUFile *f, bool postcopy)
> +int qemu_savevm_state_iterate(QEMUFile *f, bool postcopy, bool 
> can_switchover)
>  {
>  SaveStateEntry *se;
>  int ret = 1;
> @@ -1430,12 +1430,20 @@ int qemu_savevm_state_iterate(QEMUFile *f, bool 
> postcopy)
>   "%d(%s): %d",
>   se->section_id, se->idstr, ret);
>  qemu_file_set_error(f, ret);
> +return ret;
>  }
> -if (ret <= 0) {
> -/* Do not proceed to the next vmstate before this one reported
> -   completion of the 

Re: [PATCH v6 5/9] target/ppc: Simplify syscall exception handlers

2024-02-26 Thread Nicholas Piggin
On Thu Feb 22, 2024 at 10:20 PM AEST, BALATON Zoltan wrote:
> On Thu, 22 Feb 2024, Philippe Mathieu-Daudé wrote:
> > On 22/2/24 12:33, BALATON Zoltan wrote:
> >> After previous changes the hypercall handling in 7xx and 74xx
> >> exception handlers can be folded into one if statement to simpilfy
> >
> > "simplify"
> >
> >> this code. Also add "unlikely" to mark the less freqiently used branch
> >
> > "frequently"
>
> Could these be fixed up when merging please? I'd not resend again unless 
> there's some other things need fixing.

Main thing was the gen_exception_err code shraing with sc. If you
wouldn't mind resending the series with all fixups. I'll plan to
get another ppc PR in before soft freeze in ~ 2 weeks so and I'll
grab this if possible.

Thanks,
Nick



Re: [PATCH v6 5/9] target/ppc: Simplify syscall exception handlers

2024-02-26 Thread Nicholas Piggin
On Thu Feb 22, 2024 at 9:33 PM AEST, BALATON Zoltan wrote:
> After previous changes the hypercall handling in 7xx and 74xx
> exception handlers can be folded into one if statement to simpilfy
> this code. Also add "unlikely" to mark the less freqiently used branch
> for the compiler.
>
> Signed-off-by: BALATON Zoltan 
> Reviewed-by: Harsh Prateek Bora 
> ---
>  target/ppc/excp_helper.c | 24 
>  1 file changed, 8 insertions(+), 16 deletions(-)
>
> diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
> index 411d67376c..035a9fd968 100644
> --- a/target/ppc/excp_helper.c
> +++ b/target/ppc/excp_helper.c
> @@ -762,26 +762,22 @@ static void powerpc_excp_7xx(PowerPCCPU *cpu, int excp)
>  case POWERPC_EXCP_SYSCALL:   /* System call exception
> */
>  {
>  int lev = env->error_code;
> -
> -if (lev == 1 && cpu->vhyp) {
> -dump_hcall(env);
> -} else {
> -dump_syscall(env);
> -}
>  /*
>   * The Virtual Open Firmware (VOF) relies on the 'sc 1'
>   * instruction to communicate with QEMU. The pegasos2 machine
>   * uses VOF and the 7xx CPUs, so although the 7xx don't have
>   * HV mode, we need to keep hypercall support.
>   */
> -if (lev == 1 && cpu->vhyp) {
> +if (unlikely(lev == 1 && cpu->vhyp)) {
>  PPCVirtualHypervisorClass *vhc =
>  PPC_VIRTUAL_HYPERVISOR_GET_CLASS(cpu->vhyp);
> +dump_hcall(env);
>  vhc->hypercall(cpu->vhyp, cpu);
>  powerpc_reset_excp_state(cpu);
>  return;
> +} else {
> +dump_syscall(env);
>  }
> -
>  break;

You could avoid the else statement for these because the
hcall branch returns.

Actually books could be changed similarly too, I think dump_hcall can be
done in the books_vhyp_handles_hcall() branch. But you don't need to
change that in your patch since it's behaviour change.

Thanks,
Nick



Re: [PATCH v6 3/9] target/ppc: Fix gen_sc to use correct nip

2024-02-26 Thread Nicholas Piggin
On Thu Feb 22, 2024 at 9:33 PM AEST, BALATON Zoltan wrote:
> Most exceptions are raised with nip pointing to the faulting
> instruction but the sc instruction generating a syscall exception
> leaves nip pointing to next instruction. Fix gen_sc to not use
> gen_exception_err() which sets nip back but correctly set nip to
> pc_next so we don't have to patch this in the exception handlers.
>
> Signed-off-by: BALATON Zoltan 
> Reviewed-by: Nicholas Piggin 

Mixed feelings about this one still but I suppose I will add it
now you have the tracing corrected. Although one more thing:

> diff --git a/target/ppc/translate.c b/target/ppc/translate.c
> index 049f636927..6a43eda3b9 100644
> --- a/target/ppc/translate.c
> +++ b/target/ppc/translate.c
> @@ -4535,15 +4535,17 @@ static void gen_hrfid(DisasContext *ctx)
>  #endif
>  static void gen_sc(DisasContext *ctx)
>  {
> -uint32_t lev;
> -
>  /*
>   * LEV is a 7-bit field, but the top 6 bits are treated as a reserved
>   * field (i.e., ignored). ISA v3.1 changes that to 5 bits, but that is
>   * for Ultravisor which TCG does not support, so just ignore the top 6.
>   */
> -lev = (ctx->opcode >> 5) & 0x1;
> -gen_exception_err(ctx, POWERPC_SYSCALL, lev);
> +uint32_t lev = (ctx->opcode >> 5) & 0x1;
> +
> +gen_update_nip(ctx, ctx->base.pc_next);
> +gen_helper_raise_exception_err(tcg_env, 
> tcg_constant_i32(POWERPC_SYSCALL),
> +   tcg_constant_i32(lev));
> +ctx->base.is_jmp = DISAS_NORETURN;
>  }
>  
>  #if defined(TARGET_PPC64)

Can you share this code with gen_exception_err, by making
gen_exception_err_nip that takes the nip?

Thanks,
Nick



Re: [PATCH 10/10] docs/devel/reset: Update to discuss system reset

2024-02-26 Thread Zhao Liu
On Tue, Feb 20, 2024 at 04:06:22PM +, Peter Maydell wrote:
> Date: Tue, 20 Feb 2024 16:06:22 +
> From: Peter Maydell 
> Subject: [PATCH 10/10] docs/devel/reset: Update to discuss system reset
> X-Mailer: git-send-email 2.34.1
> 
> Now that system reset uses a three-phase-reset, update the reset
> documentation to include a section describing how this works.
> Include documentation of the current major beartrap in reset, which
> is that only devices on the qbus tree will get automatically reset.
> 
> Signed-off-by: Peter Maydell 
> ---
> This merely documents the current situation, and says nothing
> about what we might like to do with it in future...
> ---
>  docs/devel/reset.rst | 44 ++--
>  1 file changed, 42 insertions(+), 2 deletions(-)
>

Reviewed-by: Zhao Liu 




Re: [PATCH 09/10] hw/core/machine: Use qemu_register_resettable for sysbus reset

2024-02-26 Thread Zhao Liu
On Tue, Feb 20, 2024 at 04:06:21PM +, Peter Maydell wrote:
> Date: Tue, 20 Feb 2024 16:06:21 +
> From: Peter Maydell 
> Subject: [PATCH 09/10] hw/core/machine: Use qemu_register_resettable for
>  sysbus reset
> X-Mailer: git-send-email 2.34.1
> 
> Move the reset of the sysbus (and thus all devices and buses anywhere
> on the qbus tree) from qemu_register_reset() to qemu_register_resettable().
> 
> This is a behaviour change: because qemu_register_resettable() is
> aware of three-phase reset, this now means that:
>  * 'enter' phase reset methods of devices and buses are called
>before any legacy reset callbacks registered with qemu_register_reset()
>  * 'exit' phase reset methods of devices and buses are called
>after any legacy qemu_register_reset() callbacks
> 
> Put another way, a qemu_register_reset() callback is now correctly
> ordered in the 'hold' phase along with any other 'hold' phase methods.
> 
> The motivation for doing this is that we will now be able to resolve
> some reset-ordering issues using the three-phase mechanism, because
> the 'exit' phase is always after the 'hold' phase, even when the
> 'hold' phase function was registered with qemu_register_reset().
> 
> Signed-off-by: Peter Maydell 
> ---
> I believe that given we don't make much use of enter/exit phases
> currently that this is unlikely to cause unexpected regressions due
> to an accidental reset-order dependency that is no longer satisfied,
> but it's always possible...
> ---
>  hw/core/machine.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
>

Reviewed-by: Zhao Liu 




Re: [PATCH 08/10] hw/core/reset: Implement qemu_register_reset via qemu_register_resettable

2024-02-26 Thread Zhao Liu
On Tue, Feb 20, 2024 at 04:06:20PM +, Peter Maydell wrote:
> Date: Tue, 20 Feb 2024 16:06:20 +
> From: Peter Maydell 
> Subject: [PATCH 08/10] hw/core/reset: Implement qemu_register_reset via
>  qemu_register_resettable
> X-Mailer: git-send-email 2.34.1
> 
> Reimplement qemu_register_reset() via qemu_register_resettable().
> 
> We define a new LegacyReset object which implements Resettable and
> defines its reset hold phase method to call a QEMUResetHandler
> function.  When qemu_register_reset() is called, we create a new
> LegacyReset object and add it to the simulation_reset
> ResettableContainer.  When qemu_unregister_reset() is called, we find
> the LegacyReset object in the container and remove it.
> 
> This implementation of qemu_unregister_reset() means we'll end up
> scanning the ResetContainer's list of child objects twice, once
> to find the LegacyReset object, and once in g_ptr_array_remove().
> In theory we could avoid this by having the ResettableContainer
> interface include a resettable_container_remove_with_equal_func()
> that took a callback method so that we could use
> g_ptr_array_find_with_equal_func() and g_ptr_array_remove_index().
> But we don't expect qemu_unregister_reset() to be called frequently
> or in hot paths, and we expect the simulation_reset container to
> usually not have many children.
> 
> Signed-off-by: Peter Maydell 
> ---
> The way that a legacy reset function needs to check the ShutdownCause
> and this doesn't line up with the ResetType is a bit awkward; this
> is an area we should come back and clean up, but I didn't want to
> tackle that in this patchset.
> ---
>  include/sysemu/reset.h |   7 ++-
>  hw/core/reset.c| 137 +++--
>  2 files changed, 110 insertions(+), 34 deletions(-)
>

Reviewed-by: Zhao Liu 




Re: [PATCH 07/10] hw/core/reset: Add qemu_{register, unregister}_resettable()

2024-02-26 Thread Zhao Liu
On Tue, Feb 20, 2024 at 04:06:19PM +, Peter Maydell wrote:
> Date: Tue, 20 Feb 2024 16:06:19 +
> From: Peter Maydell 
> Subject: [PATCH 07/10] hw/core/reset: Add qemu_{register,
>  unregister}_resettable()
> X-Mailer: git-send-email 2.34.1
> 
> Implement new functions qemu_register_resettable() and
> qemu_unregister_resettable().  These are intended to be
> three-phase-reset aware equivalents of the old qemu_register_reset()
> and qemu_unregister_reset().  Instead of passing in a function
> pointer and opaque, you register any QOM object that implements the
> Resettable interface.
> 
> The implementation is simple: we have a single global instance of a
> ResettableContainer, which we reset in qemu_devices_reset(), and
> the Resettable objects passed to qemu_register_resettable() are
> added to it.
> 
> Signed-off-by: Peter Maydell 
> ---
>  include/sysemu/reset.h | 37 ++---
>  hw/core/reset.c| 31 +--
>  2 files changed, 63 insertions(+), 5 deletions(-)

Reviewed-by: Zhao Liu  



[PATCH v2] hw/char/pl011: Add support for loopback

2024-02-26 Thread Tong Ho
This patch adds loopback for sent characters, sent BREAK,
and modem-control signals.

Loopback of send and modem-control is often used for uart
self tests in real hardware but missing from current pl011
model, resulting in self-test failures when running in QEMU.

This implementation matches what is observed in real pl011
hardware placed in loopback mode:
1. Input characters and BREAK events from serial backend
   are ignored, but
2. Both TX characters and BREAK events are still sent to
   serial backend, in addition to be looped back to RX.

Signed-off-by: Tong Ho 
Signed-off-by: Francisco Iglesias 
---
 hw/char/pl011.c | 110 +++-
 1 file changed, 108 insertions(+), 2 deletions(-)

diff --git a/hw/char/pl011.c b/hw/char/pl011.c
index 855cb82d08..8753b84a84 100644
--- a/hw/char/pl011.c
+++ b/hw/char/pl011.c
@@ -49,10 +49,14 @@ DeviceState *pl011_create(hwaddr addr, qemu_irq irq, 
Chardev *chr)
 }
 
 /* Flag Register, UARTFR */
+#define PL011_FLAG_RI   0x100
 #define PL011_FLAG_TXFE 0x80
 #define PL011_FLAG_RXFF 0x40
 #define PL011_FLAG_TXFF 0x20
 #define PL011_FLAG_RXFE 0x10
+#define PL011_FLAG_DCD  0x04
+#define PL011_FLAG_DSR  0x02
+#define PL011_FLAG_CTS  0x01
 
 /* Data Register, UARTDR */
 #define DR_BE   (1 << 10)
@@ -76,6 +80,13 @@ DeviceState *pl011_create(hwaddr addr, qemu_irq irq, Chardev 
*chr)
 #define LCR_FEN (1 << 4)
 #define LCR_BRK (1 << 0)
 
+/* Control Register, UARTCR */
+#define CR_OUT2 (1 << 13)
+#define CR_OUT1 (1 << 12)
+#define CR_RTS  (1 << 11)
+#define CR_DTR  (1 << 10)
+#define CR_LBE  (1 << 7)
+
 static const unsigned char pl011_id_arm[8] =
   { 0x11, 0x10, 0x14, 0x00, 0x0d, 0xf0, 0x05, 0xb1 };
 static const unsigned char pl011_id_luminary[8] =
@@ -251,6 +262,89 @@ static void pl011_trace_baudrate_change(const PL011State 
*s)
 s->ibrd, s->fbrd);
 }
 
+static bool pl011_loopback_enabled(PL011State *s)
+{
+return !!(s->cr & CR_LBE);
+}
+
+static void pl011_loopback_mdmctrl(PL011State *s)
+{
+uint32_t cr, fr, il;
+
+if (!pl011_loopback_enabled(s)) {
+return;
+}
+
+/*
+ * Loopback software-driven modem control outputs to modem status inputs:
+ *   FR.RI  <= CR.Out2
+ *   FR.DCD <= CR.Out1
+ *   FR.CTS <= CR.RTS
+ *   FR.DSR <= CR.DTR
+ *
+ * The loopback happens immediately even if this call is triggered
+ * by setting only CR.LBE.
+ *
+ * CTS/RTS updates due to enabled hardware flow controls are not
+ * dealt with here.
+ */
+cr = s->cr;
+fr = s->flags & ~(PL011_FLAG_RI | PL011_FLAG_DCD |
+  PL011_FLAG_DSR | PL011_FLAG_CTS);
+fr |= (cr & CR_OUT2) ? PL011_FLAG_RI  : 0;
+fr |= (cr & CR_OUT1) ? PL011_FLAG_DCD : 0;
+fr |= (cr & CR_RTS)  ? PL011_FLAG_CTS : 0;
+fr |= (cr & CR_DTR)  ? PL011_FLAG_DSR : 0;
+
+/* Change interrupts based on updated FR */
+il = s->int_level & ~(INT_DSR | INT_DCD | INT_CTS | INT_RI);
+il |= (fr & PL011_FLAG_DSR) ? INT_DSR : 0;
+il |= (fr & PL011_FLAG_DCD) ? INT_DCD : 0;
+il |= (fr & PL011_FLAG_CTS) ? INT_CTS : 0;
+il |= (fr & PL011_FLAG_RI)  ? INT_RI  : 0;
+
+s->flags = fr;
+s->int_level = il;
+pl011_update(s);
+}
+
+static void pl011_put_fifo(void *opaque, uint32_t value);
+
+static void pl011_loopback_tx(PL011State *s, uint32_t value)
+{
+if (!pl011_loopback_enabled(s)) {
+return;
+}
+
+/*
+ * Caveat:
+ *
+ * In real hardware, TX loopback happens at the serial-bit level
+ * and then reassembled by the RX logics back into bytes and placed
+ * into the RX fifo. That is, loopback happens after TX fifo.
+ *
+ * Because the real hardware TX fifo is time-drained at the frame
+ * rate governed by the configured serial format, some loopback
+ * bytes in TX fifo may still be able to get into the RX fifo
+ * that could be full at times while being drained at software
+ * pace.
+ *
+ * In such scenario, the RX draining pace is the major factor
+ * deciding which loopback bytes get into the RX fifo, unless
+ * hardware flow-control is enabled.
+ *
+ * For simplicity, the above described is not emulated.
+ */
+pl011_put_fifo(s, value);
+}
+
+static void pl011_loopback_break(PL011State *s, int brk_enable)
+{
+if (brk_enable) {
+pl011_loopback_tx(s, DR_BE);
+}
+}
+
 static void pl011_write(void *opaque, hwaddr offset,
 uint64_t value, unsigned size)
 {
@@ -266,6 +360,7 @@ static void pl011_write(void *opaque, hwaddr offset,
 /* XXX this blocks entire thread. Rewrite to use
  * qemu_chr_fe_write and background I/O callbacks */
 qemu_chr_fe_write_all(>chr, , 1);
+pl011_loopback_tx(s, ch);
 s->int_level |= INT_TX;
 pl011_update(s);
 break;
@@ -295,13 +390,15 @@ static void pl011_write(void *opaque, hwaddr offset,
 int 

Re: [PATCH v6 11/41] Temporarily disable unimplemented rpi4b devices

2024-02-26 Thread Kambalin, Sergey
Hi Peter and Philippe!


Thank you for the review and feedback!


OK, I'll fix PCIE-relarted comments and the overlapping issue


BR,
Sergey Kambalin
Software Developer,
Auriga Inc.



От: Peter Maydell 
Отправлено: 26 февраля 2024 г. 10:41:31
Кому: Philippe Mathieu-Daudé
Копия: Sergey Kambalin; qemu-...@nongnu.org; qemu-devel@nongnu.org; Kambalin, 
Sergey
Тема: Re: [PATCH v6 11/41] Temporarily disable unimplemented rpi4b devices

On Mon, 26 Feb 2024 at 16:06, Philippe Mathieu-Daudé  wrote:
>
> On 26/2/24 14:39, Peter Maydell wrote:
> > On Mon, 26 Feb 2024 at 13:35, Philippe Mathieu-Daudé  
> > wrote:
> >>
> >> On 26/2/24 01:02, Sergey Kambalin wrote:
> >>> +static void raspi4_modify_dtb(const struct arm_boot_info *info, void 
> >>> *fdt)
> >>> +{
> >>> +uint64_t ram_size;
> >>> +
> >>> +/* Temporarily disable following devices until they are implemented 
> >>> */
> >>> +const char *nodes_to_remove[] = {
> >>> +"brcm,bcm2711-pcie",
> >>> +"brcm,bcm2711-rng200",
> >>> +"brcm,bcm2711-thermal",
> >>> +"brcm,bcm2711-genet-v5",
> >>> +};
> >>> +
> >>> +for (int i = 0; i < ARRAY_SIZE(nodes_to_remove); i++) {
> >>> +const char *dev_str = nodes_to_remove[i];
> >>> +
> >>> +int offset = fdt_node_offset_by_compatible(fdt, -1, dev_str);
> >>> +if (offset >= 0) {
> >>> +if (!fdt_nop_node(fdt, offset)) {
> >>
> >> Peter, I remember a discussion where you wre not keen on altering DTB
> >> for non-Virt machines.
> >>
> >> Since these devices are all implemented at the end of the series, why
> >> not add the devices then the raspi4 board at the end, so this patch is
> >> not even required?
> >
> > I'm not super-keen on it, but as you say it goes away once all
> > the devices are implemented, so I'm not too worried.
> >
> > Doing it this way around would let us take the first 11 patches
> > in the series into git now (they've all been reviewed), which
> > gives us (I think) a functional raspi4 with some missing devices,
> > which seems useful in the interim until the rest of the series
> > gets reviewed and committed.
>
> Fine by me! Sergey, don't we also need patch #39 (Add missed BCM2835
> properties) to have a happy Linux boot?
>
> Patch #17 "Implement BCM2838 thermal sensor" could also go in but it
> doesn't apply cleanly on top of 1-12); maybe Sergey can send a series
> of "patches already reviewed" on top so they get in for v9, postponing
> pcie/network for after release.

I'll put together a pullreq tomorrow (see my other email for details
of which patches plus the necessary changes to the avocado tests).
Sergey -- I suggest you wait til that gets upstream, and then
rebase on that.

-- PMM


Re: [PATCH V4 00/14] allow cpr-reboot for vfio

2024-02-26 Thread Peter Xu
On Mon, Feb 26, 2024 at 05:08:05PM -0500, Steven Sistare wrote:
> On 2/26/2024 3:21 PM, Steven Sistare wrote:
> > On 2/26/2024 4:01 AM, Peter Xu wrote:
> >> On Mon, Feb 26, 2024 at 09:49:46AM +0100, Cédric Le Goater wrote:
> >>> Go ahead. It will help me for the changes I am doing on error reporting
> >>> for VFIO migration. I will rebase on top.
> >>
> >> Thanks for confirming.  I queued the migration patches then, but leave the
> >> two vfio one for further discussion.
> > 
> > Very good, thanks.  I am always happy to move the ball a few yards closer to
> > the goal line :)
> 
> Peter, beware that patch 3 needs an edit before being queued.
> This hunk snuck in and should be deleted:
> 
> [PATCH V4 03/14] migration: convert to NotifierWithReturn
> diff --git a/roms/seabios-hppa b/roms/seabios-hppa
> index 03774ed..e4eac85 16
> --- a/roms/seabios-hppa
> +++ b/roms/seabios-hppa
> @@ -1 +1 @@
> -Subproject commit 03774edaad3bfae090ac96ca5450353c641637d1
> +Subproject commit e4eac85880e8677f96d8b9e94de9f2eec9c0751f

I see, I dropped this change in the patch.

https://gitlab.com/peterx/qemu/-/commit/ccea71f8f222593c47670366d7d86554586e596e

-- 
Peter Xu




RE: [PATCH RFCv2 1/8] backends/iommufd: Introduce helper function iommufd_device_get_hw_capabilities()

2024-02-26 Thread Duan, Zhenzhong


>-Original Message-
>From: Joao Martins 
>Subject: Re: [PATCH RFCv2 1/8] backends/iommufd: Introduce helper
>function iommufd_device_get_hw_capabilities()
>
>On 26/02/2024 07:29, Duan, Zhenzhong wrote:
>> Hi Joao,
>>
>>> -Original Message-
>>> From: Joao Martins 
>>> Subject: [PATCH RFCv2 1/8] backends/iommufd: Introduce helper
>function
>>> iommufd_device_get_hw_capabilities()
>>>
>>> The new helper will fetch vendor agnostic IOMMU capabilities supported
>>> both by hardware and software. Right now it is only iommu dirty
>>> tracking.
>>>
>>> Signed-off-by: Joao Martins 
>>> ---
>>> backends/iommufd.c   | 25 +
>>> include/sysemu/iommufd.h |  2 ++
>>> 2 files changed, 27 insertions(+)
>>>
>>> diff --git a/backends/iommufd.c b/backends/iommufd.c
>>> index d92791bba935..8486894f1b3f 100644
>>> --- a/backends/iommufd.c
>>> +++ b/backends/iommufd.c
>>> @@ -237,3 +237,28 @@ void iommufd_device_init(IOMMUFDDevice
>*idev)
>>> host_iommu_base_device_init(>base, HID_IOMMUFD,
>>> sizeof(IOMMUFDDevice));
>>> }
>>> +
>>> +int iommufd_device_get_hw_capabilities(IOMMUFDDevice *idev,
>uint64_t
>>> *caps,
>>> +   Error **errp)
>>> +{
>>> +struct iommu_hw_info info = {
>>> +.size = sizeof(info),
>>> +.flags = 0,
>>> +.dev_id = idev->devid,
>>> +.data_len = 0,
>>> +.__reserved = 0,
>>> +.data_uptr = 0,
>>> +.out_capabilities = 0,
>>> +};
>>> +int ret;
>>> +
>>> +ret = ioctl(idev->iommufd->fd, IOMMU_GET_HW_INFO, );
>>> +if (ret) {
>>> +error_setg_errno(errp, errno,
>>> + "Failed to get hardware info capabilities");
>>> +} else {
>>> +*caps = info.out_capabilities;
>>> +}
>>> +
>>> +return ret;
>>> +}
>>
>> This helper is redundant with https://lists.gnu.org/archive/html/qemu-
>devel/2024-02/msg00031.html
>> We have to get other elements in info in nesting series, so mind using that
>helper
>> Instead to avoid redundancy? I can move that patch ahead for your usage.
>>
>
>Sure.
>
>Btw, speaking of which. You series could actually be split into two. One for
>iommufd device abstraction part (patch 00-09) and another for the nesting
>bits
>(10-18). FWIW this series here as submitted was actually just placing it on
>top
>of the iommufd device abstraction

I see, will split in next version.

>
>I am still planning on adding this same helper, probably just calling into
>yours. Mostly because I disregard the data/data-size as I am only interested
>in
>vendor agnostic capabilities.

Sounds good.

Thanks
Zhenzhong



Re: [PATCH v4 22/34] migration/multifd: Prepare multifd sync for fixed-ram migration

2024-02-26 Thread Peter Xu
On Mon, Feb 26, 2024 at 07:52:20PM -0300, Fabiano Rosas wrote:
> Peter Xu  writes:
> 
> > On Tue, Feb 20, 2024 at 07:41:26PM -0300, Fabiano Rosas wrote:
> >> The fixed-ram migration can be performed live or non-live, but it is
> >> always asynchronous, i.e. the source machine and the destination
> >> machine are not migrating at the same time. We only need some pieces
> >> of the multifd sync operations.
> >> 
> >> multifd_send_sync_main()
> >> 
> >>   Issued by the ram migration code on the migration thread, causes the
> >>   multifd send channels to synchronize with the migration thread and
> >>   makes the sending side emit a packet with the MULTIFD_FLUSH flag.
> >> 
> >>   With fixed-ram we want to maintain the sync on the sending side
> >>   because that provides ordering between the rounds of dirty pages when
> >>   migrating live.
> >> 
> >> MULTIFD_FLUSH
> >> -
> >>   On the receiving side, the presence of the MULTIFD_FLUSH flag on a
> >>   packet causes the receiving channels to start synchronizing with the
> >>   main thread.
> >> 
> >>   We're not using packets with fixed-ram, so there's no MULTIFD_FLUSH
> >>   flag and therefore no channel sync on the receiving side.
> >> 
> >> multifd_recv_sync_main()
> >> 
> >>   Issued by the migration thread when the ram migration flag
> >>   RAM_SAVE_FLAG_MULTIFD_FLUSH is received, causes the migration thread
> >>   on the receiving side to start synchronizing with the recv
> >>   channels. Due to compatibility, this is also issued when
> >>   RAM_SAVE_FLAG_EOS is received.
> >> 
> >>   For fixed-ram we only need to synchronize the channels at the end of
> >>   migration to avoid doing cleanup before the channels have finished
> >>   their IO.
> >> 
> >> Make sure the multifd syncs are only issued at the appropriate
> >> times. Note that due to pre-existing backward compatibility issues, we
> >> have the multifd_flush_after_each_section property that enables an
> >> older behavior of synchronizing channels more frequently (and
> >> inefficiently). Fixed-ram should always run with that property
> >> disabled (default).
> >
> > What if the user enables multifd_flush_after_each_section=true?
> >
> > IMHO we don't necessarily need to attach the fixed-ram loading flush to any
> > flag in the stream.  For fixed-ram IIUC all the loads will happen in one
> > shot of ram_load() anyway when parsing the ramblock list, so.. how about we
> > decouple the fixed-ram load flush from the stream by always do a sync in
> > ram_load() unconditionally?
> 
> I would like to. But it's not possible because ram_load() is called once
> per section. So once for each EOS flag on the stream. We'll have at
> least two calls to ram_load(), once due to qemu_savevm_state_iterate()
> and another due to qemu_savevm_state_complete_precopy().
> 
> The fact that fixed-ram can use just one load doesn't change the fact
> that we perform more than one "save". So we'll need to use the FLUSH
> flag in this case unfortunately.

After I re-read it, I found one more issue.

Now recv side sync is "once and for all" - it doesn't allow a second time
to sync_main because it syncs only until quits.  That is IMHO making the
code much harder to maintain, and we'll need rich comment to explain why is
that happening.

Ideally any "sync main" for recv threads can be called multiple times.  And
IMHO it's not really hard.  Actually it can make the code much cleaner by
merging some logic between socket-based and file-based from that regard.

I tried to play with your branch and propose something like this, just to
show what I meant. This should allow all new fixed-ram test to pass here,
meanwhile it should allow sync main on recv side to be re-entrant, sharing
the logic with socket-based as much as possible:

=
diff --git a/migration/multifd.c b/migration/multifd.c
index a0202b5661..28480f6cfe 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -86,10 +86,8 @@ struct {
 /* number of created threads */
 int count;
 /*
- * For sockets: this is posted once for each MULTIFD_FLAG_SYNC flag.
- *
- * For files: this is only posted at the end of the file load to mark
- *completion of the load process.
+ * This is always posted by the recv threads, the main thread uses it
+ * to wait for recv threads to finish assigned tasks.
  */
 QemuSemaphore sem_sync;
 /* global number of generated multifd packets */
@@ -1316,38 +1314,55 @@ void multifd_recv_cleanup(void)
 multifd_recv_cleanup_state();
 }
 
-
-/*
- * Wait until all channels have finished receiving data. Once this
- * function returns, cleanup routines are safe to run.
- */
-static void multifd_file_recv_sync(void)
+static void multifd_recv_file_sync_request(void)
 {
 int i;
 
 for (i = 0; i < migrate_multifd_channels(); i++) {
 MultiFDRecvParams *p = _recv_state->params[i];
 
-

RE: [PATCH v5 3/3] virtio-iommu: Change the default granule to the host page size

2024-02-26 Thread Duan, Zhenzhong
Hi Eric,

>-Original Message-
>From: Eric Auger 
>Subject: [PATCH v5 3/3] virtio-iommu: Change the default granule to the
>host page size
>
>We used to set the default granule to 4KB but with VFIO assignment
>it makes more sense to use the actual host page size.
>
>Indeed when hotplugging a VFIO device protected by a virtio-iommu
>on a 64kB/64kB host/guest config, we current get a qemu crash:
>
>"vfio: DMA mapping failed, unable to continue"
>
>This is due to the hot-attached VFIO device calling
>memory_region_iommu_set_page_size_mask() with 64kB granule
>whereas the virtio-iommu granule was already frozen to 4KB on
>machine init done.
>
>Set the granule property to "host" and introduce a new compat.
>The page size mask used before 9.0 was qemu_target_page_mask().
>Since the virtio-iommu currently only supports x86_64 and aarch64,
>this matched a 4KB granule.
>
>Note that the new default will prevent 4kB guest on 64kB host
>because the granule will be set to 64kB which would be larger
>than the guest page size. In that situation, the virtio-iommu
>driver fails on viommu_domain_finalise() with
>"granule 0x1 larger than system page size 0x1000".
>
>In that case the workaround is to request 4K granule.
>
>The current limitation of global granule in the virtio-iommu
>should be removed and turned into per domain granule. But
>until we get this upgraded, this new default is probably
>better because I don't think anyone is currently interested in
>running a 4KB page size guest with virtio-iommu on a 64KB host.
>However supporting 64kB guest on 64kB host with virtio-iommu and
>VFIO looks a more important feature.
>
>Signed-off-by: Eric Auger 
>Reviewed-by: Philippe Mathieu-Daudé 

Reviewed-by: Zhenzhong Duan 

Thanks
Zhenzhong

>
>---
>
>v4 -> v5
>- use low case, mandated by the jason qapi
>---
> hw/core/machine.c| 1 +
> hw/virtio/virtio-iommu.c | 2 +-
> 2 files changed, 2 insertions(+), 1 deletion(-)
>
>diff --git a/hw/core/machine.c b/hw/core/machine.c
>index 70ac96954c..56f38b6579 100644
>--- a/hw/core/machine.c
>+++ b/hw/core/machine.c
>@@ -35,6 +35,7 @@
>
> GlobalProperty hw_compat_8_2[] = {
> { TYPE_VIRTIO_IOMMU_PCI, "aw-bits", "64" },
>+{ TYPE_VIRTIO_IOMMU_PCI, "granule", "4k" },
> };
> const size_t hw_compat_8_2_len = G_N_ELEMENTS(hw_compat_8_2);
>
>diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
>index 33e0520bc8..6831446e29 100644
>--- a/hw/virtio/virtio-iommu.c
>+++ b/hw/virtio/virtio-iommu.c
>@@ -1548,7 +1548,7 @@ static Property virtio_iommu_properties[] = {
> DEFINE_PROP_BOOL("boot-bypass", VirtIOIOMMU, boot_bypass, true),
> DEFINE_PROP_UINT8("aw-bits", VirtIOIOMMU, aw_bits, 0),
> DEFINE_PROP_GRANULE_MODE("granule", VirtIOIOMMU, granule_mode,
>- GRANULE_MODE_4K),
>+ GRANULE_MODE_HOST),
> DEFINE_PROP_END_OF_LIST(),
> };
>
>--
>2.41.0



RE: [PATCH v5 1/3] qdev: Add a granule_mode property

2024-02-26 Thread Duan, Zhenzhong


>-Original Message-
>From: Eric Auger 
>Subject: [PATCH v5 1/3] qdev: Add a granule_mode property
>
>Introduce a new enum type property allowing to set an
>IOMMU granule. Values are 4k, 8k, 16k, 64k and host.
>This latter indicates the vIOMMU granule will match
>the host page size.
>
>A subsequent patch will add such a property to the
>virtio-iommu device.
>
>Signed-off-by: Eric Auger 
>Signed-off-by: Philippe Mathieu-Daudé 
>
>---
>v4 -> v5
>- remove code that can be automatically generated
>  and add the new enum in qapi/virtio.json (Philippe).
>  Added Phild's SOB. low case needs to be used due to
>  the Jason generation.
>
>v3 -> v4:
>- Add 8K
>---
> qapi/virtio.json| 18 ++
> include/hw/qdev-properties-system.h |  3 +++
> hw/core/qdev-properties-system.c| 15 +++
> 3 files changed, 36 insertions(+)
>
>diff --git a/qapi/virtio.json b/qapi/virtio.json
>index a79013fe89..95745fdfd7 100644
>--- a/qapi/virtio.json
>+++ b/qapi/virtio.json
>@@ -957,3 +957,21 @@
>
> { 'struct': 'DummyVirtioForceArrays',
>   'data': { 'unused-iothread-vq-mapping': ['IOThreadVirtQueueMapping'] } }
>+
>+##
>+# @GranuleMode:
>+#
>+# @4k: granule page size of 4KiB
>+#
>+# @8k: granule page size of 8KiB
>+#
>+# @16k: granule page size of 16KiB
>+#
>+# @64k: granule page size of 64KiB
>+#
>+# @host: granule matches the host page size
>+#
>+# Since: 9.0
>+##
>+{ 'enum': 'GranuleMode',
>+  'data': [ '4k', '8k', '16k', '64k', 'host' ] }
>diff --git a/include/hw/qdev-properties-system.h b/include/hw/qdev-
>properties-system.h
>index 06c359c190..626be87dd3 100644
>--- a/include/hw/qdev-properties-system.h
>+++ b/include/hw/qdev-properties-system.h
>@@ -8,6 +8,7 @@ extern const PropertyInfo qdev_prop_macaddr;
> extern const PropertyInfo qdev_prop_reserved_region;
> extern const PropertyInfo qdev_prop_multifd_compression;
> extern const PropertyInfo qdev_prop_mig_mode;
>+extern const PropertyInfo qdev_prop_granule_mode;
> extern const PropertyInfo qdev_prop_losttickpolicy;
> extern const PropertyInfo qdev_prop_blockdev_on_error;
> extern const PropertyInfo qdev_prop_bios_chs_trans;
>@@ -47,6 +48,8 @@ extern const PropertyInfo
>qdev_prop_iothread_vq_mapping_list;
> #define DEFINE_PROP_MIG_MODE(_n, _s, _f, _d) \
> DEFINE_PROP_SIGNED(_n, _s, _f, _d, qdev_prop_mig_mode, \
>MigMode)
>+#define DEFINE_PROP_GRANULE_MODE(_n, _s, _f, _d) \
>+DEFINE_PROP_SIGNED(_n, _s, _f, _d, qdev_prop_granule_mode,
>GranuleMode)
> #define DEFINE_PROP_LOSTTICKPOLICY(_n, _s, _f, _d) \
> DEFINE_PROP_SIGNED(_n, _s, _f, _d, qdev_prop_losttickpolicy, \
> LostTickPolicy)
>diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-
>system.c
>index 1a396521d5..685cffd064 100644
>--- a/hw/core/qdev-properties-system.c
>+++ b/hw/core/qdev-properties-system.c
>@@ -34,6 +34,7 @@
> #include "net/net.h"
> #include "hw/pci/pci.h"
> #include "hw/pci/pcie.h"
>+#include "hw/virtio/virtio-iommu.h"

This is unnecessary, otherwise,

Reviewed-by: Zhenzhong Duan 

Thanks
Zhenzhong

> #include "hw/i386/x86.h"
> #include "util/block-helpers.h"
>
>@@ -679,6 +680,20 @@ const PropertyInfo qdev_prop_mig_mode = {
> .set_default_value = qdev_propinfo_set_default_value_enum,
> };
>
>+/* --- GranuleMode --- */
>+
>+QEMU_BUILD_BUG_ON(sizeof(GranuleMode) != sizeof(int));
>+
>+const PropertyInfo qdev_prop_granule_mode = {
>+.name = "GranuleMode",
>+.description = "granule_mode values, "
>+   "4k, 8k, 16k, 64k, host",
>+.enum_table = _lookup,
>+.get = qdev_propinfo_get_enum,
>+.set = qdev_propinfo_set_enum,
>+.set_default_value = qdev_propinfo_set_default_value_enum,
>+};
>+
> /* --- Reserved Region --- */
>
> /*
>--
>2.41.0



RE: [PATCH v5 2/3] virtio-iommu: Add a granule property

2024-02-26 Thread Duan, Zhenzhong



>-Original Message-
>From: Eric Auger 
>Subject: [PATCH v5 2/3] virtio-iommu: Add a granule property
>
>This allows to choose which granule will be used by
>default by the virtio-iommu. Current page size mask
>default is qemu_target_page_mask so this translates
>into a 4K granule.
>
>Signed-off-by: Eric Auger 

Reviewed-by: Zhenzhong Duan 

Thanks
Zhenzhong

>
>---
>v4 -> v5:
>- use -(n * KiB) (Phild)
>
>v3 -> v4:
>- granule_mode introduction moved to that patch
>---
> include/hw/virtio/virtio-iommu.h |  2 ++
> hw/virtio/virtio-iommu.c | 28 +---
> qemu-options.hx  |  3 +++
> 3 files changed, 30 insertions(+), 3 deletions(-)
>
>diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-
>iommu.h
>index 5fbe4677c2..f2785f7997 100644
>--- a/include/hw/virtio/virtio-iommu.h
>+++ b/include/hw/virtio/virtio-iommu.h
>@@ -24,6 +24,7 @@
> #include "hw/virtio/virtio.h"
> #include "hw/pci/pci.h"
> #include "qom/object.h"
>+#include "qapi/qapi-types-virtio.h"
>
> #define TYPE_VIRTIO_IOMMU "virtio-iommu-device"
> #define TYPE_VIRTIO_IOMMU_PCI "virtio-iommu-pci"
>@@ -67,6 +68,7 @@ struct VirtIOIOMMU {
> Notifier machine_done;
> bool granule_frozen;
> uint8_t aw_bits;
>+GranuleMode granule_mode;
> };
>
> #endif
>diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
>index 2ec5ef3cd1..33e0520bc8 100644
>--- a/hw/virtio/virtio-iommu.c
>+++ b/hw/virtio/virtio-iommu.c
>@@ -29,6 +29,7 @@
> #include "sysemu/reset.h"
> #include "sysemu/sysemu.h"
> #include "qemu/reserved-region.h"
>+#include "qemu/units.h"
> #include "qapi/error.h"
> #include "qemu/error-report.h"
> #include "trace.h"
>@@ -1115,8 +1116,8 @@ static int
>virtio_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu_mr,
> }
>
> /*
>- * The default mask (TARGET_PAGE_MASK) is the smallest supported guest
>granule,
>- * for example 0xf000. When an assigned device has page size
>+ * The default mask depends on the "granule" property. For example, with
>+ * 4K granule, it is -(4 * KiB). When an assigned device has page size
>  * restrictions due to the hardware IOMMU configuration, apply this
>restriction
>  * to the mask.
>  */
>@@ -1313,7 +1314,26 @@ static void
>virtio_iommu_device_realize(DeviceState *dev, Error **errp)
>  * in vfio realize
>  */
> s->config.bypass = s->boot_bypass;
>-s->config.page_size_mask = qemu_target_page_mask();
>+
>+switch (s->granule_mode) {
>+case GRANULE_MODE_4K:
>+s->config.page_size_mask = -(4 * KiB);
>+break;
>+case GRANULE_MODE_8K:
>+s->config.page_size_mask = -(8 * KiB);
>+break;
>+case GRANULE_MODE_16K:
>+s->config.page_size_mask = -(16 * KiB);
>+break;
>+case GRANULE_MODE_64K:
>+s->config.page_size_mask = -(64 * KiB);
>+break;
>+case GRANULE_MODE_HOST:
>+s->config.page_size_mask = qemu_real_host_page_mask();
>+break;
>+default:
>+error_setg(errp, "Unsupported granule mode");
>+}
> if (s->aw_bits < 32 || s->aw_bits > 64) {
> error_setg(errp, "aw-bits must be within [32,64]");
> }
>@@ -1527,6 +1547,8 @@ static Property virtio_iommu_properties[] = {
>  TYPE_PCI_BUS, PCIBus *),
> DEFINE_PROP_BOOL("boot-bypass", VirtIOIOMMU, boot_bypass, true),
> DEFINE_PROP_UINT8("aw-bits", VirtIOIOMMU, aw_bits, 0),
>+DEFINE_PROP_GRANULE_MODE("granule", VirtIOIOMMU, granule_mode,
>+ GRANULE_MODE_4K),
> DEFINE_PROP_END_OF_LIST(),
> };
>
>diff --git a/qemu-options.hx b/qemu-options.hx
>index 3b670758b0..c7b43b67d5 100644
>--- a/qemu-options.hx
>+++ b/qemu-options.hx
>@@ -1179,6 +1179,9 @@ SRST
> ``aw-bits=val`` (val between 32 and 64, default depends on machine)
> This decides the address width of IOVA address space. It defaults
> to 39 bits on q35 machines and 48 bits on ARM virt machines.
>+``granule=val`` (possible values are 4K, 8K, 16K, 64K and host)
>+This decides the default granule to be be exposed by the
>+virtio-iommu. If host, the granule matches the host page size.
>
> ERST
>
>--
>2.41.0




Re: [PATCH 06/10] hw/core: Add ResetContainer which holds objects implementing Resettable

2024-02-26 Thread Zhao Liu
On Tue, Feb 20, 2024 at 04:06:18PM +, Peter Maydell wrote:
> Date: Tue, 20 Feb 2024 16:06:18 +
> From: Peter Maydell 
> Subject: [PATCH 06/10] hw/core: Add ResetContainer which holds objects
>  implementing Resettable
> X-Mailer: git-send-email 2.34.1
> 
> Implement a ResetContainer.  This is a subclass of Object, and it
> implements the Resettable interface.  The container holds a list of
> arbitrary other objects which implement Resettable, and when the
> container is reset, all the objects it contains are also reset.
> 
> This will allow us to have a 3-phase-reset equivalent of the old
> qemu_register_reset() API: we will have a single "simulation reset"
> top level ResetContainer, and objects in it are the equivalent of the
> old QEMUResetHandler functions.
> 
> The qemu_register_reset() API manages its list of callbacks using a
> QTAILQ, but here we use a GPtrArray for our list of Resettable
> children: we expect the "remove" operation (which will need to do an
> iteration through the list) to be fairly uncommon, and we get simpler
> code with fewer memory allocations.
> 
> Since there is currently no listed owner in MAINTAINERS for the
> existing reset-related source files, create a new section for
> them, and add these new files there also.
> 
> Signed-off-by: Peter Maydell 
> ---
>  MAINTAINERS  | 10 +
>  include/hw/core/resetcontainer.h | 48 
>  hw/core/resetcontainer.c | 76 
>  hw/core/meson.build  |  1 +
>  4 files changed, 135 insertions(+)
>  create mode 100644 include/hw/core/resetcontainer.h
>  create mode 100644 hw/core/resetcontainer.c
>

Reviewed-by: Zhao Liu 




Re: [PATCH] migration: Don't serialize migration while can't switchover

2024-02-26 Thread Wang, Lei
On 2/22/2024 23:56, Avihai Horon wrote:
> Currently, migration code serializes device data sending during pre-copy
> iterative phase. As noted in the code comment, this is done to prevent
> faster changing device from sending its data over and over.
> 
> However, with switchover-ack capability enabled, this behavior can be
> problematic and may prevent migration from converging. The problem lies
> in the fact that an earlier device may never finish sending its data and
> thus block other devices from sending theirs.
> 
> This bug was observed in several VFIO migration scenarios where some
> workload on the VM prevented RAM from ever reaching a hard zero, not
> allowing VFIO initial pre-copy data to be sent, and thus destination
> could not ack switchover. Note that the same scenario, but without
> switchover-ack, would converge.
> 
> Fix it by not serializing device data sending during pre-copy iterative
> phase if switchover was not acked yet.

Hi Avihai,

Can this bug be solved by ordering the priority of different device's handlers?

> 
> Fixes: 1b4adb10f898 ("migration: Implement switchover ack logic")
> Signed-off-by: Avihai Horon 
> ---
>  migration/savevm.h|  2 +-
>  migration/migration.c |  4 ++--
>  migration/savevm.c| 22 +++---
>  3 files changed, 18 insertions(+), 10 deletions(-)
> 
> diff --git a/migration/savevm.h b/migration/savevm.h
> index 74669733dd6..d4a368b522b 100644
> --- a/migration/savevm.h
> +++ b/migration/savevm.h
> @@ -36,7 +36,7 @@ void qemu_savevm_state_setup(QEMUFile *f);
>  bool qemu_savevm_state_guest_unplug_pending(void);
>  int qemu_savevm_state_resume_prepare(MigrationState *s);
>  void qemu_savevm_state_header(QEMUFile *f);
> -int qemu_savevm_state_iterate(QEMUFile *f, bool postcopy);
> +int qemu_savevm_state_iterate(QEMUFile *f, bool postcopy, bool 
> can_switchover);
>  void qemu_savevm_state_cleanup(void);
>  void qemu_savevm_state_complete_postcopy(QEMUFile *f);
>  int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only,
> diff --git a/migration/migration.c b/migration/migration.c
> index ab21de2cadb..d8bfe1fb1b9 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -3133,7 +3133,7 @@ static MigIterateState 
> migration_iteration_run(MigrationState *s)
>  }
>  
>  /* Just another iteration step */
> -qemu_savevm_state_iterate(s->to_dst_file, in_postcopy);
> +qemu_savevm_state_iterate(s->to_dst_file, in_postcopy, can_switchover);
>  return MIG_ITERATE_RESUME;
>  }
>  
> @@ -3216,7 +3216,7 @@ static MigIterateState 
> bg_migration_iteration_run(MigrationState *s)
>  {
>  int res;
>  
> -res = qemu_savevm_state_iterate(s->to_dst_file, false);
> +res = qemu_savevm_state_iterate(s->to_dst_file, false, true);
>  if (res > 0) {
>  bg_migration_completion(s);
>  return MIG_ITERATE_BREAK;
> diff --git a/migration/savevm.c b/migration/savevm.c
> index d612c8a9020..3a012796375 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1386,7 +1386,7 @@ int qemu_savevm_state_resume_prepare(MigrationState *s)
>   *   0 : We haven't finished, caller have to go again
>   *   1 : We have finished, we can go to complete phase
>   */
> -int qemu_savevm_state_iterate(QEMUFile *f, bool postcopy)
> +int qemu_savevm_state_iterate(QEMUFile *f, bool postcopy, bool 
> can_switchover)
>  {
>  SaveStateEntry *se;
>  int ret = 1;
> @@ -1430,12 +1430,20 @@ int qemu_savevm_state_iterate(QEMUFile *f, bool 
> postcopy)
>   "%d(%s): %d",
>   se->section_id, se->idstr, ret);
>  qemu_file_set_error(f, ret);
> +return ret;
>  }
> -if (ret <= 0) {
> -/* Do not proceed to the next vmstate before this one reported
> -   completion of the current stage. This serializes the migration
> -   and reduces the probability that a faster changing state is
> -   synchronized over and over again. */
> +
> +if (ret == 0 && can_switchover) {
> +/*
> + * Do not proceed to the next vmstate before this one reported
> + * completion of the current stage. This serializes the migration
> + * and reduces the probability that a faster changing state is
> + * synchronized over and over again.
> + * Do it only if migration can switchover. If migration can't
> + * switchover yet, do proceed to let other devices send their 
> data
> + * too, as this may be required for switchover to be acked and
> + * migration to converge.
> + */
>  break;
>  }
>  }
> @@ -1724,7 +1732,7 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
>  qemu_savevm_state_setup(f);
>  
>  while (qemu_file_get_error(f) == 0) {
> -if (qemu_savevm_state_iterate(f, false) > 0) {
> +if (qemu_savevm_state_iterate(f, false, true) 

RE: [PATCH rfcv2 18/18] intel_iommu: Block migration if cap is updated

2024-02-26 Thread Duan, Zhenzhong


>-Original Message-
>From: Joao Martins 
>Subject: Re: [PATCH rfcv2 18/18] intel_iommu: Block migration if cap is
>updated
>
>On 01/02/2024 07:28, Zhenzhong Duan wrote:
>> When there is VFIO device and vIOMMU cap/ecap is updated based on
>host
>> IOMMU cap/ecap, migration should be blocked.
>>
>> Signed-off-by: Zhenzhong Duan 
>
>Is this really needed considering migration with vIOMMU is already blocked
>anyways?

VFIO device can be hot unplugged, then blocker due to vIOMMU is removed,
but we still need a blocker for cap/ecap update.

Thanks
Zhenzhong

>
>> ---
>>  hw/i386/intel_iommu.c | 16 ++--
>>  1 file changed, 14 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>> index 72cc8b2c71..7f9ff653b2 100644
>> --- a/hw/i386/intel_iommu.c
>> +++ b/hw/i386/intel_iommu.c
>> @@ -39,6 +39,7 @@
>>  #include "hw/i386/apic_internal.h"
>>  #include "kvm/kvm_i386.h"
>>  #include "migration/vmstate.h"
>> +#include "migration/blocker.h"
>>  #include "trace.h"
>>
>>  #define S_AW_BITS (VTD_MGAW_FROM_CAP(s->cap) + 1)
>> @@ -3829,6 +3830,8 @@ static int
>vtd_check_legacy_hdev(IntelIOMMUState *s,
>>  return 0;
>>  }
>>
>> +static Error *vtd_mig_blocker;
>> +
>>  static int vtd_check_iommufd_hdev(IntelIOMMUState *s,
>>IOMMUFDDevice *idev,
>>Error **errp)
>> @@ -3860,8 +3863,17 @@ static int
>vtd_check_iommufd_hdev(IntelIOMMUState *s,
>>  tmp_cap |= VTD_CAP_MGAW(host_mgaw + 1);
>>  }
>>
>> -s->cap = tmp_cap;
>> -return 0;
>> +if (s->cap != tmp_cap) {
>> +if (vtd_mig_blocker == NULL) {
>> +error_setg(_mig_blocker,
>> +   "cap/ecap update from host IOMMU block migration");
>> +ret = migrate_add_blocker(_mig_blocker, errp);
>> +}
>> +if (!ret) {
>> +s->cap = tmp_cap;
>> +}
>> +}
>> +return ret;
>>  }
>>
>>  static int vtd_check_hdev(IntelIOMMUState *s, VTDHostIOMMUDevice
>*vtd_hdev,



[PATCH v2 5/5] tests: Add migration test for loongarch64

2024-02-26 Thread Bibo Mao
This patch adds migration test support for loongarch64. The test code
comes from aarch64 mostly, only that it booted as bios in qemu since
kernel requires elf format and bios uses binary format.

In addition to providing the binary, this patch also includes the source
code and the build script in tests/migration/loongarch64. So users can
change the source and/or re-compile the binary as they wish.

Signed-off-by: Bibo Mao 
Reviewed-by: Fabiano Rosas 
Acked-by: Thomas Huth 
Acked-by: Peter Xu 
---
 tests/migration/Makefile |  2 +-
 tests/migration/loongarch64/Makefile | 18 +
 tests/migration/loongarch64/a-b-kernel.S | 49 
 tests/migration/loongarch64/a-b-kernel.h | 16 
 tests/migration/migration-test.h |  3 ++
 tests/qtest/meson.build  |  4 ++
 tests/qtest/migration-test.c | 10 +
 7 files changed, 101 insertions(+), 1 deletion(-)
 create mode 100644 tests/migration/loongarch64/Makefile
 create mode 100644 tests/migration/loongarch64/a-b-kernel.S
 create mode 100644 tests/migration/loongarch64/a-b-kernel.h

diff --git a/tests/migration/Makefile b/tests/migration/Makefile
index 13e99b1692..cfebfe23f8 100644
--- a/tests/migration/Makefile
+++ b/tests/migration/Makefile
@@ -5,7 +5,7 @@
 # See the COPYING file in the top-level directory.
 #
 
-TARGET_LIST = i386 aarch64 s390x
+TARGET_LIST = i386 aarch64 s390x loongarch64
 
 SRC_PATH = ../..
 
diff --git a/tests/migration/loongarch64/Makefile 
b/tests/migration/loongarch64/Makefile
new file mode 100644
index 00..5d8719205f
--- /dev/null
+++ b/tests/migration/loongarch64/Makefile
@@ -0,0 +1,18 @@
+# To specify cross compiler prefix, use CROSS_PREFIX=
+#   $ make CROSS_PREFIX=loongarch64-linux-gnu-
+
+.PHONY: all clean
+all: a-b-kernel.h
+
+a-b-kernel.h: loongarch64.kernel
+   echo "$$__note" > $@
+   xxd -i $< | sed -e 's/.*int.*//' >> $@
+
+loongarch64.kernel: loongarch64.elf
+   $(CROSS_PREFIX)objcopy -j .text -O binary $< $@
+
+loongarch64.elf: a-b-kernel.S
+   $(CROSS_PREFIX)gcc -o $@ -nostdlib -Wl,--build-id=none $<
+
+clean:
+   $(RM) *.kernel *.elf
diff --git a/tests/migration/loongarch64/a-b-kernel.S 
b/tests/migration/loongarch64/a-b-kernel.S
new file mode 100644
index 00..cd543345fe
--- /dev/null
+++ b/tests/migration/loongarch64/a-b-kernel.S
@@ -0,0 +1,49 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Copyright (c) 2024 Loongson Technology Corporation Limited
+ */
+#include "../migration-test.h"
+
+#define LOONGARCH_CSR_CRMD  0
+#define LOONGARCH_VIRT_UART 0x1FE001E0
+.section .text
+
+.globl  _start
+_start:
+/* output char 'A' to UART16550 */
+li.d$t0, LOONGARCH_VIRT_UART
+li.w$t1, 'A'
+st.b$t1, $t0, 0
+
+/* traverse test memory region */
+li.d$t0, LOONGARCH_TEST_MEM_START
+li.d$t1, LOONGARCH_TEST_MEM_END
+li.d$t2, TEST_MEM_PAGE_SIZE
+li.d$t4, LOONGARCH_VIRT_UART
+li.w$t5, 'B'
+
+clean:
+st.b$zero, $t0, 0
+add.d   $t0,   $t0, $t2
+bne $t0,   $t1, clean
+/* keeps a counter so we can limit the output speed */
+addi.d  $t6,   $zero, 0
+
+mainloop:
+li.d$t0, LOONGARCH_TEST_MEM_START
+
+innerloop:
+ld.bu   $t3, $t0, 0
+addi.w  $t3, $t3, 1
+ext.w.b $t3, $t3
+st.b$t3, $t0, 0
+add.d   $t0, $t0, $t2
+bne $t0, $t1, innerloop
+
+addi.d  $t6, $t6, 1
+andi$t6, $t6, 31
+bnez$t6, mainloop
+
+st.b$t5, $t4, 0
+b   mainloop
+nop
diff --git a/tests/migration/loongarch64/a-b-kernel.h 
b/tests/migration/loongarch64/a-b-kernel.h
new file mode 100644
index 00..b3fe466754
--- /dev/null
+++ b/tests/migration/loongarch64/a-b-kernel.h
@@ -0,0 +1,16 @@
+/* This file is automatically generated from the assembly file in
+* tests/migration/loongarch64. Edit that file and then run "make all"
+* inside tests/migration to update, and then remember to send both
+* the header and the assembler differences in your patch submission.
+*/
+unsigned char loongarch64_kernel[] = {
+  0x0c, 0xc0, 0x3f, 0x14, 0x8c, 0x81, 0x87, 0x03, 0x0d, 0x04, 0x81, 0x03,
+  0x8d, 0x01, 0x00, 0x29, 0x0c, 0x00, 0x04, 0x14, 0x0d, 0x80, 0x0c, 0x14,
+  0x2e, 0x00, 0x00, 0x14, 0x10, 0xc0, 0x3f, 0x14, 0x10, 0x82, 0x87, 0x03,
+  0x11, 0x08, 0x81, 0x03, 0x80, 0x01, 0x00, 0x29, 0x8c, 0xb9, 0x10, 0x00,
+  0x8d, 0xf9, 0xff, 0x5f, 0x12, 0x00, 0xc0, 0x02, 0x0c, 0x00, 0x04, 0x14,
+  0x8f, 0x01, 0x00, 0x2a, 0xef, 0x05, 0x80, 0x02, 0xef, 0x5d, 0x00, 0x00,
+  0x8f, 0x01, 0x00, 0x29, 0x8c, 0xb9, 0x10, 0x00, 0x8d, 0xed, 0xff, 0x5f,
+  0x52, 0x06, 0xc0, 0x02, 0x52, 0x7e, 0x40, 0x03, 0x5f, 0xde, 0xff, 0x47,
+  0x11, 0x02, 0x00, 0x29, 0xff, 0xd7, 0xff, 0x53, 0x00, 0x00, 0x40, 0x03
+};
diff --git a/tests/migration/migration-test.h b/tests/migration/migration-test.h
index 68512c0b1b..f402e48349 100644
--- a/tests/migration/migration-test.h
+++ b/tests/migration/migration-test.h
@@ 

[PATCH v2 0/5] Add migration test for loongarch64

2024-02-26 Thread Bibo Mao
Migration test case is added for loongarch64 here. Since compat machine
type is required for migration test case, also compat machine qemu 9.0
is added for loongarch virt machine.

Migration test case passes to run in both tcg and kvm mode with the
patch.

---
Change in v2:
  1. Keep the default memory size unchanged with 1GB, only modify minimum
memory size with 256MB
  2. Remove tab char in file tests/migration/loongarch64/a-b-kernel.S
  3. Rebase patch on 
https://patchwork.kernel.org/project/qemu-devel/patch/0bd892aa9b88e0f4cc904cb70efd0251fc1cde29.1708336919.git.lixiang...@loongson.cn
to avoid confliction
---
Bibo Mao (5):
  hw/loongarch: Rename LOONGARCH_MACHINE with VIRT_MACHINE.
  hw/loongarch: Rename LoongArchMachineState with VirtMachineState
  hw/loongarch: Add compat machine for 9.0
  hw/loongarch: Set minimium memory size as 256M
  tests: Add migration test for loongarch64

 hw/loongarch/acpi-build.c|  80 +++---
 hw/loongarch/fw_cfg.c|   2 +-
 hw/loongarch/fw_cfg.h|   2 +-
 hw/loongarch/virt.c  | 345 +--
 include/hw/loongarch/virt.h  |  10 +-
 tests/migration/Makefile |   2 +-
 tests/migration/loongarch64/Makefile |  18 ++
 tests/migration/loongarch64/a-b-kernel.S |  49 
 tests/migration/loongarch64/a-b-kernel.h |  16 ++
 tests/migration/migration-test.h |   3 +
 tests/qtest/meson.build  |   4 +
 tests/qtest/migration-test.c |  10 +
 12 files changed, 337 insertions(+), 204 deletions(-)
 create mode 100644 tests/migration/loongarch64/Makefile
 create mode 100644 tests/migration/loongarch64/a-b-kernel.S
 create mode 100644 tests/migration/loongarch64/a-b-kernel.h


base-commit: 03d496a992d98650315af41be7c0ca6de2a28da1
-- 
2.39.3




[PATCH v2 3/5] hw/loongarch: Add compat machine for 9.0

2024-02-26 Thread Bibo Mao
Since migration test case requires compat machine type support,
compat machine is added for qemu 9.0 here.

Signed-off-by: Bibo Mao 
---
 hw/loongarch/virt.c | 60 +++--
 1 file changed, 47 insertions(+), 13 deletions(-)

diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
index 3bc35c58c9..f37f642ede 100644
--- a/hw/loongarch/virt.c
+++ b/hw/loongarch/virt.c
@@ -46,6 +46,32 @@
 #include "hw/block/flash.h"
 #include "qemu/error-report.h"
 
+#define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
+static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
+void *data) \
+{ \
+MachineClass *mc = MACHINE_CLASS(oc); \
+virt_machine_##major##_##minor##_options(mc); \
+mc->desc = "QEMU " # major "." # minor " ARM Virtual Machine"; \
+if (latest) { \
+mc->alias = "virt"; \
+} \
+} \
+static const TypeInfo machvirt_##major##_##minor##_info = { \
+.name = MACHINE_TYPE_NAME("virt-" # major "." # minor), \
+.parent = TYPE_VIRT_MACHINE, \
+.class_init = virt_##major##_##minor##_class_init, \
+}; \
+static void machvirt_machine_##major##_##minor##_init(void) \
+{ \
+type_register_static(_##major##_##minor##_info); \
+} \
+type_init(machvirt_machine_##major##_##minor##_init);
+
+#define DEFINE_VIRT_MACHINE_AS_LATEST(major, minor) \
+DEFINE_VIRT_MACHINE_LATEST(major, minor, true)
+#define DEFINE_VIRT_MACHINE(major, minor) \
+DEFINE_VIRT_MACHINE_LATEST(major, minor, false)
 
 struct loaderparams {
 uint64_t ram_size;
@@ -1200,18 +1226,26 @@ static void virt_class_init(ObjectClass *oc, void *data)
 #endif
 }
 
-static const TypeInfo virt_machine_types[] = {
-{
-.name   = TYPE_VIRT_MACHINE,
-.parent = TYPE_MACHINE,
-.instance_size  = sizeof(VirtMachineState),
-.class_init = virt_class_init,
-.instance_init = virt_machine_initfn,
-.interfaces = (InterfaceInfo[]) {
- { TYPE_HOTPLUG_HANDLER },
- { }
-},
-}
+static const TypeInfo virt_machine_info = {
+.name   = TYPE_VIRT_MACHINE,
+.parent = TYPE_MACHINE,
+.abstract   = true,
+.instance_size  = sizeof(VirtMachineState),
+.class_init = virt_class_init,
+.instance_init = virt_machine_initfn,
+.interfaces = (InterfaceInfo[]) {
+{ TYPE_HOTPLUG_HANDLER },
+{ }
+},
 };
 
-DEFINE_TYPES(virt_machine_types)
+static void machvirt_machine_init(void)
+{
+type_register_static(_machine_info);
+}
+type_init(machvirt_machine_init);
+
+static void virt_machine_9_0_options(MachineClass *mc)
+{
+}
+DEFINE_VIRT_MACHINE_AS_LATEST(9, 0)
-- 
2.39.3




[PATCH v2 1/5] hw/loongarch: Rename LOONGARCH_MACHINE with VIRT_MACHINE.

2024-02-26 Thread Bibo Mao
On LoongArch system, there is only virt machine type now, name
LOONGARCH_MACHINE is confused, rename it with VIRT_MACHINE. Machine name
about Other real hw boards can be added in future.

Signed-off-by: Bibo Mao 
---
 hw/loongarch/acpi-build.c   |  8 
 hw/loongarch/virt.c | 19 +--
 include/hw/loongarch/virt.h |  4 ++--
 3 files changed, 15 insertions(+), 16 deletions(-)

diff --git a/hw/loongarch/acpi-build.c b/hw/loongarch/acpi-build.c
index e5ab1080af..72322cdb1e 100644
--- a/hw/loongarch/acpi-build.c
+++ b/hw/loongarch/acpi-build.c
@@ -167,7 +167,7 @@ build_srat(GArray *table_data, BIOSLinker *linker, 
MachineState *machine)
 int i, arch_id, node_id;
 uint64_t mem_len, mem_base;
 int nb_numa_nodes = machine->numa_state->num_nodes;
-LoongArchMachineState *lams = LOONGARCH_MACHINE(machine);
+LoongArchMachineState *lams = VIRT_MACHINE(machine);
 MachineClass *mc = MACHINE_GET_CLASS(lams);
 const CPUArchIdList *arch_ids = mc->possible_cpu_arch_ids(machine);
 AcpiTable table = { .sig = "SRAT", .rev = 1, .oem_id = lams->oem_id,
@@ -279,7 +279,7 @@ static void
 build_la_ged_aml(Aml *dsdt, MachineState *machine)
 {
 uint32_t event;
-LoongArchMachineState *lams = LOONGARCH_MACHINE(machine);
+LoongArchMachineState *lams = VIRT_MACHINE(machine);
 
 build_ged_aml(dsdt, "\\_SB."GED_DEVICE,
   HOTPLUG_HANDLER(lams->acpi_ged),
@@ -391,7 +391,7 @@ static void
 build_dsdt(GArray *table_data, BIOSLinker *linker, MachineState *machine)
 {
 Aml *dsdt, *scope, *pkg;
-LoongArchMachineState *lams = LOONGARCH_MACHINE(machine);
+LoongArchMachineState *lams = VIRT_MACHINE(machine);
 AcpiTable table = { .sig = "DSDT", .rev = 1, .oem_id = lams->oem_id,
 .oem_table_id = lams->oem_table_id };
 
@@ -421,7 +421,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, 
MachineState *machine)
 
 static void acpi_build(AcpiBuildTables *tables, MachineState *machine)
 {
-LoongArchMachineState *lams = LOONGARCH_MACHINE(machine);
+LoongArchMachineState *lams = VIRT_MACHINE(machine);
 GArray *table_offsets;
 AcpiFadtData fadt_data;
 unsigned facs, rsdt, dsdt;
diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
index 1e98d8bda5..0d4ea57e5b 100644
--- a/hw/loongarch/virt.c
+++ b/hw/loongarch/virt.c
@@ -823,7 +823,7 @@ static void loongarch_init(MachineState *machine)
 ram_addr_t ram_size = machine->ram_size;
 uint64_t highram_size = 0, phyAddr = 0;
 MemoryRegion *address_space_mem = get_system_memory();
-LoongArchMachineState *lams = LOONGARCH_MACHINE(machine);
+LoongArchMachineState *lams = VIRT_MACHINE(machine);
 int nb_numa_nodes = machine->numa_state->num_nodes;
 NodeInfo *numa_info = machine->numa_state->nodes;
 int i;
@@ -990,7 +990,7 @@ bool loongarch_is_acpi_enabled(LoongArchMachineState *lams)
 static void loongarch_get_acpi(Object *obj, Visitor *v, const char *name,
void *opaque, Error **errp)
 {
-LoongArchMachineState *lams = LOONGARCH_MACHINE(obj);
+LoongArchMachineState *lams = VIRT_MACHINE(obj);
 OnOffAuto acpi = lams->acpi;
 
 visit_type_OnOffAuto(v, name, , errp);
@@ -999,14 +999,14 @@ static void loongarch_get_acpi(Object *obj, Visitor *v, 
const char *name,
 static void loongarch_set_acpi(Object *obj, Visitor *v, const char *name,
void *opaque, Error **errp)
 {
-LoongArchMachineState *lams = LOONGARCH_MACHINE(obj);
+LoongArchMachineState *lams = VIRT_MACHINE(obj);
 
 visit_type_OnOffAuto(v, name, >acpi, errp);
 }
 
 static void loongarch_machine_initfn(Object *obj)
 {
-LoongArchMachineState *lams = LOONGARCH_MACHINE(obj);
+LoongArchMachineState *lams = VIRT_MACHINE(obj);
 
 lams->acpi = ON_OFF_AUTO_AUTO;
 lams->oem_id = g_strndup(ACPI_BUILD_APPNAME6, 6);
@@ -1038,7 +1038,7 @@ static void virt_machine_device_pre_plug(HotplugHandler 
*hotplug_dev,
 static void virt_mem_unplug_request(HotplugHandler *hotplug_dev,
  DeviceState *dev, Error **errp)
 {
-LoongArchMachineState *lams = LOONGARCH_MACHINE(hotplug_dev);
+LoongArchMachineState *lams = VIRT_MACHINE(hotplug_dev);
 
 /* the acpi ged is always exist */
 hotplug_handler_unplug_request(HOTPLUG_HANDLER(lams->acpi_ged), dev,
@@ -1056,7 +1056,7 @@ static void 
virt_machine_device_unplug_request(HotplugHandler *hotplug_dev,
 static void virt_mem_unplug(HotplugHandler *hotplug_dev,
  DeviceState *dev, Error **errp)
 {
-LoongArchMachineState *lams = LOONGARCH_MACHINE(hotplug_dev);
+LoongArchMachineState *lams = VIRT_MACHINE(hotplug_dev);
 
 hotplug_handler_unplug(HOTPLUG_HANDLER(lams->acpi_ged), dev, errp);
 pc_dimm_unplug(PC_DIMM(dev), MACHINE(lams));
@@ -1074,7 +1074,7 @@ static void virt_machine_device_unplug(HotplugHandler 
*hotplug_dev,
 static void virt_mem_plug(HotplugHandler *hotplug_dev,

[PATCH v2 4/5] hw/loongarch: Set minimium memory size as 256M

2024-02-26 Thread Bibo Mao
The minimum memory size for LoongArch UEFI bios is 256M, also some
test cases such as migration and qos use 256M memory by default.

Here set minimum memory size for Loongarch VirtMachine with 256M rather
than 1G, so that test cases with 256M memory can pass to run.

Signed-off-by: Bibo Mao 
---
 hw/loongarch/virt.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
index f37f642ede..1dadb8e299 100644
--- a/hw/loongarch/virt.c
+++ b/hw/loongarch/virt.c
@@ -864,8 +864,8 @@ static void virt_init(MachineState *machine)
 cpu_model = LOONGARCH_CPU_TYPE_NAME("la464");
 }
 
-if (ram_size < 1 * GiB) {
-error_report("ram_size must be greater than 1G.");
+if (ram_size < 256 * MiB) {
+error_report("ram_size must be greater than 256M.");
 exit(1);
 }
 create_fdt(vms);
-- 
2.39.3




[PATCH v2 2/5] hw/loongarch: Rename LoongArchMachineState with VirtMachineState

2024-02-26 Thread Bibo Mao
Rename LoongArchMachineState with VirtMachineState, and change variable
name LoongArchMachineState *lams with VirtMachineState *vms, and rename
function loongarch_xxx() with virt_xxx() also.

Signed-off-by: Bibo Mao 
---
 hw/loongarch/acpi-build.c   |  80 +-
 hw/loongarch/fw_cfg.c   |   2 +-
 hw/loongarch/fw_cfg.h   |   2 +-
 hw/loongarch/virt.c | 290 ++--
 include/hw/loongarch/virt.h |   8 +-
 5 files changed, 191 insertions(+), 191 deletions(-)

diff --git a/hw/loongarch/acpi-build.c b/hw/loongarch/acpi-build.c
index 72322cdb1e..b6741809ef 100644
--- a/hw/loongarch/acpi-build.c
+++ b/hw/loongarch/acpi-build.c
@@ -105,14 +105,14 @@ build_facs(GArray *table_data)
 
 /* build MADT */
 static void
-build_madt(GArray *table_data, BIOSLinker *linker, LoongArchMachineState *lams)
+build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
 {
-MachineState *ms = MACHINE(lams);
+MachineState *ms = MACHINE(vms);
 MachineClass *mc = MACHINE_GET_CLASS(ms);
 const CPUArchIdList *arch_ids = mc->possible_cpu_arch_ids(ms);
 int i, arch_id;
-AcpiTable table = { .sig = "APIC", .rev = 1, .oem_id = lams->oem_id,
-.oem_table_id = lams->oem_table_id };
+AcpiTable table = { .sig = "APIC", .rev = 1, .oem_id = vms->oem_id,
+.oem_table_id = vms->oem_table_id };
 
 acpi_table_begin(, table_data);
 
@@ -167,11 +167,11 @@ build_srat(GArray *table_data, BIOSLinker *linker, 
MachineState *machine)
 int i, arch_id, node_id;
 uint64_t mem_len, mem_base;
 int nb_numa_nodes = machine->numa_state->num_nodes;
-LoongArchMachineState *lams = VIRT_MACHINE(machine);
-MachineClass *mc = MACHINE_GET_CLASS(lams);
+VirtMachineState *vms = VIRT_MACHINE(machine);
+MachineClass *mc = MACHINE_GET_CLASS(vms);
 const CPUArchIdList *arch_ids = mc->possible_cpu_arch_ids(machine);
-AcpiTable table = { .sig = "SRAT", .rev = 1, .oem_id = lams->oem_id,
-.oem_table_id = lams->oem_table_id };
+AcpiTable table = { .sig = "SRAT", .rev = 1, .oem_id = vms->oem_id,
+.oem_table_id = vms->oem_table_id };
 
 acpi_table_begin(, table_data);
 build_append_int_noprefix(table_data, 1, 4); /* Reserved */
@@ -279,13 +279,13 @@ static void
 build_la_ged_aml(Aml *dsdt, MachineState *machine)
 {
 uint32_t event;
-LoongArchMachineState *lams = VIRT_MACHINE(machine);
+VirtMachineState *vms = VIRT_MACHINE(machine);
 
 build_ged_aml(dsdt, "\\_SB."GED_DEVICE,
-  HOTPLUG_HANDLER(lams->acpi_ged),
+  HOTPLUG_HANDLER(vms->acpi_ged),
   VIRT_SCI_IRQ, AML_SYSTEM_MEMORY,
   VIRT_GED_EVT_ADDR);
-event = object_property_get_uint(OBJECT(lams->acpi_ged),
+event = object_property_get_uint(OBJECT(vms->acpi_ged),
  "ged-event", _abort);
 if (event & ACPI_GED_MEM_HOTPLUG_EVT) {
 build_memory_hotplug_aml(dsdt, machine->ram_slots, "\\_SB", NULL,
@@ -295,7 +295,7 @@ build_la_ged_aml(Aml *dsdt, MachineState *machine)
 acpi_dsdt_add_power_button(dsdt);
 }
 
-static void build_pci_device_aml(Aml *scope, LoongArchMachineState *lams)
+static void build_pci_device_aml(Aml *scope, VirtMachineState *vms)
 {
 struct GPEXConfig cfg = {
 .mmio64.base = VIRT_PCI_MEM_BASE,
@@ -305,13 +305,13 @@ static void build_pci_device_aml(Aml *scope, 
LoongArchMachineState *lams)
 .ecam.base   = VIRT_PCI_CFG_BASE,
 .ecam.size   = VIRT_PCI_CFG_SIZE,
 .irq = VIRT_GSI_BASE + VIRT_DEVICE_IRQS,
-.bus = lams->pci_bus,
+.bus = vms->pci_bus,
 };
 
 acpi_dsdt_add_gpex(scope, );
 }
 
-static void build_flash_aml(Aml *scope, LoongArchMachineState *lams)
+static void build_flash_aml(Aml *scope, VirtMachineState *vms)
 {
 Aml *dev, *crs;
 MemoryRegion *flash_mem;
@@ -322,11 +322,11 @@ static void build_flash_aml(Aml *scope, 
LoongArchMachineState *lams)
 hwaddr flash1_base;
 hwaddr flash1_size;
 
-flash_mem = pflash_cfi01_get_memory(lams->flash[0]);
+flash_mem = pflash_cfi01_get_memory(vms->flash[0]);
 flash0_base = flash_mem->addr;
 flash0_size = memory_region_size(flash_mem);
 
-flash_mem = pflash_cfi01_get_memory(lams->flash[1]);
+flash_mem = pflash_cfi01_get_memory(vms->flash[1]);
 flash1_base = flash_mem->addr;
 flash1_size = memory_region_size(flash_mem);
 
@@ -352,7 +352,7 @@ static void build_flash_aml(Aml *scope, 
LoongArchMachineState *lams)
 }
 
 #ifdef CONFIG_TPM
-static void acpi_dsdt_add_tpm(Aml *scope, LoongArchMachineState *vms)
+static void acpi_dsdt_add_tpm(Aml *scope, VirtMachineState *vms)
 {
 PlatformBusDevice *pbus = PLATFORM_BUS_DEVICE(vms->platform_bus_dev);
 hwaddr pbus_base = VIRT_PLATFORM_BUS_BASEADDRESS;
@@ -391,18 +391,18 @@ static void
 build_dsdt(GArray *table_data, BIOSLinker *linker, 

RE: [PATCH v3 0/4] RISC-V: Modularize common match conditions for trigger

2024-02-26 Thread 張哲嘉
Hi Alistair,

> -Original Message-
> From: Alistair Francis 
> Sent: Tuesday, February 27, 2024 8:02 AM
> To: Alvin Che-Chia Chang(張哲嘉) 
> Cc: qemu-ri...@nongnu.org; qemu-devel@nongnu.org;
> alistair.fran...@wdc.com; bin.m...@windriver.com; liwei1...@gmail.com;
> dbarb...@ventanamicro.com; zhiwei_...@linux.alibaba.com
> Subject: Re: [PATCH v3 0/4] RISC-V: Modularize common match conditions for
> trigger
>
> [EXTERNAL MAIL 外部信件]
>
> On Mon, Feb 26, 2024 at 5:10 PM Alvin Chang via 
> wrote:
> >
> > According to latest RISC-V Debug specification version 1.0 [1], the
>
> The issue here is that we really only support the "debug" spec. That's the 
> 0.13
> version of the spec.
>
> We do also support bits of the 1.0 spec, but those should be changed to be
> hidden behind the new extension flags like "Sdtrig"
>
> I think this patch still applies to the 0.13 version though. Do you mind 
> changing
> this to target the 0.13 version of the spec instead?

I have changed it to target 0.13
Please see patch v4, thanks!


Sincerely,
Alvin Chang

>
> Ideally we can then support the new Sdtrig extension in the future
>
> Alistair
>
> > enabled privilege levels of the trigger is common match conditions for
> > all the types of the trigger.
> >
> > This series modularize the code for checking the privilege levels of
> > type 2/3/6 triggers by implementing functions trigger_common_match()
> > and trigger_priv_match().
> >
> > Additional match conditions, such as CSR tcontrol and textra, can be
> > further implemented into trigger_common_match() in the future.
> >
> > [1]:
> > https://github.com/riscv/riscv-debug-spec/releases/tag/1.0.0-rc1-ascii
> > doc
> >
> > Changes from v2:
> > - Explicitly mention the targeting version of RISC-V Debug Spec.
> >
> > Changes from v1:
> > - Fix typo
> > - Add commit description for changing behavior of looping the triggers
> >   when we check type 2 triggers.
> >
> > Alvin Chang (4):
> >   target/riscv: Add functions for common matching conditions of trigger
> >   target/riscv: Apply modularized matching conditions for breakpoint
> >   target/riscv: Apply modularized matching conditions for watchpoint
> >   target/riscv: Apply modularized matching conditions for icount
> > trigger
> >
> >  target/riscv/debug.c | 124
> > +--
> >  1 file changed, 83 insertions(+), 41 deletions(-)
> >
> > --
> > 2.34.1
> >
> >
CONFIDENTIALITY NOTICE:

This e-mail (and its attachments) may contain confidential and legally 
privileged information or information protected from disclosure. If you are not 
the intended recipient, you are hereby notified that any disclosure, copying, 
distribution, or use of the information contained herein is strictly 
prohibited. In this case, please immediately notify the sender by return 
e-mail, delete the message (and any accompanying documents) and destroy all 
printed hard copies. Thank you for your cooperation.

Copyright ANDES TECHNOLOGY CORPORATION - All Rights Reserved.


[PATCH v4 1/4] target/riscv: Add functions for common matching conditions of trigger

2024-02-26 Thread Alvin Chang via
According to RISC-V Debug specification version 0.13 [1] (also applied
to version 1.0 [2] but it has not been ratified yet), there are several
common matching conditions before firing a trigger, including the
enabled privilege levels of the trigger.

This commit adds trigger_common_match() to prepare the common matching
conditions for the type 2/3/6 triggers. For now, we just implement
trigger_priv_match() to check if the enabled privilege levels of the
trigger match CPU's current privilege level.

[1]: https://github.com/riscv/riscv-debug-spec/releases/tag/task_group_vote
[2]: https://github.com/riscv/riscv-debug-spec/releases/tag/1.0.0-rc1-asciidoc

Signed-off-by: Alvin Chang 
---
 target/riscv/debug.c | 70 
 1 file changed, 70 insertions(+)

diff --git a/target/riscv/debug.c b/target/riscv/debug.c
index e30d99cc2f..3891236b82 100644
--- a/target/riscv/debug.c
+++ b/target/riscv/debug.c
@@ -241,6 +241,76 @@ static void do_trigger_action(CPURISCVState *env, 
target_ulong trigger_index)
 }
 }
 
+/*
+ * Check the privilege level of specific trigger matches CPU's current 
privilege
+ * level.
+ */
+static bool trigger_priv_match(CPURISCVState *env, trigger_type_t type,
+   int trigger_index)
+{
+target_ulong ctrl = env->tdata1[trigger_index];
+
+switch (type) {
+case TRIGGER_TYPE_AD_MATCH:
+/* type 2 trigger cannot be fired in VU/VS mode */
+if (env->virt_enabled) {
+return false;
+}
+/* check U/S/M bit against current privilege level */
+if ((ctrl >> 3) & BIT(env->priv)) {
+return true;
+}
+break;
+case TRIGGER_TYPE_AD_MATCH6:
+if (env->virt_enabled) {
+/* check VU/VS bit against current privilege level */
+if ((ctrl >> 23) & BIT(env->priv)) {
+return true;
+}
+} else {
+/* check U/S/M bit against current privilege level */
+if ((ctrl >> 3) & BIT(env->priv)) {
+return true;
+}
+}
+break;
+case TRIGGER_TYPE_INST_CNT:
+if (env->virt_enabled) {
+/* check VU/VS bit against current privilege level */
+if ((ctrl >> 25) & BIT(env->priv)) {
+return true;
+}
+} else {
+/* check U/S/M bit against current privilege level */
+if ((ctrl >> 6) & BIT(env->priv)) {
+return true;
+}
+}
+break;
+case TRIGGER_TYPE_INT:
+case TRIGGER_TYPE_EXCP:
+case TRIGGER_TYPE_EXT_SRC:
+qemu_log_mask(LOG_UNIMP, "trigger type: %d is not supported\n", type);
+break;
+case TRIGGER_TYPE_NO_EXIST:
+case TRIGGER_TYPE_UNAVAIL:
+qemu_log_mask(LOG_GUEST_ERROR, "trigger type: %d does not exist\n",
+  type);
+break;
+default:
+g_assert_not_reached();
+}
+
+return false;
+}
+
+/* Common matching conditions for all types of the triggers. */
+static bool trigger_common_match(CPURISCVState *env, trigger_type_t type,
+ int trigger_index)
+{
+return trigger_priv_match(env, type, trigger_index);
+}
+
 /* type 2 trigger */
 
 static uint32_t type2_breakpoint_size(CPURISCVState *env, target_ulong ctrl)
-- 
2.34.1




[PATCH v4 3/4] target/riscv: Apply modularized matching conditions for watchpoint

2024-02-26 Thread Alvin Chang via
We have implemented trigger_common_match(), which checks if the enabled
privilege levels of the trigger match CPU's current privilege level.
Remove the related code in riscv_cpu_debug_check_watchpoint() and invoke
trigger_common_match() to check the privilege levels of the type 2 and
type 6 triggers for the watchpoints.

This commit also changes the behavior of looping the triggers. In
previous implementation, if we have a type 2 trigger and
env->virt_enabled is true, we directly return false to stop the loop.
Now we keep looping all the triggers until we find a matched trigger.

Only load/store bits and loaded/stored address should be further checked
in riscv_cpu_debug_check_watchpoint().

Signed-off-by: Alvin Chang 
---
 target/riscv/debug.c | 26 ++
 1 file changed, 6 insertions(+), 20 deletions(-)

diff --git a/target/riscv/debug.c b/target/riscv/debug.c
index b7b0fa8945..9f9f332019 100644
--- a/target/riscv/debug.c
+++ b/target/riscv/debug.c
@@ -899,13 +899,12 @@ bool riscv_cpu_debug_check_watchpoint(CPUState *cs, 
CPUWatchpoint *wp)
 for (i = 0; i < RV_MAX_TRIGGERS; i++) {
 trigger_type = get_trigger_type(env, i);
 
+if (!trigger_common_match(env, trigger_type, i)) {
+continue;
+}
+
 switch (trigger_type) {
 case TRIGGER_TYPE_AD_MATCH:
-/* type 2 trigger cannot be fired in VU/VS mode */
-if (env->virt_enabled) {
-return false;
-}
-
 ctrl = env->tdata1[i];
 addr = env->tdata2[i];
 flags = 0;
@@ -918,10 +917,7 @@ bool riscv_cpu_debug_check_watchpoint(CPUState *cs, 
CPUWatchpoint *wp)
 }
 
 if ((wp->flags & flags) && (wp->vaddr == addr)) {
-/* check U/S/M bit against current privilege level */
-if ((ctrl >> 3) & BIT(env->priv)) {
-return true;
-}
+return true;
 }
 break;
 case TRIGGER_TYPE_AD_MATCH6:
@@ -937,17 +933,7 @@ bool riscv_cpu_debug_check_watchpoint(CPUState *cs, 
CPUWatchpoint *wp)
 }
 
 if ((wp->flags & flags) && (wp->vaddr == addr)) {
-if (env->virt_enabled) {
-/* check VU/VS bit against current privilege level */
-if ((ctrl >> 23) & BIT(env->priv)) {
-return true;
-}
-} else {
-/* check U/S/M bit against current privilege level */
-if ((ctrl >> 3) & BIT(env->priv)) {
-return true;
-}
-}
+return true;
 }
 break;
 default:
-- 
2.34.1




[PATCH v4 0/4] RISC-V: Modularize common match conditions for trigger

2024-02-26 Thread Alvin Chang via
According to RISC-V Debug specification ratified version 0.13 [1]
(also applied to version 1.0 [2] but it has not been ratified yet), the
enabled privilege levels of the trigger is common match conditions for
all the types of the trigger.

This series modularize the code for checking the privilege levels of
type 2/3/6 triggers by implementing functions trigger_common_match()
and trigger_priv_match().

Additional match conditions, such as CSR tcontrol and textra, can be
further implemented into trigger_common_match() in the future.

[1]: https://github.com/riscv/riscv-debug-spec/releases/tag/task_group_vote
[2]: https://github.com/riscv/riscv-debug-spec/releases/tag/1.0.0-rc1-asciidoc

Changes from v3:
- Change this series to target Debug Spec. version 0.13

Changes from v2:
- Explicitly mention the targeting version of RISC-V Debug Spec.

Changes from v1:
- Fix typo
- Add commit description for changing behavior of looping the triggers
  when we check type 2 triggers.

Alvin Chang (4):
  target/riscv: Add functions for common matching conditions of trigger
  target/riscv: Apply modularized matching conditions for breakpoint
  target/riscv: Apply modularized matching conditions for watchpoint
  target/riscv: Apply modularized matching conditions for icount trigger

 target/riscv/debug.c | 124 +--
 1 file changed, 83 insertions(+), 41 deletions(-)

-- 
2.34.1




[PATCH v4 4/4] target/riscv: Apply modularized matching conditions for icount trigger

2024-02-26 Thread Alvin Chang via
We have implemented trigger_common_match(), which checks if the enabled
privilege levels of the trigger match CPU's current privilege level. We
can invoke trigger_common_match() to check the privilege levels of the
type 3 triggers.

Signed-off-by: Alvin Chang 
---
 target/riscv/debug.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/riscv/debug.c b/target/riscv/debug.c
index 9f9f332019..eb45e2c147 100644
--- a/target/riscv/debug.c
+++ b/target/riscv/debug.c
@@ -624,7 +624,7 @@ void helper_itrigger_match(CPURISCVState *env)
 if (get_trigger_type(env, i) != TRIGGER_TYPE_INST_CNT) {
 continue;
 }
-if (check_itrigger_priv(env, i)) {
+if (!trigger_common_match(env, TRIGGER_TYPE_INST_CNT, i)) {
 continue;
 }
 count = itrigger_get_count(env, i);
-- 
2.34.1




[PATCH v4 2/4] target/riscv: Apply modularized matching conditions for breakpoint

2024-02-26 Thread Alvin Chang via
We have implemented trigger_common_match(), which checks if the enabled
privilege levels of the trigger match CPU's current privilege level.
Remove the related code in riscv_cpu_debug_check_breakpoint() and invoke
trigger_common_match() to check the privilege levels of the type 2 and
type 6 triggers for the breakpoints.

This commit also changes the behavior of looping the triggers. In
previous implementation, if we have a type 2 trigger and
env->virt_enabled is true, we directly return false to stop the loop.
Now we keep looping all the triggers until we find a matched trigger.

Only the execution bit and the executed PC should be futher checked in
riscv_cpu_debug_check_breakpoint().

Signed-off-by: Alvin Chang 
Reviewed-by: Alistair Francis 
---
 target/riscv/debug.c | 26 ++
 1 file changed, 6 insertions(+), 20 deletions(-)

diff --git a/target/riscv/debug.c b/target/riscv/debug.c
index 3891236b82..b7b0fa8945 100644
--- a/target/riscv/debug.c
+++ b/target/riscv/debug.c
@@ -855,21 +855,17 @@ bool riscv_cpu_debug_check_breakpoint(CPUState *cs)
 for (i = 0; i < RV_MAX_TRIGGERS; i++) {
 trigger_type = get_trigger_type(env, i);
 
+if (!trigger_common_match(env, trigger_type, i)) {
+continue;
+}
+
 switch (trigger_type) {
 case TRIGGER_TYPE_AD_MATCH:
-/* type 2 trigger cannot be fired in VU/VS mode */
-if (env->virt_enabled) {
-return false;
-}
-
 ctrl = env->tdata1[i];
 pc = env->tdata2[i];
 
 if ((ctrl & TYPE2_EXEC) && (bp->pc == pc)) {
-/* check U/S/M bit against current privilege level */
-if ((ctrl >> 3) & BIT(env->priv)) {
-return true;
-}
+return true;
 }
 break;
 case TRIGGER_TYPE_AD_MATCH6:
@@ -877,17 +873,7 @@ bool riscv_cpu_debug_check_breakpoint(CPUState *cs)
 pc = env->tdata2[i];
 
 if ((ctrl & TYPE6_EXEC) && (bp->pc == pc)) {
-if (env->virt_enabled) {
-/* check VU/VS bit against current privilege level */
-if ((ctrl >> 23) & BIT(env->priv)) {
-return true;
-}
-} else {
-/* check U/S/M bit against current privilege level */
-if ((ctrl >> 3) & BIT(env->priv)) {
-return true;
-}
-}
+return true;
 }
 break;
 default:
-- 
2.34.1




Re: [PATCH v4 08/10] hw/cxl/cxl-mailbox-utils: Add mailbox commands to support add/release dynamic capacity response

2024-02-26 Thread fan
On Mon, Feb 26, 2024 at 06:04:17PM +, Jonathan Cameron wrote:
> On Wed, 21 Feb 2024 10:16:01 -0800
> nifan@gmail.com wrote:
> 
> > From: Fan Ni 
> > 
> > Per CXL spec 3.1, two mailbox commands are implemented:
> > Add Dynamic Capacity Response (Opcode 4802h) 8.2.9.9.9.3, and
> > Release Dynamic Capacity (Opcode 4803h) 8.2.9.9.9.4.
> > 
> > Signed-off-by: Fan Ni 
> 
> Hi Fan, 
> 
> Comments on this are all about corner cases. If we can I think we need
> to cover a few more.  Linux won't hit them (I think) so it will be
> a bit of a pain to test but maybe raw commands enabled and some
> userspace code will let us exercise the corner cases?
> 
> Jonathan
> 
> 
> 
> > +
> > +/*
> > + * CXL r3.1 section 8.2.9.9.9.4: Release Dynamic Capacity (opcode 4803h)
> > + */
> > +static CXLRetCode cmd_dcd_release_dyn_cap(const struct cxl_cmd *cmd,
> > +  uint8_t *payload_in,
> > +  size_t len_in,
> > +  uint8_t *payload_out,
> > +  size_t *len_out,
> > +  CXLCCI *cci)
> > +{
> > +CXLUpdateDCExtentListInPl *in = (void *)payload_in;
> > +CXLType3Dev *ct3d = CXL_TYPE3(cci->d);
> > +CXLDCExtentList *extent_list = >dc.extents;
> > +CXLDCExtent *ent;
> > +uint32_t i;
> > +uint64_t dpa, len;
> > +CXLRetCode ret;
> > +
> > +if (in->num_entries_updated == 0) {
> > +return CXL_MBOX_INVALID_INPUT;
> > +}
> > +
> > +ret = cxl_detect_malformed_extent_list(ct3d, in);
> > +if (ret != CXL_MBOX_SUCCESS) {
> > +return ret;
> > +}
> > +
> > +for (i = 0; i < in->num_entries_updated; i++) {
> > +bool found = false;
> > +
> > +dpa = in->updated_entries[i].start_dpa;
> > +len = in->updated_entries[i].len;
> > +
> > +QTAILQ_FOREACH(ent, extent_list, node) {
> > +if (ent->start_dpa <= dpa &&
> > +dpa + len <= ent->start_dpa + ent->len) {
> > +/*
> > + * If an incoming extent covers a portion of an extent
> > + * in the device extent list, remove only the overlapping
> > + * portion, meaning
> > + * 1. the portions that are not covered by the incoming
> > + *extent at both end of the original extent will become
> > + *new extents and inserted to the extent list; and
> > + * 2. the original extent is removed from the extent list;
> > + * 3. dc extent count is updated accordingly.
> > + */
> > +uint64_t ent_start_dpa = ent->start_dpa;
> > +uint64_t ent_len = ent->len;
> > +uint64_t len1 = dpa - ent_start_dpa;
> > +uint64_t len2 = ent_start_dpa + ent_len - dpa - len;
> > +
> > +found = true;
> > +cxl_remove_extent_from_extent_list(extent_list, ent);
> > +ct3d->dc.total_extent_count -= 1;
> > +
> > +if (len1) {
> > +cxl_insert_extent_to_extent_list(extent_list,
> > + ent_start_dpa, len1,
> > + NULL, 0);
> > +ct3d->dc.total_extent_count += 1;
> > +}
> > +if (len2) {
> > +cxl_insert_extent_to_extent_list(extent_list, dpa + 
> > len,
> > + len2, NULL, 0);
> > +ct3d->dc.total_extent_count += 1;
> 
> There is a non zero chance that we'll overflow however many extents we claim
> to support. So we need to check that and fail the remove if it happens.
> Could ignore this for now though as that value is (I think!) conservative
> to allow for complex extent list tracking implementations.  Succeeding
> when a naive solution would fail due to running out of extents that it can
> manage is not (I think) a bug.
> 
> > +}
> > +break;
> > +/*Currently we reject the attempt to remove a superset*/
> 
> Space after /* and before */
> 
> I think we need to fix this. Linux isn't going to do it any time soon, but
> I think it's allowed to allocate two extents next to each other then free them
> in one go.  Isn't this case easy to do or are there awkward corners?

If we use the bitmap (indicating each range is filled by valid extents)
in PATCH 10, it should not be that difficult to do.

Fan
> If it's sufficiently nasty (maybe because only part of extent provided 
> exists?)
> then maybe we can leave it for now.
> 
> I worry about something like
> 
> |  EXTENT TO FREE|
> | Exists|   gap   | Exists   |
> Where we have to check for gap before removing anything?
> Does the spec 

Re: [PATCH v4 08/10] hw/cxl/cxl-mailbox-utils: Add mailbox commands to support add/release dynamic capacity response

2024-02-26 Thread fan
On Mon, Feb 26, 2024 at 06:04:17PM +, Jonathan Cameron wrote:
> On Wed, 21 Feb 2024 10:16:01 -0800
> nifan@gmail.com wrote:
> 
> > From: Fan Ni 
> > 
> > Per CXL spec 3.1, two mailbox commands are implemented:
> > Add Dynamic Capacity Response (Opcode 4802h) 8.2.9.9.9.3, and
> > Release Dynamic Capacity (Opcode 4803h) 8.2.9.9.9.4.
> > 
> > Signed-off-by: Fan Ni 
> 
> Hi Fan, 
> 
> Comments on this are all about corner cases. If we can I think we need
> to cover a few more.  Linux won't hit them (I think) so it will be
> a bit of a pain to test but maybe raw commands enabled and some
> userspace code will let us exercise the corner cases?
> 
> Jonathan
> 
> 
> 
> > +
> > +/*
> > + * CXL r3.1 section 8.2.9.9.9.4: Release Dynamic Capacity (opcode 4803h)
> > + */
> > +static CXLRetCode cmd_dcd_release_dyn_cap(const struct cxl_cmd *cmd,
> > +  uint8_t *payload_in,
> > +  size_t len_in,
> > +  uint8_t *payload_out,
> > +  size_t *len_out,
> > +  CXLCCI *cci)
> > +{
> > +CXLUpdateDCExtentListInPl *in = (void *)payload_in;
> > +CXLType3Dev *ct3d = CXL_TYPE3(cci->d);
> > +CXLDCExtentList *extent_list = >dc.extents;
> > +CXLDCExtent *ent;
> > +uint32_t i;
> > +uint64_t dpa, len;
> > +CXLRetCode ret;
> > +
> > +if (in->num_entries_updated == 0) {
> > +return CXL_MBOX_INVALID_INPUT;
> > +}
> > +
> > +ret = cxl_detect_malformed_extent_list(ct3d, in);
> > +if (ret != CXL_MBOX_SUCCESS) {
> > +return ret;
> > +}
> > +
> > +for (i = 0; i < in->num_entries_updated; i++) {
> > +bool found = false;
> > +
> > +dpa = in->updated_entries[i].start_dpa;
> > +len = in->updated_entries[i].len;
> > +
> > +QTAILQ_FOREACH(ent, extent_list, node) {
> > +if (ent->start_dpa <= dpa &&
> > +dpa + len <= ent->start_dpa + ent->len) {
> > +/*
> > + * If an incoming extent covers a portion of an extent
> > + * in the device extent list, remove only the overlapping
> > + * portion, meaning
> > + * 1. the portions that are not covered by the incoming
> > + *extent at both end of the original extent will become
> > + *new extents and inserted to the extent list; and
> > + * 2. the original extent is removed from the extent list;
> > + * 3. dc extent count is updated accordingly.
> > + */
> > +uint64_t ent_start_dpa = ent->start_dpa;
> > +uint64_t ent_len = ent->len;
> > +uint64_t len1 = dpa - ent_start_dpa;
> > +uint64_t len2 = ent_start_dpa + ent_len - dpa - len;
> > +
> > +found = true;
> > +cxl_remove_extent_from_extent_list(extent_list, ent);
> > +ct3d->dc.total_extent_count -= 1;
> > +
> > +if (len1) {
> > +cxl_insert_extent_to_extent_list(extent_list,
> > + ent_start_dpa, len1,
> > + NULL, 0);
> > +ct3d->dc.total_extent_count += 1;
> > +}
> > +if (len2) {
> > +cxl_insert_extent_to_extent_list(extent_list, dpa + 
> > len,
> > + len2, NULL, 0);
> > +ct3d->dc.total_extent_count += 1;
> 
> There is a non zero chance that we'll overflow however many extents we claim
> to support. So we need to check that and fail the remove if it happens.
> Could ignore this for now though as that value is (I think!) conservative
> to allow for complex extent list tracking implementations.  Succeeding
> when a naive solution would fail due to running out of extents that it can
> manage is not (I think) a bug.

Yeah. spec r3.1 mentioned about the overflow issue that adding/releasing
extent requests can raise. We should fail the operation if running out of
extents and report resource exhausted.

> 
> > +}
> > +break;
> > +/*Currently we reject the attempt to remove a superset*/
> 
> Space after /* and before */
> 
> I think we need to fix this. Linux isn't going to do it any time soon, but
> I think it's allowed to allocate two extents next to each other then free them
> in one go.  Isn't this case easy to do or are there awkward corners?
> If it's sufficiently nasty (maybe because only part of extent provided 
> exists?)
> then maybe we can leave it for now.
> 
> I worry about something like
> 
> |  EXTENT TO FREE|
> | Exists|   gap   | Exists   |
> Where we have to 

Re: [PATCH v3 0/4] RISC-V: Modularize common match conditions for trigger

2024-02-26 Thread Alistair Francis
On Mon, Feb 26, 2024 at 5:10 PM Alvin Chang via  wrote:
>
> According to latest RISC-V Debug specification version 1.0 [1], the

The issue here is that we really only support the "debug" spec. That's
the 0.13 version of the spec.

We do also support bits of the 1.0 spec, but those should be changed
to be hidden behind the new extension flags like "Sdtrig"

I think this patch still applies to the 0.13 version though. Do you
mind changing this to target the 0.13 version of the spec instead?

Ideally we can then support the new Sdtrig extension in the future

Alistair

> enabled privilege levels of the trigger is common match conditions for
> all the types of the trigger.
>
> This series modularize the code for checking the privilege levels of
> type 2/3/6 triggers by implementing functions trigger_common_match()
> and trigger_priv_match().
>
> Additional match conditions, such as CSR tcontrol and textra, can be
> further implemented into trigger_common_match() in the future.
>
> [1]: https://github.com/riscv/riscv-debug-spec/releases/tag/1.0.0-rc1-asciidoc
>
> Changes from v2:
> - Explicitly mention the targeting version of RISC-V Debug Spec.
>
> Changes from v1:
> - Fix typo
> - Add commit description for changing behavior of looping the triggers
>   when we check type 2 triggers.
>
> Alvin Chang (4):
>   target/riscv: Add functions for common matching conditions of trigger
>   target/riscv: Apply modularized matching conditions for breakpoint
>   target/riscv: Apply modularized matching conditions for watchpoint
>   target/riscv: Apply modularized matching conditions for icount trigger
>
>  target/riscv/debug.c | 124 +--
>  1 file changed, 83 insertions(+), 41 deletions(-)
>
> --
> 2.34.1
>
>



Re: [PATCH v2 2/4] target/riscv: Apply modularized matching conditions for breakpoint

2024-02-26 Thread Alistair Francis
On Fri, Feb 23, 2024 at 12:22 PM Alvin Chang via  wrote:
>
> We have implemented trigger_common_match(), which checks if the enabled
> privilege levels of the trigger match CPU's current privilege level.
> Remove the related code in riscv_cpu_debug_check_breakpoint() and invoke
> trigger_common_match() to check the privilege levels of the type 2 and
> type 6 triggers for the breakpoints.
>
> This commit also changes the behavior of looping the triggers. In
> previous implementation, if we have a type 2 trigger and
> env->virt_enabled is true, we directly return false to stop the loop.
> Now we keep looping all the triggers until we find a matched trigger.
>
> Only the execution bit and the executed PC should be futher checked in
> riscv_cpu_debug_check_breakpoint().
>
> Signed-off-by: Alvin Chang 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  target/riscv/debug.c | 26 ++
>  1 file changed, 6 insertions(+), 20 deletions(-)
>
> diff --git a/target/riscv/debug.c b/target/riscv/debug.c
> index 3891236b82..b7b0fa8945 100644
> --- a/target/riscv/debug.c
> +++ b/target/riscv/debug.c
> @@ -855,21 +855,17 @@ bool riscv_cpu_debug_check_breakpoint(CPUState *cs)
>  for (i = 0; i < RV_MAX_TRIGGERS; i++) {
>  trigger_type = get_trigger_type(env, i);
>
> +if (!trigger_common_match(env, trigger_type, i)) {
> +continue;
> +}
> +
>  switch (trigger_type) {
>  case TRIGGER_TYPE_AD_MATCH:
> -/* type 2 trigger cannot be fired in VU/VS mode */
> -if (env->virt_enabled) {
> -return false;
> -}
> -
>  ctrl = env->tdata1[i];
>  pc = env->tdata2[i];
>
>  if ((ctrl & TYPE2_EXEC) && (bp->pc == pc)) {
> -/* check U/S/M bit against current privilege level */
> -if ((ctrl >> 3) & BIT(env->priv)) {
> -return true;
> -}
> +return true;
>  }
>  break;
>  case TRIGGER_TYPE_AD_MATCH6:
> @@ -877,17 +873,7 @@ bool riscv_cpu_debug_check_breakpoint(CPUState *cs)
>  pc = env->tdata2[i];
>
>  if ((ctrl & TYPE6_EXEC) && (bp->pc == pc)) {
> -if (env->virt_enabled) {
> -/* check VU/VS bit against current privilege level */
> -if ((ctrl >> 23) & BIT(env->priv)) {
> -return true;
> -}
> -} else {
> -/* check U/S/M bit against current privilege level */
> -if ((ctrl >> 3) & BIT(env->priv)) {
> -return true;
> -}
> -}
> +return true;
>  }
>  break;
>  default:
> --
> 2.34.1
>
>



Re: [PATCH v4 22/34] migration/multifd: Prepare multifd sync for fixed-ram migration

2024-02-26 Thread Fabiano Rosas
Peter Xu  writes:

> On Tue, Feb 20, 2024 at 07:41:26PM -0300, Fabiano Rosas wrote:
>> The fixed-ram migration can be performed live or non-live, but it is
>> always asynchronous, i.e. the source machine and the destination
>> machine are not migrating at the same time. We only need some pieces
>> of the multifd sync operations.
>> 
>> multifd_send_sync_main()
>> 
>>   Issued by the ram migration code on the migration thread, causes the
>>   multifd send channels to synchronize with the migration thread and
>>   makes the sending side emit a packet with the MULTIFD_FLUSH flag.
>> 
>>   With fixed-ram we want to maintain the sync on the sending side
>>   because that provides ordering between the rounds of dirty pages when
>>   migrating live.
>> 
>> MULTIFD_FLUSH
>> -
>>   On the receiving side, the presence of the MULTIFD_FLUSH flag on a
>>   packet causes the receiving channels to start synchronizing with the
>>   main thread.
>> 
>>   We're not using packets with fixed-ram, so there's no MULTIFD_FLUSH
>>   flag and therefore no channel sync on the receiving side.
>> 
>> multifd_recv_sync_main()
>> 
>>   Issued by the migration thread when the ram migration flag
>>   RAM_SAVE_FLAG_MULTIFD_FLUSH is received, causes the migration thread
>>   on the receiving side to start synchronizing with the recv
>>   channels. Due to compatibility, this is also issued when
>>   RAM_SAVE_FLAG_EOS is received.
>> 
>>   For fixed-ram we only need to synchronize the channels at the end of
>>   migration to avoid doing cleanup before the channels have finished
>>   their IO.
>> 
>> Make sure the multifd syncs are only issued at the appropriate
>> times. Note that due to pre-existing backward compatibility issues, we
>> have the multifd_flush_after_each_section property that enables an
>> older behavior of synchronizing channels more frequently (and
>> inefficiently). Fixed-ram should always run with that property
>> disabled (default).
>
> What if the user enables multifd_flush_after_each_section=true?
>
> IMHO we don't necessarily need to attach the fixed-ram loading flush to any
> flag in the stream.  For fixed-ram IIUC all the loads will happen in one
> shot of ram_load() anyway when parsing the ramblock list, so.. how about we
> decouple the fixed-ram load flush from the stream by always do a sync in
> ram_load() unconditionally?

I would like to. But it's not possible because ram_load() is called once
per section. So once for each EOS flag on the stream. We'll have at
least two calls to ram_load(), once due to qemu_savevm_state_iterate()
and another due to qemu_savevm_state_complete_precopy().

The fact that fixed-ram can use just one load doesn't change the fact
that we perform more than one "save". So we'll need to use the FLUSH
flag in this case unfortunately.

>
> @@ -4368,6 +4367,15 @@ static int ram_load(QEMUFile *f, void *opaque, int 
> version_id)
>  ret = ram_load_precopy(f);
>  }
>  }
> +
> +/*
> + * Fixed-ram migration may queue load tasks to multifd threads; make
> + * sure they're all done.
> + */
> +if (migrate_fixed_ram() && migrate_multifd()) {
> +multifd_recv_sync_main();
> +}
> +
>  trace_ram_load_complete(ret, seq_iter);
>  
>  return ret;
>
> Then ram_load() always guarantees synchronous loading of pages, and
> fixed-ram will completely ignore multifd flushes (then we also skip it for
> the ram_save_complete() like what this patch does for the rest).
>
>> 
>> Signed-off-by: Fabiano Rosas 
>> ---
>>  migration/ram.c | 19 ---
>>  1 file changed, 16 insertions(+), 3 deletions(-)
>> 
>> diff --git a/migration/ram.c b/migration/ram.c
>> index 5932e1b8e1..c7050f6f68 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -1369,8 +1369,11 @@ static int find_dirty_block(RAMState *rs, 
>> PageSearchStatus *pss)
>>  if (ret < 0) {
>>  return ret;
>>  }
>> -qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
>> -qemu_fflush(f);
>> +
>> +if (!migrate_fixed_ram()) {
>> +qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
>> +qemu_fflush(f);
>> +}
>>  }
>>  /*
>>   * If memory migration starts over, we will meet a dirtied page
>> @@ -3112,7 +3115,8 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>>  return ret;
>>  }
>>  
>> -if (migrate_multifd() && !migrate_multifd_flush_after_each_section()) {
>> +if (migrate_multifd() && !migrate_multifd_flush_after_each_section()
>> +&& !migrate_fixed_ram()) {
>>  qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
>>  }
>>  
>> @@ -4253,6 +4257,15 @@ static int ram_load_precopy(QEMUFile *f)
>>  break;
>>  case RAM_SAVE_FLAG_EOS:
>>  /* normal exit */
>> +if 

Re: [PATCH v2 10/15] hw/southbridge/ich9: Add the DMI-to-PCI bridge

2024-02-26 Thread Bernhard Beschow



Am 26. Februar 2024 11:14:09 UTC schrieb "Philippe Mathieu-Daudé" 
:
>Instantiate TYPE_ICH_DMI_PCI_BRIDGE in TYPE_ICH9_SOUTHBRIDGE.
>
>Since the Q35 machine doesn't use it, add the 'd2p-enabled'
>property to disable it.
>
>Signed-off-by: Philippe Mathieu-Daudé 
>---
> include/hw/southbridge/ich9.h |  9 -
> hw/i386/pc_q35.c  |  1 +
> hw/southbridge/ich9.c | 27 +++
> hw/southbridge/Kconfig|  1 +
> 4 files changed, 29 insertions(+), 9 deletions(-)
>
>diff --git a/include/hw/southbridge/ich9.h b/include/hw/southbridge/ich9.h
>index 162ae3baa1..b9122d299d 100644
>--- a/include/hw/southbridge/ich9.h
>+++ b/include/hw/southbridge/ich9.h
>@@ -108,15 +108,6 @@ struct ICH9LPCState {
> #define ICH9_USB_UHCI1_DEV  29
> #define ICH9_USB_UHCI1_FUNC 0
> 
>-/* D30:F0 DMI-to-PCI bridge */
>-#define ICH9_D2P_BRIDGE "ICH9 D2P BRIDGE"
>-#define ICH9_D2P_BRIDGE_SAVEVM_VERSION  0
>-
>-#define ICH9_D2P_BRIDGE_DEV 30
>-#define ICH9_D2P_BRIDGE_FUNC0
>-
>-#define ICH9_D2P_SECONDARY_DEFAULT  (256 - 8)
>-
> /* D31:F0 LPC Processor Interface */
> #define ICH9_RST_CNT_IOPORT 0xCF9
> 
>diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
>index 8c8a2f65b8..f951cf1e3a 100644
>--- a/hw/i386/pc_q35.c
>+++ b/hw/i386/pc_q35.c
>@@ -226,6 +226,7 @@ static void pc_q35_init(MachineState *machine)
> object_property_add_child(OBJECT(machine), "ich9", OBJECT(ich9));
> object_property_set_link(OBJECT(ich9), "mch-pcie-bus",
>  OBJECT(pcms->pcibus), _abort);
>+qdev_prop_set_bit(ich9, "d2p-enabled", false);
> qdev_realize_and_unref(ich9, NULL, _fatal);
> 
> /* create ISA bus */
>diff --git a/hw/southbridge/ich9.c b/hw/southbridge/ich9.c
>index f3a9b932ab..8c4356ff74 100644
>--- a/hw/southbridge/ich9.c
>+++ b/hw/southbridge/ich9.c
>@@ -12,16 +12,23 @@
> #include "hw/qdev-properties.h"
> #include "hw/southbridge/ich9.h"
> #include "hw/pci/pci.h"
>+#include "hw/pci-bridge/ich9_dmi.h"
>+
>+#define ICH9_D2P_DEVFN  PCI_DEVFN(30, 0)

Something along the lines of ICH9_DMI_PCI_DEVFN seems more clear to me.

> 
> struct ICH9State {
> DeviceState parent_obj;
> 
>+I82801b11Bridge d2p;

Same here and essentially all identifiers and properties with "d2p" in their 
name.

Best regards,
Bernhard

>+
> PCIBus *pci_bus;
>+bool d2p_enabled;
> };
> 
> static Property ich9_props[] = {
> DEFINE_PROP_LINK("mch-pcie-bus", ICH9State, pci_bus,
>  TYPE_PCIE_BUS, PCIBus *),
>+DEFINE_PROP_BOOL("d2p-enabled", ICH9State, d2p_enabled, true),
> DEFINE_PROP_END_OF_LIST(),
> };
> 
>@@ -29,6 +36,22 @@ static void ich9_init(Object *obj)
> {
> }
> 
>+static bool ich9_realize_d2p(ICH9State *s, Error **errp)
>+{
>+if (!module_object_class_by_name(TYPE_ICH_DMI_PCI_BRIDGE)) {
>+error_setg(errp, "DMI-to-PCI function not available in this build");
>+return false;
>+}
>+object_initialize_child(OBJECT(s), "d2p", >d2p, 
>TYPE_ICH_DMI_PCI_BRIDGE);
>+qdev_prop_set_int32(DEVICE(>d2p), "addr", ICH9_D2P_DEVFN);
>+if (!qdev_realize(DEVICE(>d2p), BUS(s->pci_bus), errp)) {
>+return false;
>+}
>+object_property_add_alias(OBJECT(s), "pci.0", OBJECT(>d2p), "pci.0");
>+
>+return true;
>+}
>+
> static void ich9_realize(DeviceState *dev, Error **errp)
> {
> ICH9State *s = ICH9_SOUTHBRIDGE(dev);
>@@ -37,6 +60,10 @@ static void ich9_realize(DeviceState *dev, Error **errp)
> error_setg(errp, "'pcie-bus' property must be set");
> return;
> }
>+
>+if (s->d2p_enabled && !ich9_realize_d2p(s, errp)) {
>+return;
>+}
> }
> 
> static void ich9_class_init(ObjectClass *klass, void *data)
>diff --git a/hw/southbridge/Kconfig b/hw/southbridge/Kconfig
>index 852b7f346f..db7259bf6f 100644
>--- a/hw/southbridge/Kconfig
>+++ b/hw/southbridge/Kconfig
>@@ -3,3 +3,4 @@
> config ICH9
> bool
> depends on PCI_EXPRESS
>+imply I82801B11



[RFC PATCH] tests/vm: update openbsd image to 7.4

2024-02-26 Thread Alex Bennée
The old links are dead so even if we have the ISO cached we can't
finish the install. Update to the current stable and tweak the install
strings.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2192
Signed-off-by: Alex Bennée 
---
 tests/vm/openbsd | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/tests/vm/openbsd b/tests/vm/openbsd
index 85c5bb3536c..85c98636332 100755
--- a/tests/vm/openbsd
+++ b/tests/vm/openbsd
@@ -22,8 +22,8 @@ class OpenBSDVM(basevm.BaseVM):
 name = "openbsd"
 arch = "x86_64"
 
-link = "https://cdn.openbsd.org/pub/OpenBSD/7.2/amd64/install72.iso;
-csum = "0369ef40a3329efcb978c578c7fdc7bda71e502aecec930a74b44160928c91d3"
+link = "https://cdn.openbsd.org/pub/OpenBSD/7.4/amd64/install74.iso;
+csum = "a1001736ed9fe2307965b5fcdb426ae11f9b80d26eb21e404a705144a0a224a0"
 size = "20G"
 pkgs = [
 # tools
@@ -99,10 +99,10 @@ class OpenBSDVM(basevm.BaseVM):
 self.console_wait_send("(I)nstall",   "i\n")
 self.console_wait_send("Terminal type",   "xterm\n")
 self.console_wait_send("System hostname", "openbsd\n")
-self.console_wait_send("Which network interface", "vio0\n")
+self.console_wait_send("Network interface to configure", "vio0\n")
 self.console_wait_send("IPv4 address","autoconf\n")
 self.console_wait_send("IPv6 address","none\n")
-self.console_wait_send("Which network interface", "done\n")
+self.console_wait_send("Network interface to configure", "done\n")
 self.console_wait("Password for root account")
 self.console_send("%s\n" % self._config["root_pass"])
 self.console_wait("Password for root account")
@@ -124,6 +124,7 @@ class OpenBSDVM(basevm.BaseVM):
 self.console_wait_send("Allow root ssh login","yes\n")
 self.console_wait_send("timezone","UTC\n")
 self.console_wait_send("root disk",   "\n")
+self.console_wait_send("Encrypt the root disk with a passphrase", 
"no\n")
 self.console_wait_send("(W)hole disk","\n")
 self.console_wait_send("(A)uto layout",   "c\n")
 
-- 
2.39.2




Re: [PATCH v2 00/15] hw/southbridge: Extract ICH9 QOM container model

2024-02-26 Thread Bernhard Beschow



Am 26. Februar 2024 11:13:59 UTC schrieb "Philippe Mathieu-Daudé" 
:
>Since v1 [1]:
>- Rebased on top of Bernhard patches
>- Rename files with 'ich9_' prefix (Bernhard)
>
>Hi,
>
>I have a long standing southbridge QOM rework branches. Since
>Bernhard is actively working on the PIIX, I'll try to refresh
>and post. This is also motivated by the Dynamic Machine work
>where we are trying to figure the ideal DSL for QEMU, so having
>complex models well designed help.
>
>Here we introduce the ICH9 'southbridge' as a QOM container.
>Since the chipset comes as a whole, we shouldn't instantiate
>its components separately. However in order to maintain old
>code we expose some properties to configure the container and
>not introduce any change for the Q35 machine. There is no
>migration change, only QOM objects moved around.

I really like the simplicity of the machine code and that the ICH9 southbridge 
becomes a proper device rather than being scattered around in machine code. 
I've made some reviews in form of a branch: 
https://github.com/shentok/qemu/commits/philmd/ich9_qom-v2/

>
>More work remain in the LPC function (more code to remove from
>Q35). Maybe worth doing in parallel with the PIIX to clean both
>PC machines.

Would be nice if the pattern could then also be applied to the VIA 
southbridges, otherwise this could break my via-apollo-pro-133t branch: 
https://github.com/shentok/qemu/tree/via-apollo-pro-133t

Best regards,
Bernhard

>
>Also we'd need to decouple the cpu_interrupt() calls between hw/
>and target/.
>
>Note that GSI is currently broken [2]. Once the LPC/ISA part is
>done, it might be easier to fix it.
>
>[1] 
>https://lore.kernel.org/qemu-devel/20240219163855.87326-1-phi...@linaro.org/
>[2] 
>https://lore.kernel.org/qemu-devel/cd0e13c6-c03d-411f-83a5-1d4d28ea4...@linaro.org/
>
>Philippe Mathieu-Daudé (15):
>  MAINTAINERS: Add 'ICH9 South Bridge' section
>  hw/i386/q35: Add local 'lpc_obj' variable
>  hw/acpi/ich9: Restrict definitions from 'hw/southbridge/ich9.h'
>  hw/acpi/ich9_tco: Include 'ich9' in names
>  hw/acpi/ich9_tco: Restrict ich9_generate_smi() declaration
>  hw/ide: Rename ich.c -> ich9_ahci.c
>  hw/i2c/smbus: Extract QOM ICH9 definitions to 'ich9_smbus.h'
>  hw/pci-bridge: Extract QOM ICH definitions to 'ich9_dmi.h'
>  hw/southbridge/ich9: Introduce TYPE_ICH9_SOUTHBRIDGE stub
>  hw/southbridge/ich9: Add the DMI-to-PCI bridge
>  hw/southbridge/ich9: Add a AHCI function
>  hw/southbridge/ich9: Add the SMBus function
>  hw/southbridge/ich9: Add the USB EHCI/UHCI functions
>  hw/southbridge/ich9: Extract LPC definitions to 'hw/isa/ich9_lpc.h'
>  hw/southbridge/ich9: Add the LPC / ISA bridge function
>
> MAINTAINERS   |  21 +-
> include/hw/acpi/ich9.h|  15 ++
> include/hw/acpi/ich9_tco.h|   6 +-
> include/hw/i2c/ich9_smbus.h   |  25 +++
> include/hw/isa/ich9_lpc.h | 166 +++
> include/hw/pci-bridge/ich9_dmi.h  |  20 ++
> include/hw/southbridge/ich9.h | 235 +-
> hw/acpi/ich9.c|   9 +-
> hw/acpi/ich9_tco.c|   5 +-
> hw/i2c/{smbus_ich9.c => ich9_smbus.c} |  36 +++-
> hw/i386/acpi-build.c  |   1 +
> hw/i386/pc_q35.c  | 126 +++-
> hw/ide/{ich.c => ich9_ahci.c} |   0
> hw/isa/{lpc_ich9.c => ich9_lpc.c} |  37 +++-
> hw/pci-bridge/{i82801b11.c => ich9_dmi.c} |  11 +-
> hw/southbridge/ich9.c | 213 
> tests/qtest/tco-test.c|   2 +-
> hw/Kconfig|   1 +
> hw/i2c/meson.build|   2 +-
> hw/i386/Kconfig   |   3 +-
> hw/ide/meson.build|   2 +-
> hw/isa/meson.build|   2 +-
> hw/meson.build|   1 +
> hw/pci-bridge/meson.build |   2 +-
> hw/southbridge/Kconfig|  11 +
> hw/southbridge/meson.build|   3 +
> 26 files changed, 587 insertions(+), 368 deletions(-)
> create mode 100644 include/hw/i2c/ich9_smbus.h
> create mode 100644 include/hw/isa/ich9_lpc.h
> create mode 100644 include/hw/pci-bridge/ich9_dmi.h
> rename hw/i2c/{smbus_ich9.c => ich9_smbus.c} (77%)
> rename hw/ide/{ich.c => ich9_ahci.c} (100%)
> rename hw/isa/{lpc_ich9.c => ich9_lpc.c} (95%)
> rename hw/pci-bridge/{i82801b11.c => ich9_dmi.c} (95%)
> create mode 100644 hw/southbridge/ich9.c
> create mode 100644 hw/southbridge/Kconfig
> create mode 100644 hw/southbridge/meson.build
>



[PATCH V2] migration: export fewer options

2024-02-26 Thread Steve Sistare
A small number of migration options are accessed by migration clients,
but to see them clients must include all of options.h, which is mostly
for migration core code.  migrate_mode() in particular will be needed by
multiple clients.

Refactor the option declarations so clients can see the necessary few via
misc.h, which already exports a portion of the client API.

Signed-off-by: Steve Sistare 
---
Changes in V2:
  * renamed options-pub.h to client-options.h
---
---
 hw/vfio/migration.c|  1 -
 hw/virtio/virtio-balloon.c |  1 -
 include/migration/client-options.h | 24 
 include/migration/misc.h   |  1 +
 migration/options.h|  6 +-
 system/dirtylimit.c|  1 -
 6 files changed, 26 insertions(+), 8 deletions(-)
 create mode 100644 include/migration/client-options.h

diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 50140ed..5d4a23c 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -18,7 +18,6 @@
 #include "sysemu/runstate.h"
 #include "hw/vfio/vfio-common.h"
 #include "migration/migration.h"
-#include "migration/options.h"
 #include "migration/savevm.h"
 #include "migration/vmstate.h"
 #include "migration/qemu-file.h"
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 89f853f..a59ff17 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -32,7 +32,6 @@
 #include "qemu/error-report.h"
 #include "migration/misc.h"
 #include "migration/migration.h"
-#include "migration/options.h"
 
 #include "hw/virtio/virtio-bus.h"
 #include "hw/virtio/virtio-access.h"
diff --git a/include/migration/client-options.h 
b/include/migration/client-options.h
new file mode 100644
index 000..887fea1
--- /dev/null
+++ b/include/migration/client-options.h
@@ -0,0 +1,24 @@
+/*
+ * QEMU public migration capabilities
+ *
+ * Copyright (c) 2012-2023 Red Hat Inc
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_MIGRATION_CLIENT_OPTIONS_H
+#define QEMU_MIGRATION_CLIENT_OPTIONS_H
+
+/* capabilities */
+
+bool migrate_background_snapshot(void);
+bool migrate_dirty_limit(void);
+bool migrate_postcopy_ram(void);
+bool migrate_switchover_ack(void);
+
+/* parameters */
+
+MigMode migrate_mode(void);
+
+#endif
diff --git a/include/migration/misc.h b/include/migration/misc.h
index 5d1aa59..4c226a4 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -17,6 +17,7 @@
 #include "qemu/notify.h"
 #include "qapi/qapi-types-migration.h"
 #include "qapi/qapi-types-net.h"
+#include "migration/client-options.h"
 
 /* migration/ram.c */
 
diff --git a/migration/options.h b/migration/options.h
index 246c160..964ebdd 100644
--- a/migration/options.h
+++ b/migration/options.h
@@ -16,6 +16,7 @@
 
 #include "hw/qdev-properties.h"
 #include "hw/qdev-properties-system.h"
+#include "migration/client-options.h"
 
 /* migration properties */
 
@@ -24,12 +25,10 @@ extern Property migration_properties[];
 /* capabilities */
 
 bool migrate_auto_converge(void);
-bool migrate_background_snapshot(void);
 bool migrate_block(void);
 bool migrate_colo(void);
 bool migrate_compress(void);
 bool migrate_dirty_bitmaps(void);
-bool migrate_dirty_limit(void);
 bool migrate_events(void);
 bool migrate_ignore_shared(void);
 bool migrate_late_block_activate(void);
@@ -37,11 +36,9 @@ bool migrate_multifd(void);
 bool migrate_pause_before_switchover(void);
 bool migrate_postcopy_blocktime(void);
 bool migrate_postcopy_preempt(void);
-bool migrate_postcopy_ram(void);
 bool migrate_rdma_pin_all(void);
 bool migrate_release_ram(void);
 bool migrate_return_path(void);
-bool migrate_switchover_ack(void);
 bool migrate_validate_uuid(void);
 bool migrate_xbzrle(void);
 bool migrate_zero_blocks(void);
@@ -83,7 +80,6 @@ uint8_t migrate_max_cpu_throttle(void);
 uint64_t migrate_max_bandwidth(void);
 uint64_t migrate_avail_switchover_bandwidth(void);
 uint64_t migrate_max_postcopy_bandwidth(void);
-MigMode migrate_mode(void);
 int migrate_multifd_channels(void);
 MultiFDCompression migrate_multifd_compression(void);
 int migrate_multifd_zlib_level(void);
diff --git a/system/dirtylimit.c b/system/dirtylimit.c
index b5607eb..774ff44 100644
--- a/system/dirtylimit.c
+++ b/system/dirtylimit.c
@@ -26,7 +26,6 @@
 #include "trace.h"
 #include "migration/misc.h"
 #include "migration/migration.h"
-#include "migration/options.h"
 
 /*
  * Dirtylimit stop working if dirty page rate error
-- 
1.8.3.1




Re: [PATCH V4 00/14] allow cpr-reboot for vfio

2024-02-26 Thread Steven Sistare
On 2/26/2024 3:21 PM, Steven Sistare wrote:
> On 2/26/2024 4:01 AM, Peter Xu wrote:
>> On Mon, Feb 26, 2024 at 09:49:46AM +0100, Cédric Le Goater wrote:
>>> Go ahead. It will help me for the changes I am doing on error reporting
>>> for VFIO migration. I will rebase on top.
>>
>> Thanks for confirming.  I queued the migration patches then, but leave the
>> two vfio one for further discussion.
> 
> Very good, thanks.  I am always happy to move the ball a few yards closer to
> the goal line :)

Peter, beware that patch 3 needs an edit before being queued.
This hunk snuck in and should be deleted:

[PATCH V4 03/14] migration: convert to NotifierWithReturn
diff --git a/roms/seabios-hppa b/roms/seabios-hppa
index 03774ed..e4eac85 16
--- a/roms/seabios-hppa
+++ b/roms/seabios-hppa
@@ -1 +1 @@
-Subproject commit 03774edaad3bfae090ac96ca5450353c641637d1
+Subproject commit e4eac85880e8677f96d8b9e94de9f2eec9c0751f


- Steve



[PATCH 0/2] Revert "hw/i386/pc: Confine system flash handling to pc_sysfw"

2024-02-26 Thread Bernhard Beschow
As reported by Volker [1], commit 6f6ad2b24582 "hw/i386/pc: Confine system
flash handling to pc_sysfw" causes a regression when specifying the property
`-M pflash0` in the PCI PC machines:
  qemu-system-x86_64: Property 'pc-q35-9.0-machine.pflash0' not found
Revert the commit for now until a solution is found.

Best regards,
Bernhard

[1] 
https://lore.kernel.org/qemu-devel/9e82a04b-f2c1-4e34-b4b6-46a0581b5...@t-online.de/

Bernhard Beschow (2):
  Revert "hw/i386/pc_sysfw: Inline pc_system_flash_create() and remove
it"
  Revert "hw/i386/pc: Confine system flash handling to pc_sysfw"

 include/hw/i386/pc.h |  2 ++
 hw/i386/pc.c |  1 +
 hw/i386/pc_piix.c|  1 +
 hw/i386/pc_sysfw.c   | 17 +
 4 files changed, 17 insertions(+), 4 deletions(-)

-- 
2.44.0




[PATCH 2/2] Revert "hw/i386/pc: Confine system flash handling to pc_sysfw"

2024-02-26 Thread Bernhard Beschow
Specifying the property `-M pflash0` results in a regression:
  qemu-system-x86_64: Property 'pc-q35-9.0-machine.pflash0' not found
Revert the change for now until a solution is found.

This reverts commit 6f6ad2b24582593d8feb00434ce2396840666227.

Reported-by: Volker Rümelin 
Signed-off-by: Bernhard Beschow 
---
 include/hw/i386/pc.h | 2 ++
 hw/i386/pc.c | 1 +
 hw/i386/pc_piix.c| 1 +
 hw/i386/pc_sysfw.c   | 6 ++
 4 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index e88468131a..0f9c1a45fc 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -191,6 +191,8 @@ void pc_i8259_create(ISABus *isa_bus, qemu_irq *i8259_irqs);
 #define TYPE_PORT92 "port92"
 
 /* pc_sysfw.c */
+void pc_system_flash_create(PCMachineState *pcms);
+void pc_system_flash_cleanup_unused(PCMachineState *pcms);
 void pc_system_firmware_init(PCMachineState *pcms, MemoryRegion *rom_memory);
 bool pc_system_ovmf_table_find(const char *entry, uint8_t **data,
int *data_len);
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index f8eb684a49..2ad8de5097 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1733,6 +1733,7 @@ static void pc_machine_initfn(Object *obj)
 #endif
 pcms->default_bus_bypass_iommu = false;
 
+pc_system_flash_create(pcms);
 pcms->pcspk = isa_new(TYPE_PC_SPEAKER);
 object_property_add_alias(OBJECT(pcms), "pcspk-audiodev",
   OBJECT(pcms->pcspk), "audiodev");
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index ec7c07b362..34203927e1 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -231,6 +231,7 @@ static void pc_init1(MachineState *machine,
 assert(machine->ram_size == x86ms->below_4g_mem_size +
 x86ms->above_4g_mem_size);
 
+pc_system_flash_cleanup_unused(pcms);
 if (machine->kernel_filename != NULL) {
 /* For xen HVM direct kernel boot, load linux here */
 xen_load_linux(pcms);
diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
index b9c1eb352d..3efabbbab2 100644
--- a/hw/i386/pc_sysfw.c
+++ b/hw/i386/pc_sysfw.c
@@ -91,7 +91,7 @@ static PFlashCFI01 *pc_pflash_create(PCMachineState *pcms,
 return PFLASH_CFI01(dev);
 }
 
-static void pc_system_flash_create(PCMachineState *pcms)
+void pc_system_flash_create(PCMachineState *pcms)
 {
 PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
 
@@ -103,7 +103,7 @@ static void pc_system_flash_create(PCMachineState *pcms)
 }
 }
 
-static void pc_system_flash_cleanup_unused(PCMachineState *pcms)
+void pc_system_flash_cleanup_unused(PCMachineState *pcms)
 {
 char *prop_name;
 int i;
@@ -210,8 +210,6 @@ void pc_system_firmware_init(PCMachineState *pcms,
 return;
 }
 
-pc_system_flash_create(pcms);
-
 /* Map legacy -drive if=pflash to machine properties */
 for (i = 0; i < ARRAY_SIZE(pcms->flash); i++) {
 pflash_cfi01_legacy_drive(pcms->flash[i],
-- 
2.44.0




[PATCH 1/2] Revert "hw/i386/pc_sysfw: Inline pc_system_flash_create() and remove it"

2024-02-26 Thread Bernhard Beschow
Commit 6f6ad2b24582 "hw/i386/pc: Confine system flash handling to pc_sysfw"
causes a regression when specifying the property `-M pflash0` in the PCI PC
machines:
  qemu-system-x86_64: Property 'pc-q35-9.0-machine.pflash0' not found
In order to revert the commit, the commit below must be reverted first.

This reverts commit cb05cc16029bb0a61ac5279ab7b3b90dcf2aa69f.

Signed-off-by: Bernhard Beschow 
---
 hw/i386/pc_sysfw.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
index b02e285579..b9c1eb352d 100644
--- a/hw/i386/pc_sysfw.c
+++ b/hw/i386/pc_sysfw.c
@@ -91,6 +91,18 @@ static PFlashCFI01 *pc_pflash_create(PCMachineState *pcms,
 return PFLASH_CFI01(dev);
 }
 
+static void pc_system_flash_create(PCMachineState *pcms)
+{
+PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
+
+if (pcmc->pci_enabled) {
+pcms->flash[0] = pc_pflash_create(pcms, "system.flash0",
+  "pflash0");
+pcms->flash[1] = pc_pflash_create(pcms, "system.flash1",
+  "pflash1");
+}
+}
+
 static void pc_system_flash_cleanup_unused(PCMachineState *pcms)
 {
 char *prop_name;
@@ -198,8 +210,7 @@ void pc_system_firmware_init(PCMachineState *pcms,
 return;
 }
 
-pcms->flash[0] = pc_pflash_create(pcms, "system.flash0", "pflash0");
-pcms->flash[1] = pc_pflash_create(pcms, "system.flash1", "pflash1");
+pc_system_flash_create(pcms);
 
 /* Map legacy -drive if=pflash to machine properties */
 for (i = 0; i < ARRAY_SIZE(pcms->flash); i++) {
-- 
2.44.0




Re: [PATCH v5 1/8] Implement STM32L4x5_RCC skeleton

2024-02-26 Thread Peter Maydell
On Mon, 26 Feb 2024 at 19:21, Arnaud Minier
 wrote:
> > From: "Peter Maydell"
> >> +static const MemoryRegionOps stm32l4x5_rcc_ops = {
> >> +.read = stm32l4x5_rcc_read,
> >> +.write = stm32l4x5_rcc_write,
> >> +.endianness = DEVICE_NATIVE_ENDIAN,
> >> +.valid = {
> >> +.max_access_size = 4,
> >> +.unaligned = false
> >> +},
> >> +};
> >
> > What's the .valid.min_access_size ?
> > Do we need to set the .impl max/min access size here too ?
>
> I honestly don't really understand the differences between .valid and .impl.
> However, since all the code assumes that 4-byte accesses are made,
> I think we can set all these values to 4 for now.

.valid is "what is the guest allowed to do?". If the guest tries
something not permitted by the .valid settings, it gets a transaction
failure instead, which then becomes a bus fault or whatever the
architectural equivalent is.

.impl is "what does my device code implement?". If the guest tries
something that is permitted by .valid but not within the bounds
specified by .impl, the core memory subsystem will try to synthesise
it (eg handling a guest word write by doing 4 byte writes to the
device write function, if a 4-byte write is permitted by .valid
but .impl only permits 1 byte writes).

thanks
-- PMM



Re: [PATCH v5 6/8] Add write protections to CR register

2024-02-26 Thread Arnaud Minier
Thank you for the review and for the tips ! It really helps.
I will address the problems you have highlighted and will send a new version 
later this week.

Arnaud

- Original Message -
> From: "Peter Maydell" 
> To: "Arnaud Minier" 
> Cc: "qemu-devel" , "Thomas Huth" , 
> "Laurent Vivier" , "Inès
> Varhol" , "Samuel Tardieu" 
> , "qemu-arm"
> , "Alistair Francis" , "Paolo 
> Bonzini" 
> Sent: Friday, February 23, 2024 3:59:03 PM
> Subject: Re: [PATCH v5 6/8] Add write protections to CR register

> On Mon, 19 Feb 2024 at 20:16, Arnaud Minier
>  wrote:
>>
>> Add write protections for the fields in the CR register.
>> PLL configuration write protections (among others) have not
>> been handled yet. This is planned in a future patch set.
> 
> Can you make sure you include a suitable prefix (eg
> "hw/misc/stm32l4x5_rcc: ") at the front of patch subjects, please?

Sorry. This will be done for the next version.

> 
>>
>> Signed-off-by: Arnaud Minier 
>> Signed-off-by: Inès Varhol 
>> ---
>>  hw/misc/stm32l4x5_rcc.c | 164 
>>  1 file changed, 114 insertions(+), 50 deletions(-)
>>
>> diff --git a/hw/misc/stm32l4x5_rcc.c b/hw/misc/stm32l4x5_rcc.c
>> index a3b192e61b..198c6238b6 100644
>> --- a/hw/misc/stm32l4x5_rcc.c
>> +++ b/hw/misc/stm32l4x5_rcc.c
>> @@ -346,9 +346,47 @@ static void rcc_update_irq(Stm32l4x5RccState *s)
>>  }
>>  }
>>
>> -static void rcc_update_cr_register(Stm32l4x5RccState *s)
>> +static void rcc_update_msi(Stm32l4x5RccState *s, uint32_t previous_value)
>> +{
>> +uint32_t val;
>> +
>> +static const uint32_t msirange[] = {
>> +10, 20, 40, 80, 100, 200,
>> +400, 800, 1600, 2400, 3200, 4800
>> +};
>> +/* MSIRANGE and MSIRGSEL */
>> +val = extract32(s->cr, R_CR_MSIRGSEL_SHIFT, R_CR_MSIRGSEL_LENGTH);
> 
> registerfields.h provides macros for "extract a named field", so you
> can write this
>val = FIELD_EX32(s->cr, CR, MSIRGSEL);

It seems really convenient ! Will use them !

> 
>> +if (val) {
>> +/* MSIRGSEL is set, use the MSIRANGE field */
>> +val = extract32(s->cr, R_CR_MSIRANGE_SHIFT, R_CR_MSIRANGE_LENGTH);
> 
> and these as val = extract32(s->cr, CR, MSIRANGE)
> and so on.
> 
>> +} else {
>> +/* MSIRGSEL is not set, use the MSISRANGE field */
>> +val = extract32(s->csr, R_CSR_MSISRANGE_SHIFT, 
>> R_CSR_MSISRANGE_LENGTH);
>> +}
>> +
>> +if (val < ARRAY_SIZE(msirange)) {
>> +clock_update_hz(s->msi_rc, msirange[val]);
>> +} else {
>> +/*
>> + * There is a hardware write protection if the value is out of 
>> bound.
>> + * Restore the previous value.
>> + */
>> +s->cr = (s->cr & ~R_CSR_MSISRANGE_MASK) |
>> +(previous_value & R_CSR_MSISRANGE_MASK);
>> +}
>> +}
>> +
> 
>> -/* HSEON and update HSERDY */
>> +/*
>> + * HSEON and update HSERDY.
>> + * HSEON cannot be reset if the HSE oscillator is used directly or
>> + * indirectly as the system clock.
>> + */
>>  val = extract32(s->cr, R_CR_HSEON_SHIFT, R_CR_HSEON_LENGTH);
>> -s->cr = (s->cr & ~R_CR_HSERDY_MASK) |
>> -(val << R_CR_HSERDY_SHIFT);
>> -if (val) {
>> -clock_update_hz(s->hse, s->hse_frequency);
>> -if (s->cier & R_CIER_HSERDYIE_MASK) {
>> -s->cifr |= R_CIFR_HSERDYF_MASK;
>> +if (extract32(s->cfgr, R_CFGR_SWS_SHIFT, R_CFGR_SWS_LENGTH) != 0b10 &&
>> +current_pll_src != RCC_CLOCK_MUX_SRC_HSE) {
>> +s->cr = (s->cr & ~R_CR_HSERDY_MASK) |
>> +(val << R_CR_HSERDY_SHIFT);
>> +if (val) {
>> +clock_update_hz(s->hse, s->hse_frequency);
>> +if (s->cier & R_CIER_HSERDYIE_MASK) {
>> +s->cifr |= R_CIFR_HSERDYF_MASK;
>> +}
>> +} else {
>> +clock_update_hz(s->hse, 0);
> 
> As I mentioned earlier, please avoid clock_update_hz() for
> clock calculations if possible.

This will be changed to use clock_update.

> 
> thanks
> -- PMM



Re: [PATCH v4 19/34] migration/multifd: Allow receiving pages without packets

2024-02-26 Thread Fabiano Rosas
Fabiano Rosas  writes:

> Peter Xu  writes:
>
>> On Tue, Feb 20, 2024 at 07:41:23PM -0300, Fabiano Rosas wrote:
>>> Currently multifd does not need to have knowledge of pages on the
>>> receiving side because all the information needed is within the
>>> packets that come in the stream.
>>> 
>>> We're about to add support to fixed-ram migration, which cannot use
>>> packets because it expects the ramblock section in the migration file
>>> to contain only the guest pages data.
>>> 
>>> Add a data structure to transfer pages between the ram migration code
>>> and the multifd receiving threads.
>>> 
>>> We don't want to reuse MultiFDPages_t for two reasons:
>>> 
>>> a) multifd threads don't really need to know about the data they're
>>>receiving.
>>> 
>>> b) the receiving side has to be stopped to load the pages, which means
>>>we can experiment with larger granularities than page size when
>>>transferring data.
>>> 
>>> Signed-off-by: Fabiano Rosas 
>>> ---
>>> @Peter: a 'quit' flag cannot be used instead of pending_job. The
>>> receiving thread needs know there's no more data coming. If the
>>> migration thread sets a 'quit' flag, the multifd thread would see the
>>> flag right away and exit.
>>
>> Hmm.. isn't this exactly what we want?  I'll comment for this inline below.
>>
>>> The only way is to clear pending_job on the
>>> thread and spin once more.
>>> ---
>>>  migration/file.c|   1 +
>>>  migration/multifd.c | 122 +---
>>>  migration/multifd.h |  15 ++
>>>  3 files changed, 131 insertions(+), 7 deletions(-)
>>> 
>>> diff --git a/migration/file.c b/migration/file.c
>>> index 5d4975f43e..22d052a71f 100644
>>> --- a/migration/file.c
>>> +++ b/migration/file.c
>>> @@ -6,6 +6,7 @@
>>>   */
>>>  
>>>  #include "qemu/osdep.h"
>>> +#include "exec/ramblock.h"
>>>  #include "qemu/cutils.h"
>>>  #include "qapi/error.h"
>>>  #include "channel.h"
>>> diff --git a/migration/multifd.c b/migration/multifd.c
>>> index 0a5279314d..45a0c7aaa8 100644
>>> --- a/migration/multifd.c
>>> +++ b/migration/multifd.c
>>> @@ -81,9 +81,15 @@ struct {
>>>  
>>>  struct {
>>>  MultiFDRecvParams *params;
>>> +MultiFDRecvData *data;
>>>  /* number of created threads */
>>>  int count;
>>> -/* syncs main thread and channels */
>>> +/*
>>> + * For sockets: this is posted once for each MULTIFD_FLAG_SYNC flag.
>>> + *
>>> + * For files: this is only posted at the end of the file load to mark
>>> + *completion of the load process.
>>> + */
>>>  QemuSemaphore sem_sync;
>>>  /* global number of generated multifd packets */
>>>  uint64_t packet_num;
>>> @@ -1110,6 +1116,53 @@ bool multifd_send_setup(void)
>>>  return true;
>>>  }
>>>  
>>> +bool multifd_recv(void)
>>> +{
>>> +int i;
>>> +static int next_recv_channel;
>>> +MultiFDRecvParams *p = NULL;
>>> +MultiFDRecvData *data = multifd_recv_state->data;
>>
>> [1]
>>
>>> +
>>> +/*
>>> + * next_channel can remain from a previous migration that was
>>> + * using more channels, so ensure it doesn't overflow if the
>>> + * limit is lower now.
>>> + */
>>> +next_recv_channel %= migrate_multifd_channels();
>>> +for (i = next_recv_channel;; i = (i + 1) % migrate_multifd_channels()) 
>>> {
>>> +if (multifd_recv_should_exit()) {
>>> +return false;
>>> +}
>>> +
>>> +p = _recv_state->params[i];
>>> +
>>> +/*
>>> + * Safe to read atomically without a lock because the flag is
>>> + * only set by this function below. Reading an old value of
>>> + * true is not an issue because it would only send us looking
>>> + * for the next idle channel.
>>> + */
>>> +if (qatomic_read(>pending_job) == false) {
>>> +next_recv_channel = (i + 1) % migrate_multifd_channels();
>>> +break;
>>> +}
>>> +}
>>
>> IIUC you'll need an smp_mb_acquire() here.  The ordering of "reading
>> pending_job" and below must be guaranteed, similar to the sender side.
>>
>
> I've been thinking about this even on the sending side.
>
> We shouldn't need the barrier here because there's a control flow
> dependency on breaking the loop. I think pending_job *must* be read
> prior to here, otherwise the program is just wrong. Does that make
> sense?

Hm, nevermind actually. We need to order this against data->size update
on the other thread anyway.




Re: [PATCH v5 2/8] Add an internal clock multiplexer object

2024-02-26 Thread Arnaud Minier
- Original Message -
> From: "Peter Maydell" 
> To: "Arnaud Minier" 
> Cc: "qemu-devel" , "Thomas Huth" , 
> "Laurent Vivier" , "Inès
> Varhol" , "Samuel Tardieu" 
> , "qemu-arm"
> , "Alistair Francis" , "Paolo 
> Bonzini" , "Alistair
> Francis" 
> Sent: Friday, February 23, 2024 3:44:59 PM
> Subject: Re: [PATCH v5 2/8] Add an internal clock multiplexer object

> On Mon, 19 Feb 2024 at 20:12, Arnaud Minier
>  wrote:
>>
>> This object is used to represent every multiplexer in the clock tree as
>> well as every clock output, every presecaler, frequency multiplier, etc.
>> This allows to use a generic approach for every component of the clock tree
>> (except the PLLs).
>>
>> Wasn't sure about how to handle the reset and the migration so used the
>> same appproach as the BCM2835 CPRMAN.
> 
> I think hw/misc/zynq_sclr.c is also probably a good model to look at.
> 
> AIUI the way it works is:
> * input Clock objects must be migrated
> * output Clock objects do not need to be migrated
> * your reset needs to be a three-phase one:
>   - in the 'enter' method you reset register values (including
> all the values that define oscillator frequencies, enable bits, etc)
>   - in the 'hold' method you compute the values for the output clocks
> as if the input clock is disabled, and propagate them
>   - in the 'exit' method you compute the values for the output clocks
> according to the value of the input clock, and propagate them
> 

Thanks for the indication.
I have changed the way we handle the reset to have a three phase one.

> 
> 
>> Signed-off-by: Arnaud Minier 
>> Signed-off-by: Inès Varhol 
>> Acked-by: Alistair Francis 
>> ---
>>  hw/misc/stm32l4x5_rcc.c   | 158 ++
>>  hw/misc/trace-events  |   5 +
>>  include/hw/misc/stm32l4x5_rcc.h   | 119 
>>  include/hw/misc/stm32l4x5_rcc_internals.h |  29 
>>  4 files changed, 311 insertions(+)
>>
>> diff --git a/hw/misc/stm32l4x5_rcc.c b/hw/misc/stm32l4x5_rcc.c
>> index 38ca8aad7d..ed10832f88 100644
>> --- a/hw/misc/stm32l4x5_rcc.c
>> +++ b/hw/misc/stm32l4x5_rcc.c
>> @@ -36,6 +36,132 @@
>>  #define LSE_FRQ 32768ULL
>>  #define LSI_FRQ 32000ULL
>>
>> +static void clock_mux_update(RccClockMuxState *mux)
>> +{
>> +uint64_t src_freq, old_freq, freq;
>> +
>> +src_freq = clock_get_hz(mux->srcs[mux->src]);
>> +old_freq = clock_get_hz(mux->out);
> 
> You should try to avoid using clock_get_hz() and clock_update_hz()
> when doing clock calculations like this. There is inherently
> rounding involved if the clock isn't running at an exact number of Hz.
> It's best to use clock_get() and clock_set(), which work with
> the clock period specified in units of 2^-32ns.
> 
> 
>> +
>> +if (!mux->enabled || !mux->divider) {
>> +freq = 0;
>> +} else {
>> +freq = muldiv64(src_freq, mux->multiplier, mux->divider);
> 
> Consider whether you can use the Clock's builtin period
> multiplier/divider (clock_set_mul_div()).

I have changed it to use the period and the builtin clock_set_mul_div() but I 
had to discard
the check below that prevents a lot of spam in the logs due to no longer
having access to the children frequency without using muldiv64 again.
Any idea on how to keep a similar functionnality .

> 
>> +}
>> +
>> +/* No change, early return to avoid log spam and useless propagation */
>> +if (old_freq == freq) {
>> +return;
>> +}
>> +
>> +clock_update_hz(mux->out, freq);
>> +trace_stm32l4x5_rcc_mux_update(mux->id, mux->src, src_freq, freq);
>> +}
>> +
>> +static void clock_mux_src_update(void *opaque, ClockEvent event)
>> +{
>> +RccClockMuxState **backref = opaque;
>> +RccClockMuxState *s = *backref;
>> +/*
>> + * The backref value is equal to:
>> + * s->backref + (sizeof(RccClockMuxState *) * update_src).
>> + * By subtracting we can get back the index of the updated clock.
>> + */
>> +const uint32_t update_src = backref - s->backref;
>> +/* Only update if the clock that was updated is the current source*/
>> +if (update_src == s->src) {
>> +clock_mux_update(s);
>> +}
>> +}
>> +
>> +static void clock_mux_init(Object *obj)
>> +{
>> +RccClockMuxState *s = RCC_CLOCK_MUX(obj);
>> +size_t i;
>> +
>> +for (i = 0; i < RCC_NUM_CLOCK_MUX_SRC; i++) {
>> +char *name = g_strdup_printf("srcs[%zu]", i);
>> +s->backref[i] = s;
>> +s->srcs[i] = qdev_init_clock_in(DEVICE(s), name,
>> +clock_mux_src_update,
>> +>backref[i],
>> +ClockUpdate);
>> +g_free(name);
>> +}
>> +
>> +s->out = qdev_init_clock_out(DEVICE(s), "out");
>> +}
>> +
>> +static void clock_mux_reset_hold(Object *obj)
>> +{ }
>> +
>> +static const VMStateDescription clock_mux_vmstate = {
>> +.name = TYPE_RCC_CLOCK_MUX,
>> +.version_id = 1,
>> +   

[PATCH 2/2] migration: Use migrate_has_error() in close_return_path_on_source()

2024-02-26 Thread Fabiano Rosas
From: Cédric Le Goater 

close_return_path_on_source() retrieves the migration error from the
the QEMUFile '->to_dst_file' to know if a shutdown is required. This
shutdown is required to exit the return-path thread.

Avoid relying on '->to_dst_file' and use migrate_has_error() instead.

(using to_dst_file is a heuristic to infer whether
rp_state.from_dst_file might be stuck on a recvmsg(). Using a generic
method for detecting errors is more reliable. We also want to reduce
dependency on QEMUFile::last_error)

Suggested-by: Peter Xu 
Signed-off-by: Cédric Le Goater 
Reviewed-by: Peter Xu 
[added some words about the motivation for this patch]
Signed-off-by: Fabiano Rosas 
---
 migration/migration.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 3161be7cde..5316bbe670 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2380,8 +2380,7 @@ static bool close_return_path_on_source(MigrationState 
*ms)
  * cause it to unblock if it's stuck waiting for the destination.
  */
 WITH_QEMU_LOCK_GUARD(>qemu_file_lock) {
-if (ms->to_dst_file && ms->rp_state.from_dst_file &&
-qemu_file_get_error(ms->to_dst_file)) {
+if (migrate_has_error(ms) && ms->rp_state.from_dst_file) {
 qemu_file_shutdown(ms->rp_state.from_dst_file);
 }
 }
-- 
2.35.3




[PATCH 0/2] migration: Fix RP shutdown order

2024-02-26 Thread Fabiano Rosas
These were extracted from:
[PATCH 00/14] migration: Improve error reporting
https://lore.kernel.org/r/20240207133347.1115903-1-...@redhat.com

We're currently relying on the presence of a QEMUFile error in the
to_dst_file to infer whether the return path file (rp_state.from_dst_file)
might be hanging at a recvmsg() system call.

This has always been buggy because we actually clear the to_dst_file
pointer and close the file before attempting any of the above.

Move the RP cleanup before the to_dst_file cleanup, at both the
migrate_fd_cleanup() and postcopy_pause() paths to make sure we
reference a valid and open to_dst_file.

Also replace the error checking to use s->error instead of
f->last_error. This removes one more dependency on
QEMUFile::last_error.

CI run: https://gitlab.com/farosas/qemu/-/pipelines/1191131909

Cédric Le Goater (1):
  migration: Use migrate_has_error() in close_return_path_on_source()

Fabiano Rosas (1):
  migration: Join the return path thread before releasing to_dst_file

 migration/migration.c | 25 ++---
 1 file changed, 10 insertions(+), 15 deletions(-)

-- 
2.35.3




[PATCH 1/2] migration: Join the return path thread before releasing to_dst_file

2024-02-26 Thread Fabiano Rosas
The return path thread might hang at a blocking system call. Before
joining the thread we might need to issue a shutdown() on the socket
file descriptor to release it. To determine whether the shutdown() is
necessary we look at the QEMUFile error.

Make sure we only clean up the QEMUFile after the return path has been
waited for.

This fixes a hang when qemu_savevm_state_setup() produced an error
that was detected by migration_detect_error(). That skips
migration_completion() so close_return_path_on_source() would get
stuck waiting for the RP thread to terminate.

Reported-by: Cédric Le Goater 
Tested-by: Cédric Le Goater 
Signed-off-by: Fabiano Rosas 
---
 migration/migration.c | 22 +-
 1 file changed, 9 insertions(+), 13 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index ab21de2cad..3161be7cde 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1326,6 +1326,8 @@ static void migrate_fd_cleanup(MigrationState *s)
 
 qemu_savevm_state_cleanup();
 
+close_return_path_on_source(s);
+
 if (s->to_dst_file) {
 QEMUFile *tmp;
 
@@ -1350,12 +1352,6 @@ static void migrate_fd_cleanup(MigrationState *s)
 qemu_fclose(tmp);
 }
 
-/*
- * We already cleaned up to_dst_file, so errors from the return
- * path might be due to that, ignore them.
- */
-close_return_path_on_source(s);
-
 assert(!migration_is_active(s));
 
 if (s->state == MIGRATION_STATUS_CANCELLING) {
@@ -2874,6 +2870,13 @@ static MigThrError postcopy_pause(MigrationState *s)
 while (true) {
 QEMUFile *file;
 
+/*
+ * We're already pausing, so ignore any errors on the return
+ * path and just wait for the thread to finish. It will be
+ * re-created when we resume.
+ */
+close_return_path_on_source(s);
+
 /*
  * Current channel is possibly broken. Release it.  Note that this is
  * guaranteed even without lock because to_dst_file should only be
@@ -2893,13 +2896,6 @@ static MigThrError postcopy_pause(MigrationState *s)
 qemu_file_shutdown(file);
 qemu_fclose(file);
 
-/*
- * We're already pausing, so ignore any errors on the return
- * path and just wait for the thread to finish. It will be
- * re-created when we resume.
- */
-close_return_path_on_source(s);
-
 migrate_set_state(>state, s->state,
   MIGRATION_STATUS_POSTCOPY_PAUSED);
 
-- 
2.35.3




[PATCH v3 7/7] Update maintainer contact for migration multifd zero page checking acceleration.

2024-02-26 Thread Hao Xiang
Add myself to maintain multifd zero page checking acceleration function.

Signed-off-by: Hao Xiang 
---
 MAINTAINERS | 5 +
 1 file changed, 5 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 992799171f..4a4f8f24e0 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3413,6 +3413,11 @@ F: tests/migration/
 F: util/userfaultfd.c
 X: migration/rdma*
 
+Migration multifd zero page checking acceleration
+M: Hao Xiang 
+S: Maintained
+F: migration/multifd-zero-page.c
+
 RDMA Migration
 R: Li Zhijian 
 R: Peter Xu 
-- 
2.30.2




Re: [PATCH v2 1/3] qtest: migration: Enhance qtest migration functions to support 'channels' argument

2024-02-26 Thread Het Gala


On 27/02/24 1:04 am, Het Gala wrote:



On 26/02/24 6:31 pm, Fabiano Rosas wrote:

Het Gala  writes:


On 24/02/24 1:42 am, Fabiano Rosas wrote:
this was the same first approach that I attempted. It won't work because

The final 'migrate' QAPI with channels string would look like

{ "execute": "migrate", "arguments": { "channels": "[ { "channel-type":
"main", "addr": { "transport": "socket", "type": "inet", "host":
"10.117.29.84", "port": "4000" }, "multifd-channels": 2 } ]" } }

instead of

{ "execute": "migrate", "arguments": { "channels": [ { "channel-type":
"main", "addr": { "transport": "socket", "type": "inet", "host":
"10.117.29.84", "port": "4000" }, "multifd-channels": 2 } ] } }

It would complain, that channels should be an *array* and not a string.

So, that's the reason parsing was required in qtest too.

I would be glad to hear if there are any ideas to convert /string ->
json object -> add it inside qdict along with uri/ ?


Isn't this what the various qobject_from_json do? How does it work with
the existing tests?

 qtest_qmp_assert_success(to, "{ 'execute': 'migrate-incoming',"
  "  'arguments': { "
  "  'channels': [ { 'channel-type': 'main',"
  "  'addr': { 'transport': 'socket',"
  "'type': 'inet',"
  "'host': '127.0.0.1',"
  "'port': '0' } } ] } }");

We can pass this^ string successfully to QMP somehow...


I think, here in qtest_qmp_assert_success, we actually can pass the 
whole QMP command, and it just asserts that return key is present in 
the response, though I am not very much familiar with qtest codebase 
to verify how swiftly we can convert string into an actual QObject.


[...]

I tried with qobject_from_json type of utility functions and the error I 
got was this :


migration-test: /rpmbuild/SOURCES/qemu/include/qapi/qmp/qobject.h:126: 
qobject_type: Assertion `QTYPE_NONE < obj->base.type && obj->base.type < 
QTYPE__MAX' failed.


And I suppose this was the case because, there are only limited types of 
QTYPE available


typedefenumQType{
QTYPE_NONE,
QTYPE_QNULL,
QTYPE_QNUM,
QTYPE_QSTRING,
QTYPE_QDICT,
QTYPE_QLIST,
QTYPE_QBOOL,
QTYPE__MAX,
} QType;

And 'channels' is a mixture of QDICT and QLIST and hence it is not able 
to easily convert from string to json.


Thoughts on this ?


static void do_test_validate_uri_channel(MigrateCommon *args)
{
QTestState *from, *to;
g_autofree char *connect_uri = NULL;

if (test_migrate_start(, , args->listen_uri, >start)) {

return;
}



Regards,

Het Gala


Regards,

Het Gala



Re: [PATCH V4 00/14] allow cpr-reboot for vfio

2024-02-26 Thread Steven Sistare
On 2/26/2024 4:01 AM, Peter Xu wrote:
> On Mon, Feb 26, 2024 at 09:49:46AM +0100, Cédric Le Goater wrote:
>> Go ahead. It will help me for the changes I am doing on error reporting
>> for VFIO migration. I will rebase on top.
> 
> Thanks for confirming.  I queued the migration patches then, but leave the
> two vfio one for further discussion.

Very good, thanks.  I am always happy to move the ball a few yards closer to
the goal line :)

- Steve




[PATCH 5/9] Hexagon (tests/tcg/hexagon) Test HVX .new read from high half of pair

2024-02-26 Thread Taylor Simpson
Make sure the decoding of HVX .new is correctly handling this case

Signed-off-by: Taylor Simpson 
---
 tests/tcg/hexagon/hvx_misc.c | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/tests/tcg/hexagon/hvx_misc.c b/tests/tcg/hexagon/hvx_misc.c
index b45170acd1..1fe14b5158 100644
--- a/tests/tcg/hexagon/hvx_misc.c
+++ b/tests/tcg/hexagon/hvx_misc.c
@@ -1,5 +1,5 @@
 /*
- *  Copyright(c) 2021-2023 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *  Copyright(c) 2021-2024 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -231,6 +231,7 @@ static void test_masked_store(bool invert)
 static void test_new_value_store(void)
 {
 void *p0 = buffer0;
+void *p1 = buffer1;
 void *pout = output;
 
 asm("{\n\t"
@@ -242,6 +243,19 @@ static void test_new_value_store(void)
 expect[0] = buffer0[0];
 
 check_output_w(__LINE__, 1);
+
+/* Test the .new read from the high half of a pair */
+asm("v7 = vmem(%0 + #0)\n\t"
+"v12 = vmem(%1 + #0)\n\t"
+"{\n\t"
+"v5:4 = vcombine(v12, v7)\n\t"
+"vmem(%2 + #0) = v5.new\n\t"
+"}\n\t"
+: : "r"(p0), "r"(p1), "r"(pout) : "v4", "v5", "v7", "v12", "memory");
+
+expect[0] = buffer1[0];
+
+check_output_w(__LINE__, 1);
 }
 
 static void test_max_temps()
-- 
2.34.1




[PATCH 8/9] Hexagon (target/hexagon) Remove gen_shortcode.py

2024-02-26 Thread Taylor Simpson
This data structure is not used

Signed-off-by: Taylor Simpson 
---
 target/hexagon/opcodes.c|  7 
 target/hexagon/README   |  1 -
 target/hexagon/gen_shortcode.py | 63 -
 target/hexagon/meson.build  | 10 --
 4 files changed, 81 deletions(-)
 delete mode 100755 target/hexagon/gen_shortcode.py

diff --git a/target/hexagon/opcodes.c b/target/hexagon/opcodes.c
index 02ae9cf787..c8bde2f9e9 100644
--- a/target/hexagon/opcodes.c
+++ b/target/hexagon/opcodes.c
@@ -37,13 +37,6 @@ const char * const opcode_names[] = {
 };
 
 
-const char * const opcode_short_semantics[] = {
-#define DEF_SHORTCODE(TAG, SHORTCODE)  [TAG] = #SHORTCODE,
-#include "shortcode_generated.h.inc"
-#undef DEF_SHORTCODE
-NULL
-};
-
 DECLARE_BITMAP(opcode_attribs[XX_LAST_OPCODE], A_ZZ_LASTATTRIB);
 
 static void init_attribs(int tag, ...)
diff --git a/target/hexagon/README b/target/hexagon/README
index 065c05154d..65b4fcc0fa 100644
--- a/target/hexagon/README
+++ b/target/hexagon/README
@@ -46,7 +46,6 @@ header files in /target/hexagon
 gen_printinsn.py-> printinsn_generated.h.inc
 gen_op_attribs.py   -> op_attribs_generated.h.inc
 gen_helper_protos.py-> helper_protos_generated.h.inc
-gen_shortcode.py-> shortcode_generated.h.inc
 gen_tcg_funcs.py-> tcg_funcs_generated.c.inc
 gen_tcg_func_table.py   -> tcg_func_table_generated.c.inc
 gen_helper_funcs.py -> helper_funcs_generated.c.inc
diff --git a/target/hexagon/gen_shortcode.py b/target/hexagon/gen_shortcode.py
deleted file mode 100755
index deb94446c4..00
--- a/target/hexagon/gen_shortcode.py
+++ /dev/null
@@ -1,63 +0,0 @@
-#!/usr/bin/env python3
-
-##
-##  Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
-##
-##  This program is free software; you can redistribute it and/or modify
-##  it under the terms of the GNU General Public License as published by
-##  the Free Software Foundation; either version 2 of the License, or
-##  (at your option) any later version.
-##
-##  This program is distributed in the hope that it will be useful,
-##  but WITHOUT ANY WARRANTY; without even the implied warranty of
-##  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-##  GNU General Public License for more details.
-##
-##  You should have received a copy of the GNU General Public License
-##  along with this program; if not, see .
-##
-
-import sys
-import re
-import string
-import hex_common
-
-
-def gen_shortcode(f, tag):
-f.write(f"DEF_SHORTCODE({tag}, {hex_common.semdict[tag]})\n")
-
-
-def main():
-hex_common.read_semantics_file(sys.argv[1])
-hex_common.read_attribs_file(sys.argv[2])
-hex_common.calculate_attribs()
-tagregs = hex_common.get_tagregs()
-tagimms = hex_common.get_tagimms()
-
-with open(sys.argv[3], "w") as f:
-f.write("#ifndef DEF_SHORTCODE\n")
-f.write("#define DEF_SHORTCODE(TAG,SHORTCODE)/* Nothing */\n")
-f.write("#endif\n")
-
-for tag in hex_common.tags:
-## Skip the priv instructions
-if "A_PRIV" in hex_common.attribdict[tag]:
-continue
-## Skip the guest instructions
-if "A_GUEST" in hex_common.attribdict[tag]:
-continue
-## Skip the diag instructions
-if tag == "Y6_diag":
-continue
-if tag == "Y6_diag0":
-continue
-if tag == "Y6_diag1":
-continue
-
-gen_shortcode(f, tag)
-
-f.write("#undef DEF_SHORTCODE\n")
-
-
-if __name__ == "__main__":
-main()
diff --git a/target/hexagon/meson.build b/target/hexagon/meson.build
index b3a0944d3b..988e7489ba 100644
--- a/target/hexagon/meson.build
+++ b/target/hexagon/meson.build
@@ -42,21 +42,11 @@ hexagon_ss.add(semantics_generated)
 #
 # Step 2
 # We use Python scripts to generate the following files
-# shortcode_generated.h.inc
 # tcg_func_table_generated.c.inc
 # printinsn_generated.h.inc
 # op_attribs_generated.h.inc
 # opcodes_def_generated.h.inc
 #
-shortcode_generated = custom_target(
-'shortcode_generated.h.inc',
-output: 'shortcode_generated.h.inc',
-depends: [semantics_generated],
-depend_files: [hex_common_py, attribs_def],
-command: [python, files('gen_shortcode.py'), semantics_generated, 
attribs_def, '@OUTPUT@'],
-)
-hexagon_ss.add(shortcode_generated)
-
 tcg_func_table_generated = custom_target(
 'tcg_func_table_generated.c.inc',
 output: 'tcg_func_table_generated.c.inc',
-- 
2.34.1




[PATCH 4/9] Hexagon (target/hexagon) Mark has_pred_dest in trans functions

2024-02-26 Thread Taylor Simpson
Check that the value matches opcode_wregs

Signed-off-by: Taylor Simpson 
---
 target/hexagon/insn.h | 1 +
 target/hexagon/decode.c   | 3 +++
 target/hexagon/gen_trans_funcs.py | 4 
 3 files changed, 8 insertions(+)

diff --git a/target/hexagon/insn.h b/target/hexagon/insn.h
index a770379958..24dcf7fe9f 100644
--- a/target/hexagon/insn.h
+++ b/target/hexagon/insn.h
@@ -41,6 +41,7 @@ struct Instruction {
 uint32_t new_value_producer_slot:4;
 int32_t new_read_idx;
 int32_t dest_idx;
+bool has_pred_dest;
 
 bool part1;  /*
   * cmp-jumps are split into two insns.
diff --git a/target/hexagon/decode.c b/target/hexagon/decode.c
index a4d8500fea..84a3899556 100644
--- a/target/hexagon/decode.c
+++ b/target/hexagon/decode.c
@@ -366,6 +366,9 @@ static void decode_shuffle_for_execution(Packet *packet)
 for (flag = false, i = 0; i < last_insn + 1; i++) {
 int opcode = packet->insn[i].opcode;
 
+g_assert(packet->insn[i].has_pred_dest ==
+ (strstr(opcode_wregs[opcode], "Pd4") ||
+  strstr(opcode_wregs[opcode], "Pe4")));
 if ((strstr(opcode_wregs[opcode], "Pd4") ||
  strstr(opcode_wregs[opcode], "Pe4")) &&
 GET_ATTRIB(opcode, A_STORE) == 0) {
diff --git a/target/hexagon/gen_trans_funcs.py 
b/target/hexagon/gen_trans_funcs.py
index 07292e0170..f1972fd2dd 100755
--- a/target/hexagon/gen_trans_funcs.py
+++ b/target/hexagon/gen_trans_funcs.py
@@ -86,6 +86,7 @@ def gen_trans_funcs(f):
 
 new_read_idx = -1
 dest_idx = -1
+has_pred_dest = "false"
 for regno, regstruct in enumerate(regs):
 reg_type, reg_id, _, _ = regstruct
 reg = hex_common.get_register(tag, reg_type, reg_id)
@@ -96,6 +97,8 @@ def gen_trans_funcs(f):
 new_read_idx = regno
 if reg.is_written() and dest_idx == -1:
 dest_idx = regno
+if reg_type == "P" and not reg.is_read():
+has_pred_dest = "true"
 
 if len(imms) != 0:
 mark_which_imm_extended(f, tag)
@@ -119,6 +122,7 @@ def gen_trans_funcs(f):
 f.write(code_fmt(f"""\
 insn->new_read_idx = {new_read_idx};
 insn->dest_idx = {dest_idx};
+insn->has_pred_dest = {has_pred_dest};
 """))
 f.write(textwrap.dedent(f"""\
 return true;
-- 
2.34.1




[PATCH 1/9] Hexagon (target/hexagon) Add is_old/is_new to Register class

2024-02-26 Thread Taylor Simpson
Signed-off-by: Taylor Simpson 
---
 target/hexagon/hex_common.py | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/target/hexagon/hex_common.py b/target/hexagon/hex_common.py
index 195620c7ec..4bacef223f 100755
--- a/target/hexagon/hex_common.py
+++ b/target/hexagon/hex_common.py
@@ -1,7 +1,7 @@
 #!/usr/bin/env python3
 
 ##
-##  Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+##  Copyright(c) 2019-2024 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
 ##
 ##  This program is free software; you can redistribute it and/or modify
 ##  it under the terms of the GNU General Public License as published by
@@ -397,10 +397,18 @@ def is_readwrite(self):
 class OldSource(Source):
 def reg_tcg(self):
 return f"{self.regtype}{self.regid}V"
+def is_old(self):
+return True
+def is_new(self):
+return False
 
 class NewSource(Source):
 def reg_tcg(self):
 return f"{self.regtype}{self.regid}N"
+def is_old(self):
+return False
+def is_new(self):
+return True
 
 class ReadWrite:
 def reg_tcg(self):
@@ -413,6 +421,10 @@ def is_read(self):
 return True
 def is_readwrite(self):
 return True
+def is_old(self):
+return True
+def is_new(self):
+return False
 
 class GprDest(Register, Single, Dest):
 def decl_tcg(self, f, tag, regno):
-- 
2.34.1




[PATCH 3/9] Hexagon (target/hexagon) Mark dest_idx in trans functions

2024-02-26 Thread Taylor Simpson
Check that the value matches opcode_reginfo/opcode_wregs

Signed-off-by: Taylor Simpson 
---
 target/hexagon/insn.h   | 1 +
 target/hexagon/decode.c | 2 ++
 target/hexagon/mmvec/decode_ext_mmvec.c | 2 ++
 target/hexagon/gen_trans_funcs.py   | 4 
 4 files changed, 9 insertions(+)

diff --git a/target/hexagon/insn.h b/target/hexagon/insn.h
index 36502bf056..a770379958 100644
--- a/target/hexagon/insn.h
+++ b/target/hexagon/insn.h
@@ -40,6 +40,7 @@ struct Instruction {
 uint32_t which_extended:1;/* If has an extender, which immediate */
 uint32_t new_value_producer_slot:4;
 int32_t new_read_idx;
+int32_t dest_idx;
 
 bool part1;  /*
   * cmp-jumps are split into two insns.
diff --git a/target/hexagon/decode.c b/target/hexagon/decode.c
index 4595e30384..a4d8500fea 100644
--- a/target/hexagon/decode.c
+++ b/target/hexagon/decode.c
@@ -184,6 +184,8 @@ decode_fill_newvalue_regno(Packet *packet)
 
 /* Now patch up the consumer with the register number */
 dst_idx = dststr - opcode_reginfo[def_opcode];
+g_assert(packet->insn[def_idx].dest_idx != -1 &&
+ packet->insn[def_idx].dest_idx == dst_idx);
 packet->insn[i].regno[use_regidx] =
 packet->insn[def_idx].regno[dst_idx];
 /*
diff --git a/target/hexagon/mmvec/decode_ext_mmvec.c 
b/target/hexagon/mmvec/decode_ext_mmvec.c
index e9007f5d71..c1320406df 100644
--- a/target/hexagon/mmvec/decode_ext_mmvec.c
+++ b/target/hexagon/mmvec/decode_ext_mmvec.c
@@ -86,6 +86,8 @@ check_new_value(Packet *pkt)
 /* still not there, we have a bad packet */
 g_assert_not_reached();
 }
+g_assert(pkt->insn[def_idx].dest_idx != -1 &&
+ pkt->insn[def_idx].dest_idx == dststr - reginfo);
 int def_regnum = pkt->insn[def_idx].regno[dststr - reginfo];
 /* Now patch up the consumer with the register number */
 pkt->insn[i].regno[use_regidx] = def_regnum ^ def_oreg;
diff --git a/target/hexagon/gen_trans_funcs.py 
b/target/hexagon/gen_trans_funcs.py
index 79475b2946..07292e0170 100755
--- a/target/hexagon/gen_trans_funcs.py
+++ b/target/hexagon/gen_trans_funcs.py
@@ -85,6 +85,7 @@ def gen_trans_funcs(f):
 """))
 
 new_read_idx = -1
+dest_idx = -1
 for regno, regstruct in enumerate(regs):
 reg_type, reg_id, _, _ = regstruct
 reg = hex_common.get_register(tag, reg_type, reg_id)
@@ -93,6 +94,8 @@ def gen_trans_funcs(f):
 """))
 if reg.is_read() and reg.is_new():
 new_read_idx = regno
+if reg.is_written() and dest_idx == -1:
+dest_idx = regno
 
 if len(imms) != 0:
 mark_which_imm_extended(f, tag)
@@ -115,6 +118,7 @@ def gen_trans_funcs(f):
 
 f.write(code_fmt(f"""\
 insn->new_read_idx = {new_read_idx};
+insn->dest_idx = {dest_idx};
 """))
 f.write(textwrap.dedent(f"""\
 return true;
-- 
2.34.1




[PATCH 9/9] Hexagon (target/hexagon) Remove hex_common.read_attribs_file

2024-02-26 Thread Taylor Simpson
The attribinfo data structure is not used
Adjust the command-line arguments to the python scripts
Add hex_common.read_common_files for TCG/helper generation scripts

Signed-off-by: Taylor Simpson 
---
 target/hexagon/gen_analyze_funcs.py | 21 ++-
 target/hexagon/gen_helper_funcs.py  | 21 ++-
 target/hexagon/gen_helper_protos.py | 21 ++-
 target/hexagon/gen_idef_parser_funcs.py |  5 ++--
 target/hexagon/gen_op_attribs.py|  5 ++--
 target/hexagon/gen_opcodes_def.py   |  4 +--
 target/hexagon/gen_printinsn.py |  5 ++--
 target/hexagon/gen_tcg_func_table.py|  5 ++--
 target/hexagon/gen_tcg_funcs.py | 21 ++-
 target/hexagon/hex_common.py| 35 +++--
 target/hexagon/meson.build  | 31 +++---
 11 files changed, 54 insertions(+), 120 deletions(-)

diff --git a/target/hexagon/gen_analyze_funcs.py 
b/target/hexagon/gen_analyze_funcs.py
index a9af666cef..b73b4e2349 100755
--- a/target/hexagon/gen_analyze_funcs.py
+++ b/target/hexagon/gen_analyze_funcs.py
@@ -1,7 +1,7 @@
 #!/usr/bin/env python3
 
 ##
-##  Copyright(c) 2022-2023 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+##  Copyright(c) 2022-2024 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
 ##
 ##  This program is free software; you can redistribute it and/or modify
 ##  it under the terms of the GNU General Public License as published by
@@ -67,24 +67,7 @@ def gen_analyze_func(f, tag, regs, imms):
 
 
 def main():
-hex_common.read_semantics_file(sys.argv[1])
-hex_common.read_attribs_file(sys.argv[2])
-hex_common.read_overrides_file(sys.argv[3])
-hex_common.read_overrides_file(sys.argv[4])
-## Whether or not idef-parser is enabled is
-## determined by the number of arguments to
-## this script:
-##
-##   5 args. -> not enabled,
-##   6 args. -> idef-parser enabled.
-##
-## The 6:th arg. then holds a list of the successfully
-## parsed instructions.
-is_idef_parser_enabled = len(sys.argv) > 6
-if is_idef_parser_enabled:
-hex_common.read_idef_parser_enabled_file(sys.argv[5])
-hex_common.calculate_attribs()
-hex_common.init_registers()
+hex_common.read_common_files()
 tagregs = hex_common.get_tagregs()
 tagimms = hex_common.get_tagimms()
 
diff --git a/target/hexagon/gen_helper_funcs.py 
b/target/hexagon/gen_helper_funcs.py
index 9cc3d69c49..e9685bff2f 100755
--- a/target/hexagon/gen_helper_funcs.py
+++ b/target/hexagon/gen_helper_funcs.py
@@ -1,7 +1,7 @@
 #!/usr/bin/env python3
 
 ##
-##  Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+##  Copyright(c) 2019-2024 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
 ##
 ##  This program is free software; you can redistribute it and/or modify
 ##  it under the terms of the GNU General Public License as published by
@@ -102,24 +102,7 @@ def gen_helper_function(f, tag, tagregs, tagimms):
 
 
 def main():
-hex_common.read_semantics_file(sys.argv[1])
-hex_common.read_attribs_file(sys.argv[2])
-hex_common.read_overrides_file(sys.argv[3])
-hex_common.read_overrides_file(sys.argv[4])
-## Whether or not idef-parser is enabled is
-## determined by the number of arguments to
-## this script:
-##
-##   5 args. -> not enabled,
-##   6 args. -> idef-parser enabled.
-##
-## The 6:th arg. then holds a list of the successfully
-## parsed instructions.
-is_idef_parser_enabled = len(sys.argv) > 6
-if is_idef_parser_enabled:
-hex_common.read_idef_parser_enabled_file(sys.argv[5])
-hex_common.calculate_attribs()
-hex_common.init_registers()
+hex_common.read_common_files()
 tagregs = hex_common.get_tagregs()
 tagimms = hex_common.get_tagimms()
 
diff --git a/target/hexagon/gen_helper_protos.py 
b/target/hexagon/gen_helper_protos.py
index c82b0f54e4..4cc72a1581 100755
--- a/target/hexagon/gen_helper_protos.py
+++ b/target/hexagon/gen_helper_protos.py
@@ -1,7 +1,7 @@
 #!/usr/bin/env python3
 
 ##
-##  Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+##  Copyright(c) 2019-2024 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
 ##
 ##  This program is free software; you can redistribute it and/or modify
 ##  it under the terms of the GNU General Public License as published by
@@ -44,24 +44,7 @@ def gen_helper_prototype(f, tag, tagregs, tagimms):
 
 
 def main():
-hex_common.read_semantics_file(sys.argv[1])
-hex_common.read_attribs_file(sys.argv[2])
-hex_common.read_overrides_file(sys.argv[3])
-hex_common.read_overrides_file(sys.argv[4])
-## Whether or not idef-parser is enabled is
-## determined by the number of arguments to
-## this script:
-##
-##   5 args. -> not enabled,
-##   6 args. -> idef-parser enabled.
-##
-## The 6:th arg. then holds a list of the successfully
-## parsed 

[PATCH 0/9] Hexagon (target/hexagon) Clean up .new decode and scripts

2024-02-26 Thread Taylor Simpson
During .new decode, there are several places where strchr is used.
We remove these by generating the values that are needed.

Once we have generated the proper values, we no longer need
op_regs_generated.h.inc.  We remove the script that generates it as
well as the code in meson.build

We also remove the script and meson.build code that creates
shortcode_generated.h.inc.  The data structure that includes it is
not used.

We remove hex_common.read_attribs_file.  The Python data structures built
during this step are not used.


Taylor Simpson (9):
  Hexagon (target/hexagon) Add is_old/is_new to Register class
  Hexagon (target/hexagon) Mark new_read_idx in trans functions
  Hexagon (target/hexagon) Mark dest_idx in trans functions
  Hexagon (target/hexagon) Mark has_pred_dest in trans functions
  Hexagon (tests/tcg/hexagon) Test HVX .new read from high half of pair
  Hexagon (target/hexagon) Remove uses of op_regs_generated.h.inc
  Hexagon (target/hexagon) Remove gen_op_regs.py
  Hexagon (target/hexagon) Remove gen_shortcode.py
  Hexagon (target/hexagon) Remove hex_common.read_attribs_file

 target/hexagon/insn.h   |   5 +-
 target/hexagon/opcodes.h|   4 -
 target/hexagon/decode.c |  50 ++
 target/hexagon/mmvec/decode_ext_mmvec.c |  30 ++
 target/hexagon/opcodes.c|  35 ---
 tests/tcg/hexagon/hvx_misc.c|  16 ++-
 target/hexagon/README   |   2 -
 target/hexagon/gen_analyze_funcs.py |  21 +---
 target/hexagon/gen_helper_funcs.py  |  21 +---
 target/hexagon/gen_helper_protos.py |  21 +---
 target/hexagon/gen_idef_parser_funcs.py |   5 +-
 target/hexagon/gen_op_attribs.py|   5 +-
 target/hexagon/gen_op_regs.py   | 125 
 target/hexagon/gen_opcodes_def.py   |   4 +-
 target/hexagon/gen_printinsn.py |   5 +-
 target/hexagon/gen_shortcode.py |  63 
 target/hexagon/gen_tcg_func_table.py|   5 +-
 target/hexagon/gen_tcg_funcs.py |  21 +---
 target/hexagon/gen_trans_funcs.py   |  23 -
 target/hexagon/hex_common.py|  49 +++---
 target/hexagon/meson.build  |  55 ---
 21 files changed, 119 insertions(+), 446 deletions(-)
 delete mode 100755 target/hexagon/gen_op_regs.py
 delete mode 100755 target/hexagon/gen_shortcode.py

-- 
2.34.1




[PATCH 7/9] Hexagon (target/hexagon) Remove gen_op_regs.py

2024-02-26 Thread Taylor Simpson
Signed-off-by: Taylor Simpson 
---
 target/hexagon/README |   1 -
 target/hexagon/gen_op_regs.py | 125 --
 target/hexagon/meson.build|  14 +---
 3 files changed, 2 insertions(+), 138 deletions(-)
 delete mode 100755 target/hexagon/gen_op_regs.py

diff --git a/target/hexagon/README b/target/hexagon/README
index 746ebec378..065c05154d 100644
--- a/target/hexagon/README
+++ b/target/hexagon/README
@@ -43,7 +43,6 @@ target/hexagon/gen_semantics.c.  This step produces
 That file is consumed by the following python scripts to produce the indicated
 header files in /target/hexagon
 gen_opcodes_def.py  -> opcodes_def_generated.h.inc
-gen_op_regs.py  -> op_regs_generated.h.inc
 gen_printinsn.py-> printinsn_generated.h.inc
 gen_op_attribs.py   -> op_attribs_generated.h.inc
 gen_helper_protos.py-> helper_protos_generated.h.inc
diff --git a/target/hexagon/gen_op_regs.py b/target/hexagon/gen_op_regs.py
deleted file mode 100755
index 7b7b33895a..00
--- a/target/hexagon/gen_op_regs.py
+++ /dev/null
@@ -1,125 +0,0 @@
-#!/usr/bin/env python3
-
-##
-##  Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
-##
-##  This program is free software; you can redistribute it and/or modify
-##  it under the terms of the GNU General Public License as published by
-##  the Free Software Foundation; either version 2 of the License, or
-##  (at your option) any later version.
-##
-##  This program is distributed in the hope that it will be useful,
-##  but WITHOUT ANY WARRANTY; without even the implied warranty of
-##  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-##  GNU General Public License for more details.
-##
-##  You should have received a copy of the GNU General Public License
-##  along with this program; if not, see .
-##
-
-import sys
-import re
-import string
-import hex_common
-
-
-##
-## Generate the register and immediate operands for each instruction
-##
-def calculate_regid_reg(tag):
-def letter_inc(x):
-return chr(ord(x) + 1)
-
-ordered_implregs = ["SP", "FP", "LR"]
-srcdst_lett = "X"
-src_lett = "S"
-dst_lett = "D"
-retstr = ""
-mapdict = {}
-for reg in ordered_implregs:
-reg_rd = 0
-reg_wr = 0
-if ("A_IMPLICIT_WRITES_" + reg) in hex_common.attribdict[tag]:
-reg_wr = 1
-if reg_rd and reg_wr:
-retstr += srcdst_lett
-mapdict[srcdst_lett] = reg
-srcdst_lett = letter_inc(srcdst_lett)
-elif reg_rd:
-retstr += src_lett
-mapdict[src_lett] = reg
-src_lett = letter_inc(src_lett)
-elif reg_wr:
-retstr += dst_lett
-mapdict[dst_lett] = reg
-dst_lett = letter_inc(dst_lett)
-return retstr, mapdict
-
-
-def calculate_regid_letters(tag):
-retstr, mapdict = calculate_regid_reg(tag)
-return retstr
-
-
-def strip_reg_prefix(x):
-y = x.replace("UREG.", "")
-y = y.replace("MREG.", "")
-return y.replace("GREG.", "")
-
-
-def main():
-hex_common.read_semantics_file(sys.argv[1])
-hex_common.read_attribs_file(sys.argv[2])
-hex_common.init_registers()
-tagregs = hex_common.get_tagregs(full=True)
-tagimms = hex_common.get_tagimms()
-
-with open(sys.argv[3], "w") as f:
-for tag in hex_common.tags:
-regs = tagregs[tag]
-rregs = []
-wregs = []
-regids = ""
-for regtype, regid, _, numregs in regs:
-reg = hex_common.get_register(tag, regtype, regid)
-if reg.is_read():
-if regid[0] not in regids:
-regids += regid[0]
-rregs.append(regtype + regid + numregs)
-if reg.is_written():
-wregs.append(regtype + regid + numregs)
-if regid[0] not in regids:
-regids += regid[0]
-for attrib in hex_common.attribdict[tag]:
-if hex_common.attribinfo[attrib]["rreg"]:
-rregs.append(strip_reg_prefix(attribinfo[attrib]["rreg"]))
-if hex_common.attribinfo[attrib]["wreg"]:
-wregs.append(strip_reg_prefix(attribinfo[attrib]["wreg"]))
-regids += calculate_regid_letters(tag)
-f.write(
-f'REGINFO({tag},"{regids}",\t/*RD:*/\t"{",".join(rregs)}",'
-f'\t/*WR:*/\t"{",".join(wregs)}")\n'
-)
-
-for tag in hex_common.tags:
-imms = tagimms[tag]
-f.write(f"IMMINFO({tag}")
-if not imms:
-f.write(""",'u',0,0,'U',0,0""")
-for sign, size, shamt in imms:
-if sign == "r":
-sign = "s"
-if not 

[PATCH 6/9] Hexagon (target/hexagon) Remove uses of op_regs_generated.h.inc

2024-02-26 Thread Taylor Simpson
Signed-off-by: Taylor Simpson 
---
 target/hexagon/opcodes.h|  4 --
 target/hexagon/decode.c | 57 +++--
 target/hexagon/mmvec/decode_ext_mmvec.c | 34 +++
 target/hexagon/opcodes.c| 28 
 4 files changed, 13 insertions(+), 110 deletions(-)

diff --git a/target/hexagon/opcodes.h b/target/hexagon/opcodes.h
index fa7e321950..0ee11bd445 100644
--- a/target/hexagon/opcodes.h
+++ b/target/hexagon/opcodes.h
@@ -40,10 +40,6 @@ typedef enum {
 
 extern const char * const opcode_names[];
 
-extern const char * const opcode_reginfo[];
-extern const char * const opcode_rregs[];
-extern const char * const opcode_wregs[];
-
 typedef struct {
 const char * const encoding;
 const EncClass enc_class;
diff --git a/target/hexagon/decode.c b/target/hexagon/decode.c
index 84a3899556..23deba2426 100644
--- a/target/hexagon/decode.c
+++ b/target/hexagon/decode.c
@@ -115,24 +115,13 @@ static void
 decode_fill_newvalue_regno(Packet *packet)
 {
 int i, use_regidx, offset, def_idx, dst_idx;
-uint16_t def_opcode, use_opcode;
-char *dststr;
 
 for (i = 1; i < packet->num_insns; i++) {
 if (GET_ATTRIB(packet->insn[i].opcode, A_DOTNEWVALUE) &&
 !GET_ATTRIB(packet->insn[i].opcode, A_EXTENSION)) {
-use_opcode = packet->insn[i].opcode;
-
-/* It's a store, so we're adjusting the Nt field */
-if (GET_ATTRIB(use_opcode, A_STORE)) {
-use_regidx = strchr(opcode_reginfo[use_opcode], 't') -
-opcode_reginfo[use_opcode];
-} else {/* It's a Jump, so we're adjusting the Ns field */
-use_regidx = strchr(opcode_reginfo[use_opcode], 's') -
-opcode_reginfo[use_opcode];
-}
-g_assert(packet->insn[i].new_read_idx != -1 &&
- packet->insn[i].new_read_idx == use_regidx);
+
+g_assert(packet->insn[i].new_read_idx != -1);
+use_regidx = packet->insn[i].new_read_idx;
 
 /*
  * What's encoded at the N-field is the offset to who's producing
@@ -153,39 +142,9 @@ decode_fill_newvalue_regno(Packet *packet)
  */
 g_assert(!((def_idx < 0) || (def_idx > (packet->num_insns - 1;
 
-/*
- * packet->insn[def_idx] is the producer
- * Figure out which type of destination it produces
- * and the corresponding index in the reginfo
- */
-def_opcode = packet->insn[def_idx].opcode;
-dststr = strstr(opcode_wregs[def_opcode], "Rd");
-if (dststr) {
-dststr = strchr(opcode_reginfo[def_opcode], 'd');
-} else {
-dststr = strstr(opcode_wregs[def_opcode], "Rx");
-if (dststr) {
-dststr = strchr(opcode_reginfo[def_opcode], 'x');
-} else {
-dststr = strstr(opcode_wregs[def_opcode], "Re");
-if (dststr) {
-dststr = strchr(opcode_reginfo[def_opcode], 'e');
-} else {
-dststr = strstr(opcode_wregs[def_opcode], "Ry");
-if (dststr) {
-dststr = strchr(opcode_reginfo[def_opcode], 'y');
-} else {
-g_assert_not_reached();
-}
-}
-}
-}
-g_assert(dststr != NULL);
-
 /* Now patch up the consumer with the register number */
-dst_idx = dststr - opcode_reginfo[def_opcode];
-g_assert(packet->insn[def_idx].dest_idx != -1 &&
- packet->insn[def_idx].dest_idx == dst_idx);
+g_assert(packet->insn[def_idx].dest_idx != -1);
+dst_idx = packet->insn[def_idx].dest_idx;
 packet->insn[i].regno[use_regidx] =
 packet->insn[def_idx].regno[dst_idx];
 /*
@@ -366,11 +325,7 @@ static void decode_shuffle_for_execution(Packet *packet)
 for (flag = false, i = 0; i < last_insn + 1; i++) {
 int opcode = packet->insn[i].opcode;
 
-g_assert(packet->insn[i].has_pred_dest ==
- (strstr(opcode_wregs[opcode], "Pd4") ||
-  strstr(opcode_wregs[opcode], "Pe4")));
-if ((strstr(opcode_wregs[opcode], "Pd4") ||
- strstr(opcode_wregs[opcode], "Pe4")) &&
+if (packet->insn[i].has_pred_dest &&
 GET_ATTRIB(opcode, A_STORE) == 0) {
 /* This should be a compare (not a store conditional) */
 if (flag) {
diff --git a/target/hexagon/mmvec/decode_ext_mmvec.c 
b/target/hexagon/mmvec/decode_ext_mmvec.c
index c1320406df..f850d0154d 100644
--- a/target/hexagon/mmvec/decode_ext_mmvec.c
+++ b/target/hexagon/mmvec/decode_ext_mmvec.c
@@ -28,21 

[PATCH 2/9] Hexagon (target/hexagon) Mark new_read_idx in trans functions

2024-02-26 Thread Taylor Simpson
Check that the value matches opcode_reginfo

Signed-off-by: Taylor Simpson 
---
 target/hexagon/insn.h   |  3 ++-
 target/hexagon/decode.c |  2 ++
 target/hexagon/mmvec/decode_ext_mmvec.c |  2 ++
 target/hexagon/gen_trans_funcs.py   | 15 ++-
 4 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/target/hexagon/insn.h b/target/hexagon/insn.h
index 3e7a22c91e..36502bf056 100644
--- a/target/hexagon/insn.h
+++ b/target/hexagon/insn.h
@@ -1,5 +1,5 @@
 /*
- *  Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *  Copyright(c) 2019-2024 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -39,6 +39,7 @@ struct Instruction {
 uint32_t slot:3;
 uint32_t which_extended:1;/* If has an extender, which immediate */
 uint32_t new_value_producer_slot:4;
+int32_t new_read_idx;
 
 bool part1;  /*
   * cmp-jumps are split into two insns.
diff --git a/target/hexagon/decode.c b/target/hexagon/decode.c
index a40210ca1e..4595e30384 100644
--- a/target/hexagon/decode.c
+++ b/target/hexagon/decode.c
@@ -131,6 +131,8 @@ decode_fill_newvalue_regno(Packet *packet)
 use_regidx = strchr(opcode_reginfo[use_opcode], 's') -
 opcode_reginfo[use_opcode];
 }
+g_assert(packet->insn[i].new_read_idx != -1 &&
+ packet->insn[i].new_read_idx == use_regidx);
 
 /*
  * What's encoded at the N-field is the offset to who's producing
diff --git a/target/hexagon/mmvec/decode_ext_mmvec.c 
b/target/hexagon/mmvec/decode_ext_mmvec.c
index 202d84c7c0..e9007f5d71 100644
--- a/target/hexagon/mmvec/decode_ext_mmvec.c
+++ b/target/hexagon/mmvec/decode_ext_mmvec.c
@@ -41,6 +41,8 @@ check_new_value(Packet *pkt)
 GET_ATTRIB(use_opcode, A_STORE)) {
 int use_regidx = strchr(opcode_reginfo[use_opcode], 's') -
 opcode_reginfo[use_opcode];
+g_assert(pkt->insn[i].new_read_idx != -1 &&
+ pkt->insn[i].new_read_idx == use_regidx);
 /*
  * What's encoded at the N-field is the offset to who's producing
  * the value.
diff --git a/target/hexagon/gen_trans_funcs.py 
b/target/hexagon/gen_trans_funcs.py
index 53e844a44b..79475b2946 100755
--- a/target/hexagon/gen_trans_funcs.py
+++ b/target/hexagon/gen_trans_funcs.py
@@ -84,14 +84,15 @@ def gen_trans_funcs(f):
 insn->opcode = {tag};
 """))
 
-regno = 0
-for reg in regs:
-reg_type = reg[0]
-reg_id = reg[1]
+new_read_idx = -1
+for regno, regstruct in enumerate(regs):
+reg_type, reg_id, _, _ = regstruct
+reg = hex_common.get_register(tag, reg_type, reg_id)
 f.write(code_fmt(f"""\
 insn->regno[{regno}] = args->{reg_type}{reg_id};
 """))
-regno += 1
+if reg.is_read() and reg.is_new():
+new_read_idx = regno
 
 if len(imms) != 0:
 mark_which_imm_extended(f, tag)
@@ -112,6 +113,9 @@ def gen_trans_funcs(f):
 insn->immed[{immno}] = args->{imm_type}{imm_letter};
 """))
 
+f.write(code_fmt(f"""\
+insn->new_read_idx = {new_read_idx};
+"""))
 f.write(textwrap.dedent(f"""\
 return true;
 {close_curly}
@@ -120,5 +124,6 @@ def gen_trans_funcs(f):
 
 if __name__ == "__main__":
 hex_common.read_semantics_file(sys.argv[1])
+hex_common.init_registers()
 with open(sys.argv[2], "w") as f:
 gen_trans_funcs(f)
-- 
2.34.1




[PATCH v3 2/7] migration/multifd: Implement zero page transmission on the multifd thread.

2024-02-26 Thread Hao Xiang
1. Add zero_pages field in MultiFDPacket_t.
2. Implements the zero page detection and handling on the multifd
threads for non-compression, zlib and zstd compression backends.
3. Added a new value 'multifd' in ZeroPageDetection enumeration.
4. Handle migration QEMU9.0 -> QEMU8.2 compatibility.
5. Adds zero page counters and updates multifd send/receive tracing
format to track the newly added counters.

Signed-off-by: Hao Xiang 
---
 hw/core/machine.c|  4 +-
 hw/core/qdev-properties-system.c |  2 +-
 migration/meson.build|  1 +
 migration/multifd-zero-page.c| 78 ++
 migration/multifd-zlib.c | 21 ++--
 migration/multifd-zstd.c | 20 ++--
 migration/multifd.c  | 83 +++-
 migration/multifd.h  | 24 -
 migration/ram.c  |  1 -
 migration/trace-events   |  8 +--
 qapi/migration.json  |  5 +-
 11 files changed, 214 insertions(+), 33 deletions(-)
 create mode 100644 migration/multifd-zero-page.c

diff --git a/hw/core/machine.c b/hw/core/machine.c
index fb5afdcae4..746da219a4 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -32,7 +32,9 @@
 #include "hw/virtio/virtio-net.h"
 #include "audio/audio.h"
 
-GlobalProperty hw_compat_8_2[] = {};
+GlobalProperty hw_compat_8_2[] = {
+{ "migration", "zero-page-detection", "legacy"},
+};
 const size_t hw_compat_8_2_len = G_N_ELEMENTS(hw_compat_8_2);
 
 GlobalProperty hw_compat_8_1[] = {
diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
index 228e685f52..6e6f68ae1b 100644
--- a/hw/core/qdev-properties-system.c
+++ b/hw/core/qdev-properties-system.c
@@ -682,7 +682,7 @@ const PropertyInfo qdev_prop_mig_mode = {
 const PropertyInfo qdev_prop_zero_page_detection = {
 .name = "ZeroPageDetection",
 .description = "zero_page_detection values, "
-   "none,legacy",
+   "none,legacy,multifd",
 .enum_table = _lookup,
 .get = qdev_propinfo_get_enum,
 .set = qdev_propinfo_set_enum,
diff --git a/migration/meson.build b/migration/meson.build
index 92b1cc4297..1eeb915ff6 100644
--- a/migration/meson.build
+++ b/migration/meson.build
@@ -22,6 +22,7 @@ system_ss.add(files(
   'migration.c',
   'multifd.c',
   'multifd-zlib.c',
+  'multifd-zero-page.c',
   'ram-compress.c',
   'options.c',
   'postcopy-ram.c',
diff --git a/migration/multifd-zero-page.c b/migration/multifd-zero-page.c
new file mode 100644
index 00..1650c41b26
--- /dev/null
+++ b/migration/multifd-zero-page.c
@@ -0,0 +1,78 @@
+/*
+ * Multifd zero page detection implementation.
+ *
+ * Copyright (c) 2024 Bytedance Inc
+ *
+ * Authors:
+ *  Hao Xiang 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/cutils.h"
+#include "exec/ramblock.h"
+#include "migration.h"
+#include "multifd.h"
+#include "options.h"
+#include "ram.h"
+
+static void swap_page_offset(ram_addr_t *pages_offset, int a, int b)
+{
+ram_addr_t temp;
+
+if (a == b) {
+return;
+}
+
+temp = pages_offset[a];
+pages_offset[a] = pages_offset[b];
+pages_offset[b] = temp;
+}
+
+/**
+ * multifd_zero_page_check_send: Perform zero page detection on all pages.
+ *
+ * Sort the page offset array by moving all normal pages to
+ * the left and all zero pages to the right of the array.
+ *
+ * @param p A pointer to the send params.
+ */
+void multifd_zero_page_check_send(MultiFDSendParams *p)
+{
+/*
+ * QEMU older than 9.0 don't understand zero page
+ * on multifd channel. This switch is required to
+ * maintain backward compatibility.
+ */
+bool use_multifd_zero_page =
+(migrate_zero_page_detection() == ZERO_PAGE_DETECTION_MULTIFD);
+MultiFDPages_t *pages = p->pages;
+RAMBlock *rb = pages->block;
+int index_normal = 0;
+int index_zero = pages->num - 1;
+
+while (index_normal <= index_zero) {
+uint64_t offset = pages->offset[index_normal];
+if (use_multifd_zero_page &&
+buffer_is_zero(rb->host + offset, p->page_size)) {
+swap_page_offset(pages->offset, index_normal, index_zero);
+index_zero--;
+ram_release_page(rb->idstr, offset);
+} else {
+index_normal++;
+pages->normal_num++;
+}
+}
+}
+
+void multifd_zero_page_check_recv(MultiFDRecvParams *p)
+{
+for (int i = 0; i < p->zero_num; i++) {
+void *page = p->host + p->zero[i];
+if (!buffer_is_zero(page, p->page_size)) {
+memset(page, 0, p->page_size);
+}
+}
+}
diff --git a/migration/multifd-zlib.c b/migration/multifd-zlib.c
index 012e3bdea1..a8b26bc5e4 100644
--- a/migration/multifd-zlib.c
+++ b/migration/multifd-zlib.c
@@ -123,13 +123,15 @@ static int zlib_send_prepare(MultiFDSendParams *p, Error 

[PATCH v3 0/7] Introduce multifd zero page checking.

2024-02-26 Thread Hao Xiang
v3 update:
* Change "zero" to "zero-pages" and use type size for "zero-bytes".
* Fixed ZeroPageDetection interface description.
* Move zero page unit tests to its own path.
* Removed some asserts.
* Added backward compatibility support for migration 9.0 -> 8.2.
* Removed fields "zero" and "normal" page address arrays from v2. Now
multifd_zero_page_check_send sorts normal/zero pages in the "offset" array.

v2 update:
* Implement zero-page-detection switch with enumeration "legacy",
"none" and "multifd".
* Move normal/zero pages from MultiFDSendParams to MultiFDPages_t.
* Add zeros and zero_bytes accounting.

This patchset is based on Juan Quintela's old series here
https://lore.kernel.org/all/20220802063907.18882-1-quint...@redhat.com/

In the multifd live migration model, there is a single migration main
thread scanning the page map, queuing the pages to multiple multifd
sender threads. The migration main thread runs zero page checking on
every page before queuing the page to the sender threads. Zero page
checking is a CPU intensive task and hence having a single thread doing
all that doesn't scale well. This change introduces a new function
to run the zero page checking on the multifd sender threads. This
patchset also lays the ground work for future changes to offload zero
page checking task to accelerator hardwares.

Use two Intel 4th generation Xeon servers for testing.

Architecture:x86_64
CPU(s):  192
Thread(s) per core:  2
Core(s) per socket:  48
Socket(s):   2
NUMA node(s):2
Vendor ID:   GenuineIntel
CPU family:  6
Model:   143
Model name:  Intel(R) Xeon(R) Platinum 8457C
Stepping:8
CPU MHz: 2538.624
CPU max MHz: 3800.
CPU min MHz: 800.

Perform multifd live migration with below setup:
1. VM has 100GB memory. All pages in the VM are zero pages.
2. Use tcp socket for live migration.
3. Use 4 multifd channels and zero page checking on migration main thread.
4. Use 1/2/4 multifd channels and zero page checking on multifd sender
threads.
5. Record migration total time from sender QEMU console's "info migrate"
command.

++
|zero-page-checking | total-time(ms) |
++
|main-thread| 9629   |
++
|multifd-1-threads  | 6182   |
++
|multifd-2-threads  | 4643   |
++
|multifd-4-threads  | 4143   |
++

Apply this patchset on top of commit
dd88d696ccecc0f3018568f8e281d3d526041e6f

Hao Xiang (7):
  migration/multifd: Add new migration option zero-page-detection.
  migration/multifd: Implement zero page transmission on the multifd
thread.
  migration/multifd: Implement ram_save_target_page_multifd to handle
multifd version of MigrationOps::ram_save_target_page.
  migration/multifd: Enable multifd zero page checking by default.
  migration/multifd: Add new migration test cases for legacy zero page
checking.
  migration/multifd: Add zero pages and zero bytes counter to migration
status interface.
  Update maintainer contact for migration multifd zero page checking
acceleration.

 MAINTAINERS |  5 ++
 hw/core/machine.c   |  4 +-
 hw/core/qdev-properties-system.c| 10 
 include/hw/qdev-properties-system.h |  4 ++
 migration/meson.build   |  1 +
 migration/migration-hmp-cmds.c  | 13 +
 migration/migration.c   |  2 +
 migration/multifd-zero-page.c   | 78 +++
 migration/multifd-zlib.c| 21 ++--
 migration/multifd-zstd.c| 20 +--
 migration/multifd.c | 83 -
 migration/multifd.h | 24 -
 migration/options.c | 21 
 migration/options.h |  1 +
 migration/ram.c | 40 ++
 migration/trace-events  |  8 +--
 qapi/migration.json | 48 +++--
 tests/migration/guestperf/engine.py |  2 +
 tests/qtest/migration-test.c| 52 ++
 19 files changed, 393 insertions(+), 44 deletions(-)
 create mode 100644 migration/multifd-zero-page.c

-- 
2.30.2




[PATCH v3 6/7] migration/multifd: Add zero pages and zero bytes counter to migration status interface.

2024-02-26 Thread Hao Xiang
This change extends the MigrationStatus interface to track zero pages
and zero bytes counter.

Signed-off-by: Hao Xiang 
---
 migration/migration-hmp-cmds.c  |  4 
 migration/migration.c   |  2 ++
 qapi/migration.json | 15 ++-
 tests/migration/guestperf/engine.py |  2 ++
 4 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
index 7e96ae6ffd..a38ad0255d 100644
--- a/migration/migration-hmp-cmds.c
+++ b/migration/migration-hmp-cmds.c
@@ -111,6 +111,10 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
info->ram->normal);
 monitor_printf(mon, "normal bytes: %" PRIu64 " kbytes\n",
info->ram->normal_bytes >> 10);
+monitor_printf(mon, "zero pages: %" PRIu64 " pages\n",
+   info->ram->zero_pages);
+monitor_printf(mon, "zero bytes: %" PRIu64 " kbytes\n",
+   info->ram->zero_bytes >> 10);
 monitor_printf(mon, "dirty sync count: %" PRIu64 "\n",
info->ram->dirty_sync_count);
 monitor_printf(mon, "page size: %" PRIu64 " kbytes\n",
diff --git a/migration/migration.c b/migration/migration.c
index ab21de2cad..a99f86f273 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1112,6 +1112,8 @@ static void populate_ram_info(MigrationInfo *info, 
MigrationState *s)
 info->ram->skipped = 0;
 info->ram->normal = stat64_get(_stats.normal_pages);
 info->ram->normal_bytes = info->ram->normal * page_size;
+info->ram->zero_pages = stat64_get(_stats.zero_pages);
+info->ram->zero_bytes = info->ram->zero_pages * page_size;
 info->ram->mbps = s->mbps;
 info->ram->dirty_sync_count =
 stat64_get(_stats.dirty_sync_count);
diff --git a/qapi/migration.json b/qapi/migration.json
index a0a85a0312..171734c07e 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -63,6 +63,10 @@
 # between 0 and @dirty-sync-count * @multifd-channels.  (since
 # 7.1)
 #
+# @zero-pages: number of zero pages (since 9.0)
+#
+# @zero-bytes: number of zero bytes sent (since 9.0)
+#
 # Features:
 #
 # @deprecated: Member @skipped is always zero since 1.5.3
@@ -81,7 +85,8 @@
'multifd-bytes': 'uint64', 'pages-per-second': 'uint64',
'precopy-bytes': 'uint64', 'downtime-bytes': 'uint64',
'postcopy-bytes': 'uint64',
-   'dirty-sync-missed-zero-copy': 'uint64' } }
+   'dirty-sync-missed-zero-copy': 'uint64',
+   'zero-pages': 'int', 'zero-bytes': 'size' } }
 
 ##
 # @XBZRLECacheStats:
@@ -332,6 +337,8 @@
 #   "duplicate":123,
 #   "normal":123,
 #   "normal-bytes":123456,
+#   "zero-pages":123,
+#   "zero-bytes":123456,
 #   "dirty-sync-count":15
 # }
 #  }
@@ -358,6 +365,8 @@
 # "duplicate":123,
 # "normal":123,
 # "normal-bytes":123456,
+# "zero-pages":123,
+# "zero-bytes":123456,
 # "dirty-sync-count":15
 #  }
 #   }
@@ -379,6 +388,8 @@
 # "duplicate":123,
 # "normal":123,
 # "normal-bytes":123456,
+# "zero-pages":123,
+# "zero-bytes":123456,
 # "dirty-sync-count":15
 #  },
 #  "disk":{
@@ -405,6 +416,8 @@
 # "duplicate":10,
 # "normal":,
 # "normal-bytes":3412992,
+# "zero-pages":,
+# "zero-bytes":3412992,
 # "dirty-sync-count":15
 #  },
 #  "xbzrle-cache":{
diff --git a/tests/migration/guestperf/engine.py 
b/tests/migration/guestperf/engine.py
index 608d7270f6..693e07c227 100644
--- a/tests/migration/guestperf/engine.py
+++ b/tests/migration/guestperf/engine.py
@@ -92,6 +92,8 @@ def _migrate_progress(self, vm):
 info["ram"].get("skipped", 0),
 info["ram"].get("normal", 0),
 info["ram"].get("normal-bytes", 0),
+info["ram"].get("zero-pages", 0);
+info["ram"].get("zero-bytes", 0);
 info["ram"].get("dirty-pages-rate", 0),
 info["ram"].get("mbps", 0),
 info["ram"].get("dirty-sync-count", 0)
-- 
2.30.2




[PATCH v3 3/7] migration/multifd: Implement ram_save_target_page_multifd to handle multifd version of MigrationOps::ram_save_target_page.

2024-02-26 Thread Hao Xiang
1. Add a dedicated handler for MigrationOps::ram_save_target_page in
multifd live migration.
2. Refactor ram_save_target_page_legacy so that the legacy and multifd
handlers don't have internal functions calling into each other.

Signed-off-by: Hao Xiang 
---
 migration/ram.c | 43 ++-
 1 file changed, 30 insertions(+), 13 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 414cd0d753..f60627e11a 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1123,10 +1123,6 @@ static int save_zero_page(RAMState *rs, PageSearchStatus 
*pss,
 QEMUFile *file = pss->pss_channel;
 int len = 0;
 
-if (migrate_zero_page_detection() == ZERO_PAGE_DETECTION_NONE) {
-return 0;
-}
-
 if (!buffer_is_zero(p, TARGET_PAGE_SIZE)) {
 return 0;
 }
@@ -2046,7 +2042,6 @@ static bool save_compress_page(RAMState *rs, 
PageSearchStatus *pss,
  */
 static int ram_save_target_page_legacy(RAMState *rs, PageSearchStatus *pss)
 {
-RAMBlock *block = pss->block;
 ram_addr_t offset = ((ram_addr_t)pss->page) << TARGET_PAGE_BITS;
 int res;
 
@@ -2062,17 +2057,34 @@ static int ram_save_target_page_legacy(RAMState *rs, 
PageSearchStatus *pss)
 return 1;
 }
 
+return ram_save_page(rs, pss);
+}
+
+/**
+ * ram_save_target_page_multifd: send one target page to multifd workers
+ *
+ * Returns 1 if the page was queued, -1 otherwise.
+ *
+ * @rs: current RAM state
+ * @pss: data about the page we want to send
+ */
+static int ram_save_target_page_multifd(RAMState *rs, PageSearchStatus *pss)
+{
+RAMBlock *block = pss->block;
+ram_addr_t offset = ((ram_addr_t)pss->page) << TARGET_PAGE_BITS;
+
 /*
- * Do not use multifd in postcopy as one whole host page should be
- * placed.  Meanwhile postcopy requires atomic update of pages, so even
- * if host page size == guest page size the dest guest during run may
- * still see partially copied pages which is data corruption.
+ * Backward compatibility support. While using multifd live
+ * migration, we still need to handle zero page checking on the
+ * migration main thread.
  */
-if (migrate_multifd() && !migration_in_postcopy()) {
-return ram_save_multifd_page(block, offset);
+if (migrate_zero_page_detection() == ZERO_PAGE_DETECTION_LEGACY) {
+if (save_zero_page(rs, pss, offset)) {
+return 1;
+}
 }
 
-return ram_save_page(rs, pss);
+return ram_save_multifd_page(block, offset);
 }
 
 /* Should be called before sending a host page */
@@ -2984,7 +2996,12 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
 }
 
 migration_ops = g_malloc0(sizeof(MigrationOps));
-migration_ops->ram_save_target_page = ram_save_target_page_legacy;
+
+if (migrate_multifd()) {
+migration_ops->ram_save_target_page = ram_save_target_page_multifd;
+} else {
+migration_ops->ram_save_target_page = ram_save_target_page_legacy;
+}
 
 bql_unlock();
 ret = multifd_send_sync_main();
-- 
2.30.2




[PATCH v3 5/7] migration/multifd: Add new migration test cases for legacy zero page checking.

2024-02-26 Thread Hao Xiang
Now that zero page checking is done on the multifd sender threads by
default, we still provide an option for backward compatibility. This
change adds a qtest migration test case to set the zero-page-detection
option to "legacy" and run multifd migration with zero page checking on the
migration main thread.

Signed-off-by: Hao Xiang 
---
 tests/qtest/migration-test.c | 52 
 1 file changed, 52 insertions(+)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 8a5bb1752e..65b531d871 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -2621,6 +2621,24 @@ test_migrate_precopy_tcp_multifd_start(QTestState *from,
 return test_migrate_precopy_tcp_multifd_start_common(from, to, "none");
 }
 
+static void *
+test_migrate_precopy_tcp_multifd_start_zero_page_legacy(QTestState *from,
+QTestState *to)
+{
+test_migrate_precopy_tcp_multifd_start_common(from, to, "none");
+migrate_set_parameter_str(from, "zero-page-detection", "legacy");
+return NULL;
+}
+
+static void *
+test_migration_precopy_tcp_multifd_start_no_zero_page(QTestState *from,
+  QTestState *to)
+{
+test_migrate_precopy_tcp_multifd_start_common(from, to, "none");
+migrate_set_parameter_str(from, "zero-page-detection", "none");
+return NULL;
+}
+
 static void *
 test_migrate_precopy_tcp_multifd_zlib_start(QTestState *from,
 QTestState *to)
@@ -2652,6 +2670,36 @@ static void test_multifd_tcp_none(void)
 test_precopy_common();
 }
 
+static void test_multifd_tcp_zero_page_legacy(void)
+{
+MigrateCommon args = {
+.listen_uri = "defer",
+.start_hook = test_migrate_precopy_tcp_multifd_start_zero_page_legacy,
+/*
+ * Multifd is more complicated than most of the features, it
+ * directly takes guest page buffers when sending, make sure
+ * everything will work alright even if guest page is changing.
+ */
+.live = true,
+};
+test_precopy_common();
+}
+
+static void test_multifd_tcp_no_zero_page(void)
+{
+MigrateCommon args = {
+.listen_uri = "defer",
+.start_hook = test_migration_precopy_tcp_multifd_start_no_zero_page,
+/*
+ * Multifd is more complicated than most of the features, it
+ * directly takes guest page buffers when sending, make sure
+ * everything will work alright even if guest page is changing.
+ */
+.live = true,
+};
+test_precopy_common();
+}
+
 static void test_multifd_tcp_zlib(void)
 {
 MigrateCommon args = {
@@ -3550,6 +3598,10 @@ int main(int argc, char **argv)
 }
 migration_test_add("/migration/multifd/tcp/plain/none",
test_multifd_tcp_none);
+migration_test_add("/migration/multifd/tcp/plain/zero-page/legacy",
+   test_multifd_tcp_zero_page_legacy);
+migration_test_add("/migration/multifd/tcp/plain/zero-page/none",
+   test_multifd_tcp_no_zero_page);
 migration_test_add("/migration/multifd/tcp/plain/cancel",
test_multifd_tcp_cancel);
 migration_test_add("/migration/multifd/tcp/plain/zlib",
-- 
2.30.2




[PATCH v3 4/7] migration/multifd: Enable multifd zero page checking by default.

2024-02-26 Thread Hao Xiang
Set default "zero-page-detection" option to "multifd". Now zero page
checking can be done in the multifd threads and this becomes the
default configuration. We still provide backward compatibility
where zero page checking is done from the migration main thread.

Signed-off-by: Hao Xiang 
---
 migration/options.c | 2 +-
 qapi/migration.json | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/migration/options.c b/migration/options.c
index 3c603391b0..3c79b6ccd4 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -181,7 +181,7 @@ Property migration_properties[] = {
   MIG_MODE_NORMAL),
 DEFINE_PROP_ZERO_PAGE_DETECTION("zero-page-detection", MigrationState,
parameters.zero_page_detection,
-   ZERO_PAGE_DETECTION_LEGACY),
+   ZERO_PAGE_DETECTION_MULTIFD),
 
 /* Migration capabilities */
 DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE),
diff --git a/qapi/migration.json b/qapi/migration.json
index 5a1bb8ad62..a0a85a0312 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -890,7 +890,7 @@
 #(Since 8.2)
 #
 # @zero-page-detection: Whether and how to detect zero pages. More details
-# see description in @ZeroPageDetection. Default is 'legacy'.  (since 9.0)
+# see description in @ZeroPageDetection. Default is 'multifd'.  (since 9.0)
 #
 # Features:
 #
@@ -1086,7 +1086,7 @@
 #(Since 8.2)
 #
 # @zero-page-detection: Whether and how to detect zero pages. More details
-# see description in @ZeroPageDetection. Default is 'legacy'.  (since 9.0)
+# see description in @ZeroPageDetection. Default is 'multifd'.  (since 9.0)
 #
 # Features:
 #
@@ -1318,7 +1318,7 @@
 #(Since 8.2)
 #
 # @zero-page-detection: Whether and how to detect zero pages. More details
-# see description in @ZeroPageDetection. Default is 'legacy'.  (since 9.0)
+# see description in @ZeroPageDetection. Default is 'multifd'.  (since 9.0)
 #
 # Features:
 #
-- 
2.30.2




[PATCH v3 1/7] migration/multifd: Add new migration option zero-page-detection.

2024-02-26 Thread Hao Xiang
This new parameter controls where the zero page checking is running.
1. If this parameter is set to 'legacy', zero page checking is
done in the migration main thread.
2. If this parameter is set to 'none', zero page checking is disabled.

Signed-off-by: Hao Xiang 
---
 hw/core/qdev-properties-system.c| 10 ++
 include/hw/qdev-properties-system.h |  4 
 migration/migration-hmp-cmds.c  |  9 +
 migration/options.c | 21 
 migration/options.h |  1 +
 migration/ram.c |  4 
 qapi/migration.json | 30 ++---
 7 files changed, 76 insertions(+), 3 deletions(-)

diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
index 1a396521d5..228e685f52 100644
--- a/hw/core/qdev-properties-system.c
+++ b/hw/core/qdev-properties-system.c
@@ -679,6 +679,16 @@ const PropertyInfo qdev_prop_mig_mode = {
 .set_default_value = qdev_propinfo_set_default_value_enum,
 };
 
+const PropertyInfo qdev_prop_zero_page_detection = {
+.name = "ZeroPageDetection",
+.description = "zero_page_detection values, "
+   "none,legacy",
+.enum_table = _lookup,
+.get = qdev_propinfo_get_enum,
+.set = qdev_propinfo_set_enum,
+.set_default_value = qdev_propinfo_set_default_value_enum,
+};
+
 /* --- Reserved Region --- */
 
 /*
diff --git a/include/hw/qdev-properties-system.h 
b/include/hw/qdev-properties-system.h
index 06c359c190..839b170235 100644
--- a/include/hw/qdev-properties-system.h
+++ b/include/hw/qdev-properties-system.h
@@ -8,6 +8,7 @@ extern const PropertyInfo qdev_prop_macaddr;
 extern const PropertyInfo qdev_prop_reserved_region;
 extern const PropertyInfo qdev_prop_multifd_compression;
 extern const PropertyInfo qdev_prop_mig_mode;
+extern const PropertyInfo qdev_prop_zero_page_detection;
 extern const PropertyInfo qdev_prop_losttickpolicy;
 extern const PropertyInfo qdev_prop_blockdev_on_error;
 extern const PropertyInfo qdev_prop_bios_chs_trans;
@@ -47,6 +48,9 @@ extern const PropertyInfo qdev_prop_iothread_vq_mapping_list;
 #define DEFINE_PROP_MIG_MODE(_n, _s, _f, _d) \
 DEFINE_PROP_SIGNED(_n, _s, _f, _d, qdev_prop_mig_mode, \
MigMode)
+#define DEFINE_PROP_ZERO_PAGE_DETECTION(_n, _s, _f, _d) \
+DEFINE_PROP_SIGNED(_n, _s, _f, _d, qdev_prop_zero_page_detection, \
+   ZeroPageDetection)
 #define DEFINE_PROP_LOSTTICKPOLICY(_n, _s, _f, _d) \
 DEFINE_PROP_SIGNED(_n, _s, _f, _d, qdev_prop_losttickpolicy, \
 LostTickPolicy)
diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
index 99b49df5dd..7e96ae6ffd 100644
--- a/migration/migration-hmp-cmds.c
+++ b/migration/migration-hmp-cmds.c
@@ -344,6 +344,11 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict 
*qdict)
 monitor_printf(mon, "%s: %s\n",
 MigrationParameter_str(MIGRATION_PARAMETER_MULTIFD_COMPRESSION),
 MultiFDCompression_str(params->multifd_compression));
+assert(params->has_zero_page_detection);
+monitor_printf(mon, "%s: %s\n",
+MigrationParameter_str(MIGRATION_PARAMETER_ZERO_PAGE_DETECTION),
+qapi_enum_lookup(_lookup,
+params->zero_page_detection));
 monitor_printf(mon, "%s: %" PRIu64 " bytes\n",
 MigrationParameter_str(MIGRATION_PARAMETER_XBZRLE_CACHE_SIZE),
 params->xbzrle_cache_size);
@@ -634,6 +639,10 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict 
*qdict)
 p->has_multifd_zstd_level = true;
 visit_type_uint8(v, param, >multifd_zstd_level, );
 break;
+case MIGRATION_PARAMETER_ZERO_PAGE_DETECTION:
+p->has_zero_page_detection = true;
+visit_type_ZeroPageDetection(v, param, >zero_page_detection, );
+break;
 case MIGRATION_PARAMETER_XBZRLE_CACHE_SIZE:
 p->has_xbzrle_cache_size = true;
 if (!visit_type_size(v, param, _size, )) {
diff --git a/migration/options.c b/migration/options.c
index 3e3e0b93b4..3c603391b0 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -179,6 +179,9 @@ Property migration_properties[] = {
 DEFINE_PROP_MIG_MODE("mode", MigrationState,
   parameters.mode,
   MIG_MODE_NORMAL),
+DEFINE_PROP_ZERO_PAGE_DETECTION("zero-page-detection", MigrationState,
+   parameters.zero_page_detection,
+   ZERO_PAGE_DETECTION_LEGACY),
 
 /* Migration capabilities */
 DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE),
@@ -903,6 +906,13 @@ uint64_t migrate_xbzrle_cache_size(void)
 return s->parameters.xbzrle_cache_size;
 }
 
+ZeroPageDetection migrate_zero_page_detection(void)
+{
+MigrationState *s = migrate_get_current();
+
+return s->parameters.zero_page_detection;
+}
+
 /* parameter setters */
 
 void migrate_set_block_incremental(bool 

Re: 'make vm-build-openbsd' tries to download nonexistent 7.2 installer ISO

2024-02-26 Thread Alex Bennée
Peter Maydell  writes:

> 'make vm-build-openbsd' has stopped working -- I suspect that this
> line from the logs is probably relevant:
>
> http://cdn.openbsd.org/pub/OpenBSD/7.2/packages/amd64/: no such dir
>
> though we don't eventually fail until much later, in 'make check' with
> gmake --output-sync -j8 check V=1;
>
> Pseudo-terminal will not be allocated because stdin is not a terminal.
> Warning: Permanently added '[127.0.0.1]:33847' (ED25519) to the list
> of known hosts.
>
> ERROR: Python not found. Use --python=/path/to/python
>
>
> tests/vm/openbsd currently has:
> link = "https://cdn.openbsd.org/pub/OpenBSD/7.2/amd64/install72.iso;
> but the webserver doesn't have 7.2 any more.
>
> Could somebody look at what we need to do to update this to 7.4
> (most recent release), please?

Sadly the installer has changed some strings so we need to update the
expect sequence.

>
> I filed
> https://gitlab.com/qemu-project/qemu/-/issues/2192
> to track this.
>
> thanks
> -- PMM

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro



[RFC PATCH v2] tests/vm: avoid re-building the VM images all the time

2024-02-26 Thread Alex Bennée
The main problem is that "check-venv" is a .PHONY target will always
evaluate and trigger a full re-build of the VM images. While its
tempting to drop it from the dependencies that does introduce a
breakage on freshly configured builds.

Fortunately we do have the otherwise redundant --force flag for the
script which up until now was always on. If we make the usage of
--force conditional on dependencies other than check-venv triggering
the update we can avoid the costly rebuild and still run cleanly on a
fresh checkout.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2118
Signed-off-by: Alex Bennée 
---
 tests/vm/Makefile.include | 2 +-
 tests/vm/basevm.py| 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/tests/vm/Makefile.include b/tests/vm/Makefile.include
index bf12e0fa3c5..ac56824a87d 100644
--- a/tests/vm/Makefile.include
+++ b/tests/vm/Makefile.include
@@ -102,7 +102,7 @@ $(IMAGES_DIR)/%.img:$(SRC_PATH)/tests/vm/% \
$(if $(LOG_CONSOLE),--log-console) \
--source-path $(SRC_PATH) \
--image "$@" \
-   --force \
+   $(if $(filter-out check-venv, $?), --force) \
--build-image $@, \
"  VM-IMAGE $*")
 
diff --git a/tests/vm/basevm.py b/tests/vm/basevm.py
index c0d62c08031..f8fd751eb14 100644
--- a/tests/vm/basevm.py
+++ b/tests/vm/basevm.py
@@ -646,9 +646,9 @@ def main(vmcls, config=None):
 vm = vmcls(args, config=config)
 if args.build_image:
 if os.path.exists(args.image) and not args.force:
-sys.stderr.writelines(["Image file exists: %s\n" % args.image,
+sys.stderr.writelines(["Image file exists, skipping build: 
%s\n" % args.image,
   "Use --force option to overwrite\n"])
-return 1
+return 0
 return vm.build_image(args.image)
 if args.build_qemu:
 vm.add_source_dir(args.build_qemu)
-- 
2.39.2




Re: [External] Re: [PATCH v2 1/7] migration/multifd: Add new migration option zero-page-detection.

2024-02-26 Thread Hao Xiang
On Sun, Feb 25, 2024 at 11:19 PM Wang, Lei  wrote:
>
> On 2/17/2024 6:39, Hao Xiang wrote:
> > This new parameter controls where the zero page checking is running.
> > 1. If this parameter is set to 'legacy', zero page checking is
> > done in the migration main thread.
> > 2. If this parameter is set to 'none', zero page checking is disabled.
> >
> > Signed-off-by: Hao Xiang 
> > ---
> >  hw/core/qdev-properties-system.c| 10 ++
> >  include/hw/qdev-properties-system.h |  4 
> >  migration/migration-hmp-cmds.c  |  9 +
> >  migration/options.c | 21 
> >  migration/options.h |  1 +
> >  migration/ram.c |  4 
> >  qapi/migration.json | 30 ++---
> >  7 files changed, 76 insertions(+), 3 deletions(-)
> >
> > diff --git a/hw/core/qdev-properties-system.c 
> > b/hw/core/qdev-properties-system.c
> > index 1a396521d5..63843f18b5 100644
> > --- a/hw/core/qdev-properties-system.c
> > +++ b/hw/core/qdev-properties-system.c
> > @@ -679,6 +679,16 @@ const PropertyInfo qdev_prop_mig_mode = {
> >  .set_default_value = qdev_propinfo_set_default_value_enum,
> >  };
> >
> > +const PropertyInfo qdev_prop_zero_page_detection = {
> > +.name = "ZeroPageDetection",
> > +.description = "zero_page_detection values, "
> > +   "multifd,legacy,none",
>
> Nit: Maybe multifd/legacy/none?

I changed it to

.description = "zero_page_detection values, "
"none,legacy,multifd",

Since both "," and "/" are used in existing code, I think it would be
fine either way.



Re: [PATCH v2 1/3] qtest: migration: Enhance qtest migration functions to support 'channels' argument

2024-02-26 Thread Het Gala


On 26/02/24 6:31 pm, Fabiano Rosas wrote:

Het Gala  writes:


On 24/02/24 1:42 am, Fabiano Rosas wrote:
this was the same first approach that I attempted. It won't work because

The final 'migrate' QAPI with channels string would look like

{ "execute": "migrate", "arguments": { "channels": "[ { "channel-type":
"main", "addr": { "transport": "socket", "type": "inet", "host":
"10.117.29.84", "port": "4000" }, "multifd-channels": 2 } ]" } }

instead of

{ "execute": "migrate", "arguments": { "channels": [ { "channel-type":
"main", "addr": { "transport": "socket", "type": "inet", "host":
"10.117.29.84", "port": "4000" }, "multifd-channels": 2 } ] } }

It would complain, that channels should be an *array* and not a string.

So, that's the reason parsing was required in qtest too.

I would be glad to hear if there are any ideas to convert /string ->
json object -> add it inside qdict along with uri/ ?


Isn't this what the various qobject_from_json do? How does it work with
the existing tests?

 qtest_qmp_assert_success(to, "{ 'execute': 'migrate-incoming',"
  "  'arguments': { "
  "  'channels': [ { 'channel-type': 'main',"
  "  'addr': { 'transport': 'socket',"
  "'type': 'inet',"
  "'host': '127.0.0.1',"
  "'port': '0' } } ] } }");

We can pass this^ string successfully to QMP somehow...


I think, here in qtest_qmp_assert_success, we actually can pass the 
whole QMP command, and it just asserts that return key is present in the 
response, though I am not very much familiar with qtest codebase to 
verify how swiftly we can convert string into an actual QObject.


[...]


static void do_test_validate_uri_channel(MigrateCommon *args)
{
QTestState *from, *to;
g_autofree char *connect_uri = NULL;

if (test_migrate_start(, , args->listen_uri, >start)) {

return;
}



Regards,

Het Gala


Regards,

Het Gala


Re: [PATCH v5 1/8] Implement STM32L4x5_RCC skeleton

2024-02-26 Thread Arnaud Minier
Thanks Peter for the review,

- Original Message -
> From: "Peter Maydell" 
> To: "Arnaud Minier" 
> Cc: "qemu-devel" , "Thomas Huth" , "Laurent Vivier" , "Inès
> Varhol" , "Samuel Tardieu" , "qemu-arm"
> , "Alistair Francis" , "Paolo Bonzini" , "Alistair
> Francis" 
> Sent: Friday, February 23, 2024 3:26:46 PM
> Subject: Re: [PATCH v5 1/8] Implement STM32L4x5_RCC skeleton

> On Mon, 19 Feb 2024 at 20:11, Arnaud Minier
>  wrote:
>>
>> Add the necessary files to add a simple RCC implementation with just
>> reads from and writes to registers. Also instanciate the RCC in the
> 
> "instantiate"

Fixed.

> 
>> STM32L4x5_SoC. It is needed for accurate emulation of all the SoC
>> clocks and timers.
>>
>> Signed-off-by: Arnaud Minier 
>> Signed-off-by: Inès Varhol 
>> Acked-by: Alistair Francis 
>> ---
> 
> 
> 
>> +static const MemoryRegionOps stm32l4x5_rcc_ops = {
>> +.read = stm32l4x5_rcc_read,
>> +.write = stm32l4x5_rcc_write,
>> +.endianness = DEVICE_NATIVE_ENDIAN,
>> +.valid = {
>> +.max_access_size = 4,
>> +.unaligned = false
>> +},
>> +};
> 
> What's the .valid.min_access_size ?
> Do we need to set the .impl max/min access size here too ?

I honestly don't really understand the differences between .valid and .impl.
However, since all the code assumes that 4-byte accesses are made,
I think we can set all these values to 4 for now.

> 
> 
>> +
>> +static const ClockPortInitArray stm32l4x5_rcc_clocks = {
>> +QDEV_CLOCK_IN(Stm32l4x5RccState, hsi16_rc, NULL, 0),
>> +QDEV_CLOCK_IN(Stm32l4x5RccState, msi_rc, NULL, 0),
>> +QDEV_CLOCK_IN(Stm32l4x5RccState, hse, NULL, 0),
>> +QDEV_CLOCK_IN(Stm32l4x5RccState, lsi_rc, NULL, 0),
>> +QDEV_CLOCK_IN(Stm32l4x5RccState, lse_crystal, NULL, 0),
>> +QDEV_CLOCK_IN(Stm32l4x5RccState, sai1_extclk, NULL, 0),
>> +QDEV_CLOCK_IN(Stm32l4x5RccState, sai2_extclk, NULL, 0),
>> +QDEV_CLOCK_END
>> +};
> 
> These are input clocks, so they each need a VMSTATE_CLOCK()
> line in the VMStateDescription. (I think only input clocks
> need to be migrated.)

Sure, will add these in the VMStateDescription.

> 
>> +
>> +
>> +static void stm32l4x5_rcc_init(Object *obj)
>> +{
>> +Stm32l4x5RccState *s = STM32L4X5_RCC(obj);
>> +
>> +sysbus_init_irq(SYS_BUS_DEVICE(obj), >irq);
>> +
>> +memory_region_init_io(>mmio, obj, _rcc_ops, s,
>> +  TYPE_STM32L4X5_RCC, 0x400);
>> +sysbus_init_mmio(SYS_BUS_DEVICE(obj), >mmio);
>> +
>> +qdev_init_clocks(DEVICE(s), stm32l4x5_rcc_clocks);
>> +
>> +s->gnd = clock_new(obj, "gnd");
>> +}
> 
> Otherwise
> Reviewed-by: Peter Maydell 
> 
> thanks
> -- PMM



Re: [PATCH v4 19/34] migration/multifd: Allow receiving pages without packets

2024-02-26 Thread Fabiano Rosas
Peter Xu  writes:

> On Tue, Feb 20, 2024 at 07:41:23PM -0300, Fabiano Rosas wrote:
>> Currently multifd does not need to have knowledge of pages on the
>> receiving side because all the information needed is within the
>> packets that come in the stream.
>> 
>> We're about to add support to fixed-ram migration, which cannot use
>> packets because it expects the ramblock section in the migration file
>> to contain only the guest pages data.
>> 
>> Add a data structure to transfer pages between the ram migration code
>> and the multifd receiving threads.
>> 
>> We don't want to reuse MultiFDPages_t for two reasons:
>> 
>> a) multifd threads don't really need to know about the data they're
>>receiving.
>> 
>> b) the receiving side has to be stopped to load the pages, which means
>>we can experiment with larger granularities than page size when
>>transferring data.
>> 
>> Signed-off-by: Fabiano Rosas 
>> ---
>> @Peter: a 'quit' flag cannot be used instead of pending_job. The
>> receiving thread needs know there's no more data coming. If the
>> migration thread sets a 'quit' flag, the multifd thread would see the
>> flag right away and exit.
>
> Hmm.. isn't this exactly what we want?  I'll comment for this inline below.
>
>> The only way is to clear pending_job on the
>> thread and spin once more.
>> ---
>>  migration/file.c|   1 +
>>  migration/multifd.c | 122 +---
>>  migration/multifd.h |  15 ++
>>  3 files changed, 131 insertions(+), 7 deletions(-)
>> 
>> diff --git a/migration/file.c b/migration/file.c
>> index 5d4975f43e..22d052a71f 100644
>> --- a/migration/file.c
>> +++ b/migration/file.c
>> @@ -6,6 +6,7 @@
>>   */
>>  
>>  #include "qemu/osdep.h"
>> +#include "exec/ramblock.h"
>>  #include "qemu/cutils.h"
>>  #include "qapi/error.h"
>>  #include "channel.h"
>> diff --git a/migration/multifd.c b/migration/multifd.c
>> index 0a5279314d..45a0c7aaa8 100644
>> --- a/migration/multifd.c
>> +++ b/migration/multifd.c
>> @@ -81,9 +81,15 @@ struct {
>>  
>>  struct {
>>  MultiFDRecvParams *params;
>> +MultiFDRecvData *data;
>>  /* number of created threads */
>>  int count;
>> -/* syncs main thread and channels */
>> +/*
>> + * For sockets: this is posted once for each MULTIFD_FLAG_SYNC flag.
>> + *
>> + * For files: this is only posted at the end of the file load to mark
>> + *completion of the load process.
>> + */
>>  QemuSemaphore sem_sync;
>>  /* global number of generated multifd packets */
>>  uint64_t packet_num;
>> @@ -1110,6 +1116,53 @@ bool multifd_send_setup(void)
>>  return true;
>>  }
>>  
>> +bool multifd_recv(void)
>> +{
>> +int i;
>> +static int next_recv_channel;
>> +MultiFDRecvParams *p = NULL;
>> +MultiFDRecvData *data = multifd_recv_state->data;
>
> [1]
>
>> +
>> +/*
>> + * next_channel can remain from a previous migration that was
>> + * using more channels, so ensure it doesn't overflow if the
>> + * limit is lower now.
>> + */
>> +next_recv_channel %= migrate_multifd_channels();
>> +for (i = next_recv_channel;; i = (i + 1) % migrate_multifd_channels()) {
>> +if (multifd_recv_should_exit()) {
>> +return false;
>> +}
>> +
>> +p = _recv_state->params[i];
>> +
>> +/*
>> + * Safe to read atomically without a lock because the flag is
>> + * only set by this function below. Reading an old value of
>> + * true is not an issue because it would only send us looking
>> + * for the next idle channel.
>> + */
>> +if (qatomic_read(>pending_job) == false) {
>> +next_recv_channel = (i + 1) % migrate_multifd_channels();
>> +break;
>> +}
>> +}
>
> IIUC you'll need an smp_mb_acquire() here.  The ordering of "reading
> pending_job" and below must be guaranteed, similar to the sender side.
>

I've been thinking about this even on the sending side.

We shouldn't need the barrier here because there's a control flow
dependency on breaking the loop. I think pending_job *must* be read
prior to here, otherwise the program is just wrong. Does that make
sense?

>> +
>> +assert(!p->data->size);
>> +multifd_recv_state->data = p->data;
>
> [2]
>
>> +p->data = data;
>> +
>> +qatomic_set(>pending_job, true);
>
> Then here:
>
>qatomic_store_release(>pending_job, true);

Ok.

>
> Please consider add comment above all acquire/releases pairs like sender
> too.
>
>> +qemu_sem_post(>sem);
>> +
>> +return true;
>> +}
>> +
>> +MultiFDRecvData *multifd_get_recv_data(void)
>> +{
>> +return multifd_recv_state->data;
>> +}
>
> Can also use it above [1].
>
> I'm thinking maybe we can do something like:
>
> #define  MULTIFD_RECV_DATA_GLOBAL  (multifd_recv_state->data)
>
> Then we can also use it at [2], and replace multifd_get_recv_data()?
>

We need the helper because multifd_recv_state->data needs 

Re: [PATCH 28/28] qemu-img: extend cvtnum() and use it in more places

2024-02-26 Thread Michael Tokarev

22.02.2024 00:16, Michael Tokarev wrote:


-static int64_t cvtnum_full(const char *name, const char *value, int64_t min,
-   int64_t max)
+static int64_t cvtnum_full(const char *name, const char *value,
+   bool issize, int64_t min, int64_t max)
  {
  int err;
  uint64_t res;
  
-err = qemu_strtosz(value, NULL, );

+err = issize ? qemu_strtosz(value, NULL, ) :
+   qemu_strtou64(value, NULL, 0, );
  if (err < 0 && err != -ERANGE) {
-error_report("Invalid %s specified. You may use "
- "k, M, G, T, P or E suffixes for", name);
-error_report("kilobytes, megabytes, gigabytes, terabytes, "
- "petabytes and exabytes.");
+if (issize) {
+error_report("Invalid %s specified. You may use "
+ "k, M, G, T, P or E suffixes for", name);
+error_report("kilobytes, megabytes, gigabytes, terabytes, "
+ "petabytes and exabytes.");
+} else {
+error_report("Invalid %s specified.", name);
+}


I've added actual value supplied to these error messages now.
And I think the list of possible suffixes makes little sense here.



@@ -5090,7 +5060,7 @@ static int img_bitmap(const img_cmd_t *ccmd, int argc, 
char **argv)
  src_fmt = optarg;
  break;
  case 'g':
-granularity = cvtnum("granularity", optarg);
+granularity = cvtnum("granularity", optarg, false);


Here, this is a size, so last arg should be true.  In the tests (190),
we already use -g 2M.  I didn't really knew what a granularity is while
converting it.

/mjt



Re: [PATCH v4 02/10] hw/cxl/cxl-mailbox-utils: Add dynamic capacity region representative and mailbox command support

2024-02-26 Thread fan
On Mon, Feb 26, 2024 at 05:33:17PM +, Jonathan Cameron wrote:
> On Wed, 21 Feb 2024 10:15:55 -0800
> nifan@gmail.com wrote:
> 
> > From: Fan Ni 
> > 
> > Per cxl spec r3.1, add dynamic capacity region representative based on
> > Table 8-165 and extend the cxl type3 device definition to include dc region
> > information. Also, based on info in 8.2.9.9.9.1, add 'Get Dynamic Capacity
> > Configuration' mailbox support.
> > 
> > Note: decode_len of a dc region is aligned to 256*MiB, divided by
> > 256 * MiB before returned to the host for "Get Dynamic Capacity 
> > Configuration"
> > mailbox command.
> > 
> > Signed-off-by: Fan Ni 
> Hi Fan,
> 
> A few comments inline.
> 
> Jonathan
> 
> > ---
> >  hw/cxl/cxl-mailbox-utils.c  | 110 
> >  include/hw/cxl/cxl_device.h |  16 ++
> >  2 files changed, 126 insertions(+)
> > 
> > diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
> > index ba1d9901df..88e3b733e3 100644
> > --- a/hw/cxl/cxl-mailbox-utils.c
> > +++ b/hw/cxl/cxl-mailbox-utils.c
> > @@ -22,6 +22,7 @@
> >  
> >  #define CXL_CAPACITY_MULTIPLIER   (256 * MiB)
> >  #define CXL_DC_EVENT_LOG_SIZE 8
> > +#define CXL_SPEC_AFTER_R30
> As below. Drop this.  Kernel code needs to be able to cope with newer specs
> than it understands anyway so should be fine with the larger records 
> (otherwise
> it's buggy and needs fixing!) 

Will remove it.
> 
> >  
> >  /*
> >   * How to add a new command, example. The command set FOO, with cmd BAR.
> > @@ -80,6 +81,8 @@ enum {
> >  #define GET_POISON_LIST0x0
> >  #define INJECT_POISON  0x1
> >  #define CLEAR_POISON   0x2
> > +DCD_CONFIG  = 0x48,
> > +#define GET_DC_CONFIG  0x0
> >  PHYSICAL_SWITCH = 0x51,
> >  #define IDENTIFY_SWITCH_DEVICE  0x0
> >  #define GET_PHYSICAL_PORT_STATE 0x1
> > @@ -1238,6 +1241,103 @@ static CXLRetCode cmd_media_clear_poison(const 
> > struct cxl_cmd *cmd,
> >  return CXL_MBOX_SUCCESS;
> >  }
> >  
> > +/*
> > + * CXL r3.1 section 8.2.9.9.9.1: Get Dynamic Capacity Configuration
> > + * (Opcode: 4800h)
> > + */
> > +static CXLRetCode cmd_dcd_get_dyn_cap_config(const struct cxl_cmd *cmd,
> > + uint8_t *payload_in,
> > + size_t len_in,
> > + uint8_t *payload_out,
> > + size_t *len_out,
> > + CXLCCI *cci)
> > +{
> > +CXLType3Dev *ct3d = CXL_TYPE3(cci->d);
> > +struct get_dyn_cap_config_in_pl {
> Type not needed - see below. 
> > +uint8_t region_cnt;
> > +uint8_t start_region_id;
> > +} QEMU_PACKED;
>} QEMU_PACKED *in;

Get your point, it seems I followed the wrong example like
cmd_media_clear_poison and cmd_inject_poison :-<

Fan

> 
> > +
> > +struct get_dyn_cap_config_out_pl {
> Don't think giving this a type is necessary - see below.
> > +uint8_t num_regions;
> > +uint8_t regions_returned;
> > +uint8_t rsvd1[6];
> > +struct {
> > +uint64_t base;
> > +uint64_t decode_len;
> > +uint64_t region_len;
> > +uint64_t block_size;
> > +uint32_t dsmadhandle;
> > +uint8_t flags;
> > +uint8_t rsvd2[3];
> > +} QEMU_PACKED records[];
> > +/*
> > + * if cxl spec version >= 3.1, extra output payload as defined
> > + * in struct get_dyn_cap_config_out_pl_extra comes here.
> > + */
> > +} QEMU_PACKED;
> } QEMU_PACKED *out;
> > +
> > +struct get_dyn_cap_config_in_pl *in = (void *)payload_in;
> > +struct get_dyn_cap_config_out_pl *out = (void *)payload_out;
> 
> We've (mostly) use the (void *) casting where we haven't given the structures
> a type.  I think I'd prefer we kept to that style for consistency.
> 
> There is an argument we should have given all these types
> for readability reasons and to avoid casting via void * but
> we have gone this way now - with the exception of
> the poison list - oops.   
> 
> > +uint16_t record_count = 0;
> > +uint16_t i;
> > +uint16_t out_pl_len;
> > +uint8_t start_region_id = in->start_region_id;
> > +#ifdef CXL_SPEC_AFTER_R30
> 
> Handy for testing, but I'd drop the ifdef for the final
> version.  We don't need to support old specs.
> 
> > +struct get_dyn_cap_config_out_pl_extra {
> > +uint32_t num_extents_supported;
> > +uint32_t num_extents_available;
> > +uint32_t num_tags_supported;
> > +uint32_t num_tags_available;
> > +} QEMU_PACKED;
> > +struct get_dyn_cap_config_out_pl_extra *extra_out;
> As above, anonymous structure should work ok.
> > +#endif
> > +
> > +if (start_region_id >= ct3d->dc.num_regions) {
> > +return CXL_MBOX_INVALID_INPUT;
> > +}
> > +
> > +

Re: [RFC PATCH v3 04/21] target/arm: Implement ALLINT MSR (immediate)

2024-02-26 Thread Richard Henderson

On 2/25/24 16:22, Jinjie Ruan wrote:



On 2024/2/24 3:03, Richard Henderson wrote:

On 2/23/24 00:32, Jinjie Ruan via wrote:

Add ALLINT MSR (immediate) to decodetree. And the EL0 check is necessary
to ALLINT. Avoid the unconditional write to pc and use raise_exception_ra
to unwind.

Signed-off-by: Jinjie Ruan 
---
v3:
- Remove EL0 check in allint_check().
- Add TALLINT check for EL1 in allint_check().
- Remove unnecessarily arm_rebuild_hflags() in msr_i_allint helper.
---
   target/arm/tcg/a64.decode  |  1 +
   target/arm/tcg/helper-a64.c    | 24 
   target/arm/tcg/helper-a64.h    |  1 +
   target/arm/tcg/translate-a64.c | 10 ++
   4 files changed, 36 insertions(+)

diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
index 8a20dce3c8..3588080024 100644
--- a/target/arm/tcg/a64.decode
+++ b/target/arm/tcg/a64.decode
@@ -207,6 +207,7 @@ MSR_i_DIT   1101 0101  0 011 0100  010
1 @msr_i
   MSR_i_TCO   1101 0101  0 011 0100  100 1 @msr_i
   MSR_i_DAIFSET   1101 0101  0 011 0100  110 1 @msr_i
   MSR_i_DAIFCLEAR 1101 0101  0 011 0100  111 1 @msr_i
+MSR_i_ALLINT    1101 0101  0 001 0100  000 1 @msr_i


Decode is incorrect either here, or in trans_MSR_i_ALLINT, because CRm
!= '000x' is UNDEFINED.

MSR_i_ALLINT    1101 0101  0 001 0100 000 imm:1 000 1

is perhaps the clearest implementation.


+static void allint_check(CPUARMState *env, uint32_t op,
+   uint32_t imm, uintptr_t ra)
+{
+    /* ALLINT update to PSTATE. */
+    if (arm_current_el(env) == 1 && arm_is_el2_enabled(env) &&
+    (arm_hcrx_el2_eff(env) & HCRX_TALLINT)) {
+    raise_exception_ra(env, EXCP_UDEF,
+   syn_aa64_sysregtrap(0, extract32(op, 0, 3),
+   extract32(op, 3, 3), 4,
+   imm, 0x1f, 0),
+   exception_target_el(env), ra);
+    }
+}
+
+void HELPER(msr_i_allint)(CPUARMState *env, uint32_t imm)
+{
+    allint_check(env, 0x8, imm, GETPC());


As previously noted, the check for MSR_i only applies to imm==1, not 0.


Sorry! The hardware manual I looked at didn't say this.


In DDI0487J.a, C6.2.229 MSR (immediate), it is present in the pseudocode

  when PSTATEField_ALLINT
if (PSTATE.EL == EL1 && IsHCRXEL2Enabled()
&& HCRX_EL2.TALLINT == '1' && CRm<0> == '1') then
  AArch64.SystemAccessTrap(EL2, 0x18);
PSTATE.ALLINT = CRm<0>;

In D19.2.49 HCRX_EL2, it is present as text for the description of TALLINT.


r~



Re: [PATCH 01/28] qemu-img: stop printing error twice in a few places

2024-02-26 Thread Michael Tokarev

26.02.2024 17:14, Daniel P. Berrangé :

On Thu, Feb 22, 2024 at 12:15:42AM +0300, Michael Tokarev wrote:

Currently we have:

   ./qemu-img resize none +10
   qemu-img: Could not open 'none': Could not open 'none': No such file or 
directory

stop printing the message twice, - local_err already has
all the info, no need to prepend additional text there.

There are a few other places like this, but I'm unsure
about these.

Signed-off-by: Michael Tokarev 
---
  qemu-img.c | 8 +++-
  1 file changed, 3 insertions(+), 5 deletions(-)


Reviewed-by: Daniel P. Berrangé 


Unfortunately I have to drop this one for now, - it requires
much more work.  For example, after this we have:

-qemu-img: TEST_DIR/t.IMGFMT: Extended L2 entries are only supported with 
cluster sizes of at least 16384 bytes
+qemu-img: Extended L2 entries are only supported with cluster sizes of at 
least 16384 bytes

-qemu-img: Could not open 'TEST_DIR/t.IMGFMT': L1 table is too small
+qemu-img: L1 table is too small

-qemu-img: Could not open 'TEST_DIR/t.IMGFMT': Could not open 'foo': No such 
file or directory
+qemu-img: Could not open 'foo': No such file or directory

and a few other interesting cases.

This whole thing needs a much bigger revisit.

/mjt



Re: [PATCH 1/7] qga/commands-posix: return fsinfo values directly as reported by statvfs

2024-02-26 Thread Konstantin Kostiuk
Best Regards,
Konstantin Kostiuk.


On Mon, Feb 26, 2024 at 7:02 PM Andrey Drobyshev <
andrey.drobys...@virtuozzo.com> wrote:

> Since the commit 25b5ff1a86 ("qga: add mountpoint usage info to
> GuestFilesystemInfo") we have 2 values reported in guest-get-fsinfo:
> used = (f_blocks - f_bfree), total = (f_blocks - f_bfree + f_bavail).
> These calculations might be obscure for the end user and require one to
> actually get into QGA source to understand how they're obtained. Let's
> just report the values f_blocks, f_bfree, f_bavail (in bytes) from
> statvfs() as they are, letting the user decide how to process them further.
>
> Originally-by: Yuri Pudgorodskiy 
> Signed-off-by: Andrey Drobyshev 
> ---
>  qga/commands-posix.c | 16 +++-
>  qga/qapi-schema.json | 11 +++
>  2 files changed, 14 insertions(+), 13 deletions(-)
>
> diff --git a/qga/commands-posix.c b/qga/commands-posix.c
> index 26008db497..752ef509d0 100644
> --- a/qga/commands-posix.c
> +++ b/qga/commands-posix.c
> @@ -1554,8 +1554,7 @@ static GuestFilesystemInfo
> *build_guest_fsinfo(struct FsMount *mount,
> Error **errp)
>  {
>  GuestFilesystemInfo *fs = g_malloc0(sizeof(*fs));
> -struct statvfs buf;
> -unsigned long used, nonroot_total, fr_size;
> +struct statvfs st;
>  char *devpath = g_strdup_printf("/sys/dev/block/%u:%u",
>  mount->devmajor, mount->devminor);
>
> @@ -1563,15 +1562,14 @@ static GuestFilesystemInfo
> *build_guest_fsinfo(struct FsMount *mount,
>  fs->type = g_strdup(mount->devtype);
>  build_guest_fsinfo_for_device(devpath, fs, errp);
>
> -if (statvfs(fs->mountpoint, ) == 0) {
> -fr_size = buf.f_frsize;
> -used = buf.f_blocks - buf.f_bfree;
> -nonroot_total = used + buf.f_bavail;
> -fs->used_bytes = used * fr_size;
> -fs->total_bytes = nonroot_total * fr_size;
> +if (statvfs(fs->mountpoint, ) == 0) {
> +fs->total_bytes = st.f_blocks * st.f_frsize;
> +fs->free_bytes = st.f_bfree * st.f_frsize;
> +fs->avail_bytes = st.f_bavail * st.f_frsize;
>
>  fs->has_total_bytes = true;
> -fs->has_used_bytes = true;
> +fs->has_free_bytes = true;
> +fs->has_avail_bytes = true;
>  }
>
>  g_free(devpath);
> diff --git a/qga/qapi-schema.json b/qga/qapi-schema.json
> index b8efe31897..1cce3c1df5 100644
> --- a/qga/qapi-schema.json
> +++ b/qga/qapi-schema.json
> @@ -1030,9 +1030,12 @@
>  #
>  # @type: file system type string
>  #
> -# @used-bytes: file system used bytes (since 3.0)
> +# @total-bytes: total file system size in bytes (since 8.3)
>  #
> -# @total-bytes: non-root file system total bytes (since 3.0)
> +# @free-bytes: amount of free space in file system in bytes (since 8.3)
>

I don't agree with this as it breaks backward compatibility. If we want to
get
these changes we should release a new version with both old and new fields
and mark old as deprecated to get a time for everyone who uses this
API updates its solutions.

A similar thing was with replacing the 'blacklist' command line.
https://gitlab.com/qemu-project/qemu/-/commit/582a098e6ca00dd42f317dad8affd13e5a20bc42
Currently, we support both 'blacklist' and 'block-rpcs' command line options
but the first one wrote a warning.

@Marc-André Lureau  @Philippe Mathieu-Daudé

What do you think about this?


> +#
> +# @avail-bytes: amount of free space in file system for unprivileged
> +# users in bytes (since 8.3)
>  #
>  # @disk: an array of disk hardware information that the volume lies
>  # on, which may be empty if the disk type is not supported
> @@ -1041,8 +1044,8 @@
>  ##
>  { 'struct': 'GuestFilesystemInfo',
>'data': {'name': 'str', 'mountpoint': 'str', 'type': 'str',
> -   '*used-bytes': 'uint64', '*total-bytes': 'uint64',
> -   'disk': ['GuestDiskAddress']} }
> +   '*total-bytes': 'uint64', '*free-bytes': 'uint64',
> +   '*avail-bytes': 'uint64', 'disk': ['GuestDiskAddress']} }
>
>  ##
>  # @guest-get-fsinfo:
> --
> 2.39.3
>
>
>


Re: [PATCH v5 2/3] virtio-iommu: Add a granule property

2024-02-26 Thread Philippe Mathieu-Daudé

On 26/2/24 19:11, Eric Auger wrote:

This allows to choose which granule will be used by
default by the virtio-iommu. Current page size mask
default is qemu_target_page_mask so this translates
into a 4K granule.

Signed-off-by: Eric Auger 

---
v4 -> v5:
- use -(n * KiB) (Phild)

v3 -> v4:
- granule_mode introduction moved to that patch
---
  include/hw/virtio/virtio-iommu.h |  2 ++
  hw/virtio/virtio-iommu.c | 28 +---
  qemu-options.hx  |  3 +++
  3 files changed, 30 insertions(+), 3 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 



Re: [RFC PATCH] tests/vm: avoid re-building the VM images all the time

2024-02-26 Thread Daniel P . Berrangé
On Mon, Feb 26, 2024 at 05:46:39PM +, Alex Bennée wrote:
> There are two problems.
> 
> The first is a .PHONY target will always evaluate which triggers a
> full re-build of the VM images. Drop the requirement knowing that this
> introduces a manual step on freshly configure build dirs.

For context, the background to this is:

   https://gitlab.com/qemu-project/qemu/-/issues/2118

dropping '$(VM_VENV)' is the fix for that bit, which is the real
killer bit.

> 
> The second is a minor unrelated tweak to the Makefile also triggers an
> expensive full re-build. Solve this be avoiding the dependency and
> putting a comment just above the bit that matters and hope developers
> notice the comment.
> 
> Signed-off-by: Alex Bennée 
> 
> ---
> 
> This is hacky and sub-optimal. There surely must be a way to have our cake
> and eat it?
> ---
>  tests/vm/Makefile.include | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/tests/vm/Makefile.include b/tests/vm/Makefile.include
> index bf12e0fa3c5..a109773c588 100644
> --- a/tests/vm/Makefile.include
> +++ b/tests/vm/Makefile.include
> @@ -88,10 +88,11 @@ vm-build-all: $(addprefix vm-build-, $(IMAGES))
>  vm-clean-all:
>   rm -f $(IMAGE_FILES)
>  
> +# Rebuilding the VMs every time this Makefile is tweaked is very
> +# expensive for most users. If you tweak the recipe bellow you will
> +# need to manually zap $(IMAGES_DIR)/%.img to rebuild.
>  $(IMAGES_DIR)/%.img: $(SRC_PATH)/tests/vm/% \
> - $(SRC_PATH)/tests/vm/basevm.py \
> - $(SRC_PATH)/tests/vm/Makefile.include \
> - $(VM_VENV)
> + $(SRC_PATH)/tests/vm/basevm.py
>   @mkdir -p $(IMAGES_DIR)
>   $(call quiet-command, \
>   $(VM_PYTHON) $< \
> -- 
> 2.39.2
> 
> 

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v12 00/10] ui/cocoa: Use NSWindow's ability to resize

2024-02-26 Thread Philippe Mathieu-Daudé

On 24/2/24 13:43, Akihiko Odaki wrote:


Akihiko Odaki (10):
   ui/cocoa: Split [-QemuCocoaView handleEventLocked:]
   ui/cocoa: Immediately call [-QemuCocoaView handleMouseEvent:buttons:]
   ui/cocoa: Release specific mouse buttons
   ui/cocoa: Scale with NSView instead of Core Graphics
   ui/cocoa: Fix pause label coordinates
   ui/cocoa: Let the platform toggle fullscreen
   ui/cocoa: Remove normalWindow
   ui/cocoa: Make window resizable
   ui/cocoa: Call console_select() with the BQL
   ui/cocoa: Remove stretch_video flag


Series queued, thanks for all the people involved over
the various iterations!



[PATCH v5 2/3] virtio-iommu: Add a granule property

2024-02-26 Thread Eric Auger
This allows to choose which granule will be used by
default by the virtio-iommu. Current page size mask
default is qemu_target_page_mask so this translates
into a 4K granule.

Signed-off-by: Eric Auger 

---
v4 -> v5:
- use -(n * KiB) (Phild)

v3 -> v4:
- granule_mode introduction moved to that patch
---
 include/hw/virtio/virtio-iommu.h |  2 ++
 hw/virtio/virtio-iommu.c | 28 +---
 qemu-options.hx  |  3 +++
 3 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-iommu.h
index 5fbe4677c2..f2785f7997 100644
--- a/include/hw/virtio/virtio-iommu.h
+++ b/include/hw/virtio/virtio-iommu.h
@@ -24,6 +24,7 @@
 #include "hw/virtio/virtio.h"
 #include "hw/pci/pci.h"
 #include "qom/object.h"
+#include "qapi/qapi-types-virtio.h"
 
 #define TYPE_VIRTIO_IOMMU "virtio-iommu-device"
 #define TYPE_VIRTIO_IOMMU_PCI "virtio-iommu-pci"
@@ -67,6 +68,7 @@ struct VirtIOIOMMU {
 Notifier machine_done;
 bool granule_frozen;
 uint8_t aw_bits;
+GranuleMode granule_mode;
 };
 
 #endif
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 2ec5ef3cd1..33e0520bc8 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -29,6 +29,7 @@
 #include "sysemu/reset.h"
 #include "sysemu/sysemu.h"
 #include "qemu/reserved-region.h"
+#include "qemu/units.h"
 #include "qapi/error.h"
 #include "qemu/error-report.h"
 #include "trace.h"
@@ -1115,8 +1116,8 @@ static int 
virtio_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu_mr,
 }
 
 /*
- * The default mask (TARGET_PAGE_MASK) is the smallest supported guest granule,
- * for example 0xf000. When an assigned device has page size
+ * The default mask depends on the "granule" property. For example, with
+ * 4K granule, it is -(4 * KiB). When an assigned device has page size
  * restrictions due to the hardware IOMMU configuration, apply this restriction
  * to the mask.
  */
@@ -1313,7 +1314,26 @@ static void virtio_iommu_device_realize(DeviceState 
*dev, Error **errp)
  * in vfio realize
  */
 s->config.bypass = s->boot_bypass;
-s->config.page_size_mask = qemu_target_page_mask();
+
+switch (s->granule_mode) {
+case GRANULE_MODE_4K:
+s->config.page_size_mask = -(4 * KiB);
+break;
+case GRANULE_MODE_8K:
+s->config.page_size_mask = -(8 * KiB);
+break;
+case GRANULE_MODE_16K:
+s->config.page_size_mask = -(16 * KiB);
+break;
+case GRANULE_MODE_64K:
+s->config.page_size_mask = -(64 * KiB);
+break;
+case GRANULE_MODE_HOST:
+s->config.page_size_mask = qemu_real_host_page_mask();
+break;
+default:
+error_setg(errp, "Unsupported granule mode");
+}
 if (s->aw_bits < 32 || s->aw_bits > 64) {
 error_setg(errp, "aw-bits must be within [32,64]");
 }
@@ -1527,6 +1547,8 @@ static Property virtio_iommu_properties[] = {
  TYPE_PCI_BUS, PCIBus *),
 DEFINE_PROP_BOOL("boot-bypass", VirtIOIOMMU, boot_bypass, true),
 DEFINE_PROP_UINT8("aw-bits", VirtIOIOMMU, aw_bits, 0),
+DEFINE_PROP_GRANULE_MODE("granule", VirtIOIOMMU, granule_mode,
+ GRANULE_MODE_4K),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/qemu-options.hx b/qemu-options.hx
index 3b670758b0..c7b43b67d5 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1179,6 +1179,9 @@ SRST
 ``aw-bits=val`` (val between 32 and 64, default depends on machine)
 This decides the address width of IOVA address space. It defaults
 to 39 bits on q35 machines and 48 bits on ARM virt machines.
+``granule=val`` (possible values are 4K, 8K, 16K, 64K and host)
+This decides the default granule to be be exposed by the
+virtio-iommu. If host, the granule matches the host page size.
 
 ERST
 
-- 
2.41.0




[PATCH v5 3/3] virtio-iommu: Change the default granule to the host page size

2024-02-26 Thread Eric Auger
We used to set the default granule to 4KB but with VFIO assignment
it makes more sense to use the actual host page size.

Indeed when hotplugging a VFIO device protected by a virtio-iommu
on a 64kB/64kB host/guest config, we current get a qemu crash:

"vfio: DMA mapping failed, unable to continue"

This is due to the hot-attached VFIO device calling
memory_region_iommu_set_page_size_mask() with 64kB granule
whereas the virtio-iommu granule was already frozen to 4KB on
machine init done.

Set the granule property to "host" and introduce a new compat.
The page size mask used before 9.0 was qemu_target_page_mask().
Since the virtio-iommu currently only supports x86_64 and aarch64,
this matched a 4KB granule.

Note that the new default will prevent 4kB guest on 64kB host
because the granule will be set to 64kB which would be larger
than the guest page size. In that situation, the virtio-iommu
driver fails on viommu_domain_finalise() with
"granule 0x1 larger than system page size 0x1000".

In that case the workaround is to request 4K granule.

The current limitation of global granule in the virtio-iommu
should be removed and turned into per domain granule. But
until we get this upgraded, this new default is probably
better because I don't think anyone is currently interested in
running a 4KB page size guest with virtio-iommu on a 64KB host.
However supporting 64kB guest on 64kB host with virtio-iommu and
VFIO looks a more important feature.

Signed-off-by: Eric Auger 
Reviewed-by: Philippe Mathieu-Daudé 

---

v4 -> v5
- use low case, mandated by the jason qapi
---
 hw/core/machine.c| 1 +
 hw/virtio/virtio-iommu.c | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index 70ac96954c..56f38b6579 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -35,6 +35,7 @@
 
 GlobalProperty hw_compat_8_2[] = {
 { TYPE_VIRTIO_IOMMU_PCI, "aw-bits", "64" },
+{ TYPE_VIRTIO_IOMMU_PCI, "granule", "4k" },
 };
 const size_t hw_compat_8_2_len = G_N_ELEMENTS(hw_compat_8_2);
 
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 33e0520bc8..6831446e29 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -1548,7 +1548,7 @@ static Property virtio_iommu_properties[] = {
 DEFINE_PROP_BOOL("boot-bypass", VirtIOIOMMU, boot_bypass, true),
 DEFINE_PROP_UINT8("aw-bits", VirtIOIOMMU, aw_bits, 0),
 DEFINE_PROP_GRANULE_MODE("granule", VirtIOIOMMU, granule_mode,
- GRANULE_MODE_4K),
+ GRANULE_MODE_HOST),
 DEFINE_PROP_END_OF_LIST(),
 };
 
-- 
2.41.0




[PATCH v5 0/3] VIRTIO-IOMMU: Set default granule to host page size

2024-02-26 Thread Eric Auger
We used to set the default granule to 4kB but with VFIO assignment
it makes more sense to use the actual host page size.

Indeed when hotplugging a VFIO device protected by a virtio-iommu
on a 64kB/64kB host/guest config, we currently get a qemu crash:

"vfio: DMA mapping failed, unable to continue"

This is due to the hot-attached VFIO device calling
memory_region_iommu_set_page_size_mask() with 64kB granule
whereas the virtio-iommu granule was already frozen to 4kB on
machine init done.

Introduce a new granule property, set this latter to "host"
and introduce a new compat (that sets it to 4k for machine
types older than 9.0).

Note that the new default will prevent 4kB guest on 64kB host
because the granule will be set to 64kB which would be larger
than the guest page size. In that situation, the virtio-iommu
driver fails on viommu_domain_finalise() with
"granule 0x1 larger than system page size 0x1000".

In that case 4K granule should be used.

To summarize, before the series, the support matrix (credit
to Jean-Philippe Brucker) was:

 Host | Guest | virtio-net | IGB passthrough
  4k  | 4k| Y  | Y
  64k | 64k   | Y  | N
  64k | 4k| Y  | N
  4k  | 64k   | Y  | Y

After the series:

 Host | Guest | virtio-net | IGB passthrough
  4k  | 4k| Y  | Y
  64k | 64k   | Y  | Y
  64k | 4k| 4K | N
  4k  | 64k   | Y  | Y

The current limitation of global granule in the virtio-iommu
should be removed and turned into per domain granule. But
until we get this upgraded, this new default is probably
better because I don't think anyone is currently interested in
running a 4KB page size guest with virtio-iommu on a 64KB host.
However supporting 64kB guest on 64kB host with virtio-iommu and
VFIO looks a more important feature.

This series can be found at:
https://github.com/eauger/qemu/tree/granule-v3

Applied on top of
[PATCH v5 0/4] VIRTIO-IOMMU: Introduce an aw-bits option
https://lore.kernel.org/all/20240215084315.863897-1-eric.au...@redhat.com/

History:
v4 -> v5:
- use -(n * KiB) (Philippe)
- remove code that can be automatically generated
  and add the new enum in qapi/virtio.json (Philippe).
- Improve commit msg on last patch and collected Philippe's R-b

v3 -> v4:
- Add 8K granule (Richard)

v2 -> v3
- introduce a dedicated granule option to handle the compat

Eric Auger (3):
  qdev: Add a granule_mode property
  virtio-iommu: Add a granule property
  virtio-iommu: Change the default granule to the host page size

 qapi/virtio.json| 18 ++
 include/hw/qdev-properties-system.h |  3 +++
 include/hw/virtio/virtio-iommu.h|  2 ++
 hw/core/machine.c   |  1 +
 hw/core/qdev-properties-system.c| 15 +++
 hw/virtio/virtio-iommu.c| 28 +---
 qemu-options.hx |  3 +++
 7 files changed, 67 insertions(+), 3 deletions(-)

-- 
2.41.0




[PATCH v5 1/3] qdev: Add a granule_mode property

2024-02-26 Thread Eric Auger
Introduce a new enum type property allowing to set an
IOMMU granule. Values are 4k, 8k, 16k, 64k and host.
This latter indicates the vIOMMU granule will match
the host page size.

A subsequent patch will add such a property to the
virtio-iommu device.

Signed-off-by: Eric Auger 
Signed-off-by: Philippe Mathieu-Daudé 

---
v4 -> v5
- remove code that can be automatically generated
  and add the new enum in qapi/virtio.json (Philippe).
  Added Phild's SOB. low case needs to be used due to
  the Jason generation.

v3 -> v4:
- Add 8K
---
 qapi/virtio.json| 18 ++
 include/hw/qdev-properties-system.h |  3 +++
 hw/core/qdev-properties-system.c| 15 +++
 3 files changed, 36 insertions(+)

diff --git a/qapi/virtio.json b/qapi/virtio.json
index a79013fe89..95745fdfd7 100644
--- a/qapi/virtio.json
+++ b/qapi/virtio.json
@@ -957,3 +957,21 @@
 
 { 'struct': 'DummyVirtioForceArrays',
   'data': { 'unused-iothread-vq-mapping': ['IOThreadVirtQueueMapping'] } }
+
+##
+# @GranuleMode:
+#
+# @4k: granule page size of 4KiB
+#
+# @8k: granule page size of 8KiB
+#
+# @16k: granule page size of 16KiB
+#
+# @64k: granule page size of 64KiB
+#
+# @host: granule matches the host page size
+#
+# Since: 9.0
+##
+{ 'enum': 'GranuleMode',
+  'data': [ '4k', '8k', '16k', '64k', 'host' ] }
diff --git a/include/hw/qdev-properties-system.h 
b/include/hw/qdev-properties-system.h
index 06c359c190..626be87dd3 100644
--- a/include/hw/qdev-properties-system.h
+++ b/include/hw/qdev-properties-system.h
@@ -8,6 +8,7 @@ extern const PropertyInfo qdev_prop_macaddr;
 extern const PropertyInfo qdev_prop_reserved_region;
 extern const PropertyInfo qdev_prop_multifd_compression;
 extern const PropertyInfo qdev_prop_mig_mode;
+extern const PropertyInfo qdev_prop_granule_mode;
 extern const PropertyInfo qdev_prop_losttickpolicy;
 extern const PropertyInfo qdev_prop_blockdev_on_error;
 extern const PropertyInfo qdev_prop_bios_chs_trans;
@@ -47,6 +48,8 @@ extern const PropertyInfo qdev_prop_iothread_vq_mapping_list;
 #define DEFINE_PROP_MIG_MODE(_n, _s, _f, _d) \
 DEFINE_PROP_SIGNED(_n, _s, _f, _d, qdev_prop_mig_mode, \
MigMode)
+#define DEFINE_PROP_GRANULE_MODE(_n, _s, _f, _d) \
+DEFINE_PROP_SIGNED(_n, _s, _f, _d, qdev_prop_granule_mode, GranuleMode)
 #define DEFINE_PROP_LOSTTICKPOLICY(_n, _s, _f, _d) \
 DEFINE_PROP_SIGNED(_n, _s, _f, _d, qdev_prop_losttickpolicy, \
 LostTickPolicy)
diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
index 1a396521d5..685cffd064 100644
--- a/hw/core/qdev-properties-system.c
+++ b/hw/core/qdev-properties-system.c
@@ -34,6 +34,7 @@
 #include "net/net.h"
 #include "hw/pci/pci.h"
 #include "hw/pci/pcie.h"
+#include "hw/virtio/virtio-iommu.h"
 #include "hw/i386/x86.h"
 #include "util/block-helpers.h"
 
@@ -679,6 +680,20 @@ const PropertyInfo qdev_prop_mig_mode = {
 .set_default_value = qdev_propinfo_set_default_value_enum,
 };
 
+/* --- GranuleMode --- */
+
+QEMU_BUILD_BUG_ON(sizeof(GranuleMode) != sizeof(int));
+
+const PropertyInfo qdev_prop_granule_mode = {
+.name = "GranuleMode",
+.description = "granule_mode values, "
+   "4k, 8k, 16k, 64k, host",
+.enum_table = _lookup,
+.get = qdev_propinfo_get_enum,
+.set = qdev_propinfo_set_enum,
+.set_default_value = qdev_propinfo_set_default_value_enum,
+};
+
 /* --- Reserved Region --- */
 
 /*
-- 
2.41.0




Re: [RFC PATCH] tests/vm: avoid re-building the VM images all the time

2024-02-26 Thread Peter Maydell
On Mon, 26 Feb 2024 at 18:06, Alex Bennée  wrote:
>
> Alex Bennée  writes:
>
> > There are two problems.
> >
> > The first is a .PHONY target will always evaluate which triggers a
> > full re-build of the VM images. Drop the requirement knowing that this
> > introduces a manual step on freshly configure build dirs.
> >
> > The second is a minor unrelated tweak to the Makefile also triggers an
> > expensive full re-build. Solve this be avoiding the dependency and
> > putting a comment just above the bit that matters and hope developers
> > notice the comment.
> >
> > Signed-off-by: Alex Bennée 
> >
> > ---
> >
> > This is hacky and sub-optimal. There surely must be a way to have our cake
> > and eat it?
> > ---
> >  tests/vm/Makefile.include | 7 ---
> >  1 file changed, 4 insertions(+), 3 deletions(-)
> >
> > diff --git a/tests/vm/Makefile.include b/tests/vm/Makefile.include
> > index bf12e0fa3c5..a109773c588 100644
> > --- a/tests/vm/Makefile.include
> > +++ b/tests/vm/Makefile.include
> > @@ -88,10 +88,11 @@ vm-build-all: $(addprefix vm-build-, $(IMAGES))
> >  vm-clean-all:
> >   rm -f $(IMAGE_FILES)
> >
> > +# Rebuilding the VMs every time this Makefile is tweaked is very
> > +# expensive for most users. If you tweak the recipe bellow you will

"below".

But how many people edit tests/vm/Makefile.include ?
It had only 5 changes made to it last year. At that
frequency of changes I think I'd favour "always do the
right thing" over "require manual removal of the cached
image sometimes".

thanks
-- PMM



Re: [PATCH v4 09/10] hw/cxl/events: Add qmp interfaces to add/release dynamic capacity extents

2024-02-26 Thread Jonathan Cameron via
On Wed, 21 Feb 2024 10:16:02 -0800
nifan@gmail.com wrote:

> From: Fan Ni 
> 
> Since fabric manager emulation is not supported yet, the change implements
> the functions to add/release dynamic capacity extents as QMP interfaces.
> 
> Note: we skips any FM issued extent release request if the exact extent
> does not exist in the extent list of the device. We will loose the
> restriction later once we have partial release support in the kernel.
> 
> 1. Add dynamic capacity extents:
> 
> For example, the command to add two continuous extents (each 128MiB long)
> to region 0 (starting at DPA offset 0) looks like below:
> 
> { "execute": "qmp_capabilities" }
> 
> { "execute": "cxl-add-dynamic-capacity",
>   "arguments": {
>   "path": "/machine/peripheral/cxl-dcd0",
>   "region-id": 0,
>   "extents": [
>   {
>   "dpa": 0,
>   "len": 134217728
>   },
>   {
>   "dpa": 134217728,
>   "len": 134217728
>   }
>   ]
>   }
> }
> 
> 2. Release dynamic capacity extents:
> 
> For example, the command to release an extent of size 128MiB from region 0
> (DPA offset 128MiB) look like below:
> 
> { "execute": "cxl-release-dynamic-capacity",
>   "arguments": {
>   "path": "/machine/peripheral/cxl-dcd0",
>   "region-id": 0,
>   "extents": [
>   {
>   "dpa": 134217728,
>   "len": 134217728
>   }
>   ]
>   }
> }
> 
> Signed-off-by: Fan Ni 
A few things inline. I don't understand one of the comments.

> ---


>  
> diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
> index f4edada303..b8c4273e99 100644
> --- a/hw/mem/cxl_type3.c
> +++ b/hw/mem/cxl_type3.c

> +/*
> + * Check whether the exact extent exists in the list
> + * Return value: the extent pointer in the list; else null
> + */
> +static CXLDCExtent *cxl_dc_extent_exists(CXLDCExtentList *list,
> +CXLDCExtentRaw *ext)
> +{
> +CXLDCExtent *ent;
> +
> +if (!ext || !list) {
> +return NULL;
> +}
> +
> +QTAILQ_FOREACH(ent, list, node) {
> +if (ent->start_dpa != ext->start_dpa) {
> +continue;
> +}
> +
> +/*Found exact extent*/
Spacing /* Found .. extent */

> +return ent->len == ext->len ? ent : NULL;
> +}
> +
> +return NULL;
> +}
> +
> +/*
> + * The main function to process dynamic capacity event. Currently DC extents
> + * add/release requests are processed.
> + */
> +static void qmp_cxl_process_dynamic_capacity(const char *path, CxlEventLog 
> log,
> + CXLDCEventType type, uint16_t 
> hid,
> + uint8_t rid,
> + CXLDCExtentRecordList *records,
> + Error **errp)
> +{
> +Object *obj;
> +CXLEventDynamicCapacity dCap = {};
> +CXLEventRecordHdr *hdr = 
> +CXLType3Dev *dcd;
> +uint8_t flags = 1 << CXL_EVENT_TYPE_INFO;
> +uint32_t num_extents = 0;
> +CXLDCExtentRecordList *list;
> +g_autofree CXLDCExtentRaw *extents = NULL;
> +uint8_t enc_log;
> +uint64_t offset, len, block_size;
> +int i;
> +int rc;
> +g_autofree unsigned long *blk_bitmap = NULL;
> +
> +obj = object_resolve_path(path, NULL);
> +if (!obj) {
> +error_setg(errp, "Unable to resolve path");
> +return;
> +}
> +if (!object_dynamic_cast(obj, TYPE_CXL_TYPE3)) {
> +error_setg(errp, "Path not point to a valid CXL type3 device");
> +return;
> +}
> +
> +dcd = CXL_TYPE3(obj);
> +if (!dcd->dc.num_regions) {
> +error_setg(errp, "No dynamic capacity support from the device");
> +return;
> +}
> +
> +rc = ct3d_qmp_cxl_event_log_enc(log);
> +if (rc < 0) {
> +error_setg(errp, "Unhandled error log type");
> +return;
> +}
> +enc_log = rc;
> +
> +if (rid >= dcd->dc.num_regions) {
> +error_setg(errp, "region id is too large");
> +return;
> +}
> +block_size = dcd->dc.regions[rid].block_size;
> +
> +/* Sanity check and count the extents */
> +list = records;
> +while (list) {
> +offset = list->value->offset;
> +len = list->value->len;
> +
> +if (len == 0) {
> +error_setg(errp, "extent with 0 length is not allowed");
> +return;
> +}
> +
> +if (offset % block_size || len % block_size) {
> +error_setg(errp, "dpa or len is not aligned to region block 
> size");
> +return;
> +}
> +
> +if (offset + len > dcd->dc.regions[rid].len) {
> +error_setg(errp, "extent range is beyond the region end");
> +return;
> +}
> +
> +num_extents++;
> +list = list->next;
> +}
> +if (num_extents == 0) {
> +error_setg(errp, "No extents found in the command");
> +return;
> +}
> +
> +blk_bitmap = 

Re: [PATCH v6 00/41] Raspberry Pi 4B machine

2024-02-26 Thread Peter Maydell
On Mon, 26 Feb 2024 at 00:04, Sergey Kambalin  wrote:
>
> Introducing Raspberry Pi 4B model.
> It contains new BCM2838 SoC, PCIE subsystem,
> RNG200, Thermal sensor and Genet network controller.
>
> It can work with recent linux kernels 6.x.x.
> Two avocado tests was added to check that.
>
> Unit tests has been made as read/write operations
> via mailbox properties.
>
> Genet integration test is under development.
>
> Every single commit
> 1) builds without errors
> 2) passes regression tests
> 3) passes style check*
> *the only exception is bcm2838-mbox-property-test.c file
> containing heavy macros usage which cause a lot of
> false-positives of checkpatch.pl.
>
> I did my best to keep the commits less than 200 changes,
> but had to make some of them a bit more in order to
> keep their integrity.
>
>
> Sergey Kambalin (41):
>   Split out common part of BCM283X classes
>   Split out common part of peripherals
>   Split out raspi machine common part
>   Introduce BCM2838 SoC
>   Add GIC-400 to BCM2838 SoC
>   Add BCM2838 GPIO stub
>   Implement BCM2838 GPIO functionality

I've just noticed that the commit messages in this series
are missing the conventional prefix that indicates what part
of the codebase they apply to (hw/arm, hw/gpio, etc). I
propose to add those in on my end for the patches I'm taking
into target-arm.next.

I think the one question I have left is the name of the
board: currently it's "raspi4b-2g", but should we name
it just "raspi4b"? None of the names we use for the other
raspi boards we model have a suffix like the "-2g" here.
Philippe, do you have an opinion here ?

-- PMM



Re: [PATCH v2] test/qtest: Add API functions to capture IRQ toggling

2024-02-26 Thread Philippe Mathieu-Daudé

On 14/11/23 00:01, Gustavo Romero wrote:

Currently, the QTest API does not provide a function to capture when an
IRQ line is raised or lowered, although the QTest Protocol already
reports such IRQ transitions. As a consequence, it is also not possible
to capture when an IRQ line is toggled. Functions like qtest_get_irq()
only read the current state of the intercepted IRQ lines, which is
already high (or low) when the function is called if the IRQ line is
toggled. Therefore, these functions miss the IRQ line state transitions.

This commit introduces two new API functions:
qtest_get_irq_raised_counter() and qtest_get_irq_lowered_counter().
These functions allow capturing the number of times an observed IRQ line
transitioned from low to high state or from high to low state,
respectively.

When used together, these new API functions then allow checking if one
or more pulses were generated (indicating if the IRQ line was toggled).

Signed-off-by: Gustavo Romero 
---
  tests/qtest/libqtest.c | 24 
  tests/qtest/libqtest.h | 28 
  2 files changed, 52 insertions(+)


Sorry I totally forgot this patch :/

Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH v4 08/10] hw/cxl/cxl-mailbox-utils: Add mailbox commands to support add/release dynamic capacity response

2024-02-26 Thread Jonathan Cameron via
On Wed, 21 Feb 2024 10:16:01 -0800
nifan@gmail.com wrote:

> From: Fan Ni 
> 
> Per CXL spec 3.1, two mailbox commands are implemented:
> Add Dynamic Capacity Response (Opcode 4802h) 8.2.9.9.9.3, and
> Release Dynamic Capacity (Opcode 4803h) 8.2.9.9.9.4.
> 
> Signed-off-by: Fan Ni 

Hi Fan, 

Comments on this are all about corner cases. If we can I think we need
to cover a few more.  Linux won't hit them (I think) so it will be
a bit of a pain to test but maybe raw commands enabled and some
userspace code will let us exercise the corner cases?

Jonathan



> +
> +/*
> + * CXL r3.1 section 8.2.9.9.9.4: Release Dynamic Capacity (opcode 4803h)
> + */
> +static CXLRetCode cmd_dcd_release_dyn_cap(const struct cxl_cmd *cmd,
> +  uint8_t *payload_in,
> +  size_t len_in,
> +  uint8_t *payload_out,
> +  size_t *len_out,
> +  CXLCCI *cci)
> +{
> +CXLUpdateDCExtentListInPl *in = (void *)payload_in;
> +CXLType3Dev *ct3d = CXL_TYPE3(cci->d);
> +CXLDCExtentList *extent_list = >dc.extents;
> +CXLDCExtent *ent;
> +uint32_t i;
> +uint64_t dpa, len;
> +CXLRetCode ret;
> +
> +if (in->num_entries_updated == 0) {
> +return CXL_MBOX_INVALID_INPUT;
> +}
> +
> +ret = cxl_detect_malformed_extent_list(ct3d, in);
> +if (ret != CXL_MBOX_SUCCESS) {
> +return ret;
> +}
> +
> +for (i = 0; i < in->num_entries_updated; i++) {
> +bool found = false;
> +
> +dpa = in->updated_entries[i].start_dpa;
> +len = in->updated_entries[i].len;
> +
> +QTAILQ_FOREACH(ent, extent_list, node) {
> +if (ent->start_dpa <= dpa &&
> +dpa + len <= ent->start_dpa + ent->len) {
> +/*
> + * If an incoming extent covers a portion of an extent
> + * in the device extent list, remove only the overlapping
> + * portion, meaning
> + * 1. the portions that are not covered by the incoming
> + *extent at both end of the original extent will become
> + *new extents and inserted to the extent list; and
> + * 2. the original extent is removed from the extent list;
> + * 3. dc extent count is updated accordingly.
> + */
> +uint64_t ent_start_dpa = ent->start_dpa;
> +uint64_t ent_len = ent->len;
> +uint64_t len1 = dpa - ent_start_dpa;
> +uint64_t len2 = ent_start_dpa + ent_len - dpa - len;
> +
> +found = true;
> +cxl_remove_extent_from_extent_list(extent_list, ent);
> +ct3d->dc.total_extent_count -= 1;
> +
> +if (len1) {
> +cxl_insert_extent_to_extent_list(extent_list,
> + ent_start_dpa, len1,
> + NULL, 0);
> +ct3d->dc.total_extent_count += 1;
> +}
> +if (len2) {
> +cxl_insert_extent_to_extent_list(extent_list, dpa + len,
> + len2, NULL, 0);
> +ct3d->dc.total_extent_count += 1;

There is a non zero chance that we'll overflow however many extents we claim
to support. So we need to check that and fail the remove if it happens.
Could ignore this for now though as that value is (I think!) conservative
to allow for complex extent list tracking implementations.  Succeeding
when a naive solution would fail due to running out of extents that it can
manage is not (I think) a bug.

> +}
> +break;
> +/*Currently we reject the attempt to remove a superset*/

Space after /* and before */

I think we need to fix this. Linux isn't going to do it any time soon, but
I think it's allowed to allocate two extents next to each other then free them
in one go.  Isn't this case easy to do or are there awkward corners?
If it's sufficiently nasty (maybe because only part of extent provided exists?)
then maybe we can leave it for now.

I worry about something like

|  EXTENT TO FREE|
| Exists|   gap   | Exists   |
Where we have to check for gap before removing anything?
Does the spec address this? Not that I can find.
I think the implication is we have to do a validation pass, then a free
pass after we know whole of requested extent is valid.
Nasty to test if nothing else :(  Would look much like your check
on malformed extent lists.


> +} else if ((dpa < ent->start_dpa + ent->len &&
> +dpa + len > ent->start_dpa + ent->len) ||
> +   (dpa < 

Re: [RFC PATCH] tests/vm: avoid re-building the VM images all the time

2024-02-26 Thread Alex Bennée
Alex Bennée  writes:

> There are two problems.
>
> The first is a .PHONY target will always evaluate which triggers a
> full re-build of the VM images. Drop the requirement knowing that this
> introduces a manual step on freshly configure build dirs.
>
> The second is a minor unrelated tweak to the Makefile also triggers an
> expensive full re-build. Solve this be avoiding the dependency and
> putting a comment just above the bit that matters and hope developers
> notice the comment.
>
> Signed-off-by: Alex Bennée 
>
> ---
>
> This is hacky and sub-optimal. There surely must be a way to have our cake
> and eat it?
> ---
>  tests/vm/Makefile.include | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/tests/vm/Makefile.include b/tests/vm/Makefile.include
> index bf12e0fa3c5..a109773c588 100644
> --- a/tests/vm/Makefile.include
> +++ b/tests/vm/Makefile.include
> @@ -88,10 +88,11 @@ vm-build-all: $(addprefix vm-build-, $(IMAGES))
>  vm-clean-all:
>   rm -f $(IMAGE_FILES)
>  
> +# Rebuilding the VMs every time this Makefile is tweaked is very
> +# expensive for most users. If you tweak the recipe bellow you will
> +# need to manually zap $(IMAGES_DIR)/%.img to rebuild.
>  $(IMAGES_DIR)/%.img: $(SRC_PATH)/tests/vm/% \
> - $(SRC_PATH)/tests/vm/basevm.py \
> - $(SRC_PATH)/tests/vm/Makefile.include \
> - $(VM_VENV)
> + $(SRC_PATH)/tests/vm/basevm.py

Maybe:

 # need to manually zap $(IMAGES_DIR)/%.img to rebuild.
 $(IMAGES_DIR)/%.img:   $(SRC_PATH)/tests/vm/% \
$(SRC_PATH)/tests/vm/basevm.py
+   $(if $(VM_VENV), make $(VM_VENV))
@mkdir -p $(IMAGES_DIR)
$(call quiet-command, \

?


>   @mkdir -p $(IMAGES_DIR)
>   $(call quiet-command, \
>   $(VM_PYTHON) $< \

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro



Re: [PATCH v4 07/10] hw/mem/cxl_type3: Add DC extent list representative and get DC extent list mailbox support

2024-02-26 Thread Jonathan Cameron via
On Wed, 21 Feb 2024 10:16:00 -0800
nifan@gmail.com wrote:

> From: Fan Ni 
> 
> Add dynamic capacity extent list representative to the definition of
> CXLType3Dev and add get DC extent list mailbox command per
> CXL.spec.3.1:.8.2.9.9.9.2.
> 
> Signed-off-by: Fan Ni 
Follow on from earlier comment on my preference for anonymous
structure types when we only use them in one place.


> +/*
> + * CXL r3.1 section 8.2.9.9.9.2:
> + * Get Dynamic Capacity Extent List (Opcode 4801h)
> + */
> +static CXLRetCode cmd_dcd_get_dyn_cap_ext_list(const struct cxl_cmd *cmd,
> +   uint8_t *payload_in,
> +   size_t len_in,
> +   uint8_t *payload_out,
> +   size_t *len_out,
> +   CXLCCI *cci)
> +{
> +CXLType3Dev *ct3d = CXL_TYPE3(cci->d);
> +struct get_dyn_cap_ext_list_in_pl {
> +uint32_t extent_cnt;
> +uint32_t start_extent_id;
> +} QEMU_PACKED;
> +
> +struct get_dyn_cap_ext_list_out_pl {
> +uint32_t count;
> +uint32_t total_extents;
> +uint32_t generation_num;
> +uint8_t rsvd[4];
> +CXLDCExtentRaw records[];
> +} QEMU_PACKED;
> +
> +struct get_dyn_cap_ext_list_in_pl *in = (void *)payload_in;
> +struct get_dyn_cap_ext_list_out_pl *out = (void *)payload_out;

As for earlier patches, I think anonymous struct types are fine for
these and lead to shorter code.

> +uint16_t record_count = 0, i = 0, record_done = 0;
> +CXLDCExtentList *extent_list = >dc.extents;
> +CXLDCExtent *ent;
> +uint16_t out_pl_len;
> +uint32_t start_extent_id = in->start_extent_id;
> +
> +if (start_extent_id > ct3d->dc.total_extent_count) {
> +return CXL_MBOX_INVALID_INPUT;
> +}
> +
> +record_count = MIN(in->extent_cnt,
> +   ct3d->dc.total_extent_count - start_extent_id);
> +
> +out_pl_len = sizeof(*out) + record_count * sizeof(out->records[0]);
> +assert(out_pl_len <= CXL_MAILBOX_MAX_PAYLOAD_SIZE);
> +
> +stl_le_p(>count, record_count);
> +stl_le_p(>total_extents, ct3d->dc.total_extent_count);
> +stl_le_p(>generation_num, ct3d->dc.ext_list_gen_seq);
> +
> +if (record_count > 0) {
> +QTAILQ_FOREACH(ent, extent_list, node) {
> +if (i++ < start_extent_id) {
> +continue;
> +}
> +stq_le_p(>records[record_done].start_dpa, ent->start_dpa);
> +stq_le_p(>records[record_done].len, ent->len);
> +memcpy(>records[record_done].tag, ent->tag, 0x10);
> +stw_le_p(>records[record_done].shared_seq, ent->shared_seq);
> +record_done++;
> +if (record_done == record_count) {
> +break;
> +}
> +}
> +}
> +
> +*len_out = out_pl_len;
> +return CXL_MBOX_SUCCESS;
> +}
> +




[RFC PATCH] tests/vm: avoid re-building the VM images all the time

2024-02-26 Thread Alex Bennée
There are two problems.

The first is a .PHONY target will always evaluate which triggers a
full re-build of the VM images. Drop the requirement knowing that this
introduces a manual step on freshly configure build dirs.

The second is a minor unrelated tweak to the Makefile also triggers an
expensive full re-build. Solve this be avoiding the dependency and
putting a comment just above the bit that matters and hope developers
notice the comment.

Signed-off-by: Alex Bennée 

---

This is hacky and sub-optimal. There surely must be a way to have our cake
and eat it?
---
 tests/vm/Makefile.include | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/tests/vm/Makefile.include b/tests/vm/Makefile.include
index bf12e0fa3c5..a109773c588 100644
--- a/tests/vm/Makefile.include
+++ b/tests/vm/Makefile.include
@@ -88,10 +88,11 @@ vm-build-all: $(addprefix vm-build-, $(IMAGES))
 vm-clean-all:
rm -f $(IMAGE_FILES)
 
+# Rebuilding the VMs every time this Makefile is tweaked is very
+# expensive for most users. If you tweak the recipe bellow you will
+# need to manually zap $(IMAGES_DIR)/%.img to rebuild.
 $(IMAGES_DIR)/%.img:   $(SRC_PATH)/tests/vm/% \
-   $(SRC_PATH)/tests/vm/basevm.py \
-   $(SRC_PATH)/tests/vm/Makefile.include \
-   $(VM_VENV)
+   $(SRC_PATH)/tests/vm/basevm.py
@mkdir -p $(IMAGES_DIR)
$(call quiet-command, \
$(VM_PYTHON) $< \
-- 
2.39.2




Re: [PATCH v4 06/10] hw/mem/cxl_type3: Add host backend and address space handling for DC regions

2024-02-26 Thread Jonathan Cameron via
On Wed, 21 Feb 2024 10:15:59 -0800
nifan@gmail.com wrote:

> From: Fan Ni 
> 
> Add (file/memory backed) host backend, all the dynamic capacity regions
> will share a single, large enough host backend. Set up address space for
> DC regions to support read/write operations to dynamic capacity for DCD.
> 
> With the change, following supports are added:
> 1. Add a new property to type3 device "volatile-dc-memdev" to point to host
>memory backend for dynamic capacity. Currently, all dc regions share one
>one host backend.
> 2. Add namespace for dynamic capacity for read/write support;
> 3. Create cdat entries for each dynamic capacity region;
> 4. Fix dvsec range registers to include DC regions.
> 
> Signed-off-by: Fan Ni 

Only comment on this one from me is beware of FIXME wording for
features we haven't implemented yet.  Makes people thing code isn't
good to go, when in reality we may or may not care about implementing
that configurability in the future!

> diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
> index 6e5f908fb1..b966fa4f10 100644
> --- a/hw/mem/cxl_type3.c
> +++ b/hw/mem/cxl_type3.c

> +
> +if (dc_mr) {
> +int i;
> +uint64_t region_base = vmr_size + pmr_size;
> +
> +/* FIXME: Currently we assume the dynamic capacity to be volatile. */
As below.  TODO: Allow for non volatile dynamic capacity.

> +for (i = 0; i < ct3d->dc.num_regions; i++) {
> +ct3_build_cdat_entries_for_mr(&(table[cur_ent]),
> +dsmad_handle++,
> +ct3d->dc.regions[i].len,
> +false, true, region_base);
> +ct3d->dc.regions[i].dsmadhandle = dsmad_handle - 1;
> +
> +cur_ent += CT3_CDAT_NUM_ENTRIES;
> +region_base += ct3d->dc.regions[i].len;
> +}
> +}
> +



> +
> +/* FIXME: set dc as volatile for now */

Not sure it's a fixme, more of a TODO to add control of this later.
Fixme sounds broken, whereas it's a missing feature only.

> +memory_region_set_nonvolatile(dc_mr, false);
> +memory_region_set_enabled(dc_mr, true);
> +host_memory_backend_set_mapped(ct3d->dc.host_dc, true);
> +if (ds->id) {
> +dc_name = g_strdup_printf("cxl-dcd-dpa-dc-space:%s", ds->id);
> +} else {
> +dc_name = g_strdup("cxl-dcd-dpa-dc-space");
> +}
> +address_space_init(>dc.host_dc_as, dc_mr, dc_name);
> +g_free(dc_name);
> +
> +if (!cxl_create_dc_regions(ct3d, errp)) {
> +error_setg(errp, "setup DC regions failed");
> +return false;
> +}
> 



  1   2   3   4   >