Re: [PATCH 1/1] Leaving Migration

2024-01-02 Thread Peter Xu
On Tue, Jan 02, 2024 at 09:19:07PM +0100, Juan Quintela wrote:
> I am leaving Red Hat, and as part of that I am leaving Migration
> maintenarship.
> 
> You are left in good hands with Peter and Fabiano.
> 
> Thanks for all the fish.
> 
> Signed-off-by: Juan Quintela 

We lose two entries for migration in just a few days.  That's always sad.

Since I joined QEMU late, so I don't know much for the past, but besides
those many migration patches that I already saw, you're also the first room
mate for my first KVM forum visit.  I just need more English KongFu to
digest all the jokes, and I'm still trying to improve (even not much. :-).

Thanks for all the work and help, and it's my honor to have worked with you
and also Dave. I wish QEMU can have a CREDITS file, but I didn't yet find
one.  This will be in the next pull, no matter from Fabiano or myself.

Thanks,

-- 
Peter Xu




Re: [PATCH v6 01/11] linux-headers: Update to kernel headers to add venus capset

2024-01-02 Thread Huang Rui
On Tue, Jan 02, 2024 at 06:42:44PM +0800, Marc-André Lureau wrote:
> Hi
> 
> On Thu, Dec 21, 2023 at 10:55 AM Akihiko Odaki  
> wrote:
> >
> > On 2023/12/19 23:14, Peter Maydell wrote:
> > > On Tue, 19 Dec 2023 at 13:49, Huang Rui  wrote:
> > >>
> > >> On Tue, Dec 19, 2023 at 08:20:22PM +0800, Akihiko Odaki wrote:
> > >>> On 2023/12/19 16:53, Huang Rui wrote:
> >  Sync up kernel headers to update venus macro till they are merged into
> >  mainline.
> > >>>
> > >>> Thanks for sorting things out with the kernel and spec.
> > >>>
> > 
> >  Signed-off-by: Huang Rui 
> >  ---
> > 
> >  Changes in v6:
> >  - Venus capset is applied in kernel, so update it in qemu for future 
> >  use.
> > 
> >  https://lore.kernel.org/lkml/b79dcf75-c9e8-490e-644f-3b97d95f7...@collabora.com/
> >  https://cgit.freedesktop.org/drm-misc/commit/?id=216d86b9a430f3280e5b631c51e6fd1a7774cfa0
> > >>> Please include the link to the upstream commit in the commit message.
> > >>
> > >> So far, it's in drm maintainers' branch not in kernel mainline yet. Do I
> > >> need to wait it to be merged into kernel mainline?
> > >
> > > For an RFC patchset, no. For patches to be merged into QEMU
> > > the headers change must be in the kernel mainline, and the
> > > QEMU commit that updates our copy of the headers must be a
> > > full-sync done with scripts/update-linux-headers.sh, not a
> > > manual edit.
> >
> > Apparently the kernel change is unlikely to be merged to mainline before
> > QEMU 9.0 so we need to come up with some idea to deal with this.

May I know when QEMU 9.0 will be released?

> >
> > The release of Linux 6.7 is near and the development of 6.8 will start
> > soon. So it was nice if the change made into 6.8, but unfortunately it
> > missed the *probably last* drm-misc tree pull request for 6.8:
> > https://lore.kernel.org/all/aqpn5miejmkks7pbcfex7b6u63uwsruywxsnr3x5ljs45qatin@nbkkej2elk46/
> >
> > It will still get into linux-next so we may retrieve headers from
> > linux-next. Or we may add the definition to
> > hw/display/virtio-gpu-virgl.c; the duplicate definition warning will
> > tell when the definition becomes no longer necessary. I'm fine with
> > either option.
> 
> The second option seems better to me, as it can avoid pulling unwanted 
> changes.
> 

I will keep an eye on dri-devel mailing list. And ok, I will add the
definition in virtio-gpu-virgl.c and remove it once kernel patch is merged
by mainline.

Thanks,
Ray



Re: [PATCH for 9.0 08/12] vdpa: add vhost_vdpa_load_setup

2024-01-02 Thread Peter Xu
On Tue, Jan 02, 2024 at 12:28:48PM +0100, Eugenio Perez Martin wrote:
> On Tue, Jan 2, 2024 at 6:33 AM Peter Xu  wrote:
> >
> > Jason, Eugenio,
> >
> > Apologies for a late reply; just back from the long holiday.
> >
> > On Thu, Dec 21, 2023 at 09:20:40AM +0100, Eugenio Perez Martin wrote:
> > > Si-Wei did the actual profiling as he is the one with the 128G guests,
> > > but most of the time was spent in the memory pinning. Si-Wei, please
> > > correct me if I'm wrong.
> >
> > IIUC we're talking about no-vIOMMU use case.  The pinning should indeed
> > take a lot of time if it's similar to what VFIO does.
> >
> > >
> > > I didn't check VFIO, but I think it just maps at realize phase with
> > > vfio_realize -> vfio_attach_device -> vfio_connect_container(). In
> > > previous testings, this delayed the VM initialization by a lot, as
> > > we're moving that 20s of blocking to every VM start.
> > >
> > > Investigating a way to do it only in the case of being the destination
> > > of a live migration, I think the right place is .load_setup migration
> > > handler. But I'm ok to move it for sure.
> >
> > If it's destined to map the 128G, it does sound sensible to me to do it
> > when VM starts, rather than anytime afterwards.
> >
> 
> Just for completion, it is not 100% sure the driver will start the
> device. But it is likely for sure.

My understanding is that vDPA is still a quite special device, assuming
only targeting advanced users, and should not appear in a default config
for anyone.  It means the user should hopefully remove the device if the
guest is not using it, instead of worrying on a slow boot.

> 
> > Could anyone help to explain what's the problem if vDPA maps 128G at VM
> > init just like what VFIO does?
> >
> 
> The main problem was the delay of VM start. In the master branch, the
> pinning is done when the driver starts the device. While it takes the
> BQL, the rest of the vCPUs can move work forward while the host is
> pinning. So the impact of it is not so evident.
> 
> To move it to initialization time made it very noticeable. To make
> things worse, QEMU did not respond to QMP commands and similar. That's
> why it was done only if the VM was the destination of a LM.

Is that a major issue for us?  IIUC then VFIO shares the same condition.
If it's a real problem, do we want to have a solution that works for both
(or, is it possible)?

> 
> However, we've added the memory map thread in this version, so this
> might not be a problem anymore. We could move the spawn of the thread
> to initialization time.
> 
> But how to undo this pinning in the case the guest does not start the
> device? In this series, this is done at the destination with
> vhost_vdpa_load_cleanup. Or is it ok to just keep the memory mapped as
> long as QEMU has the vDPA device?

I think even if vDPA decides to use a thread, we should keep the same
behavior before/after the migration.  Having assymetric behavior over DMA
from the assigned HWs might have unpredictable implications.

What I worry is we may over-optimize / over-engineer the case where the
user will specify the vDPA device but not use it, as I mentioned above.

For the long term, maybe there's chance to optimize DMA pinning for both
vdpa/vfio use cases, then we can always pin them during VM starts? Assuming
that issue only exists for large VMs, while they should normally be good
candidates for huge pages already.  Then, it means maybe one folio/page can
cover a large range (e.g. 1G on x86_64) in one pin, and physical continuity
also provides possibility of IOMMU large page mappings.  I didn't check at
which stage we are for VFIO on this, Alex may know better. I'm copying Alex
anyway since the problem seems to be a common one already, so maybe he has
some thoughts.

Thanks,

-- 
Peter Xu




Re: [PATCH v6 01/11] linux-headers: Update to kernel headers to add venus capset

2024-01-02 Thread Huang Rui
On Tue, Dec 19, 2023 at 08:20:22PM +0800, Akihiko Odaki wrote:
> On 2023/12/19 16:53, Huang Rui wrote:
> > Sync up kernel headers to update venus macro till they are merged into
> > mainline.
> 
> Thanks for sorting things out with the kernel and spec.
> 
> > 
> > Signed-off-by: Huang Rui 
> > ---
> > 
> > Changes in v6:
> > - Venus capset is applied in kernel, so update it in qemu for future use.
> > 
> > https://lore.kernel.org/lkml/b79dcf75-c9e8-490e-644f-3b97d95f7...@collabora.com/
> > https://cgit.freedesktop.org/drm-misc/commit/?id=216d86b9a430f3280e5b631c51e6fd1a7774cfa0
> Please include the link to the upstream commit in the commit message.

OK, I will add this info in qemu commit message.

Thanks,
Ray



Re: [PATCH v6 01/11] linux-headers: Update to kernel headers to add venus capset

2024-01-02 Thread Huang Rui
On Tue, Dec 19, 2023 at 10:14:28PM +0800, Peter Maydell wrote:
> On Tue, 19 Dec 2023 at 13:49, Huang Rui  wrote:
> >
> > On Tue, Dec 19, 2023 at 08:20:22PM +0800, Akihiko Odaki wrote:
> > > On 2023/12/19 16:53, Huang Rui wrote:
> > > > Sync up kernel headers to update venus macro till they are merged into
> > > > mainline.
> > >
> > > Thanks for sorting things out with the kernel and spec.
> > >
> > > >
> > > > Signed-off-by: Huang Rui 
> > > > ---
> > > >
> > > > Changes in v6:
> > > > - Venus capset is applied in kernel, so update it in qemu for future 
> > > > use.
> > > >
> > > > https://lore.kernel.org/lkml/b79dcf75-c9e8-490e-644f-3b97d95f7...@collabora.com/
> > > > https://cgit.freedesktop.org/drm-misc/commit/?id=216d86b9a430f3280e5b631c51e6fd1a7774cfa0
> > > Please include the link to the upstream commit in the commit message.
> >
> > So far, it's in drm maintainers' branch not in kernel mainline yet. Do I
> > need to wait it to be merged into kernel mainline?
> 
> For an RFC patchset, no. For patches to be merged into QEMU
> the headers change must be in the kernel mainline, and the
> QEMU commit that updates our copy of the headers must be a
> full-sync done with scripts/update-linux-headers.sh, not a
> manual edit.
> 

Yes, according to the comment in previous series, I am using
update-linux-headers.sh to generate the patch. But here, the patch is not
merged in mainline yet.

Thanks,
Ray



Re: [PATCH v3 7/9] target/loongarch: Implement kvm_arch_handle_exit

2024-01-02 Thread gaosong

在 2023/12/28 下午4:40, Tianrui Zhao 写道:

Implement kvm_arch_handle_exit for loongarch. In this
function, the KVM_EXIT_LOONGARCH_IOCSR is handled,
we read or write the iocsr address space by the addr,
length and is_write argument in kvm_run.

Signed-off-by: Tianrui Zhao 
Signed-off-by: xianglai li 
Reviewed-by: Richard Henderson 
---
  target/loongarch/kvm.c| 24 +++-
  target/loongarch/trace-events |  1 +
  2 files changed, 24 insertions(+), 1 deletion(-)

Reviewed-by: Song Gao 

Thanks.
Song Gao

diff --git a/target/loongarch/kvm.c b/target/loongarch/kvm.c
index 85e7aeb083..d2dab3fef4 100644
--- a/target/loongarch/kvm.c
+++ b/target/loongarch/kvm.c
@@ -723,7 +723,29 @@ bool kvm_arch_cpu_check_are_resettable(void)
  
  int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)

  {
-return 0;
+int ret = 0;
+LoongArchCPU *cpu = LOONGARCH_CPU(cs);
+CPULoongArchState *env = >env;
+MemTxAttrs attrs = {};
+
+attrs.requester_id = env_cpu(env)->cpu_index;
+
+trace_kvm_arch_handle_exit(run->exit_reason);
+switch (run->exit_reason) {
+case KVM_EXIT_LOONGARCH_IOCSR:
+address_space_rw(>address_space_iocsr,
+ run->iocsr_io.phys_addr,
+ attrs,
+ run->iocsr_io.data,
+ run->iocsr_io.len,
+ run->iocsr_io.is_write);
+break;
+default:
+ret = -1;
+warn_report("KVM: unknown exit reason %d", run->exit_reason);
+break;
+}
+return ret;
  }
  
  void kvm_arch_accel_class_init(ObjectClass *oc)

diff --git a/target/loongarch/trace-events b/target/loongarch/trace-events
index 937c3c7c0c..021839880e 100644
--- a/target/loongarch/trace-events
+++ b/target/loongarch/trace-events
@@ -11,3 +11,4 @@ kvm_failed_get_counter(const char *msg) "Failed to get counter 
from KVM: %s"
  kvm_failed_put_counter(const char *msg) "Failed to put counter into KVM: %s"
  kvm_failed_get_cpucfg(const char *msg) "Failed to get cpucfg from KVM: %s"
  kvm_failed_put_cpucfg(const char *msg) "Failed to put cpucfg into KVM: %s"
+kvm_arch_handle_exit(int num) "kvm arch handle exit, the reason number: %d"





Re: [PATCH v3 4/9] target/loongarch: Implement kvm get/set registers

2024-01-02 Thread gaosong

在 2023/12/28 下午4:40, Tianrui Zhao 写道:

+static int kvm_loongarch_get_regs_fp(CPUState *cs)
+{
+int ret, i;
+struct kvm_fpu fpu;
+
+LoongArchCPU *cpu = LOONGARCH_CPU(cs);
+CPULoongArchState *env = >env;
+
+ret = kvm_vcpu_ioctl(cs, KVM_GET_FPU, );
+if (ret < 0) {
+trace_kvm_failed_get_fpu(strerror(errno));
+return ret;
+}
+
+env->fcsr0 = fpu.fcsr;
+for (i = 0; i < 32; i++) {
+env->fpr[i].vreg.UD[0] = fpu.fpr[i].val64[0];
+}
+for (i = 0; i < 8; i++) {
+env->cf[i] = fpu.fcc & 0xFF;
+fpu.fcc = fpu.fcc >> 8;
+}
+

Use  write_fcc(env, fpu.fcc)

+return ret;
+}
+
+static int kvm_loongarch_put_regs_fp(CPUState *cs)
+{
+int ret, i;
+struct kvm_fpu fpu;
+
+LoongArchCPU *cpu = LOONGARCH_CPU(cs);
+CPULoongArchState *env = >env;
+
+fpu.fcsr = env->fcsr0;
+fpu.fcc = 0;
+for (i = 0; i < 32; i++) {
+fpu.fpr[i].val64[0] = env->fpr[i].vreg.UD[0];
+}
+
+for (i = 0; i < 8; i++) {
+fpu.fcc |= env->cf[i] << (8 * i);
+}
+

Use fpu.fcc = read_fcc(env)

+ret = kvm_vcpu_ioctl(cs, KVM_SET_FPU, );
+if (ret < 0) {
+trace_kvm_failed_put_fpu(strerror(errno));
+}
+
+return ret;
+}





Re: [PATCH v3 6/9] target/loongarch: Implement kvm_arch_init_vcpu

2024-01-02 Thread gaosong

在 2023/12/28 下午4:40, Tianrui Zhao 写道:

Implement kvm_arch_init_vcpu interface for loongarch,
in this function, we register VM change state handler.
And when VM state changes to running, the counter value
should be put into kvm to keep consistent with kvm,
and when state change to stop, counter value should be
refreshed from kvm.

Signed-off-by: Tianrui Zhao 
Signed-off-by: xianglai li 
---
  target/loongarch/cpu.h|  2 ++
  target/loongarch/kvm.c| 23 +++
  target/loongarch/trace-events |  2 ++
  3 files changed, 27 insertions(+)

Reviewed-by: Song Gao 

Thanks.
Song Gao

diff --git a/target/loongarch/cpu.h b/target/loongarch/cpu.h
index f4a89bd626..8ebd6fa1a7 100644
--- a/target/loongarch/cpu.h
+++ b/target/loongarch/cpu.h
@@ -381,6 +381,8 @@ struct ArchCPU {
  
  /* 'compatible' string for this CPU for Linux device trees */

  const char *dtb_compatible;
+/* used by KVM_REG_LOONGARCH_COUNTER ioctl to access guest time counters */
+uint64_t kvm_state_counter;
  };
  
  /**

diff --git a/target/loongarch/kvm.c b/target/loongarch/kvm.c
index 29944b9ef8..85e7aeb083 100644
--- a/target/loongarch/kvm.c
+++ b/target/loongarch/kvm.c
@@ -617,8 +617,31 @@ int kvm_arch_put_registers(CPUState *cs, int level)
  return ret;
  }
  
+static void kvm_loongarch_vm_stage_change(void *opaque, bool running,

+  RunState state)
+{
+int ret;
+CPUState *cs = opaque;
+LoongArchCPU *cpu = LOONGARCH_CPU(cs);
+
+if (running) {
+ret = kvm_set_one_reg(cs, KVM_REG_LOONGARCH_COUNTER,
+  >kvm_state_counter);
+if (ret < 0) {
+trace_kvm_failed_put_counter(strerror(errno));
+}
+} else {
+ret = kvm_get_one_reg(cs, KVM_REG_LOONGARCH_COUNTER,
+  >kvm_state_counter);
+if (ret < 0) {
+trace_kvm_failed_get_counter(strerror(errno));
+}
+}
+}
+
  int kvm_arch_init_vcpu(CPUState *cs)
  {
+qemu_add_vm_change_state_handler(kvm_loongarch_vm_stage_change, cs);
  return 0;
  }
  
diff --git a/target/loongarch/trace-events b/target/loongarch/trace-events

index 6827ab566a..937c3c7c0c 100644
--- a/target/loongarch/trace-events
+++ b/target/loongarch/trace-events
@@ -7,5 +7,7 @@ kvm_failed_get_fpu(const char *msg) "Failed to get fpu from KVM: 
%s"
  kvm_failed_put_fpu(const char *msg) "Failed to put fpu into KVM: %s"
  kvm_failed_get_mpstate(const char *msg) "Failed to get mp_state from KVM: %s"
  kvm_failed_put_mpstate(const char *msg) "Failed to put mp_state into KVM: %s"
+kvm_failed_get_counter(const char *msg) "Failed to get counter from KVM: %s"
+kvm_failed_put_counter(const char *msg) "Failed to put counter into KVM: %s"
  kvm_failed_get_cpucfg(const char *msg) "Failed to get cpucfg from KVM: %s"
  kvm_failed_put_cpucfg(const char *msg) "Failed to put cpucfg into KVM: %s"





Re: [PATCH v3 5/9] target/loongarch: Implement kvm_arch_init function

2024-01-02 Thread gaosong

在 2023/12/28 下午4:40, Tianrui Zhao 写道:

Implement the kvm_arch_init of loongarch, in the function, the
KVM_CAP_MP_STATE cap is checked by kvm ioctl.

Signed-off-by: Tianrui Zhao 
Signed-off-by: xianglai li 
Reviewed-by: Richard Henderson 
---
  target/loongarch/kvm.c | 1 +
  1 file changed, 1 insertion(+)

Reviewed-by: Song Gao 

Thanks.
Song Gao

diff --git a/target/loongarch/kvm.c b/target/loongarch/kvm.c
index e7c9ef830c..29944b9ef8 100644
--- a/target/loongarch/kvm.c
+++ b/target/loongarch/kvm.c
@@ -665,6 +665,7 @@ int kvm_arch_get_default_type(MachineState *ms)
  
  int kvm_arch_init(MachineState *ms, KVMState *s)

  {
+cap_has_mp_state = kvm_check_extension(s, KVM_CAP_MP_STATE);
  return 0;
  }
  





Re: [PATCH v3 4/9] target/loongarch: Implement kvm get/set registers

2024-01-02 Thread gaosong

在 2023/12/28 下午4:40, Tianrui Zhao 写道:

Implement kvm_arch_get/set_registers interfaces, many regs
can be get/set in the function, such as core regs, csr regs,
fpu regs, mp state, etc.

Signed-off-by: Tianrui Zhao 
Signed-off-by: xianglai li 
---
  meson.build   |   1 +
  target/loongarch/cpu.c|   3 +
  target/loongarch/cpu.h|   1 +
  target/loongarch/fpu_helper.c |   2 +
  target/loongarch/internals.h  |   5 +-
  target/loongarch/kvm.c| 580 +-
  target/loongarch/trace-events |  11 +
  target/loongarch/trace.h  |   1 +
  8 files changed, 601 insertions(+), 3 deletions(-)
  create mode 100644 target/loongarch/trace-events
  create mode 100644 target/loongarch/trace.h

Reviewed-by: Song Gao 

Thanks.
Song Gao

diff --git a/meson.build b/meson.build
index 6c77d9687d..445f2b7c2b 100644
--- a/meson.build
+++ b/meson.build
@@ -3358,6 +3358,7 @@ if have_system or have_user
  'target/hppa',
  'target/i386',
  'target/i386/kvm',
+'target/loongarch',
  'target/mips/tcg',
  'target/nios2',
  'target/ppc',
diff --git a/target/loongarch/cpu.c b/target/loongarch/cpu.c
index 4432a0081d..83899c673f 100644
--- a/target/loongarch/cpu.c
+++ b/target/loongarch/cpu.c
@@ -555,6 +555,9 @@ static void loongarch_cpu_reset_hold(Object *obj)
  #ifndef CONFIG_USER_ONLY
  env->pc = 0x1c00;
  memset(env->tlb, 0, sizeof(env->tlb));
+if (kvm_enabled()) {
+kvm_arch_reset_vcpu(env);
+}
  #endif
  
  restore_fp_status(env);

diff --git a/target/loongarch/cpu.h b/target/loongarch/cpu.h
index f6d5ef0852..f4a89bd626 100644
--- a/target/loongarch/cpu.h
+++ b/target/loongarch/cpu.h
@@ -360,6 +360,7 @@ typedef struct CPUArchState {
  MemoryRegion iocsr_mem;
  bool load_elf;
  uint64_t elf_address;
+uint32_t mp_state;
  /* Store ipistate to access from this struct */
  DeviceState *ipistate;
  #endif
diff --git a/target/loongarch/fpu_helper.c b/target/loongarch/fpu_helper.c
index f6753c5875..1b71ea46d3 100644
--- a/target/loongarch/fpu_helper.c
+++ b/target/loongarch/fpu_helper.c
@@ -9,7 +9,9 @@
  #include "cpu.h"
  #include "exec/helper-proto.h"
  #include "exec/exec-all.h"
+#ifdef CONFIG_TCG
  #include "exec/cpu_ldst.h"
+#endif
  #include "fpu/softfloat.h"
  #include "internals.h"
  
diff --git a/target/loongarch/internals.h b/target/loongarch/internals.h

index c492863cc5..0beb034748 100644
--- a/target/loongarch/internals.h
+++ b/target/loongarch/internals.h
@@ -31,8 +31,10 @@ void G_NORETURN do_raise_exception(CPULoongArchState *env,
  
  const char *loongarch_exception_name(int32_t exception);
  
+#ifdef CONFIG_TCG

  int ieee_ex_to_loongarch(int xcpt);
  void restore_fp_status(CPULoongArchState *env);
+#endif
  
  #ifndef CONFIG_USER_ONLY

  extern const VMStateDescription vmstate_loongarch_cpu;
@@ -44,12 +46,13 @@ uint64_t 
cpu_loongarch_get_constant_timer_counter(LoongArchCPU *cpu);
  uint64_t cpu_loongarch_get_constant_timer_ticks(LoongArchCPU *cpu);
  void cpu_loongarch_store_constant_timer_config(LoongArchCPU *cpu,
 uint64_t value);
-
+#ifdef CONFIG_TCG
  bool loongarch_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
  MMUAccessType access_type, int mmu_idx,
  bool probe, uintptr_t retaddr);
  
  hwaddr loongarch_cpu_get_phys_page_debug(CPUState *cpu, vaddr addr);

+#endif
  #endif /* !CONFIG_USER_ONLY */
  
  uint64_t read_fcc(CPULoongArchState *env);

diff --git a/target/loongarch/kvm.c b/target/loongarch/kvm.c
index 0d67322fd9..e7c9ef830c 100644
--- a/target/loongarch/kvm.c
+++ b/target/loongarch/kvm.c
@@ -26,19 +26,595 @@
  #include "sysemu/runstate.h"
  #include "cpu-csr.h"
  #include "kvm_loongarch.h"
+#include "trace.h"
  
  static bool cap_has_mp_state;

  const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
  KVM_CAP_LAST_INFO
  };
  
+static int kvm_loongarch_get_regs_core(CPUState *cs)

+{
+int ret = 0;
+int i;
+struct kvm_regs regs;
+LoongArchCPU *cpu = LOONGARCH_CPU(cs);
+CPULoongArchState *env = >env;
+
+/* Get the current register set as KVM seems it */
+ret = kvm_vcpu_ioctl(cs, KVM_GET_REGS, );
+if (ret < 0) {
+trace_kvm_failed_get_regs_core(strerror(errno));
+return ret;
+}
+/* gpr[0] value is always 0 */
+env->gpr[0] = 0;
+for (i = 1; i < 32; i++) {
+env->gpr[i] = regs.gpr[i];
+}
+
+env->pc = regs.pc;
+return ret;
+}
+
+static int kvm_loongarch_put_regs_core(CPUState *cs)
+{
+int ret = 0;
+int i;
+struct kvm_regs regs;
+LoongArchCPU *cpu = LOONGARCH_CPU(cs);
+CPULoongArchState *env = >env;
+
+/* Set the registers based on QEMU's view of things */
+for (i = 0; i < 32; i++) {
+regs.gpr[i] = env->gpr[i];
+}
+
+regs.pc = env->pc;
+ret = kvm_vcpu_ioctl(cs, KVM_SET_REGS, );
+if (ret < 0) {
+

Re: [PATCH v3 2/9] target/loongarch: Define some kvm_arch interfaces

2024-01-02 Thread gaosong

在 2023/12/28 下午4:40, Tianrui Zhao 写道:

Define some functions in target/loongarch/kvm.c, such as
kvm_arch_put_registers, kvm_arch_get_registers and
kvm_arch_handle_exit, etc. which are needed by kvm/kvm-all.c.
Now the most functions has no content and they will be
implemented in the next patches.

Signed-off-by: Tianrui Zhao 
Signed-off-by: xianglai li 
Reviewed-by: Richard Henderson 
---
  target/loongarch/kvm.c | 131 +
  1 file changed, 131 insertions(+)
  create mode 100644 target/loongarch/kvm.c

Reviewed-by: Song Gao 

Thanks.
Song Gao

diff --git a/target/loongarch/kvm.c b/target/loongarch/kvm.c
new file mode 100644
index 00..0d67322fd9
--- /dev/null
+++ b/target/loongarch/kvm.c
@@ -0,0 +1,131 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * QEMU LoongArch KVM
+ *
+ * Copyright (c) 2023 Loongson Technology Corporation Limited
+ */
+
+#include "qemu/osdep.h"
+#include 
+#include 
+
+#include "qemu/timer.h"
+#include "qemu/error-report.h"
+#include "qemu/main-loop.h"
+#include "sysemu/sysemu.h"
+#include "sysemu/kvm.h"
+#include "sysemu/kvm_int.h"
+#include "hw/pci/pci.h"
+#include "exec/memattrs.h"
+#include "exec/address-spaces.h"
+#include "hw/boards.h"
+#include "hw/irq.h"
+#include "qemu/log.h"
+#include "hw/loader.h"
+#include "migration/migration.h"
+#include "sysemu/runstate.h"
+#include "cpu-csr.h"
+#include "kvm_loongarch.h"
+
+static bool cap_has_mp_state;
+const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
+KVM_CAP_LAST_INFO
+};
+
+int kvm_arch_get_registers(CPUState *cs)
+{
+return 0;
+}
+int kvm_arch_put_registers(CPUState *cs, int level)
+{
+return 0;
+}
+
+int kvm_arch_init_vcpu(CPUState *cs)
+{
+return 0;
+}
+
+int kvm_arch_destroy_vcpu(CPUState *cs)
+{
+return 0;
+}
+
+unsigned long kvm_arch_vcpu_id(CPUState *cs)
+{
+return cs->cpu_index;
+}
+
+int kvm_arch_release_virq_post(int virq)
+{
+return 0;
+}
+
+int kvm_arch_msi_data_to_gsi(uint32_t data)
+{
+abort();
+}
+
+int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route,
+ uint64_t address, uint32_t data, PCIDevice *dev)
+{
+return 0;
+}
+
+int kvm_arch_add_msi_route_post(struct kvm_irq_routing_entry *route,
+int vector, PCIDevice *dev)
+{
+return 0;
+}
+
+void kvm_arch_init_irq_routing(KVMState *s)
+{
+}
+
+int kvm_arch_get_default_type(MachineState *ms)
+{
+return 0;
+}
+
+int kvm_arch_init(MachineState *ms, KVMState *s)
+{
+return 0;
+}
+
+int kvm_arch_irqchip_create(KVMState *s)
+{
+return 0;
+}
+
+void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
+{
+}
+
+MemTxAttrs kvm_arch_post_run(CPUState *cs, struct kvm_run *run)
+{
+return MEMTXATTRS_UNSPECIFIED;
+}
+
+int kvm_arch_process_async_events(CPUState *cs)
+{
+return cs->halted;
+}
+
+bool kvm_arch_stop_on_emulation_error(CPUState *cs)
+{
+return true;
+}
+
+bool kvm_arch_cpu_check_are_resettable(void)
+{
+return true;
+}
+
+int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
+{
+return 0;
+}
+
+void kvm_arch_accel_class_init(ObjectClass *oc)
+{
+}





Re: [PATCH v3 3/9] target/loongarch: Supplement vcpu env initial when vcpu reset

2024-01-02 Thread gaosong

在 2023/12/28 下午4:40, Tianrui Zhao 写道:

Supplement vcpu env initial when vcpu reset, including
init vcpu CSR_CPUID,CSR_TID to cpu->cpu_index. The two
regs will be used in kvm_get/set_csr_ioctl.

Signed-off-by: Tianrui Zhao 
Signed-off-by: xianglai li 
---
  target/loongarch/cpu.c | 2 ++
  target/loongarch/cpu.h | 2 +-
  2 files changed, 3 insertions(+), 1 deletion(-)

Reviewed-by: Song Gao 

Thanks.
Song Gao

diff --git a/target/loongarch/cpu.c b/target/loongarch/cpu.c
index b26187dfde..4432a0081d 100644
--- a/target/loongarch/cpu.c
+++ b/target/loongarch/cpu.c
@@ -533,10 +533,12 @@ static void loongarch_cpu_reset_hold(Object *obj)
  
  env->CSR_ESTAT = env->CSR_ESTAT & (~MAKE_64BIT_MASK(0, 2));

  env->CSR_RVACFG = FIELD_DP64(env->CSR_RVACFG, CSR_RVACFG, RBITS, 0);
+env->CSR_CPUID = cs->cpu_index;
  env->CSR_TCFG = FIELD_DP64(env->CSR_TCFG, CSR_TCFG, EN, 0);
  env->CSR_LLBCTL = FIELD_DP64(env->CSR_LLBCTL, CSR_LLBCTL, KLO, 0);
  env->CSR_TLBRERA = FIELD_DP64(env->CSR_TLBRERA, CSR_TLBRERA, ISTLBR, 0);
  env->CSR_MERRCTL = FIELD_DP64(env->CSR_MERRCTL, CSR_MERRCTL, ISMERR, 0);
+env->CSR_TID = cs->cpu_index;
  
  env->CSR_PRCFG3 = FIELD_DP64(env->CSR_PRCFG3, CSR_PRCFG3, TLB_TYPE, 2);

  env->CSR_PRCFG3 = FIELD_DP64(env->CSR_PRCFG3, CSR_PRCFG3, MTLB_ENTRY, 63);
diff --git a/target/loongarch/cpu.h b/target/loongarch/cpu.h
index 00d1fba597..f6d5ef0852 100644
--- a/target/loongarch/cpu.h
+++ b/target/loongarch/cpu.h
@@ -319,6 +319,7 @@ typedef struct CPUArchState {
  uint64_t CSR_PWCH;
  uint64_t CSR_STLBPS;
  uint64_t CSR_RVACFG;
+uint64_t CSR_CPUID;
  uint64_t CSR_PRCFG1;
  uint64_t CSR_PRCFG2;
  uint64_t CSR_PRCFG3;
@@ -350,7 +351,6 @@ typedef struct CPUArchState {
  uint64_t CSR_DBG;
  uint64_t CSR_DERA;
  uint64_t CSR_DSAVE;
-uint64_t CSR_CPUID;
  
  #ifndef CONFIG_USER_ONLY

  LoongArchTLB  tlb[LOONGARCH_TLB_MAX];





Re: [PATCH v3 1/9] linux-headers: Synchronize linux headers from linux v6.7.0-rc7

2024-01-02 Thread gaosong

在 2023/12/28 下午4:40, Tianrui Zhao 写道:

Use the scripts/update-linux-headers.sh to synchronize linux
headers from linux v6.7.0-rc7. We mainly want to add the
loongarch linux headers and then add the loongarch kvm support
based on it.

Signed-off-by: Tianrui Zhao 
---
  include/standard-headers/drm/drm_fourcc.h |   2 +
  include/standard-headers/linux/fuse.h |  10 +-
  include/standard-headers/linux/pci_regs.h |  24 ++-
  include/standard-headers/linux/vhost_types.h  |   7 +
  .../standard-headers/linux/virtio_config.h|   5 +
  include/standard-headers/linux/virtio_pci.h   |  11 ++
  linux-headers/asm-arm64/kvm.h |  32 
  linux-headers/asm-generic/unistd.h|  14 +-
  linux-headers/asm-loongarch/bitsperlong.h |   1 +
  linux-headers/asm-loongarch/kvm.h | 108 +++
  linux-headers/asm-loongarch/mman.h|   1 +
  linux-headers/asm-loongarch/unistd.h  |   5 +
  linux-headers/asm-mips/unistd_n32.h   |   4 +
  linux-headers/asm-mips/unistd_n64.h   |   4 +
  linux-headers/asm-mips/unistd_o32.h   |   4 +
  linux-headers/asm-powerpc/unistd_32.h |   4 +
  linux-headers/asm-powerpc/unistd_64.h |   4 +
  linux-headers/asm-riscv/kvm.h |  12 ++
  linux-headers/asm-s390/unistd_32.h|   4 +
  linux-headers/asm-s390/unistd_64.h|   4 +
  linux-headers/asm-x86/unistd_32.h |   4 +
  linux-headers/asm-x86/unistd_64.h |   3 +
  linux-headers/asm-x86/unistd_x32.h|   3 +
  linux-headers/linux/iommufd.h | 180 +-
  linux-headers/linux/kvm.h |  11 ++
  linux-headers/linux/psp-sev.h |   1 +
  linux-headers/linux/stddef.h  |   9 +-
  linux-headers/linux/userfaultfd.h |   9 +-
  linux-headers/linux/vfio.h|  47 +++--
  linux-headers/linux/vhost.h   |   8 +
  30 files changed, 504 insertions(+), 31 deletions(-)
  create mode 100644 linux-headers/asm-loongarch/bitsperlong.h
  create mode 100644 linux-headers/asm-loongarch/kvm.h
  create mode 100644 linux-headers/asm-loongarch/mman.h
  create mode 100644 linux-headers/asm-loongarch/unistd.h

Acked-by: Song Gao 

Thanks.
Song Gao

diff --git a/include/standard-headers/drm/drm_fourcc.h 
b/include/standard-headers/drm/drm_fourcc.h
index 72279f4d25..3afb70160f 100644
--- a/include/standard-headers/drm/drm_fourcc.h
+++ b/include/standard-headers/drm/drm_fourcc.h
@@ -322,6 +322,8 @@ extern "C" {
   * index 1 = Cr:Cb plane, [39:0] Cr1:Cb1:Cr0:Cb0 little endian
   */
  #define DRM_FORMAT_NV15   fourcc_code('N', 'V', '1', '5') /* 2x2 
subsampled Cr:Cb plane */
+#define DRM_FORMAT_NV20fourcc_code('N', 'V', '2', '0') /* 2x1 
subsampled Cr:Cb plane */
+#define DRM_FORMAT_NV30fourcc_code('N', 'V', '3', '0') /* 
non-subsampled Cr:Cb plane */
  
  /*

   * 2 plane YCbCr MSB aligned
diff --git a/include/standard-headers/linux/fuse.h 
b/include/standard-headers/linux/fuse.h
index 6b9793842c..fc0dcd10ae 100644
--- a/include/standard-headers/linux/fuse.h
+++ b/include/standard-headers/linux/fuse.h
@@ -209,7 +209,7 @@
   *  - add FUSE_HAS_EXPIRE_ONLY
   *
   *  7.39
- *  - add FUSE_DIRECT_IO_RELAX
+ *  - add FUSE_DIRECT_IO_ALLOW_MMAP
   *  - add FUSE_STATX and related structures
   */
  
@@ -405,8 +405,7 @@ struct fuse_file_lock {

   * FUSE_CREATE_SUPP_GROUP: add supplementary group info to create, mkdir,
   *symlink and mknod (single group that matches parent)
   * FUSE_HAS_EXPIRE_ONLY: kernel supports expiry-only entry invalidation
- * FUSE_DIRECT_IO_RELAX: relax restrictions in FOPEN_DIRECT_IO mode, for now
- *   allow shared mmap
+ * FUSE_DIRECT_IO_ALLOW_MMAP: allow shared mmap in FOPEN_DIRECT_IO mode.
   */
  #define FUSE_ASYNC_READ   (1 << 0)
  #define FUSE_POSIX_LOCKS  (1 << 1)
@@ -445,7 +444,10 @@ struct fuse_file_lock {
  #define FUSE_HAS_INODE_DAX(1ULL << 33)
  #define FUSE_CREATE_SUPP_GROUP(1ULL << 34)
  #define FUSE_HAS_EXPIRE_ONLY  (1ULL << 35)
-#define FUSE_DIRECT_IO_RELAX   (1ULL << 36)
+#define FUSE_DIRECT_IO_ALLOW_MMAP (1ULL << 36)
+
+/* Obsolete alias for FUSE_DIRECT_IO_ALLOW_MMAP */
+#define FUSE_DIRECT_IO_RELAX   FUSE_DIRECT_IO_ALLOW_MMAP
  
  /**

   * CUSE INIT request/reply flags
diff --git a/include/standard-headers/linux/pci_regs.h 
b/include/standard-headers/linux/pci_regs.h
index e5f558d964..a39193213f 100644
--- a/include/standard-headers/linux/pci_regs.h
+++ b/include/standard-headers/linux/pci_regs.h
@@ -80,6 +80,7 @@
  #define  PCI_HEADER_TYPE_NORMAL   0
  #define  PCI_HEADER_TYPE_BRIDGE   1
  #define  PCI_HEADER_TYPE_CARDBUS  2
+#define  PCI_HEADER_TYPE_MFD   0x80/* Multi-Function Device 
(possible) */
  
  #define PCI_BIST		0x0f	/* 8 bits */

  #define  PCI_BIST_CODE_MASK   0x0f/* Return 

Re: QEMU developers call

2024-01-02 Thread Peter Xu
On Tue, Jan 02, 2024 at 10:17:40PM +0100, Juan Quintela wrote:
> 
> Uv
> 
> Nf V nz yrnivat DRZH qrirybczrag, V pna'g unaqyr guvf pnyy.  Fubhyq
> nalbar gnxr pner bs vg?
> 
> V unir gnyxrq nobhg guvf jvgu Crgre, naq V guvax gung vs abobql fgrcf
> hc, ur pna "ibyhagrre" gb unaqyr vg.
> 
> Yngre, Whna.

If I'd make a list of good candidates, I'll probably be the last one out of
tens, especially considering my current timezone. :-D

Take care, Juan.

-- 
Peter Xu
# tr 'U-ZA-Tu-za-t' 'H-ZA-Gh-za-g'




RE: [PATCH 1/2] backends/iommufd: Remove check on number of backend users

2024-01-02 Thread Duan, Zhenzhong
Hi Cédric,

>-Original Message-
>From: Cédric Le Goater 
>Sent: Tuesday, January 2, 2024 8:32 PM
>To: qemu-devel@nongnu.org
>Cc: Liu, Yi L ; Eric Auger ; Duan,
>Zhenzhong ; Cédric Le Goater
>
>Subject: [PATCH 1/2] backends/iommufd: Remove check on number of
>backend users
>
>QOM already has a ref count on objects and it will assert much
>earlier, when INT_MAX is reached.
>
>Signed-off-by: Cédric Le Goater 

IIUC, /dev/iommu is opened on demand, be->users is used to catch underflow
or overflow due to mismatched iommufd_backend_connect/disconnect
pairs. It looks different from object ref count in purpose, but I agree
a correctly written code will never trigger such overflow/underflow.

So, for the series:
Reviewed-by: Zhenzhong Duan 

BRs.
Zhenzhong

>---
> backends/iommufd.c | 5 -
> 1 file changed, 5 deletions(-)
>
>diff --git a/backends/iommufd.c b/backends/iommufd.c
>index
>ba58a0eb0d0ba9aae625c987eb728547543dba66..393c0d9a3719e3de1a6b
>51a8ff2e75e184badc82 100644
>--- a/backends/iommufd.c
>+++ b/backends/iommufd.c
>@@ -80,11 +80,6 @@ int iommufd_backend_connect(IOMMUFDBackend
>*be, Error **errp)
> int fd, ret = 0;
>
> qemu_mutex_lock(>lock);
>-if (be->users == UINT32_MAX) {
>-error_setg(errp, "too many connections");
>-ret = -E2BIG;
>-goto out;
>-}
> if (be->owned && !be->users) {
> fd = qemu_open_old("/dev/iommu", O_RDWR);
> if (fd < 0) {
>--
>2.43.0



Re: [PATCH 1/2] qapi/virtio: Keep feature and status bits in the QMP output

2024-01-02 Thread Yong Huang
Thanks, I'll fix that in the next version.

On Fri, Dec 29, 2023 at 5:31 PM Philippe Mathieu-Daudé 
wrote:

> Hi,
>
> On 28/12/23 19:52, Hyman Huang wrote:
> > Maintain the feature and status bits in the x-query-virtio-status
> > output and, as usual, add human-readable output only in HMP.
> >
> > Applications may find it useful to compare features and status
> > information directly. An upper application, for example, could
> > use the QMP command x-query-virtio-status to retrieve vhost-user
> > net device features and the "ovs-vsctl list interface" command to
> > retrieve interface features (in number format) in order to verify
> > the correctness of the virtio negotiation between guest, QEMU,
> > and OVS-DPDK. The application could then compare the two features
> > directly, without the need for additional feature encoding.
> >
> > Signed-off-by: Hyman Huang 
> > ---
> >   hw/virtio/virtio-hmp-cmds.c |  25 +++--
> >   hw/virtio/virtio-qmp.c  |  23 ++---
> >   qapi/virtio.json| 192 
> >   3 files changed, 45 insertions(+), 195 deletions(-)
> >
> > diff --git a/hw/virtio/virtio-hmp-cmds.c b/hw/virtio/virtio-hmp-cmds.c
> > index 477c97dea2..721c630ab0 100644
> > --- a/hw/virtio/virtio-hmp-cmds.c
> > +++ b/hw/virtio/virtio-hmp-cmds.c
> > @@ -6,6 +6,7 @@
> >*/
> >
> >   #include "qemu/osdep.h"
> > +#include "virtio-qmp.h"
> >   #include "monitor/hmp.h"
> >   #include "monitor/monitor.h"
> >   #include "qapi/qapi-commands-virtio.h"
> > @@ -145,13 +146,17 @@ void hmp_virtio_status(Monitor *mon, const QDict
> *qdict)
> >   monitor_printf(mon, "  endianness:  %s\n",
> >  s->device_endian);
> >   monitor_printf(mon, "  status:\n");
> > -hmp_virtio_dump_status(mon, s->status);
> > +hmp_virtio_dump_status(mon,
> > +qmp_decode_status(s->status));
>
> Why not let the callee do this call?
>
> >   monitor_printf(mon, "  Guest features:\n");
> > -hmp_virtio_dump_features(mon, s->guest_features);
> > +hmp_virtio_dump_features(mon,
> > +qmp_decode_features(s->device_id, s->guest_features));
> >   monitor_printf(mon, "  Host features:\n");
> > -hmp_virtio_dump_features(mon, s->host_features);
> > +hmp_virtio_dump_features(mon,
> > +qmp_decode_features(s->device_id, s->host_features));
> >   monitor_printf(mon, "  Backend features:\n");
> > -hmp_virtio_dump_features(mon, s->backend_features);
> > +hmp_virtio_dump_features(mon,
> > +qmp_decode_features(s->device_id, s->backend_features));
> >
> >   if (s->vhost_dev) {
> >   monitor_printf(mon, "  VHost:\n");
> > @@ -172,13 +177,17 @@ void hmp_virtio_status(Monitor *mon, const QDict
> *qdict)
> >   monitor_printf(mon, "log_size:   %"PRId64"\n",
> >  s->vhost_dev->log_size);
> >   monitor_printf(mon, "Features:\n");
> > -hmp_virtio_dump_features(mon, s->vhost_dev->features);
> > +hmp_virtio_dump_features(mon,
> > +qmp_decode_features(s->device_id, s->vhost_dev->features));
>
> Ditto.
>
> >   monitor_printf(mon, "Acked features:\n");
> > -hmp_virtio_dump_features(mon, s->vhost_dev->acked_features);
> > +hmp_virtio_dump_features(mon,
> > +qmp_decode_features(s->device_id,
> s->vhost_dev->acked_features));
> >   monitor_printf(mon, "Backend features:\n");
> > -hmp_virtio_dump_features(mon, s->vhost_dev->backend_features);
> > +hmp_virtio_dump_features(mon,
> > +qmp_decode_features(s->device_id,
> s->vhost_dev->backend_features));
> >   monitor_printf(mon, "Protocol features:\n");
> > -hmp_virtio_dump_protocols(mon, s->vhost_dev->protocol_features);
> > +hmp_virtio_dump_protocols(mon,
> > +qmp_decode_protocols(s->vhost_dev->protocol_features));
> >   }
>
>

-- 
Best regards


Re: chacha20-s390 broken in 8.2.0 in TCG on s390x

2024-01-02 Thread Richard Henderson

On 12/22/23 01:51, Michael Tokarev wrote:

When running current kernel on s390x in tcg mode *on s390x hw*, the following
is generated when loading crypto selftest module (it gets loaded automatically):

[   10.546690] alg: skcipher: chacha20-s390 encryption test failed (wrong result) on test 
vector 1, cfg="in-place (one sglist)"

[   10.546914] alg: self-tests for chacha20 using chacha20-s390 failed (rc=-22)
[   10.546969] [ cut here ]
[   10.546998] alg: self-tests for chacha20 using chacha20-s390 failed (rc=-22)
[   10.547182] WARNING: CPU: 1 PID: 109 at crypto/testmgr.c:5936 
alg_test+0x55a/0x5b8
[   10.547510] Modules linked in: net_failover chacha_s390(+) libchacha virtio_blk(+) 
failover
[   10.547854] CPU: 1 PID: 109 Comm: cryptomgr_test Not tainted 6.5.0-5-s390x #1  Debian 
6.5.13-1

[   10.548002] Hardware name: QEMU 8561 QEMU (KVM/Linux)
[   10.548101] Krnl PSW : 0704c0018000 005df8fe 
(alg_test+0x55e/0x5b8)
[   10.548207]    R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 
RI:0 EA:3
[   10.548291] Krnl GPRS:  01286408 005df8fa 
01286408
[   10.548337]    0014bf14 001c6ba8 01838b3c 
0005
[   10.548475]    025a4880 025a4800 ffea 
ffea
[   10.548521]    3e649200  005df8fa 
03800016bcf8
[   10.549504] Krnl Code: 005df8ee: c020003b5828    larl    
%r2,00d4a93e
[   10.549504]    005df8f4: c0e5ffdb62d2    brasl    
%r14,0014be98
[   10.549504]   #005df8fa: af00    mc    0,0
[   10.549504]   >005df8fe: a7f4fee6    brc    
15,005df6ca
[   10.549504]    005df902: b9040042    lgr    %r4,%r2
[   10.549504]    005df906: b9040039    lgr    %r3,%r9
[   10.549504]    005df90a: c020003b57df    larl    
%r2,00d4a8c8
[   10.549504]    005df910: 18bd    lr    %r11,%r13
[   10.550004] Call Trace:
[   10.550375]  [<005df8fe>] alg_test+0x55e/0x5b8
[   10.550467] ([<005df8fa>] alg_test+0x55a/0x5b8)
[   10.550489]  [<005d9fbc>] cryptomgr_test+0x34/0x60
[   10.550514]  [<0017d004>] kthread+0x124/0x130
[   10.550539]  [<00103124>] __ret_from_fork+0x3c/0x50
[   10.550562]  [<00b1dfca>] ret_from_fork+0xa/0x30
[   10.550611] Last Breaking-Event-Address:
[   10.550626]  [<0014bf20>] __warn_printk+0x88/0x110
[   10.550723] ---[ end trace  ]---

git bisect points to this commit:

commit ab84dc398b3b702b0c692538b947ef65dbbdf52f
Author: Richard Henderson 
Date:   Wed Aug 23 23:04:24 2023 -0700

     tcg/optimize: Optimize env memory operations

So far, this seems to work on amd64 host, but fails on s390x host -
where this has been observed so far.  Maybe it also fails in some
other combinations too, I don't yet know.  Just finished bisecting
it on s390x.


I haven't been able to build a reproducer for this.
Have you an image or kernel you can share?


r~




[PATCH v2] target/i386: Fix CPUID encoding of Fn8000001E_ECX

2024-01-02 Thread Babu Moger
Observed the following failure while booting the SEV-SNP guest and the
guest fails to boot with the smp parameters:
"-smp 192,sockets=1,dies=12,cores=8,threads=2".

qemu-system-x86_64: sev_snp_launch_update: SNP_LAUNCH_UPDATE ret=-5 fw_error=22 
'Invalid parameter'
qemu-system-x86_64: SEV-SNP: CPUID validation failed for function 0x801e, 
index: 0x0.
provided: eax:0x, ebx: 0x0100, ecx: 0x0b00, edx: 0x
expected: eax:0x, ebx: 0x0100, ecx: 0x0300, edx: 0x
qemu-system-x86_64: SEV-SNP: failed update CPUID page

Reason for the failure is due to overflowing of bits used for "Node per
processor" in CPUID Fn801E_ECX. This field's width is 3 bits wide and
can hold maximum value 0x7. With dies=12 (0xB), it overflows and spills
over into the reserved bits. In the case of SEV-SNP, this causes CPUID
enforcement failure and guest fails to boot.

The PPR documentation for CPUID_Fn801E_ECX [Node Identifiers]
=
BitsDescription
31:11   Reserved.

10:8NodesPerProcessor: Node per processor. Read-only.
ValidValues:
Value   Description
0h  1 node per processor.
7h-1h   Reserved.

7:0 NodeId: Node ID. Read-only. Reset: Fixed,XXh.
=

As in the spec, the valid value for "node per processor" is 0 and rest
are reserved.

Looking back at the history of decoding of CPUID_Fn801E_ECX, noticed
that there were cases where "node per processor" can be more than 1. It
is valid only for pre-F17h (pre-EPYC) architectures. For EPYC or later
CPUs, the linux kernel does not use this information to build the L3
topology.

Also noted that the CPUID Function 0x801E_ECX is available only when
TOPOEXT feature is enabled. This feature is enabled only for EPYC(F17h)
or later processors. So, previous generation of processors do not not
enumerate 0x801E_ECX leaf.

There could be some corner cases where the older guests could enable the
TOPOEXT feature by running with -cpu host, in which case legacy guests
might notice the topology change. To address those cases introduced a
new CPU property "legacy-multi-node". It will be true for older machine
types to maintain compatibility. By default, it will be false, so new
decoding will be used going forward.

The documentation is taken from Preliminary Processor Programming
Reference (PPR) for AMD Family 19h Model 11h, Revision B1 Processors 55901
Rev 0.25 - Oct 6, 2022.

Cc: qemu-sta...@nongnu.org
Fixes: 31ada106d891 ("Simplify CPUID_8000_001E for AMD")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger 
Reviewed-by: Zhao Liu 
---
v2: Rebased to the latest tree.
Updated the pc_compat_8_2 for the new flag.
Added the comment for new property legacy_multi_node.
Added Reviwed-by from Zhao.
---
 hw/i386/pc.c  |  4 +++-
 target/i386/cpu.c | 18 ++
 target/i386/cpu.h |  6 ++
 3 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 496498df3a..a504e05e62 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -78,7 +78,9 @@
 { "qemu64-" TYPE_X86_CPU, "model-id", "QEMU Virtual CPU version " v, },\
 { "athlon-" TYPE_X86_CPU, "model-id", "QEMU Virtual CPU version " v, },
 
-GlobalProperty pc_compat_8_2[] = {};
+GlobalProperty pc_compat_8_2[] = {
+{ TYPE_X86_CPU, "legacy-multi-node", "on" },
+};
 const size_t pc_compat_8_2_len = G_N_ELEMENTS(pc_compat_8_2);
 
 GlobalProperty pc_compat_8_1[] = {};
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 95d5f16cd5..2cc84e8500 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -398,12 +398,9 @@ static void encode_topo_cpuid801e(X86CPU *cpu, 
X86CPUTopoInfo *topo_info,
  * 31:11 Reserved.
  * 10:8 NodesPerProcessor: Node per processor. Read-only. Reset: XXXb.
  *  ValidValues:
- *  Value Description
- *  000b  1 node per processor.
- *  001b  2 nodes per processor.
- *  010b Reserved.
- *  011b 4 nodes per processor.
- *  111b-100b Reserved.
+ *  Value   Description
+ *  0h  1 node per processor.
+ *  7h-1h   Reserved.
  *  7:0 NodeId: Node ID. Read-only. Reset: XXh.
  *
  * NOTE: Hardware reserves 3 bits for number of nodes per processor.
@@ -412,8 +409,12 @@ static void encode_topo_cpuid801e(X86CPU *cpu, 
X86CPUTopoInfo *topo_info,
  * NodeId is combination of node and socket_id which is already decoded
  * in apic_id. Just use it by shifting.
  */
-*ecx = ((topo_info->dies_per_pkg - 1) << 8) |
-   ((cpu->apic_id >> apicid_die_offset(topo_info)) & 0xFF);
+if (cpu->legacy_multi_node) {
+*ecx = ((topo_info->dies_per_pkg - 1) << 8) |
+   ((cpu->apic_id >> apicid_die_offset(topo_info)) & 0xFF);
+} else {
+*ecx = (cpu->apic_id >> 

Re: [RFC PATCH 2/5] qdev-properties: Add OptionalBool QAPI type

2024-01-02 Thread Richard Henderson

On 1/3/24 03:04, Philippe Mathieu-Daudé wrote:

To be able to distinct whether a boolean qdev property
has been set or not, add the DEFINE_PROP_BOOL_NODEFAULT()
qdev macro based on the tri-state OptionalBool QAPI type.

Signed-off-by: Philippe Mathieu-Daudé 
---
  qapi/common.json | 16 
  include/hw/qdev-properties.h |  5 +
  hw/core/qdev-properties.c| 10 ++
  3 files changed, 31 insertions(+)


How is this different from OnOffAuto?


r~



Re: [PATCH 1/1] Leaving Migration

2024-01-02 Thread Bin Meng
On Wed, Jan 3, 2024 at 4:20 AM Juan Quintela  wrote:
>
> I am leaving Red Hat, and as part of that I am leaving Migration
> maintenarship.

maintainership?

>
> You are left in good hands with Peter and Fabiano.
>
> Thanks for all the fish.

Best wishes!

>
> Signed-off-by: Juan Quintela 
> ---
>  MAINTAINERS | 3 ---
>  .mailmap| 1 +
>  2 files changed, 1 insertion(+), 3 deletions(-)
>

Reviewed-by: Bin Meng 



Re: [PATCH] docs/system: Document running Linux on amigaone machine

2024-01-02 Thread Bernhard Beschow



Am 16. Dezember 2023 21:15:54 UTC schrieb BALATON Zoltan :
>On Sat, 16 Dec 2023, Bernhard Beschow wrote:
>> Am 16. Dezember 2023 12:53:55 UTC schrieb BALATON Zoltan 
>> :
>>> On Sat, 16 Dec 2023, Bernhard Beschow wrote:
 Documentation on how to run Linux on the amigaone machine is currently 
 burried
 in the depths of the qemu-devel mailing list [1] and in the source code. 
 Let's
 collect the information in the QEMU handbook for a one stop solution.
>>> 
>>> Thanks for collecting this info and adding it as documentation.
>> 
>> You're welcome!
>> 
>>> A few small comments bellow.
>>> 
 [1] 
 https://lore.kernel.org/qemu-devel/dafc407d-3749-e6f4-3a66-750fde896...@eik.bme.hu/
>>> 
>>> Do we want to keep an URL in the commit log? kernel.org is quite stable but 
>>> not sure it would need to be in the commit message.
>> 
>> Let's drop it.
>> 
>>> 
 Co-authored-by: BALATON Zoltan 
 Signed-off-by: Bernhard Beschow 
 ---
 MAINTAINERS  |  1 +
 docs/system/ppc/amigaone.rst | 53 
 docs/system/target-ppc.rst   |  1 +
 hw/ppc/amigaone.c|  9 --
 4 files changed, 55 insertions(+), 9 deletions(-)
 create mode 100644 docs/system/ppc/amigaone.rst
 
 diff --git a/MAINTAINERS b/MAINTAINERS
 index 695e0bd34f..a2dd1407e2 100644
 --- a/MAINTAINERS
 +++ b/MAINTAINERS
 @@ -1560,6 +1560,7 @@ amigaone
 M: BALATON Zoltan 
 L: qemu-...@nongnu.org
 S: Maintained
 +F: docs/system/ppc/amigaone.rst
 F: hw/ppc/amigaone.c
 F: hw/pci-host/articia.c
 F: include/hw/pci-host/articia.h
 diff --git a/docs/system/ppc/amigaone.rst b/docs/system/ppc/amigaone.rst
 new file mode 100644
 index 00..c3f11a7bb2
 --- /dev/null
 +++ b/docs/system/ppc/amigaone.rst
>>> 
>>> Maybe call it amigang.rst so it can be a place for docs on other PPC 
>>> AmigaNG machines such as pegasos2 and sam460ex in the future to collect 
>>> them in one place.
>> 
>> Having everything in one place seems like creating a lot of complexity if 
>> one were to elaborate on the various pros and cons for each machine: 
>> AmigaOne needs a custom vgabios, the others do not.
>
>All of these need real mode VGA BIOS as the BIOS emulator in all three 
>machines choke on the gcc compiled QEMU VGA BIOS so this isn't uinque to 
>amigaone.
>
>> MorpOS can be run on the other machines but not on AmigaOne. Sometimes a 
>> bootloader is needed and sometimes not, the circumstances may vary.
>
>MorphOS does not support amigaone, boot loader is optional and alternative to 
>using firmware.
>
>Other docs seem to combine similar machines like powermac and embedded and 
>ppce500 in a single doc file so the convention seems to not have one file for 
>each machine but it's not a big deal.
>
>> I suggest to have a separate doc on each machine.
>
>I could also rename it later if more docs is added for other machines.
>
 @@ -0,0 +1,53 @@
 +Eyetech AmigaOne/Mai Logic Teron (``amigaone``)
 +===
 +
 +The ``amigaone`` model emulates an AmigaOne XE mainboard developed by 
 Eyetech. Use
 +the executable ``qemu-system-ppc`` to simulate a complete system.
>>> 
>>> QEMU is not a simulator so even if that's repeating this should say 
>>> emulate. (Should this doc be formatted with 80 chars line too like sources?)
>> 
>> I took heavy inspiration from the cupieboard machine. Feel free to change.
>
>docs/system/arm/cubieboard.rst:
>"The ``cubieboard`` model emulates the Cubietech Cubieboard,"
>also says emulate not simulate.
>
 +
 +Emulated devices
 +
 +
 + *  PowerPC 7457 v1.2 CPU
 + *  Articia S north bridge
 + *  VT82C686B south bridge
 + *  PCI VGA compatible card
 +
 +
 +Preparation
 +---
 +
 +A firmware binary is necessary for the boot process and is available at
 +https://www.hyperion-entertainment.com/index.php/downloads?view=files=28.
 +It needs to be extracted with the following command:
 +
 +.. code-block:: bash
 +
 +  $ tail -c 524288 updater.image > u-boot-amigaone.bin
 +
 +The firmware binary is unable to run QEMU‘s standard vgabios and
 +``VGABIOS-lgpl-latest.bin`` is needed instead. It can be downloaded from
 +http://www.nongnu.org/vgabios.
 +
 +
 +Running Linux
 +-
 +
 +There are some Linux images under the following link that work on the
 +``amigaone`` machine:
 +https://sourceforge.net/projects/amigaone-linux/files/debian-installer/. 
 To boot
 +the system run:
 +
 +.. code-block:: bash
 +
 +  $ qemu-system-ppc -M amigaone -bios u-boot-amigaone.bin \
 +-cdrom "A1 Linux Net Installer.iso" \
 +-device 
 ati-vga,model=rv100,romfile=VGABIOS-lgpl-latest.bin
 +
 +From the firmware 

Re: [PATCH v2 12/12] hw/isa/vt82c686: Implement relocation and toggling of SuperI/O functions

2024-01-02 Thread Bernhard Beschow



Am 24. Dezember 2023 00:51:53 UTC schrieb BALATON Zoltan :
>On Tue, 19 Dec 2023, Bernhard Beschow wrote:
>> Am 19. Dezember 2023 00:26:15 UTC schrieb BALATON Zoltan 
>> :
>>> On Mon, 18 Dec 2023, Bernhard Beschow wrote:
 The VIA south bridges are able to relocate and toggle (enable or disable) 
 their
 SuperI/O functions. So far this is hardcoded such that all functions are 
 always
 enabled and are located at fixed addresses.
 
 Some PC BIOSes seem to probe for I/O occupancy before activating such a 
 function
 and issue an error in case of a conflict. Since the functions are enabled 
 on
 reset, conflicts are always detected. Prevent that by implementing 
 relocation
 and toggling of the SuperI/O functions.
 
 Note that all SuperI/O functions are now deactivated upon reset (except for
 VT82C686B's serial ports where Fuloong 2e's rescue-yl seems to expect them 
 to be
 enabled by default). Rely on firmware -- or in case of pegasos2 on board 
 code if
 no -bios is given -- to configure the functions accordingly.
>>> 
>>> Pegasos2 emulates firmware when no -bios is given, this was explained in 
>>> previos commit so maybe not needed to be explained it here again so you 
>>> could drop the comment between -- -- but I don't mind.
>>> 
 Signed-off-by: Bernhard Beschow 
 ---
 hw/isa/vt82c686.c | 121 ++
 1 file changed, 90 insertions(+), 31 deletions(-)
 
 diff --git a/hw/isa/vt82c686.c b/hw/isa/vt82c686.c
 index 9c2333a277..be202d23cf 100644
 --- a/hw/isa/vt82c686.c
 +++ b/hw/isa/vt82c686.c
 @@ -15,6 +15,9 @@
 
 #include "qemu/osdep.h"
 #include "hw/isa/vt82c686.h"
 +#include "hw/block/fdc.h"
 +#include "hw/char/parallel-isa.h"
 +#include "hw/char/serial.h"
 #include "hw/pci/pci.h"
 #include "hw/qdev-properties.h"
 #include "hw/ide/pci.h"
 @@ -343,6 +346,35 @@ static const TypeInfo via_superio_info = {
 
 #define TYPE_VT82C686B_SUPERIO "vt82c686b-superio"
 
 +static void vt82c686b_superio_update(ViaSuperIOState *s)
 +{
 +isa_parallel_set_enabled(s->superio.parallel[0],
 + (s->regs[0xe2] & 0x3) != 3);
 +isa_serial_set_enabled(s->superio.serial[0], s->regs[0xe2] & BIT(2));
 +isa_serial_set_enabled(s->superio.serial[1], s->regs[0xe2] & BIT(3));
 +isa_fdc_set_enabled(s->superio.floppy, s->regs[0xe2] & BIT(4));
 +
 +isa_fdc_set_iobase(s->superio.floppy, (s->regs[0xe3] & 0xfc) << 2);
 +isa_parallel_set_iobase(s->superio.parallel[0], s->regs[0xe6] << 2);
 +isa_serial_set_iobase(s->superio.serial[0], (s->regs[0xe7] & 0xfe) << 
 2);
 +isa_serial_set_iobase(s->superio.serial[1], (s->regs[0xe8] & 0xfe) << 
 2);
 +}
>>> 
>>> I wonder if some code duplication could be saved by adding a shared 
>>> via_superio_update() for this further up in the abstract via-superio class 
>>> instead of this method and vt8231_superio_update() below. This common 
>>> method in abstract class would need to handle the differences which seem to 
>>> be reg addresses offset by 0x10 and VT8231 having only 1 serial port. These 
>>> could either be handled by adding function parameters or fields to 
>>> ViaSuperIOState for this that the subclasses can set and the method check. 
>>> (Such as reg base=0xe2 for vt82c686 and 0xf2 for vt8231 and num_serial or 
>>> similar for how many ports are there then can have a for loop for those 
>>> that would only run once for vt8231).
>> 
>> Only the enable bits and the parallel port base address line up, the serial 
>> port(s) and the floppy would need special treatment. Not worth it IMO.
>
>Missed this part in previous reply. The serial ports would be taken care of by 
>a loop for number of ports so only the floppy needs an if which could also use 
>the number of serial ports for lack of better way to distinguish these cips 
>easily. Number of ports are in the superio class which you could get with 
>ISA_SUPERIO_GET_CLASS (see via_superio_realize) then serial.count would be 2 
>for 686b and 1 for 8231.

I'm not very convinced about telling the device models apart by their number of 
sub devices. So let's omit this part for now.

>
>But now I think another way may be better that is to drop the superio_update 
>function as this updates all functions on writing any register unnecessarily 
>and put the lines from it in the vt*_superio_cfg_write() functions under the 
>separate cases. This was the original intent, that's why the reset function 
>goes through that write function so it can enable/disable functions. That way 
>you could apply mask on write so via_superio_cfg_read() would return 0 bits as 
>0 (although the data sheet is not clear about what real chip does, just says 
>these must be 0 not that it's enforced but if we enforce that it's probably 
>better to return the 

Re: [PATCH v2 1/5] system/cpus: rename qemu_mutex_lock_iothread() to bql_lock()

2024-01-02 Thread Bernhard Beschow



Am 12. Dezember 2023 15:39:00 UTC schrieb Stefan Hajnoczi :
>The Big QEMU Lock (BQL) has many names and they are confusing. The
>actual QemuMutex variable is called qemu_global_mutex but it's commonly
>referred to as the BQL in discussions and some code comments. The
>locking APIs, however, are called qemu_mutex_lock_iothread() and
>qemu_mutex_unlock_iothread().
>
>The "iothread" name is historic and comes from when the main thread was
>split into into KVM vcpu threads and the "iothread" (now called the main

Duplicate "into" here.

>loop thread). I have contributed to the confusion myself by introducing
>a separate --object iothread, a separate concept unrelated to the BQL.
>
>The "iothread" name is no longer appropriate for the BQL. Rename the
>locking APIs to:
>- void bql_lock(void)
>- void bql_unlock(void)
>- bool bql_locked(void)
>
>There are more APIs with "iothread" in their names. Subsequent patches
>will rename them. There are also comments and documentation that will be
>updated in later patches.
>
>Signed-off-by: Stefan Hajnoczi 
>Reviewed-by: Paul Durrant 
>Acked-by: Fabiano Rosas 
>Acked-by: David Woodhouse 
>Reviewed-by: Cédric Le Goater 
>Acked-by: Peter Xu 
>Acked-by: Eric Farman 
>Reviewed-by: Harsh Prateek Bora 



QEMU developers call

2024-01-02 Thread Juan Quintela


Uv

Nf V nz yrnivat DRZH qrirybczrag, V pna'g unaqyr guvf pnyy.  Fubhyq
nalbar gnxr pner bs vg?

V unir gnyxrq nobhg guvf jvgu Crgre, naq V guvax gung vs abobql fgrcf
hc, ur pna "ibyhagrre" gb unaqyr vg.

Yngre, Whna.




Re: [PATCH] docs/devel: Document conventional file prefixes and suffixes

2024-01-02 Thread Stefan Hajnoczi
On Tue, Dec 26, 2023 at 04:04:41PM +0100, Philippe Mathieu-Daudé wrote:
> Some header and source file names use common prefix / suffix
> but we never really ruled a convention. Start doing so with
> the current patterns from the tree.
> 
> Suggested-by: Alex Bennée 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  docs/devel/style.rst | 49 
>  1 file changed, 49 insertions(+)
> 
> diff --git a/docs/devel/style.rst b/docs/devel/style.rst
> index 2f68b50079..4da50eb2ea 100644
> --- a/docs/devel/style.rst
> +++ b/docs/devel/style.rst
> @@ -162,6 +162,55 @@ pre-processor. Another common suffix is ``_impl``; it is 
> used for the
>  concrete implementation of a function that will not be called
>  directly, but rather through a macro or an inline function.
>  
> +File Naming Conventions
> +---
> +
> +Public headers
> +~~
> +
> +Headers expected to be access by multiple subsystems must reside in

s/access/accessed/

> +the ``include/`` folder. Headers local to a subsystem should reside in
> +the sysbsystem folder, if any (for example ``qobject/qobject-internal.h``

s/sysbsystem/subsystem/

> +can only be included by files within the ``qobject/`` folder).
> +
> +Header file prefix and suffix hints
> +~~~
> +
> +When headers relate to common concept, it is useful to use a common

Either "common concepts" (plural) or "a common concept" (singular with
an indefinite article).

> +prefix or suffix.
> +
> +When headers relate to the same (guest) subsystem, the subsystem name is
> +often used as prefix. If headers are already in a folder named as the
> +subsystem, prefixing them is optional.

"named as the subsystem" sounds strange. I suggest something like:

"If headers are already in a folder with the subsystem in its name,
prefixing them is optional."

or

"Prefixing header files is optional if the folder name already contains
the subsystem name."

> +
> +For example, hardware models related to the Aspeed systems are named
> +using the ``aspeed_`` prefix.
> +
> +Headers related to the same (host) concept can also use a common prefix.

Is there a need to distinguish between "(guest)" above and "(host)" here
since we end up recommending the same thing for both?

> +For example OS specific headers use the ``-posix`` and ``-win32`` suffixes.

The previous sentence is about prefixes but this sentence focusses on
suffixes. That's a little confusing. I guess you mean "foo-posix" and
"foo-win32" have a common prefix. It may help to express it in terms of
the prefix instead of mentioning the suffix.

> +
> +Registered file suffixes
> +
> +
> +* ``.inc``
> +
> +  Source files meant to be included by other source files as templates
> +  must use the ``.c.inc`` suffix. Similarly, headers meant to be included
> +  multiple times as template must use the ``.h.inc`` suffix.
> +
> +Recommended file prefixes / suffixes
> +
> +
> +* ``target`` and ``common`` suffixes
> +
> +  Files which are specific to a target should use the ``target`` suffix.
> +  Such ``target`` suffixed headers usually *taint* the files including them
> +  by making them target specific.

Is there any particular macro or pattern for enforcing this? I remember
there are #error preprocessor directives in some header files to prevent
including them from the wrong source file, but I'm not sure if you're
referring to anything specific here.

> +
> +  Files common to all targets should use the ``common`` suffix, to provide
> +  a hint that these files can be safely included from common code.

This statement is too general. For example, files in util/ can be used
from common code but don't have a suffix. I think target and common
suffixes are useful when something is split into target-specific and
common parts. Otherwise it's not necessary.


signature.asc
Description: PGP signature


[PATCH 1/1] Leaving Migration

2024-01-02 Thread Juan Quintela
I am leaving Red Hat, and as part of that I am leaving Migration
maintenarship.

You are left in good hands with Peter and Fabiano.

Thanks for all the fish.

Signed-off-by: Juan Quintela 
---
 MAINTAINERS | 3 ---
 .mailmap| 1 +
 2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 395f26ba86..56464cd916 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -70,7 +70,6 @@ R: Daniel P. Berrangé 
 R: Thomas Huth 
 R: Markus Armbruster 
 R: Philippe Mathieu-Daudé 
-R: Juan Quintela 
 W: https://www.qemu.org/docs/master/devel/index.html
 S: Odd Fixes
 F: docs/devel/style.rst
@@ -3355,7 +3354,6 @@ S: Odd Fixes
 F: scripts/checkpatch.pl
 
 Migration
-M: Juan Quintela 
 M: Peter Xu 
 M: Fabiano Rosas 
 R: Leonardo Bras 
@@ -3375,7 +3373,6 @@ F: util/userfaultfd.c
 X: migration/rdma*
 
 RDMA Migration
-M: Juan Quintela 
 R: Li Zhijian 
 R: Peter Xu 
 R: Leonardo Bras 
diff --git a/.mailmap b/.mailmap
index e12e19f691..d94572af05 100644
--- a/.mailmap
+++ b/.mailmap
@@ -81,6 +81,7 @@ Greg Kurz  
 Huacai Chen  
 Huacai Chen  
 James Hogan  
+Juan Quintela  
 Leif Lindholm  
 Leif Lindholm  
 Luc Michel  
-- 
2.43.0




[PATCH 0/1] Leaving Red Hat

2024-01-02 Thread Juan Quintela
Hi

After so many years it has been a difficult decision to make, but I am
leaving Red Hat, and with it I am leaving QEMU development and
Migration Mainteinership.

I leave you in good and capable hands with Peter and Fabiano.

I have really enjoyed working with all of you.  I have learned a lot
working with you.

Looking at git logs, it appears that my 1st commit is from:

commit 4f3a1d56e45bcd325f1e8a976290142bc8662bee
Author: Juan Quintela 
Date:   Thu Jun 25 00:07:59 2009 +0200

Rename OBJS to obj-y

So it have been almost 15 years.

As you can see for the commit, I changed the configure script to make
it more configurable and the change was so bad that Paolo ended
rewriting all of it in meson :-)

I am cc'ing my personal account because I lost access to this address
today.

Again, thanks for everything.

Later, Juan.

Juan Quintela (1):
  Leaving Migration

 MAINTAINERS | 3 ---
 .mailmap| 1 +
 2 files changed, 1 insertion(+), 3 deletions(-)

-- 
2.43.0




Re: [PATCH v2 08/16] target/riscv: move 'vlen' to riscv_cpu_properties[]

2024-01-02 Thread Daniel Henrique Barboza




On 1/2/24 09:04, Daniel Henrique Barboza wrote:



On 12/29/23 08:22, Vladimir Isaev wrote:

22.12.2023 15:22, Daniel Henrique Barboza wrote:


Turning 'vlen' into a class property will allow its default value to be
overwritten by cpu_init() later on, solving the issue we have now where
CPU specific settings are getting overwritten by the default.

For 'vlen', 'elen' and the blocksize options we need a way of tracking
if the user set a value for them. This is benign for TCG since the cost
of always validating these values are small, but for KVM we need syscalls
to read the host values to make the validations. Knowing whether the
user didn't touch the values makes a difference. We'll track user setting
for these properties using a hash, like we do in the TCG driver.

Common validation bits are moved from riscv_cpu_validate_v() to
prop_vlen_set() to be shared with KVM.

And, as done with every option we migrated to riscv_cpu_properties[],
vendor CPUs can't have their 'vlen' value changed.

Signed-off-by: Daniel Henrique Barboza 
---
  target/riscv/cpu.c | 63 +-
  target/riscv/tcg/tcg-cpu.c |  5 ---
  2 files changed, 62 insertions(+), 6 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index d6625399a7..c2ff50bcab 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -29,6 +29,7 @@
  #include "qapi/visitor.h"
  #include "qemu/error-report.h"
  #include "hw/qdev-properties.h"
+#include "hw/core/qdev-prop-internal.h"
  #include "migration/vmstate.h"
  #include "fpu/softfloat-helpers.h"
  #include "sysemu/kvm.h"
@@ -53,6 +54,15 @@ const uint32_t misa_bits[] = {RVI, RVE, RVM, RVA, RVF, RVD, 
RVV,
  #define BYTE(x)   (x)
  #endif

+/* Hash that stores general user set numeric options */
+static GHashTable *general_user_opts;
+
+static void cpu_option_add_user_setting(const char *optname, uint32_t value)
+{
+    g_hash_table_insert(general_user_opts, (gpointer)optname,
+    GUINT_TO_POINTER(value));
+}
+
  #define ISA_EXT_DATA_ENTRY(_name, _min_ver, _prop) \
  {#_name, _min_ver, CPU_CFG_OFFSET(_prop)}

@@ -1244,6 +1254,8 @@ static void riscv_cpu_init(Object *obj)
    IRQ_LOCAL_MAX + IRQ_LOCAL_GUEST_MAX);
  #endif /* CONFIG_USER_ONLY */

+    general_user_opts = g_hash_table_new(g_str_hash, g_str_equal);
+
  /*
   * The timer and performance counters extensions were supported
   * in QEMU before they were added as discrete extensions in the
@@ -1664,8 +1676,54 @@ static const PropertyInfo prop_vext_spec = {
  .set = prop_vext_spec_set,
  };

+static void prop_vlen_set(Object *obj, Visitor *v, const char *name,
+ void *opaque, Error **errp)
+{
+    RISCVCPU *cpu = RISCV_CPU(obj);
+    uint16_t value;
+
+    if (!visit_type_uint16(v, name, , errp)) {
+    return;
+    }
+
+    if (!is_power_of_2(value)) {
+    error_setg(errp, "Vector extension VLEN must be power of 2");
+    return;
+    }
+
+    /* Always allow setting a default value */


What is the case for vlen equal to 0? Since in properties it is defined with 
default value set to 128.


The process of setting a default uses the default setter of the property. I.e.
when setting vlen default value to 128, this function will be called with
value = 128 when cpu->cfg.vlen is 0.

If we don't special case this scenario we'll fail the "vendor CPUs don't allow
changing vlen" check that comes right after.


I'll change this design. If we need to special case the default case for every
setter to not collide with the vendor CPU check, then we're better of adding the
defaults manually in riscv_cpu_init(). The defaults there will be overwritten by
any defaults set in the cpu_inits() of the CPUs, so in the end it's the same
effect but without the extra bloat in the setters().

In fact, reading the docs we see the following note on the 'set_default' 
Property
flag:

 * Property:
 * @set_default: true if the default value should be set from @defval,
 *in which case @info->set_default_value must not be NULL
 *(if false then no default value is set by the property system
 * and the field retains whatever value it was given by instance_init).

So I believe putting the defaults in riscv_cpu_init() is covered by the intended
design.


Thanks,

Daniel





Thanks,

Daniel




+    if (cpu->cfg.vlen == 0) {
+    cpu->cfg.vlen = value;
+    return;
+    }
+
+    if (value != cpu->cfg.vlen && riscv_cpu_is_vendor(obj)) {
+    cpu_set_prop_err(cpu, name, errp);
+    error_append_hint(errp, "Current '%s' val: %u\n",
+  name, cpu->cfg.vlen);
+    return;
+    }
+
+    cpu_option_add_user_setting(name, value);
+    cpu->cfg.vlen = value;
+}
+
+static void prop_vlen_get(Object *obj, Visitor *v, const char *name,
+ void *opaque, Error **errp)
+{
+    uint16_t value = RISCV_CPU(obj)->cfg.vlen;
+
+    visit_type_uint16(v, name, , errp);
+}
+
+static 

Re: [PATCH 10/11] migration: Remove unnecessary usage of local Error

2024-01-02 Thread Fabiano Rosas
Avihai Horon  writes:

> According to Error API, usage of ERRP_GUARD() or a local Error instead
> of errp is needed if errp is passed to void functions, where it is later
> dereferenced to see if an error occurred.
>
> There are several places in migration.c that use local Error although it
> is not needed. Change these places to use errp directly.
>
> Signed-off-by: Avihai Horon 

Reviewed-by: Fabiano Rosas 



Re: [PATCH 09/11] migration: Fix migration_channel_read_peek() error path

2024-01-02 Thread Fabiano Rosas
Avihai Horon  writes:

> migration_channel_read_peek() calls qio_channel_readv_full() and handles
> both cases of return value == 0 and return value < 0 the same way, by
> calling error_setg() with errp. However, if return value < 0, errp is
> already set, so calling error_setg() with errp will lead to an assert.
>
> Fix it by handling these cases separately, calling error_setg() with
> errp only in return value == 0 case.
>
> Fixes: 6720c2b32725 ("migration: check magic value for deciding the mapping 
> of channels")
> Signed-off-by: Avihai Horon 

Reviewed-by: Fabiano Rosas 



Re: [PATCH 07/11] migration/multifd: Fix leaking of Error in TLS error flow

2024-01-02 Thread Fabiano Rosas
Avihai Horon  writes:

> If there is an error in multifd TLS handshake task,
> multifd_tls_outgoing_handshake() retrieves the error with
> qio_task_propagate_error() but never frees it.
>
> Fix it by freeing the obtained Error.
>
> In addition, the error is not reported at all, so report it with
> migrate_set_error().
>
> Fixes: 29647140157a ("migration/tls: add support for multifd tls-handshake")
> Signed-off-by: Avihai Horon 

Reviewed-by: Fabiano Rosas 



Re: [PATCH 08/11] migration/multifd: Remove error_setg() in migration_ioc_process_incoming()

2024-01-02 Thread Fabiano Rosas
Avihai Horon  writes:

> If multifd_load_setup() fails in migration_ioc_process_incoming(),
> error_setg() is called with errp. This will lead to an assert because in
> that case errp already contains an error.
>
> Fix it by removing the redundant error_setg().
>
> Fixes: 6720c2b32725 ("migration: check magic value for deciding the mapping 
> of channels")
> Signed-off-by: Avihai Horon 

Reviewed-by: Fabiano Rosas 



Re: [PATCH v3 26/33] *-user: Deprecate and disable -p pagesize

2024-01-02 Thread Philippe Mathieu-Daudé

On 2/1/24 02:58, Richard Henderson wrote:

This option controls the host page size.  From the mis-usage in
our own testsuite, this is easily confused with guest page size.

The only thing that occurs when changing the host page size is
that stuff breaks, because one cannot actually change the host
page size.  Therefore reject all but the no-op setting as part
of the deprecation process.

Reviewed-by: Warner Losh 
Signed-off-by: Richard Henderson 
---
  docs/about/deprecated.rst |  7 +++
  docs/user/main.rst|  3 ---
  bsd-user/main.c   |  9 +
  linux-user/main.c | 11 ++-
  4 files changed, 18 insertions(+), 12 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH v3 12/33] hw/tpm: Remove HOST_PAGE_ALIGN from tpm_ppi_init

2024-01-02 Thread Philippe Mathieu-Daudé

On 2/1/24 02:57, Richard Henderson wrote:

The size of the allocation need not match the alignment.

Signed-off-by: Richard Henderson 
---
  hw/tpm/tpm_ppi.c | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/hw/tpm/tpm_ppi.c b/hw/tpm/tpm_ppi.c
index 7f74e26ec6..91eeafd53a 100644
--- a/hw/tpm/tpm_ppi.c
+++ b/hw/tpm/tpm_ppi.c
@@ -47,8 +47,7 @@ void tpm_ppi_reset(TPMPPI *tpmppi)
  void tpm_ppi_init(TPMPPI *tpmppi, MemoryRegion *m,
hwaddr addr, Object *obj)
  {
-tpmppi->buf = qemu_memalign(qemu_real_host_page_size(),
-HOST_PAGE_ALIGN(TPM_PPI_ADDR_SIZE));
+tpmppi->buf = qemu_memalign(qemu_real_host_page_size(), TPM_PPI_ADDR_SIZE);
  memory_region_init_ram_device_ptr(>ram, obj, "tpm-ppi",
TPM_PPI_ADDR_SIZE, tpmppi->buf);
  vmstate_register_ram(>ram, DEVICE(obj));


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH for 8.2.1] hw/net: cadence_gem: Fix MDIO_OP_xxx values

2024-01-02 Thread Philippe Mathieu-Daudé

On 2/1/24 19:32, Heinrich Schuchardt wrote:

On 1/2/24 16:08, Philippe Mathieu-Daudé wrote:

On 2/1/24 15:18, Bin Meng wrote:

Testing upstream U-Boot with 'sifive_u' machine we see:

   => dhcp
   ethernet@1009: PHY present at 0
   Could not get PHY for ethernet@1009: addr 0
   phy_connect failed

This has been working till QEMU 8.1 but broken since QEMU 8.2.


s/till/until/?


These are synonyms. Till is more informal. No need to change.


Not obvious for non-native/fluent informal English speakers.






Fixes: 1b09eeb122aa ("hw/net/cadence_gem: use FIELD to describe
PHYMNTNC register fields")
Reported-by: Heinrich Schuchardt 
Signed-off-by: Bin Meng 

---

  hw/net/cadence_gem.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)





Re: [PATCH 05/11] migration/multifd: Fix error message in multifd_recv_initial_packet()

2024-01-02 Thread Fabiano Rosas
Avihai Horon  writes:

> In multifd_recv_initial_packet(), if MultiFDInit_t->id is greater than
> the configured number of multifd channels, an irrelevant error message
> about multifd version is printed.
>
> Change the error message to a relevant one about the channel id.
>
> Signed-off-by: Avihai Horon 

Reviewed-by: Fabiano Rosas 



Re: [PATCH 04/11] migration: Remove errp parameter in migration_fd_process_incoming()

2024-01-02 Thread Fabiano Rosas
Avihai Horon  writes:

> Errp parameter in migration_fd_process_incoming() is unused.
> Remove it.
>
> Signed-off-by: Avihai Horon 

Reviewed-by: Fabiano Rosas 



Re: [PATCH 03/11] migration: Refactor migration_incoming_setup()

2024-01-02 Thread Fabiano Rosas
Avihai Horon  writes:

> Commit 6720c2b32725 ("migration: check magic value for deciding the
> mapping of channels") extracted the only code that could fail in
> migration_incoming_setup().
>
> Now migration_incoming_setup() can't fail, so refactor it to return void
> and remove errp parameter.
>
> Signed-off-by: Avihai Horon 

Reviewed-by: Fabiano Rosas 



Re: [PATCH 02/11] migration: Remove nulling of hostname in migrate_init()

2024-01-02 Thread Fabiano Rosas
Avihai Horon  writes:

> MigrationState->hostname is set to NULL in migrate_init(). This is
> redundant because it is already freed and set to NULL in
> migrade_fd_cleanup(). Remove it.
>
> Signed-off-by: Avihai Horon 

Reviewed-by: Fabiano Rosas 



Re: [PATCH v3 2/4] tests/lcitool: Refresh generated files

2024-01-02 Thread Ilya Maximets
On 1/2/24 19:19, Philippe Mathieu-Daudé wrote:
> On 12/7/23 15:07, Daniel P. Berrangé wrote:
>> On Wed, Jul 12, 2023 at 02:55:10PM +0200, Philippe Mathieu-Daudé wrote:
>>> On 12/7/23 13:12, Daniel P. Berrangé wrote:
 On Wed, Jul 12, 2023 at 01:00:38PM +0200, Philippe Mathieu-Daudé wrote:
> On 11/7/23 19:58, Daniel P. Berrangé wrote:
>> On Tue, Jul 11, 2023 at 04:49:20PM +0200, Philippe Mathieu-Daudé wrote:
>>> Refresh the generated files by running:
>>>
>>>  $ make lcitool-refresh
>>>
>>> Signed-off-by: Philippe Mathieu-Daudé 
>>> ---
>>> tests/docker/dockerfiles/debian-amd64.docker |  2 -
>>> tests/docker/dockerfiles/ubuntu2004.docker   |  2 -
>>> tests/docker/dockerfiles/ubuntu2204.docker   |  2 -
>>
>> I don't know why this would be removing xen/pmem from these files. If
>> I run 'lcitool-refresh' on current git master that doesn't happen
> 
> FTR since commit cb039ef3d9 libxdp-devel is also being changed on my
> host, similarly to libpmem-devel, so I suppose it also has some host
> specific restriction.

Yeah, many distributions are not building libxdp for non-x86
architectures today.  There are no real reasons not to, but
they just do not, because the package is relatively new.  It
will likely become available in the future.

I see this thread is a bit old.  Was a solution found for the
pmem/xen?  i.e. do I need to do something specific for libxdp
at this point?

Best regards, Ilya Maximets.

> 
>> Do you have any other local changes on top ?
>>>
>>> (I just noticed manually updating the libvirt-ci submodule is
>>>   pointless because the 'lcitool-refresh' rule does that)
>>>
> diff --git a/tests/docker/dockerfiles/ubuntu2204.docker
> b/tests/docker/dockerfiles/ubuntu2204.docker
> index 1d442cdfe6..5162927016 100644
> --- a/tests/docker/dockerfiles/ubuntu2204.docker
> +++ b/tests/docker/dockerfiles/ubuntu2204.docker
> @@ -73 +72,0 @@ RUN export DEBIAN_FRONTEND=noninteractive && \
> -  libpmem-dev \
> @@ -99 +97,0 @@ RUN export DEBIAN_FRONTEND=noninteractive && \
> -  libxen-dev \

 What architecture are you running from ? I'm suspecting it is a non
 x86_64 architecture as these pacakges are both arch conditionalized.
>>>
>>> My host is Darwin aarch64, but how is this relevant to what we
>>> generate for guests? Are we missing specifying the target arch
>>> somewhere? (This is not the '--cross-arch' argument, because we
>>> don't want to cross-build). These dockerfiles always targeted x86_64,
>>> in particular the debian-amd64.docker one...
>>
>> This is an unfortunate design choice of lcitool.
>>
>> If you give '--cross-arch' it will always spit out the dockerfile
>> suitable for cross-compiling to that arch.
>>
>> If you omit '--cross-arch' it will always spit out the dockerfile
>> suitable for compiling on the *current* host arch.
>>
>> When we're committing the dockerfiles to qemu (or any libvirt project
>> using lcitool), we've got an implicit assumption that the non-cross
>> dockerfiles were for x86_64, and hence an implicit assumption that
>> the person who ran 'lcitool' was also on x86_64.
>>
>> This is clearly a bad design choice mistake in retrospect as people
>> are increasingly likely to be using aarch64 for day-to-day dev work.
>>
>> So I guess we need to come up with a more explicit way to say give
>> me a native container for x86_64.
>>
>> With regards,
>> Daniel
> 




Re: [PATCH for 8.2.1] hw/net: cadence_gem: Fix MDIO_OP_xxx values

2024-01-02 Thread Heinrich Schuchardt

On 1/2/24 16:08, Philippe Mathieu-Daudé wrote:

On 2/1/24 15:18, Bin Meng wrote:

Testing upstream U-Boot with 'sifive_u' machine we see:

   => dhcp
   ethernet@1009: PHY present at 0
   Could not get PHY for ethernet@1009: addr 0
   phy_connect failed

This has been working till QEMU 8.1 but broken since QEMU 8.2.


s/till/until/?


These are synonyms. Till is more informal. No need to change.




Fixes: 1b09eeb122aa ("hw/net/cadence_gem: use FIELD to describe
PHYMNTNC register fields")
Reported-by: Heinrich Schuchardt 
Signed-off-by: Bin Meng 

---

  hw/net/cadence_gem.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/net/cadence_gem.c b/hw/net/cadence_gem.c
index 296bba238e..472ce9c8cf 100644
--- a/hw/net/cadence_gem.c
+++ b/hw/net/cadence_gem.c
@@ -199,8 +199,8 @@ REG32(PHYMNTNC, 0x34) /* Phy Maintenance reg */
  FIELD(PHYMNTNC, PHY_ADDR, 23, 5)
  FIELD(PHYMNTNC, OP, 28, 2)
  FIELD(PHYMNTNC, ST, 30, 2)
-#define MDIO_OP_READ    0x3
-#define MDIO_OP_WRITE   0x2
+#define MDIO_OP_READ    0x2
+#define MDIO_OP_WRITE   0x1
  REG32(RXPAUSE, 0x38) /* RX Pause Time reg */
  REG32(TXPAUSE, 0x3c) /* TX Pause Time reg */


Reviewed-by: Philippe Mathieu-Daudé 



Thank you Bin for the fix.

With the fix I was able to download a file via TFTP to U-Boot
sifive_unleashed_defconfig on the emulated board. Cf.
docs/system/riscv/sifive_u.rst.

Tested-by: Heinrich Schuchardt 



Re: [PATCH v3 2/4] tests/lcitool: Refresh generated files

2024-01-02 Thread Philippe Mathieu-Daudé

On 12/7/23 15:07, Daniel P. Berrangé wrote:

On Wed, Jul 12, 2023 at 02:55:10PM +0200, Philippe Mathieu-Daudé wrote:

On 12/7/23 13:12, Daniel P. Berrangé wrote:

On Wed, Jul 12, 2023 at 01:00:38PM +0200, Philippe Mathieu-Daudé wrote:

On 11/7/23 19:58, Daniel P. Berrangé wrote:

On Tue, Jul 11, 2023 at 04:49:20PM +0200, Philippe Mathieu-Daudé wrote:

Refresh the generated files by running:

 $ make lcitool-refresh

Signed-off-by: Philippe Mathieu-Daudé 
---
tests/docker/dockerfiles/debian-amd64.docker |  2 -
tests/docker/dockerfiles/ubuntu2004.docker   |  2 -
tests/docker/dockerfiles/ubuntu2204.docker   |  2 -


I don't know why this would be removing xen/pmem from these files. If
I run 'lcitool-refresh' on current git master that doesn't happen


FTR since commit cb039ef3d9 libxdp-devel is also being changed on my
host, similarly to libpmem-devel, so I suppose it also has some host
specific restriction.


Do you have any other local changes on top ?


(I just noticed manually updating the libvirt-ci submodule is
  pointless because the 'lcitool-refresh' rule does that)


diff --git a/tests/docker/dockerfiles/ubuntu2204.docker
b/tests/docker/dockerfiles/ubuntu2204.docker
index 1d442cdfe6..5162927016 100644
--- a/tests/docker/dockerfiles/ubuntu2204.docker
+++ b/tests/docker/dockerfiles/ubuntu2204.docker
@@ -73 +72,0 @@ RUN export DEBIAN_FRONTEND=noninteractive && \
-  libpmem-dev \
@@ -99 +97,0 @@ RUN export DEBIAN_FRONTEND=noninteractive && \
-  libxen-dev \


What architecture are you running from ? I'm suspecting it is a non
x86_64 architecture as these pacakges are both arch conditionalized.


My host is Darwin aarch64, but how is this relevant to what we
generate for guests? Are we missing specifying the target arch
somewhere? (This is not the '--cross-arch' argument, because we
don't want to cross-build). These dockerfiles always targeted x86_64,
in particular the debian-amd64.docker one...


This is an unfortunate design choice of lcitool.

If you give '--cross-arch' it will always spit out the dockerfile
suitable for cross-compiling to that arch.

If you omit '--cross-arch' it will always spit out the dockerfile
suitable for compiling on the *current* host arch.

When we're committing the dockerfiles to qemu (or any libvirt project
using lcitool), we've got an implicit assumption that the non-cross
dockerfiles were for x86_64, and hence an implicit assumption that
the person who ran 'lcitool' was also on x86_64.

This is clearly a bad design choice mistake in retrospect as people
are increasingly likely to be using aarch64 for day-to-day dev work.

So I guess we need to come up with a more explicit way to say give
me a native container for x86_64.

With regards,
Daniel





Re: Some thoughts and questions about CXL & MCE

2024-01-02 Thread Jonathan Cameron via
On Sat, 23 Dec 2023 00:33:43 +0800
Shiyang Ruan  wrote:

> Hi guys,
> 
> I have some thoughts and questions about CXL & MCE mechanism.

+CC qemu-devel as quite bit of this is QEMU related .

> 
> CXL type-3 devices can be used as volatile or persistent memory, so a 
> poisoned page on them should also trigger a memory failure, to let OS 
> handle process using the page and let device driver recover the page.  I 
> am now investigating this.
> 
> Currently, CXL RAS is under development.  We can now inject POISON on a 
> CXL device by qemu (qmp commands), and then `cxl list -L` could show 
> those poisoned areas.  But the POISON injection is silent, I think we 
> need a singal here to notify OS to handle those poisoned areas when 
> injecting. 

Agreed. The emulation is far from complete.  It should kick off
the relevant event log entry additions as well under at least some
circumstances (depends whether we think we are injecting poison to be
discovered later - which it won't be because we don't check for poison
when doing reads and writes - I've not yet figured out how to do that
in QEMU). If we are using the inject poison opcode from the host OS
then we are missing the bit in 8.2.9.9.4.2 (CXL r3.1)
"In addition , the device shall add an appropriate poison creation
event to it's internal informational event log, update the event
status register and if configured, interrupt the host".
So that should do a General Media Event of type 04h - host inject poison.

For the qmp interface we should add control of whether we are injecting
poison that is intended to trigger an error now (e.g. what would result
from a scrub detecting it) or poison for detection later - either by
triggering a media scan, or by a host read / write.

If it's a scrub poison detection that we are emulating then we
should issue an uncorrectable GMR Event record with Memory Event Type
of Scrub media or maybe a 00h (Media ECC error) if we think some
other reason might cause it and transaction type 05 Media patrol scrub.

Note IIRC you can manually inject these records which will result in
appropriate events being reported in Linux they just aren't currently
hooked up to the QEMU poison injection (qmp or host interface).

If you have time to look at filling these more complex flows in that
would be great as it would make the qemu side of things easier to use.

> According to CXL 3.0 spec Figure 12-5, there are 2 methods 
> to send the signal: FW-First and OS-First.
> My understanding about them is:
> - FW-First method:
>a. CXL device report POISON to Firmware
>b. GHES calls CXL driver handler[1], which will handle the POISON

I'm in two minds about how to emulate the firmware first paths.
In the short term I'll get some old code I have running again that
lets us do general CPER record injection. However, we might want to
actually push the record creation into EDK2.  Meh - lets do it in
qemu first and see how bad it looks.

>c. CXL driver handler translates DPA to HPA, construct a mce 
> instance, then call mce_log() to queue this MCE (? not sure)

Yes, the last step is missing currently I think? (I'm loosing track
of some of the ras flows).

> - OS-First method:
>a. CXL device report POISON to OS by MSI
>b. CXL driver will handle the POISON
>c. same with the c. above
> 
> So, I think:
> Firstly, and obviously, we need to add a signal when injecting POISON in 
> qemu.  For example, call `cxl_event_insert()` after injection.
Yes - create the appropriate records and add them.  However we'll need
to enable adding different causes of poison so we know whether to do this
or to rely on later queries or not.

> 
> Secondly, implement a method in CXL driver to turn POISON to MCE and 
> push it into the mce_evt_pool for decode chain to process, then 
> mce_uc_nb on this chain will finally call memory_failure().
> 
> And a question:
> How to configure the CXL device to choose FW-First or OS-First singal 
> methods (methods for qemu and bare matel if possible)?

There is an _OSC for this.  We can hook that up in QEMU but it may be
controversial to do it there rather than in EDK2.

> 
> 
> I don't fully understand the CXL spec yet (it's difficult for me), so 
> the above ideas may be immature, but I really want to figure out how we 
> can make CXL & MCE work.  I'd really appreciate it if you could help me 
> on this!
> 
> [1] 
> https://lore.kernel.org/linux-cxl/20231220-cxl-cper-v5-0-1bb8a4ca2...@intel.com/T/#u

Great if you can look at filling in the details in this area.
There are still quite a few flows we haven't fully realized in emulation
or in the kernel.

Jonathan

> 
> 
> --
> Thanks,
> Ruan




Re: [PATCH 01/11] migration: Remove migrate_max_downtime() declaration

2024-01-02 Thread Fabiano Rosas
Avihai Horon  writes:

> migrate_max_downtime() has been removed long ago, but its declaration
> was mistakenly left. Remove it.
>
> Signed-off-by: Avihai Horon 

Reviewed-by: Fabiano Rosas 



[PATCH 0/2] gitlab: Add config for Loongarch64 custom runner

2024-01-02 Thread Philippe Mathieu-Daudé
Loongson Technology Corporation Limited offered to sponsor QEMU
with a Loongarch64 custom runner.

Philippe Mathieu-Daudé (2):
  gitlab: Introduce Loongarch64 runner
  gitlab: Add Loongarch64 KVM-only build

 docs/devel/ci-jobs.rst.inc|  6 +++
 .gitlab-ci.d/custom-runners.yml   |  1 +
 .../openeuler-22.03-loongarch64.yml   | 43 +++
 3 files changed, 50 insertions(+)
 create mode 100644 .gitlab-ci.d/custom-runners/openeuler-22.03-loongarch64.yml

-- 
2.41.0




[PATCH 1/2] gitlab: Introduce Loongarch64 runner

2024-01-02 Thread Philippe Mathieu-Daudé
Full build config to run CI tests on a Loongarch64 host.

Forks might enable this by setting LOONGARCH64_RUNNER_AVAILABLE
in their CI namespace settings, see:
https://www.qemu.org/docs/master/devel/ci.html#maintainer-controlled-job-variables

Signed-off-by: Philippe Mathieu-Daudé 
---
 docs/devel/ci-jobs.rst.inc|  6 ++
 .gitlab-ci.d/custom-runners.yml   |  1 +
 .../openeuler-22.03-loongarch64.yml   | 21 +++
 3 files changed, 28 insertions(+)
 create mode 100644 .gitlab-ci.d/custom-runners/openeuler-22.03-loongarch64.yml

diff --git a/docs/devel/ci-jobs.rst.inc b/docs/devel/ci-jobs.rst.inc
index 4c39cdb2d9..b821a33112 100644
--- a/docs/devel/ci-jobs.rst.inc
+++ b/docs/devel/ci-jobs.rst.inc
@@ -189,6 +189,12 @@ used as a gitlab-CI runner, you can set this variable to 
enable the
 tests that require this kind of host. The runner should be tagged with
 both "centos_stream_8" and "x86_64".
 
+LOONGARCH64_RUNNER_AVAILABLE
+
+If you've got access to a Loongarch64 host that can be used as a gitlab-CI
+runner, you can set this variable to enable the tests that require this
+kind of host. The runner should be tagged with "loongarch64".
+
 CCACHE_DISABLE
 ~~
 The jobs are configured to use "ccache" by default since this typically
diff --git a/.gitlab-ci.d/custom-runners.yml b/.gitlab-ci.d/custom-runners.yml
index 8e5b9500f4..152ace4492 100644
--- a/.gitlab-ci.d/custom-runners.yml
+++ b/.gitlab-ci.d/custom-runners.yml
@@ -32,3 +32,4 @@ include:
   - local: '/.gitlab-ci.d/custom-runners/ubuntu-22.04-aarch64.yml'
   - local: '/.gitlab-ci.d/custom-runners/ubuntu-22.04-aarch32.yml'
   - local: '/.gitlab-ci.d/custom-runners/centos-stream-8-x86_64.yml'
+  - local: '/.gitlab-ci.d/custom-runners/openeuler-22.03-loongarch64.yml'
diff --git a/.gitlab-ci.d/custom-runners/openeuler-22.03-loongarch64.yml 
b/.gitlab-ci.d/custom-runners/openeuler-22.03-loongarch64.yml
new file mode 100644
index 00..86d18f820e
--- /dev/null
+++ b/.gitlab-ci.d/custom-runners/openeuler-22.03-loongarch64.yml
@@ -0,0 +1,21 @@
+openeuler-22.03-loongarch64-all:
+ extends: .custom_runner_template
+ needs: []
+ stage: build
+ tags:
+ - oe2203
+ - loongarch64
+ rules:
+ - if: '$CI_PROJECT_NAMESPACE == "qemu-project" && $CI_COMMIT_BRANCH =~ 
/^staging/'
+   when: manual
+   allow_failure: true
+ - if: "$LOONGARCH64_RUNNER_AVAILABLE"
+   when: manual
+   allow_failure: true
+ script:
+ - mkdir build
+ - cd build
+ - ../configure
+   || { cat config.log meson-logs/meson-log.txt; exit 1; }
+ - make --output-sync -j`nproc --ignore=40`
+ - make --output-sync -j`nproc --ignore=40` check
-- 
2.41.0




[NOTFORMERGE PATCH 2/2] gitlab: Add Loongarch64 KVM-only build

2024-01-02 Thread Philippe Mathieu-Daudé
Signed-off-by: Philippe Mathieu-Daudé 
---
Used to test 
https://lore.kernel.org/qemu-devel/20231228084051.3235354-1-zhaotian...@loongson.cn/
---
 .../openeuler-22.03-loongarch64.yml   | 22 +++
 1 file changed, 22 insertions(+)

diff --git a/.gitlab-ci.d/custom-runners/openeuler-22.03-loongarch64.yml 
b/.gitlab-ci.d/custom-runners/openeuler-22.03-loongarch64.yml
index 86d18f820e..60674b8d0f 100644
--- a/.gitlab-ci.d/custom-runners/openeuler-22.03-loongarch64.yml
+++ b/.gitlab-ci.d/custom-runners/openeuler-22.03-loongarch64.yml
@@ -19,3 +19,25 @@ openeuler-22.03-loongarch64-all:
|| { cat config.log meson-logs/meson-log.txt; exit 1; }
  - make --output-sync -j`nproc --ignore=40`
  - make --output-sync -j`nproc --ignore=40` check
+
+openeuler-22.03-loongarch64-kvm:
+ extends: .custom_runner_template
+ needs: []
+ stage: build
+ tags:
+ - oe2203
+ - loongarch64
+ rules:
+ - if: '$CI_PROJECT_NAMESPACE == "qemu-project" && $CI_COMMIT_BRANCH =~ 
/^staging/'
+   when: manual
+   allow_failure: true
+ - if: "$LOONGARCH64_RUNNER_AVAILABLE"
+   when: manual
+   allow_failure: true
+ script:
+ - mkdir build
+ - cd build
+ - ../configure --enable-kvm --disable-tcg
+   || { cat config.log meson-logs/meson-log.txt; exit 1; }
+ - make --output-sync -j`nproc --ignore=40`
+ - make --output-sync -j`nproc --ignore=40` check
-- 
2.41.0




Re: [RFC 0/3] aio-posix: call ->poll_end() when removing AioHandler

2024-01-02 Thread Hanna Czenczek

On 02.01.24 16:53, Paolo Bonzini wrote:

On Tue, Jan 2, 2024 at 4:24 PM Hanna Czenczek  wrote:

I’ve attached the preliminary patch that I didn’t get to send (or test
much) last year.  Not sure if it has the same CPU-usage-spike issue
Fiona was seeing, the only functional difference is that I notify the vq
after attaching the notifiers instead of before.

I think the patch makes sense and cleaning up the logic of aio_poll
(which is one of those functions that grew and grew without much
clarity into who did what) can be done on top.

Just one small thing, the virtio_queue_notify_vq() call is required
because the virtqueue interrupt and eventfd are edge-triggered rather
than level-triggered; so perhaps it should be placed in the
function(s) that establish the handlers,
virtio_queue_aio_attach_host_notifier() and
virtio_queue_aio_attach_host_notifier_no_poll()? Neither
virtio_blk_drained_end() nor virtio_scsi_drained_end() are
particularly special, and the comment applies just as well:

 /*
  * We will have ignored notifications about new requests from the guest
  * while handlers were not attached, so "kick" the virt queue to process
  * those requests now.
  */


I wasn’t entirely whether we want to call notify_vq() if we have 
instated the handlers for the first time (in which case someone ought to 
look for existing unprocessed requests anyway), so I decided to limit it 
to drained_end.


But considering that it must be safe to call notify_vq() right after 
instating the handlers (virtio_queue_host_notifier_read() may then be 
called after all), we might as well do so every time, yes.


Hanna

Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv

2024-01-02 Thread Cédric Le Goater

On 1/2/24 17:15, Philippe Mathieu-Daudé wrote:

Hi Cédric,

On 2/1/24 15:55, Cédric Le Goater wrote:

On 12/12/23 17:29, Philippe Mathieu-Daudé wrote:

Hi,

When a MPCore cluster is used, the Cortex-A cores belong the the
cluster container, not to the board/soc layer. This series move
the creation of vCPUs to the MPCore private container.

Doing so we consolidate the QOM model, moving common code in a
central place (abstract MPCore parent).


Changing the QOM hierarchy has an impact on the state of the machine
and some fixups are then required to maintain migration compatibility.
This can become a real headache for KVM machines like virt for which
migration compatibility is a feature, less for emulated ones.


All changes are either moving properties (which are not migrated)
or moving non-migrated QOM members (i.e. pointers of ARMCPU, which
is still migrated elsewhere). So I don't see any obvious migration
problem, but I might be missing something, so I Cc'ed Juan :>

We broke migration compatibility by moving the IC object in the QOM
hierarchy of the pseries machines in the past. These changes might
be different. Here is the QOM tree of the ast2600 SoC.

before :

  /soc (ast2600-a3)
/a7mpcore (a15mpcore_priv)
  /a15mp-priv-container[0] (memory-region)
  /gic (arm_gic)
/gic_cpu[0] (memory-region)
/gic_cpu[1] (memory-region)
/gic_cpu[2] (memory-region)
/gic_dist[0] (memory-region)
/gic_vcpu[0] (memory-region)
/gic_viface[0] (memory-region)
/gic_viface[1] (memory-region)
/gic_viface[2] (memory-region)
/cpu[0] (cortex-a7-arm-cpu)
/cpu[1] (cortex-a7-arm-cpu)

after :

  /soc (ast2600-a3)
/a7mpcore (a7mpcore_priv)
  /cpu[0] (cortex-a7-arm-cpu)
  /cpu[1] (cortex-a7-arm-cpu)
  /gic (arm_gic)
/gic_cpu[0] (memory-region)
/gic_cpu[1] (memory-region)
/gic_cpu[2] (memory-region)
/gic_dist[0] (memory-region)
  /mpcore-priv-container[0] (memory-region)


Thanks,

C.






Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv

2024-01-02 Thread Philippe Mathieu-Daudé

Hi Cédric,

On 2/1/24 15:55, Cédric Le Goater wrote:

On 12/12/23 17:29, Philippe Mathieu-Daudé wrote:

Hi,

When a MPCore cluster is used, the Cortex-A cores belong the the
cluster container, not to the board/soc layer. This series move
the creation of vCPUs to the MPCore private container.

Doing so we consolidate the QOM model, moving common code in a
central place (abstract MPCore parent).


Changing the QOM hierarchy has an impact on the state of the machine
and some fixups are then required to maintain migration compatibility.
This can become a real headache for KVM machines like virt for which
migration compatibility is a feature, less for emulated ones.


All changes are either moving properties (which are not migrated)
or moving non-migrated QOM members (i.e. pointers of ARMCPU, which
is still migrated elsewhere). So I don't see any obvious migration
problem, but I might be missing something, so I Cc'ed Juan :>



I don't have a good solution to propose to overcome this problem :/

C.




This eventually allow removing one qemu_get_cpu() use, which we
want to remove in heterogeneous machines (machines using MPCore
are candidate for heterogeneous emulation).




  include/hw/arm/aspeed_soc.h    |   5 +-
  include/hw/arm/boot.h  |   4 +-
  include/hw/arm/exynos4210.h    |   6 +-
  include/hw/arm/fsl-imx6.h  |   6 +-
  include/hw/arm/fsl-imx6ul.h    |   8 +-
  include/hw/arm/fsl-imx7.h  |   8 +-
  include/hw/arm/npcm7xx.h   |   3 +-
  include/hw/cpu/a15mpcore.h |  44 ---
  include/hw/cpu/a9mpcore.h  |  39 ---
  include/hw/cpu/cortex_mpcore.h | 135 ++




[RFC PATCH 1/5] qdev-properties: Add qdev_property_del_static()

2024-01-02 Thread Philippe Mathieu-Daudé
We can add properties with qdev_property_add_static().
Add qdev_property_del_static() to delete them.

Signed-off-by: Philippe Mathieu-Daudé 
---
 include/hw/qdev-properties.h | 2 ++
 hw/core/qdev-properties.c| 7 +++
 2 files changed, 9 insertions(+)

diff --git a/include/hw/qdev-properties.h b/include/hw/qdev-properties.h
index 09aa04ca1e..0e1930177e 100644
--- a/include/hw/qdev-properties.h
+++ b/include/hw/qdev-properties.h
@@ -225,6 +225,8 @@ void error_set_from_qdev_prop_error(Error **errp, int ret, 
Object *obj,
  */
 void qdev_property_add_static(DeviceState *dev, Property *prop);
 
+void qdev_property_del_static(DeviceState *dev, Property *prop);
+
 /**
  * qdev_alias_all_properties: Create aliases on source for all target 
properties
  * @target: Device which has properties to be aliased
diff --git a/hw/core/qdev-properties.c b/hw/core/qdev-properties.c
index 7d6fa726fd..0c17a5de82 100644
--- a/hw/core/qdev-properties.c
+++ b/hw/core/qdev-properties.c
@@ -994,6 +994,13 @@ void qdev_property_add_static(DeviceState *dev, Property 
*prop)
 }
 }
 
+void qdev_property_del_static(DeviceState *dev, Property *prop)
+{
+Object *obj = OBJECT(dev);
+
+object_property_del(obj, prop->name);
+}
+
 static void qdev_class_add_property(DeviceClass *klass, const char *name,
 Property *prop)
 {
-- 
2.41.0




Re: [PATCH v3 5/5] Rename "QEMU global mutex" to "BQL" in comments and docs

2024-01-02 Thread Cédric Le Goater

On 1/2/24 16:35, Stefan Hajnoczi wrote:

The term "QEMU global mutex" is identical to the more widely used Big
QEMU Lock ("BQL"). Update the code comments and documentation to use
"BQL" instead of "QEMU global mutex".

Signed-off-by: Stefan Hajnoczi 
Acked-by: Markus Armbruster 
Reviewed-by: Philippe Mathieu-Daudé 



Reviewed-by: Cédric Le Goater 

Thanks,

C.





[RFC PATCH 5/5] hw/arm/armv7m: Do not expose 'vfp' property if ARM CPU doesn't have it

2024-01-02 Thread Philippe Mathieu-Daudé
Since the TYPE_ARMV7M object doesn't know its CPU type at the
time armv7m_instance_init() is called, we need to prepare to
forward any CPU properties there, then we can forward them in
armv7m_realize().

But then when introspecting at runtime, in the case the requested
CPU doesn't expose such properties, we can still see them exposed
in the container.

It is possible to remove an unmeaningful property with
qdev_property_del_static(). As an example, remove the 'vfp'
property when not relevant.

When running the musca-a board, the monitor output changes as:

  (qemu) info qtree
...
dev: armv7m, id ""
  gpio-in "NMI" 1
  gpio-out "SYSRESETREQ" 1
  gpio-in "" 96
  clock-in "cpuclk" freq_hz=40 MHz
  clock-in "refclk" freq_hz=0 Hz
  cpu-type = "cortex-m33-arm-cpu"
  init-svtor = 270532608 (0x1020)
  init-nsvtor = 0 (0x0)
  enable-bitband = false
  start-powered-off = false
- vfp = "false"
  dsp = false
  mpu-ns-regions = 8 (0x8)
  mpu-s-regions = 8 (0x8)

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/arm/armv7m.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/hw/arm/armv7m.c b/hw/arm/armv7m.c
index 12cdad09f9..f1f40353cb 100644
--- a/hw/arm/armv7m.c
+++ b/hw/arm/armv7m.c
@@ -244,6 +244,9 @@ static const MemoryRegionOps ppb_default_ops = {
 .valid.max_access_size = 8,
 };
 
+static Property optional_arm_cpu_vfp_property =
+DEFINE_PROP_BOOL_NODEFAULT("vfp", ARMv7MState, vfp);
+
 static void armv7m_instance_init(Object *obj)
 {
 ARMv7MState *s = ARMV7M(obj);
@@ -271,6 +274,9 @@ static void armv7m_instance_init(Object *obj)
 
 s->refclk = qdev_init_clock_in(DEVICE(obj), "refclk", NULL, NULL, 0);
 s->cpuclk = qdev_init_clock_in(DEVICE(obj), "cpuclk", NULL, NULL, 0);
+
+qdev_property_add_static(DEVICE(obj),
+ _arm_cpu_vfp_property);
 }
 
 static void armv7m_realize(DeviceState *dev, Error **errp)
@@ -331,6 +337,8 @@ static void armv7m_realize(DeviceState *dev, Error **errp)
 } else if (s->vfp == OPTIONAL_BOOL_TRUE) {
 error_setg(errp, "'%s' does not support VFP", s->cpu_type);
 return;
+} else {
+qdev_property_del_static(dev, _arm_cpu_vfp_property);
 }
 if (object_property_find(OBJECT(s->cpu), "dsp")) {
 if (!object_property_set_bool(OBJECT(s->cpu), "dsp", s->dsp, errp)) {
@@ -551,7 +559,6 @@ static Property armv7m_properties[] = {
 DEFINE_PROP_BOOL("enable-bitband", ARMv7MState, enable_bitband, false),
 DEFINE_PROP_BOOL("start-powered-off", ARMv7MState, start_powered_off,
  false),
-DEFINE_PROP_BOOL_NODEFAULT("vfp", ARMv7MState, vfp),
 DEFINE_PROP_BOOL("dsp", ARMv7MState, dsp, true),
 DEFINE_PROP_UINT32("mpu-ns-regions", ARMv7MState, mpu_ns_regions, 
UINT_MAX),
 DEFINE_PROP_UINT32("mpu-s-regions", ARMv7MState, mpu_s_regions, UINT_MAX),
-- 
2.41.0




[RFC PATCH 3/5] hw/arm/armv7m: Convert ARMv7MState::vfp from bool to OptionalBool

2024-01-02 Thread Philippe Mathieu-Daudé
We want to know if the 'vfp' property has been initialized.
Convert it from boolean to OptionalBool (which contain the
'unset' enum).

Note the monitor output is changed as:

  (qemu) info qtree
...
dev: armv7m, id ""
  gpio-in "NMI" 1
  gpio-out "SYSRESETREQ" 1
  gpio-in "" 96
  clock-in "cpuclk" freq_hz=40 MHz
  clock-in "refclk" freq_hz=0 Hz
  cpu-type = "cortex-m33-arm-cpu"
  init-svtor = 270532608 (0x1020)
  init-nsvtor = 0 (0x0)
  enable-bitband = false
  start-powered-off = false
- vfp = false
+ vfp = "false"
  dsp = false
  mpu-ns-regions = 8 (0x8)
  mpu-s-regions = 8 (0x8)

Signed-off-by: Philippe Mathieu-Daudé 
---
 include/hw/arm/armv7m.h | 2 +-
 hw/arm/armsse.c | 2 +-
 hw/arm/armv7m.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/hw/arm/armv7m.h b/include/hw/arm/armv7m.h
index e2cebbd15c..6c9e65b644 100644
--- a/include/hw/arm/armv7m.h
+++ b/include/hw/arm/armv7m.h
@@ -105,7 +105,7 @@ struct ARMv7MState {
 uint32_t mpu_s_regions;
 bool enable_bitband;
 bool start_powered_off;
-bool vfp;
+OptionalBool vfp;
 bool dsp;
 };
 
diff --git a/hw/arm/armsse.c b/hw/arm/armsse.c
index 31acbf7347..472b16fc30 100644
--- a/hw/arm/armsse.c
+++ b/hw/arm/armsse.c
@@ -1028,7 +1028,7 @@ static void armsse_realize(DeviceState *dev, Error **errp)
 }
 }
 if (!s->cpu_fpu[i]) {
-if (!object_property_set_bool(cpuobj, "vfp", false, errp)) {
+if (!object_property_set_str(cpuobj, "vfp", "false", errp)) {
 return;
 }
 }
diff --git a/hw/arm/armv7m.c b/hw/arm/armv7m.c
index d10abb36a8..3610f6f4a1 100644
--- a/hw/arm/armv7m.c
+++ b/hw/arm/armv7m.c
@@ -548,7 +548,7 @@ static Property armv7m_properties[] = {
 DEFINE_PROP_BOOL("enable-bitband", ARMv7MState, enable_bitband, false),
 DEFINE_PROP_BOOL("start-powered-off", ARMv7MState, start_powered_off,
  false),
-DEFINE_PROP_BOOL("vfp", ARMv7MState, vfp, true),
+DEFINE_PROP_BOOL_NODEFAULT("vfp", ARMv7MState, vfp),
 DEFINE_PROP_BOOL("dsp", ARMv7MState, dsp, true),
 DEFINE_PROP_UINT32("mpu-ns-regions", ARMv7MState, mpu_ns_regions, 
UINT_MAX),
 DEFINE_PROP_UINT32("mpu-s-regions", ARMv7MState, mpu_s_regions, UINT_MAX),
-- 
2.41.0




[RFC PATCH 4/5] hw/arm/armv7m: Error if trying to set unavailable ARMCPU::vfp property

2024-01-02 Thread Philippe Mathieu-Daudé
Do not ignore impossible configuration requested by the user.
For example, when trying to enable VFP on a Cortex-M33 we now get:

  qemu-system-arm: 'cortex-m33-arm-cpu' does not support VFP

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/arm/armv7m.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/arm/armv7m.c b/hw/arm/armv7m.c
index 3610f6f4a1..12cdad09f9 100644
--- a/hw/arm/armv7m.c
+++ b/hw/arm/armv7m.c
@@ -328,6 +328,9 @@ static void armv7m_realize(DeviceState *dev, Error **errp)
 if (!object_property_set_bool(OBJECT(s->cpu), "vfp", s->vfp, errp)) {
 return;
 }
+} else if (s->vfp == OPTIONAL_BOOL_TRUE) {
+error_setg(errp, "'%s' does not support VFP", s->cpu_type);
+return;
 }
 if (object_property_find(OBJECT(s->cpu), "dsp")) {
 if (!object_property_set_bool(OBJECT(s->cpu), "dsp", s->dsp, errp)) {
-- 
2.41.0




[RFC PATCH 2/5] qdev-properties: Add OptionalBool QAPI type

2024-01-02 Thread Philippe Mathieu-Daudé
To be able to distinct whether a boolean qdev property
has been set or not, add the DEFINE_PROP_BOOL_NODEFAULT()
qdev macro based on the tri-state OptionalBool QAPI type.

Signed-off-by: Philippe Mathieu-Daudé 
---
 qapi/common.json | 16 
 include/hw/qdev-properties.h |  5 +
 hw/core/qdev-properties.c| 10 ++
 3 files changed, 31 insertions(+)

diff --git a/qapi/common.json b/qapi/common.json
index 6fed9cde1a..884c143e2a 100644
--- a/qapi/common.json
+++ b/qapi/common.json
@@ -207,3 +207,19 @@
 ##
 { 'struct': 'HumanReadableText',
   'data': { 'human-readable-text': 'str' } }
+
+##
+# @OptionalBool:
+#
+# An enumeration of three options: true, false, and unset
+#
+# @unset: Unset (default)
+#
+# @false: False
+#
+# @true: True
+#
+# Since: 9.0
+##
+{ 'enum': 'OptionalBool',
+  'data': [ 'false', 'true', 'unset' ] }
diff --git a/include/hw/qdev-properties.h b/include/hw/qdev-properties.h
index 0e1930177e..8cf95da2c3 100644
--- a/include/hw/qdev-properties.h
+++ b/include/hw/qdev-properties.h
@@ -49,6 +49,7 @@ struct PropertyInfo {
 extern const PropertyInfo qdev_prop_bit;
 extern const PropertyInfo qdev_prop_bit64;
 extern const PropertyInfo qdev_prop_bool;
+extern const PropertyInfo qdev_prop_bool_unset;
 extern const PropertyInfo qdev_prop_enum;
 extern const PropertyInfo qdev_prop_uint8;
 extern const PropertyInfo qdev_prop_uint16;
@@ -105,6 +106,10 @@ extern const PropertyInfo qdev_prop_link;
 .set_default = true, \
 .defval.u= (bool)_defval)
 
+#define DEFINE_PROP_BOOL_NODEFAULT(_name, _state, _field) \
+DEFINE_PROP_SIGNED(_name, _state, _field, OPTIONAL_BOOL_UNSET, \
+qdev_prop_bool_unset, OptionalBool)
+
 /**
  * The DEFINE_PROP_UINT64_CHECKMASK macro checks a user-supplied value
  * against corresponding bitmask, rejects the value if it violates.
diff --git a/hw/core/qdev-properties.c b/hw/core/qdev-properties.c
index 0c17a5de82..1bec8ee679 100644
--- a/hw/core/qdev-properties.c
+++ b/hw/core/qdev-properties.c
@@ -260,6 +260,16 @@ const PropertyInfo qdev_prop_bool = {
 .set_default_value = set_default_value_bool,
 };
 
+/* --- optional bool --- */
+
+const PropertyInfo qdev_prop_bool_unset = {
+.name  = "OptionalBool",
+.enum_table  = _lookup,
+.get   = qdev_propinfo_get_enum,
+.set   = qdev_propinfo_set_enum,
+.set_default_value = qdev_propinfo_set_default_value_enum,
+};
+
 /* --- 8bit integer --- */
 
 static void get_uint8(Object *obj, Visitor *v, const char *name, void *opaque,
-- 
2.41.0




[RFC PATCH 0/5] qdev-properties: Try to improve use of dynamic property introspection

2024-01-02 Thread Philippe Mathieu-Daudé
Hi,

This RFC series tries to work over some limitations exposed in
https://lore.kernel.org/qemu-devel/9293e63b-8032-4ea0-b516-9db6949fb...@linaro.org/

Eventually all QDev objects would use static QOM properties,
but in some cases we can not. ARMv7MState is a such example
adding properties that might end up irrelevant.

This is just an example, but thinking long term (in particular
in the context of dynamic machines) I'm looking at how this
could be improved. Thus this series. I don't like much this
current approach (because more boiler place and complexity)
however this seems to DTRT for the user.

Philippe Mathieu-Daudé (5):
  qdev-properties: Add qdev_property_del_static()
  qdev-properties: Add OptionalBool QAPI type
  hw/arm/armv7m: Convert ARMv7MState::vfp from bool to OptionalBool
  hw/arm/armv7m: Error if trying to set unavailable ARMCPU::vfp property
  hw/arm/armv7m: Do not expose 'vfp' property if ARM CPU doesn't have it

 qapi/common.json | 16 
 include/hw/arm/armv7m.h  |  2 +-
 include/hw/qdev-properties.h |  7 +++
 hw/arm/armsse.c  |  2 +-
 hw/arm/armv7m.c  | 12 +++-
 hw/core/qdev-properties.c| 17 +
 6 files changed, 53 insertions(+), 3 deletions(-)

-- 
2.41.0




Re: [PATCH v3 4/5] Replace "iothread lock" with "BQL" in comments

2024-01-02 Thread Cédric Le Goater

On 1/2/24 16:35, Stefan Hajnoczi wrote:

The term "iothread lock" is obsolete. The APIs use Big QEMU Lock (BQL)
in their names. Update the code comments to use "BQL" instead of
"iothread lock".

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Philippe Mathieu-Daudé 



Reviewed-by: Cédric Le Goater 

Thanks,

C.






Re: [RFC 0/3] aio-posix: call ->poll_end() when removing AioHandler

2024-01-02 Thread Paolo Bonzini
On Tue, Jan 2, 2024 at 4:24 PM Hanna Czenczek  wrote:
> I’ve attached the preliminary patch that I didn’t get to send (or test
> much) last year.  Not sure if it has the same CPU-usage-spike issue
> Fiona was seeing, the only functional difference is that I notify the vq
> after attaching the notifiers instead of before.

I think the patch makes sense and cleaning up the logic of aio_poll
(which is one of those functions that grew and grew without much
clarity into who did what) can be done on top.

Just one small thing, the virtio_queue_notify_vq() call is required
because the virtqueue interrupt and eventfd are edge-triggered rather
than level-triggered; so perhaps it should be placed in the
function(s) that establish the handlers,
virtio_queue_aio_attach_host_notifier() and
virtio_queue_aio_attach_host_notifier_no_poll()? Neither
virtio_blk_drained_end() nor virtio_scsi_drained_end() are
particularly special, and the comment applies just as well:

/*
 * We will have ignored notifications about new requests from the guest
 * while handlers were not attached, so "kick" the virt queue to process
 * those requests now.
 */

Paolo




Re: Re: [PATCH v13 00/26] riscv: RVA22 profiles support

2024-01-02 Thread Andrew Jones
On Tue, Jan 02, 2024 at 08:40:48AM -0300, Daniel Henrique Barboza wrote:
> Hi,
> 
> Drew brought to my attention the following post on the tech-unprivileged 
> mailing
> list:
> 
> "Architecture Review Committee meeting minutes, 12/19/23"
> https://lists.riscv.org/g/tech-unprivileged/message/611
> 
> Second paragraph mentions:
> 
> "In response to some recent discussion in the Apps and Tools HC about how 
> profiles should
> be represented in GCC/LLVM, the ARC provides this answer: compilers should 
> use a single parameter
> for an ISA string.  An ISA string begins with either a base ISA name (e.g. 
> rv64i) or a profile name
> (e.g. rva23u64) and is optionally followed by additional extensions (e.g.  
> rv64imac_zicond or
> rva23u64_zfh_zicond).  If the ISA string begins with a profile name, it is 
> equivalent to
> replacing the profile name with its mandatory base ISA and its mandatory 
> extensions; any
> optional extensions in a profile must be explicitly named if their inclusion 
> is desired.
> ISAs are sets, and concatenating strings takes the union, so redundancy is 
> legal (e.g.
> rva23u64, rva23u64_zicsr, and rva23u64_zicsr_zicsr are all valid and 
> equivalent)."
> 
> The takeaways from it:
> 
> - this implementation is compliant with how profiles are interpreted, i.e. a 
> profile is
> considered a set of the mandatory base ISA and mandatory extensions, and any 
> additional/optional
> extensions must be explicitly named;

Yes, it's good QEMU's RISC-V CPU model command line will be consistent
with the above paragraph (and then presumably with RISC-V compiler
"ISA strings")

> 
> - our ISA string format is also since we use the base ISA name + extensions 
> format already.
> This series don't  change/add anything in this regard.
> 
> 
> If we have enough demand for it, I can do a follow-up to add support for the 
> ISA string
> profile format. I.e. this:
> 
> $ build/qemu-system-riscv64 -M virt -cpu rva22s64 (...)
> 
> # cat /proc/device-tree/cpus/cpu@0/riscv,isa
> rv64imafdc_zicbom_zicbop_zicboz_zicntr_zicsr_zifencei_zihintpause_zihpm_zfhmin_zca_zcd_zba_zbb_zbs_zkt_svinval_svpbmt
> 
> Would become this:
> 
> # cat /proc/device-tree/cpus/cpu@0/riscv,isa
> rva22s64

We can't do that. The "ISA string" referred to in the above command line
isn't the ISA string specified in "ISA Extension Naming Conventions" of
the unpriv spec, it's the string given to the compiler to tell it which
extensions it may assume when generating instructions.

Thanks,
drew



[PATCH v3 3/5] qemu/main-loop: rename qemu_cond_wait_iothread() to qemu_cond_wait_bql()

2024-01-02 Thread Stefan Hajnoczi
The name "iothread" is overloaded. Use the term Big QEMU Lock (BQL)
instead, it is already widely used and unambiguous.

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Cédric Le Goater 
Reviewed-by: Philippe Mathieu-Daudé 
---
 include/qemu/main-loop.h  | 10 +-
 accel/tcg/tcg-accel-ops-rr.c  |  4 ++--
 hw/display/virtio-gpu.c   |  2 +-
 hw/ppc/spapr_events.c |  2 +-
 system/cpu-throttle.c |  2 +-
 system/cpus.c |  4 ++--
 target/i386/nvmm/nvmm-accel-ops.c |  2 +-
 target/i386/whpx/whpx-accel-ops.c |  2 +-
 8 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/include/qemu/main-loop.h b/include/qemu/main-loop.h
index c26ad2a029..5764db157c 100644
--- a/include/qemu/main-loop.h
+++ b/include/qemu/main-loop.h
@@ -371,17 +371,17 @@ G_DEFINE_AUTOPTR_CLEANUP_FUNC(BQLLockAuto, 
bql_auto_unlock)
 = bql_auto_lock(__FILE__, __LINE__)
 
 /*
- * qemu_cond_wait_iothread: Wait on condition for the main loop mutex
+ * qemu_cond_wait_bql: Wait on condition for the Big QEMU Lock (BQL)
  *
- * This function atomically releases the main loop mutex and causes
+ * This function atomically releases the Big QEMU Lock (BQL) and causes
  * the calling thread to block on the condition.
  */
-void qemu_cond_wait_iothread(QemuCond *cond);
+void qemu_cond_wait_bql(QemuCond *cond);
 
 /*
- * qemu_cond_timedwait_iothread: like the previous, but with timeout
+ * qemu_cond_timedwait_bql: like the previous, but with timeout
  */
-void qemu_cond_timedwait_iothread(QemuCond *cond, int ms);
+void qemu_cond_timedwait_bql(QemuCond *cond, int ms);
 
 /* internal interfaces */
 
diff --git a/accel/tcg/tcg-accel-ops-rr.c b/accel/tcg/tcg-accel-ops-rr.c
index c4ea372a3f..5794e5a9ce 100644
--- a/accel/tcg/tcg-accel-ops-rr.c
+++ b/accel/tcg/tcg-accel-ops-rr.c
@@ -111,7 +111,7 @@ static void rr_wait_io_event(void)
 
 while (all_cpu_threads_idle()) {
 rr_stop_kick_timer();
-qemu_cond_wait_iothread(first_cpu->halt_cond);
+qemu_cond_wait_bql(first_cpu->halt_cond);
 }
 
 rr_start_kick_timer();
@@ -198,7 +198,7 @@ static void *rr_cpu_thread_fn(void *arg)
 
 /* wait for initial kick-off after machine start */
 while (first_cpu->stopped) {
-qemu_cond_wait_iothread(first_cpu->halt_cond);
+qemu_cond_wait_bql(first_cpu->halt_cond);
 
 /* process any pending work */
 CPU_FOREACH(cpu) {
diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index b016d3bac8..67c5be1a4e 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -1512,7 +1512,7 @@ void virtio_gpu_reset(VirtIODevice *vdev)
 g->reset_finished = false;
 qemu_bh_schedule(g->reset_bh);
 while (!g->reset_finished) {
-qemu_cond_wait_iothread(>reset_cond);
+qemu_cond_wait_bql(>reset_cond);
 }
 } else {
 virtio_gpu_reset_bh(g);
diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
index deb4641505..cb0587 100644
--- a/hw/ppc/spapr_events.c
+++ b/hw/ppc/spapr_events.c
@@ -899,7 +899,7 @@ void spapr_mce_req_event(PowerPCCPU *cpu, bool recovered)
 }
 return;
 }
-qemu_cond_wait_iothread(>fwnmi_machine_check_interlock_cond);
+qemu_cond_wait_bql(>fwnmi_machine_check_interlock_cond);
 if (spapr->fwnmi_machine_check_addr == -1) {
 /*
  * If the machine was reset while waiting for the interlock,
diff --git a/system/cpu-throttle.c b/system/cpu-throttle.c
index 786a9a5639..c951a6c65e 100644
--- a/system/cpu-throttle.c
+++ b/system/cpu-throttle.c
@@ -54,7 +54,7 @@ static void cpu_throttle_thread(CPUState *cpu, 
run_on_cpu_data opaque)
 endtime_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + sleeptime_ns;
 while (sleeptime_ns > 0 && !cpu->stop) {
 if (sleeptime_ns > SCALE_MS) {
-qemu_cond_timedwait_iothread(cpu->halt_cond,
+qemu_cond_timedwait_bql(cpu->halt_cond,
  sleeptime_ns / SCALE_MS);
 } else {
 bql_unlock();
diff --git a/system/cpus.c b/system/cpus.c
index 9b68dc9c7c..c8e2772b5f 100644
--- a/system/cpus.c
+++ b/system/cpus.c
@@ -514,12 +514,12 @@ void bql_unlock(void)
 qemu_mutex_unlock();
 }
 
-void qemu_cond_wait_iothread(QemuCond *cond)
+void qemu_cond_wait_bql(QemuCond *cond)
 {
 qemu_cond_wait(cond, );
 }
 
-void qemu_cond_timedwait_iothread(QemuCond *cond, int ms)
+void qemu_cond_timedwait_bql(QemuCond *cond, int ms)
 {
 qemu_cond_timedwait(cond, , ms);
 }
diff --git a/target/i386/nvmm/nvmm-accel-ops.c 
b/target/i386/nvmm/nvmm-accel-ops.c
index f9d5e9a37a..6b2bfd9b9c 100644
--- a/target/i386/nvmm/nvmm-accel-ops.c
+++ b/target/i386/nvmm/nvmm-accel-ops.c
@@ -48,7 +48,7 @@ static void *qemu_nvmm_cpu_thread_fn(void *arg)
 }
 }
 while (cpu_thread_is_idle(cpu)) {
-qemu_cond_wait_iothread(cpu->halt_cond);
+

[PATCH v3 4/5] Replace "iothread lock" with "BQL" in comments

2024-01-02 Thread Stefan Hajnoczi
The term "iothread lock" is obsolete. The APIs use Big QEMU Lock (BQL)
in their names. Update the code comments to use "BQL" instead of
"iothread lock".

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Philippe Mathieu-Daudé 
---
 docs/devel/reset.rst |  2 +-
 hw/display/qxl.h |  2 +-
 include/exec/cpu-common.h|  2 +-
 include/exec/memory.h|  4 ++--
 include/exec/ramblock.h  |  2 +-
 include/migration/register.h |  8 
 target/arm/internals.h   |  4 ++--
 accel/tcg/cputlb.c   |  4 ++--
 accel/tcg/tcg-accel-ops-icount.c |  2 +-
 hw/remote/mpqemu-link.c  |  2 +-
 migration/block-dirty-bitmap.c   | 10 +-
 migration/block.c| 22 +++---
 migration/colo.c |  2 +-
 migration/migration.c|  2 +-
 migration/ram.c  |  4 ++--
 system/physmem.c |  6 +++---
 target/arm/helper.c  |  2 +-
 ui/spice-core.c  |  2 +-
 util/rcu.c   |  2 +-
 audio/coreaudio.m|  4 ++--
 ui/cocoa.m   |  6 +++---
 21 files changed, 47 insertions(+), 47 deletions(-)

diff --git a/docs/devel/reset.rst b/docs/devel/reset.rst
index 38ed1790f7..d4e79718ba 100644
--- a/docs/devel/reset.rst
+++ b/docs/devel/reset.rst
@@ -19,7 +19,7 @@ Triggering reset
 
 This section documents the APIs which "users" of a resettable object should use
 to control it. All resettable control functions must be called while holding
-the iothread lock.
+the BQL.
 
 You can apply a reset to an object using ``resettable_assert_reset()``. You 
need
 to call ``resettable_release_reset()`` to release the object from reset. To
diff --git a/hw/display/qxl.h b/hw/display/qxl.h
index fdac14edad..e0a85a5ca4 100644
--- a/hw/display/qxl.h
+++ b/hw/display/qxl.h
@@ -159,7 +159,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(PCIQXLDevice, PCI_QXL)
  *
  * Use with care; by the time this function returns, the returned pointer is
  * not protected by RCU anymore.  If the caller is not within an RCU critical
- * section and does not hold the iothread lock, it must have other means of
+ * section and does not hold the BQL, it must have other means of
  * protecting the pointer, such as a reference to the region that includes
  * the incoming ram_addr_t.
  *
diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index 41115d8919..fef3138d29 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -92,7 +92,7 @@ RAMBlock *qemu_ram_block_by_name(const char *name);
  *
  * By the time this function returns, the returned pointer is not protected
  * by RCU anymore.  If the caller is not within an RCU critical section and
- * does not hold the iothread lock, it must have other means of protecting the
+ * does not hold the BQL, it must have other means of protecting the
  * pointer, such as a reference to the memory region that owns the RAMBlock.
  */
 RAMBlock *qemu_ram_block_from_host(void *ptr, bool round_offset,
diff --git a/include/exec/memory.h b/include/exec/memory.h
index f172e82ac9..991ab8c6e8 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -1962,7 +1962,7 @@ int memory_region_get_fd(MemoryRegion *mr);
  *
  * Use with care; by the time this function returns, the returned pointer is
  * not protected by RCU anymore.  If the caller is not within an RCU critical
- * section and does not hold the iothread lock, it must have other means of
+ * section and does not hold the BQL, it must have other means of
  * protecting the pointer, such as a reference to the region that includes
  * the incoming ram_addr_t.
  *
@@ -1979,7 +1979,7 @@ MemoryRegion *memory_region_from_host(void *ptr, 
ram_addr_t *offset);
  *
  * Use with care; by the time this function returns, the returned pointer is
  * not protected by RCU anymore.  If the caller is not within an RCU critical
- * section and does not hold the iothread lock, it must have other means of
+ * section and does not hold the BQL, it must have other means of
  * protecting the pointer, such as a reference to the region that includes
  * the incoming ram_addr_t.
  *
diff --git a/include/exec/ramblock.h b/include/exec/ramblock.h
index 69c6a53902..3eb79723c6 100644
--- a/include/exec/ramblock.h
+++ b/include/exec/ramblock.h
@@ -34,7 +34,7 @@ struct RAMBlock {
 ram_addr_t max_length;
 void (*resized)(const char*, uint64_t length, void *host);
 uint32_t flags;
-/* Protected by iothread lock.  */
+/* Protected by the BQL.  */
 char idstr[256];
 /* RCU-enabled, writes protected by the ramlist lock */
 QLIST_ENTRY(RAMBlock) next;
diff --git a/include/migration/register.h b/include/migration/register.h
index fed1d04a3c..9ab1f79512 100644
--- a/include/migration/register.h
+++ b/include/migration/register.h
@@ -17,7 +17,7 @@
 #include "hw/vmstate-if.h"
 
 typedef struct SaveVMHandlers {
-/* This runs inside the iothread lock. 

[PATCH v3 1/5] system/cpus: rename qemu_mutex_lock_iothread() to bql_lock()

2024-01-02 Thread Stefan Hajnoczi
The Big QEMU Lock (BQL) has many names and they are confusing. The
actual QemuMutex variable is called qemu_global_mutex but it's commonly
referred to as the BQL in discussions and some code comments. The
locking APIs, however, are called qemu_mutex_lock_iothread() and
qemu_mutex_unlock_iothread().

The "iothread" name is historic and comes from when the main thread was
split into into KVM vcpu threads and the "iothread" (now called the main
loop thread). I have contributed to the confusion myself by introducing
a separate --object iothread, a separate concept unrelated to the BQL.

The "iothread" name is no longer appropriate for the BQL. Rename the
locking APIs to:
- void bql_lock(void)
- void bql_unlock(void)
- bool bql_locked(void)

There are more APIs with "iothread" in their names. Subsequent patches
will rename them. There are also comments and documentation that will be
updated in later patches.

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Paul Durrant 
Acked-by: Fabiano Rosas 
Acked-by: David Woodhouse 
Reviewed-by: Cédric Le Goater 
Acked-by: Peter Xu 
Acked-by: Eric Farman 
Reviewed-by: Harsh Prateek Bora 
Acked-by: Hyman Huang 
---
 include/block/aio-wait.h |   2 +-
 include/qemu/main-loop.h |  39 +
 include/qemu/thread.h|   2 +-
 accel/accel-blocker.c|  10 +--
 accel/dummy-cpus.c   |   8 +-
 accel/hvf/hvf-accel-ops.c|   4 +-
 accel/kvm/kvm-accel-ops.c|   4 +-
 accel/kvm/kvm-all.c  |  22 ++---
 accel/tcg/cpu-exec.c |  26 +++---
 accel/tcg/cputlb.c   |  16 ++--
 accel/tcg/tcg-accel-ops-icount.c |   4 +-
 accel/tcg/tcg-accel-ops-mttcg.c  |  12 +--
 accel/tcg/tcg-accel-ops-rr.c |  14 ++--
 accel/tcg/tcg-accel-ops.c|   2 +-
 accel/tcg/translate-all.c|   2 +-
 cpu-common.c |   4 +-
 dump/dump.c  |   4 +-
 hw/core/cpu-common.c |   6 +-
 hw/i386/intel_iommu.c|   6 +-
 hw/i386/kvm/xen_evtchn.c |  16 ++--
 hw/i386/kvm/xen_overlay.c|   2 +-
 hw/i386/kvm/xen_xenstore.c   |   2 +-
 hw/intc/arm_gicv3_cpuif.c|   2 +-
 hw/intc/s390_flic.c  |  18 ++--
 hw/misc/edu.c|   4 +-
 hw/misc/imx6_src.c   |   2 +-
 hw/misc/imx7_src.c   |   2 +-
 hw/net/xen_nic.c |   8 +-
 hw/ppc/pegasos2.c|   2 +-
 hw/ppc/ppc.c |   4 +-
 hw/ppc/spapr.c   |   2 +-
 hw/ppc/spapr_rng.c   |   4 +-
 hw/ppc/spapr_softmmu.c   |   4 +-
 hw/remote/mpqemu-link.c  |  20 ++---
 hw/remote/vfio-user-obj.c|   2 +-
 hw/s390x/s390-skeys.c|   2 +-
 migration/block-dirty-bitmap.c   |   4 +-
 migration/block.c|  16 ++--
 migration/colo.c |  60 +++---
 migration/dirtyrate.c|  12 +--
 migration/migration.c|  52 ++--
 migration/ram.c  |  12 +--
 replay/replay-internal.c |   2 +-
 semihosting/console.c|   8 +-
 stubs/iothread-lock.c|   6 +-
 system/cpu-throttle.c|   4 +-
 system/cpus.c|  51 ++--
 system/dirtylimit.c  |   4 +-
 system/memory.c  |   2 +-
 system/physmem.c |   8 +-
 system/runstate.c|   2 +-
 system/watchpoint.c  |   4 +-
 target/arm/arm-powerctl.c|  14 ++--
 target/arm/helper.c  |   4 +-
 target/arm/hvf/hvf.c |   8 +-
 target/arm/kvm.c |   8 +-
 target/arm/ptw.c |   6 +-
 target/arm/tcg/helper-a64.c  |   8 +-
 target/arm/tcg/m_helper.c|   6 +-
 target/arm/tcg/op_helper.c   |  24 +++---
 target/arm/tcg/psci.c|   2 +-
 target/hppa/int_helper.c |   8 +-
 target/i386/hvf/hvf.c|   6 +-
 target/i386/kvm/hyperv.c |   4 +-
 target/i386/kvm/kvm.c|  28 +++
 target/i386/kvm/xen-emu.c|  14 ++--
 target/i386/nvmm/nvmm-accel-ops.c|   4 +-
 target/i386/nvmm/nvmm-all.c  |  20 ++---
 target/i386/tcg/sysemu/fpu_helper.c  |   6 +-
 target/i386/tcg/sysemu/misc_helper.c |   4 +-
 target/i386/whpx/whpx-accel-ops.c|   4 +-
 target/i386/whpx/whpx-all.c  |  24 +++---
 target/loongarch/csr_helper.c|   4 +-
 target/mips/kvm.c|   4 +-
 target/mips/tcg/sysemu/cp0_helper.c  |   4 +-
 target/openrisc/sys_helper.c |  16 ++--
 target/ppc/excp_helper.c |  12 +--
 target/ppc/kvm.c |   4 +-
 target/ppc/misc_helper.c |   8 +-
 target/ppc/timebase_helper.c

[PATCH v3 5/5] Rename "QEMU global mutex" to "BQL" in comments and docs

2024-01-02 Thread Stefan Hajnoczi
The term "QEMU global mutex" is identical to the more widely used Big
QEMU Lock ("BQL"). Update the code comments and documentation to use
"BQL" instead of "QEMU global mutex".

Signed-off-by: Stefan Hajnoczi 
Acked-by: Markus Armbruster 
Reviewed-by: Philippe Mathieu-Daudé 
---
 docs/devel/multi-thread-tcg.rst   |  7 +++
 docs/devel/qapi-code-gen.rst  |  2 +-
 docs/devel/replay.rst |  2 +-
 docs/devel/multiple-iothreads.txt | 14 +++---
 include/block/blockjob.h  |  6 +++---
 include/io/task.h |  2 +-
 include/qemu/coroutine-core.h |  2 +-
 include/qemu/coroutine.h  |  2 +-
 hw/block/dataplane/virtio-blk.c   |  8 
 hw/block/virtio-blk.c |  2 +-
 hw/scsi/virtio-scsi-dataplane.c   |  6 +++---
 net/tap.c |  2 +-
 12 files changed, 27 insertions(+), 28 deletions(-)

diff --git a/docs/devel/multi-thread-tcg.rst b/docs/devel/multi-thread-tcg.rst
index c9541a7b20..7302c3bf53 100644
--- a/docs/devel/multi-thread-tcg.rst
+++ b/docs/devel/multi-thread-tcg.rst
@@ -226,10 +226,9 @@ instruction. This could be a future optimisation.
 Emulated hardware state
 ---
 
-Currently thanks to KVM work any access to IO memory is automatically
-protected by the global iothread mutex, also known as the BQL (Big
-QEMU Lock). Any IO region that doesn't use global mutex is expected to
-do its own locking.
+Currently thanks to KVM work any access to IO memory is automatically protected
+by the BQL (Big QEMU Lock). Any IO region that doesn't use the BQL is expected
+to do its own locking.
 
 However IO memory isn't the only way emulated hardware state can be
 modified. Some architectures have model specific registers that
diff --git a/docs/devel/qapi-code-gen.rst b/docs/devel/qapi-code-gen.rst
index 7f78183cd4..ea8228518c 100644
--- a/docs/devel/qapi-code-gen.rst
+++ b/docs/devel/qapi-code-gen.rst
@@ -594,7 +594,7 @@ blocking the guest and other background operations.
 Coroutine safety can be hard to prove, similar to thread safety.  Common
 pitfalls are:
 
-- The global mutex isn't held across ``qemu_coroutine_yield()``, so
+- The BQL isn't held across ``qemu_coroutine_yield()``, so
   operations that used to assume that they execute atomically may have
   to be more careful to protect against changes in the global state.
 
diff --git a/docs/devel/replay.rst b/docs/devel/replay.rst
index 0244be8b9c..effd856f0c 100644
--- a/docs/devel/replay.rst
+++ b/docs/devel/replay.rst
@@ -184,7 +184,7 @@ modes.
 Reading and writing requests are created by CPU thread of QEMU. Later these
 requests proceed to block layer which creates "bottom halves". Bottom
 halves consist of callback and its parameters. They are processed when
-main loop locks the global mutex. These locks are not synchronized with
+main loop locks the BQL. These locks are not synchronized with
 replaying process because main loop also processes the events that do not
 affect the virtual machine state (like user interaction with monitor).
 
diff --git a/docs/devel/multiple-iothreads.txt 
b/docs/devel/multiple-iothreads.txt
index 4865196bde..de85767b12 100644
--- a/docs/devel/multiple-iothreads.txt
+++ b/docs/devel/multiple-iothreads.txt
@@ -5,7 +5,7 @@ the COPYING file in the top-level directory.
 
 
 This document explains the IOThread feature and how to write code that runs
-outside the QEMU global mutex.
+outside the BQL.
 
 The main loop and IOThreads
 ---
@@ -29,13 +29,13 @@ scalability bottleneck on hosts with many CPUs.  Work can 
be spread across
 several IOThreads instead of just one main loop.  When set up correctly this
 can improve I/O latency and reduce jitter seen by the guest.
 
-The main loop is also deeply associated with the QEMU global mutex, which is a
-scalability bottleneck in itself.  vCPU threads and the main loop use the QEMU
-global mutex to serialize execution of QEMU code.  This mutex is necessary
-because a lot of QEMU's code historically was not thread-safe.
+The main loop is also deeply associated with the BQL, which is a
+scalability bottleneck in itself.  vCPU threads and the main loop use the BQL
+to serialize execution of QEMU code.  This mutex is necessary because a lot of
+QEMU's code historically was not thread-safe.
 
 The fact that all I/O processing is done in a single main loop and that the
-QEMU global mutex is contended by all vCPU threads and the main loop explain
+BQL is contended by all vCPU threads and the main loop explain
 why it is desirable to place work into IOThreads.
 
 The experimental virtio-blk data-plane implementation has been benchmarked and
@@ -66,7 +66,7 @@ There are several old APIs that use the main loop AioContext:
 
 Since they implicitly work on the main loop they cannot be used in code that
 runs in an IOThread.  They might cause a crash or deadlock if called from an
-IOThread since the QEMU global mutex is not held.
+IOThread since the BQL is not held.
 
 

[PATCH v3 0/5] Make Big QEMU Lock naming consistent

2024-01-02 Thread Stefan Hajnoczi
v3:
- Rebase
- Define bql_lock() macro on a single line [Akihiko Odaki]
v2:
- Rename APIs bql_*() [PeterX]
- Spell out "Big QEMU Lock (BQL)" in doc comments [PeterX]
- Rename "iolock" variables in hw/remote/mpqemu-link.c [Harsh]
- Fix bql_auto_lock() indentation in Patch 2 [Ilya]
- "with BQL taken" -> "with the BQL taken" [Philippe]
- "under BQL" -> "under the BQL" [Philippe]

The Big QEMU Lock ("BQL") has two other names: "iothread lock" and "QEMU global
mutex". The term "iothread lock" is easily confused with the unrelated --object
iothread (iothread.c).

This series updates the code and documentation to consistently use "BQL". This
makes the code easier to understand.

Stefan Hajnoczi (5):
  system/cpus: rename qemu_mutex_lock_iothread() to bql_lock()
  qemu/main-loop: rename QEMU_IOTHREAD_LOCK_GUARD to BQL_LOCK_GUARD
  qemu/main-loop: rename qemu_cond_wait_iothread() to
qemu_cond_wait_bql()
  Replace "iothread lock" with "BQL" in comments
  Rename "QEMU global mutex" to "BQL" in comments and docs

 docs/devel/multi-thread-tcg.rst  |   7 +-
 docs/devel/qapi-code-gen.rst |   2 +-
 docs/devel/replay.rst|   2 +-
 docs/devel/reset.rst |   2 +-
 docs/devel/multiple-iothreads.txt|  14 ++--
 hw/display/qxl.h |   2 +-
 include/block/aio-wait.h |   2 +-
 include/block/blockjob.h |   6 +-
 include/exec/cpu-common.h|   2 +-
 include/exec/memory.h|   4 +-
 include/exec/ramblock.h  |   2 +-
 include/io/task.h|   2 +-
 include/migration/register.h |   8 +-
 include/qemu/coroutine-core.h|   2 +-
 include/qemu/coroutine.h |   2 +-
 include/qemu/main-loop.h |  68 ---
 include/qemu/thread.h|   2 +-
 target/arm/internals.h   |   4 +-
 accel/accel-blocker.c|  10 +--
 accel/dummy-cpus.c   |   8 +-
 accel/hvf/hvf-accel-ops.c|   4 +-
 accel/kvm/kvm-accel-ops.c|   4 +-
 accel/kvm/kvm-all.c  |  22 ++---
 accel/tcg/cpu-exec.c |  26 +++---
 accel/tcg/cputlb.c   |  20 ++---
 accel/tcg/tcg-accel-ops-icount.c |   6 +-
 accel/tcg/tcg-accel-ops-mttcg.c  |  12 +--
 accel/tcg/tcg-accel-ops-rr.c |  18 ++--
 accel/tcg/tcg-accel-ops.c|   2 +-
 accel/tcg/translate-all.c|   2 +-
 cpu-common.c |   4 +-
 dump/dump.c  |   4 +-
 hw/block/dataplane/virtio-blk.c  |   8 +-
 hw/block/virtio-blk.c|   2 +-
 hw/core/cpu-common.c |   6 +-
 hw/display/virtio-gpu.c  |   2 +-
 hw/i386/intel_iommu.c|   6 +-
 hw/i386/kvm/xen_evtchn.c |  30 +++
 hw/i386/kvm/xen_gnttab.c |   2 +-
 hw/i386/kvm/xen_overlay.c|   2 +-
 hw/i386/kvm/xen_xenstore.c   |   2 +-
 hw/intc/arm_gicv3_cpuif.c|   2 +-
 hw/intc/s390_flic.c  |  18 ++--
 hw/mips/mips_int.c   |   2 +-
 hw/misc/edu.c|   4 +-
 hw/misc/imx6_src.c   |   2 +-
 hw/misc/imx7_src.c   |   2 +-
 hw/net/xen_nic.c |   8 +-
 hw/ppc/pegasos2.c|   2 +-
 hw/ppc/ppc.c |   6 +-
 hw/ppc/spapr.c   |   2 +-
 hw/ppc/spapr_events.c|   2 +-
 hw/ppc/spapr_rng.c   |   4 +-
 hw/ppc/spapr_softmmu.c   |   4 +-
 hw/remote/mpqemu-link.c  |  22 ++---
 hw/remote/vfio-user-obj.c|   2 +-
 hw/s390x/s390-skeys.c|   2 +-
 hw/scsi/virtio-scsi-dataplane.c  |   6 +-
 migration/block-dirty-bitmap.c   |  14 ++--
 migration/block.c|  38 -
 migration/colo.c |  62 +++---
 migration/dirtyrate.c|  12 +--
 migration/migration.c|  54 ++--
 migration/ram.c  |  16 ++--
 net/tap.c|   2 +-
 replay/replay-internal.c |   2 +-
 semihosting/console.c|   8 +-
 stubs/iothread-lock.c|   6 +-
 system/cpu-throttle.c|   6 +-
 system/cpus.c|  55 +++--
 system/dirtylimit.c  |   4 +-
 system/memory.c  |   2 +-
 system/physmem.c |  14 ++--
 system/runstate.c|   2 +-
 system/watchpoint.c  |   4 +-
 target/arm/arm-powerctl.c|  14 ++--
 target/arm/helper.c  |   6 +-
 target/arm/hvf/hvf.c |   8 +-
 target/arm/kvm.c |   8 +-
 target/arm/ptw.c |   6 +-
 target/arm/tcg/helper-a64.c  |   8 +-
 target/arm/tcg/m_helper.c|   6 +-
 target/arm/tcg/op_helper.c 

[PATCH v3 2/5] qemu/main-loop: rename QEMU_IOTHREAD_LOCK_GUARD to BQL_LOCK_GUARD

2024-01-02 Thread Stefan Hajnoczi
The name "iothread" is overloaded. Use the term Big QEMU Lock (BQL)
instead, it is already widely used and unambiguous.

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Paul Durrant 
Acked-by: David Woodhouse 
Reviewed-by: Cédric Le Goater 
Acked-by: Ilya Leoshkevich 
---
 include/qemu/main-loop.h  | 19 +--
 hw/i386/kvm/xen_evtchn.c  | 14 +++---
 hw/i386/kvm/xen_gnttab.c  |  2 +-
 hw/mips/mips_int.c|  2 +-
 hw/ppc/ppc.c  |  2 +-
 target/i386/kvm/xen-emu.c |  2 +-
 target/ppc/excp_helper.c  |  2 +-
 target/ppc/helper_regs.c  |  2 +-
 target/riscv/cpu_helper.c |  4 ++--
 9 files changed, 24 insertions(+), 25 deletions(-)

diff --git a/include/qemu/main-loop.h b/include/qemu/main-loop.h
index 72ebc0cb3a..c26ad2a029 100644
--- a/include/qemu/main-loop.h
+++ b/include/qemu/main-loop.h
@@ -343,33 +343,32 @@ void bql_lock_impl(const char *file, int line);
 void bql_unlock(void);
 
 /**
- * QEMU_IOTHREAD_LOCK_GUARD
+ * BQL_LOCK_GUARD
  *
  * Wrap a block of code in a conditional bql_{lock,unlock}.
  */
-typedef struct IOThreadLockAuto IOThreadLockAuto;
+typedef struct BQLLockAuto BQLLockAuto;
 
-static inline IOThreadLockAuto *qemu_iothread_auto_lock(const char *file,
-int line)
+static inline BQLLockAuto *bql_auto_lock(const char *file, int line)
 {
 if (bql_locked()) {
 return NULL;
 }
 bql_lock_impl(file, line);
 /* Anything non-NULL causes the cleanup function to be called */
-return (IOThreadLockAuto *)(uintptr_t)1;
+return (BQLLockAuto *)(uintptr_t)1;
 }
 
-static inline void qemu_iothread_auto_unlock(IOThreadLockAuto *l)
+static inline void bql_auto_unlock(BQLLockAuto *l)
 {
 bql_unlock();
 }
 
-G_DEFINE_AUTOPTR_CLEANUP_FUNC(IOThreadLockAuto, qemu_iothread_auto_unlock)
+G_DEFINE_AUTOPTR_CLEANUP_FUNC(BQLLockAuto, bql_auto_unlock)
 
-#define QEMU_IOTHREAD_LOCK_GUARD() \
-g_autoptr(IOThreadLockAuto) _iothread_lock_auto __attribute__((unused)) \
-= qemu_iothread_auto_lock(__FILE__, __LINE__)
+#define BQL_LOCK_GUARD() \
+g_autoptr(BQLLockAuto) _bql_lock_auto __attribute__((unused)) \
+= bql_auto_lock(__FILE__, __LINE__)
 
 /*
  * qemu_cond_wait_iothread: Wait on condition for the main loop mutex
diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c
index d7d15cfaf7..bd077eda6d 100644
--- a/hw/i386/kvm/xen_evtchn.c
+++ b/hw/i386/kvm/xen_evtchn.c
@@ -1127,7 +1127,7 @@ int xen_evtchn_reset_op(struct evtchn_reset *reset)
 return -ESRCH;
 }
 
-QEMU_IOTHREAD_LOCK_GUARD();
+BQL_LOCK_GUARD();
 return xen_evtchn_soft_reset();
 }
 
@@ -1145,7 +1145,7 @@ int xen_evtchn_close_op(struct evtchn_close *close)
 return -EINVAL;
 }
 
-QEMU_IOTHREAD_LOCK_GUARD();
+BQL_LOCK_GUARD();
 qemu_mutex_lock(>port_lock);
 
 ret = close_port(s, close->port, _kvm_routes);
@@ -1272,7 +1272,7 @@ int xen_evtchn_bind_pirq_op(struct evtchn_bind_pirq *pirq)
 return -EINVAL;
 }
 
-QEMU_IOTHREAD_LOCK_GUARD();
+BQL_LOCK_GUARD();
 
 if (s->pirq[pirq->pirq].port) {
 return -EBUSY;
@@ -1824,7 +1824,7 @@ int xen_physdev_map_pirq(struct physdev_map_pirq *map)
 return -ENOTSUP;
 }
 
-QEMU_IOTHREAD_LOCK_GUARD();
+BQL_LOCK_GUARD();
 QEMU_LOCK_GUARD(>port_lock);
 
 if (map->domid != DOMID_SELF && map->domid != xen_domid) {
@@ -1884,7 +1884,7 @@ int xen_physdev_unmap_pirq(struct physdev_unmap_pirq 
*unmap)
 return -EINVAL;
 }
 
-QEMU_IOTHREAD_LOCK_GUARD();
+BQL_LOCK_GUARD();
 qemu_mutex_lock(>port_lock);
 
 if (!pirq_inuse(s, pirq)) {
@@ -1924,7 +1924,7 @@ int xen_physdev_eoi_pirq(struct physdev_eoi *eoi)
 return -ENOTSUP;
 }
 
-QEMU_IOTHREAD_LOCK_GUARD();
+BQL_LOCK_GUARD();
 QEMU_LOCK_GUARD(>port_lock);
 
 if (!pirq_inuse(s, pirq)) {
@@ -1956,7 +1956,7 @@ int xen_physdev_query_pirq(struct 
physdev_irq_status_query *query)
 return -ENOTSUP;
 }
 
-QEMU_IOTHREAD_LOCK_GUARD();
+BQL_LOCK_GUARD();
 QEMU_LOCK_GUARD(>port_lock);
 
 if (!pirq_inuse(s, pirq)) {
diff --git a/hw/i386/kvm/xen_gnttab.c b/hw/i386/kvm/xen_gnttab.c
index 0a24f53f20..d9477ae927 100644
--- a/hw/i386/kvm/xen_gnttab.c
+++ b/hw/i386/kvm/xen_gnttab.c
@@ -176,7 +176,7 @@ int xen_gnttab_map_page(uint64_t idx, uint64_t gfn)
 return -EINVAL;
 }
 
-QEMU_IOTHREAD_LOCK_GUARD();
+BQL_LOCK_GUARD();
 QEMU_LOCK_GUARD(>gnt_lock);
 
 xen_overlay_do_map_page(>gnt_aliases[idx], gpa);
diff --git a/hw/mips/mips_int.c b/hw/mips/mips_int.c
index 6c32e466a3..eef2fd2cd1 100644
--- a/hw/mips/mips_int.c
+++ b/hw/mips/mips_int.c
@@ -36,7 +36,7 @@ static void cpu_mips_irq_request(void *opaque, int irq, int 
level)
 return;
 }
 
-QEMU_IOTHREAD_LOCK_GUARD();
+BQL_LOCK_GUARD();
 
 if (level) {
 env->CP0_Cause |= 1 << (irq + CP0Ca_IP);
diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
index b6581c16fc..7387b5b677 100644

Re: [PATCH v2 1/5] system/cpus: rename qemu_mutex_lock_iothread() to bql_lock()

2024-01-02 Thread Stefan Hajnoczi
On Wed, Dec 13, 2023 at 03:37:00PM +0900, Akihiko Odaki wrote:
> On 2023/12/13 0:39, Stefan Hajnoczi wrote:
> > @@ -312,58 +312,58 @@ bool qemu_in_main_thread(void);
> >   } while (0)
> >   /**
> > - * qemu_mutex_lock_iothread: Lock the main loop mutex.
> > + * bql_lock: Lock the Big QEMU Lock (BQL).
> >*
> > - * This function locks the main loop mutex.  The mutex is taken by
> > + * This function locks the Big QEMU Lock (BQL).  The lock is taken by
> >* main() in vl.c and always taken except while waiting on
> > - * external events (such as with select).  The mutex should be taken
> > + * external events (such as with select).  The lock should be taken
> >* by threads other than the main loop thread when calling
> >* qemu_bh_new(), qemu_set_fd_handler() and basically all other
> >* functions documented in this file.
> >*
> > - * NOTE: tools currently are single-threaded and qemu_mutex_lock_iothread
> > + * NOTE: tools currently are single-threaded and bql_lock
> >* is a no-op there.
> >*/
> > -#define qemu_mutex_lock_iothread()  \
> > -qemu_mutex_lock_iothread_impl(__FILE__, __LINE__)
> > -void qemu_mutex_lock_iothread_impl(const char *file, int line);
> > +#define bql_lock()  \
> > +bql_lock_impl(__FILE__, __LINE__)
> 
> This line break is no longer necessary.

Will fix in v3.

Stefan


signature.asc
Description: PGP signature


Re: [PATCH 5/6] nbd/server: only traverse NBDExport->clients from main loop thread

2024-01-02 Thread Eric Blake
On Wed, Dec 20, 2023 at 08:49:02PM -0500, Stefan Hajnoczi wrote:
> The NBD clients list is currently accessed from both the export
> AioContext and the main loop thread. When the AioContext lock is removed
> there will be nothing protecting the clients list.
> 
> Adding a lock around the clients list is tricky because NBDClient
> structs are refcounted and may be freed from the export AioContext or
> the main loop thread. nbd_export_request_shutdown() -> client_close() ->
> nbd_client_put() is also tricky because the list lock would be held
> while indirectly dropping references to NDBClients.
> 
> A simpler approach is to only allow nbd_client_put() and client_close()
> calls from the main loop thread. Then the NBD clients list is only
> accessed from the main loop thread and no fancy locking is needed.
> 
> nbd_trip() just needs to reschedule itself in the main loop AioContext
> before calling nbd_client_put() and client_close(). This costs more CPU
> cycles per NBD request but is needed for thread-safety when the
> AioContext lock is removed.

Late review (now that this is already in), but this is a bit
misleading.  The CPU penalty is only incurred for NBD_CMD_DISC or
after detection of a protocol error (that is, only when the connection
is being shut down), and not on every NBD request.

Thanks for working on this!

> 
> Note that nbd_client_get() can still be called from either thread, so
> make NBDClient->refcount atomic.
> 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  nbd/server.c | 23 ---
>  1 file changed, 20 insertions(+), 3 deletions(-)
 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org




Re: [PATCH 1/8] vga: remove unused macros

2024-01-02 Thread Paolo Bonzini
On Sun, Dec 31, 2023 at 5:01 PM BALATON Zoltan  wrote:
>
> On Sun, 31 Dec 2023, Paolo Bonzini wrote:
> > Signed-off-by: Paolo Bonzini 
> > ---
> > hw/display/vga.c | 14 --
> > 1 file changed, 14 deletions(-)
> >
> > diff --git a/hw/display/vga.c b/hw/display/vga.c
> > index 37557c3442a..18d966ecd3e 100644
> > --- a/hw/display/vga.c
> > +++ b/hw/display/vga.c
> > @@ -103,12 +103,6 @@ const uint8_t gr_mask[16] = {
> > #define PAT(x) (x)
> > #endif
>
> While at it you could also move this definiton dows to where it's used so
> it's clear where it's needed.

In fact PAT() can be replaced with const_le32(), and GET_PLANE() can
use a byte swap, which is just as fast as tweaking the shift count these
days[1]:

#define GET_PLANE(data, p) ((cpu_to_le32(data) >> ((p) * 8)) & 0xff)

and it can even be inlined

Paolo

[1] see following instructions, only sparc is missing:

arm/aarch64 - rev
i386 - bswapl
loongarch - revb.2w
mips - wsbh + rotr
ppc - brw
riscv - rev8 ( + srli on riscv64)
s390 - lrvgr




Re: [RFC 0/3] aio-posix: call ->poll_end() when removing AioHandler

2024-01-02 Thread Hanna Czenczek

On 13.12.23 22:15, Stefan Hajnoczi wrote:

Hanna and Fiona encountered a bug in aio_set_fd_handler(): there is no matching
io_poll_end() call upon removing an AioHandler when io_poll_begin() was
previously called. The missing io_poll_end() call leaves virtqueue
notifications disabled and the virtqueue's ioeventfd will never become
readable anymore.

The details of how virtio-scsi devices using IOThreads can hang after
hotplug/unplug are covered here:
https://issues.redhat.com/browse/RHEL-3934

Hanna is currently away over the December holidays. I'm sending these RFC
patches in the meantime. They demonstrate running aio_set_fd_handler() in the
AioContext home thread and adding the missing io_poll_end() call.


I agree with Paolo that if the handlers are removed, the caller probably 
isn’t interested in notifications: In our specific case, we remove the 
handlers because the device is to be drained, so we won’t poll the 
virtqueue anyway.  Therefore, if io_poll_end() is to be called to 
complete the start-end pair, it shouldn’t be done when the handlers are 
removed, but instead when they are reinstated.


I believe that’s quite infeasible to do in generic code: We’d have to 
basically remember that we haven’t called a specific io_poll_end 
callback yet, and so once it is reinstated while we’re not in a polling 
phase, we would need to call it then.  That in itself is already hairy, 
but in addition, the callback consists of a function pointer and an 
opaque pointer, and it seems impossible to reliably establish identity 
between two opaque pointers to know whether a newly instated io_poll_end 
callback is the same one as one that had been removed before (pointer 
identity may work, but also may not).


That’s why I think the responsibility should fall on the caller.  I 
believe both virtio_queue_aio_attach_host_notifier*() functions should 
enable vq notifications before instating the handler – if we’re in 
polling mode, io_poll_start() will then immediately be called and 
disable vq notifications again.  Having them enabled briefly shouldn’t 
hurt anything but performance (which I don’t think is terrible in the 
drain case).  For completeness’ sake, we may even have 
virtio_queue_aio_detach_host_notifier() disable vq notifications, 
because otherwise it’s unknown whether notifications are enabled or 
disabled after removing the notifiers.  That isn’t terrible, but I think 
(A) if any of the two, we want them to be disabled, because we won’t 
check for requests anyway, and (B) this gives us a clearer state 
machine, where removing the notifiers will always leave vq notifications 
disabled, so when they are reinstated (i.e. when calling 
virtio_queue_aio_attach_host_notifier*()), it’s clear that we must poll 
once to check for new requests.


I’ve attached the preliminary patch that I didn’t get to send (or test 
much) last year.  Not sure if it has the same CPU-usage-spike issue 
Fiona was seeing, the only functional difference is that I notify the vq 
after attaching the notifiers instead of before.


HannaFrom 451aae74fc19a6ea5cd6381247cd9202571651e8 Mon Sep 17 00:00:00 2001
From: Hanna Czenczek 
Date: Wed, 6 Dec 2023 18:24:55 +0100
Subject: [PATCH] Keep notifications disabled during drain

Preliminary patch with a preliminary description: During drain, we do
not care about virtqueue notifications, which is why we remove the
handlers on it.  When removing those handlers, whether vq notifications
are enabled or not depends on whether we were in polling mode or not; if
not, they are enabled (by default); if so, they have been disabled by
the io_poll_start callback.

Ideally, we would rather have the vq notifications be disabled, because
we will not process requests during a drained section anyway.
Therefore, have virtio_queue_aio_detach_host_notifier() explicitly
disable them, and virtio_queue_aio_attach_host_notifier*() re-enable
them (if we are in a polling section, attaching them will invoke the
io_poll_start callback, which will re-disable them).

Because we will miss virtqueue updates in the drained section, we also
need to poll the virtqueue once in the drained_end functions after
attaching the notifiers.
---
 include/block/aio.h|  7 ++-
 include/hw/virtio/virtio.h |  1 +
 hw/block/virtio-blk.c  | 10 ++
 hw/scsi/virtio-scsi.c  | 10 ++
 hw/virtio/virtio.c | 30 +-
 5 files changed, 56 insertions(+), 2 deletions(-)

diff --git a/include/block/aio.h b/include/block/aio.h
index f08b358077..795a375ff2 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -497,9 +497,14 @@ void aio_set_event_notifier(AioContext *ctx,
 AioPollFn *io_poll,
 EventNotifierHandler *io_poll_ready);
 
-/* Set polling begin/end callbacks for an event notifier that has already been
+/*
+ * Set polling begin/end callbacks for an event notifier that has already been
  * registered with 

[PATCH] roms/opensbi: Upgrade from v1.3.1 to v1.4

2024-01-02 Thread Bin Meng
Upgrade OpenSBI from v1.3.1 to v1.4 and the pre-built bios images.

The v1.4 release includes the following commits:

1a398d9 lib: sbi: Add Zicntr as a HART ISA extension
669089c lib: sbi: Add Zihpm as a HART ISA extension
72b9c8f lib: sbi: Alphabetically sort HART ISA extensions
5359fc6 lib: sbi: Rename hart_pmu_get_allowed_bits() function
976895c lib: sbi: Fix Priv spec version for [m|s]counteren and mcountinhibit 
CSRs
6053917 lib: sbi: Fix how print gets flags
35ef182 lib: sbi: print not fill '0' when left-aligned
40dac06 lib: sbi: Add '+' flags for print
458fa74 lib: sbi: Add ' ' '\'' flags for print
05cbb6e lib: sbi: implifying the parameters of printi
fe08281 lib: sbi: print add 'o' type
c6ee5ae lib: sbi: Fix printi
3b6fcdd lib: sbi: Simplify prints
cc89fa7 lib: sbi: Fix printc
ff43168 lib: sbi: Fix timing of clearing tbuf
a73982d lib: sbi: Fix missing '\0' when buffer szie equal 1
ea6533a lib: utils/gpio: Fix RV32 compile error for designware GPIO driver
c3b98c6 include: sbi: Add macro definitions for mseccfg CSR
1c099c4 lib: sbi: Add functions to manipulate PMP entries
6c202c5 include: sbi: Add Smepmp specific access flags for PMP entries
cbcfc7b lib: sbi: Add smepmp in hart extensions
d72f5f1 lib: utils: Add detection of Smepmp from ISA string in FDT
4a42a23 lib: sbi: Grant SU R/W/X permissions to whole memory
f3fdd04 lib: sbi: Change the order of PMP initialization
5dd8db5 lib: sbi: Add support for Smepmp
6e44ef6 lib: sbi: Add functions to map/unmap shared memory
0ad8660 lib: sbi: Map/Unmap debug console shared memory buffers
057eb10 lib: utils/gpio: Fix RV32 compile error for designware GPIO driver
0e2111e libfdt: fix SPDX license identifiers
e05a9cf lib: sbi: Update system suspend to spec
5e20d25 include: sbi: fix CSR define of mseccfg
44c5151 include: sbi_utils: Remove driver pointer from struct i2c_adapter
14a35b0 lib: utils/regmap: Add generic regmap access library
8e97275 lib: utils/regmap: Add simple FDT based regmap framework
f21d8f7 lib: utils/regmap: Add simple FDT based syscon regmap driver
4a344a9 lib: utils/reset: Add syscon based reboot and poweroff
c2e6027 lib: utils/reset: Remove SiFive Test reset driver
f536e0b gitignore: allow gitignore to ignore most dot file
c744ed7 lib: sbi_pmu: Enable noncontigous hpm event and counters
6259b2e lib: utils/fdt: Fix fdt_parse_isa_extensions() implementation
f46a564 lib: sbi: Fix typo for finding fixed event counter
94197a8 fw_base.S: Fix assembler error with clang 16+
c104c60 lib: sbi: Add support for smcntrpmf
7aabeee Makefile: Fix grep warning
e7e73aa platform: generic: allwinner: correct mhpmevent count
ee1f83c lib: sbi_pmu: remove mhpm_count field in hart feature
a9cffd6 firmware: payload: test: Change to SBI v2.0 DBCN ecalls
b20bd47 lib: sbi: improve the definition of SBI_IPI_EVENT_MAX
664692f lib: sbi_pmu: ensure update hpm counter before starting counting
c9a296d platform: generic: allwinner: fix OF process for T-HEAD c9xx pmu
901d3d7 lib: sbi_pmu: keep overflow interrupt of stopped hpm counter disabled
cacfba3 platform: Allow platforms to specify the size of tlb fifo
5bd9694 lib: sbi: alloc tlb fifo by sbi_malloc
130e65d lib: sbi: Implement SET_FS_DIRTY() to make sure the mstatus FS dirty is 
set
d1e4dff lib: sbi: Introduce HART index in sbi_scratch
e6125c3 lib: sbi: Remove sbi_platform_hart_index/invalid() functions
296e70d lib: sbi: Extend sbi_hartmask to support both hartid and hartindex
e632cd7 lib: sbi: Use sbi_scratch_last_hartindex() in remote TLB managment
78c667b lib: sbi: Prefer hartindex over hartid in IPI framework
22d6ff8 lib: sbi: Remove sbi_scratch_last_hartid() macro
112daa2 lib: sbi: Maximize the use of HART index in sbi_domain
9560fb3 include: sbi: Remove sbi_hartmask_for_each_hart() macro
b8fb96e include: sbi_domain: Fix permission test macros
bff27c1 lib: sbi: Factor-out Smepmp configuration as separate function
5240d31 lib: sbi: Don't clear mseccfg.MML bit in sbi_hart_smepmp_configure()
2b51a9d lib: sbi: Fix pmp_flags for Smepmp read-only shared region
73aea28 lib: sbi: Populate M-only Smepmp entries before setting mseccfg.MML
e8bc162 lib: utils/serial: Add shared regions for serial drivers
b7e9d34 lib: utils/regmap: Mark syscon region as shared read-write
3669153 platform: generic: thead: fix stale TLB entries for th1520/sg2042
de525ac firmware: Remove ALIGN in .rela.dyn in linker script
2a6d725 firmware: Remove handling of R_RISCV_{32,64}
6ed125a Makefile: Add --exclude-libs ALL to avoid .dynsym
e21901d doc: Fix fw_payload.md
a125423 lib: utils/serial: Ensure proper allocation of PMP entries for uart8250
d36709f lib: utils: timer/ipi: Update memregion flags for PLMT and PLICSW
8197c2f lib: sbi: fix sbi_domain_get_assigned_hartmask()
9da30f6 lib: utils/fdt: simplify dt_parse_isa_extensions
942aca2 lib: utils: Simplify SET_ISA_EXT_MAP()
f831b93 lib: sbi_pmu: check for index overflows
d891cae gpio/starfive: redundant readl() call
e8114c6 docs: platform: update platform_requirements.md
3632f2b lib: sbi: Add support 

Re: [PATCH for 8.2.1] hw/net: cadence_gem: Fix MDIO_OP_xxx values

2024-01-02 Thread Philippe Mathieu-Daudé

On 2/1/24 15:18, Bin Meng wrote:

Testing upstream U-Boot with 'sifive_u' machine we see:

   => dhcp
   ethernet@1009: PHY present at 0
   Could not get PHY for ethernet@1009: addr 0
   phy_connect failed

This has been working till QEMU 8.1 but broken since QEMU 8.2.


s/till/until/?


Fixes: 1b09eeb122aa ("hw/net/cadence_gem: use FIELD to describe PHYMNTNC register 
fields")
Reported-by: Heinrich Schuchardt 
Signed-off-by: Bin Meng 

---

  hw/net/cadence_gem.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/net/cadence_gem.c b/hw/net/cadence_gem.c
index 296bba238e..472ce9c8cf 100644
--- a/hw/net/cadence_gem.c
+++ b/hw/net/cadence_gem.c
@@ -199,8 +199,8 @@ REG32(PHYMNTNC, 0x34) /* Phy Maintenance reg */
  FIELD(PHYMNTNC, PHY_ADDR, 23, 5)
  FIELD(PHYMNTNC, OP, 28, 2)
  FIELD(PHYMNTNC, ST, 30, 2)
-#define MDIO_OP_READ0x3
-#define MDIO_OP_WRITE   0x2
+#define MDIO_OP_READ0x2
+#define MDIO_OP_WRITE   0x1
  
  REG32(RXPAUSE, 0x38) /* RX Pause Time reg */

  REG32(TXPAUSE, 0x3c) /* TX Pause Time reg */


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv

2024-01-02 Thread Cédric Le Goater

On 12/12/23 17:29, Philippe Mathieu-Daudé wrote:

Hi,

When a MPCore cluster is used, the Cortex-A cores belong the the
cluster container, not to the board/soc layer. This series move
the creation of vCPUs to the MPCore private container.

Doing so we consolidate the QOM model, moving common code in a
central place (abstract MPCore parent).


Changing the QOM hierarchy has an impact on the state of the machine
and some fixups are then required to maintain migration compatibility.
This can become a real headache for KVM machines like virt for which
migration compatibility is a feature, less for emulated ones.

I don't have a good solution to propose to overcome this problem :/

C.
 



This eventually allow removing one qemu_get_cpu() use, which we
want to remove in heterogeneous machines (machines using MPCore
are candidate for heterogeneous emulation).

Maybe these hw/cpu/arm/ files belong to hw/arm/...

Regards,

Phil.

Philippe Mathieu-Daudé (33):
   hw/arm/boot: Propagate vCPU to arm_load_dtb()
   hw/arm/fsl-imx6: Add a local 'gic' variable
   hw/arm/fsl-imx6ul: Add a local 'gic' variable
   hw/arm/fsl-imx7: Add a local 'gic' variable
   hw/cpu: Remove dead Kconfig
   hw/cpu/arm: Rename 'busdev' -> 'gicsbd' in a15mp_priv_realize()
   hw/cpu/arm: Alias 'num-cpu' property on TYPE_REALVIEW_MPCORE
   hw/cpu/arm: Declare CPU QOM types using DEFINE_TYPES() macro
   hw/cpu/arm: Merge {a9mpcore.h, a15mpcore.h} as cortex_mpcore.h
   hw/cpu/arm: Introduce abstract CORTEX_MPCORE_PRIV QOM type
   hw/cpu/arm: Have A9MPCORE/A15MPCORE inheritate common
 CORTEX_MPCORE_PRIV
   hw/cpu/arm: Create MPCore container in QOM parent
   hw/cpu/arm: Handle 'num_cores' property once in MPCore parent
   hw/cpu/arm: Handle 'has_el2/3' properties once in MPCore parent
   hw/cpu/arm: Handle 'gic-irq' property once in MPCore parent
   hw/cpu/arm: Handle GIC once in MPCore parent
   hw/cpu/arm: Document more properties of CORTEX_MPCORE_PRIV QOM type
   hw/cpu/arm: Replace A15MPPrivState by CortexMPPrivState
   hw/cpu/arm: Introduce TYPE_A7MPCORE_PRIV for Cortex-A7 MPCore
   hw/cpu/arm: Consolidate check on max GIC spi supported
   hw/cpu/arm: Create CPUs once in MPCore parent
   hw/arm/aspeed_ast2600: Let the A7MPcore create/wire the CPU cores
   hw/arm/exynos4210: Let the A9MPcore create/wire the CPU cores
   hw/arm/fsl-imx6: Let the A9MPcore create/wire the CPU cores
   hw/arm/fsl-imx6ul: Let the A7MPcore create/wire the CPU cores
   hw/arm/fsl-imx7: Let the A7MPcore create/wire the CPU cores
   hw/arm/highbank: Let the A9/A15MPcore create/wire the CPU cores
   hw/arm/vexpress: Let the A9/A15MPcore create/wire the CPU cores
   hw/arm/xilinx_zynq: Let the A9MPcore create/wire the CPU cores
   hw/arm/npcm7xx: Let the A9MPcore create/wire the CPU cores
   hw/cpu/a9mpcore: Remove legacy code
   hw/cpu/arm: Remove 'num-cpu' property alias
   hw/cpu/arm: Remove use of qemu_get_cpu() in A7/A15 realize()

  MAINTAINERS|   3 +-
  include/hw/arm/aspeed_soc.h|   5 +-
  include/hw/arm/boot.h  |   4 +-
  include/hw/arm/exynos4210.h|   6 +-
  include/hw/arm/fsl-imx6.h  |   6 +-
  include/hw/arm/fsl-imx6ul.h|   8 +-
  include/hw/arm/fsl-imx7.h  |   8 +-
  include/hw/arm/npcm7xx.h   |   3 +-
  include/hw/cpu/a15mpcore.h |  44 ---
  include/hw/cpu/a9mpcore.h  |  39 ---
  include/hw/cpu/cortex_mpcore.h | 135 ++
  hw/arm/aspeed_ast2600.c|  61 --
  hw/arm/boot.c  |  11 +-
  hw/arm/exynos4210.c|  60 --
  hw/arm/exynos4_boards.c|   6 +-
  hw/arm/fsl-imx6.c  |  84 --
  hw/arm/fsl-imx6ul.c|  65 ---
  hw/arm/fsl-imx7.c  | 103 +
  hw/arm/highbank.c  |  56 ++---
  hw/arm/mcimx6ul-evk.c  |   3 +-
  hw/arm/mcimx7d-sabre.c |   3 +-
  hw/arm/npcm7xx.c   |  48 ++--
  hw/arm/realview.c  |   4 +-
  hw/arm/sabrelite.c |   4 +-
  hw/arm/vexpress.c  |  60 +++---
  hw/arm/virt.c  |   2 +-
  hw/arm/xilinx_zynq.c   |  30 ++---
  hw/cpu/a15mpcore.c | 179 +
  hw/cpu/a9mpcore.c  | 138 +-
  hw/cpu/arm11mpcore.c   |  23 ++--
  hw/cpu/cortex_mpcore.c | 202 +
  hw/cpu/realview_mpcore.c   |  30 ++---
  hw/arm/Kconfig |   8 +-
  hw/cpu/Kconfig |   8 --
  hw/cpu/meson.build |   1 +
  35 files changed, 689 insertions(+), 761 deletions(-)
  delete mode 100644 include/hw/cpu/a15mpcore.h
  delete mode 100644 include/hw/cpu/a9mpcore.h
  create mode 100644 include/hw/cpu/cortex_mpcore.h
  create mode 100644 hw/cpu/cortex_mpcore.c
  delete mode 100644 hw/cpu/Kconfig






Re: [PATCH for 9.0 00/12] Map memory at destination .load_setup in vDPA-net migration

2024-01-02 Thread Eugenio Perez Martin
On Mon, Dec 25, 2023 at 5:31 PM Michael S. Tsirkin  wrote:
>
> On Fri, Dec 15, 2023 at 06:28:18PM +0100, Eugenio Pérez wrote:
> > Current memory operations like pinning may take a lot of time at the
> > destination.  Currently they are done after the source of the migration is
> > stopped, and before the workload is resumed at the destination.  This is a
> > period where neigher traffic can flow, nor the VM workload can continue
> > (downtime).
> >
> > We can do better as we know the memory layout of the guest RAM at the
> > destination from the moment the migration starts.  Moving that operation 
> > allows
> > QEMU to communicate the kernel the maps while the workload is still running 
> > in
> > the source, so Linux can start mapping them.
> >
> > Also, the destination of the guest memory may finish before the destination
> > QEMU maps all the memory.  In this case, the rest of the memory will be 
> > mapped
> > at the same time as before applying this series, when the device is 
> > starting.
> > So we're only improving with this series.
> >
> > If the destination has the switchover_ack capability enabled, the 
> > destination
> > hold the migration until all the memory is mapped.
> >
> > This needs to be applied on top of [1]. That series performs some code
> > reorganization that allows to map the guest memory without knowing the queue
> > layout the guest configure on the device.
> >
> > This series reduced the downtime in the stop-and-copy phase of the live
> > migration from 20s~30s to 5s, with a 128G mem guest and two mlx5_vdpa 
> > devices,
> > per [2].
>
> I think this is reasonable and could be applied - batching is good.
> Could you rebase on master and repost please?
>

New comments appeared in the meantime [1], but I'll rebase with the
needed changes after they converge.

Thanks!

[1] https://patchwork.kernel.org/comment/25653487/


> > Future directions on top of this series may include:
> > * Iterative migration of virtio-net devices, as it may reduce downtime per 
> > [3].
> >   vhost-vdpa net can apply the configuration through CVQ in the destination
> >   while the source is still migrating.
> > * Move more things ahead of migration time, like DRIVER_OK.
> > * Check that the devices of the destination are valid, and cancel the 
> > migration
> >   in case it is not.
> >
> > v1 from RFC v2:
> > * Hold on migration if memory has not been mapped in full with 
> > switchover_ack.
> > * Revert map if the device is not started.
> >
> > RFC v2:
> > * Delegate map to another thread so it does no block QMP.
> > * Fix not allocating iova_tree if x-svq=on at the destination.
> > * Rebased on latest master.
> > * More cleanups of current code, that might be split from this series too.
> >
> > [1] https://lists.nongnu.org/archive/html/qemu-devel/2023-12/msg01986.html
> > [2] https://lists.nongnu.org/archive/html/qemu-devel/2023-12/msg00909.html
> > [3] 
> > https://lore.kernel.org/qemu-devel/6c8ebb97-d546-3f1c-4cdd-54e23a566...@nvidia.com/T/
> >
> > Eugenio Pérez (12):
> >   vdpa: do not set virtio status bits if unneeded
> >   vdpa: make batch_begin_once early return
> >   vdpa: merge _begin_batch into _batch_begin_once
> >   vdpa: extract out _dma_end_batch from _listener_commit
> >   vdpa: factor out stop path of vhost_vdpa_dev_start
> >   vdpa: check for iova tree initialized at net_client_start
> >   vdpa: set backend capabilities at vhost_vdpa_init
> >   vdpa: add vhost_vdpa_load_setup
> >   vdpa: approve switchover after memory map in the migration destination
> >   vdpa: add vhost_vdpa_net_load_setup NetClient callback
> >   vdpa: add vhost_vdpa_net_switchover_ack_needed
> >   virtio_net: register incremental migration handlers
> >
> >  include/hw/virtio/vhost-vdpa.h |  32 
> >  include/net/net.h  |   8 +
> >  hw/net/virtio-net.c|  48 ++
> >  hw/virtio/vhost-vdpa.c | 274 +++--
> >  net/vhost-vdpa.c   |  43 +-
> >  5 files changed, 357 insertions(+), 48 deletions(-)
> >
> > --
> > 2.39.3
> >
>




Re: [PATCH 0/2] backends/iommufd: Remove mutex

2024-01-02 Thread Eric Auger
Hi Cédric,

On 1/2/24 13:32, Cédric Le Goater wrote:
> Hello !
>
> Coverity has some reports regarding the IOMMUFDBackend mutex. Since
> the IOMMUFDBackend routines are called from the QEMU main thread, this
> series simply suggests removing the mutex and rely on the BQL to
> handle concurrent access.
>
> Thanks,
>
> C.
>
> Cédric Le Goater (2):
>   backends/iommufd: Remove check on number of backend users
>   backends/iommufd: Remove mutex

for the series:
Reviewed-by: Eric Auger 

Thank you for the fix!

Eric
>
>  include/sysemu/iommufd.h |  2 --
>  backends/iommufd.c   | 12 
>  2 files changed, 14 deletions(-)
>




Re: [PATCH 11/33] hw/cpu/arm: Have A9MPCORE/A15MPCORE inheritate common CORTEX_MPCORE_PRIV

2024-01-02 Thread Cédric Le Goater

On 12/12/23 17:29, Philippe Mathieu-Daudé wrote:

TYPE_CORTEX_MPCORE_PRIV becomes the common parent to
TYPE_A9MPCORE_PRIV and TYPE_A15MPCORE_PRIV.

Signed-off-by: Philippe Mathieu-Daudé 



Reviewed-by: Cédric Le Goater 

Thanks,

C.






Re: [PATCH 12/33] hw/cpu/arm: Create MPCore container in QOM parent

2024-01-02 Thread Cédric Le Goater

On 12/12/23 17:29, Philippe Mathieu-Daudé wrote:

Move the memory region container creation to the abstract QOM
parent. Children set the region size via the class 'container_size'
field.

Signed-off-by: Philippe Mathieu-Daudé 



Reviewed-by: Cédric Le Goater 

Thanks,

C.






Re: [PATCH 10/33] hw/cpu/arm: Introduce abstract CORTEX_MPCORE_PRIV QOM type

2024-01-02 Thread Cédric Le Goater

On 12/12/23 17:29, Philippe Mathieu-Daudé wrote:

This type will be common to A9MPCORE/A15MPCORE devices.



Reviewed-by: Cédric Le Goater 

Thanks,

C.





[PATCH for 8.2.1] hw/net: cadence_gem: Fix MDIO_OP_xxx values

2024-01-02 Thread Bin Meng
Testing upstream U-Boot with 'sifive_u' machine we see:

  => dhcp
  ethernet@1009: PHY present at 0
  Could not get PHY for ethernet@1009: addr 0
  phy_connect failed

This has been working till QEMU 8.1 but broken since QEMU 8.2.

Fixes: 1b09eeb122aa ("hw/net/cadence_gem: use FIELD to describe PHYMNTNC 
register fields")
Reported-by: Heinrich Schuchardt 
Signed-off-by: Bin Meng 

---

 hw/net/cadence_gem.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/net/cadence_gem.c b/hw/net/cadence_gem.c
index 296bba238e..472ce9c8cf 100644
--- a/hw/net/cadence_gem.c
+++ b/hw/net/cadence_gem.c
@@ -199,8 +199,8 @@ REG32(PHYMNTNC, 0x34) /* Phy Maintenance reg */
 FIELD(PHYMNTNC, PHY_ADDR, 23, 5)
 FIELD(PHYMNTNC, OP, 28, 2)
 FIELD(PHYMNTNC, ST, 30, 2)
-#define MDIO_OP_READ0x3
-#define MDIO_OP_WRITE   0x2
+#define MDIO_OP_READ0x2
+#define MDIO_OP_WRITE   0x1
 
 REG32(RXPAUSE, 0x38) /* RX Pause Time reg */
 REG32(TXPAUSE, 0x3c) /* TX Pause Time reg */
-- 
2.34.1




Re: [PATCH 09/33] hw/cpu/arm: Merge {a9mpcore.h, a15mpcore.h} as cortex_mpcore.h

2024-01-02 Thread Cédric Le Goater

On 12/12/23 17:29, Philippe Mathieu-Daudé wrote:

Merge Cortex-A{9,15} MPCore devices in the same header.

Signed-off-by: Philippe Mathieu-Daudé 


Reviewed-by: Cédric Le Goater 

Thanks,

C.






Re: [PATCH 08/33] hw/cpu/arm: Declare CPU QOM types using DEFINE_TYPES() macro

2024-01-02 Thread Cédric Le Goater

On 12/12/23 17:29, Philippe Mathieu-Daudé wrote:

When multiple QOM types are registered in the same file,
it is simpler to use the the DEFINE_TYPES() macro. In
particular because type array declared with such macro
are easier to review.

Signed-off-by: Philippe Mathieu-Daudé 


Reviewed-by: Cédric Le Goater 

Thanks,

C.






Re: [PATCH 07/33] hw/cpu/arm: Alias 'num-cpu' property on TYPE_REALVIEW_MPCORE

2024-01-02 Thread Cédric Le Goater

On 12/12/23 17:29, Philippe Mathieu-Daudé wrote:

No need to duplicate and forward the 'num-cpu' property from
TYPE_ARM11MPCORE_PRIV to TYPE_REALVIEW_MPCORE, alias it with
QOM object_property_add_alias().

Signed-off-by: Philippe Mathieu-Daudé 



Reviewed-by: Cédric Le Goater 

Thanks,

C.






Re: [PATCH 06/33] hw/cpu/arm: Rename 'busdev' -> 'gicsbd' in a15mp_priv_realize()

2024-01-02 Thread Cédric Le Goater

On 12/12/23 17:29, Philippe Mathieu-Daudé wrote:

'busdev' is the internal GIC as SysBus device.
Since we already have a 'gicdev' variable for the GIC as QDev,
rename 'busdev' as 'gicsbd' to make it clear we access the IRQ
lines from the GIC.

Signed-off-by: Philippe Mathieu-Daudé 



Reviewed-by: Cédric Le Goater 

Thanks,

C.






Re: [PATCH 05/33] hw/cpu: Remove dead Kconfig

2024-01-02 Thread Cédric Le Goater

On 12/12/23 17:29, Philippe Mathieu-Daudé wrote:

ARM MPCore Kconfig are defined in hw/arm/Kconfig.
hw/cpu/Kconfig is never used, remove it.

Fixes: 82f5181777 ("kconfig: introduce kconfig files")
Signed-off-by: Philippe Mathieu-Daudé 



Reviewed-by: Cédric Le Goater 

Thanks,

C.






Re: [PATCH 04/33] hw/arm/fsl-imx7: Add a local 'gic' variable

2024-01-02 Thread Cédric Le Goater

On 12/12/23 17:29, Philippe Mathieu-Daudé wrote:

The A7MPCore forward the IRQs from its internal GIC.
To make the code clearer, add a 'gic' variable.

Signed-off-by: Philippe Mathieu-Daudé 



Reviewed-by: Cédric Le Goater 

Thanks,

C.





Re: [PATCH 03/33] hw/arm/fsl-imx6ul: Add a local 'gic' variable

2024-01-02 Thread Cédric Le Goater

On 12/12/23 17:29, Philippe Mathieu-Daudé wrote:

The A7MPCore forward the IRQs from its internal GIC.
To make the code clearer, add a 'gic' variable.

Signed-off-by: Philippe Mathieu-Daudé 



Reviewed-by: Cédric Le Goater 

Thanks,

C.






Re: [PATCH 02/33] hw/arm/fsl-imx6: Add a local 'gic' variable

2024-01-02 Thread Cédric Le Goater

On 12/12/23 17:29, Philippe Mathieu-Daudé wrote:

The A9MPCore forward the IRQs from its internal GIC.
To make the code clearer, add a 'gic' variable.

Signed-off-by: Philippe Mathieu-Daudé 



Reviewed-by: Cédric Le Goater 

Thanks,

C.






Re: [PATCH 01/33] hw/arm/boot: Propagate vCPU to arm_load_dtb()

2024-01-02 Thread Cédric Le Goater

On 12/12/23 17:29, Philippe Mathieu-Daudé wrote:

In heterogeneous setup the first vCPU might not be
the one expected, better pass it explicitly.

Signed-off-by: Philippe Mathieu-Daudé 


Reviewed-by: Cédric Le Goater 

Thanks,

C.






Re: [PATCH v2 1/1] virgl: Implement resource_query_layout

2024-01-02 Thread Marc-André Lureau
Hi

On Thu, Dec 21, 2023 at 3:36 PM Julia Zhang  wrote:
>
> From: Daniel Stone 
>
> A new ioctl to shuttle information between host and guest about the
> actual buffer allocation, which can be used for interop between GL and
> Vulkan when supporting standard window systems.
>

The command hasn't been added to the kernel yet. The function is not
in the virgl library either.

> Signed-off-by: Daniel Stone 
> Co-developed-by: Julia Zhang 
> Signed-off-by: Julia Zhang 
> ---
>  hw/display/virtio-gpu-base.c|  4 +++
>  hw/display/virtio-gpu-virgl.c   | 40 +
>  include/hw/virtio/virtio-gpu-bswap.h|  7 
>  include/standard-headers/linux/virtio_gpu.h | 30 
>  meson.build |  4 +++
>  5 files changed, 85 insertions(+)
>
> diff --git a/hw/display/virtio-gpu-base.c b/hw/display/virtio-gpu-base.c
> index 6bcee3882f..09b37f015d 100644
> --- a/hw/display/virtio-gpu-base.c
> +++ b/hw/display/virtio-gpu-base.c
> @@ -240,6 +240,10 @@ virtio_gpu_base_get_features(VirtIODevice *vdev, 
> uint64_t features,
>  features |= (1 << VIRTIO_GPU_F_RESOURCE_UUID);
>  }
>
> +#ifdef HAVE_VIRGL_RESOURCE_QUERY_LAYOUT
> +features |= (1 << VIRTIO_GPU_F_RESOURCE_QUERY_LAYOUT);
> +#endif /* HAVE_VIRGL_RESOURCE_QUERY_LAYOUT */
> +
>  return features;
>  }
>
> diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
> index c1e7c6d0c6..b331232fee 100644
> --- a/hw/display/virtio-gpu-virgl.c
> +++ b/hw/display/virtio-gpu-virgl.c
> @@ -813,6 +813,40 @@ static void virgl_cmd_set_scanout_blob(VirtIOGPU *g,
>
>  #endif /* HAVE_VIRGL_RESOURCE_BLOB */
>
> +#ifdef HAVE_VIRGL_RESOURCE_QUERY_LAYOUT
> +static void virgl_cmd_resource_query_layout(VirtIOGPU *g,
> +   struct virtio_gpu_ctrl_command 
> *cmd)
> +{
> +struct virtio_gpu_resource_query_layout qlayout;
> +struct virtio_gpu_resp_resource_layout resp;
> +struct virgl_renderer_resource_layout rlayout;
> +int ret;
> +int i;
> +VIRTIO_GPU_FILL_CMD(qlayout);
> +virtio_gpu_resource_query_layout_bswap();
> +
> +ret = virgl_renderer_resource_query_layout(qlayout.resource_id, ,
> +  qlayout.width, qlayout.height,
> +  qlayout.format, qlayout.bind);
> +if (ret != 0) {
> +qemu_log_mask(LOG_GUEST_ERROR, "%s: resource %d is not 
> externally-allocated\n",
> +  __func__, qlayout.resource_id);
> +cmd->error = VIRTIO_GPU_RESP_ERR_INVALID_RESOURCE_ID;
> +return;
> +}
> +
> +memset(, 0, sizeof(resp));
> +resp.hdr.type = VIRTIO_GPU_RESP_OK_RESOURCE_LAYOUT;
> +resp.num_planes = rlayout.num_planes;
> +resp.modifier = rlayout.modifier;
> +for (i = 0; i < resp.num_planes; i++) {
> +resp.planes[i].offset = rlayout.planes[i].offset;
> +resp.planes[i].stride = rlayout.planes[i].stride;
> +}
> +virtio_gpu_ctrl_response(g, cmd, , sizeof(resp));
> +}
> +#endif /* HAVE_VIRGL_RESOURCE_QUERY_LAYOUT */
> +
>  void virtio_gpu_virgl_process_cmd(VirtIOGPU *g,
>struct virtio_gpu_ctrl_command *cmd)
>  {
> @@ -896,6 +930,12 @@ void virtio_gpu_virgl_process_cmd(VirtIOGPU *g,
>  virgl_cmd_set_scanout_blob(g, cmd);
>  break;
>  #endif /* HAVE_VIRGL_RESOURCE_BLOB */
> +
> +#ifdef HAVE_VIRGL_RESOURCE_QUERY_LAYOUT
> +case VIRTIO_GPU_CMD_RESOURCE_QUERY_LAYOUT:
> +   virgl_cmd_resource_query_layout(g, cmd);
> +   break;
> +#endif /* HAVE_VIRGL_RESOURCE_QUERY_LAYOUT*/
>  default:
>  cmd->error = VIRTIO_GPU_RESP_ERR_UNSPEC;
>  break;
> diff --git a/include/hw/virtio/virtio-gpu-bswap.h 
> b/include/hw/virtio/virtio-gpu-bswap.h
> index dd1975e2d4..dea8cf6fd3 100644
> --- a/include/hw/virtio/virtio-gpu-bswap.h
> +++ b/include/hw/virtio/virtio-gpu-bswap.h
> @@ -92,4 +92,11 @@ virtio_gpu_scanout_blob_bswap(struct 
> virtio_gpu_set_scanout_blob *ssb)
>  le32_to_cpus(>offsets[3]);
>  }
>
> +static inline void
> +virtio_gpu_resource_query_layout_bswap(struct 
> virtio_gpu_resource_query_layout *rql)
> +{
> +virtio_gpu_ctrl_hdr_bswap(>hdr);
> +le32_to_cpus(>resource_id);
> +}
> +
>  #endif
> diff --git a/include/standard-headers/linux/virtio_gpu.h 
> b/include/standard-headers/linux/virtio_gpu.h
> index c621389f3d..c9a2f58237 100644
> --- a/include/standard-headers/linux/virtio_gpu.h
> +++ b/include/standard-headers/linux/virtio_gpu.h
> @@ -65,6 +65,11 @@
>   */
>  #define VIRTIO_GPU_F_CONTEXT_INIT4
>
> +/*
> + * VIRTIO_GPU_CMD_RESOURCE_QUERY_LAYOUT
> + */
> +#define VIRTIO_GPU_F_RESOURCE_QUERY_LAYOUT 5
> +
>  enum virtio_gpu_ctrl_type {
> VIRTIO_GPU_UNDEFINED = 0,
>
> @@ -95,6 +100,7 @@ enum virtio_gpu_ctrl_type {
> VIRTIO_GPU_CMD_SUBMIT_3D,
> VIRTIO_GPU_CMD_RESOURCE_MAP_BLOB,
> VIRTIO_GPU_CMD_RESOURCE_UNMAP_BLOB,
> +   

Re: [PATCH v6 10/11] virtio-gpu: Initialize Venus

2024-01-02 Thread Marc-André Lureau
Hi

On Tue, Dec 19, 2023 at 11:55 AM Huang Rui  wrote:
>
> From: Antonio Caggiano 
>
> Request Venus when initializing VirGL.
>
> Signed-off-by: Antonio Caggiano 
> Signed-off-by: Huang Rui 
> ---
>
> Changes in v6:
> - Remove the unstable API flags check because virglrenderer is already 1.0.
> - Squash the render server flag support into "Initialize Venus".
>
>  hw/display/virtio-gpu-virgl.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
> index f35a751824..c523a6717a 100644
> --- a/hw/display/virtio-gpu-virgl.c
> +++ b/hw/display/virtio-gpu-virgl.c
> @@ -964,6 +964,10 @@ int virtio_gpu_virgl_init(VirtIOGPU *g)
>  }
>  #endif
>
> +#ifdef VIRGL_RENDERER_VENUS
> +flags |= VIRGL_RENDERER_VENUS | VIRGL_RENDERER_RENDER_SERVER;
> +#endif
> +

I wonder if it's a good idea to initialize venus by default. It
doesn't seem to require vulkan during initialization, but this may
evolve. Make it optional?

>  ret = virgl_renderer_init(g, flags, _gpu_3d_cbs);
>  if (ret != 0) {
>  error_report("virgl could not be initialized: %d", ret);
> --
> 2.25.1
>


-- 
Marc-André Lureau



Re: [QEMU-devel][RFC PATCH 1/1] backends/hostmem: qapi/qom: Add an ObjectOption for memory-backend-* called HostMemType and its arg 'cxlram'

2024-01-02 Thread David Hildenbrand

On 01.01.24 08:53, Ho-Ren (Jack) Chuang wrote:

Introduce a new configuration option 'host-mem-type=' in the
'-object memory-backend-ram', allowing users to specify
from which type of memory to allocate.

Users can specify 'cxlram' as an argument, and QEMU will then
automatically locate CXL RAM NUMA nodes and use them as the backend memory.
For example:
-object memory-backend-ram,id=vmem0,size=19G,host-mem-type=cxlram \
-device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \
-device cxl-rp,port=0,bus=cxl.1,id=root_port13,chassis=0,slot=2 \
-device cxl-type3,bus=root_port13,volatile-memdev=vmem0,id=cxl-vmem0 \
-M 
cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=19G,cxl-fmw.0.interleave-granularity=8k
 \



You can achieve the exact same thing already simply by using memory 
policies and detecting the node(s) before calling QEMU, no?


There has to be a good reason to add such a shortcut into QEMU, and it 
should be spelled out here.


--
Cheers,

David / dhildenb




Re: [PATCH v6 1/2] qom: new object to associate device to numa node

2024-01-02 Thread Jonathan Cameron via
On Mon, 25 Dec 2023 10:26:02 +0530
 wrote:

> From: Ankit Agrawal 
> 
> NVIDIA GPU's support MIG (Mult-Instance GPUs) feature [1], which allows
> partitioning of the GPU device resources (including device memory) into
> several (upto 8) isolated instances. Each of the partitioned memory needs
> a dedicated NUMA node to operate. The partitions are not fixed and they
> can be created/deleted at runtime.
> 
> Unfortunately Linux OS does not provide a means to dynamically create/destroy
> NUMA nodes and such feature implementation is not expected to be trivial. The
> nodes that OS discovers at the boot time while parsing SRAT remains fixed. So
> we utilize the Generic Initiator Affinity structures that allows association
> between nodes and devices. Multiple GI structures per BDF is possible,
> allowing creation of multiple nodes by exposing unique PXM in each of these
> structures.
> 
> Introduce a new acpi-generic-initiator object to allow host admin provide the
> device and the corresponding NUMA nodes. Qemu maintain this association and
> use this object to build the requisite GI Affinity Structure. On a multi
> device system, each device supporting the features needs a unique
> acpi-generic-initiator object with its own set of NUMA nodes associated to it.
> 
> An admin can provide the range of nodes through a uint16 array host-nodes
> and link it to a device by providing its id. Currently, only PCI device is
> supported. The following sample creates 8 nodes per PCI device for a VM
> with 2 PCI devices and link them to the respecitve PCI device using
> acpi-generic-initiator objects:
> 
> -numa node,nodeid=2 -numa node,nodeid=3 -numa node,nodeid=4 \
> -numa node,nodeid=5 -numa node,nodeid=6 -numa node,nodeid=7 \
> -numa node,nodeid=8 -numa node,nodeid=9 \
> -device 
> vfio-pci-nohotplug,host=0009:01:00.0,bus=pcie.0,addr=04.0,rombar=0,id=dev0 \
> -object acpi-generic-initiator,id=gi0,pci-dev=dev0,host-nodes=2-9 \
> 
> -numa node,nodeid=10 -numa node,nodeid=11 -numa node,nodeid=12 \
> -numa node,nodeid=13 -numa node,nodeid=14 -numa node,nodeid=15 \
> -numa node,nodeid=16 -numa node,nodeid=17 \
> -device 
> vfio-pci-nohotplug,host=0009:01:01.0,bus=pcie.0,addr=05.0,rombar=0,id=dev1 \
> -object acpi-generic-initiator,id=gi1,pci-dev=dev1,host-nodes=10-17 \

Hi Ankit,

Whilst I'm still not particularly keen on this use of GI nodes, the
infrastructure is now generic enough that it covers more normal use cases
so I'm almost fine with it going into QEMU. If you want to use it for unusual
things that's up to you ;)  Note that the following is about QEMU allowing
you to potentially shoot yourself in the foot rather than necessarily saying
the interface shouldn't allow a PCI dev to map to multiple GI nodes.

As per reply to the cover letter I definitely want to see SRAT table dumps
in here though so we can easily see what this is actually building.

I worry that some OS might make the assumption that it's one GI node
per PCI device though. The language in the ACPI specification is:

"The Generic Initiator Affinity Structure provides the association between _a_
generic initiator and _the_ proximity domain to which the initiator belongs".

The use of _a_ and _the_ in there makes it pretty explicitly a N:1 relationship
(multiple devices can be in same proximity domain, but a device may only be in 
one).
To avoid that confusion you will need an ACPI spec change.  I'd be happy to
support 

The reason you can get away with this in Linux today is that I only implemented
a very minimal support for GIs with the mappings being provided the other way
around (_PXM in a PCIe node in DSDT).  If we finish that support off I'd assume
the multiple mappings here will result in a firmware bug warning in at least
some cases.  Note the reason support for the mapping the other way isn't yet
in linux is that we never resolved the mess that a PCI re-enumeration will
cause (requires a pre enumeration pass of what is configured by fw and caching
of the path to all the PCIe devices that lets you access so we can reconstruct
the mapping post enumeration).

Also, this effectively creates a bunch of separate generic initiator nodes
and lumping that under one object seems to imply they are in general connected
to each other.

I'd be happier with a separate instance per GI node

  -object acpi-generic-initiator,id=gi1,pci-dev=dev1,nodeid=10
  -object acpi-generic-initiator,id=gi2,pci-dev=dev1,nodeid=11
etc with the proviso that anyone using this on a system that assumes a one
to one mapping for PCI

However, I'll leave it up to those more familiar with the QEMU numa
control interface design to comment on whether this approach is preferable
to making the gi part of the numa node entry or doing it like hmat.

-numa srat-gi,node-id=10,gi-pci-dev=dev1

etc

> 
> [1] https://www.nvidia.com/en-in/technologies/multi-instance-gpu
> 
> Signed-off-by: Ankit Agrawal 
> ---
>  hw/acpi/acpi-generic-initiator.c | 70 
>  

Re: [PATCH v6 08/11] virtio-gpu: Resource UUID

2024-01-02 Thread Marc-André Lureau
Hi

On Tue, Dec 19, 2023 at 11:55 AM Huang Rui  wrote:
>
> From: Antonio Caggiano 
>
> Enable resource UUID feature and implement command resource assign UUID.
> This is done by introducing a hash table to map resource IDs to their
> UUIDs.

I agree with Akihiko, what about putting QemuUUID in struct
virtio_gpu_simple_resource?

(I also doubt about the hash table usefulness, but I don't know
how/why the UUID is used)

>
> Signed-off-by: Antonio Caggiano 
> Signed-off-by: Huang Rui 
> ---
>
> Changes in v6:
> - Set resource uuid as option.
> - Implement optional subsection of vmstate_virtio_gpu_resource_uuid_state
>   or virtio live migration.
> - Use g_int_hash/g_int_equal instead of the default.
> - Move virtio_vgpu_simple_resource initialization in the earlier new patch
>   "virtio-gpu: Introduce virgl_gpu_resource structure"
>
>  hw/display/trace-events|   1 +
>  hw/display/virtio-gpu-base.c   |   4 ++
>  hw/display/virtio-gpu-virgl.c  |   3 +
>  hw/display/virtio-gpu.c| 119 +
>  include/hw/virtio/virtio-gpu.h |   7 ++
>  5 files changed, 134 insertions(+)
>
> diff --git a/hw/display/trace-events b/hw/display/trace-events
> index 2336a0ca15..54d6894c59 100644
> --- a/hw/display/trace-events
> +++ b/hw/display/trace-events
> @@ -41,6 +41,7 @@ virtio_gpu_cmd_res_create_blob(uint32_t res, uint64_t size) 
> "res 0x%x, size %" P
>  virtio_gpu_cmd_res_unref(uint32_t res) "res 0x%x"
>  virtio_gpu_cmd_res_back_attach(uint32_t res) "res 0x%x"
>  virtio_gpu_cmd_res_back_detach(uint32_t res) "res 0x%x"
> +virtio_gpu_cmd_res_assign_uuid(uint32_t res) "res 0x%x"
>  virtio_gpu_cmd_res_xfer_toh_2d(uint32_t res) "res 0x%x"
>  virtio_gpu_cmd_res_xfer_toh_3d(uint32_t res) "res 0x%x"
>  virtio_gpu_cmd_res_xfer_fromh_3d(uint32_t res) "res 0x%x"
> diff --git a/hw/display/virtio-gpu-base.c b/hw/display/virtio-gpu-base.c
> index 37af256219..6bcee3882f 100644
> --- a/hw/display/virtio-gpu-base.c
> +++ b/hw/display/virtio-gpu-base.c
> @@ -236,6 +236,10 @@ virtio_gpu_base_get_features(VirtIODevice *vdev, 
> uint64_t features,
>  features |= (1 << VIRTIO_GPU_F_CONTEXT_INIT);
>  }
>
> +if (virtio_gpu_resource_uuid_enabled(g->conf)) {
> +features |= (1 << VIRTIO_GPU_F_RESOURCE_UUID);
> +}
> +
>  return features;
>  }
>
> diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
> index 5a3a292f79..be9da6e780 100644
> --- a/hw/display/virtio-gpu-virgl.c
> +++ b/hw/display/virtio-gpu-virgl.c
> @@ -777,6 +777,9 @@ void virtio_gpu_virgl_process_cmd(VirtIOGPU *g,
>  /* TODO add security */
>  virgl_cmd_ctx_detach_resource(g, cmd);
>  break;
> +case VIRTIO_GPU_CMD_RESOURCE_ASSIGN_UUID:
> +virtio_gpu_resource_assign_uuid(g, cmd);
> +break;
>  case VIRTIO_GPU_CMD_GET_CAPSET_INFO:
>  virgl_cmd_get_capset_info(g, cmd);
>  break;
> diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
> index 8189c392dc..466debb256 100644
> --- a/hw/display/virtio-gpu.c
> +++ b/hw/display/virtio-gpu.c
> @@ -958,6 +958,37 @@ virtio_gpu_resource_detach_backing(VirtIOGPU *g,
>  virtio_gpu_cleanup_mapping(g, res);
>  }
>
> +void virtio_gpu_resource_assign_uuid(VirtIOGPU *g,
> + struct virtio_gpu_ctrl_command *cmd)
> +{
> +struct virtio_gpu_simple_resource *res;
> +struct virtio_gpu_resource_assign_uuid assign;
> +struct virtio_gpu_resp_resource_uuid resp;
> +QemuUUID *uuid;
> +
> +VIRTIO_GPU_FILL_CMD(assign);
> +virtio_gpu_bswap_32(, sizeof(assign));
> +trace_virtio_gpu_cmd_res_assign_uuid(assign.resource_id);
> +
> +res = virtio_gpu_find_check_resource(g, assign.resource_id, false, 
> __func__, >error);
> +if (!res) {
> +return;
> +}
> +
> +memset(, 0, sizeof(resp));
> +resp.hdr.type = VIRTIO_GPU_RESP_OK_RESOURCE_UUID;
> +
> +uuid = g_hash_table_lookup(g->resource_uuids, _id);
> +if (!uuid) {
> +uuid = g_new(QemuUUID, 1);
> +qemu_uuid_generate(uuid);
> +g_hash_table_insert(g->resource_uuids, _id, uuid);
> +}
> +
> +memcpy(resp.uuid, uuid, sizeof(QemuUUID));
> +virtio_gpu_ctrl_response(g, cmd, , sizeof(resp));
> +}
> +
>  void virtio_gpu_simple_process_cmd(VirtIOGPU *g,
> struct virtio_gpu_ctrl_command *cmd)
>  {
> @@ -1006,6 +1037,9 @@ void virtio_gpu_simple_process_cmd(VirtIOGPU *g,
>  case VIRTIO_GPU_CMD_RESOURCE_DETACH_BACKING:
>  virtio_gpu_resource_detach_backing(g, cmd);
>  break;
> +case VIRTIO_GPU_CMD_RESOURCE_ASSIGN_UUID:
> +virtio_gpu_resource_assign_uuid(g, cmd);
> +break;
>  default:
>  cmd->error = VIRTIO_GPU_RESP_ERR_UNSPEC;
>  break;
> @@ -1400,6 +1434,57 @@ static int virtio_gpu_blob_load(QEMUFile *f, void 
> *opaque, size_t size,
>  return 0;
>  }
>
> +static int virtio_gpu_resource_uuid_save(QEMUFile *f, void *opaque, size_t 
> size,
> + 

Re: [PATCH v6 07/11] virtio-gpu: Handle resource blob commands

2024-01-02 Thread Marc-André Lureau
Hi

On Tue, Dec 19, 2023 at 11:55 AM Huang Rui  wrote:
>
> From: Antonio Caggiano 
>
> Support BLOB resources creation, mapping and unmapping by calling the
> new stable virglrenderer 0.10 interface. Only enabled when available and
> via the blob config. E.g. -device virtio-vga-gl,blob=true
>
> Signed-off-by: Antonio Caggiano 
> Signed-off-by: Dmitry Osipenko 
> Signed-off-by: Xenia Ragiadakou 
> Signed-off-by: Huang Rui 
> ---
>
> Changes in v6:
> - Use new struct virgl_gpu_resource.
> - Unmap, unref and destroy the resource only after the memory region
>   has been completely removed.
> - In unref check whether the resource is still mapped.
> - In unmap_blob check whether the resource has been already unmapped.
> - Fix coding style
>
>  hw/display/virtio-gpu-virgl.c | 274 +-
>  hw/display/virtio-gpu.c   |   4 +-
>  meson.build   |   4 +
>  3 files changed, 276 insertions(+), 6 deletions(-)

Could you split this patch to introduce the new resource
ref/get/put/destroy functions first, before adding the blob commands.
Please explain the rationale of the changes, why they are safe or
equivalent to current code. I'd also suggest documentation and better
naming for the functions, or inlined code as appropriate, as it's
confusing to understand together what should be used and when.

>
> diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
> index faab374336..5a3a292f79 100644
> --- a/hw/display/virtio-gpu-virgl.c
> +++ b/hw/display/virtio-gpu-virgl.c
> @@ -17,6 +17,7 @@
>  #include "trace.h"
>  #include "hw/virtio/virtio.h"
>  #include "hw/virtio/virtio-gpu.h"
> +#include "hw/virtio/virtio-gpu-bswap.h"
>
>  #include "ui/egl-helpers.h"
>
> @@ -24,8 +25,62 @@
>
>  struct virgl_gpu_resource {
>  struct virtio_gpu_simple_resource res;
> +uint32_t ref;
> +VirtIOGPU *g;
> +
> +#ifdef HAVE_VIRGL_RESOURCE_BLOB
> +/* only blob resource needs this region to be mapped as guest mmio */
> +MemoryRegion *region;
> +#endif
>  };
>
> +static void vres_get_ref(struct virgl_gpu_resource *vres)
> +{
> +uint32_t ref;
> +
> +ref = qatomic_fetch_inc(>ref);
> +g_assert(ref < INT_MAX);
> +}
> +
> +static void virgl_resource_destroy(struct virgl_gpu_resource *vres)
> +{
> +struct virtio_gpu_simple_resource *res;
> +VirtIOGPU *g;
> +
> +if (!vres) {
> +return;
> +}
> +
> +g = vres->g;
> +res = >res;
> +QTAILQ_REMOVE(>reslist, res, next);
> +virtio_gpu_cleanup_mapping(g, res);
> +g_free(vres);
> +}
> +
> +static void virgl_resource_unref(struct virgl_gpu_resource *vres)
> +{
> +struct virtio_gpu_simple_resource *res;
> +
> +if (!vres) {
> +return;
> +}
> +
> +res = >res;
> +virgl_renderer_resource_detach_iov(res->resource_id, NULL, NULL);
> +virgl_renderer_resource_unref(res->resource_id);
> +}
> +
> +static void vres_put_ref(struct virgl_gpu_resource *vres)
> +{
> +g_assert(vres->ref > 0);
> +
> +if (qatomic_fetch_dec(>ref) == 1) {
> +virgl_resource_unref(vres);
> +virgl_resource_destroy(vres);
> +}
> +}
> +
>  static struct virgl_gpu_resource *
>  virgl_gpu_find_resource(VirtIOGPU *g, uint32_t resource_id)
>  {
> @@ -59,6 +114,8 @@ static void virgl_cmd_create_resource_2d(VirtIOGPU *g,
> c2d.width, c2d.height);
>
>  vres = g_new0(struct virgl_gpu_resource, 1);
> +vres_get_ref(vres);
> +vres->g = g;
>  vres->res.width = c2d.width;
>  vres->res.height = c2d.height;
>  vres->res.format = c2d.format;
> @@ -91,6 +148,8 @@ static void virgl_cmd_create_resource_3d(VirtIOGPU *g,
> c3d.width, c3d.height, c3d.depth);
>
>  vres = g_new0(struct virgl_gpu_resource, 1);
> +vres_get_ref(vres);
> +vres->g = g;
>  vres->res.width = c3d.width;
>  vres->res.height = c3d.height;
>  vres->res.format = c3d.format;
> @@ -126,12 +185,21 @@ static void virgl_cmd_resource_unref(VirtIOGPU *g,
>  return;
>  }
>
> -virgl_renderer_resource_detach_iov(unref.resource_id, NULL, NULL);
> -virgl_renderer_resource_unref(unref.resource_id);
> +#ifdef HAVE_VIRGL_RESOURCE_BLOB
> +if (vres->region) {
> +VirtIOGPUBase *b = VIRTIO_GPU_BASE(g);
> +MemoryRegion *mr = vres->region;
> +
> +warn_report("%s: blob resource %d not unmapped",
> +__func__, unref.resource_id);
> +vres->region = NULL;
> +memory_region_set_enabled(mr, false);
> +memory_region_del_subregion(>hostmem, mr);
> +object_unparent(OBJECT(mr));
> +}
> +#endif /* HAVE_VIRGL_RESOURCE_BLOB */
>
> -QTAILQ_REMOVE(>reslist, >res, next);
> -virtio_gpu_cleanup_mapping(g, >res);
> -g_free(vres);
> +vres_put_ref(vres);
>  }
>
>  static void virgl_cmd_context_create(VirtIOGPU *g,
> @@ -470,6 +538,191 @@ static void virgl_cmd_get_capset(VirtIOGPU *g,
>  g_free(resp);
>  }

[PATCH 2/2] backends/iommufd: Remove mutex

2024-01-02 Thread Cédric Le Goater
Coverity reports a concurrent data access violation because be->users
is being accessed in iommufd_backend_can_be_deleted() without holding
the mutex.

However, these routines are called from the QEMU main thread when a
device is created. In this case, the code paths should be protected by
the BQL lock and it should be safe to drop the IOMMUFD backend mutex.
Simply remove it.

Fixes: CID 1531550
Fixes: CID 1531549
Signed-off-by: Cédric Le Goater 
---
 include/sysemu/iommufd.h | 2 --
 backends/iommufd.c   | 7 ---
 2 files changed, 9 deletions(-)

diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h
index 
9c5524b0ed15ef5f81be159415bc216572a283d8..9af27ebd6ccb78ca8e16aa3c62629aab9f7f31e4
 100644
--- a/include/sysemu/iommufd.h
+++ b/include/sysemu/iommufd.h
@@ -2,7 +2,6 @@
 #define SYSEMU_IOMMUFD_H
 
 #include "qom/object.h"
-#include "qemu/thread.h"
 #include "exec/hwaddr.h"
 #include "exec/cpu-common.h"
 
@@ -19,7 +18,6 @@ struct IOMMUFDBackend {
 /*< protected >*/
 int fd;/* /dev/iommu file descriptor */
 bool owned;/* is the /dev/iommu opened internally */
-QemuMutex lock;
 uint32_t users;
 
 /*< public >*/
diff --git a/backends/iommufd.c b/backends/iommufd.c
index 
393c0d9a3719e3de1a6b51a8ff2e75e184badc82..1ef683c7b080e688af46c5b98e61eafa73e39895
 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -29,7 +29,6 @@ static void iommufd_backend_init(Object *obj)
 be->fd = -1;
 be->users = 0;
 be->owned = true;
-qemu_mutex_init(>lock);
 }
 
 static void iommufd_backend_finalize(Object *obj)
@@ -52,10 +51,8 @@ static void iommufd_backend_set_fd(Object *obj, const char 
*str, Error **errp)
 error_prepend(errp, "Could not parse remote object fd %s:", str);
 return;
 }
-qemu_mutex_lock(>lock);
 be->fd = fd;
 be->owned = false;
-qemu_mutex_unlock(>lock);
 trace_iommu_backend_set_fd(be->fd);
 }
 
@@ -79,7 +76,6 @@ int iommufd_backend_connect(IOMMUFDBackend *be, Error **errp)
 {
 int fd, ret = 0;
 
-qemu_mutex_lock(>lock);
 if (be->owned && !be->users) {
 fd = qemu_open_old("/dev/iommu", O_RDWR);
 if (fd < 0) {
@@ -93,13 +89,11 @@ int iommufd_backend_connect(IOMMUFDBackend *be, Error 
**errp)
 out:
 trace_iommufd_backend_connect(be->fd, be->owned,
   be->users, ret);
-qemu_mutex_unlock(>lock);
 return ret;
 }
 
 void iommufd_backend_disconnect(IOMMUFDBackend *be)
 {
-qemu_mutex_lock(>lock);
 if (!be->users) {
 goto out;
 }
@@ -110,7 +104,6 @@ void iommufd_backend_disconnect(IOMMUFDBackend *be)
 }
 out:
 trace_iommufd_backend_disconnect(be->fd, be->users);
-qemu_mutex_unlock(>lock);
 }
 
 int iommufd_backend_alloc_ioas(IOMMUFDBackend *be, uint32_t *ioas_id,
-- 
2.43.0




[PATCH 0/2] backends/iommufd: Remove mutex

2024-01-02 Thread Cédric Le Goater
Hello !

Coverity has some reports regarding the IOMMUFDBackend mutex. Since
the IOMMUFDBackend routines are called from the QEMU main thread, this
series simply suggests removing the mutex and rely on the BQL to
handle concurrent access.

Thanks,

C.

Cédric Le Goater (2):
  backends/iommufd: Remove check on number of backend users
  backends/iommufd: Remove mutex

 include/sysemu/iommufd.h |  2 --
 backends/iommufd.c   | 12 
 2 files changed, 14 deletions(-)

-- 
2.43.0




[PATCH 1/2] backends/iommufd: Remove check on number of backend users

2024-01-02 Thread Cédric Le Goater
QOM already has a ref count on objects and it will assert much
earlier, when INT_MAX is reached.

Signed-off-by: Cédric Le Goater 
---
 backends/iommufd.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/backends/iommufd.c b/backends/iommufd.c
index 
ba58a0eb0d0ba9aae625c987eb728547543dba66..393c0d9a3719e3de1a6b51a8ff2e75e184badc82
 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -80,11 +80,6 @@ int iommufd_backend_connect(IOMMUFDBackend *be, Error **errp)
 int fd, ret = 0;
 
 qemu_mutex_lock(>lock);
-if (be->users == UINT32_MAX) {
-error_setg(errp, "too many connections");
-ret = -E2BIG;
-goto out;
-}
 if (be->owned && !be->users) {
 fd = qemu_open_old("/dev/iommu", O_RDWR);
 if (fd < 0) {
-- 
2.43.0




Re: [PATCH v6 0/2] acpi: report numa nodes for device memory using GI

2024-01-02 Thread Jonathan Cameron via
On Mon, 25 Dec 2023 10:26:01 +0530
 wrote:

> From: Ankit Agrawal 
> 
> There are upcoming devices which allow CPU to cache coherently access
> their memory. It is sensible to expose such memory as NUMA nodes separate
> from the sysmem node to the OS. The ACPI spec provides a scheme in SRAT
> called Generic Initiator Affinity Structure [1] to allow an association
> between a Proximity Domain (PXM) and a Generic Initiator (GI) (e.g.
> heterogeneous processors and accelerators, GPUs, and I/O devices with
> integrated compute or DMA engines).
> 
> While a single node per device may cover several use cases, it is however
> insufficient for a full utilization of the NVIDIA GPUs MIG
> (Mult-Instance GPUs) [2] feature. The feature allows partitioning of the
> GPU device resources (including device memory) into several (upto 8)
> isolated instances. Each of the partitioned memory requires a dedicated NUMA
> node to operate. The partitions are not fixed and they can be created/deleted
> at runtime.
> 
> Linux OS does not provide a means to dynamically create/destroy NUMA nodes
> and such feature implementation is expected to be non-trivial. The nodes
> that OS discovers at the boot time while parsing SRAT remains fixed. So we
> utilize the GI Affinity structures that allows association between nodes
> and devices. Multiple GI structures per device/BDF is possible, allowing
> creation of multiple nodes in the VM by exposing unique PXM in each of these
> structures.
> 
> Implement the mechanism to build the GI affinity structures as Qemu currently
> does not. Introduce a new acpi-generic-initiator object that allows an
> association of a set of nodes with a device. During SRAT creation, all such
> objected are identified and used to add the GI Affinity Structures. Currently,
> only PCI device is supported. On a multi device system, each device supporting
> the features needs a unique acpi-generic-initiator object with its own set of
> NUMA nodes associated to it.
> 
> The admin will create a range of 8 nodes and associate that with the device
> using the acpi-generic-initiator object. While a configuration of less than
> 8 nodes per device is allowed, such configuration will prevent utilization of
> the feature to the fullest. This setting is applicable to all the Grace+Hopper
> systems. The following is an example of the Qemu command line arguments to
> create 8 nodes and link them to the device 'dev0':
> 
> -numa node,nodeid=2 -numa node,nodeid=3 -numa node,nodeid=4 \
> -numa node,nodeid=5 -numa node,nodeid=6 -numa node,nodeid=7 \
> -numa node,nodeid=8 -numa node,nodeid=9 \
> -device 
> vfio-pci-nohotplug,host=0009:01:00.0,bus=pcie.0,addr=04.0,rombar=0,id=dev0 \
> -object acpi-generic-initiator,id=gi0,pci-dev=dev0,host-nodes=2-9 \
> 

I'd find it helpful to see the resulting chunk of SRAT for these examples
(disassembled) in this cover letter and the patches (where there are more 
examples).




Re: [PATCH v2 10/16] target/riscv: create finalize_features() for KVM

2024-01-02 Thread Daniel Henrique Barboza




On 12/29/23 08:22, Vladimir Isaev wrote:

22.12.2023 15:22, Daniel Henrique Barboza wrote:

To turn cbom_blocksize and cboz_blocksize into class properties we need
KVM specific changes.

KVM is creating its own version of these options with a customized
setter() that prevents users from picking an invalid value during init()
time. This comes at the cost of duplicating each option that KVM
supports. This will keep happening for each new shared option KVM
implements in the future.

We can avoid that by using the same property TCG uses and adding
specific KVM handling during finalize() time, like TCG already does with
riscv_tcg_cpu_finalize_features(). To do that, the common CPU property
offers a way of knowing if an option was user set or not, sparing us
from doing unneeded syscalls.

riscv_kvm_cpu_finalize_features() is then created using the same
KVMScratch CPU we already use during init() time, since finalize() time
is still too early to use the official KVM CPU for it. cbom_blocksize
and cboz_blocksize are then handled during finalize() in the same way
they're handled by their KVM specific setter.

With this change we can proceed with the blocksize changes in the common
code without breaking the KVM driver.

Signed-off-by: Daniel Henrique Barboza 
---
  target/riscv/cpu.c   | 16 +++---
  target/riscv/cpu.h   |  1 +
  target/riscv/kvm/kvm-cpu.c   | 59 
  target/riscv/kvm/kvm_riscv.h |  1 +
  4 files changed, 72 insertions(+), 5 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 8be619b6f1..f49d31d753 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -63,6 +63,11 @@ static void cpu_option_add_user_setting(const char *optname, 
uint32_t value)
  GUINT_TO_POINTER(value));
  }

+bool riscv_cpu_option_set(const char *optname)
+{
+return g_hash_table_contains(general_user_opts, optname);
+}
+


This function may work in unexpected way for future developer since we can 
check just somehow restricted
number of options using it, like vlen/elen/cbom/cboz size, but not vext_spec or 
pmu-num/mask.


Makes sense. I'll add all options in this hash to make it consistent.


Thanks,


Daniel




  #define ISA_EXT_DATA_ENTRY(_name, _min_ver, _prop) \
  {#_name, _min_ver, CPU_CFG_OFFSET(_prop)}

@@ -1056,17 +1061,18 @@ void riscv_cpu_finalize_features(RISCVCPU *cpu, Error 
**errp)
  {
  Error *local_err = NULL;

-/*
- * KVM accel does not have a specialized finalize()
- * callback because its extensions are validated
- * in the get()/set() callbacks of each property.
- */
  if (tcg_enabled()) {
  riscv_tcg_cpu_finalize_features(cpu, _err);
  if (local_err != NULL) {
  error_propagate(errp, local_err);
  return;
  }
+} else if (kvm_enabled()) {
+riscv_kvm_cpu_finalize_features(cpu, _err);
+if (local_err != NULL) {
+error_propagate(errp, local_err);
+return;
+}
  }

  #ifndef CONFIG_USER_ONLY
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 53101b82c5..988471c7ba 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -495,6 +495,7 @@ bool riscv_cpu_tlb_fill(CPUState *cs, vaddr address, int 
size,
  bool probe, uintptr_t retaddr);
  char *riscv_isa_string(RISCVCPU *cpu);
  void riscv_cpu_list(void);
+bool riscv_cpu_option_set(const char *optname);

  #define cpu_list riscv_cpu_list
  #define cpu_mmu_index riscv_cpu_mmu_index
diff --git a/target/riscv/kvm/kvm-cpu.c b/target/riscv/kvm/kvm-cpu.c
index 62a1e51f0a..70fb075846 100644
--- a/target/riscv/kvm/kvm-cpu.c
+++ b/target/riscv/kvm/kvm-cpu.c
@@ -1490,6 +1490,65 @@ static void kvm_cpu_instance_init(CPUState *cs)
  }
  }

+void riscv_kvm_cpu_finalize_features(RISCVCPU *cpu, Error **errp)
+{
+CPURISCVState *env = >env;
+KVMScratchCPU kvmcpu;
+struct kvm_one_reg reg;
+uint64_t val;
+int ret;
+
+/* short-circuit without spinning the scratch CPU */
+if (!cpu->cfg.ext_zicbom && !cpu->cfg.ext_zicboz) {
+return;
+}
+
+if (!kvm_riscv_create_scratch_vcpu()) {
+error_setg(errp, "Unable to create scratch KVM cpu");
+return;
+}
+
+if (cpu->cfg.ext_zicbom &&
+riscv_cpu_option_set(kvm_cbom_blocksize.name)) {
+
+reg.id = kvm_riscv_reg_id_ulong(env, KVM_REG_RISCV_CONFIG,
+kvm_cbom_blocksize.kvm_reg_id);
+reg.addr = (uint64_t)
+ret = ioctl(kvmcpu.cpufd, KVM_GET_ONE_REG, );
+if (ret != 0) {
+error_setg(errp, "Unable to read cbom_blocksize, error %d", errno);
+return;
+}
+
+if (cpu->cfg.cbom_blocksize != val) {
+error_setg(errp, "Unable to set cbom_blocksize to a different "
+   "value than the host (%lu)", val);
+return;
+}
+}
+
+if (cpu->cfg.ext_zicboz &&
+ 

Re: [RFC PATCH] hw/arm: Prefer arm_feature() over object_property_find()

2024-01-02 Thread Philippe Mathieu-Daudé

Hi,

On 18/12/23 10:48, Peter Maydell wrote:

On Mon, 18 Dec 2023 at 07:26, Markus Armbruster  wrote:


Peter Maydell  writes:


On Thu, 14 Dec 2023 at 17:14, Philippe Mathieu-Daudé  wrote:


QOM properties are added on the ARM vCPU object when a
feature is present. Rather than checking the property
is present, check the feature.

Suggested-by: Markus Armbruster 
Signed-off-by: Philippe Mathieu-Daudé 
---
RFC: If there is no objection on this patch, I can split
  as a per-feature series if necessary.

Based-on: <20231123143813.42632-1-phi...@linaro.org>
   "hw: Simplify accesses to CPUState::'start-powered-off' property"


I'm not a super-fan of board-level code looking inside
the QOM object with direct use of arm_feature() when
it doesn't have to. What's wrong with asking whether
the property exists before trying to set it?


I'm not a fan of using QOM instead of the native C interface.

The native C interface is CPUArmState method arm_feature().


But we don't in most of these cases really want to know "is this
a CPU with feature foo?". What we're asking is "does this
QOM property exist so it won't blow up if I set/get it?".


[More analysis on this topic.]

ARMV7M (hw/arm/armv7m.c) is an interesting QOM use case.

ARMV7M is a ARMCPU container, with few more things. (We have
more complex cases with containers of array of vCPUs, so this
single-vCPU case is a good start).

We'd like to apply properties on ARMV7M which get forwarded
to the embedded ARMCPU.
Usually we create the ARMCPU in armv7m_instance_init(), call
object_property_add_alias() to alias some ARMCPU to ARMV7M,
so these properties can be set externally before ARMV7M is
realized, being directly forwarded to the embedded ARMCPU [*].

The problem with ARMV7M is it the ARMCPU QOM type is variable,
so we don't know it in armv7m_instance_init() but only later
in armv7m_realize(), thus we can not call QOM _add_alias() to
alias them. One way to resolve this is to duplicate all possible
ARMCPU properties we want to set on ARMV7M, and set them in
armv7m_realize() after the ARMCPU is created and before it is
realized (the current implementation):

static void armv7m_realize(DeviceState *dev, Error **errp)
{
...
s->cpu = ARM_CPU(object_new_with_props(s->cpu_type,
   OBJECT(s), "cpu",
   , NULL));
...

if (object_property_find(OBJECT(s->cpu), "vfp")) {
if (!object_property_set_bool(OBJECT(s->cpu), "vfp",
  s->vfp, errp)) {
return;
}
}
...

if (!qdev_realize(cpudev, NULL, errp)) {
return;
}
...
}

static Property armv7m_properties[] = {
DEFINE_PROP_STRING("cpu-type", ARMv7MState, cpu_type),
...
DEFINE_PROP_BOOL("vfp", ARMv7MState, vfp, true),
...
};

Note ARMV7M "vfp" is a /static/ QOM property, so can not be
unregistered if the ARMCPU doesn't expose it.

* If ARMCPU doesn't provide "vfp", ARMV7M properties introspection
  still shows 'vfp=true'.

* If ARMCPU doesn't provide "vfp", requesting 'vfp=true' on ARMV7M
  is silently ignored.

* If ARMCPU doesn't provide "vfp", even if we unregister ARMV7M "vfp"
  property in realize() for cleaner introspection, we can not check
  whether user requested an explicit value before realize().
  Possibly we could use a tri-state {unset/false/true} dynamic property
  to check that.

[*] object_property_add_alias() is a 1-1 static aliasing. In the
case of cluster of objects we don't have API to do a 1-N static
aliasing; the current way to do that is similar to dynamic
properties setters iterating on the array (getter usually return
the container property, ignoring the cluster values).

Regards,

Phil.



  1   2   >