Re: [External] Re: [PATCH v2,1/1] memory: avoid updating ioeventfds for some address_space

2023-09-11 Thread hongmainquan




在 2023/9/12 1:55 上午, David Hildenbrand 写道:

On 11.09.23 18:28, Peter Xu wrote:

On Mon, Sep 04, 2023 at 08:51:43PM +0800, hongmainquan wrote:


Friendly ping...
Hello, this patch has already received a R-b from PeterXu. Could you 
please
help me review it as well and see if there are any issues? If 
everything is

fine, could you please consider merging it? Thank you!


Paolo, wanna pick this one up?

David, I know you're preparing a pull with a lot of memory changes, if 
you

like to pick this up it'll be good too.


I queued it to

https://github.com/davidhildenbrand/qemu.git mem-next

If nobody beats me to it (or requests me to drop it), I'll send it 
upstream next week.



Thanks for that! I hope it can be successfully merged.



[PATCH v2 06/10] Optimize loongarch_irq_init function implementation

2023-09-11 Thread xianglai li
Optimize loongarch_irq_init function implementation
and abstract the function loongarch_cpu_irq_init from it.

Cc: "Salil Mehta" 
Cc: Xiaojuan Yang 
Cc: Song Gao 
Cc: "Michael S. Tsirkin" 
Cc: Igor Mammedov 
Cc: Ani Sinha 
Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
Cc: Marcel Apfelbaum 
Cc: "Philippe Mathieu-Daudé" 
Cc: Yanan Wang 
Cc: "Daniel P. Berrangé" 
Cc: Peter Xu 
Cc: David Hildenbrand 
Cc: Bibo Mao 
Signed-off-by: xianglai li 
---
 hw/loongarch/virt.c | 105 
 include/hw/loongarch/virt.h |   5 +-
 2 files changed, 62 insertions(+), 48 deletions(-)

diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
index b8474e7b94..fb06b4ab4e 100644
--- a/hw/loongarch/virt.c
+++ b/hw/loongarch/virt.c
@@ -46,6 +46,8 @@
 #include "hw/block/flash.h"
 #include "qemu/error-report.h"
 
+static LoongArchCPU *loongarch_cpu_irq_init(MachineState *machine,
+LoongArchCPU *cpu, Error **errp);
 
 static void virt_flash_create(LoongArchMachineState *lams)
 {
@@ -573,16 +575,16 @@ static void loongarch_devices_init(DeviceState *pch_pic, 
LoongArchMachineState *
 static void loongarch_irq_init(LoongArchMachineState *lams)
 {
 MachineState *ms = MACHINE(lams);
-DeviceState *pch_pic, *pch_msi, *cpudev;
-DeviceState *ipi, *extioi;
+DeviceState *pch_pic, *pch_msi;
+DeviceState *extioi;
 SysBusDevice *d;
 LoongArchCPU *lacpu;
-CPULoongArchState *env;
 CPUState *cpu_state;
-int cpu, pin, i, start, num;
+int cpu, i, start, num;
 
 extioi = qdev_new(TYPE_LOONGARCH_EXTIOI);
 sysbus_realize_and_unref(SYS_BUS_DEVICE(extioi), _fatal);
+lams->extioi = extioi;
 
 /*
  * The connection of interrupts:
@@ -607,44 +609,8 @@ static void loongarch_irq_init(LoongArchMachineState *lams)
  */
 for (cpu = 0; cpu < ms->smp.cpus; cpu++) {
 cpu_state = qemu_get_cpu(cpu);
-cpudev = DEVICE(cpu_state);
 lacpu = LOONGARCH_CPU(cpu_state);
-env = &(lacpu->env);
-
-ipi = qdev_new(TYPE_LOONGARCH_IPI);
-sysbus_realize_and_unref(SYS_BUS_DEVICE(ipi), _fatal);
-
-/* connect ipi irq to cpu irq */
-qdev_connect_gpio_out(ipi, 0, qdev_get_gpio_in(cpudev, IRQ_IPI));
-/* IPI iocsr memory region */
-memory_region_add_subregion(>system_iocsr, SMP_IPI_MAILBOX,
-sysbus_mmio_get_region(SYS_BUS_DEVICE(ipi),
-0));
-memory_region_add_subregion(>system_iocsr, MAIL_SEND_ADDR,
-sysbus_mmio_get_region(SYS_BUS_DEVICE(ipi),
-1));
-/*
- * extioi iocsr memory region
- * only one extioi is added on loongarch virt machine
- * external device interrupt can only be routed to cpu 0-3
- */
-if (cpu < EXTIOI_CPUS)
-memory_region_add_subregion(>system_iocsr, APIC_BASE,
-sysbus_mmio_get_region(SYS_BUS_DEVICE(extioi),
-cpu));
-env->ipistate = ipi;
-}
-
-/*
- * connect ext irq to the cpu irq
- * cpu_pin[9:2] <= intc_pin[7:0]
- */
-for (cpu = 0; cpu < MIN(ms->smp.cpus, EXTIOI_CPUS); cpu++) {
-cpudev = DEVICE(qemu_get_cpu(cpu));
-for (pin = 0; pin < LS3A_INTC_IP; pin++) {
-qdev_connect_gpio_out(extioi, (cpu * 8 + pin),
-  qdev_get_gpio_in(cpudev, pin + 2));
-}
+loongarch_cpu_irq_init(ms, lacpu, _fatal);
 }
 
 pch_pic = qdev_new(TYPE_LOONGARCH_PCH_PIC);
@@ -927,11 +893,7 @@ static void loongarch_init(MachineState *machine)
 }
 }
 fdt_add_flash_node(lams);
-/* register reset function */
-for (i = 0; i < machine->smp.cpus; i++) {
-lacpu = LOONGARCH_CPU(qemu_get_cpu(i));
-qemu_register_reset(reset_load_elf, lacpu);
-}
+
 /* Initialize the IO interrupt subsystem */
 loongarch_irq_init(lams);
 fdt_add_irqchip_node(lams);
@@ -1091,6 +1053,57 @@ static void virt_mem_plug(HotplugHandler *hotplug_dev,
  dev, _abort);
 }
 
+static LoongArchCPU *loongarch_cpu_irq_init(MachineState *machine,
+LoongArchCPU *cpu, Error **errp)
+{
+LoongArchMachineState *lsms = LOONGARCH_MACHINE(machine);
+CPUState *cs = CPU(cpu);
+unsigned int cpu_index = cs->cpu_index;
+DeviceState *cpudev = DEVICE(cpu);
+DeviceState *extioi = lsms->extioi;
+CPULoongArchState *env = >env;
+DeviceState *ipi;
+int pin;
+
+qemu_register_reset(reset_load_elf, cpu);
+
+ipi = qdev_new(TYPE_LOONGARCH_IPI);
+sysbus_realize_and_unref(SYS_BUS_DEVICE(ipi), errp);
+
+/* connect ipi irq to cpu irq */
+qdev_connect_gpio_out(ipi, 0, qdev_get_gpio_in(cpudev, IRQ_IPI));
+/* IPI iocsr memory region */
+memory_region_add_subregion(>system_iocsr, SMP_IPI_MAILBOX,
+ 

[PATCH v2 09/10] Add generic event device for Loongarch

2023-09-11 Thread xianglai li
Create a new GED device type for Loongarch,
mount cpu_madt function to update the ACPI table.

Cc: "Salil Mehta" 
Cc: Xiaojuan Yang 
Cc: Song Gao 
Cc: "Michael S. Tsirkin" 
Cc: Igor Mammedov 
Cc: Ani Sinha 
Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
Cc: Marcel Apfelbaum 
Cc: "Philippe Mathieu-Daudé" 
Cc: Yanan Wang 
Cc: "Daniel P. Berrangé" 
Cc: Peter Xu 
Cc: David Hildenbrand 
Cc: Bibo Mao 
Signed-off-by: xianglai li 
---
 hw/loongarch/acpi-build.c | 17 +
 hw/loongarch/generic_event_device_loongarch.c | 36 +++
 hw/loongarch/meson.build  |  2 +-
 include/hw/acpi/generic_event_device.h|  1 +
 include/hw/loongarch/virt.h   |  4 +++
 5 files changed, 59 insertions(+), 1 deletion(-)
 create mode 100644 hw/loongarch/generic_event_device_loongarch.c

diff --git a/hw/loongarch/acpi-build.c b/hw/loongarch/acpi-build.c
index ae292fc543..66fad295cc 100644
--- a/hw/loongarch/acpi-build.c
+++ b/hw/loongarch/acpi-build.c
@@ -46,6 +46,23 @@
 #define ACPI_BUILD_DPRINTF(fmt, ...)
 #endif
 
+void virt_madt_cpu_entry(int uid,
+ const CPUArchIdList *apic_ids,
+ GArray *entry, bool force_enabled)
+{
+uint32_t apic_id = apic_ids->cpus[uid].arch_id;
+/* Flags – Local APIC Flags */
+uint32_t flags = apic_ids->cpus[uid].cpu != NULL || force_enabled ?
+ 1 /* Enabled */ : 0;
+
+/* Rev 1.0b, Table 5-13 Processor Local APIC Structure */
+build_append_int_noprefix(entry, 0, 1);   /* Type */
+build_append_int_noprefix(entry, 8, 1);   /* Length */
+build_append_int_noprefix(entry, uid, 1); /* ACPI Processor ID */
+build_append_int_noprefix(entry, apic_id, 1); /* APIC ID */
+build_append_int_noprefix(entry, flags, 4); /* Flags */
+}
+
 /* build FADT */
 static void init_common_fadt_data(AcpiFadtData *data)
 {
diff --git a/hw/loongarch/generic_event_device_loongarch.c 
b/hw/loongarch/generic_event_device_loongarch.c
new file mode 100644
index 00..1fe550239b
--- /dev/null
+++ b/hw/loongarch/generic_event_device_loongarch.c
@@ -0,0 +1,36 @@
+/*
+ * loongarch variant of the generic event device for hw reduced acpi
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/acpi/generic_event_device.h"
+#include "hw/loongarch/virt.h"
+
+static void acpi_ged_loongarch_class_init(ObjectClass *class, void *data)
+{
+AcpiDeviceIfClass *adevc = ACPI_DEVICE_IF_CLASS(class);
+
+adevc->madt_cpu = virt_madt_cpu_entry;
+}
+
+static const TypeInfo acpi_ged_loongarch_info = {
+.name  = TYPE_ACPI_GED_LOONGARCH,
+.parent= TYPE_ACPI_GED,
+.class_init= acpi_ged_loongarch_class_init,
+.interfaces = (InterfaceInfo[]) {
+{ TYPE_HOTPLUG_HANDLER },
+{ TYPE_ACPI_DEVICE_IF },
+{ }
+}
+};
+
+static void acpi_ged_loongarch_register_types(void)
+{
+type_register_static(_ged_loongarch_info);
+}
+
+type_init(acpi_ged_loongarch_register_types)
diff --git a/hw/loongarch/meson.build b/hw/loongarch/meson.build
index c0421502ab..8d21addee3 100644
--- a/hw/loongarch/meson.build
+++ b/hw/loongarch/meson.build
@@ -3,6 +3,6 @@ loongarch_ss.add(files(
 'fw_cfg.c',
 ))
 loongarch_ss.add(when: 'CONFIG_LOONGARCH_VIRT', if_true: [files('virt.c'), 
fdt])
-loongarch_ss.add(when: 'CONFIG_ACPI', if_true: files('acpi-build.c'))
+loongarch_ss.add(when: 'CONFIG_ACPI', if_true: files('acpi-build.c', 
'generic_event_device_loongarch.c'))
 
 hw_arch += {'loongarch': loongarch_ss}
diff --git a/include/hw/acpi/generic_event_device.h 
b/include/hw/acpi/generic_event_device.h
index d0a5a43abf..2923bd9d82 100644
--- a/include/hw/acpi/generic_event_device.h
+++ b/include/hw/acpi/generic_event_device.h
@@ -71,6 +71,7 @@
 OBJECT_DECLARE_SIMPLE_TYPE(AcpiGedState, ACPI_GED)
 
 #define TYPE_ACPI_GED_X86 "acpi-ged-x86"
+#define TYPE_ACPI_GED_LOONGARCH "acpi-ged-loongarch"
 
 #define ACPI_GED_EVT_SEL_OFFSET0x0
 #define ACPI_GED_EVT_SEL_LEN   0x4
diff --git a/include/hw/loongarch/virt.h b/include/hw/loongarch/virt.h
index 176dc43a93..f6c9495af2 100644
--- a/include/hw/loongarch/virt.h
+++ b/include/hw/loongarch/virt.h
@@ -60,4 +60,8 @@ struct LoongArchMachineState {
 OBJECT_DECLARE_SIMPLE_TYPE(LoongArchMachineState, LOONGARCH_MACHINE)
 bool loongarch_is_acpi_enabled(LoongArchMachineState *lams);
 void loongarch_acpi_setup(LoongArchMachineState *lams);
+void virt_madt_cpu_entry(int uid,
+ const CPUArchIdList *apic_ids, GArray *entry,
+ bool force_enabled);
+
 #endif
-- 
2.39.1




[PATCH v2 08/10] Add support of *unrealize* for Loongarch cpu

2023-09-11 Thread xianglai li
Add the unrealize function to the Loongarch CPU for cpu hot-(un)plug

Cc: "Salil Mehta" 
Cc: Xiaojuan Yang 
Cc: Song Gao 
Cc: "Michael S. Tsirkin" 
Cc: Igor Mammedov 
Cc: Ani Sinha 
Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
Cc: Marcel Apfelbaum 
Cc: "Philippe Mathieu-Daudé" 
Cc: Yanan Wang 
Cc: "Daniel P. Berrangé" 
Cc: Peter Xu 
Cc: David Hildenbrand 
Cc: Bibo Mao 
Signed-off-by: xianglai li 
---
 target/loongarch/cpu.c | 22 ++
 target/loongarch/cpu.h |  1 +
 2 files changed, 23 insertions(+)

diff --git a/target/loongarch/cpu.c b/target/loongarch/cpu.c
index a5153cb3bc..03134d7bac 100644
--- a/target/loongarch/cpu.c
+++ b/target/loongarch/cpu.c
@@ -572,6 +572,24 @@ static void loongarch_cpu_realizefn(DeviceState *dev, 
Error **errp)
 lacc->parent_realize(dev, errp);
 }
 
+static void loongarch_cpu_unrealizefn(DeviceState *dev)
+{
+LoongArchCPUClass *mcc = LOONGARCH_CPU_GET_CLASS(dev);
+
+#ifndef CONFIG_USER_ONLY
+CPUState *cs = CPU(dev);
+LoongArchCPU *cpu = LOONGARCH_CPU(dev);
+CPULoongArchState *env = >env;
+
+cpu_remove_sync(CPU(dev));
+cpu_address_space_destroy(cs, 0);
+address_space_destroy(>address_space_iocsr);
+memory_region_del_subregion(>system_iocsr, >iocsr_mem);
+#endif
+
+mcc->parent_unrealize(dev);
+}
+
 #ifndef CONFIG_USER_ONLY
 static void loongarch_qemu_write(void *opaque, hwaddr addr,
  uint64_t val, unsigned size)
@@ -752,6 +770,9 @@ static void loongarch_cpu_class_init(ObjectClass *c, void 
*data)
 device_class_set_props(dc, loongarch_cpu_properties);
 device_class_set_parent_realize(dc, loongarch_cpu_realizefn,
 >parent_realize);
+device_class_set_parent_unrealize(dc, loongarch_cpu_unrealizefn,
+  >parent_unrealize);
+
 resettable_class_set_parent_phases(rc, NULL, loongarch_cpu_reset_hold, 
NULL,
>parent_phases);
 
@@ -773,6 +794,7 @@ static void loongarch_cpu_class_init(ObjectClass *c, void 
*data)
 #ifdef CONFIG_TCG
 cc->tcg_ops = _tcg_ops;
 #endif
+dc->user_creatable = true;
 }
 
 static gchar *loongarch32_gdb_arch_name(CPUState *cs)
diff --git a/target/loongarch/cpu.h b/target/loongarch/cpu.h
index 058bc53bde..4143138b12 100644
--- a/target/loongarch/cpu.h
+++ b/target/loongarch/cpu.h
@@ -412,6 +412,7 @@ struct LoongArchCPUClass {
 /*< public >*/
 
 DeviceRealize parent_realize;
+DeviceUnrealize parent_unrealize;
 ResettablePhases parent_phases;
 };
 
-- 
2.39.1




[PATCH v2 02/10] Update CPUs AML with cpu-(ctrl)dev change

2023-09-11 Thread xianglai li
CPUs Control device(\\_SB.PCI0) register interface for the x86 arch
is based on PCI and is IO port based and hence existing cpus AML code
assumes _CRS objects would evaluate to a system resource which describes
IO Port address.
But on Loongarch arch CPUs control device(\\_SB.PRES) register interface
is memory-mapped hence _CRS object should evaluate to system resource
which describes memory-mapped base address.

This cpus AML code change updates the existing interface of the build cpus AML
function to accept both IO/MEMORY type regions and update the _CRS object
correspondingly.

Co-authored-by: "Salil Mehta" 
Cc: "Salil Mehta" 
Cc: Xiaojuan Yang 
Cc: Song Gao 
Cc: "Michael S. Tsirkin" 
Cc: Igor Mammedov 
Cc: Ani Sinha 
Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
Cc: Marcel Apfelbaum 
Cc: "Philippe Mathieu-Daudé" 
Cc: Yanan Wang 
Cc: "Daniel P. Berrangé" 
Cc: Peter Xu 
Cc: David Hildenbrand 
Cc: Bibo Mao 
Signed-off-by: xianglai li 
---
 hw/acpi/cpu.c | 20 +++-
 hw/i386/acpi-build.c  |  2 +-
 include/hw/acpi/cpu.h |  5 +++--
 3 files changed, 19 insertions(+), 8 deletions(-)

diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
index 6897c8789a..41fc157ac9 100644
--- a/hw/acpi/cpu.c
+++ b/hw/acpi/cpu.c
@@ -5,6 +5,7 @@
 #include "qapi/qapi-events-acpi.h"
 #include "trace.h"
 #include "sysemu/numa.h"
+#include "hw/acpi/cpu_hotplug.h"
 
 #define OVMF_CPUHP_SMI_CMD 4
 
@@ -331,9 +332,10 @@ const VMStateDescription vmstate_cpu_hotplug = {
 #define CPU_FW_EJECT_EVENT "CEJF"
 
 void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
-hwaddr io_base,
+hwaddr mmap_io_base,
 const char *res_root,
-const char *event_handler_method)
+const char *event_handler_method,
+AmlRegionSpace rs)
 {
 Aml *ifctx;
 Aml *field;
@@ -360,14 +362,22 @@ void build_cpus_aml(Aml *table, MachineState *machine, 
CPUHotplugFeatures opts,
 aml_append(cpu_ctrl_dev, aml_mutex(CPU_LOCK, 0));
 
 crs = aml_resource_template();
-aml_append(crs, aml_io(AML_DECODE16, io_base, io_base, 1,
+if (rs == AML_SYSTEM_IO) {
+aml_append(crs, aml_io(AML_DECODE16, mmap_io_base, mmap_io_base, 1,
ACPI_CPU_HOTPLUG_REG_LEN));
+} else {
+aml_append(crs, aml_memory32_fixed(mmap_io_base,
+   ACPI_CPU_HOTPLUG_REG_LEN, AML_READ_WRITE));
+}
+
 aml_append(cpu_ctrl_dev, aml_name_decl("_CRS", crs));
 
+g_assert(rs == AML_SYSTEM_IO || rs == AML_SYSTEM_MEMORY);
 /* declare CPU hotplug MMIO region with related access fields */
 aml_append(cpu_ctrl_dev,
-aml_operation_region("PRST", AML_SYSTEM_IO, aml_int(io_base),
- ACPI_CPU_HOTPLUG_REG_LEN));
+aml_operation_region("PRST", rs,
+ aml_int(mmap_io_base),
+ ACPI_CPU_HOTPLUG_REG_LEN));
 
 field = aml_field("PRST", AML_BYTE_ACC, AML_NOLOCK,
   AML_WRITE_AS_ZEROS);
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index bb12b0ad43..560f108d38 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1550,7 +1550,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 .fw_unplugs_cpu = pm->smi_on_cpu_unplug,
 };
 build_cpus_aml(dsdt, machine, opts, pm->cpu_hp_io_base,
-   "\\_SB.PCI0", "\\_GPE._E02");
+   "\\_SB.PCI0", "\\_GPE._E02", AML_SYSTEM_IO);
 }
 
 if (pcms->memhp_io_base && nr_mem) {
diff --git a/include/hw/acpi/cpu.h b/include/hw/acpi/cpu.h
index 999caaf510..cddea78333 100644
--- a/include/hw/acpi/cpu.h
+++ b/include/hw/acpi/cpu.h
@@ -56,9 +56,10 @@ typedef struct CPUHotplugFeatures {
 } CPUHotplugFeatures;
 
 void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
-hwaddr io_base,
+hwaddr mmap_io_base,
 const char *res_root,
-const char *event_handler_method);
+const char *event_handler_method,
+AmlRegionSpace rs);
 
 void acpi_cpu_ospm_status(CPUHotplugState *cpu_st, ACPIOSTInfoList ***list);
 
-- 
2.39.1




[PATCH v2 05/10] Added CPU topology support for Loongarch

2023-09-11 Thread xianglai li
1.Add topological relationships for Loongarch VCPU
and initialize topology member variables.
2.Add a description of the calculation method of
the arch_id and the topological relationship of the CPU.

Cc: "Salil Mehta" 
Cc: Xiaojuan Yang 
Cc: Song Gao 
Cc: "Michael S. Tsirkin" 
Cc: Igor Mammedov 
Cc: Ani Sinha 
Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
Cc: Marcel Apfelbaum 
Cc: "Philippe Mathieu-Daudé" 
Cc: Yanan Wang 
Cc: "Daniel P. Berrangé" 
Cc: Peter Xu 
Cc: David Hildenbrand 
Cc: Bibo Mao 
Signed-off-by: xianglai li 
---
 docs/system/loongarch/virt.rst |  31 ++
 hw/loongarch/virt.c| 101 ++---
 target/loongarch/cpu.c |  13 -
 target/loongarch/cpu.h |  12 +++-
 4 files changed, 134 insertions(+), 23 deletions(-)

diff --git a/docs/system/loongarch/virt.rst b/docs/system/loongarch/virt.rst
index c37268b404..eaba9e2fd7 100644
--- a/docs/system/loongarch/virt.rst
+++ b/docs/system/loongarch/virt.rst
@@ -28,6 +28,37 @@ The ``qemu-system-loongarch64`` provides emulation for virt
 machine. You can specify the machine type ``virt`` and
 cpu type ``la464``.
 
+CPU Topology
+
+
+The ``LA464`` type CPUs have the concept of Socket Core and Thread.
+
+For example:
+
+``-smp 1,maxcpus=M,sockets=S,cores=C,threads=T``
+
+The above parameters indicate that the machine has a maximum of ``M`` vCPUs and
+``S`` sockets, each socket has ``C`` cores, each core has ``T`` threads,
+and each thread corresponds to a vCPU.
+
+Then ``M`` ``S`` ``C`` ``T`` has the following relationship:
+
+``M = S * C * T``
+
+In the CPU topology relationship, When we know the ``socket_id`` ``core_id``
+and ``thread_id`` of the CPU, we can calculate its ``arch_id``:
+
+``arch_id = (socket_id * S) + (core_id * C) + (thread_id * T)``
+
+Similarly, when we know the ``arch_id`` of the CPU,
+we can also get its ``socket_id`` ``core_id`` and ``thread_id``:
+
+``socket_id = arch_id / (C * T)``
+
+``core_id = (arch_id / T) % C``
+
+``thread_id = arch_id % T``
+
 Boot options
 
 
diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
index 2629128aed..b8474e7b94 100644
--- a/hw/loongarch/virt.c
+++ b/hw/loongarch/virt.c
@@ -624,11 +624,11 @@ static void loongarch_irq_init(LoongArchMachineState 
*lams)
 sysbus_mmio_get_region(SYS_BUS_DEVICE(ipi),
 1));
 /*
-* extioi iocsr memory region
-* only one extioi is added on loongarch virt machine
-* external device interrupt can only be routed to cpu 0-3
-*/
-   if (cpu < EXTIOI_CPUS)
+ * extioi iocsr memory region
+ * only one extioi is added on loongarch virt machine
+ * external device interrupt can only be routed to cpu 0-3
+ */
+if (cpu < EXTIOI_CPUS)
 memory_region_add_subregion(>system_iocsr, APIC_BASE,
 sysbus_mmio_get_region(SYS_BUS_DEVICE(extioi),
 cpu));
@@ -789,9 +789,7 @@ static void loongarch_init(MachineState *machine)
 NodeInfo *numa_info = machine->numa_state->nodes;
 int i;
 hwaddr fdt_base;
-const CPUArchIdList *possible_cpus;
 MachineClass *mc = MACHINE_GET_CLASS(machine);
-CPUState *cpu;
 char *ramName = NULL;
 
 if (!cpu_model) {
@@ -803,16 +801,41 @@ static void loongarch_init(MachineState *machine)
 exit(1);
 }
 create_fdt(lams);
-/* Init CPUs */
 
-possible_cpus = mc->possible_cpu_arch_ids(machine);
-for (i = 0; i < possible_cpus->len; i++) {
-cpu = cpu_create(machine->cpu_type);
-cpu->cpu_index = i;
-machine->possible_cpus->cpus[i].cpu = OBJECT(cpu);
-lacpu = LOONGARCH_CPU(cpu);
-lacpu->phy_id = machine->possible_cpus->cpus[i].arch_id;
+/* Init CPUs */
+mc->possible_cpu_arch_ids(machine);
+for (i = 0; i < machine->smp.cpus; i++) {
+Object *cpuobj;
+cpuobj = object_new(machine->cpu_type);
+lacpu = LOONGARCH_CPU(cpuobj);
+
+lacpu->arch_id = machine->possible_cpus->cpus[i].arch_id;
+object_property_set_int(cpuobj, "socket-id",
+
machine->possible_cpus->cpus[i].props.socket_id,
+NULL);
+object_property_set_int(cpuobj, "core-id",
+machine->possible_cpus->cpus[i].props.core_id,
+NULL);
+object_property_set_int(cpuobj, "thread-id",
+
machine->possible_cpus->cpus[i].props.thread_id,
+NULL);
+/*
+ * The CPU in place at the time of machine startup will also enter
+ * the CPU hot-plug process when it is created, but at this time,
+ * the GED device has not been created, resulting in exit in the CPU
+ * hot-plug process, which can avoid the incumbent CPU 

[PATCH v2 04/10] Introduce the CPU address space destruction function

2023-09-11 Thread xianglai li
Introduce new function to destroy CPU address space resources
for cpu hot-(un)plug.

Co-authored-by: "Salil Mehta" 
Cc: "Salil Mehta" 
Cc: Xiaojuan Yang 
Cc: Song Gao 
Cc: "Michael S. Tsirkin" 
Cc: Igor Mammedov 
Cc: Ani Sinha 
Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
Cc: Marcel Apfelbaum 
Cc: "Philippe Mathieu-Daudé" 
Cc: Yanan Wang 
Cc: "Daniel P. Berrangé" 
Cc: Peter Xu 
Cc: David Hildenbrand 
Cc: Bibo Mao 
Signed-off-by: xianglai li 
---
 include/exec/cpu-common.h |  8 
 include/hw/core/cpu.h |  1 +
 softmmu/physmem.c | 24 
 3 files changed, 33 insertions(+)

diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index 41788c0bdd..eb56a228a2 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -120,6 +120,14 @@ size_t qemu_ram_pagesize_largest(void);
  */
 void cpu_address_space_init(CPUState *cpu, int asidx,
 const char *prefix, MemoryRegion *mr);
+/**
+ * cpu_address_space_destroy:
+ * @cpu: CPU for which address space needs to be destroyed
+ * @asidx: integer index of this address space
+ *
+ * Note that with KVM only one address space is supported.
+ */
+void cpu_address_space_destroy(CPUState *cpu, int asidx);
 
 void cpu_physical_memory_rw(hwaddr addr, void *buf,
 hwaddr len, bool is_write);
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 92a4234439..c90cf3a162 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -366,6 +366,7 @@ struct CPUState {
 QSIMPLEQ_HEAD(, qemu_work_item) work_list;
 
 CPUAddressSpace *cpu_ases;
+int cpu_ases_ref_count;
 int num_ases;
 AddressSpace *as;
 MemoryRegion *memory;
diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index 18277ddd67..c75e3e8042 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -761,6 +761,7 @@ void cpu_address_space_init(CPUState *cpu, int asidx,
 
 if (!cpu->cpu_ases) {
 cpu->cpu_ases = g_new0(CPUAddressSpace, cpu->num_ases);
+cpu->cpu_ases_ref_count = cpu->num_ases;
 }
 
 newas = >cpu_ases[asidx];
@@ -774,6 +775,29 @@ void cpu_address_space_init(CPUState *cpu, int asidx,
 }
 }
 
+void cpu_address_space_destroy(CPUState *cpu, int asidx)
+{
+CPUAddressSpace *cpuas;
+
+assert(asidx < cpu->num_ases);
+assert(asidx == 0 || !kvm_enabled());
+assert(cpu->cpu_ases);
+
+cpuas = >cpu_ases[asidx];
+if (tcg_enabled()) {
+memory_listener_unregister(>tcg_as_listener);
+}
+
+address_space_destroy(cpuas->as);
+
+cpu->cpu_ases_ref_count--;
+if (cpu->cpu_ases_ref_count == 0) {
+g_free(cpu->cpu_ases);
+cpu->cpu_ases = NULL;
+}
+
+}
+
 AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx)
 {
 /* Return the AddressSpace corresponding to the specified index */
-- 
2.39.1




[PATCH v2 10/10] Update the ACPI table for the Loongarch CPU

2023-09-11 Thread xianglai li
Add new types of GED devices for Loongarch machines,
add CPU hot-(un)plug event response and address spaces,
and update the ACPI table.

Cc: "Salil Mehta" 
Cc: Xiaojuan Yang 
Cc: Song Gao 
Cc: "Michael S. Tsirkin" 
Cc: Igor Mammedov 
Cc: Ani Sinha 
Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
Cc: Marcel Apfelbaum 
Cc: "Philippe Mathieu-Daudé" 
Cc: Yanan Wang 
Cc: "Daniel P. Berrangé" 
Cc: Peter Xu 
Cc: David Hildenbrand 
Cc: Bibo Mao 
Signed-off-by: xianglai li 
---
 hw/acpi/acpi-cpu-hotplug-stub.c |  9 +
 hw/loongarch/acpi-build.c   | 16 +++-
 hw/loongarch/virt.c |  5 +++--
 include/hw/loongarch/virt.h |  1 +
 4 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/hw/acpi/acpi-cpu-hotplug-stub.c b/hw/acpi/acpi-cpu-hotplug-stub.c
index 2aec90d968..af9fda2cf4 100644
--- a/hw/acpi/acpi-cpu-hotplug-stub.c
+++ b/hw/acpi/acpi-cpu-hotplug-stub.c
@@ -19,6 +19,15 @@ void legacy_acpi_cpu_hotplug_init(MemoryRegion *parent, 
Object *owner,
 return;
 }
 
+void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
+hwaddr mmap_io_base,
+const char *res_root,
+const char *event_handler_method,
+AmlRegionSpace rs)
+{
+return;
+}
+
 void acpi_cpu_ospm_status(CPUHotplugState *cpu_st, ACPIOSTInfoList ***list)
 {
 return;
diff --git a/hw/loongarch/acpi-build.c b/hw/loongarch/acpi-build.c
index 66fad295cc..312908fb2f 100644
--- a/hw/loongarch/acpi-build.c
+++ b/hw/loongarch/acpi-build.c
@@ -138,15 +138,18 @@ build_madt(GArray *table_data, BIOSLinker *linker, 
LoongArchMachineState *lams)
 build_append_int_noprefix(table_data, 1 /* PCAT_COMPAT */, 4); /* Flags */
 
 for (i = 0; i < arch_ids->len; i++) {
+uint32_t flags;
+
 /* Processor Core Interrupt Controller Structure */
 arch_id = arch_ids->cpus[i].arch_id;
+flags = arch_ids->cpus[i].cpu ? 1 : 0;
 
 build_append_int_noprefix(table_data, 17, 1);/* Type */
 build_append_int_noprefix(table_data, 15, 1);/* Length */
 build_append_int_noprefix(table_data, 1, 1); /* Version */
 build_append_int_noprefix(table_data, i, 4); /* ACPI Processor ID 
*/
 build_append_int_noprefix(table_data, arch_id, 4); /* Core ID */
-build_append_int_noprefix(table_data, 1, 4); /* Flags */
+build_append_int_noprefix(table_data, flags, 4);   /* Flags */
 }
 
 /* Extend I/O Interrupt Controller Structure */
@@ -309,6 +312,17 @@ build_la_ged_aml(Aml *dsdt, MachineState *machine)
  AML_SYSTEM_MEMORY,
  VIRT_GED_MEM_ADDR);
 }
+
+if (event & ACPI_GED_CPU_HOTPLUG_EVT) {
+CPUHotplugFeatures opts = {
+.acpi_1_compatible = false,
+.has_legacy_cphp = false
+};
+
+build_cpus_aml(dsdt, machine, opts, VIRT_GED_CPUHP_ADDR,
+   "\\_SB", "\\_GPE._E01", AML_SYSTEM_MEMORY);
+
+}
 acpi_dsdt_add_power_button(dsdt);
 }
 
diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
index aabe8aa814..c4ec9dd6a7 100644
--- a/hw/loongarch/virt.c
+++ b/hw/loongarch/virt.c
@@ -449,12 +449,12 @@ static DeviceState *create_acpi_ged(DeviceState *pch_pic, 
LoongArchMachineState
 {
 DeviceState *dev;
 MachineState *ms = MACHINE(lams);
-uint32_t event = ACPI_GED_PWR_DOWN_EVT;
+uint32_t event = ACPI_GED_PWR_DOWN_EVT | ACPI_GED_CPU_HOTPLUG_EVT;
 
 if (ms->ram_slots) {
 event |= ACPI_GED_MEM_HOTPLUG_EVT;
 }
-dev = qdev_new(TYPE_ACPI_GED);
+dev = qdev_new(TYPE_ACPI_GED_LOONGARCH);
 qdev_prop_set_uint32(dev, "ged-event", event);
 
 /* ged event */
@@ -463,6 +463,7 @@ static DeviceState *create_acpi_ged(DeviceState *pch_pic, 
LoongArchMachineState
 sysbus_mmio_map(SYS_BUS_DEVICE(dev), 1, VIRT_GED_MEM_ADDR);
 /* ged regs used for reset and power down */
 sysbus_mmio_map(SYS_BUS_DEVICE(dev), 2, VIRT_GED_REG_ADDR);
+sysbus_mmio_map(SYS_BUS_DEVICE(dev), 3, VIRT_GED_CPUHP_ADDR);
 
 sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0,
qdev_get_gpio_in(pch_pic, VIRT_SCI_IRQ - 
VIRT_GSI_BASE));
diff --git a/include/hw/loongarch/virt.h b/include/hw/loongarch/virt.h
index f6c9495af2..82ac238f4f 100644
--- a/include/hw/loongarch/virt.h
+++ b/include/hw/loongarch/virt.h
@@ -31,6 +31,7 @@
 #define VIRT_GED_EVT_ADDR   0x100e
 #define VIRT_GED_MEM_ADDR   (VIRT_GED_EVT_ADDR + ACPI_GED_EVT_SEL_LEN)
 #define VIRT_GED_REG_ADDR   (VIRT_GED_MEM_ADDR + MEMORY_HOTPLUG_IO_LEN)
+#define VIRT_GED_CPUHP_ADDR (VIRT_GED_REG_ADDR + ACPI_CPU_HOTPLUG_REG_LEN)
 
 struct LoongArchMachineState {
 /*< private >*/
-- 
2.39.1




[PATCH v2 00/10] Adds CPU hot-plug support to Loongarch

2023-09-11 Thread xianglai li
Hello everyone, We refer to the implementation of ARM CPU
Hot-Plug to add GED-based CPU Hot-Plug support to Loongarch.

The first 4 patches are changes to the QEMU common code,
including adding GED support for CPU Hot-Plug, updating
the ACPI table creation process, and adding qdev_disconnect_gpio_out_named
and cpu_address_space_destroy interfaces to release resources
when CPU un-plug.

For the modification of the public part of the code, we refer to the
arm-related patch, and the link address of the corresponding patch is
as follows:
https://lore.kernel.org/all/20200613213629.21984-1-salil.me...@huawei.com/

In order to respect the work of "Salil Mehta", we will rebase the first
4 patches in the final patch, which are referenced here to ensure that
the loongarch cpu hotplug can work properly.

The last 6 patches are Loongarch architecture-related,
and the modifications include the definition of the hook
function related to the CPU Hot-(UN)Plug, the allocation
and release of CPU resources when CPU Hot-(UN)Plug,
the creation process of updating the ACPI table,
and finally the custom switch for the CPU Hot-Plug.

V2:
- Fix formatting and spelling errors
- Split large patches into smaller patches
  - Split the original patch
<> into
<>
<>
<>.
  - Split the original patch
<> into
<>
<>
- Added loongarch cpu topology calculation method.
- Change the position of the cpu topology patch.
- Change unreasonable variable and function names.



xianglai li (10):
  Update ACPI GED framework to support vcpu hot-(un)plug
  Update CPUs AML with cpu-(ctrl)dev change
  make qdev_disconnect_gpio_out_named() public
  Introduce the CPU address space destruction function
  Added CPU topology support for Loongarch
  Optimize loongarch_irq_init function implementation
  Add basic CPU hot-(un)plug support for Loongarch
  Add support of *unrealize* for Loongarch cpu
  Add generic event device for Loongarch
  Update the ACPI table for the Loongarch CPU

 .../devices/loongarch64-softmmu/default.mak   |   1 +
 docs/system/loongarch/virt.rst|  31 ++
 hw/acpi/acpi-cpu-hotplug-stub.c   |  15 +
 hw/acpi/cpu.c |  27 +-
 hw/acpi/generic_event_device.c|  33 ++
 hw/core/gpio.c|   4 +-
 hw/i386/acpi-build.c  |   2 +-
 hw/loongarch/acpi-build.c |  33 +-
 hw/loongarch/generic_event_device_loongarch.c |  36 ++
 hw/loongarch/meson.build  |   2 +-
 hw/loongarch/virt.c   | 424 +++---
 include/exec/cpu-common.h |   8 +
 include/hw/acpi/cpu.h |   5 +-
 include/hw/acpi/cpu_hotplug.h |  10 +
 include/hw/acpi/generic_event_device.h|   6 +
 include/hw/core/cpu.h |   1 +
 include/hw/loongarch/virt.h   |  10 +-
 include/hw/qdev-core.h|   2 +
 softmmu/physmem.c |  24 +
 target/loongarch/cpu.c|  35 +-
 target/loongarch/cpu.h|  13 +-
 21 files changed, 635 insertions(+), 87 deletions(-)
 create mode 100644 hw/loongarch/generic_event_device_loongarch.c

Cc: "Salil Mehta" 
Cc: Xiaojuan Yang 
Cc: Song Gao 
Cc: "Michael S. Tsirkin" 
Cc: Igor Mammedov 
Cc: Ani Sinha 
Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
Cc: Marcel Apfelbaum 
Cc: "Philippe Mathieu-Daudé" 
Cc: Yanan Wang 
Cc: "Daniel P. Berrangé" 
Cc: Peter Xu 
Cc: David Hildenbrand 
Cc: Bibo Mao 
Cc: Xianglai li 
--
2.39.1




[PATCH v2 03/10] make qdev_disconnect_gpio_out_named() public

2023-09-11 Thread xianglai li
It will be reused in loongarch/virt.c for unwiring
the vcpu<->exioi interrupts for the vcpu hot-(un)plug
cases.

Co-authored-by: "Salil Mehta" 
Cc: "Salil Mehta" 
Cc: Xiaojuan Yang 
Cc: Song Gao 
Cc: "Michael S. Tsirkin" 
Cc: Igor Mammedov 
Cc: Ani Sinha 
Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
Cc: Marcel Apfelbaum 
Cc: "Philippe Mathieu-Daudé" 
Cc: Yanan Wang 
Cc: "Daniel P. Berrangé" 
Cc: Peter Xu 
Cc: David Hildenbrand 
Cc: Bibo Mao 
Signed-off-by: xianglai li 
---
 hw/core/gpio.c | 4 ++--
 include/hw/qdev-core.h | 2 ++
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/hw/core/gpio.c b/hw/core/gpio.c
index 80d07a6ec9..4fc6409545 100644
--- a/hw/core/gpio.c
+++ b/hw/core/gpio.c
@@ -143,8 +143,8 @@ qemu_irq qdev_get_gpio_out_connector(DeviceState *dev, 
const char *name, int n)
 
 /* disconnect a GPIO output, returning the disconnected input (if any) */
 
-static qemu_irq qdev_disconnect_gpio_out_named(DeviceState *dev,
-   const char *name, int n)
+qemu_irq qdev_disconnect_gpio_out_named(DeviceState *dev,
+const char *name, int n)
 {
 char *propname = g_strdup_printf("%s[%d]",
  name ? name : "unnamed-gpio-out", n);
diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index 151d968238..32bb54163e 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -739,6 +739,8 @@ qemu_irq qdev_get_gpio_out_connector(DeviceState *dev, 
const char *name, int n);
  */
 qemu_irq qdev_intercept_gpio_out(DeviceState *dev, qemu_irq icpt,
  const char *name, int n);
+qemu_irq qdev_disconnect_gpio_out_named(DeviceState *dev,
+const char *name, int n);
 
 BusState *qdev_get_child_bus(DeviceState *dev, const char *name);
 
-- 
2.39.1




[PATCH v2 07/10] Add basic CPU hot-(un)plug support for Loongarch

2023-09-11 Thread xianglai li
Add CPU hot-(un)plug related hook functions and
turn on the CPU hot-(un)plug custom switch.

Cc: "Salil Mehta" 
Cc: Xiaojuan Yang 
Cc: Song Gao 
Cc: "Michael S. Tsirkin" 
Cc: Igor Mammedov 
Cc: Ani Sinha 
Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
Cc: Marcel Apfelbaum 
Cc: "Philippe Mathieu-Daudé" 
Cc: Yanan Wang 
Cc: "Daniel P. Berrangé" 
Cc: Peter Xu 
Cc: David Hildenbrand 
Cc: Bibo Mao 
Signed-off-by: xianglai li 
---
 .../devices/loongarch64-softmmu/default.mak   |   1 +
 hw/loongarch/virt.c   | 223 ++
 2 files changed, 224 insertions(+)

diff --git a/configs/devices/loongarch64-softmmu/default.mak 
b/configs/devices/loongarch64-softmmu/default.mak
index 928bc117ef..e596706fab 100644
--- a/configs/devices/loongarch64-softmmu/default.mak
+++ b/configs/devices/loongarch64-softmmu/default.mak
@@ -1,3 +1,4 @@
 # Default configuration for loongarch64-softmmu
 
 CONFIG_LOONGARCH_VIRT=y
+CONFIG_ACPI_CPU_HOTPLUG=y
diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
index fb06b4ab4e..aabe8aa814 100644
--- a/hw/loongarch/virt.c
+++ b/hw/loongarch/virt.c
@@ -999,11 +999,93 @@ static void virt_get_cpu_topo_by_cpu_index(const 
MachineState *ms,
 cpu_topo->thread_id = cpu_index % ms->smp.threads;
 }
 
+/* find cpu slot in machine->possible_cpus by arch_id */
+static CPUArchId *loongarch_find_cpu_slot(MachineState *ms, int arch_id)
+{
+int n;
+for (n = 0; n < ms->possible_cpus->len; n++) {
+if (ms->possible_cpus->cpus[n].arch_id == arch_id) {
+return >possible_cpus->cpus[n];
+}
+}
+
+return NULL;
+}
+
+static void loongarch_cpu_pre_plug(HotplugHandler *hotplug_dev,
+DeviceState *dev, Error **errp)
+{
+MachineState *ms = MACHINE(OBJECT(hotplug_dev));
+MachineClass *mc = MACHINE_GET_CLASS(hotplug_dev);
+LoongArchCPU *cpu = LOONGARCH_CPU(dev);
+CPUState *cs = CPU(dev);
+CPUArchId *cpu_slot;
+Error *local_err = NULL;
+LoongArchCPUTopo cpu_topo;
+int arch_id;
+
+if (dev->hotplugged && !mc->has_hotpluggable_cpus) {
+error_setg(_err, "CPU hotplug not supported for this machine");
+goto out;
+}
+
+/* sanity check the cpu */
+if (!object_dynamic_cast(OBJECT(cpu), ms->cpu_type)) {
+error_setg(_err, "Invalid CPU type, expected cpu type: '%s'",
+   ms->cpu_type);
+goto out;
+}
+
+if ((cpu->thread_id < 0) || (cpu->thread_id >= ms->smp.threads)) {
+error_setg(_err,
+   "Invalid thread-id %u specified, must be in range 1:%u",
+   cpu->thread_id, ms->smp.threads - 1);
+goto out;
+}
+
+if ((cpu->core_id < 0) || (cpu->core_id >= ms->smp.cores)) {
+error_setg(_err,
+   "Invalid core-id %u specified, must be in range 1:%u",
+   cpu->core_id, ms->smp.cores);
+goto out;
+}
+
+if ((cpu->socket_id < 0) || (cpu->socket_id >= ms->smp.sockets)) {
+error_setg(_err,
+   "Invalid socket-id %u specified, must be in range 1:%u",
+   cpu->socket_id, ms->smp.sockets - 1);
+goto out;
+}
+
+cpu_topo.socket_id = cpu->socket_id;
+cpu_topo.core_id = cpu->core_id;
+cpu_topo.thread_id = cpu->thread_id;
+arch_id = virt_get_arch_id_from_cpu_topo(ms, _topo);
+
+cpu_slot = loongarch_find_cpu_slot(ms, arch_id);
+if (CPU(cpu_slot->cpu)) {
+error_setg(_err,
+   "cpu(id%d=%d:%d:%d) with arch-id %" PRIu64 " exists",
+   cs->cpu_index, cpu->socket_id, cpu->core_id,
+   cpu->thread_id, cpu_slot->arch_id);
+goto out;
+}
+cpu->arch_id = arch_id;
+
+numa_cpu_pre_plug(cpu_slot, dev, _err);
+
+return ;
+out:
+error_propagate(errp, local_err);
+}
+
 static void virt_machine_device_pre_plug(HotplugHandler *hotplug_dev,
 DeviceState *dev, Error **errp)
 {
 if (memhp_type_supported(dev)) {
 virt_mem_pre_plug(hotplug_dev, dev, errp);
+} else if (object_dynamic_cast(OBJECT(dev), TYPE_LOONGARCH_CPU)) {
+loongarch_cpu_pre_plug(hotplug_dev, dev, errp);
 }
 }
 
@@ -1017,11 +1099,45 @@ static void virt_mem_unplug_request(HotplugHandler 
*hotplug_dev,
errp);
 }
 
+static void loongarch_cpu_unplug_request(HotplugHandler *hotplug_dev,
+DeviceState *dev, Error **errp)
+{
+MachineState *machine = MACHINE(OBJECT(hotplug_dev));
+LoongArchMachineState *lsms = LOONGARCH_MACHINE(machine);
+Error *local_err = NULL;
+HotplugHandlerClass *hhc;
+LoongArchCPU *cpu = LOONGARCH_CPU(dev);
+CPUState *cs = CPU(dev);
+
+if (!lsms->acpi_ged) {
+error_setg(_err, "CPU hot unplug not supported without ACPI");
+goto out;
+}
+
+if (cs->cpu_index == 0) {
+error_setg(_err,
+   

[PATCH v2 01/10] Update ACPI GED framework to support vcpu hot-(un)plug

2023-09-11 Thread xianglai li
ACPI GED shall be used to convey to the guest kernel about any cpu hot-(un)plug
events. Therefore, existing ACPI GED framework inside QEMU needs to be enhanced
to support CPU hot-(un)plug state and events.

Co-authored-by: "Salil Mehta" 
Cc: "Salil Mehta" 
Cc: Xiaojuan Yang 
Cc: Song Gao 
Cc: "Michael S. Tsirkin" 
Cc: Igor Mammedov 
Cc: Ani Sinha 
Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
Cc: Marcel Apfelbaum 
Cc: "Philippe Mathieu-Daudé" 
Cc: Yanan Wang 
Cc: "Daniel P. Berrangé" 
Cc: Peter Xu 
Cc: David Hildenbrand 
Cc: Bibo Mao 
Signed-off-by: xianglai li 
---
 hw/acpi/acpi-cpu-hotplug-stub.c|  6 +
 hw/acpi/cpu.c  |  7 --
 hw/acpi/generic_event_device.c | 33 ++
 include/hw/acpi/cpu_hotplug.h  | 10 
 include/hw/acpi/generic_event_device.h |  5 
 5 files changed, 54 insertions(+), 7 deletions(-)

diff --git a/hw/acpi/acpi-cpu-hotplug-stub.c b/hw/acpi/acpi-cpu-hotplug-stub.c
index 3fc4b14c26..2aec90d968 100644
--- a/hw/acpi/acpi-cpu-hotplug-stub.c
+++ b/hw/acpi/acpi-cpu-hotplug-stub.c
@@ -24,6 +24,12 @@ void acpi_cpu_ospm_status(CPUHotplugState *cpu_st, 
ACPIOSTInfoList ***list)
 return;
 }
 
+void cpu_hotplug_hw_init(MemoryRegion *as, Object *owner,
+ CPUHotplugState *state, hwaddr base_addr)
+{
+return;
+}
+
 void acpi_cpu_plug_cb(HotplugHandler *hotplug_dev,
   CPUHotplugState *cpu_st, DeviceState *dev, Error **errp)
 {
diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
index 19c154d78f..6897c8789a 100644
--- a/hw/acpi/cpu.c
+++ b/hw/acpi/cpu.c
@@ -6,13 +6,6 @@
 #include "trace.h"
 #include "sysemu/numa.h"
 
-#define ACPI_CPU_HOTPLUG_REG_LEN 12
-#define ACPI_CPU_SELECTOR_OFFSET_WR 0
-#define ACPI_CPU_FLAGS_OFFSET_RW 4
-#define ACPI_CPU_CMD_OFFSET_WR 5
-#define ACPI_CPU_CMD_DATA_OFFSET_RW 8
-#define ACPI_CPU_CMD_DATA2_OFFSET_R 0
-
 #define OVMF_CPUHP_SMI_CMD 4
 
 enum {
diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
index a3d31631fe..c5a70957b4 100644
--- a/hw/acpi/generic_event_device.c
+++ b/hw/acpi/generic_event_device.c
@@ -12,6 +12,7 @@
 #include "qemu/osdep.h"
 #include "qapi/error.h"
 #include "hw/acpi/acpi.h"
+#include "hw/acpi/cpu.h"
 #include "hw/acpi/generic_event_device.h"
 #include "hw/irq.h"
 #include "hw/mem/pc-dimm.h"
@@ -25,6 +26,7 @@ static const uint32_t ged_supported_events[] = {
 ACPI_GED_MEM_HOTPLUG_EVT,
 ACPI_GED_PWR_DOWN_EVT,
 ACPI_GED_NVDIMM_HOTPLUG_EVT,
+ACPI_GED_CPU_HOTPLUG_EVT,
 };
 
 /*
@@ -117,6 +119,10 @@ void build_ged_aml(Aml *table, const char *name, 
HotplugHandler *hotplug_dev,
aml_notify(aml_name("\\_SB.NVDR"),
   aml_int(0x80)));
 break;
+case ACPI_GED_CPU_HOTPLUG_EVT:
+aml_append(if_ctx, aml_call0(ACPI_CPU_CONTAINER "."
+ ACPI_CPU_SCAN_METHOD));
+break;
 default:
 /*
  * Please make sure all the events in ged_supported_events[]
@@ -234,6 +240,8 @@ static void acpi_ged_device_plug_cb(HotplugHandler 
*hotplug_dev,
 } else {
 acpi_memory_plug_cb(hotplug_dev, >memhp_state, dev, errp);
 }
+} else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+acpi_cpu_plug_cb(hotplug_dev, >cpuhp_state, dev, errp);
 } else {
 error_setg(errp, "virt: device plug request for unsupported device"
" type: %s", object_get_typename(OBJECT(dev)));
@@ -248,6 +256,8 @@ static void acpi_ged_unplug_request_cb(HotplugHandler 
*hotplug_dev,
 if ((object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM) &&
!(object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM {
 acpi_memory_unplug_request_cb(hotplug_dev, >memhp_state, dev, errp);
+} else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+acpi_cpu_unplug_request_cb(hotplug_dev, >cpuhp_state, dev, errp);
 } else {
 error_setg(errp, "acpi: device unplug request for unsupported device"
" type: %s", object_get_typename(OBJECT(dev)));
@@ -261,6 +271,8 @@ static void acpi_ged_unplug_cb(HotplugHandler *hotplug_dev,
 
 if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
 acpi_memory_unplug_cb(>memhp_state, dev, errp);
+} else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+acpi_cpu_unplug_cb(>cpuhp_state, dev, errp);
 } else {
 error_setg(errp, "acpi: device unplug for unsupported device"
" type: %s", object_get_typename(OBJECT(dev)));
@@ -272,6 +284,7 @@ static void acpi_ged_ospm_status(AcpiDeviceIf *adev, 
ACPIOSTInfoList ***list)
 AcpiGedState *s = ACPI_GED(adev);
 
 acpi_memory_ospm_status(>memhp_state, list);
+acpi_cpu_ospm_status(>cpuhp_state, list);
 }
 
 static void acpi_ged_send_event(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
@@ 

Re: [PATCH 0/7] target/arm: hwcaps updates, FEAT_HBC

2023-09-11 Thread Richard Henderson

On 9/11/23 06:53, Peter Maydell wrote:

Peter Maydell (6):
   linux-user/elfload.c: Correct SME feature names reported in cpuinfo
   linux-user/elfload.c: Add missing arm and arm64 hwcap values
   linux-user/elfload.c: Report previously missing arm32 hwcaps
   target/arm: Update AArch64 ID register field definitions
   target/arm: Update user-mode ID reg mask values
   target/arm: Implement FEAT_HBC


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 7/9] migration: Remember num of ramblocks to sync during recovery

2023-09-11 Thread Fabiano Rosas
Peter Xu  writes:

> Instead of only relying on the count of rp_sem, make the counter be part of
> RAMState so it can be used in both threads to synchronize on the process.
>
> rp_sem will be further reused as a way to kick the main thread, e.g., on
> recovery failures.
>
> Signed-off-by: Peter Xu 

Reviewed-by: Fabiano Rosas 



Re: [PATCH 8/9] migration: Add migration_rp_wait|kick()

2023-09-11 Thread Fabiano Rosas
Peter Xu  writes:

> It's just a simple wrapper for rp_sem on either wait() or kick(), make it
> even clearer on how it is used.  Prepared to be used even for other things.
>
> Signed-off-by: Peter Xu 
> ---
>  migration/migration.h | 15 +++
>  migration/migration.c |  4 ++--
>  migration/ram.c   | 16 +++-
>  3 files changed, 24 insertions(+), 11 deletions(-)
>
> diff --git a/migration/migration.h b/migration/migration.h
> index a5c95e4d43..b6de78dbdd 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -304,6 +304,12 @@ struct MigrationState {
>   * be cleared in the rp_thread!
>   */
>  bool  rp_thread_created;
> +/*
> + * Used to synchonize between migration main thread and return path

synchronize

Reviewed-by: Fabiano Rosas 



Re: [PATCH 9/9] migration/postcopy: Allow network to fail even during recovery

2023-09-11 Thread Fabiano Rosas
Peter Xu  writes:

Hi, sorry it took me so long to get to this.

> Normally the postcopy recover phase should only exist for a super short
> period, that's the duration when QEMU is trying to recover from an
> interrupted postcopy migration, during which handshake will be carried out
> for continuing the procedure with state changes from PAUSED -> RECOVER ->
> POSTCOPY_ACTIVE again.
>
> Here RECOVER phase should be super small, that happens right after the
> admin specified a new but working network link for QEMU to reconnect to
> dest QEMU.
>
> However there can still be case where the channel is broken in this small
> RECOVER window.
>
> If it happens, with current code there's no way the src QEMU can got kicked
> out of RECOVER stage. No way either to retry the recover in another channel
> when established.
>
> This patch allows the RECOVER phase to fail itself too - we're mostly
> ready, just some small things missing, e.g. properly kick the main
> migration thread out when sleeping on rp_sem when we found that we're at
> RECOVER stage.  When this happens, it fails the RECOVER itself, and
> rollback to PAUSED stage.  Then the user can retry another round of
> recovery.
>
> To make it even stronger, teach QMP command migrate-pause to explicitly
> kick src/dst QEMU out when needed, so even if for some reason the migration
> thread didn't got kicked out already by a failing rethrn-path thread, the
> admin can also kick it out.
>
> This will be an super, super corner case, but still try to cover that.

It would be nice to have a test for this. Being such a corner case, it
will be hard to keep this scenario working.

I wrote two tests[1] that do the recovery each using a different URI:
1) fd: using a freshly opened file,
2) fd: using a socketpair that simply has nothing on the other end.

These might be too far from the original bug, but it seems to exercise
some of the same paths:

Scenario 1:
/x86_64/migration/postcopy/recovery/fail-twice

the stacks are:

Thread 8 (Thread 0x7fffd5ffe700 (LWP 30282) "live_migration"):
 qemu_sem_wait
 ram_dirty_bitmap_sync_all
 ram_resume_prepare
 qemu_savevm_state_resume_prepare
 postcopy_do_resume
 postcopy_pause
 migration_detect_error
 migration_thread

Thread 7 (Thread 0x7fffd67ff700 (LWP 30281) "return path"):
 qemu_sem_wait
 postcopy_pause_return_path_thread
 source_return_path_thread

This patch seems to fix it, although we cannot call qmp_migrate_recover
a second time because the mis state is now in RECOVER:

  "Migrate recover can only be run when postcopy is paused."

Do we maybe need to return the state to PAUSED, or allow
qmp_migrate_recover to run in RECOVER, like you did on the src side?


Scenario 2:
/x86_64/migration/postcopy/recovery/fail-twice/rp

Thread 8 (Thread 0x7fffd5ffe700 (LWP 30456) "live_migration"):
 qemu_sem_wait
 ram_dirty_bitmap_sync_all
 ram_resume_prepare
 qemu_savevm_state_resume_prepare
 postcopy_do_resume
 postcopy_pause
 migration_detect_error
 migration_thread

Thread 7 (Thread 0x7fffd67ff700 (LWP 30455) "return path"):
 recvmsg
 qio_channel_socket_readv
 qio_channel_readv_full
 qio_channel_read
 qemu_fill_buffer
 qemu_peek_byte
 qemu_get_byte
 qemu_get_be16
 source_return_path_thread

Here, with this patch the migration gets stuck unless we call
migrate_pause() one more time. After another round of migrate_pause +
recover, it finishes properly.


1- hacked-together test:
-->8--
>From a34685c8795799350665a880cd2bf53c5812 Mon Sep 17 00:00:00 2001
From: Fabiano Rosas 
Date: Mon, 11 Sep 2023 20:45:33 -0300
Subject: [PATCH] test patch

---
 tests/qtest/migration-test.c | 87 
 1 file changed, 87 insertions(+)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 1b43df5ca7..4d9d2209c1 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -695,6 +695,7 @@ typedef struct {
 /* Postcopy specific fields */
 void *postcopy_data;
 bool postcopy_preempt;
+int postcopy_recovery_method;
 } MigrateCommon;
 
 static int test_migrate_start(QTestState **from, QTestState **to,
@@ -1357,6 +1358,61 @@ static void test_postcopy_preempt_tls_psk(void)
 }
 #endif
 
+static void postcopy_recover_fail(QTestState *from, QTestState *to, int method)
+{
+int src, dst;
+
+if (method == 1) {
+/* give it some random fd to recover */
+g_autofree char *uri = g_strdup_printf("%s/noop", tmpfs);
+src = dst = open(uri, O_CREAT|O_RDWR);
+} else if (method == 2) {
+int ret, pair1[2], pair2[2];
+
+/* create two unrelated socketpairs */
+ret = qemu_socketpair(PF_LOCAL, SOCK_STREAM, 0, pair1);
+g_assert_cmpint(ret, ==, 0);
+
+ret = qemu_socketpair(PF_LOCAL, SOCK_STREAM, 0, pair2);
+g_assert_cmpint(ret, ==, 0);
+
+/* give the guests unpaired ends of the sockets */
+src = pair1[0];
+dst = pair2[0];
+}
+
+qtest_qmp_fds_assert_success(to, , 1,
+  

Re: [PATCH v3 23/23] bsd-user: Add stubs for vadvise(), sbrk() and sstk()

2023-09-11 Thread Richard Henderson

On 9/9/23 12:37, Karim Taha wrote:

From: Warner Losh

The above system calls are not supported by qemu.

Signed-off-by: Warner Losh
Signed-off-by: Karim Taha
---
  bsd-user/bsd-mem.h| 18 ++
  bsd-user/freebsd/os-syscall.c | 12 
  2 files changed, 30 insertions(+)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH v2] tpm: fix crash when FD >= 1024

2023-09-11 Thread Stefan Berger



On 9/11/23 09:25, marcandre.lur...@redhat.com wrote:

From: Marc-Andr޸ Lureau 

Replace select() with poll() to fix a crash when QEMU has a large number
of FDs.

Fixes:
https://bugzilla.redhat.com/show_bug.cgi?id=2020133


For backporting I think we should also add this tag here:

Fixes:  ca64b08638 ("tpm: Move backend code under the 'backends/' 
directory")


Though RETRY_ON_EINTR was only introduced in 8.0.0-rc0. What's the right 
tag for backporting then?


   Stefan



Signed-off-by: Marc-Andr޸ Lureau 
Reviewed-by: Michael Tokarev 
Reviewed-by: Stefan Berger 
---
  backends/tpm/tpm_util.c | 11 ++-
  1 file changed, 2 insertions(+), 9 deletions(-)

diff --git a/backends/tpm/tpm_util.c b/backends/tpm/tpm_util.c
index a6e6d3e72f..1856589c3b 100644
--- a/backends/tpm/tpm_util.c
+++ b/backends/tpm/tpm_util.c
@@ -112,12 +112,8 @@ static int tpm_util_request(int fd,
  void *response,
  size_t responselen)
  {
-fd_set readfds;
+GPollFD fds[1] = { {.fd = fd, .events = G_IO_IN } };
  int n;
-struct timeval tv = {
-.tv_sec = 1,
-.tv_usec = 0,
-};

  n = write(fd, request, requestlen);
  if (n < 0) {
@@ -127,11 +123,8 @@ static int tpm_util_request(int fd,
  return -EFAULT;
  }

-FD_ZERO();
-FD_SET(fd, );
-
  /* wait for a second */
-n = select(fd + 1, , NULL, NULL, );
+n = RETRY_ON_EINTR(g_poll(fds, 1, 1000));
  if (n != 1) {
  return -errno;
  }




Re: [PATCH v3 22/23] bsd-user: Implement shmat(2) and shmdt(2)

2023-09-11 Thread Richard Henderson

On 9/9/23 12:37, Karim Taha wrote:

+static inline abi_long do_bsd_shmdt(abi_ulong shmaddr)
+{
+int i;
+
+for (i = 0; i < N_BSD_SHM_REGIONS; ++i) {
+if (bsd_shm_regions[i].start == shmaddr) {
+bsd_shm_regions[i].start = 0;
+page_set_flags(shmaddr,
+shmaddr + bsd_shm_regions[i].size, 0);
+break;
+}
+}
+
+return get_errno(shmdt(g2h_untagged(shmaddr)));
+}


On success, this needs to mmap_reserve the region for reserved_va.


r~



Re: [PATCH v3 19/23] bsd-user: Implement shm_open(2)

2023-09-11 Thread Richard Henderson

On 9/9/23 12:37, Karim Taha wrote:

From: Stacey Son

Co-authored-by: Kyle Evans

Signed-off-by: Stacey Son
Signed-off-by: Kyle Evans
Signed-off-by: Karim Taha
---
  bsd-user/bsd-mem.h| 25 +
  bsd-user/freebsd/os-syscall.c |  4 
  2 files changed, 29 insertions(+)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH v3 18/23] bsd-user: Implement do_obreak function

2023-09-11 Thread Richard Henderson

On 9/9/23 12:36, Karim Taha wrote:

From: Stacey Son

Match linux-user, by manually applying the following commits, in order:

d28b3c90cfad1a7e211ae2bce36ecb9071086129   linux-user: Make sure initial brk(0) 
is page-aligned
15ad98536ad9410fb32ddf1ff09389b677643faa   linux-user: Fix qemu brk() to not 
zero bytes on current page
dfe49864afb06e7e452a4366051697bc4fcfc1a5   linux-user: Prohibit brk() to to 
shrink below initial heap address
eac78a4b0b7da4de2c0a297f4d528ca9cc6256a3   linux-user: Fix signed math overflow 
in brk() syscall
c6cc059eca18d9f6e4e26bb8b6d1135ddb35d81a   linux-user: Do not call get_errno() 
in do_brk()
e69e032d1a8ee8d754ca119009a3c2c997f8bb30   linux-user: Use MAP_FIXED_NOREPLACE 
for do_brk()
cb9d5d1fda0bc2312fc0c779b4ea1d7bf826f31f   linux-user: Do nothing if too small 
brk is specified
2aea137a425a87b930a33590177b04368fd7cc12   linux-user: Do not align brk with 
host page size

Signed-off-by: Stacey Son
Signed-off-by: Karim Taha
---
  bsd-user/bsd-mem.h| 45 +++
  bsd-user/freebsd/os-syscall.c |  7 ++
  2 files changed, 52 insertions(+)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH v3 17/23] bsd-user: Implement mincore(2)

2023-09-11 Thread Richard Henderson

On 9/9/23 12:36, Karim Taha wrote:

From: Stacey Son 

Signed-off-by: Stacey Son 
Signed-off-by: Karim Taha 
---
  bsd-user/bsd-mem.h| 22 ++
  bsd-user/freebsd/os-syscall.c |  4 
  2 files changed, 26 insertions(+)

diff --git a/bsd-user/bsd-mem.h b/bsd-user/bsd-mem.h
index 0e16051418..1dabbe36e6 100644
--- a/bsd-user/bsd-mem.h
+++ b/bsd-user/bsd-mem.h
@@ -189,4 +189,26 @@ static inline abi_long do_bsd_minherit(abi_long addr, 
abi_long len,
  return get_errno(minherit(g2h_untagged(addr), len, inherit));
  }
  
+/* mincore(2) */

+static inline abi_long do_bsd_mincore(abi_ulong target_addr, abi_ulong len,
+abi_ulong target_vec)
+{
+abi_long ret;
+void *p;
+abi_ulong vec_len = DIV_ROUND_UP(len,TARGET_PAGE_SIZE);
+
+if (!guest_range_valid_untagged(target_addr,len) || 
!page_check_range(target_addr, len, PAGE_VALID)) {
+return -TARGET_EFAULT;
+}
+
+p = lock_user(VERIFY_WRITE, target_vec, vec_len, 0);
+if (p == NULL) {
+return -TARGET_EFAULT;
+}
+ret = get_errno(mincore(g2h_untagged(target_addr), len, p));
+unlock_user(p, target_vec, 0);


You don't need the lock/unlock_user at all.  It is wrongly checking for WRITE.


r~


+
+return ret;
+}
+
  #endif /* BSD_USER_BSD_MEM_H */
diff --git a/bsd-user/freebsd/os-syscall.c b/bsd-user/freebsd/os-syscall.c
index f5d60cf902..8d1cf3b35c 100644
--- a/bsd-user/freebsd/os-syscall.c
+++ b/bsd-user/freebsd/os-syscall.c
@@ -527,6 +527,10 @@ static abi_long freebsd_syscall(void *cpu_env, int num, 
abi_long arg1,
  ret = do_bsd_minherit(arg1, arg2, arg3);
  break;
  
+case TARGET_FREEBSD_NR_mincore: /* mincore(2) */

+ret = do_bsd_mincore(arg1, arg2, arg3);
+break;
+
  #if defined(__FreeBSD_version) && __FreeBSD_version >= 1300048
  case TARGET_FREEBSD_NR_shm_open2: /* shm_open2(2) */
  ret = do_freebsd_shm_open2(arg1, arg2, arg3, arg4, arg5);





Re: [PATCH v3 16/23] bsd-user: Implment madvise(2) to match the linux-user implementation.

2023-09-11 Thread Richard Henderson

On 9/9/23 12:36, Karim Taha wrote:

Signed-off-by: Signed-off-by: Karim Taha
---
  bsd-user/bsd-mem.h| 53 +++
  bsd-user/freebsd/os-syscall.c |  4 +++
  bsd-user/syscall_defs.h   |  2 ++
  3 files changed, 59 insertions(+)


Reviewed-by: Richard Henderson 



+switch (advice) {
+case MADV_DONTNEED:
+if (page_check_range(start, len, PAGE_PASSTHROUGH)) {
+ret = get_errno(madvise(g2h_untagged(start), len, advice));
+if ((advice == MADV_DONTNEED) && (ret == 0)) {


Duplicate check for MADV_DONTNEED.
Useless parenthesis.


r~



Re: [PATCH v3 10/23] bsd-user: Implement shmid_ds conversion between host and target.

2023-09-11 Thread Richard Henderson

On 9/9/23 12:36, Karim Taha wrote:

From: Stacey Son

Signed-off-by: Stacey Son
Signed-off-by: Karim Taha
---
  bsd-user/bsd-mem.c | 43 +++
  1 file changed, 43 insertions(+)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH v3 05/23] bsd-user: Implement shm_open2(2) system call

2023-09-11 Thread Richard Henderson

On 9/9/23 12:36, Karim Taha wrote:

From: Kyle Evans

Signed-off-by: Kyle Evans
Signed-off-by: Karim Taha
---
  bsd-user/freebsd/os-misc.h| 42 +++
  bsd-user/freebsd/os-syscall.c | 13 +++
  2 files changed, 55 insertions(+)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH RESEND v5 56/57] target/loongarch: Move simply DO_XX marcos togther

2023-09-11 Thread Richard Henderson

On 9/7/23 01:31, Song Gao wrote:

Signed-off-by: Song Gao
---
  target/loongarch/vec.h| 42 ++
  target/loongarch/vec_helper.c | 48 ---
  2 files changed, 42 insertions(+), 48 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH RESEND v5 55/57] target/loongarch: Implement xvld xvst

2023-09-11 Thread Richard Henderson

On 9/7/23 01:31, Song Gao wrote:

This patch includes:
- XVLD[X], XVST[X];
- XVLDREPL.{B/H/W/D};
- XVSTELM.{B/H/W/D}.

Signed-off-by: Song Gao
---
  target/loongarch/insns.decode   |  18 +++
  target/loongarch/disas.c|  24 
  target/loongarch/insn_trans/trans_vec.c.inc | 143 ++--
  3 files changed, 175 insertions(+), 10 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH RESEND v5 54/57] target/loongarch: Implement xvshuf xvperm{i} xvshuf4i

2023-09-11 Thread Richard Henderson

On 9/7/23 01:31, Song Gao wrote:

  void HELPER(vshuf_b)(void *vd, void *vj, void *vk, void *va, uint32_t desc)
  {
  int i, m;
-VReg temp;
+VReg temp = {};
  VReg *Vd = (VReg *)vd;
  VReg *Vj = (VReg *)vj;
  VReg *Vk = (VReg *)vk;
  VReg *Va = (VReg *)va;
+int oprsz = simd_oprsz(desc);
  
-m = LSX_LEN/8;

-for (i = 0; i < m ; i++) {
+m = LSX_LEN / 8;
+for (i = 0; i < m; i++) {
  uint64_t k = (uint8_t)Va->B(i) % (2 * m);
  temp.B(i) = k < m ? Vk->B(k) : Vj->B(k - m);
  }
+if (oprsz == 32) {
+for(i = m; i < 2 * m; i++) {
+uint64_t j = (uint8_t)Va->B(i) % (2 * m);
+temp.B(i) = j < m ? Vk->B(j + m) : Vj->B(j);
+}
+}


Loop, not a compare against oprsz.  Several instances.


+void HELPER(vpermi_q)(void *vd, void *vj, uint64_t imm, uint32_t desc)
 {
 VReg temp;
 VReg *Vd = (VReg *)vd;
 VReg *Vj = (VReg *)vj;
 
+temp.Q(0) = (imm & 0x3) > 1 ? Vd->Q((imm & 0x3) - 2) : Vj->Q(imm & 0x3);

+temp.Q(1) = ((imm >> 4) & 0x3) > 1 ? Vd->Q(((imm >> 4) & 0x3) - 2) :
+ Vj->Q((imm >> 4) & 0x3);



for (i = 0; i < 2; i++, imm >>= 4) {
   temp.Q(i) = (imm & 2 ? Vd : Vj)->Q(imm & 1);
}


r~



Re: [PATCH RESEND v5 52/57] target/loongarch: Implement xvreplve xvinsve0 xvpickve

2023-09-11 Thread Richard Henderson

On 9/7/23 01:31, Song Gao wrote:

+static bool gen_vreplve_vl(DisasContext *ctx, arg_vvr *a,
+   uint32_t oprsz, int vece, int bit,
+   void (*func)(TCGv_i64, TCGv_ptr, tcg_target_long))
  {
  TCGv_i64 t0 = tcg_temp_new_i64();
  TCGv_ptr t1 = tcg_temp_new_ptr();
  TCGv_i64 t2 = tcg_temp_new_i64();
  
+tcg_gen_andi_i64(t0, gpr_src(ctx, a->rk, EXT_NONE), (LSX_LEN / bit) - 1);

  tcg_gen_shli_i64(t0, t0, vece);
  if (HOST_BIG_ENDIAN) {
+tcg_gen_xori_i64(t0, t0, vece << ((LSX_LEN / bit) -1));
  }
  
  tcg_gen_trunc_i64_ptr(t1, t0);

  tcg_gen_add_ptr(t1, t1, cpu_env);
  func(t2, t1, vec_full_offset(a->vj));
+tcg_gen_gvec_dup_i64(vece, vec_full_offset(a->vd), 16, 16, t2);
+if (oprsz == 32) {
+func(t2, t1,  offsetof(CPULoongArchState, fpr[a->vj].vreg.Q(1)));
+tcg_gen_gvec_dup_i64(vece,
+ offsetof(CPULoongArchState, fpr[a->vd].vreg.Q(1)),
+ 16, 16, t2);
+}


This would be clearer as a loop:

for (i = 0; i < oprsz; i += 16) {
func(t2, t1, i);
tcg_gen_gvec_dup_i64(vece, vec_full_offset(a->vd) + i, 16, 16, t2);
}



+static bool trans_xvrepl128vei_b(DisasContext *ctx, arg_vv_i * a)
 {
+if (!avail_LASX(ctx)) {
 return false;
 }
 
+if (!check_vec(ctx, 32)) {

 return true;
 }
 
+tcg_gen_gvec_dup_mem(MO_8,

+ offsetof(CPULoongArchState, fpr[a->vd].vreg.B(0)),
+ offsetof(CPULoongArchState,
+  fpr[a->vj].vreg.B((a->imm))),
+ 16, 16);
+tcg_gen_gvec_dup_mem(MO_8,
+ offsetof(CPULoongArchState, fpr[a->vd].vreg.B(16)),
+ offsetof(CPULoongArchState,
+  fpr[a->vj].vreg.B((a->imm + 16))),
+ 16, 16);
+return true;
+}


Again, a loop.  Also, I think you can easily merge all 4 of these functions 
using VECE.


+#define XVREPLVE0(NAME, MOP)  \
+static bool trans_## NAME(DisasContext *ctx, arg_vv * a)  \
+{ \
+if (!avail_LASX(ctx)) {   \
+return false; \
+} \
+  \
+if (!check_vec(ctx, 32)) {\
+return true;  \
+} \
+  \
+tcg_gen_gvec_dup_mem(MOP, vec_full_offset(a->vd), vec_full_offset(a->vj), \
+ 32, 32); \
+return true;  \
+}
 
+XVREPLVE0(xvreplve0_b, MO_8)

+XVREPLVE0(xvreplve0_h, MO_16)
+XVREPLVE0(xvreplve0_w, MO_32)
+XVREPLVE0(xvreplve0_d, MO_64)
+XVREPLVE0(xvreplve0_q, MO_128)


Should use a helper function and TRANS().


+static bool do_vbsrl_v(DisasContext *ctx, arg_vv_i *a, uint32_t oprsz)
+{
+int ofs, i, max;
+TCGv_i64 desthigh[2], destlow[2], high[2], low[2];
+
+if (!check_vec(ctx, 32)) {
+return true;
+}
+
+max = (oprsz == 16) ? 1 : 2;
+
+for (i = 0; i < max; i++) {
+desthigh[i] = tcg_temp_new_i64();
+destlow[i] = tcg_temp_new_i64();
+high[i] = tcg_temp_new_i64();
+low[i] = tcg_temp_new_i64();
+get_vreg64(high[i], a->vj, 2 * i + 1);
+
+ofs = ((a->imm) & 0xf) * 8;
+if (ofs < 64) {
+get_vreg64(low[i], a->vj, 2 * i);
+tcg_gen_extract2_i64(destlow[i], low[i], high[i], ofs);
+tcg_gen_shri_i64(desthigh[i], high[i], ofs);
+} else {
+tcg_gen_shri_i64(destlow[i], high[i], ofs - 64);
+desthigh[i] = tcg_constant_i64(0);
+}
+set_vreg64(desthigh[i], a->vd, 2 * i + 1);
+set_vreg64(destlow[i], a->vd, 2 * i);
+}
 
 return true;

 }


Why are you using arrays?  They don't seem required.
This would seem clearer as

for (i = 0; i < oprsz / 16; i++) {
TCGv desthi = tcg_temp_new_i64();
...
}


r~



Re: [PATCH 06/32] bsd-user: Add bsd-proc.c to meson.build

2023-09-11 Thread Karim Taha
Richard Henderson  wrote:

>> +elf = cc.find_library('elf', required: true)
>> +procstat = cc.find_library('procstat', required: true)
>> +kvm = cc.find_library('kvm', required: true)
>> +bsd_user_ss.add(elf, procstat, kvm)
>
> What are these for?  Particularly kvm?
>
>
> r~

It's need to link with `libprocstat`, which is need for the 
`filestat` struct definition, and it's `proc_*` functions used
is `get_filename_from_fd` function, however the function is declared
static, which emits an `unused function warning`, but compiles
successfully.

The linker errors only when the `get_filename_from_fd` is used in
`freebsd_exec_common` function.

--
Karim Taha



Re: [PATCH 3/4] hw/cpu: Introduce CPUClass::cpu_resolving_type field

2023-09-11 Thread Gavin Shan



On 9/11/23 19:43, Philippe Mathieu-Daudé wrote:

On 11/9/23 01:28, Gavin Shan wrote:

On 9/8/23 21:22, Philippe Mathieu-Daudé wrote:

Add a field to return the QOM type name of a CPU class.

Signed-off-by: Philippe Mathieu-Daudé 
---
  include/hw/core/cpu.h   | 2 ++
  hw/core/cpu-common.c    | 2 +-
  target/alpha/cpu.c  | 1 +
  target/arm/cpu.c    | 1 +
  target/avr/cpu.c    | 1 +
  target/cris/cpu.c   | 1 +
  target/hexagon/cpu.c    | 1 +
  target/hppa/cpu.c   | 1 +
  target/i386/cpu.c   | 1 +
  target/loongarch/cpu.c  | 1 +
  target/m68k/cpu.c   | 1 +
  target/microblaze/cpu.c | 1 +
  target/mips/cpu.c   | 1 +
  target/nios2/cpu.c  | 1 +
  target/openrisc/cpu.c   | 1 +
  target/ppc/cpu_init.c   | 1 +
  target/riscv/cpu.c  | 1 +
  target/rx/cpu.c | 1 +
  target/s390x/cpu.c  | 1 +
  target/sh4/cpu.c    | 1 +
  target/sparc/cpu.c  | 1 +
  target/tricore/cpu.c    | 1 +
  target/xtensa/cpu.c | 1 +
  23 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 129d179937..e469efd409 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -100,6 +100,7 @@ struct SysemuCPUOps;
  /**
   * CPUClass:
+ * @cpu_resolving_type: CPU QOM type name
   * @class_by_name: Callback to map -cpu command line model name to an
   * instantiatable CPU type.
   * @parse_features: Callback to parse command line arguments.
@@ -148,6 +149,7 @@ struct CPUClass {
  DeviceClass parent_class;
  /*< public >*/
+    const char *cpu_resolving_type;
  ObjectClass *(*class_by_name)(const char *cpu_model);
  void (*parse_features)(const char *typename, char *str, Error **errp);


The question is why not use CPU_RESOLVING_TYPE directly? It seems 
CPU_RESOLVING_TYPE
is exactly what you want here.


Unfortunately CPU_RESOLVING_TYPE is target-specific, we want
hw/core/cpu-common.c to be target-agnostic (build once for all
targets). This is particularly important in the context of
heterogeneous QEMU, where a single binary will be able to create
CPUs from different targets.



CPU_RESOLVING_TYPE and CPUClass::cpu_resolving_type is duplicate to
each other. There are two options I can figure out to avoid the
duplication.

(a) move cpu_class_by_name() from hw/core/cpu-common.c to cpu.c, so that
CPU_RESOLVING_TYPE can be seen. cpu.c::list_cpus() is the example.

(b) remove hw/core/cpu-common.c::cpu_calss_by_name() and squeeze its
logic to cpu.c::parse_cpu_option() since there are not too much
users for it. target/arm and target/s390 needs some tweaks so that
hw/core/cpu-common.c::cpu_calss_by_name() can be removed.

[gshan@gshan q]$ git grep \ cpu_class_by_name\(
cpu.c:oc = cpu_class_by_name(CPU_RESOLVING_TYPE, model_pieces[0]);
target/arm/arm-qmp-cmds.c:oc = cpu_class_by_name(TYPE_ARM_CPU, 
model->name);
target/s390x/cpu_models_sysemu.c:oc = cpu_class_by_name(TYPE_S390_CPU, 
info->name);

When option (b) is taken, this series to have the checks against @oc
in hw/core/cpu-common.c::cpu_calss_by_name() becomes non-sense. Instead,
we need the same (and complete) checks in CPUClass::class_by_name() for
individual targets. Further more, an inline helper can be provided to do
the check in CPUClass::class_by_name() for individual targets.

   include/hw/core/cpu.h

   static inline bool cpu_class_is_valid(ObjectClass *oc, const char *parent)
   {
   if (!object_class_dynamic_cast(oc, parent) ||
   object_class_is_abstract(oc)) {
   return false;
   }

   return true;
   }

Thanks,
Gavin




Re: [PATCH RESEND v5 51/57] target/loongarch: Implement xvinsgr2vr xvpickve2gr

2023-09-11 Thread Richard Henderson

On 9/7/23 01:31, Song Gao wrote:

+static bool trans_xvinsgr2vr_w(DisasContext *ctx, arg_vr_i *a)
+{
+if (!avail_LASX(ctx)) {
+return false;
+}
+return trans_vinsgr2vr_w(ctx, a);
+}


Using the other translator doesn't help.


static bool trans_vinsgr2vr_w(DisasContext *ctx, arg_vr_i *a)
{
TCGv src = gpr_src(ctx, a->rj, EXT_NONE);

if (!avail_LSX(ctx)) {
return false;
}

CHECK_SXE;


This portion doesn't apply, and you miss the check_vec for the larger LASX.


tcg_gen_st32_i64(src, cpu_env,
 offsetof(CPULoongArchState, fpr[a->vd].vreg.W(a->imm)));
return true;
}


The only thing that is left is this one line, so I'm not sure it's worth splitting out a 
common helper function.



r~



[PATCH 1/4] target/ppc: Add new hflags to support BHRB

2023-09-11 Thread Glenn Miles
This commit is preparatory to the addition of Branch History
Rolling Buffer (BHRB) functionality, which is being provided
today starting with the P8 processor.

BHRB uses several SPR register fields to control whether or not
a branch instruction's address (and sometimes target address)
should be recorded.  Checking each of these fields with each
branch instruction using jitted code would lead to a significant
decrease in performance.

Therefore, it was decided that BHRB configuration bits that are
not expected to change frequently should have their state stored in
hflags so that the amount of checking done by jitted code can
be reduced.

This commit contains the changes for storing the state of the
following register fields as hflags:

MMCR0[FCP] - Determines if BHRB recording is frozen in the
 problem state

MMCR0[FCPC] - A modifier for MMCR0[FCP]

MMCRA[BHRBRD] - Disables all BHRB recording for a thread

Signed-off-by: Glenn Miles 
---
 target/ppc/cpu.h |  9 +
 target/ppc/cpu_init.c|  4 ++--
 target/ppc/helper.h  |  1 +
 target/ppc/helper_regs.c | 12 
 target/ppc/machine.c |  2 +-
 target/ppc/power8-pmu-regs.c.inc |  5 +
 target/ppc/power8-pmu.c  | 15 +++
 target/ppc/power8-pmu.h  |  4 ++--
 target/ppc/spr_common.h  |  1 +
 target/ppc/translate.c   |  6 ++
 10 files changed, 50 insertions(+), 9 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 25fac9577a..20ae1466a5 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -439,6 +439,9 @@ FIELD(MSR, LE, MSR_LE, 1)
 #define MMCR0_FC56   PPC_BIT(59) /* PMC Freeze Counters 5-6 bit */
 #define MMCR0_PMC1CE PPC_BIT(48) /* MMCR0 PMC1 Condition Enabled */
 #define MMCR0_PMCjCE PPC_BIT(49) /* MMCR0 PMCj Condition Enabled */
+#define MMCR0_BHRBA  PPC_BIT_NR(42)  /* BHRB Available */
+#define MMCR0_FCPPPC_BIT(34) /* Freeze Counters/BHRB if PR=1 */
+#define MMCR0_FCPC   PPC_BIT(51) /* Condition for FCP bit */
 /* MMCR0 userspace r/w mask */
 #define MMCR0_UREG_MASK (MMCR0_FC | MMCR0_PMAO | MMCR0_PMAE)
 /* MMCR2 userspace r/w mask */
@@ -451,6 +454,9 @@ FIELD(MSR, LE, MSR_LE, 1)
 #define MMCR2_UREG_MASK (MMCR2_FC1P0 | MMCR2_FC2P0 | MMCR2_FC3P0 | \
  MMCR2_FC4P0 | MMCR2_FC5P0 | MMCR2_FC6P0)
 
+#define MMCRA_BHRBRDPPC_BIT(26)/* BHRB Recording Disable */
+
+
 #define MMCR1_EVT_SIZE 8
 /* extract64() does a right shift before extracting */
 #define MMCR1_PMC1SEL_START 32
@@ -703,6 +709,9 @@ enum {
 HFLAGS_PMCJCE = 17, /* MMCR0 PMCjCE bit */
 HFLAGS_PMC_OTHER = 18, /* PMC other than PMC5-6 is enabled */
 HFLAGS_INSN_CNT = 19, /* PMU instruction count enabled */
+HFLAGS_FCPC = 20,   /* MMCR0 FCPC bit */
+HFLAGS_FCP = 21,/* MMCR0 FCP bit */
+HFLAGS_BHRBRD = 22, /* MMCRA BHRBRD bit */
 HFLAGS_VSX = 23, /* MSR_VSX if cpu has VSX */
 HFLAGS_VR = 25,  /* MSR_VR if cpu has VRE */
 
diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
index 02b7aad9b0..568f9c3b88 100644
--- a/target/ppc/cpu_init.c
+++ b/target/ppc/cpu_init.c
@@ -5152,7 +5152,7 @@ static void register_book3s_pmu_sup_sprs(CPUPPCState *env)
  KVM_REG_PPC_MMCR1, 0x);
 spr_register_kvm(env, SPR_POWER_MMCRA, "MMCRA",
  SPR_NOACCESS, SPR_NOACCESS,
- _read_generic, _write_generic,
+ _read_generic, _write_MMCRA,
  KVM_REG_PPC_MMCRA, 0x);
 spr_register_kvm(env, SPR_POWER_PMC1, "PMC1",
  SPR_NOACCESS, SPR_NOACCESS,
@@ -7152,7 +7152,7 @@ static void ppc_cpu_reset_hold(Object *obj)
 if (env->mmu_model != POWERPC_MMU_REAL) {
 ppc_tlb_invalidate_all(env);
 }
-pmu_mmcr01_updated(env);
+pmu_mmcr01a_updated(env);
 }
 
 /* clean any pending stop state */
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index abec6fe341..1a3d9a7e57 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -27,6 +27,7 @@ DEF_HELPER_2(store_lpcr, void, env, tl)
 DEF_HELPER_2(store_pcr, void, env, tl)
 DEF_HELPER_2(store_mmcr0, void, env, tl)
 DEF_HELPER_2(store_mmcr1, void, env, tl)
+DEF_HELPER_2(store_mmcrA, void, env, tl)
 DEF_HELPER_3(store_pmc, void, env, i32, i64)
 DEF_HELPER_2(read_pmc, tl, env, i32)
 DEF_HELPER_2(insns_inc, void, env, i32)
diff --git a/target/ppc/helper_regs.c b/target/ppc/helper_regs.c
index f380342d4d..4ff054063d 100644
--- a/target/ppc/helper_regs.c
+++ b/target/ppc/helper_regs.c
@@ -61,6 +61,15 @@ static uint32_t hreg_compute_pmu_hflags_value(CPUPPCState 
*env)
 if (env->spr[SPR_POWER_MMCR0] & MMCR0_PMCjCE) {
 hflags |= 1 << HFLAGS_PMCJCE;
 }
+if (env->spr[SPR_POWER_MMCR0] & MMCR0_FCP) {
+hflags |= 1 << HFLAGS_FCP;
+}
+if (env->spr[SPR_POWER_MMCR0] & MMCR0_FCPC) {
+  

Re: [PATCH RESEND v5 41/57] target/loongarch: Implement xvssrlrn xvssrarn

2023-09-11 Thread Richard Henderson

On 9/7/23 01:31, Song Gao wrote:

This patch includes:
- XVSSRLRN.{B.H/H.W/W.D};
- XVSSRARN.{B.H/H.W/W.D};
- XVSSRLRN.{BU.H/HU.W/WU.D};
- XVSSRARN.{BU.H/HU.W/WU.D};
- XVSSRLRNI.{B.H/H.W/W.D/D.Q};
- XVSSRARNI.{B.H/H.W/W.D/D.Q};
- XVSSRLRNI.{BU.H/HU.W/WU.D/DU.Q};
- XVSSRARNI.{BU.H/HU.W/WU.D/DU.Q}.

Signed-off-by: Song Gao
---
  target/loongarch/insns.decode   |  30 ++
  target/loongarch/disas.c|  30 ++
  target/loongarch/vec_helper.c   | 489 
  target/loongarch/insn_trans/trans_vec.c.inc |  28 ++
  4 files changed, 378 insertions(+), 199 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH RESEND v5 40/57] target/loongarch: Implement xvssrln xvssran

2023-09-11 Thread Richard Henderson

On 9/7/23 01:31, Song Gao wrote:

This patch includes:
- XVSSRLN.{B.H/H.W/W.D};
- XVSSRAN.{B.H/H.W/W.D};
- XVSSRLN.{BU.H/HU.W/WU.D};
- XVSSRAN.{BU.H/HU.W/WU.D};
- XVSSRLNI.{B.H/H.W/W.D/D.Q};
- XVSSRANI.{B.H/H.W/W.D/D.Q};
- XVSSRLNI.{BU.H/HU.W/WU.D/DU.Q};
- XVSSRANI.{BU.H/HU.W/WU.D/DU.Q}.

Signed-off-by: Song Gao
---
  target/loongarch/insns.decode   |  30 ++
  target/loongarch/disas.c|  30 ++
  target/loongarch/vec_helper.c   | 456 
  target/loongarch/insn_trans/trans_vec.c.inc |  28 ++
  4 files changed, 353 insertions(+), 191 deletions(-)


Reviewed-by: Richard Henderson 

r~



[PATCH] vdpa: fix gcc cvq_isolated uninitialized variable warning

2023-09-11 Thread Stefan Hajnoczi
gcc 13.2.1 emits the following warning:

  net/vhost-vdpa.c: In function ‘net_vhost_vdpa_init.constprop’:
  net/vhost-vdpa.c:1394:25: error: ‘cvq_isolated’ may be used uninitialized 
[-Werror=maybe-uninitialized]
   1394 | s->cvq_isolated = cvq_isolated;
| ^~
  net/vhost-vdpa.c:1355:9: note: ‘cvq_isolated’ was declared here
   1355 | int cvq_isolated;
| ^~~~
  cc1: all warnings being treated as errors

Cc: Eugenio Pérez 
Cc: Michael S. Tsirkin 
Cc: Jason Wang 
Signed-off-by: Stefan Hajnoczi 
---
 net/vhost-vdpa.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 34202ca009..7eaee841aa 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -1352,7 +1352,7 @@ static NetClientState *net_vhost_vdpa_init(NetClientState 
*peer,
 VhostVDPAState *s;
 int ret = 0;
 assert(name);
-int cvq_isolated;
+int cvq_isolated = 0;
 
 if (is_datapath) {
 nc = qemu_new_net_client(_vhost_vdpa_info, peer, device,
-- 
2.41.0




Re: [PULL v2 38/45] hw/riscv/virt.c: fix non-KVM --enable-debug build

2023-09-11 Thread Daniel Henrique Barboza




On 9/11/23 16:54, Michael Tokarev wrote:

11.09.2023 09:43, Alistair Francis:

From: Daniel Henrique Barboza 

A build with --enable-debug and without KVM will fail as follows:

/usr/bin/ld: libqemu-riscv64-softmmu.fa.p/hw_riscv_virt.c.o: in function 
`virt_machine_init':
./qemu/build/../hw/riscv/virt.c:1465: undefined reference to 
`kvm_riscv_aia_create'

This happens because the code block with "if virt_use_kvm_aia(s)" isn't
being ignored by the debug build, resulting in an undefined reference to
a KVM only function.

Add a 'kvm_enabled()' conditional together with virt_use_kvm_aia() will
make the compiler crop the kvm_riscv_aia_create() call entirely from a
non-KVM build. Note that adding the 'kvm_enabled()' conditional inside
virt_use_kvm_aia() won't fix the build because this function would need
to be inlined multiple times to make the compiler zero out the entire
block.

While we're at it, use kvm_enabled() in all instances where
virt_use_kvm_aia() is checked to allow the compiler to elide these other
kvm-only instances as well.

Suggested-by: Richard Henderson 
Fixes: dbdb99948e ("target/riscv: select KVM AIA in riscv virt machine")


This is probably commit 48c2c33c52, not dbdb99948e.

..

  /* KVM AIA only has one APLIC instance */
-    if (virt_use_kvm_aia(s)) {
+    if (kvm_enabled() && virt_use_kvm_aia(s)) {
  create_fdt_socket_aplic(s, memmap, 0,

...

As has been discovered earlier (see "target/i386: Restrict system-specific
features from user emulation" threads), this is not enough, - compiler does
not always eliminate if (0 && foo) { bar; } construct, it's too fragile and
does not actually work.  Either some #ifdef'fery is needed here or some other,
more explicit, way to eliminate such code, like introducing stub functions.

I know it's too late and this change is already in, but unfortunately that's
when I noticed this.  While the "Fixes:" tag can't be changed anymore, a more
proper fix for the actual problem (with the proper Fixes tag now) can still
be applied on top of this fix.


This works fine on current master on my machine:

$ ../configure --cc=clang 
--target-list=riscv64-softmmu,riscv64-linux-user,riscv32-softmmu,riscv32-linux-user
 --enable-debug
$ make -j

So I'm not sure what do you mean by 'actual problem'. If you refer to the 
non-existence
of stub functions, earlier today we had a review by Phil in this patch here 
where I was
adding stubs for all KVM functions:

https://lore.kernel.org/qemu-riscv/f30d8589-8b59-2fd7-c38c-3f79508a4...@linaro.org/

Phil mentioned that we don't need a function stub if the KVM only function is 
protected by
"kvm_enabled()". And this is fine, but then, from what you're saying, we can't 
rely on
(kvm_enabled() && foo) working all the time, so we're one conditional away from 
breaking
things it seems.

I think we need to make up our minds and standardize. Do we rely on the 
compiler to optimize
stuff out, or do we use stubs/ifdefs? I'm fine with both options, as long as we 
pick one and
stick with it.


Thanks,

Daniel



/mjt




Re: [PATCH RESEND v5 21/57] target/loongarch: Implement xavg/xvagr

2023-09-11 Thread Richard Henderson

On 9/7/23 01:31, Song Gao wrote:

This patch includes:
- XVAVG.{B/H/W/D/}[U];
- XVAVGR.{B/H/W/D}[U].

Signed-off-by: Song Gao
---
  target/loongarch/insns.decode   | 17 
  target/loongarch/disas.c| 17 
  target/loongarch/vec_helper.c   | 22 +++--
  target/loongarch/insn_trans/trans_vec.c.inc | 16 +++
  4 files changed, 62 insertions(+), 10 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH RESEND v5 20/57] target/loongarch: Implement xvaddw/xvsubw

2023-09-11 Thread Richard Henderson

On 9/7/23 01:31, Song Gao wrote:

This patch includes:
- XVADDW{EV/OD}.{H.B/W.H/D.W/Q.D}[U];
- XVSUBW{EV/OD}.{H.B/W.H/D.W/Q.D}[U];
- XVADDW{EV/OD}.{H.BU.B/W.HU.H/D.WU.W/Q.DU.D}.

Signed-off-by: Song Gao
---
  target/loongarch/insns.decode   |  45 
  target/loongarch/disas.c|  43 +++
  target/loongarch/vec_helper.c   | 120 ++--
  target/loongarch/insn_trans/trans_vec.c.inc |  41 +++
  4 files changed, 215 insertions(+), 34 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH RESEND v5 19/57] target/loongarch: Implement xvhaddw/xvhsubw

2023-09-11 Thread Richard Henderson

On 9/7/23 01:31, Song Gao wrote:

This patch includes:
- XVHADDW.{H.B/W.H/D.W/Q.D/HU.BU/WU.HU/DU.WU/QU.DU};
- XVHSUBW.{H.B/W.H/D.W/Q.D/HU.BU/WU.HU/DU.WU/QU.DU}.

Signed-off-by: Song Gao
---
  target/loongarch/insns.decode   | 18 +++
  target/loongarch/disas.c| 17 +++
  target/loongarch/vec_helper.c   | 34 -
  target/loongarch/insn_trans/trans_vec.c.inc | 26 
  4 files changed, 88 insertions(+), 7 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 11/14] target/arm: Implement the SETG* instructions

2023-09-11 Thread Richard Henderson

On 9/11/23 07:17, Peter Maydell wrote:

I think it would be a little better if set_tags was visible to the compiler, 
via inlining,
so that all of the conditions can be folded away.


Do you mean having a separate triplet of helper functions
for setg, which then call an inline function shared with
the normal setp/setm/sete to do the actual work, rather than
passing "is this setg" via the syndrome ?


Yes.


r~




[PATCH v4 1/3] target/i386: Check kvm_hyperv_expand_features() return value

2023-09-11 Thread Philippe Mathieu-Daudé
Move kvm_hyperv_expand_features() call earlier (this will simplify
reviewing the next commit) and check its return value, since it can
fail.

Signed-off-by: Philippe Mathieu-Daudé 
---
 target/i386/cpu.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 24ee67b42d..760428d4dc 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -7107,6 +7107,10 @@ void x86_cpu_expand_features(X86CPU *cpu, Error **errp)
 }
 }
 
+if (kvm_enabled() && !kvm_hyperv_expand_features(cpu, errp)) {
+return;
+}
+
 /* Set cpuid_*level* based on cpuid_min_*level, if not explicitly set */
 if (env->cpuid_level_func7 == UINT32_MAX) {
 env->cpuid_level_func7 = env->cpuid_min_level_func7;
@@ -7120,10 +7124,6 @@ void x86_cpu_expand_features(X86CPU *cpu, Error **errp)
 if (env->cpuid_xlevel2 == UINT32_MAX) {
 env->cpuid_xlevel2 = env->cpuid_min_xlevel2;
 }
-
-if (kvm_enabled()) {
-kvm_hyperv_expand_features(cpu, errp);
-}
 }
 
 /*
-- 
2.41.0




[PATCH v4 3/3] target/i386: Prohibit target specific KVM prototypes on user emulation

2023-09-11 Thread Philippe Mathieu-Daudé
None of these target-specific prototypes should be used
by user emulation. Remove their declaration there, so we
get a compile failure if ever used (instead of having to
deal with linker and its possible optimizations, such
dead code removal).

Suggested-by: Kevin Wolf 
Signed-off-by: Philippe Mathieu-Daudé 
---
 target/i386/kvm/kvm_i386.h | 4 
 target/i386/cpu.c  | 3 ++-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/target/i386/kvm/kvm_i386.h b/target/i386/kvm/kvm_i386.h
index 55d4e68c34..5ef73f0a1c 100644
--- a/target/i386/kvm/kvm_i386.h
+++ b/target/i386/kvm/kvm_i386.h
@@ -13,6 +13,10 @@
 
 #include "sysemu/kvm.h"
 
+#ifdef CONFIG_USER_ONLY
+#error Cannot include kvm_i386.h from user emulation
+#endif
+
 #ifdef CONFIG_KVM
 
 #define kvm_pit_in_kernel() \
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 8b57708604..6be59c7b2b 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -26,7 +26,7 @@
 #include "tcg/helper-tcg.h"
 #include "sysemu/reset.h"
 #include "sysemu/hvf.h"
-#include "kvm/kvm_i386.h"
+#include "sysemu/kvm.h"
 #include "sev.h"
 #include "qapi/error.h"
 #include "qemu/error-report.h"
@@ -40,6 +40,7 @@
 #include "exec/address-spaces.h"
 #include "hw/boards.h"
 #include "hw/i386/sgx-epc.h"
+#include "kvm/kvm_i386.h"
 #endif
 
 #include "disas/capstone.h"
-- 
2.41.0




[RFC PATCH v4 2/3] target/i386: Restrict system-specific features from user emulation

2023-09-11 Thread Philippe Mathieu-Daudé
Since commits 3adce820cf ("target/i386: Remove unused KVM
stubs") and ef1cf6890f ("target/i386: Allow elision of
kvm_hv_vpindex_settable()"), when building on a x86 host
configured as:

  $ ./configure --cc=clang \
--target-list=x86_64-linux-user,x86_64-softmmu \
--enable-debug

we get:

  [71/71] Linking target qemu-x86_64
  FAILED: qemu-x86_64
  /usr/bin/ld: libqemu-x86_64-linux-user.fa.p/target_i386_cpu.c.o: in function 
`cpu_x86_cpuid':
  cpu.c:(.text+0x1374): undefined reference to `kvm_arch_get_supported_cpuid'
  /usr/bin/ld: libqemu-x86_64-linux-user.fa.p/target_i386_cpu.c.o: in function 
`x86_cpu_filter_features':
  cpu.c:(.text+0x81c2): undefined reference to `kvm_arch_get_supported_cpuid'
  /usr/bin/ld: cpu.c:(.text+0x81da): undefined reference to 
`kvm_arch_get_supported_cpuid'
  /usr/bin/ld: cpu.c:(.text+0x81f2): undefined reference to 
`kvm_arch_get_supported_cpuid'
  /usr/bin/ld: cpu.c:(.text+0x820a): undefined reference to 
`kvm_arch_get_supported_cpuid'
  /usr/bin/ld: 
libqemu-x86_64-linux-user.fa.p/target_i386_cpu.c.o:cpu.c:(.text+0x8225): more 
undefined references to `kvm_arch_get_supported_cpuid' follow
  clang: error: linker command failed with exit code 1 (use -v to see 
invocation)
  ninja: build stopped: subcommand failed.

libqemu-x86_64-linux-user.fa is user emulation specific, so
having system emulation code called there is dubious.

'--enable-debug' disables optimizations (CFLAGS=-O0).

While at this (un)optimization level GCC eliminate the
following dead code (CPP output of mentioned build):

 static void x86_cpu_get_supported_cpuid(uint32_t func, uint32_t index,
 uint32_t *eax, uint32_t *ebx,
 uint32_t *ecx, uint32_t *edx)
 {
 if ((0)) {
 *eax = kvm_arch_get_supported_cpuid(kvm_state, func, index, R_EAX);
 *ebx = kvm_arch_get_supported_cpuid(kvm_state, func, index, R_EBX);
 *ecx = kvm_arch_get_supported_cpuid(kvm_state, func, index, R_ECX);
 *edx = kvm_arch_get_supported_cpuid(kvm_state, func, index, R_EDX);
 } else if (0) {
 *eax = 0;
 *ebx = 0;
 *ecx = 0;
 *edx = 0;
 } else {
 *eax = 0;
 *ebx = 0;
 *ecx = 0;
 *edx = 0;
 }
 }

Clang does not.

Instead of trying to deal with compiler specific checks around
__OPTIMIZE__ (see commit 2140cfa51d "i386: Fix build by providing
stub kvm_arch_get_supported_cpuid()"), simply restrict code
belonging to system emulation, easing user emulation linking.

Reported-by: Kevin Wolf 
Signed-off-by: Philippe Mathieu-Daudé 
---
RFC: I might have been overzealous in x86_cpu_expand_features(),
please double check.
---
 target/i386/cpu.c | 47 +--
 1 file changed, 45 insertions(+), 2 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 760428d4dc..8b57708604 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1664,6 +1664,7 @@ static inline uint64_t 
x86_cpu_xsave_xss_components(X86CPU *cpu)
cpu->env.features[FEAT_XSAVE_XSS_LO];
 }
 
+#ifndef CONFIG_USER_ONLY
 /*
  * Returns the set of feature flags that are supported and migratable by
  * QEMU, for a given FeatureWord.
@@ -1686,6 +1687,7 @@ static uint64_t x86_cpu_get_migratable_flags(FeatureWord 
w)
 }
 return r;
 }
+#endif /* !CONFIG_USER_ONLY */
 
 void host_cpuid(uint32_t function, uint32_t count,
 uint32_t *eax, uint32_t *ebx, uint32_t *ecx, uint32_t *edx)
@@ -5677,8 +5679,6 @@ CpuDefinitionInfoList *qmp_query_cpu_definitions(Error 
**errp)
 return cpu_list;
 }
 
-#endif /* !CONFIG_USER_ONLY */
-
 uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
 bool migratable_only)
 {
@@ -5781,6 +5781,38 @@ static void x86_cpu_get_cache_cpuid(uint32_t func, 
uint32_t index,
 }
 }
 
+#else /* CONFIG_USER_ONLY */
+
+uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
+bool migratable_only)
+{
+FeatureWordInfo *wi = _word_info[w];
+
+return wi->tcg_features;
+}
+
+static void x86_cpu_get_supported_cpuid(uint32_t func, uint32_t index,
+uint32_t *eax, uint32_t *ebx,
+uint32_t *ecx, uint32_t *edx)
+{
+*eax = 0;
+*ebx = 0;
+*ecx = 0;
+*edx = 0;
+}
+
+static void x86_cpu_get_cache_cpuid(uint32_t func, uint32_t index,
+uint32_t *eax, uint32_t *ebx,
+uint32_t *ecx, uint32_t *edx)
+{
+*eax = 0;
+*ebx = 0;
+*ecx = 0;
+*edx = 0;
+}
+
+#endif /* !CONFIG_USER_ONLY */
+
 /*
  * Only for builtin_x86_defs models initialized with x86_register_cpudef_types.
  */
@@ -6163,6 +6195,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
 }
 *edx = env->features[FEAT_7_0_EDX]; /* Feature 

[PATCH v4 0/3] target/i386: Restrict system-specific features from user emulation

2023-09-11 Thread Philippe Mathieu-Daudé
Too many system-specific code (and in particular KVM related)
is pulled in user-only build. This led to adding unjustified
stubs as kludge to unagressive linker non-optimizations.

This series restrict x86 system-specific features to sysemu,
so we don't require any stub, and remove all x86 KVM declarations
from user emulation code (to trigger compile failure instead of
link one).

Philippe Mathieu-Daudé (3):
  target/i386: Check kvm_hyperv_expand_features() return value
  RFC target/i386: Restrict system-specific features from user emulation
  target/i386: Prohibit target specific KVM prototypes on user emulation

 target/i386/kvm/kvm_i386.h |  4 +++
 target/i386/cpu.c  | 58 +-
 2 files changed, 55 insertions(+), 7 deletions(-)

-- 
2.41.0




Re: [PATCH 04/32] bsd-user: Add freebsd_exec_common and do_freebsd_procctl to qemu.h.

2023-09-11 Thread Karim Taha
Richard Henderson  wrote:

> On 8/27/23 08:57, Karim Taha wrote:
>> From: Stacey Son 
>> 
>> Signed-off-by: Stacey Son 
>> Signed-off-by: Karim Taha 
>> ---
>>   bsd-user/main.c | 2 +-
>>   bsd-user/qemu.h | 7 +++
>>   2 files changed, 8 insertions(+), 1 deletion(-)
>> 
>> diff --git a/bsd-user/main.c b/bsd-user/main.c
>> index 381bb18df8..b94b2d34b6 100644
>> --- a/bsd-user/main.c
>> +++ b/bsd-user/main.c
>> @@ -88,7 +88,7 @@ unsigned long reserved_va = MAX_RESERVED_VA;
>>   unsigned long reserved_va;
>>   #endif
>>   
>> -static const char *interp_prefix = CONFIG_QEMU_INTERP_PREFIX;
>> +const char *interp_prefix = CONFIG_QEMU_INTERP_PREFIX;
>>   const char *qemu_uname_release;
>>   char qemu_proc_pathname[PATH_MAX];  /* full path to exeutable */
>>   
>
> Adding interp_prefix is unrelated.
>
> Without that,
> Reviewed-by: Richard Henderson 
>
>
> r~
>
I grepped for `interp_prefix`, it's later used in the
`freebsd_exec_common` function definition, so do you mean I should add
it with the relevant commit that defines the function?

--
Karim Taha

>> diff --git a/bsd-user/qemu.h b/bsd-user/qemu.h
>> index 6724bb9f0a..23bbdd3e0c 100644
>> --- a/bsd-user/qemu.h
>> +++ b/bsd-user/qemu.h
>> @@ -113,6 +113,7 @@ typedef struct TaskState {
>>   } __attribute__((aligned(16))) TaskState;
>>   
>>   void stop_all_tasks(void);
>> +extern const char *interp_prefix;
>>   extern const char *qemu_uname_release;
>>   
>>   /*
>> @@ -251,6 +252,12 @@ abi_long get_errno(abi_long ret);
>>   bool is_error(abi_long ret);
>>   int host_to_target_errno(int err);
>>   
>> +/* os-proc.c */
>> +abi_long freebsd_exec_common(abi_ulong path_or_fd, abi_ulong guest_argp,
>> +abi_ulong guest_envp, int do_fexec);
>> +abi_long do_freebsd_procctl(void *cpu_env, int idtype, abi_ulong arg2,
>> +abi_ulong arg3, abi_ulong arg4, abi_ulong arg5, abi_ulong arg6);
>> +
>>   /* os-sys.c */
>>   abi_long do_freebsd_sysctl(CPUArchState *env, abi_ulong namep, int32_t 
>> namelen,
>>   abi_ulong oldp, abi_ulong oldlenp, abi_ulong newp, abi_ulong 
>> newlen);



Re: [PATCH 01/11] qdev: Add qdev_prop_set_array()

2023-09-11 Thread Philippe Mathieu-Daudé

On 8/9/23 16:36, Kevin Wolf wrote:

Instead of exposing the ugly hack of how we represent arrays in qdev (a
static "foo-len" property and after it is set, dynamically created
"foo[i]" properties) to boards, add an interface that allows setting the
whole array at once.

Once all internal users of devices with array properties have been
converted to use this function, we can change the implementation to move
away from this hack.

Signed-off-by: Kevin Wolf 
---
  include/hw/qdev-properties.h |  3 +++
  hw/core/qdev-properties.c| 21 +
  2 files changed, 24 insertions(+)




+void qdev_prop_set_array(DeviceState *dev, const char *name, QList *values)
+{
+const QListEntry *entry;
+g_autofree char *prop_len = g_strdup_printf("len-%s", name);
+uint32_t i = 0;


"unsigned"? Anyway,

Reviewed-by: Philippe Mathieu-Daudé 


+
+object_property_set_int(OBJECT(dev), prop_len, qlist_size(values),
+_abort);
+
+QLIST_FOREACH_ENTRY(values, entry) {
+g_autofree char *prop_idx = g_strdup_printf("%s[%u]", name, i);
+object_property_set_qobject(OBJECT(dev), prop_idx, entry->value,
+_abort);
+i++;
+}
+
+qobject_unref(values);
+}





[PATCH v1 2/2] target/s390x: flag te and cte as deprecated

2023-09-11 Thread Collin Walling
Add the CONSTRAINT_TRANSACTIONAL_EXE (cte) and TRANSACTIONAL_EXE (te)
under the list of deprecated features.

Signed-off-by: Collin Walling 
---
 target/s390x/cpu_features.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/target/s390x/cpu_features.c b/target/s390x/cpu_features.c
index efafc9711c..cb4e2b8920 100644
--- a/target/s390x/cpu_features.c
+++ b/target/s390x/cpu_features.c
@@ -218,6 +218,9 @@ void s390_get_deprecated_features(S390FeatBitmap features)
  /* CSSKE is deprecated on newer generations */
  S390_FEAT_CONDITIONAL_SSKE,
  S390_FEAT_BPB,
+ /* Deprecated on z16 */
+ S390_FEAT_CONSTRAINT_TRANSACTIONAL_EXE,
+ S390_FEAT_TRANSACTIONAL_EXE
 };
 int i;
 
-- 
2.41.0




[PATCH v1 1/2] target/s390x: introduce "host-recommended" option for model expansion

2023-09-11 Thread Collin Walling
Newer S390 machines may drop support for features completely, rendering
guests operating with older CPU models incapable of running on said
machines. A manual effort to disable certain CPU features would be
required.

To alleviate this issue, a list of "deprecated" features are now
retained within QEMU, and a new "static-recommended" CPU model expansion
type has been created to allow a query of the host-model with deprecated
features explicitly disabled.

Signed-off-by: Collin Walling 
---
 qapi/machine-target.json |  8 +++-
 target/s390x/cpu_features.c  | 14 ++
 target/s390x/cpu_features.h  |  1 +
 target/s390x/cpu_models_sysemu.c | 26 +-
 4 files changed, 43 insertions(+), 6 deletions(-)

diff --git a/qapi/machine-target.json b/qapi/machine-target.json
index f0a6b72414..4dc891809d 100644
--- a/qapi/machine-target.json
+++ b/qapi/machine-target.json
@@ -42,6 +42,12 @@
 # to be migration-safe, but allows tooling to get an insight and
 # work with model details.
 #
+# @static-recommended: Expand to a static CPU model with property
+# changes suggested by the architecture. This is useful for
+# expanding a CPU model expected to operate in mixed
+# CPU-generation environments. The @static-recommended CPU
+# models are migration-safe.
+#
 # Note: When a non-migration-safe CPU model is expanded in static
 # mode, some features enabled by the CPU model may be omitted,
 # because they can't be implemented by a static CPU model
@@ -55,7 +61,7 @@
 # Since: 2.8
 ##
 { 'enum': 'CpuModelExpansionType',
-  'data': [ 'static', 'full' ] }
+  'data': [ 'static', 'full', 'static-recommended' ] }
 
 ##
 # @CpuModelCompareResult:
diff --git a/target/s390x/cpu_features.c b/target/s390x/cpu_features.c
index d28eb65845..efafc9711c 100644
--- a/target/s390x/cpu_features.c
+++ b/target/s390x/cpu_features.c
@@ -212,6 +212,20 @@ void s390_feat_bitmap_to_ascii(const S390FeatBitmap 
features, void *opaque,
 };
 }
 
+void s390_get_deprecated_features(S390FeatBitmap features)
+{
+static const int feats[] = {
+ /* CSSKE is deprecated on newer generations */
+ S390_FEAT_CONDITIONAL_SSKE,
+ S390_FEAT_BPB,
+};
+int i;
+
+for (i = 0; i < ARRAY_SIZE(feats); i++) {
+set_bit(feats[i], features);
+}
+}
+
 #define FEAT_GROUP_INIT(_name, _group, _desc)\
 {\
 .name = _name,   \
diff --git a/target/s390x/cpu_features.h b/target/s390x/cpu_features.h
index 87463f064d..5421762db5 100644
--- a/target/s390x/cpu_features.h
+++ b/target/s390x/cpu_features.h
@@ -68,6 +68,7 @@ void s390_add_from_feat_block(S390FeatBitmap features, 
S390FeatType type,
   uint8_t *data);
 void s390_feat_bitmap_to_ascii(const S390FeatBitmap features, void *opaque,
void (*fn)(const char *name, void *opaque));
+void s390_get_deprecated_features(S390FeatBitmap features);
 
 /* Definition of a CPU feature group */
 typedef struct {
diff --git a/target/s390x/cpu_models_sysemu.c b/target/s390x/cpu_models_sysemu.c
index 63981bf36b..1aa3d076b4 100644
--- a/target/s390x/cpu_models_sysemu.c
+++ b/target/s390x/cpu_models_sysemu.c
@@ -176,7 +176,7 @@ static void qdict_add_enabled_feat(const char *name, void 
*opaque)
 
 /* convert S390CPUDef into a static CpuModelInfo */
 static void cpu_info_from_model(CpuModelInfo *info, const S390CPUModel *model,
-bool delta_changes)
+bool delta_changes, bool disable_dep_feats)
 {
 QDict *qdict = qdict_new();
 S390FeatBitmap bitmap;
@@ -198,6 +198,13 @@ static void cpu_info_from_model(CpuModelInfo *info, const 
S390CPUModel *model,
 if (!bitmap_empty(bitmap, S390_FEAT_MAX)) {
 s390_feat_bitmap_to_ascii(bitmap, qdict, qdict_add_enabled_feat);
 }
+
+/* features flagged as deprecated */
+if (disable_dep_feats) {
+bitmap_zero(bitmap, S390_FEAT_MAX);
+s390_get_deprecated_features(bitmap);
+s390_feat_bitmap_to_ascii(bitmap, qdict, qdict_add_disabled_feat);
+}
 } else {
 /* expand all features */
 s390_feat_bitmap_to_ascii(model->features, qdict,
@@ -221,6 +228,7 @@ CpuModelExpansionInfo 
*qmp_query_cpu_model_expansion(CpuModelExpansionType type,
 CpuModelExpansionInfo *expansion_info = NULL;
 S390CPUModel s390_model;
 bool delta_changes = false;
+bool disable_dep_feats = false;
 
 /* convert it to our internal representation */
 cpu_model_from_info(_model, model, );
@@ -229,9 +237,16 @@ CpuModelExpansionInfo 
*qmp_query_cpu_model_expansion(CpuModelExpansionType type,
 return NULL;
 }
 
-if (type == CPU_MODEL_EXPANSION_TYPE_STATIC) {
+switch (type) {
+case CPU_MODEL_EXPANSION_TYPE_STATIC_RECOMMENDED:
+disable_dep_feats = true;
+

Re: [PATCH 09/11] hw/rx/rx62n: Use qdev_prop_set_array()

2023-09-11 Thread Philippe Mathieu-Daudé

On 8/9/23 16:37, Kevin Wolf wrote:

Instead of manually setting "foo-len" and "foo[i]" properties, build a
QList and use the new qdev_prop_set_array() helper to set the whole
array property with a single call.

Signed-off-by: Kevin Wolf 
---
  hw/rx/rx62n.c | 19 ++-
  1 file changed, 10 insertions(+), 9 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH v6 08/10] migration/yank: Use channel features

2023-09-11 Thread Philippe Mathieu-Daudé

On 11/9/23 19:13, Fabiano Rosas wrote:

Stop using outside knowledge about the io channels when registering
yank functions. Query for features instead.

The yank method for all channels used with migration code currently is
to call the qio_channel_shutdown() function, so query for
QIO_CHANNEL_FEATURE_SHUTDOWN. We could add a separate feature in the
future for indicating whether a channel supports yanking, but that
seems overkill at the moment.

Signed-off-by: Fabiano Rosas 
---
  migration/yank_functions.c | 6 +-
  1 file changed, 1 insertion(+), 5 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PULL v2 42/45] target/riscv: Allocate itrigger timers only once

2023-09-11 Thread Michael Tokarev

11.09.2023 09:43, Alistair Francis пишет:

From: Akihiko Odaki 

riscv_trigger_init() had been called on reset events that can happen
several times for a CPU and it allocated timers for itrigger. If old
timers were present, they were simply overwritten by the new timers,
resulting in a memory leak.

Divide riscv_trigger_init() into two functions, namely
riscv_trigger_realize() and riscv_trigger_reset() and call them in
appropriate timing. The timer allocation will happen only once for a
CPU in riscv_trigger_realize().

Fixes: 5a4ae64cac ("target/riscv: Add itrigger support when icount is enabled")
Signed-off-by: Akihiko Odaki 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: LIU Zhiwei 
Reviewed-by: Alistair Francis 
Message-ID: <20230818034059.9146-1-akihiko.od...@daynix.com>
Signed-off-by: Alistair Francis 


This smells like one more change from this series which should be
picked up for stable-8.1.  Picking this one up, please let me know
if I shuldn't.

Thanks,

/mjt



Re: [PULL v2 38/45] hw/riscv/virt.c: fix non-KVM --enable-debug build

2023-09-11 Thread Michael Tokarev

11.09.2023 09:43, Alistair Francis:

From: Daniel Henrique Barboza 

A build with --enable-debug and without KVM will fail as follows:

/usr/bin/ld: libqemu-riscv64-softmmu.fa.p/hw_riscv_virt.c.o: in function 
`virt_machine_init':
./qemu/build/../hw/riscv/virt.c:1465: undefined reference to 
`kvm_riscv_aia_create'

This happens because the code block with "if virt_use_kvm_aia(s)" isn't
being ignored by the debug build, resulting in an undefined reference to
a KVM only function.

Add a 'kvm_enabled()' conditional together with virt_use_kvm_aia() will
make the compiler crop the kvm_riscv_aia_create() call entirely from a
non-KVM build. Note that adding the 'kvm_enabled()' conditional inside
virt_use_kvm_aia() won't fix the build because this function would need
to be inlined multiple times to make the compiler zero out the entire
block.

While we're at it, use kvm_enabled() in all instances where
virt_use_kvm_aia() is checked to allow the compiler to elide these other
kvm-only instances as well.

Suggested-by: Richard Henderson 
Fixes: dbdb99948e ("target/riscv: select KVM AIA in riscv virt machine")


This is probably commit 48c2c33c52, not dbdb99948e.

..

  /* KVM AIA only has one APLIC instance */
-if (virt_use_kvm_aia(s)) {
+if (kvm_enabled() && virt_use_kvm_aia(s)) {
  create_fdt_socket_aplic(s, memmap, 0,

...

As has been discovered earlier (see "target/i386: Restrict system-specific
features from user emulation" threads), this is not enough, - compiler does
not always eliminate if (0 && foo) { bar; } construct, it's too fragile and
does not actually work.  Either some #ifdef'fery is needed here or some other,
more explicit, way to eliminate such code, like introducing stub functions.

I know it's too late and this change is already in, but unfortunately that's
when I noticed this.  While the "Fixes:" tag can't be changed anymore, a more
proper fix for the actual problem (with the proper Fixes tag now) can still
be applied on top of this fix.

/mjt



Re: [PATCH 1/1] migration: skip poisoned memory pages on "ram saving" phase

2023-09-11 Thread Peter Xu
On Sat, Sep 09, 2023 at 03:57:44PM +0100, Joao Martins wrote:
> > Should I continue to treat them as zero pages written with
> > save_zero_page_to_file ? 
> 
> MCE had already been forward to the guest, so guest is supposed to not be 
> using
> the page (nor rely on its contents). Hence destination ought to just see a 
> zero
> page. So what you said seems like the best course of action.
> 
> > Or should I consider the case of an ongoing compression
> > use and create a new code compressing an empty page with 
> > save_compress_page() ?
> > 
> The compress code looks to be a tentative compression (not guaranteed IIUC), 
> so
> I am not sure it needs any more logic that just adding at the top of
> ram_save_target_page_legacy() as Peter suggested?
> 
> > And what about an RDMA memory region impacted by a memory error ?
> > This is an important aspect.
> > Does anyone know how this situation is dealt with ? And how it should be 
> > handled
> > in Qemu ?
> > 
> 
> If you refer to guest RDMA MRs that is just guest RAM, not sure we are even
> aware of those from qemu. But if you refer to the RDMA transport that sits 
> below
> the Qemu file (or rather acts as an implementation of QemuFile), so handling 
> in
> ram_save_target_page_legacy() already seems to cover it.

I'm also not familiar enough with RDMA, but it looks tricky indeed. AFAIU
it's leveraging RDMA_CONTROL_COMPRESS for zero pages for now (with
RDMACompress.value==0), so it doesn't seem to be using generic migration
protocols.

If we want to fix all places well, one way to consider is to introduce
migration_buffer_is_zero(), which can be a wrapper for buffer_is_zero() by
default, but also returns true for poisoned pages before reading the
buffer.  Then we use it in all three places:

  - For compression, in do_compress_ram_page()
  - For RDMA, in qemu_rdma_write_one()
  - For generic migration, in save_zero_page_to_file() (your current patch)

I suppose then all cases will be fixed.  We need to make sure we'll always
use migration_buffer_is_zero() as the 1st thing to call when QEMU wants to
migrate a target page.  Maybe it'll worth a comment above that function.

Thanks,

-- 
Peter Xu




Re: [PATCH v1] vfio/common: Separate vfio-pci ranges

2023-09-11 Thread Joao Martins
On 11/09/2023 19:35, Alex Williamson wrote:
> On Mon, 11 Sep 2023 11:12:55 +0100
> Joao Martins  wrote:
> 
>> On 11/09/2023 10:48, Duan, Zhenzhong wrote:
 -Original Message-
 From: Joao Martins 
 Sent: Monday, September 11, 2023 5:07 PM
 Subject: Re: [PATCH v1] vfio/common: Separate vfio-pci ranges

 On 11/09/2023 09:57, Duan, Zhenzhong wrote:  
>> -Original Message-
>> From: qemu-devel-bounces+zhenzhong.duan=intel@nongnu.org > devel-bounces+zhenzhong.duan=intel@nongnu.org> On Behalf Of Joao
>> Martins
>> Sent: Friday, September 8, 2023 5:30 PM
>> Subject: [PATCH v1] vfio/common: Separate vfio-pci ranges
>>
>> QEMU computes the DMA logging ranges for two predefined ranges: 32-bit
>> and 64-bit. In the OVMF case, when the dynamic MMIO window is enabled,
>> QEMU includes in the 64-bit range the RAM regions at the lower part
>> and vfio-pci device RAM regions which are at the top of the address
>> space. This range contains a large gap and the size can be bigger than
>> the dirty tracking HW limits of some devices (MLX5 has a 2^42 limit).
>>
>> To avoid such large ranges, introduce a new PCI range covering the
>> vfio-pci device RAM regions, this only if the addresses are above 4GB
>> to avoid breaking potential SeaBIOS guests.
>>
>> Signed-off-by: Joao Martins 
>> [ clg: - wrote commit log
>>   - fixed overlapping 32-bit and PCI ranges when using SeaBIOS ]
>> Signed-off-by: Cédric Le Goater 
>> ---
>> v2:
>> * s/minpci/minpci64/
>> * s/maxpci/maxpci64/
>> * Expand comment to cover the pci-hole64 and why we don't do special
>>  handling of pci-hole32.
>> ---
>> hw/vfio/common.c | 71 +---
>> hw/vfio/trace-events |  2 +-
>> 2 files changed, 61 insertions(+), 12 deletions(-)
>>
>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>> index 237101d03844..134649226d43 100644
>> --- a/hw/vfio/common.c
>> +++ b/hw/vfio/common.c
>> @@ -27,6 +27,7 @@
>>
>> #include "hw/vfio/vfio-common.h"
>> #include "hw/vfio/vfio.h"
>> +#include "hw/vfio/pci.h"
>> #include "exec/address-spaces.h"
>> #include "exec/memory.h"
>> #include "exec/ram_addr.h"
>> @@ -1400,6 +1401,8 @@ typedef struct VFIODirtyRanges {
>> hwaddr max32;
>> hwaddr min64;
>> hwaddr max64;
>> +hwaddr minpci64;
>> +hwaddr maxpci64;
>> } VFIODirtyRanges;
>>
>> typedef struct VFIODirtyRangesListener {
>> @@ -1408,6 +1411,31 @@ typedef struct VFIODirtyRangesListener {
>> MemoryListener listener;
>> } VFIODirtyRangesListener;
>>
>> +static bool vfio_section_is_vfio_pci(MemoryRegionSection *section,
>> + VFIOContainer *container)
>> +{
>> +VFIOPCIDevice *pcidev;
>> +VFIODevice *vbasedev;
>> +VFIOGroup *group;
>> +Object *owner;
>> +
>> +owner = memory_region_owner(section->mr);
>> +
>> +QLIST_FOREACH(group, >group_list, container_next) {
>> +QLIST_FOREACH(vbasedev, >device_list, next) {
>> +if (vbasedev->type != VFIO_DEVICE_TYPE_PCI) {
>> +continue;
>> +}
>> +pcidev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
>> +if (OBJECT(pcidev) == owner) {
>> +return true;
>> +}
>> +}
>> +}
>> +
>> +return false;
>> +}  
>
> What about simplify it with memory_region_is_ram_device()?
> This way vdpa device could also be included.
>  

 Note that the check is not interested in RAM (or any other kinda of memory 
 like
 VGA). That's covered in the 32-64 ranges. But rather in any PCI device RAM 
 that
 would fall in the 64-bit PCI hole. Would memory_region_is_ram_device() 
 really
 cover it? If so, I am all for the simplification.  
>>>
>>> Ram device is used not only by vfio pci bars but also host notifier of vdpa 
>>> and vhost-user.
>>>
>>> I have another question, previously I think vfio pci bars are device states 
>>> and
>>> save/restored through VFIO migration protocol, so we don't need to dirty
>>> tracking them. Do I understand wrong?  
>>
>> The general thinking of device dirty tracking is to track all addressable 
>> IOVAs.
>> But you do raise a good question. My understanding is that migrating the bars
>> *as is* might be device migration specific (not a guarantee?); the save file 
>> and
>> precopy interface are the only places we transfer from/to the data and it's 
>> just
>> opaque data, not bars or anything formatted specifically -- so if we migrate
>> bars it is hidden in what device f/w wants to move. Might be that BARs aren't
>> even needed as they are sort of scratch space from h/w side. Ultimately, 

Re: [PATCH 02/13] memory: Introduce memory_region_iommu_set_iova_ranges

2023-09-11 Thread Peter Xu
On Mon, Sep 04, 2023 at 10:03:45AM +0200, Eric Auger wrote:
> This helper will allow to convey information about valid
> IOVA ranges to virtual IOMMUS.
> 
> Signed-off-by: Eric Auger 

Reviewed-by: Peter Xu 

-- 
Peter Xu




Re: [PATCH 01/13] memory: Let ReservedRegion use Range

2023-09-11 Thread Peter Xu
On Mon, Sep 04, 2023 at 10:03:44AM +0200, Eric Auger wrote:
> A reserved region is a range tagged with a type. Let's directly use
> the Range type in the prospect to reuse some of the library helpers
> shipped with the Range type.
> 
> Signed-off-by: Eric Auger 

Reviewed-by: Peter Xu 

-- 
Peter Xu




Re: [PATCH v1] vfio/common: Separate vfio-pci ranges

2023-09-11 Thread Alex Williamson
On Mon, 11 Sep 2023 11:12:55 +0100
Joao Martins  wrote:

> On 11/09/2023 10:48, Duan, Zhenzhong wrote:
> >> -Original Message-
> >> From: Joao Martins 
> >> Sent: Monday, September 11, 2023 5:07 PM
> >> Subject: Re: [PATCH v1] vfio/common: Separate vfio-pci ranges
> >>
> >> On 11/09/2023 09:57, Duan, Zhenzhong wrote:  
>  -Original Message-
>  From: qemu-devel-bounces+zhenzhong.duan=intel@nongnu.org   devel-bounces+zhenzhong.duan=intel@nongnu.org> On Behalf Of Joao
>  Martins
>  Sent: Friday, September 8, 2023 5:30 PM
>  Subject: [PATCH v1] vfio/common: Separate vfio-pci ranges
> 
>  QEMU computes the DMA logging ranges for two predefined ranges: 32-bit
>  and 64-bit. In the OVMF case, when the dynamic MMIO window is enabled,
>  QEMU includes in the 64-bit range the RAM regions at the lower part
>  and vfio-pci device RAM regions which are at the top of the address
>  space. This range contains a large gap and the size can be bigger than
>  the dirty tracking HW limits of some devices (MLX5 has a 2^42 limit).
> 
>  To avoid such large ranges, introduce a new PCI range covering the
>  vfio-pci device RAM regions, this only if the addresses are above 4GB
>  to avoid breaking potential SeaBIOS guests.
> 
>  Signed-off-by: Joao Martins 
>  [ clg: - wrote commit log
>    - fixed overlapping 32-bit and PCI ranges when using SeaBIOS ]
>  Signed-off-by: Cédric Le Goater 
>  ---
>  v2:
>  * s/minpci/minpci64/
>  * s/maxpci/maxpci64/
>  * Expand comment to cover the pci-hole64 and why we don't do special
>   handling of pci-hole32.
>  ---
>  hw/vfio/common.c | 71 +---
>  hw/vfio/trace-events |  2 +-
>  2 files changed, 61 insertions(+), 12 deletions(-)
> 
>  diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>  index 237101d03844..134649226d43 100644
>  --- a/hw/vfio/common.c
>  +++ b/hw/vfio/common.c
>  @@ -27,6 +27,7 @@
> 
>  #include "hw/vfio/vfio-common.h"
>  #include "hw/vfio/vfio.h"
>  +#include "hw/vfio/pci.h"
>  #include "exec/address-spaces.h"
>  #include "exec/memory.h"
>  #include "exec/ram_addr.h"
>  @@ -1400,6 +1401,8 @@ typedef struct VFIODirtyRanges {
>  hwaddr max32;
>  hwaddr min64;
>  hwaddr max64;
>  +hwaddr minpci64;
>  +hwaddr maxpci64;
>  } VFIODirtyRanges;
> 
>  typedef struct VFIODirtyRangesListener {
>  @@ -1408,6 +1411,31 @@ typedef struct VFIODirtyRangesListener {
>  MemoryListener listener;
>  } VFIODirtyRangesListener;
> 
>  +static bool vfio_section_is_vfio_pci(MemoryRegionSection *section,
>  + VFIOContainer *container)
>  +{
>  +VFIOPCIDevice *pcidev;
>  +VFIODevice *vbasedev;
>  +VFIOGroup *group;
>  +Object *owner;
>  +
>  +owner = memory_region_owner(section->mr);
>  +
>  +QLIST_FOREACH(group, >group_list, container_next) {
>  +QLIST_FOREACH(vbasedev, >device_list, next) {
>  +if (vbasedev->type != VFIO_DEVICE_TYPE_PCI) {
>  +continue;
>  +}
>  +pcidev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
>  +if (OBJECT(pcidev) == owner) {
>  +return true;
>  +}
>  +}
>  +}
>  +
>  +return false;
>  +}  
> >>>
> >>> What about simplify it with memory_region_is_ram_device()?
> >>> This way vdpa device could also be included.
> >>>  
> >>
> >> Note that the check is not interested in RAM (or any other kinda of memory 
> >> like
> >> VGA). That's covered in the 32-64 ranges. But rather in any PCI device RAM 
> >> that
> >> would fall in the 64-bit PCI hole. Would memory_region_is_ram_device() 
> >> really
> >> cover it? If so, I am all for the simplification.  
> > 
> > Ram device is used not only by vfio pci bars but also host notifier of vdpa 
> > and vhost-user.
> > 
> > I have another question, previously I think vfio pci bars are device states 
> > and
> > save/restored through VFIO migration protocol, so we don't need to dirty
> > tracking them. Do I understand wrong?  
> 
> The general thinking of device dirty tracking is to track all addressable 
> IOVAs.
> But you do raise a good question. My understanding is that migrating the bars
> *as is* might be device migration specific (not a guarantee?); the save file 
> and
> precopy interface are the only places we transfer from/to the data and it's 
> just
> opaque data, not bars or anything formatted specifically -- so if we migrate
> bars it is hidden in what device f/w wants to move. Might be that BARs aren't
> even needed as they are sort of scratch space from h/w side. Ultimately, the
> dirty tracker is the one reporting the values, and the 

Re: [PATCH v2,1/1] memory: avoid updating ioeventfds for some address_space

2023-09-11 Thread David Hildenbrand

On 11.09.23 18:28, Peter Xu wrote:

On Mon, Sep 04, 2023 at 08:51:43PM +0800, hongmainquan wrote:


Friendly ping...
Hello, this patch has already received a R-b from PeterXu. Could you please
help me review it as well and see if there are any issues? If everything is
fine, could you please consider merging it? Thank you!


Paolo, wanna pick this one up?

David, I know you're preparing a pull with a lot of memory changes, if you
like to pick this up it'll be good too.


I queued it to

https://github.com/davidhildenbrand/qemu.git mem-next

If nobody beats me to it (or requests me to drop it), I'll send it 
upstream next week.


--
Cheers,

David / dhildenb




[PATCH v6 10/10] migration: Add a wrapper to cleanup migration files

2023-09-11 Thread Fabiano Rosas
We currently have a pattern for cleaning up a migration QEMUFile:

  qemu_mutex_lock(>qemu_file_lock);
  file = s->file_name;
  s->file_name = NULL;
  qemu_mutex_unlock(>qemu_file_lock);

  migration_ioc_unregister_yank_from_file(file);
  qemu_file_shutdown(file);
  qemu_fclose(file);

This sequence requires some consideration about locking to avoid
TOC/TOU bugs and avoid passing NULL into the functions that don't
expect it.

There's no need to call a shutdown() right before a close() and a
shutdown() in another thread being issued as a means to unblock a file
should not collide with this close().

Create a wrapper function to make sure the locking is being done
properly. Remove the extra shutdown().

Signed-off-by: Fabiano Rosas 
---
 migration/migration.c | 89 ---
 1 file changed, 25 insertions(+), 64 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 7fec57ad7f..3e94521321 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -217,6 +217,26 @@ MigrationIncomingState 
*migration_incoming_get_current(void)
 return current_incoming;
 }
 
+static void migration_file_release(QEMUFile **file)
+{
+MigrationState *ms = migrate_get_current();
+QEMUFile *tmp;
+
+/*
+ * Reset the pointer before releasing it to avoid holding the lock
+ * for too long.
+ */
+WITH_QEMU_LOCK_GUARD(>qemu_file_lock) {
+tmp = *file;
+*file = NULL;
+}
+
+if (tmp) {
+migration_ioc_unregister_yank_from_file(tmp);
+qemu_fclose(tmp);
+}
+}
+
 void migration_incoming_transport_cleanup(MigrationIncomingState *mis)
 {
 if (mis->socket_address_list) {
@@ -1155,8 +1175,6 @@ static void migrate_fd_cleanup(MigrationState *s)
 qemu_savevm_state_cleanup();
 
 if (s->to_dst_file) {
-QEMUFile *tmp;
-
 trace_migrate_fd_cleanup();
 qemu_mutex_unlock_iothread();
 if (s->migration_thread_running) {
@@ -1166,16 +1184,7 @@ static void migrate_fd_cleanup(MigrationState *s)
 qemu_mutex_lock_iothread();
 
 multifd_save_cleanup();
-qemu_mutex_lock(>qemu_file_lock);
-tmp = s->to_dst_file;
-s->to_dst_file = NULL;
-qemu_mutex_unlock(>qemu_file_lock);
-/*
- * Close the file handle without the lock to make sure the
- * critical section won't block for long.
- */
-migration_ioc_unregister_yank_from_file(tmp);
-qemu_fclose(tmp);
+migration_file_release(>to_dst_file);
 }
 
 /*
@@ -1815,38 +1824,6 @@ static int migrate_handle_rp_resume_ack(MigrationState 
*s, uint32_t value)
 return 0;
 }
 
-/*
- * Release ms->rp_state.from_dst_file (and postcopy_qemufile_src if
- * existed) in a safe way.
- */
-static void migration_release_dst_files(MigrationState *ms)
-{
-QEMUFile *file;
-
-WITH_QEMU_LOCK_GUARD(>qemu_file_lock) {
-/*
- * Reset the from_dst_file pointer first before releasing it, as we
- * can't block within lock section
- */
-file = ms->rp_state.from_dst_file;
-ms->rp_state.from_dst_file = NULL;
-}
-
-/*
- * Do the same to postcopy fast path socket too if there is.  No
- * locking needed because this qemufile should only be managed by
- * return path thread.
- */
-if (ms->postcopy_qemufile_src) {
-migration_ioc_unregister_yank_from_file(ms->postcopy_qemufile_src);
-qemu_file_shutdown(ms->postcopy_qemufile_src);
-qemu_fclose(ms->postcopy_qemufile_src);
-ms->postcopy_qemufile_src = NULL;
-}
-
-qemu_fclose(file);
-}
-
 /*
  * Handles messages sent on the return path towards the source VM
  *
@@ -2046,7 +2023,8 @@ static int 
await_return_path_close_on_source(MigrationState *ms)
 ret = ms->rp_state.error;
 ms->rp_state.error = false;
 
-migration_release_dst_files(ms);
+migration_file_release(>rp_state.from_dst_file);
+migration_file_release(>postcopy_qemufile_src);
 
 trace_migration_return_path_end_after(ret);
 return ret;
@@ -2502,26 +2480,9 @@ static MigThrError postcopy_pause(MigrationState *s)
 assert(s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE);
 
 while (true) {
-QEMUFile *file;
-
-/*
- * Current channel is possibly broken. Release it.  Note that this is
- * guaranteed even without lock because to_dst_file should only be
- * modified by the migration thread.  That also guarantees that the
- * unregister of yank is safe too without the lock.  It should be safe
- * even to be within the qemu_file_lock, but we didn't do that to avoid
- * taking more mutex (yank_lock) within qemu_file_lock.  TL;DR: we make
- * the qemu_file_lock critical section as small as possible.
- */
+/* Current channel is possibly broken. Release it. */
 assert(s->to_dst_file);
-

[PATCH v6 00/10] Fix segfault on migration return path

2023-09-11 Thread Fabiano Rosas
This version adds migration-specific tracking for the yank functions.

We've estabilished that using the ioc refcount is a layer violation
and that relaxing the abort() on yank.c is undesirable, so we're left
with making the migration code keep track of how many QEMUFiles are
still using the QIOChannel and (consequently) the yank function.

CI run: https://gitlab.com/farosas/qemu/-/pipelines/999810871

v5:
https://lore.kernel.org/r/20230831183916.13203-1-faro...@suse.de
v4:
https://lore.kernel.org/r/20230816142510.5637-1-faro...@suse.de
v3:
https://lore.kernel.org/r/20230811150836.2895-1-faro...@suse.de
v2:
https://lore.kernel.org/r/20230802143644.7534-1-faro...@suse.de
v1:
https://lore.kernel.org/r/20230728121516.16258-1-faro...@suse.de

Fabiano Rosas (10):
  migration: Fix possible race when setting rp_state.error
  migration: Fix possible races when shutting down the return path
  migration: Fix possible race when shutting down to_dst_file
  migration: Remove redundant cleanup of postcopy_qemufile_src
  migration: Consolidate return path closing code
  migration: Replace the return path retry logic
  migration: Move return path cleanup to main migration thread
  migration/yank: Use channel features
  migration/yank: Keep track of registered yank instances
  migration: Add a wrapper to cleanup migration files

 migration/migration.c  | 227 +
 migration/migration.h  |   1 -
 migration/yank_functions.c |  87 +++---
 migration/yank_functions.h |   8 ++
 4 files changed, 160 insertions(+), 163 deletions(-)

-- 
2.35.3




[PATCH v6 04/10] migration: Remove redundant cleanup of postcopy_qemufile_src

2023-09-11 Thread Fabiano Rosas
This file is owned by the return path thread which is already doing
cleanup.

Reviewed-by: Peter Xu 
Signed-off-by: Fabiano Rosas 
---
 migration/migration.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 5e6a766235..195726eb4a 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1177,12 +1177,6 @@ static void migrate_fd_cleanup(MigrationState *s)
 qemu_fclose(tmp);
 }
 
-if (s->postcopy_qemufile_src) {
-migration_ioc_unregister_yank_from_file(s->postcopy_qemufile_src);
-qemu_fclose(s->postcopy_qemufile_src);
-s->postcopy_qemufile_src = NULL;
-}
-
 assert(!migration_is_active(s));
 
 if (s->state == MIGRATION_STATUS_CANCELLING) {
-- 
2.35.3




[PATCH v6 05/10] migration: Consolidate return path closing code

2023-09-11 Thread Fabiano Rosas
We'll start calling the await_return_path_close_on_source() function
from other parts of the code, so move all of the related checks and
tracepoints into it.

Reviewed-by: Peter Xu 
Signed-off-by: Fabiano Rosas 
---
 migration/migration.c | 29 ++---
 1 file changed, 14 insertions(+), 15 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 195726eb4a..4edbee3a5d 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2049,6 +2049,14 @@ static int open_return_path_on_source(MigrationState *ms,
 /* Returns 0 if the RP was ok, otherwise there was an error on the RP */
 static int await_return_path_close_on_source(MigrationState *ms)
 {
+int ret;
+
+if (!ms->rp_state.rp_thread_created) {
+return 0;
+}
+
+trace_migration_return_path_end_before();
+
 /*
  * If this is a normal exit then the destination will send a SHUT
  * and the rp_thread will exit, however if there's an error we
@@ -2066,7 +2074,10 @@ static int 
await_return_path_close_on_source(MigrationState *ms)
 qemu_thread_join(>rp_state.rp_thread);
 ms->rp_state.rp_thread_created = false;
 trace_await_return_path_close_on_source_close();
-return ms->rp_state.error;
+
+ret = ms->rp_state.error;
+trace_migration_return_path_end_after(ret);
+return ret;
 }
 
 static inline void
@@ -2362,20 +2373,8 @@ static void migration_completion(MigrationState *s)
 goto fail;
 }
 
-/*
- * If rp was opened we must clean up the thread before
- * cleaning everything else up (since if there are no failures
- * it will wait for the destination to send it's status in
- * a SHUT command).
- */
-if (s->rp_state.rp_thread_created) {
-int rp_error;
-trace_migration_return_path_end_before();
-rp_error = await_return_path_close_on_source(s);
-trace_migration_return_path_end_after(rp_error);
-if (rp_error) {
-goto fail;
-}
+if (await_return_path_close_on_source(s)) {
+goto fail;
 }
 
 if (qemu_file_get_error(s->to_dst_file)) {
-- 
2.35.3




[PATCH v6 06/10] migration: Replace the return path retry logic

2023-09-11 Thread Fabiano Rosas
Replace the return path retry logic with finishing and restarting the
thread. This fixes a race when resuming the migration that leads to a
segfault.

Currently when doing postcopy we consider that an IO error on the
return path file could be due to a network intermittency. We then keep
the thread alive but have it do cleanup of the 'from_dst_file' and
wait on the 'postcopy_pause_rp' semaphore. When the user issues a
migrate resume, a new return path is opened and the thread is allowed
to continue.

There's a race condition in the above mechanism. It is possible for
the new return path file to be setup *before* the cleanup code in the
return path thread has had a chance to run, leading to the *new* file
being closed and the pointer set to NULL. When the thread is released
after the resume, it tries to dereference 'from_dst_file' and crashes:

Thread 7 "return path" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffd1dbf700 (LWP 9611)]
0x560e4893 in qemu_file_get_error_obj (f=0x0, errp=0x0) at 
../migration/qemu-file.c:154
154 return f->last_error;

(gdb) bt
 #0  0x560e4893 in qemu_file_get_error_obj (f=0x0, errp=0x0) at 
../migration/qemu-file.c:154
 #1  0x560e4983 in qemu_file_get_error (f=0x0) at 
../migration/qemu-file.c:206
 #2  0x55b9a1df in source_return_path_thread (opaque=0x56e06000) at 
../migration/migration.c:1876
 #3  0x5602e14f in qemu_thread_start (args=0x5782e780) at 
../util/qemu-thread-posix.c:541
 #4  0x738d76ea in start_thread (arg=0x7fffd1dbf700) at 
pthread_create.c:477
 #5  0x735efa6f in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Here's the race (important bit is open_return_path happening before
migration_release_dst_files):

migration | qmp | return path
--+-+-
qmp_migrate_pause()
 shutdown(ms->to_dst_file)
  f->last_error = -EIO
migrate_detect_error()
 postcopy_pause()
  set_state(PAUSED)
  wait(postcopy_pause_sem)
qmp_migrate(resume)
migrate_fd_connect()
 resume = state == PAUSED
 open_return_path <-- TOO SOON!
 set_state(RECOVER)
 post(postcopy_pause_sem)
(incoming closes 
to_src_file)
res = 
qemu_file_get_error(rp)

migration_release_dst_files()

ms->rp_state.from_dst_file = NULL
  post(postcopy_pause_rp_sem)

postcopy_pause_return_path_thread()
  
wait(postcopy_pause_rp_sem)
rp = 
ms->rp_state.from_dst_file
goto retry
qemu_file_get_error(rp)
SIGSEGV
---

We can keep the retry logic without having the thread alive and
waiting. The only piece of data used by it is the 'from_dst_file' and
it is only allowed to proceed after a migrate resume is issued and the
semaphore released at migrate_fd_connect().

Move the retry logic to outside the thread by waiting for the thread
to finish before pausing the migration.

Reviewed-by: Peter Xu 
Signed-off-by: Fabiano Rosas 
---
 migration/migration.c | 60 ---
 migration/migration.h |  1 -
 2 files changed, 11 insertions(+), 50 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 4edbee3a5d..7dfcbc3634 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1775,18 +1775,6 @@ static void migrate_handle_rp_req_pages(MigrationState 
*ms, const char* rbname,
 }
 }
 
-/* Return true to retry, false to quit */
-static bool postcopy_pause_return_path_thread(MigrationState *s)
-{
-trace_postcopy_pause_return_path();
-
-qemu_sem_wait(>postcopy_pause_rp_sem);
-
-trace_postcopy_pause_return_path_continued();
-
-return true;
-}
-
 static int migrate_handle_rp_recv_bitmap(MigrationState *s, char *block_name)
 {
 RAMBlock *block = qemu_ram_block_by_name(block_name);
@@ -1870,7 +1858,6 @@ static void *source_return_path_thread(void *opaque)
 trace_source_return_path_thread_entry();
 rcu_register_thread();
 
-retry:
 while (!ms->rp_state.error && !qemu_file_get_error(rp) &&
migration_is_setup_or_active(ms->state)) {
   

[PATCH v6 03/10] migration: Fix possible race when shutting down to_dst_file

2023-09-11 Thread Fabiano Rosas
It's not safe to call qemu_file_shutdown() on the to_dst_file without
first checking for the file's presence under the lock. The cleanup of
this file happens at postcopy_pause() and migrate_fd_cleanup() which
are not necessarily running in the same thread as migrate_fd_cancel().

Reviewed-by: Peter Xu 
Signed-off-by: Fabiano Rosas 
---
 migration/migration.c | 18 +++---
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 85c171f32c..5e6a766235 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1245,7 +1245,7 @@ static void migrate_fd_error(MigrationState *s, const 
Error *error)
 static void migrate_fd_cancel(MigrationState *s)
 {
 int old_state ;
-QEMUFile *f = migrate_get_current()->to_dst_file;
+
 trace_migrate_fd_cancel();
 
 WITH_QEMU_LOCK_GUARD(>qemu_file_lock) {
@@ -1271,11 +1271,13 @@ static void migrate_fd_cancel(MigrationState *s)
  * If we're unlucky the migration code might be stuck somewhere in a
  * send/write while the network has failed and is waiting to timeout;
  * if we've got shutdown(2) available then we can force it to quit.
- * The outgoing qemu file gets closed in migrate_fd_cleanup that is
- * called in a bh, so there is no race against this cancel.
  */
-if (s->state == MIGRATION_STATUS_CANCELLING && f) {
-qemu_file_shutdown(f);
+if (s->state == MIGRATION_STATUS_CANCELLING) {
+WITH_QEMU_LOCK_GUARD(>qemu_file_lock) {
+if (s->to_dst_file) {
+qemu_file_shutdown(s->to_dst_file);
+}
+}
 }
 if (s->state == MIGRATION_STATUS_CANCELLING && s->block_inactive) {
 Error *local_err = NULL;
@@ -1519,12 +1521,14 @@ void qmp_migrate_pause(Error **errp)
 {
 MigrationState *ms = migrate_get_current();
 MigrationIncomingState *mis = migration_incoming_get_current();
-int ret;
+int ret = 0;
 
 if (ms->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
 /* Source side, during postcopy */
 qemu_mutex_lock(>qemu_file_lock);
-ret = qemu_file_shutdown(ms->to_dst_file);
+if (ms->to_dst_file) {
+ret = qemu_file_shutdown(ms->to_dst_file);
+}
 qemu_mutex_unlock(>qemu_file_lock);
 if (ret) {
 error_setg(errp, "Failed to pause source migration");
-- 
2.35.3




[PATCH v6 09/10] migration/yank: Keep track of registered yank instances

2023-09-11 Thread Fabiano Rosas
The core yank code is strict about balanced registering and
unregistering of yank functions.

This creates a difficulty because the migration code registers one
yank function per QIOChannel, but each QIOChannel can be referenced by
more than one QEMUFile. The yank function should not be removed until
all QEMUFiles have been closed.

Keep a reference count of how many QEMUFiles are using a QIOChannel
that has a yank function. Only unregister the yank function when all
QEMUFiles have been closed.

This improves the current code by removing the need for the programmer
to know which QEMUFile is the last one to be cleaned up and fixes the
theoretical issue of removing the yank function while another QEMUFile
could still be using the ioc and require a yank.

Signed-off-by: Fabiano Rosas 
---
 migration/yank_functions.c | 81 ++
 migration/yank_functions.h |  8 
 2 files changed, 81 insertions(+), 8 deletions(-)

diff --git a/migration/yank_functions.c b/migration/yank_functions.c
index 979e60c762..afdeef30ff 100644
--- a/migration/yank_functions.c
+++ b/migration/yank_functions.c
@@ -10,9 +10,32 @@
 #include "qemu/osdep.h"
 #include "io/channel.h"
 #include "yank_functions.h"
+#include "qemu/lockable.h"
 #include "qemu/yank.h"
 #include "qemu-file.h"
 
+static QemuMutex ioc_list_lock;
+static QLIST_HEAD(, Yankable) yankable_ioc_list
+= QLIST_HEAD_INITIALIZER(yankable_ioc_list);
+
+static void __attribute__((constructor)) ioc_list_lock_init(void)
+{
+qemu_mutex_init(_list_lock);
+}
+
+static void yankable_ref(Yankable *yankable)
+{
+assert(yankable->refcnt > 0);
+yankable->refcnt++;
+assert(yankable->refcnt < INT_MAX);
+}
+
+static void yankable_unref(Yankable *yankable)
+{
+assert(yankable->refcnt > 0);
+yankable->refcnt--;
+}
+
 void migration_yank_iochannel(void *opaque)
 {
 QIOChannel *ioc = QIO_CHANNEL(opaque);
@@ -28,20 +51,62 @@ static bool migration_ioc_yank_supported(QIOChannel *ioc)
 
 void migration_ioc_register_yank(QIOChannel *ioc)
 {
-if (migration_ioc_yank_supported(ioc)) {
-yank_register_function(MIGRATION_YANK_INSTANCE,
-   migration_yank_iochannel,
-   ioc);
+Yankable *new, *entry;
+
+if (!ioc || !migration_ioc_yank_supported(ioc)) {
+return;
 }
+
+WITH_QEMU_LOCK_GUARD(_list_lock) {
+QLIST_FOREACH(entry, _ioc_list, next) {
+if (entry->opaque == ioc) {
+yankable_ref(entry);
+return;
+}
+}
+
+new = g_new0(Yankable, 1);
+new->refcnt = 1;
+new->opaque = ioc;
+object_ref(ioc);
+
+QLIST_INSERT_HEAD(_ioc_list, new, next);
+}
+
+yank_register_function(MIGRATION_YANK_INSTANCE,
+   migration_yank_iochannel,
+   ioc);
 }
 
 void migration_ioc_unregister_yank(QIOChannel *ioc)
 {
-if (migration_ioc_yank_supported(ioc)) {
-yank_unregister_function(MIGRATION_YANK_INSTANCE,
- migration_yank_iochannel,
- ioc);
+Yankable *entry, *tmp;
+
+if (!ioc || !migration_ioc_yank_supported(ioc)) {
+return;
 }
+
+WITH_QEMU_LOCK_GUARD(_list_lock) {
+QLIST_FOREACH_SAFE(entry, _ioc_list, next, tmp) {
+if (entry->opaque == ioc) {
+yankable_unref(entry);
+
+if (!entry->refcnt) {
+QLIST_REMOVE(entry, next);
+g_free(entry);
+goto unreg;
+}
+}
+}
+}
+
+return;
+
+unreg:
+yank_unregister_function(MIGRATION_YANK_INSTANCE,
+ migration_yank_iochannel,
+ ioc);
+object_unref(ioc);
 }
 
 void migration_ioc_unregister_yank_from_file(QEMUFile *file)
diff --git a/migration/yank_functions.h b/migration/yank_functions.h
index a7577955ed..c928a46f68 100644
--- a/migration/yank_functions.h
+++ b/migration/yank_functions.h
@@ -7,6 +7,14 @@
  * See the COPYING file in the top-level directory.
  */
 
+struct Yankable {
+void *opaque;
+int refcnt;
+QLIST_ENTRY(Yankable) next;
+};
+
+typedef struct Yankable Yankable;
+
 /**
  * migration_yank_iochannel: yank function for iochannel
  *
-- 
2.35.3




[PATCH v6 01/10] migration: Fix possible race when setting rp_state.error

2023-09-11 Thread Fabiano Rosas
We don't need to set the rp_state.error right after a shutdown because
qemu_file_shutdown() always sets the QEMUFile error, so the return
path thread would have seen it and set the rp error itself.

Setting the error outside of the thread is also racy because the
thread could clear it after we set it.

Reviewed-by: Peter Xu 
Signed-off-by: Fabiano Rosas 
---
 migration/migration.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/migration/migration.c b/migration/migration.c
index 5528acb65e..f88c86079c 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2062,7 +2062,6 @@ static int 
await_return_path_close_on_source(MigrationState *ms)
  * waiting for the destination.
  */
 qemu_file_shutdown(ms->rp_state.from_dst_file);
-mark_source_rp_bad(ms);
 }
 trace_await_return_path_close_on_source_joining();
 qemu_thread_join(>rp_state.rp_thread);
-- 
2.35.3




[PATCH v6 02/10] migration: Fix possible races when shutting down the return path

2023-09-11 Thread Fabiano Rosas
We cannot call qemu_file_shutdown() on the return path file without
taking the file lock. The return path thread could be running it's
cleanup code and have just cleared the from_dst_file pointer.

Checking ms->to_dst_file for errors could also race with
migrate_fd_cleanup() which clears the to_dst_file pointer.

Protect both accesses by taking the file lock.

This was caught by inspection, it should be rare, but the next patches
will start calling this code from other places, so let's do the
correct thing.

Reviewed-by: Peter Xu 
Signed-off-by: Fabiano Rosas 
---
 migration/migration.c | 19 ++-
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index f88c86079c..85c171f32c 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2052,17 +2052,18 @@ static int open_return_path_on_source(MigrationState 
*ms,
 static int await_return_path_close_on_source(MigrationState *ms)
 {
 /*
- * If this is a normal exit then the destination will send a SHUT and the
- * rp_thread will exit, however if there's an error we need to cause
- * it to exit.
+ * If this is a normal exit then the destination will send a SHUT
+ * and the rp_thread will exit, however if there's an error we
+ * need to cause it to exit. shutdown(2), if we have it, will
+ * cause it to unblock if it's stuck waiting for the destination.
  */
-if (qemu_file_get_error(ms->to_dst_file) && ms->rp_state.from_dst_file) {
-/*
- * shutdown(2), if we have it, will cause it to unblock if it's stuck
- * waiting for the destination.
- */
-qemu_file_shutdown(ms->rp_state.from_dst_file);
+WITH_QEMU_LOCK_GUARD(>qemu_file_lock) {
+if (ms->to_dst_file && ms->rp_state.from_dst_file &&
+qemu_file_get_error(ms->to_dst_file)) {
+qemu_file_shutdown(ms->rp_state.from_dst_file);
+}
 }
+
 trace_await_return_path_close_on_source_joining();
 qemu_thread_join(>rp_state.rp_thread);
 ms->rp_state.rp_thread_created = false;
-- 
2.35.3




[PATCH v6 08/10] migration/yank: Use channel features

2023-09-11 Thread Fabiano Rosas
Stop using outside knowledge about the io channels when registering
yank functions. Query for features instead.

The yank method for all channels used with migration code currently is
to call the qio_channel_shutdown() function, so query for
QIO_CHANNEL_FEATURE_SHUTDOWN. We could add a separate feature in the
future for indicating whether a channel supports yanking, but that
seems overkill at the moment.

Signed-off-by: Fabiano Rosas 
---
 migration/yank_functions.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/migration/yank_functions.c b/migration/yank_functions.c
index d5a710a3f2..979e60c762 100644
--- a/migration/yank_functions.c
+++ b/migration/yank_functions.c
@@ -8,12 +8,9 @@
  */
 
 #include "qemu/osdep.h"
-#include "qapi/error.h"
 #include "io/channel.h"
 #include "yank_functions.h"
 #include "qemu/yank.h"
-#include "io/channel-socket.h"
-#include "io/channel-tls.h"
 #include "qemu-file.h"
 
 void migration_yank_iochannel(void *opaque)
@@ -26,8 +23,7 @@ void migration_yank_iochannel(void *opaque)
 /* Return whether yank is supported on this ioc */
 static bool migration_ioc_yank_supported(QIOChannel *ioc)
 {
-return object_dynamic_cast(OBJECT(ioc), TYPE_QIO_CHANNEL_SOCKET) ||
-object_dynamic_cast(OBJECT(ioc), TYPE_QIO_CHANNEL_TLS);
+return qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN);
 }
 
 void migration_ioc_register_yank(QIOChannel *ioc)
-- 
2.35.3




[PATCH v6 07/10] migration: Move return path cleanup to main migration thread

2023-09-11 Thread Fabiano Rosas
Now that the return path thread is allowed to finish during a paused
migration, we can move the cleanup of the QEMUFiles to the main
migration thread.

Reviewed-by: Peter Xu 
Signed-off-by: Fabiano Rosas 
---
 migration/migration.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/migration/migration.c b/migration/migration.c
index 7dfcbc3634..7fec57ad7f 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -98,6 +98,7 @@ static int migration_maybe_pause(MigrationState *s,
  int *current_active_state,
  int new_state);
 static void migrate_fd_cancel(MigrationState *s);
+static int await_return_path_close_on_source(MigrationState *s);
 
 static bool migration_needs_multiple_sockets(void)
 {
@@ -1177,6 +1178,12 @@ static void migrate_fd_cleanup(MigrationState *s)
 qemu_fclose(tmp);
 }
 
+/*
+ * We already cleaned up to_dst_file, so errors from the return
+ * path might be due to that, ignore them.
+ */
+await_return_path_close_on_source(s);
+
 assert(!migration_is_active(s));
 
 if (s->state == MIGRATION_STATUS_CANCELLING) {
@@ -1985,7 +1992,6 @@ out:
 }
 
 trace_source_return_path_thread_end();
-migration_release_dst_files(ms);
 rcu_unregister_thread();
 return NULL;
 }
@@ -2039,6 +2045,9 @@ static int 
await_return_path_close_on_source(MigrationState *ms)
 
 ret = ms->rp_state.error;
 ms->rp_state.error = false;
+
+migration_release_dst_files(ms);
+
 trace_migration_return_path_end_after(ret);
 return ret;
 }
-- 
2.35.3




Re: [virtio-dev] [RESEND VIRTIO GPU PATCH v3 1/1] virtio-gpu: Add new feature flag VIRTIO_GPU_F_FREEZE_S3

2023-09-11 Thread Mikhail Golubev-Ciuchea via
Hi Jiqian,

Thanks for the proposal.

Some time ago I was working on the same issue with suspending the gpu device
(on arm).  Additionally, I had troubles with virtio-video device as well, see
https://lore.kernel.org/lkml/20211215172739.ga77...@opensynergy.com/T/ for
details.

In your case, the
VIRTIO_GPU_FREEZE_MODE_FREEZE_S3/VIRTIO_GPU_FREEZE_MODE_UNFREEZE do influence
how reset is being handled by Qemu, is this correct?  Since multiple devices
can benefit from the same mechanism, would it be possible to: a) have a more
generic, non ACPI-based name, b) make the feature generic, applicable to other
devices as well?


Best wishes,
Mikhail Golubev-Ciuchea



--

Mikhail Golubev-Ciuchea


OpenSynergy GmbH

Rotherstr. 20, 10245 Berlin

Telefon: +49 (30) 60 98 54 0 - 903

EMail:   mikhail.golu...@opensynergy.com

www.opensynergy.com

Handelsregister/Commercial Registry: Amtsgericht Charlottenburg, HRB 108616B

Geschäftsführer/Managing Director: Regis Adjamah



From: virtio-...@lists.oasis-open.org  on 
behalf of Jiqian Chen 
Sent: Monday, September 11, 2023 12:04 PM
To: Gerd Hoffmann; Marc-André Lureau; Robert Beckett; 
virtio-comm...@lists.oasis-open.org; virtio-...@lists.oasis-open.org
Cc: qemu-devel@nongnu.org; linux-ker...@vger.kernel.org; Stefano Stabellini; 
Roger Pau Monné; Alex Deucher; Christian Koenig; Stewart Hildebrand; Xenia 
Ragiadakou; Honglei Huang; Julia Zhang; Huang Rui; Jiqian Chen
Subject: [virtio-dev] [RESEND VIRTIO GPU PATCH v3 1/1] virtio-gpu: Add new 
feature flag VIRTIO_GPU_F_FREEZE_S3

When we suspend/resume guest on Xen, the display can't come back.
This is because when guest suspended, it called into Qemu. Then
Qemu destroyed all resources which is used for display. So that
guest's display can't come back to the time when it was suspended.

To solve above problem, I added a new mechanism that when guest is
suspending, it will notify Qemu, and then Qemu will not destroy
resourcesi which are created by using commands
VIRTIO_GPU_CMD_RESOURCE_CREATE_*.

Due to that mechanism needs cooperation between guest and host,
I need to add a new feature flag, so that guest and host can
negotiate whenever freeze_S3 is supported or not.

Signed-off-by: Jiqian Chen 
---
 device-types/gpu/description.tex | 42 
 1 file changed, 42 insertions(+)

diff --git a/device-types/gpu/description.tex b/device-types/gpu/description.tex
index 4435248..1a137e7 100644
--- a/device-types/gpu/description.tex
+++ b/device-types/gpu/description.tex
@@ -37,6 +37,8 @@ \subsection{Feature bits}\label{sec:Device Types / GPU Device 
/ Feature bits}
   resources is supported.
 \item[VIRTIO_GPU_F_CONTEXT_INIT (4)] multiple context types and
   synchronization timelines supported.  Requires VIRTIO_GPU_F_VIRGL.
+\item[VIRTIO_GPU_F_FREEZE_S3 (5)] freezing virtio-gpu and keeping resources
+  alive is supported.
 \end{description}

 \subsection{Device configuration layout}\label{sec:Device Types / GPU Device / 
Device configuration layout}
@@ -228,6 +230,9 @@ \subsubsection{Device Operation: Request 
header}\label{sec:Device Types / GPU De
 VIRTIO_GPU_CMD_UPDATE_CURSOR = 0x0300,
 VIRTIO_GPU_CMD_MOVE_CURSOR,

+/* freeze mode */
+VIRTIO_GPU_CMD_SET_FREEZE_MODE = 0x0400,
+
 /* success responses */
 VIRTIO_GPU_RESP_OK_NODATA = 0x1100,
 VIRTIO_GPU_RESP_OK_DISPLAY_INFO,
@@ -838,6 +843,43 @@ \subsubsection{Device Operation: cursorq}\label{sec:Device 
Types / GPU Device /

 \end{description}

+\subsubsection{Device Operation: freeze_mode}\label{sec:Device Types / GPU 
Device / Device Operation / Device Operation: freeze_mode}
+
+\begin{lstlisting}
+typedef enum {
+  VIRTIO_GPU_FREEZE_MODE_UNFREEZE = 0,
+  VIRTIO_GPU_FREEZE_MODE_FREEZE_S3 = 3,
+} virtio_gpu_freeze_mode_t;
+
+struct virtio_gpu_set_freeze_mode {
+  struct virtio_gpu_ctrl_hdr hdr;
+  virtio_gpu_freeze_mode_t freeze_mode;
+};
+\end{lstlisting}
+
+\begin{description}
+
+\item[VIRTIO_GPU_CMD_SET_FREEZE_MODE]
+Notify freeze mode through controlq.
+Request data is \field{struct virtio_gpu_set_freeze_mode}.
+Response type is VIRTIO_GPU_RESP_OK_NODATA.
+
+This is added for S3 function in guest with virtio-gpu. When guest does
+S3, let it notify QEMU that virtio-gpu is in what freeze mode in
+\field{freeze_mode}. VIRTIO_GPU_FREEZE_MODE_FREEZE_S3 means guest is
+doing S3 and virtio-gpu will be freezed, VIRTIO_GPU_FREEZE_MODE_UNFREEZE
+means virtio-gpu can be used as usual. When virtio-gpu is freezed, QEMU
+will not destroy resources which are created by using commands
+VIRTIO_GPU_CMD_RESOURCE_CREATE_*, so that guest can use those resources
+to resume display.
+
+Note: this change is not enough to solve the problems of S4 function.
+QEMU may lose resources after hibernation. It needs more research and
+development. If S4 is supported in the future, it may need another
+feature flag here.
+
+\end{description}
+
 \subsection{VGA Compatibility}\label{sec:Device 

Re: [PATCH 02/11] hw/i386/pc: Use qdev_prop_set_array()

2023-09-11 Thread Kevin Wolf
Am 11.09.2023 um 17:42 hat Peter Maydell geschrieben:
> On Fri, 8 Sept 2023 at 15:37, Kevin Wolf  wrote:
> >
> > Instead of manually setting "foo-len" and "foo[i]" properties, build a
> > QList and use the new qdev_prop_set_array() helper to set the whole
> > array property with a single call.
> >
> > Signed-off-by: Kevin Wolf 
> > ---
> >  hw/i386/pc.c | 8 +---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > index 54838c0c41..0e84e454cb 100644
> > --- a/hw/i386/pc.c
> > +++ b/hw/i386/pc.c
> > @@ -81,6 +81,7 @@
> >  #include "qapi/error.h"
> >  #include "qapi/qapi-visit-common.h"
> >  #include "qapi/qapi-visit-machine.h"
> > +#include "qapi/qmp/qlist.h"
> >  #include "qapi/visitor.h"
> >  #include "hw/core/cpu.h"
> >  #include "hw/usb.h"
> > @@ -1508,9 +1509,10 @@ static void 
> > pc_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
> >  char *resv_prop_str = g_strdup_printf("0xfee0:0xfeef:%d",
> >VIRTIO_IOMMU_RESV_MEM_T_MSI);
> >
> > -object_property_set_uint(OBJECT(dev), "len-reserved-regions", 1, 
> > errp);
> > -object_property_set_str(OBJECT(dev), "reserved-regions[0]",
> > -resv_prop_str, errp);
> > +QList *reserved_regions = qlist_new();
> > +qlist_append_str(reserved_regions, resv_prop_str);
> > +qdev_prop_set_array(dev, "reserved-regions", reserved_regions);
> > +
> 
> The variable declaration should be at the top of the block;

It is at the top of the block, the only thing before it is another
variable declaration. Would you prefer to have the empty line removed or
after the declaration to make this visually clearer?

Kevin




Re: [PATCH v3] target/i386: Restrict system-specific features from user emulation

2023-09-11 Thread Kevin Wolf
Am 11.09.2023 um 16:27 hat Philippe Mathieu-Daudé geschrieben:
> Since commits 3adce820cf ("target/i386: Remove unused KVM
> stubs") and ef1cf6890f ("target/i386: Allow elision of
> kvm_hv_vpindex_settable()"), when building on a x86 host
> configured as:
> 
>   $ ./configure --cc=clang \
> --target-list=x86_64-linux-user,x86_64-softmmu \
> --enable-debug
> 
> we get:
> 
>   [71/71] Linking target qemu-x86_64
>   FAILED: qemu-x86_64
>   /usr/bin/ld: libqemu-x86_64-linux-user.fa.p/target_i386_cpu.c.o: in 
> function `cpu_x86_cpuid':
>   cpu.c:(.text+0x1374): undefined reference to `kvm_arch_get_supported_cpuid'
>   /usr/bin/ld: libqemu-x86_64-linux-user.fa.p/target_i386_cpu.c.o: in 
> function `x86_cpu_filter_features':
>   cpu.c:(.text+0x81c2): undefined reference to `kvm_arch_get_supported_cpuid'
>   /usr/bin/ld: cpu.c:(.text+0x81da): undefined reference to 
> `kvm_arch_get_supported_cpuid'
>   /usr/bin/ld: cpu.c:(.text+0x81f2): undefined reference to 
> `kvm_arch_get_supported_cpuid'
>   /usr/bin/ld: cpu.c:(.text+0x820a): undefined reference to 
> `kvm_arch_get_supported_cpuid'
>   /usr/bin/ld: 
> libqemu-x86_64-linux-user.fa.p/target_i386_cpu.c.o:cpu.c:(.text+0x8225): more 
> undefined references to `kvm_arch_get_supported_cpuid' follow
>   clang: error: linker command failed with exit code 1 (use -v to see 
> invocation)
>   ninja: build stopped: subcommand failed.
> 
> libqemu-x86_64-linux-user.fa is user emulation specific, so
> having system emulation code called there is dubious.
> 
> '--enable-debug' disables optimizations (CFLAGS=-O0).
> 
> While at this (un)optimization level GCC eliminate the
> following dead code (CPP output of mentioned build):
> 
>  static void x86_cpu_get_supported_cpuid(uint32_t func, uint32_t index,
>  uint32_t *eax, uint32_t *ebx,
>  uint32_t *ecx, uint32_t *edx)
>  {
>  if ((0)) {
>  *eax = kvm_arch_get_supported_cpuid(kvm_state, func, index, R_EAX);
>  *ebx = kvm_arch_get_supported_cpuid(kvm_state, func, index, R_EBX);
>  *ecx = kvm_arch_get_supported_cpuid(kvm_state, func, index, R_ECX);
>  *edx = kvm_arch_get_supported_cpuid(kvm_state, func, index, R_EDX);
>  } else if (0) {
>  *eax = 0;
>  *ebx = 0;
>  *ecx = 0;
>  *edx = 0;
>  } else {
>  *eax = 0;
>  *ebx = 0;
>  *ecx = 0;
>  *edx = 0;
>  }
>  }
> 
> Clang does not.
> 
> Instead of trying to deal with compiler specific checks around
> __OPTIMIZE__ (see commit 2140cfa51d "i386: Fix build by providing
> stub kvm_arch_get_supported_cpuid()"), simply restrict code
> belonging to system emulation, easing user emulation linking.
> 
> Reported-by: Kevin Wolf 
> Signed-off-by: Philippe Mathieu-Daudé 

Can we make the function declarations in the header file for the
functions without stubs conditional on !CONFIG_USER_ONLY, too, so that
trying to call them would already fail during compilation (and also with
-O2), not only when linking without optimisations?

Kevin




Re: [PATCH v2,1/1] memory: avoid updating ioeventfds for some address_space

2023-09-11 Thread Peter Xu
On Mon, Sep 04, 2023 at 08:51:43PM +0800, hongmainquan wrote:
> 
> Friendly ping...
> Hello, this patch has already received a R-b from PeterXu. Could you please
> help me review it as well and see if there are any issues? If everything is
> fine, could you please consider merging it? Thank you!

Paolo, wanna pick this one up?

David, I know you're preparing a pull with a lot of memory changes, if you
like to pick this up it'll be good too.

If no one will pick it up, I can try to send a pull just to have this
patch, as long as it will work out.

Thanks,

-- 
Peter Xu




Re: [PATCH 1/7] bsd-user: spelling fixes

2023-09-11 Thread Warner Losh
On Mon, Sep 11, 2023, 10:12 AM Kyle Evans  wrote:

> On 9/11/23 03:39, Michael Tokarev wrote:
> > Warner, Kyle, can you take a look please?
> >
> >
> https://patchew.org/QEMU/20230909131258.354675-1-...@tls.msk.ru/20230909131258.354675-2-...@tls.msk.ru/
> >
>
> Hmm, the original for this doesn't seem to have landed in my inbox, but
> these all look OK to me.
>


Same. None of these will cause a conflict. I'll make sure any out of tree
files are updated... though cut and paste rarely flows from these files.


> > 09.09.2023 16:12, Michael Tokarev:
> >> Signed-off-by: Michael Tokarev 
> >> ---
> >>   bsd-user/errno_defs.h| 2 +-
> >>   bsd-user/freebsd/target_os_siginfo.h | 2 +-
> >>   bsd-user/freebsd/target_os_stack.h   | 4 ++--
> >>   bsd-user/freebsd/target_os_user.h| 2 +-
> >>   bsd-user/qemu.h  | 2 +-
> >>   bsd-user/signal-common.h | 4 ++--
> >>   bsd-user/signal.c| 6 +++---
> >>   7 files changed, 11 insertions(+), 11 deletions(-)
> >
>
> Reviewed-by: Kyle Evans 
>

Reviewed-by: Warner Losh 

Thanks,
>
> Kyle Evans
>
>


Re: [PATCH 1/7] bsd-user: spelling fixes

2023-09-11 Thread Kyle Evans

On 9/11/23 03:39, Michael Tokarev wrote:

Warner, Kyle, can you take a look please?

https://patchew.org/QEMU/20230909131258.354675-1-...@tls.msk.ru/20230909131258.354675-2-...@tls.msk.ru/



Hmm, the original for this doesn't seem to have landed in my inbox, but 
these all look OK to me.




09.09.2023 16:12, Michael Tokarev:

Signed-off-by: Michael Tokarev 
---
  bsd-user/errno_defs.h    | 2 +-
  bsd-user/freebsd/target_os_siginfo.h | 2 +-
  bsd-user/freebsd/target_os_stack.h   | 4 ++--
  bsd-user/freebsd/target_os_user.h    | 2 +-
  bsd-user/qemu.h  | 2 +-
  bsd-user/signal-common.h | 4 ++--
  bsd-user/signal.c    | 6 +++---
  7 files changed, 11 insertions(+), 11 deletions(-)




Reviewed-by: Kyle Evans 

Thanks,

Kyle Evans




Re: [PATCH 10/11] qom: Add object_property_set_default_list()

2023-09-11 Thread Peter Maydell
On Fri, 8 Sept 2023 at 15:37, Kevin Wolf  wrote:
>
> This function provides a default for properties that are accessed using
> the list visitor interface. The default is always an empty list.
>
> Signed-off-by: Kevin Wolf 

Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PATCH 09/11] hw/rx/rx62n: Use qdev_prop_set_array()

2023-09-11 Thread Peter Maydell
On Fri, 8 Sept 2023 at 15:37, Kevin Wolf  wrote:
>
> Instead of manually setting "foo-len" and "foo[i]" properties, build a
> QList and use the new qdev_prop_set_array() helper to set the whole
> array property with a single call.
>
> Signed-off-by: Kevin Wolf 
> ---
>  hw/rx/rx62n.c | 19 ++-

Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PATCH 08/11] hw/arm/xlnx-versal: Use qdev_prop_set_array()

2023-09-11 Thread Peter Maydell
On Fri, 8 Sept 2023 at 15:37, Kevin Wolf  wrote:
>
> Instead of manually setting "foo-len" and "foo[i]" properties, build a
> QList and use the new qdev_prop_set_array() helper to set the whole
> array property with a single call.
>
> Signed-off-by: Kevin Wolf 
> ---

Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PATCH 07/11] hw/arm/virt: Use qdev_prop_set_array()

2023-09-11 Thread Peter Maydell
On Fri, 8 Sept 2023 at 15:39, Kevin Wolf  wrote:
>
> Instead of manually setting "foo-len" and "foo[i]" properties, build a
> QList and use the new qdev_prop_set_array() helper to set the whole
> array property with a single call.
>
> Signed-off-by: Kevin Wolf 
> ---

Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PATCH 06/11] hw/arm/vexpress: Use qdev_prop_set_array()

2023-09-11 Thread Peter Maydell
On Fri, 8 Sept 2023 at 15:37, Kevin Wolf  wrote:
>
> Instead of manually setting "foo-len" and "foo[i]" properties, build a
> QList and use the new qdev_prop_set_array() helper to set the whole
> array property with a single call.
>
> Signed-off-by: Kevin Wolf 

Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PATCH 05/11] hw/arm/sbsa-ref: Use qdev_prop_set_array()

2023-09-11 Thread Peter Maydell
On Fri, 8 Sept 2023 at 15:37, Kevin Wolf  wrote:
>
> Instead of manually setting "foo-len" and "foo[i]" properties, build a
> QList and use the new qdev_prop_set_array() helper to set the whole
> array property with a single call.
>
> Signed-off-by: Kevin Wolf 
> ---

Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PATCH 04/11] hw/arm/mps2: Use qdev_prop_set_array()

2023-09-11 Thread Peter Maydell
On Fri, 8 Sept 2023 at 15:37, Kevin Wolf  wrote:
>
> Instead of manually setting "foo-len" and "foo[i]" properties, build a
> QList and use the new qdev_prop_set_array() helper to set the whole
> array property with a single call.
>
> Signed-off-by: Kevin Wolf 
> ---
>  hw/arm/mps2.c | 12 
>  1 file changed, 8 insertions(+), 4 deletions(-)

Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PATCH 03/11] hw/arm/mps2-tz: Use qdev_prop_set_array()

2023-09-11 Thread Peter Maydell
On Fri, 8 Sept 2023 at 15:37, Kevin Wolf  wrote:
>
> Instead of manually setting "foo-len" and "foo[i]" properties, build a
> QList and use the new qdev_prop_set_array() helper to set the whole
> array property with a single call.
>
> Signed-off-by: Kevin Wolf 
> ---

Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PATCH 02/11] hw/i386/pc: Use qdev_prop_set_array()

2023-09-11 Thread Peter Maydell
On Fri, 8 Sept 2023 at 15:37, Kevin Wolf  wrote:
>
> Instead of manually setting "foo-len" and "foo[i]" properties, build a
> QList and use the new qdev_prop_set_array() helper to set the whole
> array property with a single call.
>
> Signed-off-by: Kevin Wolf 
> ---
>  hw/i386/pc.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 54838c0c41..0e84e454cb 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -81,6 +81,7 @@
>  #include "qapi/error.h"
>  #include "qapi/qapi-visit-common.h"
>  #include "qapi/qapi-visit-machine.h"
> +#include "qapi/qmp/qlist.h"
>  #include "qapi/visitor.h"
>  #include "hw/core/cpu.h"
>  #include "hw/usb.h"
> @@ -1508,9 +1509,10 @@ static void 
> pc_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
>  char *resv_prop_str = g_strdup_printf("0xfee0:0xfeef:%d",
>VIRTIO_IOMMU_RESV_MEM_T_MSI);
>
> -object_property_set_uint(OBJECT(dev), "len-reserved-regions", 1, 
> errp);
> -object_property_set_str(OBJECT(dev), "reserved-regions[0]",
> -resv_prop_str, errp);
> +QList *reserved_regions = qlist_new();
> +qlist_append_str(reserved_regions, resv_prop_str);
> +qdev_prop_set_array(dev, "reserved-regions", reserved_regions);
> +

The variable declaration should be at the top of the block;
otherwise
Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PATCH 01/11] qdev: Add qdev_prop_set_array()

2023-09-11 Thread Peter Maydell
On Fri, 8 Sept 2023 at 15:37, Kevin Wolf  wrote:
>
> Instead of exposing the ugly hack of how we represent arrays in qdev (a
> static "foo-len" property and after it is set, dynamically created
> "foo[i]" properties) to boards, add an interface that allows setting the
> whole array at once.
>
> Once all internal users of devices with array properties have been
> converted to use this function, we can change the implementation to move
> away from this hack.
>
> Signed-off-by: Kevin Wolf 

Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PATCH 4/7] hw/net: spelling fixes

2023-09-11 Thread Peter Maydell
On Sat, 9 Sept 2023 at 14:15, Michael Tokarev  wrote:
>
> Signed-off-by: Michael Tokarev 
> ---

Reviewed-by: Peter Maydell 

PS: for a v2 patchset each patch email ought to have the
"PATCH v2" tag, not just the cover letter.

thanks
-- PMM



Re: [PATCH 5/7] hw/pci: spelling fixes

2023-09-11 Thread Peter Maydell
On Sat, 9 Sept 2023 at 14:18, Michael Tokarev  wrote:
>
> Signed-off-by: Michael Tokarev 


> @@ -503,7 +503,7 @@ static void designware_pcie_root_realize(PCIDevice *dev, 
> Error **errp)
>_pci_host_msi_ops,
>root, "pcie-msi", 0x4);
>  /*
> - * We initially place MSI interrupt I/O region a adress 0 and
> + * We initially place MSI interrupt I/O region a address 0 and

"at address 0"

>   * disable it. It'll be later moved to correct offset and enabled
>   * in designware_pcie_root_update_msi_mapping() as a part of
>   * initialization done by guest OS

> diff --git a/hw/pci-host/gpex-acpi.c b/hw/pci-host/gpex-acpi.c
> index 7c7316bc96..87ba074254 100644
> --- a/hw/pci-host/gpex-acpi.c
> +++ b/hw/pci-host/gpex-acpi.c
> @@ -177,7 +177,7 @@ void acpi_dsdt_add_gpex(Aml *scope, struct GPEXConfig 
> *cfg)
>  acpi_dsdt_add_pci_route_table(dev, cfg->irq);
>
>  /*
> - * Resources defined for PXBs are composed by the folling parts:
> + * Resources defined for PXBs are composed by the following 
> parts:

Should be "composed of", while we're editing the line.

>   * 1. The resources the pci-brige/pcie-root-port need.
>   * 2. The resources the devices behind pxb need.
>   */

otherwise
Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PATCH 7/7] hw/other: spelling fixes

2023-09-11 Thread Peter Maydell
On Sat, 9 Sept 2023 at 14:16, Michael Tokarev  wrote:
>
> Signed-off-by: Michael Tokarev 


> diff --git a/hw/misc/trace-events b/hw/misc/trace-events
> index e8b2be14c0..bc87cd3670 100644
> --- a/hw/misc/trace-events
> +++ b/hw/misc/trace-events
> @@ -155,7 +155,7 @@ stm32f4xx_syscfg_read(uint64_t addr) "reg read: addr: 
> 0x%" PRIx64 " "
>  stm32f4xx_syscfg_write(uint64_t addr, uint64_t data) "reg write: addr: 0x%" 
> PRIx64 " val: 0x%" PRIx64 ""
>
>  # stm32f4xx_exti.c
> -stm32f4xx_exti_set_irq(int irq, int leve) "Set EXTI: %d to %d"
> +stm32f4xx_exti_set_irq(int irq, int level) "Set EXTI: %d to %d"
>  stm32f4xx_exti_read(uint64_t addr) "reg read: addr: 0x%" PRIx64 " "
>  stm32f4xx_exti_write(uint64_t addr, uint64_t data) "reg write: addr: 0x%" 
> PRIx64 " val: 0x%" PRIx64 ""

Unlike all the other changes, this is a code change, not a
comment change. But it's OK, so:

Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PATCH 2/7] i386: spelling fixes

2023-09-11 Thread Peter Maydell
On Sat, 9 Sept 2023 at 14:17, Michael Tokarev  wrote:
>
> Signed-off-by: Michael Tokarev 
> ---
>  host/include/i386/host/cpuinfo.h | 2 +-
>  hw/i386/acpi-build.c | 4 ++--
>  hw/i386/amd_iommu.c  | 4 ++--
>  hw/i386/intel_iommu.c| 4 ++--
>  hw/i386/kvm/xen_xenstore.c   | 2 +-
>  hw/i386/kvm/xenstore_impl.c  | 2 +-
>  hw/i386/pc.c | 4 ++--
>  include/hw/i386/topology.h   | 2 +-
>  target/i386/cpu.c| 4 ++--
>  target/i386/cpu.h| 4 ++--
>  target/i386/kvm/kvm.c| 4 ++--
>  target/i386/kvm/xen-emu.c| 2 +-
>  target/i386/machine.c| 4 ++--
>  target/i386/tcg/translate.c  | 8 
>  tests/tcg/i386/system/boot.S | 2 +-
>  tests/tcg/i386/x86.csv   | 2 +-
>  16 files changed, 27 insertions(+), 27 deletions(-)

> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index bb12b0ad43..1afcef5937 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -779,7 +779,7 @@ static Aml *initialize_route(Aml *route, const char 
> *link_name,
>   *
>   * Returns an array of 128 routes, one for each device,
>   * based on device location.
> - * The main goal is to equaly distribute the interrupts
> + * The main goal is to equally distribute the interrupts
>   * over the 4 existing ACPI links (works only for i440fx).
>   * The hash function is  (slot + pin) & 3 -> "LNK[D|A|B|C]".
>   *
> @@ -2079,7 +2079,7 @@ build_srat(GArray *table_data, BIOSLinker *linker, 
> MachineState *machine)
>  }
>
>  /*
> - * Insert DMAR scope for PCI bridges and endpoint devcie
> + * Insert DMAR scope for PCI bridges and endpoint device

I suspect this should be "devices" plural.

>   */
>  static void
>  insert_scope(PCIBus *bus, PCIDevice *dev, void *opaque)

>  static void ioportF0_write(void *opaque, hwaddr addr, uint64_t data,
> unsigned size)
>  {

> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -4729,7 +4729,7 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
>  /*
>   * Put MSR_IA32_FEATURE_CONTROL first, this ensures the VM gets out of 
> VMX
>   * root operation upon vCPU reset. kvm_put_msr_feature_control() should 
> also
> - * preceed kvm_put_nested_state() when 'real' nested state is set.
> + * proceed kvm_put_nested_state() when 'real' nested state is set.

Should be "precede".

>   */
>  if (level >= KVM_PUT_RESET_STATE) {
>  ret = kvm_put_msr_feature_control(x86_cpu);

Otherwise
Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PULL v2 00/15] Block layer patches

2023-09-11 Thread Stefan Hajnoczi
Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/8.2 for any 
user-visible changes.


signature.asc
Description: PGP signature


Re: [PULL v2 00/45] riscv-to-apply queue

2023-09-11 Thread Stefan Hajnoczi
Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/8.2 for any 
user-visible changes.


signature.asc
Description: PGP signature


Re: [PULL 00/13] vfio queue

2023-09-11 Thread Stefan Hajnoczi
Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/8.2 for any 
user-visible changes.


signature.asc
Description: PGP signature


Re: [PULL 00/26] target-arm queue

2023-09-11 Thread Stefan Hajnoczi
Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/8.2 for any 
user-visible changes.


signature.asc
Description: PGP signature


Re: [PULL 24/51] meson: compile bundled device trees

2023-09-11 Thread Peter Maydell
On Fri, 8 Sept 2023 at 21:08, BALATON Zoltan  wrote:
>
> On Fri, 8 Sep 2023, Michael Tokarev wrote:
> > 08.09.2023 22:21, BALATON Zoltan:
> >> I was about to ask, since when but probably nobody knows then. AFAIR I had
> >> no such errors for the canyonlands one when I've added it but that was
> >> quite some years ago and things in dtc for example could have changed so it
> >> now gives these warnings.
> >
> > I think it can be said based on debian build logs. Lemme see..
> >
> > https://buildd.debian.org/status/logs.php?pkg=qemu=all
> >
> > The first log entry there is from 2018-12-12, for qemu 3.1, dtc 1.4.7.
> > It has:
> >
> > dtc -o b/qemu/pc-bios/bamboo.dtb pc-bios/bamboo.dts
> > b/qemu/pc-bios/bamboo.dtb: Warning (unit_address_vs_reg): /memory: node has 
> > a
> > reg or ranges property, but no unit name
> > b/qemu/pc-bios/bamboo.dtb: Warning (unit_address_vs_reg): /plb/opb: node has
> > a reg or ranges property, but no unit name
> > b/qemu/pc-bios/bamboo.dtb: Warning (chosen_node_stdout_path):
> > /chosen:linux,stdout-path: Use 'stdout-path' instead
> > b/qemu/pc-bios/bamboo.dtb: Warning (interrupts_property): /plb/opb: Missing
> > interrupt-parent
> > b/qemu/pc-bios/bamboo.dtb: Warning (interrupts_property): /plb/opb/ebc:
> > Missing interrupt-parent
>
>
> OK so bamboo was likely always like that. Sam460ex (aka canyonlands which
> is the devel board it is based on) was added in February 2018 so that was
> OK back then but later dtc versions may have become pickier somewhere
> between 1.4.7 and 1.6.0.

Yeah, at some point dtc started warning about a lot of these
things, I think in the hope that people writing dt files would
fix them before the dt got into wide circulation.

> > next it was moved to one of the subpackages, and moved back to
> > arch-independent package in 6.2 (2022-01-09, dtc 1.6.0), which has:
> >
> > dtc -o b/misc/bamboo.dtb pc-bios/bamboo.dts
> > pc-bios/bamboo.dts:45.9-48.4: Warning (unit_address_vs_reg): /memory: node
> > has a reg or ranges property, but no unit name
> > pc-bios/bamboo.dts:87.13-154.5: Warning (unit_address_vs_reg): /plb/opb: 
> > node
> > has a reg or ranges property, but no unit name
> > pc-bios/bamboo.dts:198.3-50: Warning (chosen_node_stdout_path):
> > /chosen:linux,stdout-path: Use 'stdout-path' instead
> > pc-bios/bamboo.dts:87.13-154.5: Warning (interrupts_property): /plb/opb:
> > Missing interrupt-parent
> > pc-bios/bamboo.dts:100.14-108.6: Warning (interrupts_property): 
> > /plb/opb/ebc:
> > Missing interrupt-parent
> > dtc -o b/misc/canyonlands.dtb pc-bios/canyonlands.dts
> > pc-bios/canyonlands.dts:47.9-50.4: Warning (unit_address_vs_reg): /memory:
> > node has a reg or ranges property, but no unit name
> > pc-bios/canyonlands.dts:210.13-429.5: Warning (unit_address_vs_reg):
> > /plb/opb: node has a reg or ranges property, but no unit name
> > pc-bios/canyonlands.dts:464.26-504.5: Warning (pci_bridge):
> > /plb/pciex@d: node name is not "pci" or "pcie"
> > pc-bios/canyonlands.dts:506.26-546.5: Warning (pci_bridge):
> > /plb/pciex@d2000: node name is not "pci" or "pcie"
>
> Linux has this in arch/powerpc/boot/dts/canyonlands.dts and at least had a
> change of the pciex names to pcie that should fix some of these but if the
> u-boot still uses older names then could updating this result in different
> results between using -kernel and without that? I don't know how guests
> use the dtb so can't tell what to do but keeping it consistent with the
> older u-boot this board has seems like a safer option.

At least in theory, guest code looking at a device tree blob
should be searching it by 'compatible' property name, not
by the name of the node itself. So changing node names is
a fairly safe change, though as you say the absolute safest
thing would be to not change at all.

Did we get these dts file from some other "upstream" source?
If so then the best thing is probably either to just update
from that upstream source again, or else say "these aren't
files the QEMU project is writing, so there's no point in
having the compile process warning about them, turn all the
dtc warnings off using the '-q' option".

-- PMM



Re: [PATCH v2] target/i386: Re-introduce few KVM stubs for Clang debug builds

2023-09-11 Thread Philippe Mathieu-Daudé

On 11/9/23 15:15, Philippe Mathieu-Daudé wrote:

Since commits 3adce820cf..ef1cf6890f, When building on
a x86 host configured as:

   $ ./configure --cc=clang \
 --target-list=x86_64-linux-user,x86_64-softmmu \
 --enable-debug

we get:

   [71/71] Linking target qemu-x86_64
   FAILED: qemu-x86_64
   /usr/bin/ld: libqemu-x86_64-linux-user.fa.p/target_i386_cpu.c.o: in function 
`cpu_x86_cpuid':
   cpu.c:(.text+0x1374): undefined reference to `kvm_arch_get_supported_cpuid'
   /usr/bin/ld: libqemu-x86_64-linux-user.fa.p/target_i386_cpu.c.o: in function 
`x86_cpu_filter_features':
   cpu.c:(.text+0x81c2): undefined reference to `kvm_arch_get_supported_cpuid'
   /usr/bin/ld: cpu.c:(.text+0x81da): undefined reference to 
`kvm_arch_get_supported_cpuid'
   /usr/bin/ld: cpu.c:(.text+0x81f2): undefined reference to 
`kvm_arch_get_supported_cpuid'
   /usr/bin/ld: cpu.c:(.text+0x820a): undefined reference to 
`kvm_arch_get_supported_cpuid'
   /usr/bin/ld: 
libqemu-x86_64-linux-user.fa.p/target_i386_cpu.c.o:cpu.c:(.text+0x8225): more 
undefined references to `kvm_arch_get_supported_cpuid' follow
   clang: error: linker command failed with exit code 1 (use -v to see 
invocation)
   ninja: build stopped: subcommand failed.

'--enable-debug' disables optimizations (CFLAGS=-O0).

While at this (un)optimization level GCC eliminate the
following dead code:

   if (0 && foo()) {
   ...
   }

Clang does not. This was previously documented in commit 2140cfa51d
("i386: Fix build by providing stub kvm_arch_get_supported_cpuid()").

Fix by partially reverting those commits, restoring a pair of stubs
for the unoptimized Clang builds.

Reported-by: Kevin Wolf 
Suggested-by: Daniel P. Berrangé 
Fixes: 3adce820cf ("target/i386: Remove unused KVM stubs")
Fixes: ef1cf6890f ("target/i386: Allow elision of kvm_hv_vpindex_settable()")
Signed-off-by: Philippe Mathieu-Daudé 
---
  target/i386/kvm/kvm-stub.c  | 31 +++
  target/i386/kvm/meson.build |  2 ++
  2 files changed, 33 insertions(+)
  create mode 100644 target/i386/kvm/kvm-stub.c


Patch superseded, see v3:
https://lore.kernel.org/qemu-devel/20230911142729.25548-1-phi...@linaro.org/



Re: [PATCH 3/7] linux-user/elfload.c: Add missing arm and arm64 hwcap values

2023-09-11 Thread Peter Maydell
On Mon, 11 Sept 2023 at 15:59, Philippe Mathieu-Daudé  wrote:
>
> On 11/9/23 15:53, Peter Maydell wrote:
> > Our lists of Arm 32 and 64 bit hwcap values have lagged behind
> > the Linux kernel. Update them to include all the bits defined
> > as of upstream Linux git commit a48fa7efaf1161c1 (in the middle
> > of the kernel 6.6 dev cycle).
> >
> > For 64-bit, we don't yet implement any of the features reported via
> > these hwcap bits.  For 32-bit we do in fact already implement them
> > all; we'll add the code to set them in a subsequent commit.
> >
> > Signed-off-by: Peter Maydell 
> > ---
> >   linux-user/elfload.c | 44 
> >   1 file changed, 44 insertions(+)
>
> Reviewed-by: Philippe Mathieu-Daudé 
>
> Why can't we import asm/hwcap.h with update-linux-headers.sh?

Probably no inherent reason, but historically linux-user has
always defined its own versions of target architecture
kernel header values and structures.

-- PMM



Re: [PATCH] ui: fix crash when there are no active_console

2023-09-11 Thread Albert Esteve
On Mon, Sep 11, 2023 at 4:42 PM Albert Esteve  wrote:

>
>
> On Mon, Sep 11, 2023 at 4:08 PM  wrote:
>
>> From: Marc-André Lureau 
>>
>> Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
>> 0x55888630 in dpy_ui_info_supported (con=0x0) at
>> ../ui/console.c:812
>> 812 return con->hw_ops->ui_info != NULL;
>> (gdb) bt
>> #0  0x55888630 in dpy_ui_info_supported (con=0x0) at
>> ../ui/console.c:812
>> #1  0x558a44b1 in protocol_client_msg (vs=0x578c76c0,
>> data=0x581e93f0 , len=24) at ../ui/vnc.c:2585
>> #2  0x558a19ac in vnc_client_read (vs=0x578c76c0) at
>> ../ui/vnc.c:1607
>> #3  0x558a1ac2 in vnc_client_io (ioc=0x581eb0e0,
>> condition=G_IO_IN, opaque=0x578c76c0) at ../ui/vnc.c:1635
>>
>> Fixes:
>> https://issues.redhat.com/browse/RHEL-2600
>>
>> Signed-off-by: Marc-André Lureau 
>> ---
>>  ui/console.c | 25 +
>>  1 file changed, 25 insertions(+)
>>
>> diff --git a/ui/console.c b/ui/console.c
>> index 90ae4be602..0f31ecece6 100644
>> --- a/ui/console.c
>> +++ b/ui/console.c
>> @@ -808,6 +808,9 @@ bool dpy_ui_info_supported(QemuConsole *con)
>>  if (con == NULL) {
>>  con = active_console;
>>  }
>> +if (con == NULL) {
>> +return false;
>> +}
>>
>>  return con->hw_ops->ui_info != NULL;
>>  }
>> @@ -817,6 +820,9 @@ const QemuUIInfo *dpy_get_ui_info(const QemuConsole
>> *con)
>>  if (con == NULL) {
>>  con = active_console;
>>  }
>> +if (con == NULL) {
>> +return NULL;
>> +}
>>
>>  return >ui_info;
>>  }
>> @@ -826,6 +832,9 @@ int dpy_set_ui_info(QemuConsole *con, QemuUIInfo
>> *info, bool delay)
>>  if (con == NULL) {
>>  con = active_console;
>>  }
>> +if (con == NULL) {
>> +return -1;
>> +}
>>
>>  if (!dpy_ui_info_supported(con)) {
>>  return -1;
>> @@ -1401,6 +1410,10 @@ QEMUCursor *qemu_console_get_cursor(QemuConsole
>> *con)
>>  if (con == NULL) {
>>  con = active_console;
>>  }
>> +if (con == NULL) {
>> +return NULL;
>> +}
>> +
>>  return QEMU_IS_GRAPHIC_CONSOLE(con) ?
>> QEMU_GRAPHIC_CONSOLE(con)->cursor : NULL;
>>  }
>>
>> @@ -1414,6 +1427,10 @@ bool qemu_console_is_graphic(QemuConsole *con)
>>  if (con == NULL) {
>>  con = active_console;
>>  }
>> +if (con == NULL) {
>> +return false;
>> +}
>> +
>>
>
I had miss this one before:
```
return QEMU_IS_GRAPHIC_CONSOLE(con);
```

Regards,
Albert


>  return con && QEMU_IS_GRAPHIC_CONSOLE(con);
>>  }
>>
>> @@ -1422,6 +1439,10 @@ bool qemu_console_is_fixedsize(QemuConsole *con)
>>  if (con == NULL) {
>>  con = active_console;
>>  }
>> +if (con == NULL) {
>> +return false;
>> +}
>> +
>>
>
> The "con" initialization condition is already checked in the line below
> (now unnecessarily).
> If the if block is preferred for consistency with other functions, we
> could remove the con check from
> the condition below:
> ```
> return QEMU_IS_GRAPHIC_CONSOLE(con) || QEMU_IS_FIXED_TEXT_CONSOLE(con);
> ```
>
>
>>  return con && (QEMU_IS_GRAPHIC_CONSOLE(con) ||
>> QEMU_IS_FIXED_TEXT_CONSOLE(con));
>>  }
>>
>> @@ -1493,6 +1514,10 @@ int qemu_console_get_index(QemuConsole *con)
>>  if (con == NULL) {
>>  con = active_console;
>>  }
>> +if (con == NULL) {
>> +return -1;
>> +}
>> +
>>
>
> Same as before, here we could simply "return con->index;"
>
>
>>  return con ? con->index : -1;
>
>  }
>>
>> --
>> 2.41.0
>>
>>
>>


Re: [PATCH 7/7] target/arm: Implement FEAT_HBC

2023-09-11 Thread Philippe Mathieu-Daudé

On 11/9/23 15:53, Peter Maydell wrote:

FEAT_HBC (Hinted conditional branches) provides a new instruction
BC.cond, which behaves exactly like the existing B.cond except
that it provides a hint to the branch predictor about the
likely behaviour of the branch.

Since QEMU does not implement branch prediction, we can treat
this identically to B.cond.

Signed-off-by: Peter Maydell 
---
  docs/system/arm/emulation.rst  | 1 +
  target/arm/cpu.h   | 5 +
  target/arm/tcg/a64.decode  | 3 ++-
  linux-user/elfload.c   | 1 +
  target/arm/tcg/cpu64.c | 4 
  target/arm/tcg/translate-a64.c | 4 
  6 files changed, 17 insertions(+), 1 deletion(-)


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH 3/7] linux-user/elfload.c: Add missing arm and arm64 hwcap values

2023-09-11 Thread Philippe Mathieu-Daudé

On 11/9/23 15:53, Peter Maydell wrote:

Our lists of Arm 32 and 64 bit hwcap values have lagged behind
the Linux kernel. Update them to include all the bits defined
as of upstream Linux git commit a48fa7efaf1161c1 (in the middle
of the kernel 6.6 dev cycle).

For 64-bit, we don't yet implement any of the features reported via
these hwcap bits.  For 32-bit we do in fact already implement them
all; we'll add the code to set them in a subsequent commit.

Signed-off-by: Peter Maydell 
---
  linux-user/elfload.c | 44 
  1 file changed, 44 insertions(+)


Reviewed-by: Philippe Mathieu-Daudé 

Why can't we import asm/hwcap.h with update-linux-headers.sh?



  1   2   3   4   >