[PULL 11/63] hw/virtio: move stubs out of stubs/

2024-04-23 Thread Paolo Bonzini
Since the virtio memory device stubs are needed exactly when the
Kconfig symbol is not enabled, they can be placed in hw/virtio/ and
conditionalized on CONFIG_VIRTIO_MD.

Signed-off-by: Paolo Bonzini 
Reviewed-by: Richard Henderson 
Message-ID: <20240408155330.522792-12-pbonz...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 stubs/virtio-md-pci.c => hw/virtio/virtio-md-stubs.c | 0
 hw/virtio/meson.build| 2 ++
 stubs/meson.build| 1 -
 3 files changed, 2 insertions(+), 1 deletion(-)
 rename stubs/virtio-md-pci.c => hw/virtio/virtio-md-stubs.c (100%)

diff --git a/stubs/virtio-md-pci.c b/hw/virtio/virtio-md-stubs.c
similarity index 100%
rename from stubs/virtio-md-pci.c
rename to hw/virtio/virtio-md-stubs.c
diff --git a/hw/virtio/meson.build b/hw/virtio/meson.build
index d7f18c96e60..621fc65454c 100644
--- a/hw/virtio/meson.build
+++ b/hw/virtio/meson.build
@@ -87,6 +87,8 @@ specific_virtio_ss.add_all(when: 'CONFIG_VIRTIO_PCI', 
if_true: virtio_pci_ss)
 system_ss.add_all(when: 'CONFIG_VIRTIO', if_true: system_virtio_ss)
 system_ss.add(when: 'CONFIG_VIRTIO', if_false: files('vhost-stub.c'))
 system_ss.add(when: 'CONFIG_VIRTIO', if_false: files('virtio-stub.c'))
+system_ss.add(when: 'CONFIG_VIRTIO_MD', if_false: files('virtio-md-stubs.c'))
+
 system_ss.add(files('virtio-hmp-cmds.c'))
 
 specific_ss.add_all(when: 'CONFIG_VIRTIO', if_true: specific_virtio_ss)
diff --git a/stubs/meson.build b/stubs/meson.build
index 45616afbfaa..60e32d363fa 100644
--- a/stubs/meson.build
+++ b/stubs/meson.build
@@ -57,7 +57,6 @@ if have_system
   stub_ss.add(files('fw_cfg.c'))
   stub_ss.add(files('semihost.c'))
   stub_ss.add(files('xen-hw-stub.c'))
-  stub_ss.add(files('virtio-md-pci.c'))
 else
   stub_ss.add(files('qdev.c'))
 endif
-- 
2.44.0





[PULL 45/63] i386/sev: Add 'legacy-vm-type' parameter for SEV guest objects

2024-04-23 Thread Paolo Bonzini
From: Michael Roth 

QEMU will currently automatically make use of the KVM_SEV_INIT2 API for
initializing SEV and SEV-ES guests verses the older
KVM_SEV_INIT/KVM_SEV_ES_INIT interfaces.

However, the older interfaces will silently avoid sync'ing FPU/XSAVE
state to the VMSA prior to encryption, thus relying on behavior and
measurements that assume the related fields to be allow zero.

With KVM_SEV_INIT2, this state is now synced into the VMSA, resulting in
measurements changes and, theoretically, behaviorial changes, though the
latter are unlikely to be seen in practice.

To allow a smooth transition to the newer interface, while still
providing a mechanism to maintain backward compatibility with VMs
created using the older interfaces, provide a new command-line
parameter:

  -object sev-guest,legacy-vm-type=true,...

and have it default to false.

Signed-off-by: Michael Roth 
Message-ID: <20240409230743.962513-2-michael.r...@amd.com>
Signed-off-by: Paolo Bonzini 
---
 qapi/qom.json | 11 ++-
 target/i386/sev.c | 18 +-
 2 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/qapi/qom.json b/qapi/qom.json
index 85e6b4f84a2..38dde6d785a 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -898,6 +898,14 @@
 # designated guest firmware page for measured boot with -kernel
 # (default: false) (since 6.2)
 #
+# @legacy-vm-type: Use legacy KVM_SEV_INIT KVM interface for creating the VM.
+#  The newer KVM_SEV_INIT2 interface syncs additional vCPU
+#  state when initializing the VMSA structures, which will
+#  result in a different guest measurement. Set this to
+#  maintain compatibility with older QEMU or kernel versions
+#  that rely on legacy KVM_SEV_INIT behavior.
+#  (default: false) (since 9.1)
+#
 # Since: 2.12
 ##
 { 'struct': 'SevGuestProperties',
@@ -908,7 +916,8 @@
 '*handle': 'uint32',
 '*cbitpos': 'uint32',
 'reduced-phys-bits': 'uint32',
-'*kernel-hashes': 'bool' } }
+'*kernel-hashes': 'bool',
+'*legacy-vm-type': 'bool' } }
 
 ##
 # @ThreadContextProperties:
diff --git a/target/i386/sev.c b/target/i386/sev.c
index 9dab4060b84..f4ee317cb03 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -67,6 +67,7 @@ struct SevGuestState {
 uint32_t cbitpos;
 uint32_t reduced_phys_bits;
 bool kernel_hashes;
+bool legacy_vm_type;
 
 /* runtime state */
 uint32_t handle;
@@ -356,6 +357,16 @@ static void sev_guest_set_kernel_hashes(Object *obj, bool 
value, Error **errp)
 sev->kernel_hashes = value;
 }
 
+static bool sev_guest_get_legacy_vm_type(Object *obj, Error **errp)
+{
+return SEV_GUEST(obj)->legacy_vm_type;
+}
+
+static void sev_guest_set_legacy_vm_type(Object *obj, bool value, Error **errp)
+{
+SEV_GUEST(obj)->legacy_vm_type = value;
+}
+
 bool
 sev_enabled(void)
 {
@@ -863,7 +874,7 @@ static int sev_kvm_type(X86ConfidentialGuest *cg)
 }
 
 kvm_type = (sev->policy & SEV_POLICY_ES) ? KVM_X86_SEV_ES_VM : 
KVM_X86_SEV_VM;
-if (kvm_is_vm_type_supported(kvm_type)) {
+if (kvm_is_vm_type_supported(kvm_type) && !sev->legacy_vm_type) {
 sev->kvm_type = kvm_type;
 } else {
 sev->kvm_type = KVM_X86_DEFAULT_VM;
@@ -1381,6 +1392,11 @@ sev_guest_class_init(ObjectClass *oc, void *data)
sev_guest_set_kernel_hashes);
 object_class_property_set_description(oc, "kernel-hashes",
 "add kernel hashes to guest firmware for measured Linux boot");
+object_class_property_add_bool(oc, "legacy-vm-type",
+   sev_guest_get_legacy_vm_type,
+   sev_guest_set_legacy_vm_type);
+object_class_property_set_description(oc, "legacy-vm-type",
+"use legacy VM type to maintain measurement compatibility with 
older QEMU or kernel versions.");
 }
 
 static void
-- 
2.44.0





[PULL 14/63] memory-device: move stubs out of stubs/

2024-04-23 Thread Paolo Bonzini
Since the memory-device stubs are needed exactly when the Kconfig symbols are 
not
needed, move them to hw/mem/.

Signed-off-by: Paolo Bonzini 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Richard Henderson 
Message-ID: <20240408155330.522792-15-pbonz...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 stubs/memory_device.c => hw/mem/memory-device-stubs.c | 0
 hw/mem/meson.build| 1 +
 stubs/meson.build | 1 -
 3 files changed, 1 insertion(+), 1 deletion(-)
 rename stubs/memory_device.c => hw/mem/memory-device-stubs.c (100%)

diff --git a/stubs/memory_device.c b/hw/mem/memory-device-stubs.c
similarity index 100%
rename from stubs/memory_device.c
rename to hw/mem/memory-device-stubs.c
diff --git a/hw/mem/meson.build b/hw/mem/meson.build
index faee1fe9360..1c1c6da24b5 100644
--- a/hw/mem/meson.build
+++ b/hw/mem/meson.build
@@ -6,6 +6,7 @@ mem_ss.add(when: 'CONFIG_NVDIMM', if_true: files('nvdimm.c'))
 mem_ss.add(when: 'CONFIG_CXL_MEM_DEVICE', if_true: files('cxl_type3.c'))
 system_ss.add(when: 'CONFIG_CXL_MEM_DEVICE', if_false: 
files('cxl_type3_stubs.c'))
 
+system_ss.add(when: 'CONFIG_MEM_DEVICE', if_false: 
files('memory-device-stubs.c'))
 system_ss.add_all(when: 'CONFIG_MEM_DEVICE', if_true: mem_ss)
 
 system_ss.add(when: 'CONFIG_SPARSE_MEM', if_true: files('sparse-mem.c'))
diff --git a/stubs/meson.build b/stubs/meson.build
index 92887660e41..a4404e765ab 100644
--- a/stubs/meson.build
+++ b/stubs/meson.build
@@ -31,7 +31,6 @@ stub_ss.add(files('monitor.c'))
 stub_ss.add(files('monitor-core.c'))
 stub_ss.add(files('physmem.c'))
 stub_ss.add(files('qemu-timer-notify-cb.c'))
-stub_ss.add(files('memory_device.c'))
 stub_ss.add(files('qmp-command-available.c'))
 stub_ss.add(files('qmp-quit.c'))
 stub_ss.add(files('qtest.c'))
-- 
2.44.0





[PULL 58/63] target/i386/host-cpu: Consolidate the use of warn_report_once()

2024-04-23 Thread Paolo Bonzini
From: Zhao Liu 

Use warn_report_once() to get rid of the static local variable "warned".

Signed-off-by: Zhao Liu 
Message-ID: <20240327103951.3853425-2-zhao1@linux.intel.com>
Signed-off-by: Paolo Bonzini 
---
 target/i386/host-cpu.c | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/target/i386/host-cpu.c b/target/i386/host-cpu.c
index 92ecb7254b8..280e427c017 100644
--- a/target/i386/host-cpu.c
+++ b/target/i386/host-cpu.c
@@ -55,18 +55,15 @@ static uint32_t host_cpu_adjust_phys_bits(X86CPU *cpu)
 {
 uint32_t host_phys_bits = host_cpu_phys_bits();
 uint32_t phys_bits = cpu->phys_bits;
-static bool warned;
 
 /*
  * Print a warning if the user set it to a value that's not the
  * host value.
  */
-if (phys_bits != host_phys_bits && phys_bits != 0 &&
-!warned) {
-warn_report("Host physical bits (%u)"
-" does not match phys-bits property (%u)",
-host_phys_bits, phys_bits);
-warned = true;
+if (phys_bits != host_phys_bits && phys_bits != 0) {
+warn_report_once("Host physical bits (%u)"
+ " does not match phys-bits property (%u)",
+ host_phys_bits, phys_bits);
 }
 
 if (cpu->host_phys_bits) {
-- 
2.44.0





[PULL 38/63] linux-headers: update to current kvm/next

2024-04-23 Thread Paolo Bonzini
Signed-off-by: Paolo Bonzini 
---
 include/standard-headers/asm-x86/bootparam.h  |  17 +-
 include/standard-headers/asm-x86/kvm_para.h   |   3 +-
 include/standard-headers/asm-x86/setup_data.h |  83 +++
 include/standard-headers/linux/ethtool.h  |  48 ++
 include/standard-headers/linux/fuse.h |  39 +-
 .../linux/input-event-codes.h |   1 +
 include/standard-headers/linux/virtio_gpu.h   |   2 +
 include/standard-headers/linux/virtio_pci.h   |  10 +-
 include/standard-headers/linux/virtio_snd.h   | 154 
 linux-headers/asm-arm64/kvm.h |  15 +-
 linux-headers/asm-arm64/sve_context.h |  11 +
 linux-headers/asm-generic/bitsperlong.h   |   4 +
 linux-headers/asm-loongarch/kvm.h |   2 -
 linux-headers/asm-mips/kvm.h  |   2 -
 linux-headers/asm-powerpc/kvm.h   |  45 +-
 linux-headers/asm-riscv/kvm.h |   3 +-
 linux-headers/asm-s390/kvm.h  | 315 +++-
 linux-headers/asm-x86/kvm.h   | 328 -
 linux-headers/linux/bits.h|  15 +
 linux-headers/linux/kvm.h | 689 +-
 linux-headers/linux/psp-sev.h |  59 ++
 linux-headers/linux/vhost.h   |   7 +
 hw/i386/x86.c |   8 -
 23 files changed, 1120 insertions(+), 740 deletions(-)
 create mode 100644 include/standard-headers/asm-x86/setup_data.h
 create mode 100644 linux-headers/linux/bits.h

diff --git a/include/standard-headers/asm-x86/bootparam.h 
b/include/standard-headers/asm-x86/bootparam.h
index 0b06d2bff1b..b582a105c08 100644
--- a/include/standard-headers/asm-x86/bootparam.h
+++ b/include/standard-headers/asm-x86/bootparam.h
@@ -2,21 +2,7 @@
 #ifndef _ASM_X86_BOOTPARAM_H
 #define _ASM_X86_BOOTPARAM_H
 
-/* setup_data/setup_indirect types */
-#define SETUP_NONE 0
-#define SETUP_E820_EXT 1
-#define SETUP_DTB  2
-#define SETUP_PCI  3
-#define SETUP_EFI  4
-#define SETUP_APPLE_PROPERTIES 5
-#define SETUP_JAILHOUSE6
-#define SETUP_CC_BLOB  7
-#define SETUP_IMA  8
-#define SETUP_RNG_SEED 9
-#define SETUP_ENUM_MAX SETUP_RNG_SEED
-
-#define SETUP_INDIRECT (1<<31)
-#define SETUP_TYPE_MAX (SETUP_ENUM_MAX | SETUP_INDIRECT)
+#include "standard-headers/asm-x86/setup_data.h"
 
 /* ram_size flags */
 #define RAMDISK_IMAGE_START_MASK   0x07FF
@@ -38,6 +24,7 @@
 #define XLF_EFI_KEXEC  (1<<4)
 #define XLF_5LEVEL (1<<5)
 #define XLF_5LEVEL_ENABLED (1<<6)
+#define XLF_MEM_ENCRYPTION (1<<7)
 
 
 #endif /* _ASM_X86_BOOTPARAM_H */
diff --git a/include/standard-headers/asm-x86/kvm_para.h 
b/include/standard-headers/asm-x86/kvm_para.h
index f0235e58a1d..9a011d20f01 100644
--- a/include/standard-headers/asm-x86/kvm_para.h
+++ b/include/standard-headers/asm-x86/kvm_para.h
@@ -92,7 +92,7 @@ struct kvm_clock_pairing {
 #define KVM_ASYNC_PF_DELIVERY_AS_INT   (1 << 3)
 
 /* MSR_KVM_ASYNC_PF_INT */
-#define KVM_ASYNC_PF_VEC_MASK  GENMASK(7, 0)
+#define KVM_ASYNC_PF_VEC_MASK  __GENMASK(7, 0)
 
 /* MSR_KVM_MIGRATION_CONTROL */
 #define KVM_MIGRATION_READY(1 << 0)
@@ -142,7 +142,6 @@ struct kvm_vcpu_pv_apf_data {
uint32_t token;
 
uint8_t pad[56];
-   uint32_t enabled;
 };
 
 #define KVM_PV_EOI_BIT 0
diff --git a/include/standard-headers/asm-x86/setup_data.h 
b/include/standard-headers/asm-x86/setup_data.h
new file mode 100644
index 000..09355f54c55
--- /dev/null
+++ b/include/standard-headers/asm-x86/setup_data.h
@@ -0,0 +1,83 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _ASM_X86_SETUP_DATA_H
+#define _ASM_X86_SETUP_DATA_H
+
+/* setup_data/setup_indirect types */
+#define SETUP_NONE 0
+#define SETUP_E820_EXT 1
+#define SETUP_DTB  2
+#define SETUP_PCI  3
+#define SETUP_EFI  4
+#define SETUP_APPLE_PROPERTIES 5
+#define SETUP_JAILHOUSE6
+#define SETUP_CC_BLOB  7
+#define SETUP_IMA  8
+#define SETUP_RNG_SEED 9
+#define SETUP_ENUM_MAX SETUP_RNG_SEED
+
+#define SETUP_INDIRECT (1<<31)
+#define SETUP_TYPE_MAX (SETUP_ENUM_MAX | SETUP_INDIRECT)
+
+#ifndef __ASSEMBLY__
+
+#include "standard-headers/linux/types.h"
+
+/* extensible setup data list node */
+struct setup_data {
+   uint64_t next;
+   uint32_t type;
+   uint32_t len;
+   uint8_t data[];
+};
+
+/* extensible setup indirect data node */
+struct setup_indirect {
+   uint32_

[PULL 53/63] RAMBlock: make guest_memfd require uncoordinated discard

2024-04-23 Thread Paolo Bonzini
Some subsystems like VFIO might disable ram block discard, but guest_memfd
uses discard operations to implement conversions between private and
shared memory.  Because of this, sequences like the following can result
in stale IOMMU mappings:

1. allocate shared page
2. convert page shared->private
3. discard shared page
4. convert page private->shared
5. allocate shared page
6. issue DMA operations against that shared page

This is not a use-after-free, because after step 3 VFIO is still pinning
the page.  However, DMA operations in step 6 will hit the old mapping
that was allocated in step 1.

Address this by taking ram_block_discard_is_enabled() into account when
deciding whether or not to discard pages.

Since kvm_convert_memory()/guest_memfd doesn't implement a
RamDiscardManager handler to convey and replay discard operations,
this is a case of uncoordinated discard, which is blocked/released
by ram_block_discard_require().  Interestingly, this function had
no use so far.

Alternative approaches would be to block discard of shared pages, but
this would cause guests to consume twice the memory if they use VFIO;
or to implement a RamDiscardManager and only block uncoordinated
discard, i.e. use ram_block_coordinated_discard_require().

[Commit message mostly by Michael Roth ]

Signed-off-by: Paolo Bonzini 
---
 system/physmem.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/system/physmem.c b/system/physmem.c
index f5dfa20e57e..5ebcf5be116 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -1846,6 +1846,13 @@ static void ram_block_add(RAMBlock *new_block, Error 
**errp)
 assert(kvm_enabled());
 assert(new_block->guest_memfd < 0);
 
+if (ram_block_discard_require(true) < 0) {
+error_setg_errno(errp, errno,
+ "cannot set up private guest memory: discard 
currently blocked");
+error_append_hint(errp, "Are you using assigned devices?\n");
+goto out_free;
+}
+
 new_block->guest_memfd = kvm_create_guest_memfd(new_block->max_length,
 0, errp);
 if (new_block->guest_memfd < 0) {
@@ -2109,6 +2116,7 @@ static void reclaim_ramblock(RAMBlock *block)
 
 if (block->guest_memfd >= 0) {
 close(block->guest_memfd);
+ram_block_discard_require(false);
 }
 
 g_free(block);
-- 
2.44.0





[PULL 48/63] kvm: Introduce support for memory_attributes

2024-04-23 Thread Paolo Bonzini
From: Xiaoyao Li 

Introduce the helper functions to set the attributes of a range of
memory to private or shared.

This is necessary to notify KVM the private/shared attribute of each gpa
range. KVM needs the information to decide the GPA needs to be mapped at
hva-based shared memory or guest_memfd based private memory.

Signed-off-by: Xiaoyao Li 
Message-ID: <20240320083945.991426-11-michael.r...@amd.com>
Signed-off-by: Paolo Bonzini 
---
 include/sysemu/kvm.h |  4 
 accel/kvm/kvm-all.c  | 32 
 2 files changed, 36 insertions(+)

diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index bd247f3a239..594ae9b4605 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -536,4 +536,8 @@ void kvm_mark_guest_state_protected(void);
  * reported for the VM.
  */
 bool kvm_hwpoisoned_mem(void);
+
+int kvm_set_memory_attributes_private(hwaddr start, uint64_t size);
+int kvm_set_memory_attributes_shared(hwaddr start, uint64_t size);
+
 #endif
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index ed50c80b1e4..db0b1a16edd 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -91,6 +91,7 @@ bool kvm_msi_use_devid;
 static bool kvm_has_guest_debug;
 static int kvm_sstep_flags;
 static bool kvm_immediate_exit;
+static uint64_t kvm_supported_memory_attributes;
 static hwaddr kvm_max_slot_size = ~0;
 
 static const KVMCapabilityInfo kvm_required_capabilites[] = {
@@ -1266,6 +1267,36 @@ void kvm_set_max_memslot_size(hwaddr max_slot_size)
 kvm_max_slot_size = max_slot_size;
 }
 
+static int kvm_set_memory_attributes(hwaddr start, uint64_t size, uint64_t 
attr)
+{
+struct kvm_memory_attributes attrs;
+int r;
+
+assert((attr & kvm_supported_memory_attributes) == attr);
+attrs.attributes = attr;
+attrs.address = start;
+attrs.size = size;
+attrs.flags = 0;
+
+r = kvm_vm_ioctl(kvm_state, KVM_SET_MEMORY_ATTRIBUTES, );
+if (r) {
+error_report("failed to set memory (0x%" HWADDR_PRIx "+0x%" PRIx64 ") "
+ "with attr 0x%" PRIx64 " error '%s'",
+ start, size, attr, strerror(errno));
+}
+return r;
+}
+
+int kvm_set_memory_attributes_private(hwaddr start, uint64_t size)
+{
+return kvm_set_memory_attributes(start, size, 
KVM_MEMORY_ATTRIBUTE_PRIVATE);
+}
+
+int kvm_set_memory_attributes_shared(hwaddr start, uint64_t size)
+{
+return kvm_set_memory_attributes(start, size, 0);
+}
+
 /* Called with KVMMemoryListener.slots_lock held */
 static void kvm_set_phys_mem(KVMMemoryListener *kml,
  MemoryRegionSection *section, bool add)
@@ -2387,6 +2418,7 @@ static int kvm_init(MachineState *ms)
 goto err;
 }
 
+kvm_supported_memory_attributes = kvm_check_extension(s, 
KVM_CAP_MEMORY_ATTRIBUTES);
 kvm_immediate_exit = kvm_check_extension(s, KVM_CAP_IMMEDIATE_EXIT);
 s->nr_slots = kvm_check_extension(s, KVM_CAP_NR_MEMSLOTS);
 
-- 
2.44.0





[PULL 12/63] semihosting: move stubs out of stubs/

2024-04-23 Thread Paolo Bonzini
Since the semihosting stubs are needed exactly when the Kconfig symbols
are not needed, move them to semihosting/ and conditionalize them
on CONFIG_SEMIHOSTING and/or CONFIG_SYSTEM_ONLY.

Signed-off-by: Paolo Bonzini 
Message-ID: <20240408155330.522792-13-pbonz...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 stubs/semihost-all.c => semihosting/stubs-all.c | 0
 stubs/semihost.c => semihosting/stubs-system.c  | 0
 semihosting/meson.build | 3 +++
 stubs/meson.build   | 2 --
 4 files changed, 3 insertions(+), 2 deletions(-)
 rename stubs/semihost-all.c => semihosting/stubs-all.c (100%)
 rename stubs/semihost.c => semihosting/stubs-system.c (100%)

diff --git a/stubs/semihost-all.c b/semihosting/stubs-all.c
similarity index 100%
rename from stubs/semihost-all.c
rename to semihosting/stubs-all.c
diff --git a/stubs/semihost.c b/semihosting/stubs-system.c
similarity index 100%
rename from stubs/semihost.c
rename to semihosting/stubs-system.c
diff --git a/semihosting/meson.build b/semihosting/meson.build
index b07cbd980f2..34933e5a195 100644
--- a/semihosting/meson.build
+++ b/semihosting/meson.build
@@ -9,5 +9,8 @@ specific_ss.add(when: ['CONFIG_SEMIHOSTING', 
'CONFIG_SYSTEM_ONLY'], if_true: fil
   'uaccess.c',
 ))
 
+common_ss.add(when: ['CONFIG_SEMIHOSTING', 'CONFIG_SYSTEM_ONLY'], if_false: 
files('stubs-all.c'))
+system_ss.add(when: ['CONFIG_SEMIHOSTING'], if_false: files('stubs-system.c'))
+
 specific_ss.add(when: ['CONFIG_ARM_COMPATIBLE_SEMIHOSTING'],
if_true: files('arm-compat-semi.c'))
diff --git a/stubs/meson.build b/stubs/meson.build
index 60e32d363fa..84ecaa4daa1 100644
--- a/stubs/meson.build
+++ b/stubs/meson.build
@@ -55,9 +55,7 @@ if have_block or have_ga
 endif
 if have_system
   stub_ss.add(files('fw_cfg.c'))
-  stub_ss.add(files('semihost.c'))
   stub_ss.add(files('xen-hw-stub.c'))
 else
   stub_ss.add(files('qdev.c'))
 endif
-stub_ss.add(files('semihost-all.c'))
-- 
2.44.0





[PULL 08/63] hw: Include minimal source set in user emulation build

2024-04-23 Thread Paolo Bonzini
From: Philippe Mathieu-Daudé 

Only the files in hwcore_ss[] are required to link a user emulation
binary.

Have meson process the hw/ sub-directories if system emulation is
selected, otherwise directly process hw/core/ to get hwcore_ss[], which
is the only set required by user emulation.

This removes about 10% from the time needed to run
"../configure --disable-system --disable-tools --disable-guest-agent".

Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Richard Henderson 
Message-ID: <20240404194757.9343-8-phi...@linaro.org>
Signed-off-by: Paolo Bonzini 
Message-ID: <20240408155330.522792-9-pbonz...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 meson.build | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index 8c1271b8846..84e59dcbb49 100644
--- a/meson.build
+++ b/meson.build
@@ -3451,8 +3451,12 @@ subdir('qom')
 subdir('authz')
 subdir('crypto')
 subdir('ui')
-subdir('hw')
 subdir('gdbstub')
+if have_system
+  subdir('hw')
+else
+  subdir('hw/core')
+endif
 
 if enable_modules
   libmodulecommon = static_library('module-common', files('module-common.c') + 
genh, pic: true, c_args: '-DBUILD_DSO')
-- 
2.44.0





[PULL 51/63] kvm/memory: Make memory type private by default if it has guest memfd backend

2024-04-23 Thread Paolo Bonzini
From: Xiaoyao Li 

KVM side leaves the memory to shared by default, which may incur the
overhead of paging conversion on the first visit of each page. Because
the expectation is that page is likely to private for the VMs that
require private memory (has guest memfd).

Explicitly set the memory to private when memory region has valid
guest memfd backend.

Signed-off-by: Xiaoyao Li 
Signed-off-by: Michael Roth 
Message-ID: <20240320083945.991426-16-michael.r...@amd.com>
Signed-off-by: Paolo Bonzini 
---
 accel/kvm/kvm-all.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 0386d4901fa..f49b2b95b54 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -1431,6 +1431,16 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml,
 strerror(-err));
 abort();
 }
+
+if (memory_region_has_guest_memfd(mr)) {
+err = kvm_set_memory_attributes_private(start_addr, slot_size);
+if (err) {
+error_report("%s: failed to set memory attribute private: %s",
+ __func__, strerror(-err));
+exit(1);
+}
+}
+
 start_addr += slot_size;
 ram_start_offset += slot_size;
 ram += slot_size;
-- 
2.44.0





[PULL 34/63] ppc/pef: switch to use confidential_guest_kvm_init/reset()

2024-04-23 Thread Paolo Bonzini
From: Xiaoyao Li 

Use the unified interface to call confidential guest related kvm_init()
and kvm_reset(), to avoid exposing pef specific functions.

As a bonus, pef.h goes away since there is no direct call from sPAPR
board code to PEF code anymore.

Signed-off-by: Xiaoyao Li 
Signed-off-by: Paolo Bonzini 
---
 include/hw/ppc/pef.h | 17 -
 hw/ppc/pef.c |  9 ++---
 hw/ppc/spapr.c   | 10 +++---
 3 files changed, 13 insertions(+), 23 deletions(-)
 delete mode 100644 include/hw/ppc/pef.h

diff --git a/include/hw/ppc/pef.h b/include/hw/ppc/pef.h
deleted file mode 100644
index 707dbe524c4..000
--- a/include/hw/ppc/pef.h
+++ /dev/null
@@ -1,17 +0,0 @@
-/*
- * PEF (Protected Execution Facility) for POWER support
- *
- * Copyright Red Hat.
- *
- * This work is licensed under the terms of the GNU GPL, version 2 or later.
- * See the COPYING file in the top-level directory.
- *
- */
-
-#ifndef HW_PPC_PEF_H
-#define HW_PPC_PEF_H
-
-int pef_kvm_init(ConfidentialGuestSupport *cgs, Error **errp);
-int pef_kvm_reset(ConfidentialGuestSupport *cgs, Error **errp);
-
-#endif /* HW_PPC_PEF_H */
diff --git a/hw/ppc/pef.c b/hw/ppc/pef.c
index d28ed3ba733..47553348b1e 100644
--- a/hw/ppc/pef.c
+++ b/hw/ppc/pef.c
@@ -15,7 +15,6 @@
 #include "sysemu/kvm.h"
 #include "migration/blocker.h"
 #include "exec/confidential-guest-support.h"
-#include "hw/ppc/pef.h"
 
 #define TYPE_PEF_GUEST "pef-guest"
 OBJECT_DECLARE_SIMPLE_TYPE(PefGuest, PEF_GUEST)
@@ -93,7 +92,7 @@ static int kvmppc_svm_off(Error **errp)
 #endif
 }
 
-int pef_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
+static int pef_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
 {
 if (!object_dynamic_cast(OBJECT(cgs), TYPE_PEF_GUEST)) {
 return 0;
@@ -107,7 +106,7 @@ int pef_kvm_init(ConfidentialGuestSupport *cgs, Error 
**errp)
 return kvmppc_svm_init(cgs, errp);
 }
 
-int pef_kvm_reset(ConfidentialGuestSupport *cgs, Error **errp)
+static int pef_kvm_reset(ConfidentialGuestSupport *cgs, Error **errp)
 {
 if (!object_dynamic_cast(OBJECT(cgs), TYPE_PEF_GUEST)) {
 return 0;
@@ -131,6 +130,10 @@ OBJECT_DEFINE_TYPE_WITH_INTERFACES(PefGuest,
 
 static void pef_guest_class_init(ObjectClass *oc, void *data)
 {
+ConfidentialGuestSupportClass *klass = 
CONFIDENTIAL_GUEST_SUPPORT_CLASS(oc);
+
+klass->kvm_init = pef_kvm_init;
+klass->kvm_reset = pef_kvm_reset;
 }
 
 static void pef_guest_init(Object *obj)
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 36ada4d0baf..533ea0f9142 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -75,6 +75,7 @@
 #include "hw/virtio/vhost-scsi-common.h"
 
 #include "exec/ram_addr.h"
+#include "exec/confidential-guest-support.h"
 #include "hw/usb.h"
 #include "qemu/config-file.h"
 #include "qemu/error-report.h"
@@ -87,7 +88,6 @@
 #include "hw/ppc/spapr_tpm_proxy.h"
 #include "hw/ppc/spapr_nvdimm.h"
 #include "hw/ppc/spapr_numa.h"
-#include "hw/ppc/pef.h"
 
 #include "monitor/monitor.h"
 
@@ -1715,7 +1715,9 @@ static void spapr_machine_reset(MachineState *machine, 
ShutdownCause reason)
 qemu_guest_getrandom_nofail(spapr->fdt_rng_seed, 32);
 }
 
-pef_kvm_reset(machine->cgs, _fatal);
+if (machine->cgs) {
+confidential_guest_kvm_reset(machine->cgs, _fatal);
+}
 spapr_caps_apply(spapr);
 spapr_nested_reset(spapr);
 
@@ -2841,7 +2843,9 @@ static void spapr_machine_init(MachineState *machine)
 /*
  * if Secure VM (PEF) support is configured, then initialize it
  */
-pef_kvm_init(machine->cgs, _fatal);
+if (machine->cgs) {
+confidential_guest_kvm_init(machine->cgs, _fatal);
+}
 
 msi_nonbroken = true;
 
-- 
2.44.0





[PULL 57/63] kvm/tdx: Ignore memory conversion to shared of unassigned region

2024-04-23 Thread Paolo Bonzini
From: Isaku Yamahata 

TDX requires vMMIO region to be shared.  For KVM, MMIO region is the region
which kvm memslot isn't assigned to (except in-kernel emulation).
qemu has the memory region for vMMIO at each device level.

While OVMF issues MapGPA(to-shared) conservatively on 32bit PCI MMIO
region, qemu doesn't find corresponding vMMIO region because it's before
PCI device allocation and memory_region_find() finds the device region, not
PCI bus region.  It's safe to ignore MapGPA(to-shared) because when guest
accesses those region they use GPA with shared bit set for vMMIO.  Ignore
memory conversion request of non-assigned region to shared and return
success.  Otherwise OVMF is confused and panics there.

Signed-off-by: Isaku Yamahata 
Signed-off-by: Xiaoyao Li 
Message-ID: <20240229063726.610065-35-xiaoyao...@intel.com>
Signed-off-by: Paolo Bonzini 
---
 accel/kvm/kvm-all.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 0911154bf8e..d7281b93f3b 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2923,6 +2923,18 @@ int kvm_convert_memory(hwaddr start, hwaddr size, bool 
to_private)
 section = memory_region_find(get_system_memory(), start, size);
 mr = section.mr;
 if (!mr) {
+/*
+ * Ignore converting non-assigned region to shared.
+ *
+ * TDX requires vMMIO region to be shared to inject #VE to guest.
+ * OVMF issues conservatively MapGPA(shared) on 32bit PCI MMIO region,
+ * and vIO-APIC 0xFEC0 4K page.
+ * OVMF assigns 32bit PCI MMIO region to
+ * [top of low memory: typically 2GB=0xC00,  0xFC0)
+ */
+if (!to_private) {
+return 0;
+}
 return -1;
 }
 
-- 
2.44.0





[PULL 56/63] kvm/tdx: Don't complain when converting vMMIO region to shared

2024-04-23 Thread Paolo Bonzini
From: Isaku Yamahata 

Because vMMIO region needs to be shared region, guest TD may explicitly
convert such region from private to shared.  Don't complain such
conversion.

Signed-off-by: Isaku Yamahata 
Signed-off-by: Xiaoyao Li 
Message-ID: <20240229063726.610065-34-xiaoyao...@intel.com>
Signed-off-by: Paolo Bonzini 
---
 accel/kvm/kvm-all.c | 19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 9eef2c64003..0911154bf8e 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2927,9 +2927,22 @@ int kvm_convert_memory(hwaddr start, hwaddr size, bool 
to_private)
 }
 
 if (!memory_region_has_guest_memfd(mr)) {
-error_report("Converting non guest_memfd backed memory region "
- "(0x%"HWADDR_PRIx" ,+ 0x%"HWADDR_PRIx") to %s",
- start, size, to_private ? "private" : "shared");
+/*
+ * Because vMMIO region must be shared, guest TD may convert vMMIO
+ * region to shared explicitly.  Don't complain such case.  See
+ * memory_region_type() for checking if the region is MMIO region.
+ */
+if (!to_private &&
+!memory_region_is_ram(mr) &&
+!memory_region_is_ram_device(mr) &&
+!memory_region_is_rom(mr) &&
+!memory_region_is_romd(mr)) {
+   ret = 0;
+} else {
+error_report("Convert non guest_memfd backed memory region "
+"(0x%"HWADDR_PRIx" ,+ 0x%"HWADDR_PRIx") to %s",
+start, size, to_private ? "private" : "shared");
+}
 goto out_unref;
 }
 
-- 
2.44.0





[PULL 30/63] q35: Introduce smm_ranges property for q35-pci-host

2024-04-23 Thread Paolo Bonzini
From: Isaku Yamahata 

Add a q35 property to check whether or not SMM ranges, e.g. SMRAM, TSEG,
etc... exist for the target platform.  TDX doesn't support SMM and doesn't
play nice with QEMU modifying related guest memory ranges.

Signed-off-by: Isaku Yamahata 
Co-developed-by: Sean Christopherson 
Signed-off-by: Sean Christopherson 
Signed-off-by: Xiaoyao Li 
Signed-off-by: Michael Roth 
Message-ID: <20240320083945.991426-19-michael.r...@amd.com>
Signed-off-by: Paolo Bonzini 
---
 include/hw/i386/pc.h  |  1 +
 include/hw/pci-host/q35.h |  1 +
 hw/i386/pc_q35.c  |  2 ++
 hw/pci-host/q35.c | 42 +++
 4 files changed, 33 insertions(+), 13 deletions(-)

diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 349f79df086..e52290916cb 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -161,6 +161,7 @@ void pc_acpi_smi_interrupt(void *opaque, int irq, int 
level);
 #define PCI_HOST_PROP_PCI_HOLE64_SIZE  "pci-hole64-size"
 #define PCI_HOST_BELOW_4G_MEM_SIZE "below-4g-mem-size"
 #define PCI_HOST_ABOVE_4G_MEM_SIZE "above-4g-mem-size"
+#define PCI_HOST_PROP_SMM_RANGES   "smm-ranges"
 
 
 void pc_pci_as_mapping_init(MemoryRegion *system_memory,
diff --git a/include/hw/pci-host/q35.h b/include/hw/pci-host/q35.h
index bafcbe67521..22fadfa3ed7 100644
--- a/include/hw/pci-host/q35.h
+++ b/include/hw/pci-host/q35.h
@@ -50,6 +50,7 @@ struct MCHPCIState {
 MemoryRegion tseg_blackhole, tseg_window;
 MemoryRegion smbase_blackhole, smbase_window;
 bool has_smram_at_smbase;
+bool has_smm_ranges;
 Range pci_hole;
 uint64_t below_4g_mem_size;
 uint64_t above_4g_mem_size;
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 6e1180d4b60..bb53a51ac18 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -219,6 +219,8 @@ static void pc_q35_init(MachineState *machine)
 x86ms->above_4g_mem_size, NULL);
 object_property_set_bool(phb, PCI_HOST_BYPASS_IOMMU,
  pcms->default_bus_bypass_iommu, NULL);
+object_property_set_bool(phb, PCI_HOST_PROP_SMM_RANGES,
+ x86_machine_is_smm_enabled(x86ms), NULL);
 sysbus_realize_and_unref(SYS_BUS_DEVICE(phb), _fatal);
 
 /* pci */
diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c
index 98d4a7c253a..0b6cbaed7ed 100644
--- a/hw/pci-host/q35.c
+++ b/hw/pci-host/q35.c
@@ -179,6 +179,8 @@ static Property q35_host_props[] = {
  mch.below_4g_mem_size, 0),
 DEFINE_PROP_SIZE(PCI_HOST_ABOVE_4G_MEM_SIZE, Q35PCIHost,
  mch.above_4g_mem_size, 0),
+DEFINE_PROP_BOOL(PCI_HOST_PROP_SMM_RANGES, Q35PCIHost,
+ mch.has_smm_ranges, true),
 DEFINE_PROP_BOOL("x-pci-hole64-fix", Q35PCIHost, pci_hole64_fix, true),
 DEFINE_PROP_END_OF_LIST(),
 };
@@ -214,6 +216,7 @@ static void q35_host_initfn(Object *obj)
 /* mch's object_initialize resets the default value, set it again */
 qdev_prop_set_uint64(DEVICE(s), PCI_HOST_PROP_PCI_HOLE64_SIZE,
  Q35_PCI_HOST_HOLE64_SIZE_DEFAULT);
+
 object_property_add(obj, PCI_HOST_PROP_PCI_HOLE_START, "uint32",
 q35_host_get_pci_hole_start,
 NULL, NULL, NULL);
@@ -476,6 +479,10 @@ static void mch_write_config(PCIDevice *d,
 mch_update_pciexbar(mch);
 }
 
+if (!mch->has_smm_ranges) {
+return;
+}
+
 if (ranges_overlap(address, len, MCH_HOST_BRIDGE_SMRAM,
MCH_HOST_BRIDGE_SMRAM_SIZE)) {
 mch_update_smram(mch);
@@ -494,10 +501,13 @@ static void mch_write_config(PCIDevice *d,
 static void mch_update(MCHPCIState *mch)
 {
 mch_update_pciexbar(mch);
+
 mch_update_pam(mch);
-mch_update_smram(mch);
-mch_update_ext_tseg_mbytes(mch);
-mch_update_smbase_smram(mch);
+if (mch->has_smm_ranges) {
+mch_update_smram(mch);
+mch_update_ext_tseg_mbytes(mch);
+mch_update_smbase_smram(mch);
+}
 
 /*
  * pci hole goes from end-of-low-ram to io-apic.
@@ -538,19 +548,21 @@ static void mch_reset(DeviceState *qdev)
 pci_set_quad(d->config + MCH_HOST_BRIDGE_PCIEXBAR,
  MCH_HOST_BRIDGE_PCIEXBAR_DEFAULT);
 
-d->config[MCH_HOST_BRIDGE_SMRAM] = MCH_HOST_BRIDGE_SMRAM_DEFAULT;
-d->config[MCH_HOST_BRIDGE_ESMRAMC] = MCH_HOST_BRIDGE_ESMRAMC_DEFAULT;
-d->wmask[MCH_HOST_BRIDGE_SMRAM] = MCH_HOST_BRIDGE_SMRAM_WMASK;
-d->wmask[MCH_HOST_BRIDGE_ESMRAMC] = MCH_HOST_BRIDGE_ESMRAMC_WMASK;
+if (mch->has_smm_ranges) {
+d->config[MCH_HOST_BRIDGE_SMRAM] = MCH_HOST_BRIDGE_SMRAM_DEFAULT;
+d->config[MCH_HOST_BRIDGE_ESMRAMC] = MCH_HOST_BRIDGE_ESMRAMC_DEFAULT;
+d->wmask[MCH_HOST_BRIDGE_SMRAM] = MCH_HOST_BRIDGE_SMRAM_WMASK;
+d->wmask[MCH_HOST_BRIDGE_ESMRAMC] = MCH_HOST_BRIDGE_ESMRAMC_WMA

[PULL 50/63] kvm: Enable KVM_SET_USER_MEMORY_REGION2 for memslot

2024-04-23 Thread Paolo Bonzini
From: Chao Peng 

Switch to KVM_SET_USER_MEMORY_REGION2 when supported by KVM.

With KVM_SET_USER_MEMORY_REGION2, QEMU can set up memory region that
backend'ed both by hva-based shared memory and guest memfd based private
memory.

Signed-off-by: Chao Peng 
Co-developed-by: Xiaoyao Li 
Signed-off-by: Xiaoyao Li 
Message-ID: <20240320083945.991426-10-michael.r...@amd.com>
Signed-off-by: Paolo Bonzini 
---
 include/sysemu/kvm_int.h |  2 ++
 accel/kvm/kvm-all.c  | 46 +---
 accel/kvm/trace-events   |  2 +-
 3 files changed, 41 insertions(+), 9 deletions(-)

diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
index 227b61fec3d..3f3d13f8166 100644
--- a/include/sysemu/kvm_int.h
+++ b/include/sysemu/kvm_int.h
@@ -30,6 +30,8 @@ typedef struct KVMSlot
 int as_id;
 /* Cache of the offset in ram address space */
 ram_addr_t ram_start_offset;
+int guest_memfd;
+hwaddr guest_memfd_offset;
 } KVMSlot;
 
 typedef struct KVMMemoryUpdate {
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 1b7bbd838c4..0386d4901fa 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -284,35 +284,58 @@ int kvm_physical_memory_addr_from_host(KVMState *s, void 
*ram,
 static int kvm_set_user_memory_region(KVMMemoryListener *kml, KVMSlot *slot, 
bool new)
 {
 KVMState *s = kvm_state;
-struct kvm_userspace_memory_region mem;
+struct kvm_userspace_memory_region2 mem;
 int ret;
 
 mem.slot = slot->slot | (kml->as_id << 16);
 mem.guest_phys_addr = slot->start_addr;
 mem.userspace_addr = (unsigned long)slot->ram;
 mem.flags = slot->flags;
+mem.guest_memfd = slot->guest_memfd;
+mem.guest_memfd_offset = slot->guest_memfd_offset;
 
 if (slot->memory_size && !new && (mem.flags ^ slot->old_flags) & 
KVM_MEM_READONLY) {
 /* Set the slot size to 0 before setting the slot to the desired
  * value. This is needed based on KVM commit 75d61fbc. */
 mem.memory_size = 0;
-ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, );
+
+if (kvm_guest_memfd_supported) {
+ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION2, );
+} else {
+ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, );
+}
 if (ret < 0) {
 goto err;
 }
 }
 mem.memory_size = slot->memory_size;
-ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, );
+if (kvm_guest_memfd_supported) {
+ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION2, );
+} else {
+ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, );
+}
 slot->old_flags = mem.flags;
 err:
 trace_kvm_set_user_memory(mem.slot >> 16, (uint16_t)mem.slot, mem.flags,
   mem.guest_phys_addr, mem.memory_size,
-  mem.userspace_addr, ret);
+  mem.userspace_addr, mem.guest_memfd,
+  mem.guest_memfd_offset, ret);
 if (ret < 0) {
-error_report("%s: KVM_SET_USER_MEMORY_REGION failed, slot=%d,"
- " start=0x%" PRIx64 ", size=0x%" PRIx64 ": %s",
- __func__, mem.slot, slot->start_addr,
- (uint64_t)mem.memory_size, strerror(errno));
+if (kvm_guest_memfd_supported) {
+error_report("%s: KVM_SET_USER_MEMORY_REGION2 failed, slot=%d,"
+" start=0x%" PRIx64 ", size=0x%" PRIx64 ","
+" flags=0x%" PRIx32 ", guest_memfd=%" PRId32 ","
+" guest_memfd_offset=0x%" PRIx64 ": %s",
+__func__, mem.slot, slot->start_addr,
+(uint64_t)mem.memory_size, mem.flags,
+mem.guest_memfd, (uint64_t)mem.guest_memfd_offset,
+strerror(errno));
+} else {
+error_report("%s: KVM_SET_USER_MEMORY_REGION failed, slot=%d,"
+" start=0x%" PRIx64 ", size=0x%" PRIx64 ": %s",
+__func__, mem.slot, slot->start_addr,
+(uint64_t)mem.memory_size, strerror(errno));
+}
 }
 return ret;
 }
@@ -467,6 +490,10 @@ static int kvm_mem_flags(MemoryRegion *mr)
 if (readonly && kvm_readonly_mem_allowed) {
 flags |= KVM_MEM_READONLY;
 }
+if (memory_region_has_guest_memfd(mr)) {
+assert(kvm_guest_memfd_supported);
+flags |= KVM_MEM_GUEST_MEMFD;
+}
 return flags;
 }
 
@@ -1394,6 +1421,9 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml,
 mem->ram_start_offset = ram_start_offset;
 mem->ram = ram;
 mem->flags = kvm_mem_flags(mr);

[PULL 21/63] kvm: use configs/ definition to conditionalize debug support

2024-04-23 Thread Paolo Bonzini
If an architecture adds support for KVM_CAP_SET_GUEST_DEBUG but QEMU does not
have the necessary code, QEMU will fail to build after updating kernel headers.
Avoid this by using a #define in config-target.h instead of 
KVM_CAP_SET_GUEST_DEBUG.

Signed-off-by: Paolo Bonzini 
---
 configs/targets/aarch64-softmmu.mak |  1 +
 configs/targets/i386-softmmu.mak|  1 +
 configs/targets/ppc-softmmu.mak |  1 +
 configs/targets/ppc64-softmmu.mak   |  1 +
 configs/targets/s390x-softmmu.mak   |  1 +
 configs/targets/x86_64-softmmu.mak  |  1 +
 include/sysemu/kvm.h|  2 +-
 include/sysemu/kvm_int.h|  2 +-
 accel/kvm/kvm-accel-ops.c   |  4 ++--
 accel/kvm/kvm-all.c | 10 +-
 10 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/configs/targets/aarch64-softmmu.mak 
b/configs/targets/aarch64-softmmu.mak
index b4338e95680..83c22391a69 100644
--- a/configs/targets/aarch64-softmmu.mak
+++ b/configs/targets/aarch64-softmmu.mak
@@ -1,5 +1,6 @@
 TARGET_ARCH=aarch64
 TARGET_BASE_ARCH=arm
 TARGET_SUPPORTS_MTTCG=y
+TARGET_KVM_HAVE_GUEST_DEBUG=y
 TARGET_XML_FILES= gdb-xml/aarch64-core.xml gdb-xml/aarch64-fpu.xml 
gdb-xml/arm-core.xml gdb-xml/arm-vfp.xml gdb-xml/arm-vfp3.xml 
gdb-xml/arm-vfp-sysregs.xml gdb-xml/arm-neon.xml gdb-xml/arm-m-profile.xml 
gdb-xml/arm-m-profile-mve.xml gdb-xml/aarch64-pauth.xml
 TARGET_NEED_FDT=y
diff --git a/configs/targets/i386-softmmu.mak b/configs/targets/i386-softmmu.mak
index 6b3c99fc86c..d61b5076134 100644
--- a/configs/targets/i386-softmmu.mak
+++ b/configs/targets/i386-softmmu.mak
@@ -1,4 +1,5 @@
 TARGET_ARCH=i386
 TARGET_SUPPORTS_MTTCG=y
 TARGET_NEED_FDT=y
+TARGET_KVM_HAVE_GUEST_DEBUG=y
 TARGET_XML_FILES= gdb-xml/i386-32bit.xml
diff --git a/configs/targets/ppc-softmmu.mak b/configs/targets/ppc-softmmu.mak
index 774440108f7..f3ea9c98f75 100644
--- a/configs/targets/ppc-softmmu.mak
+++ b/configs/targets/ppc-softmmu.mak
@@ -1,4 +1,5 @@
 TARGET_ARCH=ppc
 TARGET_BIG_ENDIAN=y
+TARGET_KVM_HAVE_GUEST_DEBUG=y
 TARGET_XML_FILES= gdb-xml/power-core.xml gdb-xml/power-fpu.xml 
gdb-xml/power-altivec.xml gdb-xml/power-spe.xml
 TARGET_NEED_FDT=y
diff --git a/configs/targets/ppc64-softmmu.mak 
b/configs/targets/ppc64-softmmu.mak
index ddf0c39617f..1db8d8381d0 100644
--- a/configs/targets/ppc64-softmmu.mak
+++ b/configs/targets/ppc64-softmmu.mak
@@ -2,5 +2,6 @@ TARGET_ARCH=ppc64
 TARGET_BASE_ARCH=ppc
 TARGET_BIG_ENDIAN=y
 TARGET_SUPPORTS_MTTCG=y
+TARGET_KVM_HAVE_GUEST_DEBUG=y
 TARGET_XML_FILES= gdb-xml/power64-core.xml gdb-xml/power-fpu.xml 
gdb-xml/power-altivec.xml gdb-xml/power-spe.xml gdb-xml/power-vsx.xml
 TARGET_NEED_FDT=y
diff --git a/configs/targets/s390x-softmmu.mak 
b/configs/targets/s390x-softmmu.mak
index 70d2f9f0ba0..b22218aacc8 100644
--- a/configs/targets/s390x-softmmu.mak
+++ b/configs/targets/s390x-softmmu.mak
@@ -1,4 +1,5 @@
 TARGET_ARCH=s390x
 TARGET_BIG_ENDIAN=y
 TARGET_SUPPORTS_MTTCG=y
+TARGET_KVM_HAVE_GUEST_DEBUG=y
 TARGET_XML_FILES= gdb-xml/s390x-core64.xml gdb-xml/s390-acr.xml 
gdb-xml/s390-fpr.xml gdb-xml/s390-vx.xml gdb-xml/s390-cr.xml 
gdb-xml/s390-virt.xml gdb-xml/s390-virt-kvm.xml gdb-xml/s390-gs.xml
diff --git a/configs/targets/x86_64-softmmu.mak 
b/configs/targets/x86_64-softmmu.mak
index 197817c9434..c5f882e5ba1 100644
--- a/configs/targets/x86_64-softmmu.mak
+++ b/configs/targets/x86_64-softmmu.mak
@@ -2,4 +2,5 @@ TARGET_ARCH=x86_64
 TARGET_BASE_ARCH=i386
 TARGET_SUPPORTS_MTTCG=y
 TARGET_NEED_FDT=y
+TARGET_KVM_HAVE_GUEST_DEBUG=y
 TARGET_XML_FILES= gdb-xml/i386-64bit.xml
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index fad9a7e8ff3..2cba899270c 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -224,7 +224,7 @@ void kvm_flush_coalesced_mmio_buffer(void);
  * calling down to kvm_arch_update_guest_debug after the generic
  * fields have been set.
  */
-#ifdef KVM_CAP_SET_GUEST_DEBUG
+#ifdef TARGET_KVM_HAVE_GUEST_DEBUG
 int kvm_update_guest_debug(CPUState *cpu, unsigned long reinject_trap);
 #else
 static inline int kvm_update_guest_debug(CPUState *cpu, unsigned long 
reinject_trap)
diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
index 882e37e12c5..94488d2c1a2 100644
--- a/include/sysemu/kvm_int.h
+++ b/include/sysemu/kvm_int.h
@@ -78,7 +78,7 @@ struct KVMState
 struct kvm_coalesced_mmio_ring *coalesced_mmio_ring;
 bool coalesced_flush_in_progress;
 int vcpu_events;
-#ifdef KVM_CAP_SET_GUEST_DEBUG
+#ifdef TARGET_KVM_HAVE_GUEST_DEBUG
 QTAILQ_HEAD(, kvm_sw_breakpoint) kvm_sw_breakpoints;
 #endif
 int max_nested_state_len;
diff --git a/accel/kvm/kvm-accel-ops.c b/accel/kvm/kvm-accel-ops.c
index b3c946dc4b4..f5ac643fca3 100644
--- a/accel/kvm/kvm-accel-ops.c
+++ b/accel/kvm/kvm-accel-ops.c
@@ -85,7 +85,7 @@ static bool kvm_cpus_are_resettable(void)
 return !kvm_enabled() || kvm_cpu_check_are_resettable();
 }
 
-#ifdef KVM_CAP_SET_GUEST_DEBUG
+#ifdef TARGET_KVM_HAVE_GUEST_DEBUG
 static int kvm_update_guest_debug_ops(CPUState *cpu)
 {
 return

[PULL 04/63] tests/unit: match some unit tests to corresponding feature switches

2024-04-23 Thread Paolo Bonzini
Try not to test code that is not used by user mode emulation, or by the
block layer, unless they are being compiled; and fix test-timed-average
which was not compiled with --disable-system --enable-tools.

This is by no means complete, it only touches the more blatantly
wrong cases.

Signed-off-by: Paolo Bonzini 
Reviewed-by: Richard Henderson 
Message-ID: <20240408155330.522792-5-pbonz...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 tests/unit/meson.build | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tests/unit/meson.build b/tests/unit/meson.build
index 228a21d03c2..26c109c968c 100644
--- a/tests/unit/meson.build
+++ b/tests/unit/meson.build
@@ -18,7 +18,6 @@ tests = {
   'test-forward-visitor': [testqapi],
   'test-string-input-visitor': [testqapi],
   'test-string-output-visitor': [testqapi],
-  'test-opts-visitor': [testqapi],
   'test-visitor-serialization': [testqapi],
   'test-bitmap': [],
   'test-resv-mem': [],
@@ -46,12 +45,8 @@ tests = {
   'test-qemu-opts': [],
   'test-keyval': [testqapi],
   'test-logging': [],
-  'test-uuid': [],
-  'ptimer-test': ['ptimer-test-stubs.c', meson.project_source_root() / 
'hw/core/ptimer.c'],
   'test-qapi-util': [],
   'test-interval-tree': [],
-  'test-xs-node': [qom],
-  'test-virtio-dmabuf': [meson.project_source_root() / 
'hw/display/virtio-dmabuf.c'],
 }
 
 if have_system or have_tools
@@ -97,6 +92,8 @@ if have_block
 'test-crypto-ivgen': [io],
 'test-crypto-afsplit': [io],
 'test-crypto-block': [io],
+'test-timed-average': [],
+'test-uuid': [],
   }
   if gnutls.found() and \
  tasn1.found() and \
@@ -131,10 +128,13 @@ endif
 
 if have_system
   tests += {
+'ptimer-test': ['ptimer-test-stubs.c', meson.project_source_root() / 
'hw/core/ptimer.c'],
 'test-iov': [],
+'test-opts-visitor': [testqapi],
+'test-xs-node': [qom],
+'test-virtio-dmabuf': [meson.project_source_root() / 
'hw/display/virtio-dmabuf.c'],
 'test-qmp-cmds': [testqapi],
 'test-xbzrle': [migration],
-'test-timed-average': [],
 'test-util-sockets': ['socket-helpers.c'],
 'test-base64': [],
 'test-bufferiszero': [],
-- 
2.44.0





[PULL 17/63] stubs: include stubs only if needed

2024-04-23 Thread Paolo Bonzini
Currently it is not documented anywhere why some functions need to
be stubbed.

Group the files in stubs/meson.build according to who needs them, both
to reduce the size of the compilation and to clarify the use of stubs.

Signed-off-by: Paolo Bonzini 
Message-ID: <20240408155330.522792-18-pbonz...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 stubs/{monitor.c => monitor-internal.c} |   0
 stubs/meson.build   | 122 +++-
 2 files changed, 75 insertions(+), 47 deletions(-)
 rename stubs/{monitor.c => monitor-internal.c} (100%)

diff --git a/stubs/monitor.c b/stubs/monitor-internal.c
similarity index 100%
rename from stubs/monitor.c
rename to stubs/monitor-internal.c
diff --git a/stubs/meson.build b/stubs/meson.build
index 4a524f5816b..8ee1fd57530 100644
--- a/stubs/meson.build
+++ b/stubs/meson.build
@@ -1,58 +1,86 @@
-stub_ss.add(files('bdrv-next-monitor-owned.c'))
-stub_ss.add(files('blk-commit-all.c'))
-stub_ss.add(files('blk-exp-close-all.c'))
-stub_ss.add(files('blockdev-close-all-bdrv-states.c'))
-stub_ss.add(files('change-state-handler.c'))
-stub_ss.add(files('cmos.c'))
+# If possible, add new files to other directories, by using "if_false".
+# If you need them here, try to add them under one of the if statements
+# below, so that it is clear who needs the stubbed functionality.
+
 stub_ss.add(files('cpu-get-clock.c'))
-stub_ss.add(files('cpus-get-virtual-clock.c'))
-stub_ss.add(files('qemu-timer-notify-cb.c'))
-stub_ss.add(files('icount.c'))
-stub_ss.add(files('dump.c'))
-stub_ss.add(files('error-printf.c'))
 stub_ss.add(files('fdset.c'))
-stub_ss.add(files('gdbstub.c'))
-stub_ss.add(files('get-vm-name.c'))
-stub_ss.add(files('graph-lock.c'))
-stub_ss.add(files('hotplug-stubs.c'))
-if linux_io_uring.found()
-  stub_ss.add(files('io_uring.c'))
-endif
 stub_ss.add(files('iothread-lock.c'))
-if have_block
-  stub_ss.add(files('iothread-lock-block.c'))
-endif
 stub_ss.add(files('is-daemonized.c'))
-if libaio.found()
-  stub_ss.add(files('linux-aio.c'))
-endif
-stub_ss.add(files('migr-blocker.c'))
-stub_ss.add(files('monitor.c'))
 stub_ss.add(files('monitor-core.c'))
-stub_ss.add(files('physmem.c'))
-stub_ss.add(files('qemu-timer-notify-cb.c'))
-stub_ss.add(files('qmp-command-available.c'))
-stub_ss.add(files('qmp-quit.c'))
-stub_ss.add(files('qtest.c'))
-stub_ss.add(files('ram-block.c'))
-stub_ss.add(files('replay.c'))
 stub_ss.add(files('replay-mode.c'))
-stub_ss.add(files('runstate-check.c'))
-stub_ss.add(files('sysbus.c'))
-stub_ss.add(files('target-get-monitor-def.c'))
-stub_ss.add(files('target-monitor-defs.c'))
 stub_ss.add(files('trace-control.c'))
-stub_ss.add(files('uuid.c'))
-stub_ss.add(files('vmstate.c'))
-stub_ss.add(files('vm-stop.c'))
-stub_ss.add(files('win32-kbd-hook.c'))
-stub_ss.add(files('cpu-synchronize-state.c'))
-if have_block or have_ga
+
+if have_block
+  stub_ss.add(files('bdrv-next-monitor-owned.c'))
+  stub_ss.add(files('blk-commit-all.c'))
+  stub_ss.add(files('blk-exp-close-all.c'))
+  stub_ss.add(files('blockdev-close-all-bdrv-states.c'))
+  stub_ss.add(files('change-state-handler.c'))
+  stub_ss.add(files('get-vm-name.c'))
+  stub_ss.add(files('iothread-lock-block.c'))
+  stub_ss.add(files('migr-blocker.c'))
+  stub_ss.add(files('physmem.c'))
+  stub_ss.add(files('ram-block.c'))
   stub_ss.add(files('replay-tools.c'))
+  stub_ss.add(files('runstate-check.c'))
+  stub_ss.add(files('uuid.c'))
 endif
-if have_system
-  stub_ss.add(files('fw_cfg.c'))
-  stub_ss.add(files('xen-hw-stub.c'))
-else
+
+if have_block or have_ga
+  # stubs for hooks in util/main-loop.c, util/async.c etc.
+  stub_ss.add(files('cpus-get-virtual-clock.c'))
+  stub_ss.add(files('icount.c'))
+  stub_ss.add(files('graph-lock.c'))
+  if linux_io_uring.found()
+stub_ss.add(files('io_uring.c'))
+  endif
+  if libaio.found()
+stub_ss.add(files('linux-aio.c'))
+  endif
+  stub_ss.add(files('qemu-timer-notify-cb.c'))
+
+  # stubs for monitor
+  stub_ss.add(files('monitor-internal.c'))
+  stub_ss.add(files('qmp-command-available.c'))
+  stub_ss.add(files('qmp-quit.c'))
+endif
+
+if have_block or have_user
+  stub_ss.add(files('qtest.c'))
+  stub_ss.add(files('vm-stop.c'))
+  stub_ss.add(files('vmstate.c'))
+
+  # more symbols provided by the monitor
+  stub_ss.add(files('error-printf.c'))
+endif
+
+if have_user
+  # Symbols that are used by hw/core.
+  stub_ss.add(files('cpu-synchronize-state.c'))
   stub_ss.add(files('qdev.c'))
 endif
+
+if have_system
+  # Symbols that are only needed in some configurations.  Try not
+  # adding more of these.  If the symbol is used in specific_ss,
+  # in particular, consider defining a preprocessor macro via
+  # Kconfig or configs/targets/.
+  stub_ss.add(files('dump.c'))
+  stub_ss.add(files('cmos.c'))
+  stub_ss.add(files('fw_cfg.c'))
+  stub_ss.add(files('target-get-monitor-def.c'))
+  stub_ss.add(files('target-monitor-defs.c'))
+  stub_ss.add(files('win32-kbd-hook.c'))
+  stub_ss.add(files('x

[PULL 10/63] hw/usb: move stubs out of stubs/

2024-04-23 Thread Paolo Bonzini
Since the USB stubs are needed exactly when the Kconfig symbols are not
enabled, they can be placed in hw/usb/ and conditionalized on CONFIG_USB.

Signed-off-by: Paolo Bonzini 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Richard Henderson 
Message-ID: <20240408155330.522792-11-pbonz...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 stubs/usb-dev-stub.c => hw/usb/bus-stub.c | 0
 hw/usb/meson.build| 2 +-
 stubs/meson.build | 1 -
 3 files changed, 1 insertion(+), 2 deletions(-)
 rename stubs/usb-dev-stub.c => hw/usb/bus-stub.c (100%)

diff --git a/stubs/usb-dev-stub.c b/hw/usb/bus-stub.c
similarity index 100%
rename from stubs/usb-dev-stub.c
rename to hw/usb/bus-stub.c
diff --git a/hw/usb/meson.build b/hw/usb/meson.build
index aac3bb35f27..23f7f7acb50 100644
--- a/hw/usb/meson.build
+++ b/hw/usb/meson.build
@@ -9,7 +9,7 @@ system_ss.add(when: 'CONFIG_USB', if_true: files(
   'desc-msos.c',
   'libhw.c',
   'pcap.c',
-))
+), if_false: files('bus-stub.c'))
 
 # usb host adapters
 system_ss.add(when: 'CONFIG_USB_UHCI', if_true: files('hcd-uhci.c'))
diff --git a/stubs/meson.build b/stubs/meson.build
index aa7120f7110..45616afbfaa 100644
--- a/stubs/meson.build
+++ b/stubs/meson.build
@@ -56,7 +56,6 @@ endif
 if have_system
   stub_ss.add(files('fw_cfg.c'))
   stub_ss.add(files('semihost.c'))
-  stub_ss.add(files('usb-dev-stub.c'))
   stub_ss.add(files('xen-hw-stub.c'))
   stub_ss.add(files('virtio-md-pci.c'))
 else
-- 
2.44.0





[PULL 01/63] meson: do not link pixman automatically into all targets

2024-04-23 Thread Paolo Bonzini
The dependency on pixman is listed manually in all sourcesets that need it.
There is no need to bring into libqemuutil, since there is nothing in
util/ that needs pixman either.

Reported-by: Michael Tokarev 
Signed-off-by: Paolo Bonzini 
Reviewed-by: Richard Henderson 
Message-ID: <20240408155330.522792-2-pbonz...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index 91a0aa64c64..8c1271b8846 100644
--- a/meson.build
+++ b/meson.build
@@ -3481,7 +3481,7 @@ util_ss = util_ss.apply({})
 libqemuutil = static_library('qemuutil',
  build_by_default: false,
  sources: util_ss.sources() + stub_ss.sources() + 
genh,
- dependencies: [util_ss.dependencies(), libm, 
threads, glib, socket, malloc, pixman])
+ dependencies: [util_ss.dependencies(), libm, 
threads, glib, socket, malloc])
 qemuutil = declare_dependency(link_with: libqemuutil,
   sources: genh + version_res,
   dependencies: [event_loop_base])
-- 
2.44.0





[PULL 02/63] tests: only build plugins if TCG is enabled

2024-04-23 Thread Paolo Bonzini
There is no way to use them for testing, if all the available
accelerators use hardware virtualization.

Signed-off-by: Paolo Bonzini 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Richard Henderson 
Message-ID: <20240408155330.522792-3-pbonz...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 tests/meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/meson.build b/tests/meson.build
index 0a6f96f8f84..acb6807094b 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -78,9 +78,9 @@ subdir('decode')
 
 if 'CONFIG_TCG' in config_all_accel
   subdir('fp')
+  subdir('plugin')
 endif
 
-subdir('plugin')
 subdir('unit')
 subdir('qapi-schema')
 subdir('qtest')
-- 
2.44.0





[PULL 00/63] First batch of i386 and build system patch for QEMU 9.1

2024-04-23 Thread Paolo Bonzini
The following changes since commit 62dbe54c24dbf77051bafe1039c31ddc8f37602d:

  Update version for v9.0.0-rc4 release (2024-04-16 18:06:15 +0100)

are available in the Git repository at:

  https://gitlab.com/bonzini/qemu.git tags/for-upstream

for you to fetch changes up to 254fade7854a6b3d5b7c54a4ca74c25bb928da14:

  target/i386/translate.c: always write 32-bits for SGDT and SIDT (2024-04-23 
16:08:50 +0200)


* cleanups for stubs
* do not link pixman automatically into all targets
* optimize computation of VGA dirty memory region
* kvm: use configs/ definition to conditionalize debug support
* hw: Add compat machines for 9.1
* target/i386: add guest-phys-bits cpu property
* target/i386: Introduce Icelake-Server-v7 and SierraForest models
* target/i386: Export RFDS bit to guests
* q35: SMM ranges cleanups
* target/i386: basic support for confidential guests
* linux-headers: update headers
* target/i386: SEV: use KVM_SEV_INIT2 if possible
* kvm: Introduce support for memory_attributes
* RAMBlock: Add support of KVM private guest memfd
* Consolidate use of warn_report_once()
* pythondeps.toml: warn about updates needed to docs/requirements.txt
* target/i386: always write 32-bits for SGDT and SIDT


Chao Peng (2):
  kvm: Enable KVM_SET_USER_MEMORY_REGION2 for memslot
  kvm: handle KVM_EXIT_MEMORY_FAULT

Gerd Hoffmann (2):
  target/i386: add guest-phys-bits cpu property
  kvm: add support for guest physical bits

Isaku Yamahata (4):
  pci-host/q35: Move PAM initialization above SMRAM initialization
  q35: Introduce smm_ranges property for q35-pci-host
  kvm/tdx: Don't complain when converting vMMIO region to shared
  kvm/tdx: Ignore memory conversion to shared of unassigned region

Mark Cave-Ayland (1):
  target/i386/translate.c: always write 32-bits for SGDT and SIDT

Michael Roth (4):
  scripts/update-linux-headers: Add setup_data.h to import list
  scripts/update-linux-headers: Add bits.h to file imports
  i386/sev: Add 'legacy-vm-type' parameter for SEV guest objects
  hw/i386/sev: Use legacy SEV VM types for older machine types

Paolo Bonzini (28):
  meson: do not link pixman automatically into all targets
  tests: only build plugins if TCG is enabled
  tests/unit: match some unit tests to corresponding feature switches
  yank: only build if needed
  hw/core: Move system emulation files to system_ss
  stubs: remove obsolete stubs
  hw/usb: move stubs out of stubs/
  hw/virtio: move stubs out of stubs/
  semihosting: move stubs out of stubs/
  ramfb: move stubs out of stubs/
  memory-device: move stubs out of stubs/
  colo: move stubs out of stubs/
  stubs: split record/replay stubs further
  stubs: include stubs only if needed
  stubs: move monitor_fdsets_cleanup with other fdset stubs
  vga: optimize computation of dirty memory region
  vga: move dirty memory region code together
  kvm: use configs/ definition to conditionalize debug support
  hw: Add compat machines for 9.1
  linux-headers: update to current kvm/next
  runstate: skip initial CPU reset if reset is not actually possible
  KVM: track whether guest state is encrypted
  KVM: remove kvm_arch_cpu_check_are_resettable
  target/i386: introduce x86-confidential-guest
  target/i386: Implement mc->kvm_type() to get VM type
  target/i386: SEV: use KVM_SEV_INIT2 if possible
  RAMBlock: make guest_memfd require uncoordinated discard
  pythondeps.toml: warn about updates needed to docs/requirements.txt

Pawan Gupta (1):
  target/i386: Export RFDS bit to guests

Philippe Mathieu-Daudé (3):
  ebpf: Restrict to system emulation
  util/qemu-config: Extract QMP commands to qemu-config-qmp.c
  hw: Include minimal source set in user emulation build

Sean Christopherson (1):
  i386/kvm: Move architectural CPUID leaf generation to separate helper

Tao Su (1):
  target/i386: Add new CPU model SierraForest

Xiaoyao Li (11):
  hw/i386/acpi: Set PCAT_COMPAT bit only when pic is not disabled
  confidential guest support: Add kvm_init() and kvm_reset() in class
  i386/sev: Switch to use confidential_guest_kvm_init()
  ppc/pef: switch to use confidential_guest_kvm_init/reset()
  s390: Switch to use confidential_guest_kvm_init()
  trace/kvm: Split address space and slot id in trace_kvm_set_user_memory()
  kvm: Introduce support for memory_attributes
  RAMBlock: Add support of KVM private guest memfd
  kvm/memory: Make memory type private by default if it has guest memfd 
backend
  HostMem: Add mechanism to opt in kvm guest memfd via MachineState
  physmem: Introduce ram_block_discard_guest_memfd_range()

Zhao Liu (4):
  target/i386/host-cpu: Consolidate the use of warn_report_once()
  target/i386/

[PATCH 14/22] openrisc: switch boards to "default y"

2024-04-23 Thread Paolo Bonzini
Some targets use "default y" for boards to filter out those that require
TCG.  For consistency we are switching all other targets to do the same.
Continue with OpenRISC.

No changes to generated config-devices.mak file.

Signed-off-by: Paolo Bonzini 
---
 configs/devices/or1k-softmmu/default.mak | 5 ++---
 hw/openrisc/Kconfig  | 4 
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/configs/devices/or1k-softmmu/default.mak 
b/configs/devices/or1k-softmmu/default.mak
index 3aecdf9d738..efe3bc278bc 100644
--- a/configs/devices/or1k-softmmu/default.mak
+++ b/configs/devices/or1k-softmmu/default.mak
@@ -5,6 +5,5 @@
 # CONFIG_TEST_DEVICES=n
 
 # Boards:
-#
-CONFIG_OR1K_SIM=y
-CONFIG_OR1K_VIRT=y
+# CONFIG_OR1K_SIM=n
+# CONFIG_OR1K_VIRT=n
diff --git a/hw/openrisc/Kconfig b/hw/openrisc/Kconfig
index 97af258b556..9c9015e0a5d 100644
--- a/hw/openrisc/Kconfig
+++ b/hw/openrisc/Kconfig
@@ -1,5 +1,7 @@
 config OR1K_SIM
 bool
+default y
+depends on OPENRISC
 select SERIAL
 select OPENCORES_ETH
 select OMPIC
@@ -7,6 +9,8 @@ config OR1K_SIM
 
 config OR1K_VIRT
 bool
+default y
+depends on OPENRISC
 imply PCI_DEVICES
 imply VIRTIO_VGA
 imply TEST_DEVICES
-- 
2.44.0




[PATCH 19/22] sh4: switch boards to "default y"

2024-04-23 Thread Paolo Bonzini
Some targets use "default y" for boards to filter out those that require
TCG.  For consistency we are switching all other targets to do the same.
Continue with SH.

No changes to generated config-devices.mak file.

Signed-off-by: Paolo Bonzini 
---
 configs/devices/sh4-softmmu/default.mak | 7 +++
 hw/sh4/Kconfig  | 4 
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/configs/devices/sh4-softmmu/default.mak 
b/configs/devices/sh4-softmmu/default.mak
index 565e8b0b5df..c06a427053a 100644
--- a/configs/devices/sh4-softmmu/default.mak
+++ b/configs/devices/sh4-softmmu/default.mak
@@ -5,7 +5,6 @@
 #CONFIG_PCI_DEVICES=n
 #CONFIG_TEST_DEVICES=n
 
-# Boards:
-#
-CONFIG_R2D=y
-CONFIG_SHIX=y
+# Boards are selected by default, uncomment to keep out of the build.
+# CONFIG_R2D=n
+# CONFIG_SHIX=n
diff --git a/hw/sh4/Kconfig b/hw/sh4/Kconfig
index e0c4ecd1a53..99a76a94c3f 100644
--- a/hw/sh4/Kconfig
+++ b/hw/sh4/Kconfig
@@ -1,5 +1,7 @@
 config R2D
 bool
+default y
+depends on SH4
 imply PCI_DEVICES
 imply TEST_DEVICES
 imply RTL8139_PCI
@@ -13,6 +15,8 @@ config R2D
 
 config SHIX
 bool
+default y
+depends on SH4
 select SH7750
 select TC58128
 
-- 
2.44.0




[PATCH 05/22] cris: switch boards to "default y"

2024-04-23 Thread Paolo Bonzini
Some targets use "default y" for boards to filter out those that require
TCG.  For consistency we are switching all other targets to do the same.
Continue with CRIS.

No changes to generated config-devices.mak file.

Signed-off-by: Paolo Bonzini 
---
 configs/devices/cris-softmmu/default.mak | 5 ++---
 hw/cris/Kconfig  | 2 ++
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/configs/devices/cris-softmmu/default.mak 
b/configs/devices/cris-softmmu/default.mak
index 5932cf4d06f..ff73cd40847 100644
--- a/configs/devices/cris-softmmu/default.mak
+++ b/configs/devices/cris-softmmu/default.mak
@@ -1,5 +1,4 @@
 # Default configuration for cris-softmmu
 
-# Boards:
-#
-CONFIG_AXIS=y
+# Boards are selected by default, uncomment to keep out of the build.
+# CONFIG_AXIS=n
diff --git a/hw/cris/Kconfig b/hw/cris/Kconfig
index 884ad2cbc0d..26c7eef7437 100644
--- a/hw/cris/Kconfig
+++ b/hw/cris/Kconfig
@@ -1,5 +1,7 @@
 config AXIS
 bool
+default y
+depends on CRIS
 select ETRAXFS
 select PFLASH_CFI02
 select NAND
-- 
2.44.0




[PATCH 11/22] meson: make target endianneess available to Kconfig

2024-04-23 Thread Paolo Bonzini
Some targets use "default y" for boards to filter out those that require
TCG.  For consistency we are switching all other targets to do the same.
MIPS boards may only be available for big-endian or only for
little-endian emulators, add a symbol so that this can be described
with a "depends on" clause.

Signed-off-by: Paolo Bonzini 
---
 meson.build| 12 +++-
 target/Kconfig |  3 +++
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/meson.build b/meson.build
index 9af60550753..9c4fb027853 100644
--- a/meson.build
+++ b/meson.build
@@ -3037,7 +3037,7 @@ foreach target : target_dirs
 }
   endif
 
-  accel_kconfig = []
+  target_kconfig = []
   foreach sym: accelerators
 if sym == 'CONFIG_TCG' or target in accelerator_targets.get(sym, [])
   config_target += { sym: 'y' }
@@ -3047,10 +3047,10 @@ foreach target : target_dirs
   else
 config_target += { 'CONFIG_TCG_BUILTIN': 'y' }
   endif
-  accel_kconfig += [ sym + '=y' ]
+  target_kconfig += [ sym + '=y' ]
 endif
   endforeach
-  if accel_kconfig.length() == 0
+  if target_kconfig.length() == 0
 if default_targets
   continue
 endif
@@ -3110,6 +3110,9 @@ foreach target : target_dirs
configuration: 
config_target_data)}
 
   if target.endswith('-softmmu')
+target_kconfig += 'CONFIG_' + config_target['TARGET_ARCH'].to_upper() + 
'=y'
+target_kconfig += 'CONFIG_TARGET_BIG_ENDIAN=' + 
config_target['TARGET_BIG_ENDIAN']
+
 config_input = meson.get_external_property(target, 'default')
 config_devices_mak = target + '-config-devices.mak'
 config_devices_mak = configure_file(
@@ -3120,8 +3123,7 @@ foreach target : target_dirs
   command: [minikconf,
 get_option('default_devices') ? '--defconfig' : 
'--allnoconfig',
 config_devices_mak, '@DEPFILE@', '@INPUT@',
-host_kconfig, accel_kconfig,
-'CONFIG_' + config_target['TARGET_ARCH'].to_upper() + '=y'])
+host_kconfig, target_kconfig])
 
 config_devices_data = configuration_data()
 config_devices = keyval.load(config_devices_mak)
diff --git a/target/Kconfig b/target/Kconfig
index 83da0bd2938..afc00dea30c 100644
--- a/target/Kconfig
+++ b/target/Kconfig
@@ -18,3 +18,6 @@ source sh4/Kconfig
 source sparc/Kconfig
 source tricore/Kconfig
 source xtensa/Kconfig
+
+config TARGET_BIG_ENDIAN
+bool
-- 
2.44.0




[PATCH 04/22] avr: switch boards to "default y"

2024-04-23 Thread Paolo Bonzini
Some targets use "default y" for boards to filter out those that require
TCG.  For consistency we are switching all other targets to do the same.
Continue with AVR.

No changes to generated config-devices.mak file.

Signed-off-by: Paolo Bonzini 
---
 configs/devices/avr-softmmu/default.mak | 5 ++---
 hw/avr/Kconfig  | 3 +++
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/configs/devices/avr-softmmu/default.mak 
b/configs/devices/avr-softmmu/default.mak
index 80218add98c..4207e7b3ce2 100644
--- a/configs/devices/avr-softmmu/default.mak
+++ b/configs/devices/avr-softmmu/default.mak
@@ -1,5 +1,4 @@
 # Default configuration for avr-softmmu
 
-# Boards:
-#
-CONFIG_ARDUINO=y
+# Boards are selected by default, uncomment to keep out of the build.
+# CONFIG_ARDUINO=n
diff --git a/hw/avr/Kconfig b/hw/avr/Kconfig
index d31298c3cce..b29937be414 100644
--- a/hw/avr/Kconfig
+++ b/hw/avr/Kconfig
@@ -5,5 +5,8 @@ config AVR_ATMEGA_MCU
 select AVR_POWER
 
 config ARDUINO
+bool
+default y
+depends on AVR
 select AVR_ATMEGA_MCU
 select UNIMP
-- 
2.44.0




[PATCH 09/22] m68k: switch boards to "default y"

2024-04-23 Thread Paolo Bonzini
Some targets use "default y" for boards to filter out those that require
TCG.  For consistency we are switching all other targets to do the same.
Continue with m68k.

No changes to generated config-devices.mak file.

Signed-off-by: Paolo Bonzini 
---
 configs/devices/m68k-softmmu/default.mak | 13 ++---
 hw/m68k/Kconfig  | 10 ++
 2 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/configs/devices/m68k-softmmu/default.mak 
b/configs/devices/m68k-softmmu/default.mak
index 8dcaa28ed38..3ceda6b041b 100644
--- a/configs/devices/m68k-softmmu/default.mak
+++ b/configs/devices/m68k-softmmu/default.mak
@@ -1,9 +1,8 @@
 # Default configuration for m68k-softmmu
 
-# Boards:
-#
-CONFIG_AN5206=y
-CONFIG_MCF5208=y
-CONFIG_NEXTCUBE=y
-CONFIG_Q800=y
-CONFIG_M68K_VIRT=y
+# Boards are selected by default, uncomment to keep out of the build.
+# CONFIG_AN5206=n
+# CONFIG_MCF5208=n
+# CONFIG_NEXTCUBE=n
+# CONFIG_Q800=n
+# CONFIG_M68K_VIRT=n
diff --git a/hw/m68k/Kconfig b/hw/m68k/Kconfig
index d88741ec9d1..0092cda4e9c 100644
--- a/hw/m68k/Kconfig
+++ b/hw/m68k/Kconfig
@@ -1,20 +1,28 @@
 config AN5206
 bool
+default y
+depends on M68K
 select COLDFIRE
 select PTIMER
 
 config MCF5208
 bool
+default y
+depends on M68K
 select COLDFIRE
 select PTIMER
 
 config NEXTCUBE
 bool
+default y
+depends on M68K
 select FRAMEBUFFER
 select ESCC
 
 config Q800
 bool
+default y
+depends on M68K
 select MAC_VIA
 select NUBUS
 select MACFB
@@ -29,6 +37,8 @@ config Q800
 
 config M68K_VIRT
 bool
+default y
+depends on M68K
 select M68K_IRQC
 select VIRT_CTRL
 select GOLDFISH_PIC
-- 
2.44.0




[PATCH 10/22] microblaze: switch boards to "default y"

2024-04-23 Thread Paolo Bonzini
Some targets use "default y" for boards to filter out those that require
TCG.  For consistency we are switching all other targets to do the same.
Continue with Microblaze.

No changes to generated config-devices.mak file.

Signed-off-by: Paolo Bonzini 
---
 configs/devices/microblaze-softmmu/default.mak | 9 -
 hw/microblaze/Kconfig  | 6 ++
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/configs/devices/microblaze-softmmu/default.mak 
b/configs/devices/microblaze-softmmu/default.mak
index db8c6e4bba3..583e3959bb7 100644
--- a/configs/devices/microblaze-softmmu/default.mak
+++ b/configs/devices/microblaze-softmmu/default.mak
@@ -1,7 +1,6 @@
 # Default configuration for microblaze-softmmu
 
-# Boards:
-#
-CONFIG_PETALOGIX_S3ADSP1800=y
-CONFIG_PETALOGIX_ML605=y
-CONFIG_XLNX_ZYNQMP_PMU=y
+# Boards are selected by default, uncomment to keep out of the build.
+# CONFIG_PETALOGIX_S3ADSP1800=n
+# CONFIG_PETALOGIX_ML605=n
+# CONFIG_XLNX_ZYNQMP_PMU=n
diff --git a/hw/microblaze/Kconfig b/hw/microblaze/Kconfig
index e2697ced9cc..d78ba843fac 100644
--- a/hw/microblaze/Kconfig
+++ b/hw/microblaze/Kconfig
@@ -1,5 +1,7 @@
 config PETALOGIX_S3ADSP1800
 bool
+default y
+depends on MICROBLAZE
 select PFLASH_CFI01
 select XILINX
 select XILINX_AXI
@@ -8,6 +10,8 @@ config PETALOGIX_S3ADSP1800
 
 config PETALOGIX_ML605
 bool
+default y
+depends on MICROBLAZE
 select PFLASH_CFI01
 select SERIAL
 select SSI_M25P80
@@ -18,4 +22,6 @@ config PETALOGIX_ML605
 
 config XLNX_ZYNQMP_PMU
 bool
+default y
+depends on MICROBLAZE
 select XLNX_ZYNQMP
-- 
2.44.0




[PATCH 18/22] s390x: switch boards to "default y"

2024-04-23 Thread Paolo Bonzini
Some targets use "default y" for boards to filter out those that require
TCG.  For consistency we are switching all other targets to do the same.
Continue with s390.

No changes to generated config-devices.mak file.

Signed-off-by: Paolo Bonzini 
---
 configs/devices/s390x-softmmu/default.mak | 5 ++---
 hw/s390x/Kconfig  | 2 ++
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/configs/devices/s390x-softmmu/default.mak 
b/configs/devices/s390x-softmmu/default.mak
index 6d87bc8b4b0..340c1092922 100644
--- a/configs/devices/s390x-softmmu/default.mak
+++ b/configs/devices/s390x-softmmu/default.mak
@@ -9,6 +9,5 @@
 #CONFIG_WDT_DIAG288=n
 #CONFIG_PCIE_DEVICES=n
 
-# Boards:
-#
-CONFIG_S390_CCW_VIRTIO=y
+# Boards are selected by default, uncomment to keep out of the build.
+# CONFIG_S390_CCW_VIRTIO=n
diff --git a/hw/s390x/Kconfig b/hw/s390x/Kconfig
index 26ad1044858..3bbf4ae56e4 100644
--- a/hw/s390x/Kconfig
+++ b/hw/s390x/Kconfig
@@ -1,5 +1,7 @@
 config S390_CCW_VIRTIO
 bool
+default y
+depends on S390X
 imply VIRTIO_PCI
 imply TERMINAL3270
 imply VFIO_AP
-- 
2.44.0




[PATCH 07/22] i386: switch boards to "default y"

2024-04-23 Thread Paolo Bonzini
Some targets use "default y" for boards to filter out those that require
TCG.  For consistency we are switching all other targets to do the same.
Continue with i386.

No changes to generated config-devices.mak files, other than
adding CONFIG_I386 to the x86_64-softmmu target.

Signed-off-by: Paolo Bonzini 
---
 configs/devices/i386-softmmu/default.mak | 11 +--
 hw/i386/Kconfig  | 10 +-
 target/i386/Kconfig  |  1 +
 3 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/configs/devices/i386-softmmu/default.mak 
b/configs/devices/i386-softmmu/default.mak
index 598c6646dfc..448e3e3b1ba 100644
--- a/configs/devices/i386-softmmu/default.mak
+++ b/configs/devices/i386-softmmu/default.mak
@@ -24,9 +24,8 @@
 #CONFIG_VTD=n
 #CONFIG_SGX=n
 
-# Boards:
-#
-CONFIG_ISAPC=y
-CONFIG_I440FX=y
-CONFIG_Q35=y
-CONFIG_MICROVM=y
+# Boards are selected by default, uncomment to keep out of the build.
+# CONFIG_ISAPC=n
+# CONFIG_I440FX=n
+# CONFIG_Q35=n
+# CONFIG_MICROVM=n
diff --git a/hw/i386/Kconfig b/hw/i386/Kconfig
index a6ee052f9a1..4362164962c 100644
--- a/hw/i386/Kconfig
+++ b/hw/i386/Kconfig
@@ -66,6 +66,8 @@ config PC_ACPI
 
 config I440FX
 bool
+default y
+depends on I386
 imply E1000_PCI
 imply VMPORT
 imply VMMOUSE
@@ -81,6 +83,8 @@ config I440FX
 
 config ISAPC
 bool
+default y
+depends on I386
 imply VGA_ISA
 select ISA_BUS
 select PC
@@ -91,6 +95,8 @@ config ISAPC
 
 config Q35
 bool
+default y
+depends on I386
 imply VTD
 imply AMD_IOMMU
 imply E1000E_PCI_EXPRESS
@@ -108,6 +114,8 @@ config Q35
 
 config MICROVM
 bool
+default y
+depends on I386
 select SERIAL_ISA # for serial_hds_isa_init()
 select ISA_BUS
 select APIC
@@ -142,4 +150,4 @@ config VMMOUSE
 config XEN_EMU
 bool
 default y
-depends on KVM && (I386 || X86_64)
+depends on KVM && I386
diff --git a/target/i386/Kconfig b/target/i386/Kconfig
index ce6968906ee..3e62fdc7064 100644
--- a/target/i386/Kconfig
+++ b/target/i386/Kconfig
@@ -3,3 +3,4 @@ config I386
 
 config X86_64
 bool
+select I386
-- 
2.44.0




[PATCH 15/22] ppc: switch boards to "default y"

2024-04-23 Thread Paolo Bonzini
Some targets use "default y" for boards to filter out those that require
TCG.  For consistency we are switching all other targets to do the same.
Continue with PowerPC/POWER.

No changes to generated config-devices.mak files, other than
adding CONFIG_PPC to the ppc64-softmmu target.

Signed-off-by: Paolo Bonzini 
---
 configs/devices/ppc-softmmu/default.mak   | 26 ---
 configs/devices/ppc64-softmmu/default.mak |  8 +++
 hw/ppc/Kconfig| 26 +++
 target/ppc/Kconfig|  1 +
 4 files changed, 44 insertions(+), 17 deletions(-)

diff --git a/configs/devices/ppc-softmmu/default.mak 
b/configs/devices/ppc-softmmu/default.mak
index 3061b26749a..460d15e676b 100644
--- a/configs/devices/ppc-softmmu/default.mak
+++ b/configs/devices/ppc-softmmu/default.mak
@@ -4,22 +4,24 @@
 # CONFIG_PCI_DEVICES=n
 # CONFIG_TEST_DEVICES=n
 
-# For embedded PPCs:
-CONFIG_E500PLAT=y
-CONFIG_MPC8544DS=y
-CONFIG_PPC405=y
-CONFIG_PPC440=y
-CONFIG_VIRTEX=y
+# Boards are selected by default, uncomment to keep out of the build.
+
+# Embedded PPCs:
+# CONFIG_E500PLAT=n
+# CONFIG_MPC8544DS=n
+# CONFIG_PPC405=n
+# CONFIG_PPC440=n
+# CONFIG_VIRTEX=n
 
 # For Sam460ex
-CONFIG_SAM460EX=y
+# CONFIG_SAM460EX=n
 
 # For Macs
-CONFIG_MAC_OLDWORLD=y
-CONFIG_MAC_NEWWORLD=y
+# CONFIG_MAC_OLDWORLD=n
+# CONFIG_MAC_NEWWORLD=n
 
-CONFIG_AMIGAONE=y
-CONFIG_PEGASOS2=y
+# CONFIG_AMIGAONE=n
+# CONFIG_PEGASOS2=n
 
 # For PReP
-CONFIG_PREP=y
+# CONFIG_PREP=n
diff --git a/configs/devices/ppc64-softmmu/default.mak 
b/configs/devices/ppc64-softmmu/default.mak
index b90e5bf4558..e8ad2603133 100644
--- a/configs/devices/ppc64-softmmu/default.mak
+++ b/configs/devices/ppc64-softmmu/default.mak
@@ -3,8 +3,6 @@
 # Include all 32-bit boards
 include ../ppc-softmmu/default.mak
 
-# For PowerNV
-CONFIG_POWERNV=y
-
-# For pSeries
-CONFIG_PSERIES=y
+# Boards are selected by default, uncomment to keep out of the build.
+# CONFIG_POWERNV=n
+# CONFIG_PSERIES=n
diff --git a/hw/ppc/Kconfig b/hw/ppc/Kconfig
index 37ccf9cdcaf..78f83e78ce5 100644
--- a/hw/ppc/Kconfig
+++ b/hw/ppc/Kconfig
@@ -1,5 +1,7 @@
 config PSERIES
 bool
+default y
+depends on PPC64
 imply USB_OHCI_PCI
 imply PCI_DEVICES
 imply TEST_DEVICES
@@ -23,6 +25,8 @@ config SPAPR_RNG
 
 config POWERNV
 bool
+default y
+depends on PPC64
 imply PCI_DEVICES
 imply TEST_DEVICES
 select ISA_IPMI_BT
@@ -38,6 +42,8 @@ config POWERNV
 
 config PPC405
 bool
+default y
+depends on PPC
 select M48T59
 select PFLASH_CFI02
 select PPC4XX
@@ -45,6 +51,8 @@ config PPC405
 
 config PPC440
 bool
+default y
+depends on PPC
 imply PCI_DEVICES
 imply TEST_DEVICES
 imply E1000_PCI
@@ -62,6 +70,8 @@ config PPC4XX
 
 config SAM460EX
 bool
+default y
+depends on PPC
 select PFLASH_CFI01
 select IDE_SII3112
 select M41T80
@@ -75,6 +85,8 @@ config SAM460EX
 
 config AMIGAONE
 bool
+default y
+depends on PPC
 imply ATI_VGA
 select ARTICIA
 select VT82C686
@@ -82,6 +94,8 @@ config AMIGAONE
 
 config PEGASOS2
 bool
+default y
+depends on PPC
 imply ATI_VGA
 select MV64361
 select VT82C686
@@ -90,6 +104,8 @@ config PEGASOS2
 
 config PREP
 bool
+default y
+depends on PPC
 imply PCI_DEVICES
 imply TEST_DEVICES
 select CS4231A
@@ -106,6 +122,8 @@ config RS6000_MC
 
 config MAC_OLDWORLD
 bool
+default y
+depends on PPC
 imply PCI_DEVICES
 imply SUNGEM
 imply TEST_DEVICES
@@ -117,6 +135,8 @@ config MAC_OLDWORLD
 
 config MAC_NEWWORLD
 bool
+default y
+depends on PPC
 imply PCI_DEVICES
 imply SUNGEM
 imply TEST_DEVICES
@@ -147,14 +167,20 @@ config E500
 
 config E500PLAT
 bool
+default y
+depends on PPC
 select E500
 
 config MPC8544DS
 bool
+default y
+depends on PPC
 select E500
 
 config VIRTEX
 bool
+default y
+depends on PPC
 select PPC4XX
 select PFLASH_CFI01
 select SERIAL
diff --git a/target/ppc/Kconfig b/target/ppc/Kconfig
index 3ff152051a3..0283711673e 100644
--- a/target/ppc/Kconfig
+++ b/target/ppc/Kconfig
@@ -3,3 +3,4 @@ config PPC
 
 config PPC64
 bool
+select PPC
-- 
2.44.0




[PATCH 21/22] tricore: switch boards to "default y"

2024-04-23 Thread Paolo Bonzini
Some targets use "default y" for boards to filter out those that require
TCG.  For consistency we are switching all other targets to do the same.
Continue with TriCore.

No changes to generated config-devices.mak file.

Signed-off-by: Paolo Bonzini 
---
 configs/devices/tricore-softmmu/default.mak | 7 +--
 hw/tricore/Kconfig  | 4 
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/configs/devices/tricore-softmmu/default.mak 
b/configs/devices/tricore-softmmu/default.mak
index cb8fc286eb2..c7ab542244b 100644
--- a/configs/devices/tricore-softmmu/default.mak
+++ b/configs/devices/tricore-softmmu/default.mak
@@ -1,2 +1,5 @@
-CONFIG_TRICORE_TESTBOARD=y
-CONFIG_TRIBOARD=y
+# Default configuration for tricore-softmmu
+
+# Boards are selected by default, uncomment to keep out of the build.
+# CONFIG_TRICORE_TESTBOARD=n
+# CONFIG_TRIBOARD=n
diff --git a/hw/tricore/Kconfig b/hw/tricore/Kconfig
index 33c1e852c33..6c04f64949d 100644
--- a/hw/tricore/Kconfig
+++ b/hw/tricore/Kconfig
@@ -1,8 +1,12 @@
 config TRICORE_TESTBOARD
+default y
+depends on TRICORE
 bool
 
 config TRIBOARD
 bool
+default y
+depends on TRICORE
 select TC27X_SOC
 
 config TC27X_SOC
-- 
2.44.0




[PATCH 22/22] xtensa: switch boards to "default y"

2024-04-23 Thread Paolo Bonzini
Some targets use "default y" for boards to filter out those that require
TCG.  For consistency we are switching all other targets to do the same.
Continue with Xtensa.

No changes to generated config-devices.mak file.

Signed-off-by: Paolo Bonzini 
---
 configs/devices/xtensa-softmmu/default.mak | 9 -
 hw/xtensa/Kconfig  | 6 ++
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/configs/devices/xtensa-softmmu/default.mak 
b/configs/devices/xtensa-softmmu/default.mak
index f650cad7609..fbc3079a943 100644
--- a/configs/devices/xtensa-softmmu/default.mak
+++ b/configs/devices/xtensa-softmmu/default.mak
@@ -4,8 +4,7 @@
 #
 #CONFIG_PCI_DEVICES=n
 
-# Boards:
-#
-CONFIG_XTENSA_SIM=y
-CONFIG_XTENSA_VIRT=y
-CONFIG_XTENSA_XTFPGA=y
+# Boards are selected by default, uncomment to keep out of the build.
+# CONFIG_XTENSA_SIM=n
+# CONFIG_XTENSA_VIRT=n
+# CONFIG_XTENSA_XTFPGA=n
diff --git a/hw/xtensa/Kconfig b/hw/xtensa/Kconfig
index 0740657ea58..443b415c2ba 100644
--- a/hw/xtensa/Kconfig
+++ b/hw/xtensa/Kconfig
@@ -1,14 +1,20 @@
 config XTENSA_SIM
+default y
+depends on XTENSA
 bool
 
 config XTENSA_VIRT
 bool
+default y
+depends on XTENSA
 select XTENSA_SIM
 select PCI_EXPRESS_GENERIC_BRIDGE
 select PCI_DEVICES
 
 config XTENSA_XTFPGA
 bool
+default y
+depends on XTENSA
 select OPENCORES_ETH
 select PFLASH_CFI01
 select SERIAL
-- 
2.44.0




[PATCH 16/22] riscv: switch boards to "default y"

2024-04-23 Thread Paolo Bonzini
Some targets use "default y" for boards to filter out those that require
TCG.  For consistency we are switching all other targets to do the same.
Continue with RISC-V.

No changes to generated config-devices.mak file.

Signed-off-by: Paolo Bonzini 
---
 configs/devices/riscv32-softmmu/default.mak | 13 ++---
 configs/devices/riscv64-softmmu/default.mak | 15 +++
 hw/riscv/Kconfig| 14 ++
 3 files changed, 27 insertions(+), 15 deletions(-)

diff --git a/configs/devices/riscv32-softmmu/default.mak 
b/configs/devices/riscv32-softmmu/default.mak
index 07e4fd26df3..c2cd86ce05f 100644
--- a/configs/devices/riscv32-softmmu/default.mak
+++ b/configs/devices/riscv32-softmmu/default.mak
@@ -4,10 +4,9 @@
 # CONFIG_PCI_DEVICES=n
 # CONFIG_TEST_DEVICES=n
 
-# Boards:
-#
-CONFIG_SPIKE=y
-CONFIG_SIFIVE_E=y
-CONFIG_SIFIVE_U=y
-CONFIG_RISCV_VIRT=y
-CONFIG_OPENTITAN=y
+# Boards are selected by default, uncomment to keep out of the build.
+# CONFIG_SPIKE=n
+# CONFIG_SIFIVE_E=n
+# CONFIG_SIFIVE_U=n
+# CONFIG_RISCV_VIRT=n
+# CONFIG_OPENTITAN=n
diff --git a/configs/devices/riscv64-softmmu/default.mak 
b/configs/devices/riscv64-softmmu/default.mak
index 221963d4c5c..39ed3a0061a 100644
--- a/configs/devices/riscv64-softmmu/default.mak
+++ b/configs/devices/riscv64-softmmu/default.mak
@@ -4,11 +4,10 @@
 # CONFIG_PCI_DEVICES=n
 # CONFIG_TEST_DEVICES=n
 
-# Boards:
-#
-CONFIG_SPIKE=y
-CONFIG_SIFIVE_E=y
-CONFIG_SIFIVE_U=y
-CONFIG_RISCV_VIRT=y
-CONFIG_MICROCHIP_PFSOC=y
-CONFIG_SHAKTI_C=y
+# Boards are selected by default, uncomment to keep out of the build.
+# CONFIG_SPIKE=n
+# CONFIG_SIFIVE_E=n
+# CONFIG_SIFIVE_U=n
+# CONFIG_RISCV_VIRT=n
+# CONFIG_MICROCHIP_PFSOC=n
+# CONFIG_SHAKTI_C=n
diff --git a/hw/riscv/Kconfig b/hw/riscv/Kconfig
index 5d644eb7b16..b2955a8ae77 100644
--- a/hw/riscv/Kconfig
+++ b/hw/riscv/Kconfig
@@ -8,6 +8,8 @@ config IBEX
 
 config MICROCHIP_PFSOC
 bool
+default y
+depends on RISCV64
 select CADENCE_SDHCI
 select MCHP_PFSOC_DMC
 select MCHP_PFSOC_IOSCB
@@ -20,12 +22,16 @@ config MICROCHIP_PFSOC
 
 config OPENTITAN
 bool
+default y
+depends on RISCV32
 select IBEX
 select SIFIVE_PLIC
 select UNIMP
 
 config RISCV_VIRT
 bool
+default y
+depends on RISCV32 || RISCV64
 imply PCI_DEVICES
 imply VIRTIO_VGA
 imply TEST_DEVICES
@@ -50,6 +56,8 @@ config RISCV_VIRT
 
 config SHAKTI_C
 bool
+default y
+depends on RISCV64
 select RISCV_ACLINT
 select SHAKTI_UART
 select SIFIVE_PLIC
@@ -57,6 +65,8 @@ config SHAKTI_C
 
 config SIFIVE_E
 bool
+default y
+depends on RISCV32 || RISCV64
 select RISCV_ACLINT
 select SIFIVE_GPIO
 select SIFIVE_PLIC
@@ -67,6 +77,8 @@ config SIFIVE_E
 
 config SIFIVE_U
 bool
+default y
+depends on RISCV32 || RISCV64
 select CADENCE
 select RISCV_ACLINT
 select SIFIVE_GPIO
@@ -83,6 +95,8 @@ config SIFIVE_U
 
 config SPIKE
 bool
+default y
+depends on RISCV32 || RISCV64
 select RISCV_NUMA
 select HTIF
 select RISCV_ACLINT
-- 
2.44.0




[PATCH 13/22] nios2: switch boards to "default y"

2024-04-23 Thread Paolo Bonzini
Some targets use "default y" for boards to filter out those that require
TCG.  For consistency we are switching all other targets to do the same.
Continue with Nios2.

No changes to generated config-devices.mak file.

Signed-off-by: Paolo Bonzini 
---
 configs/devices/nios2-softmmu/default.mak | 7 +++
 hw/nios2/Kconfig  | 9 -
 2 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/configs/devices/nios2-softmmu/default.mak 
b/configs/devices/nios2-softmmu/default.mak
index e130d024e62..50a68d26b0f 100644
--- a/configs/devices/nios2-softmmu/default.mak
+++ b/configs/devices/nios2-softmmu/default.mak
@@ -1,6 +1,5 @@
 # Default configuration for nios2-softmmu
 
-# Boards:
-#
-CONFIG_NIOS2_10M50=y
-CONFIG_NIOS2_GENERIC_NOMMU=y
+# Boards are selected by default, uncomment to keep out of the build.
+# CONFIG_NIOS2_10M50=n
+# CONFIG_NIOS2_GENERIC_NOMMU=n
diff --git a/hw/nios2/Kconfig b/hw/nios2/Kconfig
index 4748ae27b67..ab7866a5358 100644
--- a/hw/nios2/Kconfig
+++ b/hw/nios2/Kconfig
@@ -1,13 +1,12 @@
 config NIOS2_10M50
 bool
-select NIOS2
+default y
+depends on NIOS2
 select SERIAL
 select ALTERA_TIMER
 select NIOS2_VIC
 
 config NIOS2_GENERIC_NOMMU
 bool
-select NIOS2
-
-config NIOS2
-bool
+default y
+depends on NIOS2
-- 
2.44.0




[PATCH 20/22] sparc: switch boards to "default y"

2024-04-23 Thread Paolo Bonzini
Some targets use "default y" for boards to filter out those that require
TCG.  For consistency we are switching all other targets to do the same.
Continue with SPARC and SPARC64.

No changes to generated config-devices.mak file.

Signed-off-by: Paolo Bonzini 
---
 configs/devices/sparc-softmmu/default.mak   | 7 +++
 configs/devices/sparc64-softmmu/default.mak | 7 +++
 hw/sparc/Kconfig| 4 
 hw/sparc64/Kconfig  | 4 
 4 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/configs/devices/sparc-softmmu/default.mak 
b/configs/devices/sparc-softmmu/default.mak
index ee852181151..87668fda5ea 100644
--- a/configs/devices/sparc-softmmu/default.mak
+++ b/configs/devices/sparc-softmmu/default.mak
@@ -5,7 +5,6 @@
 #CONFIG_TCX=n
 #CONFIG_CG3=n
 
-# Boards:
-#
-CONFIG_SUN4M=y
-CONFIG_LEON3=y
+# Boards are selected by default, uncomment to keep out of the build.
+# CONFIG_SUN4M=n
+# CONFIG_LEON3=n
diff --git a/configs/devices/sparc64-softmmu/default.mak 
b/configs/devices/sparc64-softmmu/default.mak
index e50030a229c..fa82f39a200 100644
--- a/configs/devices/sparc64-softmmu/default.mak
+++ b/configs/devices/sparc64-softmmu/default.mak
@@ -6,7 +6,6 @@
 #CONFIG_SUNHME=n
 #CONFIG_TEST_DEVICES=n
 
-# Boards:
-#
-CONFIG_SUN4U=y
-CONFIG_NIAGARA=y
+# Boards are selected by default, uncomment to keep out of the build.
+# CONFIG_SUN4U=n
+# CONFIG_NIAGARA=n
diff --git a/hw/sparc/Kconfig b/hw/sparc/Kconfig
index 79d58beb7a6..3cc165dbfb7 100644
--- a/hw/sparc/Kconfig
+++ b/hw/sparc/Kconfig
@@ -1,5 +1,7 @@
 config SUN4M
 bool
+default y
+depends on SPARC && !SPARC64
 imply TCX
 imply CG3
 select CS4231
@@ -18,6 +20,8 @@ config SUN4M
 
 config LEON3
 bool
+default y
+depends on SPARC && !SPARC64
 select GRLIB
 
 config GRLIB
diff --git a/hw/sparc64/Kconfig b/hw/sparc64/Kconfig
index 7e557ad17b0..3b948a22907 100644
--- a/hw/sparc64/Kconfig
+++ b/hw/sparc64/Kconfig
@@ -1,5 +1,7 @@
 config SUN4U
 bool
+default y
+depends on SPARC64
 imply PCI_DEVICES
 imply SUNHME
 imply TEST_DEVICES
@@ -16,6 +18,8 @@ config SUN4U
 
 config NIAGARA
 bool
+default y
+depends on SPARC64
 select EMPTY_SLOT
 select SUN4V_RTC
 select UNIMP
-- 
2.44.0




[PATCH 08/22] loongarch: switch boards to "default y"

2024-04-23 Thread Paolo Bonzini
Some targets use "default y" for boards to filter out those that require
TCG.  For consistency we are switching all other targets to do the same.
Continue with Loongarch.

No changes to generated config-devices.mak file.

Signed-off-by: Paolo Bonzini 
---
 configs/devices/loongarch64-softmmu/default.mak | 3 ++-
 hw/loongarch/Kconfig| 2 ++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/configs/devices/loongarch64-softmmu/default.mak 
b/configs/devices/loongarch64-softmmu/default.mak
index 0893112b81d..ffe705836fd 100644
--- a/configs/devices/loongarch64-softmmu/default.mak
+++ b/configs/devices/loongarch64-softmmu/default.mak
@@ -3,4 +3,5 @@
 # Uncomment the following lines to disable these optional devices:
 # CONFIG_PCI_DEVICES=n
 
-CONFIG_LOONGARCH_VIRT=y
+# Boards are selected by default, uncomment to keep out of the build.
+# CONFIG_LOONGARCH_VIRT=n
diff --git a/hw/loongarch/Kconfig b/hw/loongarch/Kconfig
index 5727efed6d8..78640505630 100644
--- a/hw/loongarch/Kconfig
+++ b/hw/loongarch/Kconfig
@@ -1,5 +1,7 @@
 config LOONGARCH_VIRT
 bool
+default y
+depends on LOONGARCH64
 select PCI
 select PCI_EXPRESS_GENERIC_BRIDGE
 imply VIRTIO_VGA
-- 
2.44.0




[PATCH 12/22] mips: switch boards to "default y"

2024-04-23 Thread Paolo Bonzini
Some targets use "default y" for boards to filter out those that require
TCG.  For consistency we are switching all other targets to do the same.
Continue with MIPS.

No changes to generated config-devices.mak file.

Signed-off-by: Paolo Bonzini 
---
 configs/devices/mips-softmmu/common.mak  |  5 +++--
 configs/devices/mips64-softmmu/default.mak   |  4 +++-
 configs/devices/mips64el-softmmu/default.mak | 10 ++
 hw/mips/Kconfig  | 12 
 4 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/configs/devices/mips-softmmu/common.mak 
b/configs/devices/mips-softmmu/common.mak
index 416a5d353e8..b50107feafe 100644
--- a/configs/devices/mips-softmmu/common.mak
+++ b/configs/devices/mips-softmmu/common.mak
@@ -4,5 +4,6 @@
 # CONFIG_PCI_DEVICES=n
 # CONFIG_TEST_DEVICES=n
 
-CONFIG_MALTA=y
-CONFIG_MIPSSIM=y
+# Boards are selected by default, uncomment to keep out of the build.
+# CONFIG_MALTA=n
+# CONFIG_MIPSSIM=n
diff --git a/configs/devices/mips64-softmmu/default.mak 
b/configs/devices/mips64-softmmu/default.mak
index 566672f3c22..1b8d7ced1c6 100644
--- a/configs/devices/mips64-softmmu/default.mak
+++ b/configs/devices/mips64-softmmu/default.mak
@@ -1,4 +1,6 @@
 # Default configuration for mips64-softmmu
 
 include ../mips-softmmu/common.mak
-CONFIG_JAZZ=y
+
+# Boards are selected by default, uncomment to keep out of the build.
+# CONFIG_JAZZ=n
diff --git a/configs/devices/mips64el-softmmu/default.mak 
b/configs/devices/mips64el-softmmu/default.mak
index 88a37cf27f1..9dce346c4fb 100644
--- a/configs/devices/mips64el-softmmu/default.mak
+++ b/configs/devices/mips64el-softmmu/default.mak
@@ -1,7 +1,9 @@
 # Default configuration for mips64el-softmmu
 
 include ../mips-softmmu/common.mak
-CONFIG_FULOONG=y
-CONFIG_LOONGSON3V=y
-CONFIG_JAZZ=y
-CONFIG_MIPS_BOSTON=y
+
+# Boards are selected by default, uncomment to keep out of the build.
+# CONFIG_FULOONG=n
+# CONFIG_LOONGSON3V=n
+# CONFIG_JAZZ=n
+# CONFIG_MIPS_BOSTON=n
diff --git a/hw/mips/Kconfig b/hw/mips/Kconfig
index 5c83ef49cf6..9bccb363eb9 100644
--- a/hw/mips/Kconfig
+++ b/hw/mips/Kconfig
@@ -1,5 +1,7 @@
 config MALTA
 bool
+default y
+depends on MIPS
 imply PCNET_PCI
 imply PCI_DEVICES
 imply TEST_DEVICES
@@ -13,11 +15,15 @@ config MALTA
 
 config MIPSSIM
 bool
+default y
+depends on MIPS
 select SERIAL
 select MIPSNET
 
 config JAZZ
 bool
+default y
+depends on MIPS64
 select ISA_BUS
 select RC4030
 select I8259
@@ -38,6 +44,8 @@ config JAZZ
 
 config FULOONG
 bool
+default y
+depends on MIPS64 && !TARGET_BIG_ENDIAN
 imply PCI_DEVICES
 imply TEST_DEVICES
 imply ATI_VGA
@@ -48,6 +56,8 @@ config FULOONG
 
 config LOONGSON3V
 bool
+default y
+depends on MIPS64 && !TARGET_BIG_ENDIAN
 imply PCI_DEVICES
 imply TEST_DEVICES
 imply VIRTIO_PCI
@@ -69,6 +79,8 @@ config MIPS_CPS
 
 config MIPS_BOSTON
 bool
+default y
+depends on MIPS64 && !TARGET_BIG_ENDIAN
 imply PCI_DEVICES
 imply TEST_DEVICES
 select FITLOADER
-- 
2.44.0




[PATCH 17/22] rx: switch boards to "default y"

2024-04-23 Thread Paolo Bonzini
Some targets use "default y" for boards to filter out those that require
TCG.  For consistency we are switching all other targets to do the same.
Continue with RX.

No changes to generated config-devices.mak file.

Signed-off-by: Paolo Bonzini 
---
 configs/devices/rx-softmmu/default.mak | 3 ++-
 hw/rx/Kconfig  | 2 ++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/configs/devices/rx-softmmu/default.mak 
b/configs/devices/rx-softmmu/default.mak
index df2b4e4f426..e7caebe1974 100644
--- a/configs/devices/rx-softmmu/default.mak
+++ b/configs/devices/rx-softmmu/default.mak
@@ -1,3 +1,4 @@
 # Default configuration for rx-softmmu
 
-CONFIG_RX_GDBSIM=y
+# Boards are selected by default, uncomment to keep out of the build.
+# CONFIG_RX_GDBSIM=n
diff --git a/hw/rx/Kconfig b/hw/rx/Kconfig
index 2b297c5a6a6..b2fa2b7eec3 100644
--- a/hw/rx/Kconfig
+++ b/hw/rx/Kconfig
@@ -7,4 +7,6 @@ config RX62N_MCU
 
 config RX_GDBSIM
 bool
+default y
+depends on RX
 select RX62N_MCU
-- 
2.44.0




[PATCH 01/22] configs: list "implied" device groups in the default configs

2024-04-23 Thread Paolo Bonzini
Match the optional device groups to what is actually included in
the config-devices.mak files.

Signed-off-by: Paolo Bonzini 
---
 configs/devices/arm-softmmu/default.mak | 2 ++
 configs/devices/loongarch64-softmmu/default.mak | 3 +++
 configs/devices/or1k-softmmu/default.mak| 4 
 configs/devices/ppc-softmmu/default.mak | 4 
 configs/devices/riscv32-softmmu/default.mak | 4 ++--
 configs/devices/riscv64-softmmu/default.mak | 4 ++--
 configs/devices/xtensa-softmmu/default.mak  | 4 
 7 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/configs/devices/arm-softmmu/default.mak 
b/configs/devices/arm-softmmu/default.mak
index 6ee31bc1ab9..c1cfb3bcf75 100644
--- a/configs/devices/arm-softmmu/default.mak
+++ b/configs/devices/arm-softmmu/default.mak
@@ -1,5 +1,7 @@
 # Default configuration for arm-softmmu
 
+# Uncomment the following lines to disable these optional devices:
+# CONFIG_I2C_DEVICES=n
 # CONFIG_PCI_DEVICES=n
 # CONFIG_TEST_DEVICES=n
 
diff --git a/configs/devices/loongarch64-softmmu/default.mak 
b/configs/devices/loongarch64-softmmu/default.mak
index 928bc117ef7..0893112b81d 100644
--- a/configs/devices/loongarch64-softmmu/default.mak
+++ b/configs/devices/loongarch64-softmmu/default.mak
@@ -1,3 +1,6 @@
 # Default configuration for loongarch64-softmmu
 
+# Uncomment the following lines to disable these optional devices:
+# CONFIG_PCI_DEVICES=n
+
 CONFIG_LOONGARCH_VIRT=y
diff --git a/configs/devices/or1k-softmmu/default.mak 
b/configs/devices/or1k-softmmu/default.mak
index 89c39e31237..3aecdf9d738 100644
--- a/configs/devices/or1k-softmmu/default.mak
+++ b/configs/devices/or1k-softmmu/default.mak
@@ -1,5 +1,9 @@
 # Default configuration for or1k-softmmu
 
+# Uncomment the following lines to disable these optional devices:
+# CONFIG_PCI_DEVICES=n
+# CONFIG_TEST_DEVICES=n
+
 # Boards:
 #
 CONFIG_OR1K_SIM=y
diff --git a/configs/devices/ppc-softmmu/default.mak 
b/configs/devices/ppc-softmmu/default.mak
index b85fd2bcd71..3061b26749a 100644
--- a/configs/devices/ppc-softmmu/default.mak
+++ b/configs/devices/ppc-softmmu/default.mak
@@ -1,5 +1,9 @@
 # Default configuration for ppc-softmmu
 
+# Uncomment the following lines to disable these optional devices:
+# CONFIG_PCI_DEVICES=n
+# CONFIG_TEST_DEVICES=n
+
 # For embedded PPCs:
 CONFIG_E500PLAT=y
 CONFIG_MPC8544DS=y
diff --git a/configs/devices/riscv32-softmmu/default.mak 
b/configs/devices/riscv32-softmmu/default.mak
index 94a236c9c25..07e4fd26df3 100644
--- a/configs/devices/riscv32-softmmu/default.mak
+++ b/configs/devices/riscv32-softmmu/default.mak
@@ -1,8 +1,8 @@
 # Default configuration for riscv32-softmmu
 
 # Uncomment the following lines to disable these optional devices:
-#
-#CONFIG_PCI_DEVICES=n
+# CONFIG_PCI_DEVICES=n
+# CONFIG_TEST_DEVICES=n
 
 # Boards:
 #
diff --git a/configs/devices/riscv64-softmmu/default.mak 
b/configs/devices/riscv64-softmmu/default.mak
index 3f680594484..221963d4c5c 100644
--- a/configs/devices/riscv64-softmmu/default.mak
+++ b/configs/devices/riscv64-softmmu/default.mak
@@ -1,8 +1,8 @@
 # Default configuration for riscv64-softmmu
 
 # Uncomment the following lines to disable these optional devices:
-#
-#CONFIG_PCI_DEVICES=n
+# CONFIG_PCI_DEVICES=n
+# CONFIG_TEST_DEVICES=n
 
 # Boards:
 #
diff --git a/configs/devices/xtensa-softmmu/default.mak 
b/configs/devices/xtensa-softmmu/default.mak
index 49e4c9da88c..f650cad7609 100644
--- a/configs/devices/xtensa-softmmu/default.mak
+++ b/configs/devices/xtensa-softmmu/default.mak
@@ -1,5 +1,9 @@
 # Default configuration for Xtensa
 
+# Uncomment the following lines to disable these optional devices:
+#
+#CONFIG_PCI_DEVICES=n
+
 # Boards:
 #
 CONFIG_XTENSA_SIM=y
-- 
2.44.0




[PATCH 02/22] alpha: switch boards to "default y"

2024-04-23 Thread Paolo Bonzini
Some targets use "default y" for boards to filter out those that require
TCG.  For consistency we are switching all other targets to do the same.
Start with Alpha.

No changes to generated config-devices.mak file.

Signed-off-by: Paolo Bonzini 
---
 configs/devices/alpha-softmmu/default.mak | 5 ++---
 hw/alpha/Kconfig  | 2 ++
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/configs/devices/alpha-softmmu/default.mak 
b/configs/devices/alpha-softmmu/default.mak
index d186fe8e9b1..3de6a9f5779 100644
--- a/configs/devices/alpha-softmmu/default.mak
+++ b/configs/devices/alpha-softmmu/default.mak
@@ -5,6 +5,5 @@
 #CONFIG_PCI_DEVICES=n
 #CONFIG_TEST_DEVICES=n
 
-# Boards:
-#
-CONFIG_DP264=y
+# Boards are selected by default, uncomment to keep out of the build.
+# CONFIG_DP264=n
diff --git a/hw/alpha/Kconfig b/hw/alpha/Kconfig
index 9af650c94ec..7f3455ce1e1 100644
--- a/hw/alpha/Kconfig
+++ b/hw/alpha/Kconfig
@@ -1,5 +1,7 @@
 config DP264
 bool
+default y
+depends on ALPHA
 imply PCI_DEVICES
 imply TEST_DEVICES
 imply E1000_PCI
-- 
2.44.0




[PATCH 06/22] hppa: switch boards to "default y"

2024-04-23 Thread Paolo Bonzini
Some targets use "default y" for boards to filter out those that require
TCG.  For consistency we are switching all other targets to do the same.
Continue with PARISC.

No changes to generated config-devices.mak file.

Signed-off-by: Paolo Bonzini 
---
 configs/devices/hppa-softmmu/default.mak | 5 ++---
 hw/hppa/Kconfig  | 2 ++
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/configs/devices/hppa-softmmu/default.mak 
b/configs/devices/hppa-softmmu/default.mak
index b0364bb88f2..059510cdbb7 100644
--- a/configs/devices/hppa-softmmu/default.mak
+++ b/configs/devices/hppa-softmmu/default.mak
@@ -4,6 +4,5 @@
 #
 #CONFIG_PCI_DEVICES=n
 
-# Boards:
-#
-CONFIG_HPPA_B160L=y
+# Boards are selected by default, uncomment to keep out of the build.
+# CONFIG_HPPA_B160L=n
diff --git a/hw/hppa/Kconfig b/hw/hppa/Kconfig
index ee7ffd2bfb5..d4d457f4ab4 100644
--- a/hw/hppa/Kconfig
+++ b/hw/hppa/Kconfig
@@ -1,5 +1,7 @@
 config HPPA_B160L
 bool
+default y
+depends on HPPA
 imply PCI_DEVICES
 imply E1000_PCI
 imply USB_OHCI_PCI
-- 
2.44.0




[PATCH 00/22] configs: switch boards to "default y"

2024-04-23 Thread Paolo Bonzini
Some boards, notably ARM boards that use TCG, are already using
"default y".  This was done to remove TCG-only boards from
a KVM-only build in commit 29d9efca16 (2023-04-26).

This series converts all other boards to that, so that the requirements
of each board are clearer in the Kconfig files.

For now, the only such use is MIPS's 64-bit and endianness requirements.
In the future, it will be possible to enable/disable boards based
on the presence of required libraries, for example libfdt, or
their deprecation status.

There is an important difference in that Kconfig symbols for boards
have to be enabled in a --without-default-devices build, similar to
devices.

Paolo

Paolo Bonzini (22):
  configs: list "implied" device groups in the default configs
  alpha: switch boards to "default y"
  arm: switch boards to "default y"
  avr: switch boards to "default y"
  cris: switch boards to "default y"
  hppa: switch boards to "default y"
  i386: switch boards to "default y"
  loongarch: switch boards to "default y"
  m68k: switch boards to "default y"
  microblaze: switch boards to "default y"
  meson: make target endianneess available to Kconfig
  mips: switch boards to "default y"
  nios2: switch boards to "default y"
  openrisc: switch boards to "default y"
  ppc: switch boards to "default y"
  riscv: switch boards to "default y"
  rx: switch boards to "default y"
  s390x: switch boards to "default y"
  sh4: switch boards to "default y"
  sparc: switch boards to "default y"
  tricore: switch boards to "default y"
  xtensa: switch boards to "default y"

 configs/devices/alpha-softmmu/default.mak |  5 ++--
 configs/devices/arm-softmmu/default.mak   |  5 +++-
 configs/devices/avr-softmmu/default.mak   |  5 ++--
 configs/devices/cris-softmmu/default.mak  |  5 ++--
 configs/devices/hppa-softmmu/default.mak  |  5 ++--
 configs/devices/i386-softmmu/default.mak  | 11 ---
 .../devices/loongarch64-softmmu/default.mak   |  6 +++-
 configs/devices/m68k-softmmu/default.mak  | 13 
 .../devices/microblaze-softmmu/default.mak|  9 +++---
 configs/devices/mips-softmmu/common.mak   |  5 ++--
 configs/devices/mips64-softmmu/default.mak|  4 ++-
 configs/devices/mips64el-softmmu/default.mak  | 10 ---
 configs/devices/nios2-softmmu/default.mak |  7 ++---
 configs/devices/or1k-softmmu/default.mak  |  9 --
 configs/devices/ppc-softmmu/default.mak   | 30 +++
 configs/devices/ppc64-softmmu/default.mak |  8 ++---
 configs/devices/riscv32-softmmu/default.mak   | 17 +--
 configs/devices/riscv64-softmmu/default.mak   | 19 ++--
 configs/devices/rx-softmmu/default.mak|  3 +-
 configs/devices/s390x-softmmu/default.mak |  5 ++--
 configs/devices/sh4-softmmu/default.mak   |  7 ++---
 configs/devices/sparc-softmmu/default.mak |  7 ++---
 configs/devices/sparc64-softmmu/default.mak   |  7 ++---
 configs/devices/tricore-softmmu/default.mak   |  7 +++--
 configs/devices/xtensa-softmmu/default.mak| 11 ---
 meson.build   | 12 
 hw/alpha/Kconfig  |  2 ++
 hw/arm/Kconfig|  2 ++
 hw/avr/Kconfig|  3 ++
 hw/cris/Kconfig   |  2 ++
 hw/hppa/Kconfig   |  2 ++
 hw/i386/Kconfig   | 10 ++-
 hw/loongarch/Kconfig  |  2 ++
 hw/m68k/Kconfig   | 10 +++
 hw/microblaze/Kconfig |  6 
 hw/mips/Kconfig   | 12 
 hw/nios2/Kconfig  |  9 +++---
 hw/openrisc/Kconfig   |  4 +++
 hw/ppc/Kconfig| 26 
 hw/riscv/Kconfig  | 14 +
 hw/rx/Kconfig |  2 ++
 hw/s390x/Kconfig  |  2 ++
 hw/sh4/Kconfig|  4 +++
 hw/sparc/Kconfig  |  4 +++
 hw/sparc64/Kconfig|  4 +++
 hw/tricore/Kconfig|  4 +++
 hw/xtensa/Kconfig |  6 
 target/Kconfig|  3 ++
 target/i386/Kconfig   |  1 +
 target/ppc/Kconfig|  1 +
 50 files changed, 252 insertions(+), 115 deletions(-)

-- 
2.44.0




[PATCH 03/22] arm: switch boards to "default y"

2024-04-23 Thread Paolo Bonzini
For ARM targets, boards that require TCG are already using "default y".
Switch ARM_VIRT to the same selection mechanism.

No changes to generated config-devices.mak file.

Signed-off-by: Paolo Bonzini 
---
 configs/devices/arm-softmmu/default.mak | 3 ++-
 hw/arm/Kconfig  | 2 ++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/configs/devices/arm-softmmu/default.mak 
b/configs/devices/arm-softmmu/default.mak
index c1cfb3bcf75..31f77c20269 100644
--- a/configs/devices/arm-softmmu/default.mak
+++ b/configs/devices/arm-softmmu/default.mak
@@ -5,7 +5,8 @@
 # CONFIG_PCI_DEVICES=n
 # CONFIG_TEST_DEVICES=n
 
-CONFIG_ARM_VIRT=y
+# Boards are selected by default, uncomment to keep out of the build.
+# CONFIG_ARM_VIRT=n
 
 # These are selected by default when TCG is enabled, uncomment them to
 # keep out of the build.
diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
index 893a7bff66b..1e7cd01087f 100644
--- a/hw/arm/Kconfig
+++ b/hw/arm/Kconfig
@@ -1,5 +1,7 @@
 config ARM_VIRT
 bool
+default y
+depends on ARM
 imply PCI_DEVICES
 imply TEST_DEVICES
 imply VFIO_AMD_XGBE
-- 
2.44.0




Re: [PATCH] target/i386/translate.c: always write 32-bits for SGDT and SIDT

2024-04-23 Thread Paolo Bonzini
On Mon, Apr 22, 2024 at 9:10 PM Volker Rümelin  wrote:
>
> Am 20.04.24 um 07:40 schrieb Mark Cave-Ayland:
> >> Current documentation agrees that all 32 bits are written, so I don't
> >> think you need this comment:
> >
> > Ah that's good to know the docs are now correct. I added the comment
> > as there was a lot of conflicting information around for older CPUs so
> > I thought it was worth an explicit mention.
>
> Quote from the Intel® 64 and IA-32 Architectures Software Developer’s
> Manual Volume 2B: Instruction Set Reference, M-U March 2024:
>
> IA-32 Architecture Compatibility
> The 16-bit form of SGDT is compatible with the Intel 286 processor if
> the upper 8 bits are not referenced. The Intel 286 processor fills these
> bits with 1s; processor generations later than the Intel 286 processor
> fill these bits with 0s.
>
> Intel still claims the upper 8 bits are filled with 0s, but the
> Operation pseudo code below is correct. The same is true for SIDT.

I think the claim is that it fills with 0s when the software is
compatible with the 286, i.e. never uses a 32-bit LIDT or LGDT
instruction. Software written to target specifically older processors
typically used the undocumented LOADALL instruction to exit protected
mode or to set 4GB segment limits, so it won't run on QEMU. You can
read about the usage here:

https://www.os2museum.com/wp/more-on-loadall-and-os2/ (286)
https://www.os2museum.com/wp/386-loadall/ (386)

and about how it worked here:

https://www.pcjs.org/documents/manuals/intel/80286/loadall/
https://www.pcjs.org/documents/manuals/intel/80386/loadall/

Interestingly, byte 3 of the GDTR or IDTR on the 286 are documented as
"should be zeroes" for LOADALL, not all ones.

Let's change "Despite claims to the contrary" with "Despite a
confusing description".

Paolo




Re: [PATCH v5 0/3] Add support for the RAPL MSRs series

2024-04-19 Thread Paolo Bonzini
On Wed, Apr 17, 2024 at 7:58 PM Daniel P. Berrangé  wrote:
> > > However, one question remains unanswered pointing the issue with the
> > > location of "/var/local/run/qemu-vmsr-helper.sock", created by
> > > compute_default_paths(). QEMU is not allowed to reach the socket here.
> >
> > If I understand correctly the question, that is expected. This is a
> > privileged functionality and therefore it requires manual intervention
> > to change the owner of the socket and allow QEMU to access it.
>
> In the systemd case, it will set the owner and mode, but in the
> non-system case, I wonder if it worth making this helper program
> have "--socket-owner" and "--socket-mode" args, so it can create
> the socket with the right mode/owner immediately, rather than
> expecting the admin to manuall chmod+chown after start the
> helper

I think a better idea would be to contribute them to
systemd-socket-activate, and just launch the helper that way. It's
mostly a testing tool, but tbh if you're not using systemd you're on
your own. If you write an init script for example, that would be the
place where you put the chmod/chown.

Paolo




Re: [PATCH] accel/tcg/icount-common: Consolidate the use of warn_report_once()

2024-04-18 Thread Paolo Bonzini
Queued, thanks.

Paolo




[PATCH] pythondeps.toml: warn about updates needed to docs/requirements.txt

2024-04-18 Thread Paolo Bonzini
docs/requirements.txt is expected by readthedocs and should be in sync
with pythondeps.toml.  Add a comment to both.

Signed-off-by: Paolo Bonzini 
---
 docs/requirements.txt | 3 +++
 pythondeps.toml   | 1 +
 2 files changed, 4 insertions(+)

diff --git a/docs/requirements.txt b/docs/requirements.txt
index 691e5218ec7..02583f209aa 100644
--- a/docs/requirements.txt
+++ b/docs/requirements.txt
@@ -1,2 +1,5 @@
+# Used by readthedocs.io
+# Should be in sync with the "installed" key of pythondeps.toml
+
 sphinx==5.3.0
 sphinx_rtd_theme==1.1.1
diff --git a/pythondeps.toml b/pythondeps.toml
index 0e884159993..9c16602d303 100644
--- a/pythondeps.toml
+++ b/pythondeps.toml
@@ -22,6 +22,7 @@
 meson = { accepted = ">=0.63.0", installed = "1.2.3", canary = "meson" }
 
 [docs]
+# Please keep the installed versions in sync with docs/requirements.txt
 sphinx = { accepted = ">=1.6", installed = "5.3.0", canary = "sphinx-build" }
 sphinx_rtd_theme = { accepted = ">=0.5", installed = "1.1.1" }
 
-- 
2.44.0




Re: [PATCH 0/3] target/i386/cpu: Misc cleanup for warning message

2024-04-17 Thread Paolo Bonzini
Queued, thanks.

Paolo




Re: [PATCH v2] hw/i386/acpi: Set PCAT_COMPAT bit only when pic is not disabled

2024-04-15 Thread Paolo Bonzini
Queued, thanks.

Paolo




Re: [PATCH v2] target/i386: Give IRQs a chance when resetting HF_INHIBIT_IRQ_MASK

2024-04-15 Thread Paolo Bonzini
On Mon, Apr 15, 2024 at 8:50 AM Ruihan Li  wrote:
>
> When emulated with QEMU, interrupts will never come in the following
> loop. However, if the NOP instruction is uncommented, interrupts will
> fire as normal.
>
> loop:
> cli
> call do_sti
> jmp loop
>
> do_sti:
> sti
> # nop
> ret
>
> This behavior is different from that of a real processor. For example,
> if KVM is enabled, interrupts will always fire regardless of whether the
> NOP instruction is commented or not. Also, the Intel Software Developer
> Manual states that after the STI instruction is executed, the interrupt
> inhibit should end as soon as the next instruction (e.g., the RET
> instruction if the NOP instruction is commented) is executed.

Thanks, interesting bug!

What do you think about writing this:

>  /* If several instructions disable interrupts, only the first does it.  
> */
>  if (inhibit && !(s->flags & HF_INHIBIT_IRQ_MASK)) {
>  gen_set_hflag(s, HF_INHIBIT_IRQ_MASK);
> -} else {
> +inhibit_reset = false;
> +} else if (!inhibit && (s->flags & HF_INHIBIT_IRQ_MASK)) {
>  gen_reset_hflag(s, HF_INHIBIT_IRQ_MASK);
> +inhibit_reset = true;
> +} else {
> +inhibit_reset = false;
>  }

in a slightly simpler manner:

inhibit_reset = false;
if (s->flags & HF_INHIBIT_IRQ_MASK) {
gen_reset_hflag(s, HF_INHIBIT_IRQ_MASK);
inhibit_reset = true;
} else if (inhibit) {
gen_set_hflag(s, HF_INHIBIT_IRQ_MASK);
}

No need to submit v3, I can do the change myself when applying.

Paolo




Re: [PATCH for-9.1 4/9] Bump minimum glib version to v2.66

2024-04-12 Thread Paolo Bonzini

On 4/12/24 12:58, Thomas Huth wrote:

On 12/04/2024 12.16, Paolo Bonzini wrote:

On Thu, Mar 28, 2024 at 3:06 PM Thomas Huth  wrote:


Now that we dropped support for CentOS 8 and Ubuntu 20.04, we can
look into bumping the glib version to a new minimum for further
clean-ups. According to repology.org, available versions are:

  CentOS Stream 9:   2.66.7
  Debian 11: 2.66.8
  Fedora 38: 2.74.1
  Freebsd:   2.78.4
  Homebrew:  2.80.0
  Openbsd:   2.78.4
  OpenSuse leap 15.5:    2.70.5
  pkgsrc_current:    2.78.4
  Ubuntu 22.04:  2.72.1

Thus it should be safe to bump the minimum glib version to 2.66 now.
Version 2.66 comes with new functions for URI parsing which will
allow further clean-ups in the following patches.


Missing:

diff --git a/qga/commands-posix-ssh.c b/qga/commands-posix-ssh.c
index b0e0b1d674f..cc1f5a708e4 100644
--- a/qga/commands-posix-ssh.c
+++ b/qga/commands-posix-ssh.c
@@ -288,7 +288,6 @@ qmp_guest_ssh_get_authorized_keys(
  }

  #ifdef QGA_BUILD_UNIT_TEST
-#if GLIB_CHECK_VERSION(2, 60, 0)
  static const strList test_key2 = {
  .value = (char *)"algo key2 comments"
  };
@@ -484,11 +483,4 @@ int main(int argc, char *argv[])

  return g_test_run();
  }
-#else
-int main(int argc, char *argv[])
-{
-    g_test_message("test skipped, needs glib >= 2.60");
-    return 0;
-}
-#endif /* GLIB_2_60 */
  #endif /* BUILD_UNIT_TEST */


Indeed! And there seems to be another GLIB_CHECK_VERSION(2,62,0) check 
in util/error-report.c which we likely can clean up now, too!


Ok, I'll squash the above and

diff --git a/util/error-report.c b/util/error-report.c
index 6e44a557321..1b17c11de19 100644
--- a/util/error-report.c
+++ b/util/error-report.c
@@ -172,18 +172,8 @@ static void print_loc(void)
 static char *
 real_time_iso8601(void)
 {
-#if GLIB_CHECK_VERSION(2,62,0)
 g_autoptr(GDateTime) dt = g_date_time_new_now_utc();
-/* ignore deprecation warning, since GLIB_VERSION_MAX_ALLOWED is 2.56 */
-#pragma GCC diagnostic push
-#pragma GCC diagnostic ignored "-Wdeprecated-declarations"
 return g_date_time_format_iso8601(dt);
-#pragma GCC diagnostic pop
-#else
-GTimeVal tv;
-g_get_current_time();
-return g_time_val_to_iso8601();
-#endif
 }
 
 /*


then.  As an aside, we probably can also drop:

/*
 * gtk_widget_set_double_buffered() was deprecated in 3.14.
 * It is required for opengl rendering on X11 though.  A
 * proper replacement (native opengl support) is only
 * available in 3.16+.  Silence the warning if possible.
 */
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wdeprecated-declarations"
gtk_widget_set_double_buffered(vc->gfx.drawing_area, FALSE);
#pragma GCC diagnostic pop


and


#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wdeprecated-declarations"
/*
 * check if RBD image is a clone (= has a parent).
 *
 * rbd_get_parent_info is deprecated from Nautilus onwards, but the
 * replacement rbd_get_parent is not present in Luminous and Mimic.
 */
if (rbd_get_parent_info(s->image, NULL, 0, NULL, 0, NULL, 0) != -ENOENT) {
return status;
}
#pragma GCC diagnostic pop


(Nautilus is Ceph 14, it's in all of CentOS Stream 9, Ubuntu 20.04 and
Debian 11) but I have no idea what the replacement would be. :/

Paolo




Re: [PATCH] Makefile: preserve --jobserver-auth argument when calling ninja

2024-04-12 Thread Paolo Bonzini
On Fri, Apr 12, 2024 at 1:52 PM Fiona Ebner  wrote:
>
> Am 02.04.24 um 10:17 schrieb Martin Hundebøll:
> > Qemu wraps its call to ninja in a Makefile. Since ninja, as opposed to
> > make, utilizes all CPU cores by default, the qemu Makefile translates
> > the absense of a `-jN` argument into `-j1`. This breaks jobserver
> > functionality, so update the -jN mangling to take the --jobserver-auth
> > argument into considerationa too.
> >
> > Signed-off-by: Martin Hundebøll 
> > ---
> >  Makefile | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/Makefile b/Makefile
> > index 8f36990335..183756018f 100644
> > --- a/Makefile
> > +++ b/Makefile
> > @@ -142,7 +142,7 @@ MAKE.k = $(findstring k,$(firstword $(filter-out 
> > --%,$(MAKEFLAGS
> >  MAKE.q = $(findstring q,$(firstword $(filter-out --%,$(MAKEFLAGS
> >  MAKE.nq = $(if $(word 2, $(MAKE.n) $(MAKE.q)),nq)
> >  NINJAFLAGS = $(if $V,-v) $(if $(MAKE.n), -n) $(if $(MAKE.k), -k0) \
> > -$(filter-out -j, $(lastword -j1 $(filter -l% -j%, $(MAKEFLAGS \
> > +$(or $(filter -l% -j%, $(MAKEFLAGS)), $(if $(filter 
> > --jobserver-auth=%, $(MAKEFLAGS)),, -j1)) \
> >  -d keepdepfile
> >  ninja-cmd-goals = $(or $(MAKECMDGOALS), all)
> >  ninja-cmd-goals += $(foreach g, $(MAKECMDGOALS), $(.ninja-goals.$g))
>
> Hi,
>
> unfortunately, this patch breaks build when specifying just '-j' as a
> make flag (i.e. without a number), because it will now end up being
> passed to ninja:

Yep, I've sent a pull request with the fix.

Paolo




Re: [PULL 0/2] Final build system fixes for 9.0

2024-04-12 Thread Paolo Bonzini
> Since these 2 patches don't modify what we can build with v9.0.0-rc3,
> would it be acceptable to merge them without having to produce a
> v9.0.0-rc4 tag before the final release?

I didn't want to ask you about that, but I agree it would not be an issue.

Paolo




Re: [PATCH v5 1/3] qio: add support for SO_PEERCRED for socket channel

2024-04-12 Thread Paolo Bonzini
On Thu, Apr 11, 2024 at 2:14 PM Anthony Harivel  wrote:
>
> The function qio_channel_get_peercred() returns a pointer to the
> credentials of the peer process connected to this socket.
>
> This credentials structure is defined in  as follows:
>
> struct ucred {
> pid_t pid;/* Process ID of the sending process */
> uid_t uid;/* User ID of the sending process */
> gid_t gid;/* Group ID of the sending process */
> };
>
> The use of this function is possible only for connected AF_UNIX stream
> sockets and for AF_UNIX stream and datagram socket pairs.
>
> On platform other than Linux, the function return 0.
>
> Signed-off-by: Anthony Harivel 
> ---
>  include/io/channel.h | 21 +
>  io/channel-socket.c  | 28 
>  io/channel.c | 13 +
>  3 files changed, 62 insertions(+)
>
> diff --git a/include/io/channel.h b/include/io/channel.h
> index 7986c49c713a..bdf0bca92ae2 100644
> --- a/include/io/channel.h
> +++ b/include/io/channel.h
> @@ -160,6 +160,9 @@ struct QIOChannelClass {
>void *opaque);
>  int (*io_flush)(QIOChannel *ioc,
>  Error **errp);
> +int (*io_peerpid)(QIOChannel *ioc,
> +   unsigned int *pid,
> +   Error **errp);
>  };
>
>  /* General I/O handling functions */
> @@ -981,4 +984,22 @@ int coroutine_mixed_fn 
> qio_channel_writev_full_all(QIOChannel *ioc,
>  int qio_channel_flush(QIOChannel *ioc,
>Error **errp);
>
> +/**
> + * qio_channel_get_peercred:
> + * @ioc: the channel object
> + * @pid: pointer to pid
> + * @errp: pointer to a NULL-initialized error object
> + *
> + * Returns the pid of the peer process connected to this socket.
> + *
> + * The use of this function is possible only for connected
> + * AF_UNIX stream sockets and for AF_UNIX stream and datagram
> + * socket pairs on Linux.
> + * Return -1 on error with pid -1 for the non-Linux OS.

with pid -1 -> and set *pid to -1.

> + */
>  static const TypeInfo qio_channel_socket_info = {
> diff --git a/io/channel.c b/io/channel.c
> index a1f12f8e9096..e3f17c24a00f 100644
> --- a/io/channel.c
> +++ b/io/channel.c
> @@ -548,6 +548,19 @@ void qio_channel_set_cork(QIOChannel *ioc,
>  }
>  }
>
> +int qio_channel_get_peerpid(QIOChannel *ioc,
> + unsigned int *pid,
> + Error **errp)
> +{
> +QIOChannelClass *klass = QIO_CHANNEL_GET_CLASS(ioc);
> +
> +if (!klass->io_peerpid) {
> +error_setg(errp, "Channel does not support peer pid");

Missing for consistency:

+*pid = -1;

> +return -1;
> +}
> +klass->io_peerpid(ioc, pid, errp);
> +return 0;

The error from klass->io_peerpid is ignored:

-klass->io_peerpid(ioc, pid, errp);
-return 0;
+return klass->io_peerpid(ioc, pid, errp);

Paolo




Re: [PATCH v5 0/3] Add support for the RAPL MSRs series

2024-04-12 Thread Paolo Bonzini
On Thu, Apr 11, 2024 at 2:14 PM Anthony Harivel  wrote:
>
> Dear maintainers,
>
> First of all, thank you very much for your review of my patch
> [1].
>
> In this version (v5), I have attempted to address all the problems
> addressed by Daniel during the last review. I've been more careful with
> all the remarks made.
>
> However, one question remains unanswered pointing the issue with the
> location of "/var/local/run/qemu-vmsr-helper.sock", created by
> compute_default_paths(). QEMU is not allowed to reach the socket here.

If I understand correctly the question, that is expected. This is a
privileged functionality and therefore it requires manual intervention
to change the owner of the socket and allow QEMU to access it.

Paolo

> Thank you again for your continued guidance.
>
> v4 -> v5
> 
>
> - correct qio_channel_get_peerpid: return pid = -1 in case of error
> - Vmsr_helper: compile only for x86
> - Vmsr_helper: use qio_channel_read/write_all
> - Vmsr_helper: abandon user/group
> - Vmsr_energy.c: correct all error_report
> - Vmsr thread: compute default socket path only once
> - Vmsr thread: open socket only once
> - Pass relevant QEMU CI
>
> v3 -> v4
> 
>
> - Correct memory leaks with AddressSanitizer
> - Add sanity check for QEMU and qemu-vmsr-helper for checking if host is
>   INTEL and if RAPL is activated.
> - Rename poor variables naming for easier comprehension
> - Move code that checks Host before creating the VMSR thread
> - Get rid of libnuma: create function that read sysfs for reading the
>   Host topology instead
>
> v2 -> v3
> 
>
> - Move all memory allocations from Clib to Glib
> - Compile on *BSD (working on Linux only)
> - No more limitation on the virtual package: each vCPU that belongs to
>   the same virtual package is giving the same results like expected on
>   a real CPU.
>   This has been tested topology like:
>  -smp 4,sockets=2
>  -smp 16,sockets=4,cores=2,threads=2
>
> v1 -> v2
> 
>
> - To overcome the CVE-2020-8694 a socket communication is created
>   to a priviliged helper
> - Add the priviliged helper (qemu-vmsr-helper)
> - Add SO_PEERCRED in qio channel socket
>
> RFC -> v1
> -
>
> - Add vmsr_* in front of all vmsr specific function
> - Change malloc()/calloc()... with all glib equivalent
> - Pre-allocate all dynamic memories when possible
> - Add a Documentation of implementation, limitation and usage
>
> Best regards,
> Anthony
>
> [1]: https://lists.gnu.org/archive/html/qemu-devel/2024-03/msg04417.html
>
> Anthony Harivel (3):
>   qio: add support for SO_PEERCRED for socket channel
>   tools: build qemu-vmsr-helper
>   Add support for RAPL MSRs in KVM/Qemu
>
>  accel/kvm/kvm-all.c  |  27 ++
>  contrib/systemd/qemu-vmsr-helper.service |  15 +
>  contrib/systemd/qemu-vmsr-helper.socket  |   9 +
>  docs/specs/index.rst |   1 +
>  docs/specs/rapl-msr.rst  | 155 +++
>  docs/tools/index.rst |   1 +
>  docs/tools/qemu-vmsr-helper.rst  |  89 
>  include/io/channel.h |  21 +
>  include/sysemu/kvm.h |   2 +
>  include/sysemu/kvm_int.h |  32 ++
>  io/channel-socket.c  |  28 ++
>  io/channel.c |  13 +
>  meson.build  |   7 +
>  target/i386/cpu.h|   8 +
>  target/i386/kvm/kvm-cpu.c|   9 +
>  target/i386/kvm/kvm.c| 428 ++
>  target/i386/kvm/meson.build  |   1 +
>  target/i386/kvm/vmsr_energy.c| 335 ++
>  target/i386/kvm/vmsr_energy.h|  99 +
>  tools/i386/qemu-vmsr-helper.c| 529 +++
>  tools/i386/rapl-msr-index.h  |  28 ++
>  21 files changed, 1837 insertions(+)
>  create mode 100644 contrib/systemd/qemu-vmsr-helper.service
>  create mode 100644 contrib/systemd/qemu-vmsr-helper.socket
>  create mode 100644 docs/specs/rapl-msr.rst
>  create mode 100644 docs/tools/qemu-vmsr-helper.rst
>  create mode 100644 target/i386/kvm/vmsr_energy.c
>  create mode 100644 target/i386/kvm/vmsr_energy.h
>  create mode 100644 tools/i386/qemu-vmsr-helper.c
>  create mode 100644 tools/i386/rapl-msr-index.h
>
> --
> 2.44.0
>




[PATCH] ci: move external build environment setups to CentOS Stream 9

2024-04-12 Thread Paolo Bonzini
RHEL 9 (and thus also the derivatives) are available since two years
now, so according to QEMU's support policy, we can drop the active
support for the previous major version 8 now.

Thus upgrade our CentOS Stream build environment playbooks to major
version 9 now.

Signed-off-by: Paolo Bonzini 
---
 .../stream/{8 => 9}/build-environment.yml | 31 ++---
 .../stream/{8 => 9}/x86_64/configure  |  4 +-
 .../stream/{8 => 9}/x86_64/test-avocado   |  0
 scripts/ci/setup/build-environment.yml| 44 +++
 4 files changed, 34 insertions(+), 45 deletions(-)
 rename scripts/ci/org.centos/stream/{8 => 9}/build-environment.yml (75%)
 rename scripts/ci/org.centos/stream/{8 => 9}/x86_64/configure (98%)
 rename scripts/ci/org.centos/stream/{8 => 9}/x86_64/test-avocado (100%)

diff --git a/scripts/ci/org.centos/stream/8/build-environment.yml 
b/scripts/ci/org.centos/stream/9/build-environment.yml
similarity index 75%
rename from scripts/ci/org.centos/stream/8/build-environment.yml
rename to scripts/ci/org.centos/stream/9/build-environment.yml
index 1ead77e2cbf..cd29fe6f275 100644
--- a/scripts/ci/org.centos/stream/8/build-environment.yml
+++ b/scripts/ci/org.centos/stream/9/build-environment.yml
@@ -2,32 +2,32 @@
 - name: Installation of extra packages to build QEMU
   hosts: all
   tasks:
-- name: Extra check for CentOS Stream 8
+- name: Extra check for CentOS Stream 9
   lineinfile:
 path: /etc/redhat-release
-line: CentOS Stream release 8
+line: CentOS Stream release 9
 state: present
   check_mode: yes
-  register: centos_stream_8
+  register: centos_stream_9
 
-- name: Enable EPEL repo on CentOS Stream 8
+- name: Enable EPEL repo on CentOS Stream 9
   dnf:
 name:
   - epel-release
 state: present
   when:
-- centos_stream_8
+- centos_stream_9
 
-- name: Enable PowerTools repo on CentOS Stream 8
+- name: Enable CRB repo on CentOS Stream 9
   ini_file:
-path: /etc/yum.repos.d/CentOS-Stream-PowerTools.repo
-section: powertools
+path: /etc/yum.repos.d/centos.repo
+section: crb
 option: enabled
 value: "1"
   when:
-- centos_stream_8
+- centos_stream_9
 
-- name: Install basic packages to build QEMU on CentOS Stream 8
+- name: Install basic packages to build QEMU on CentOS Stream 9
   dnf:
 name:
   - bzip2
@@ -42,7 +42,6 @@
   - gettext
   - git
   - glib2-devel
-  - glusterfs-api-devel
   - gnutls-devel
   - libaio-devel
   - libcap-ng-devel
@@ -61,22 +60,24 @@
   - lzo-devel
   - make
   - mesa-libEGL-devel
+  - meson
   - nettle-devel
   - ninja-build
   - nmap-ncat
   - numactl-devel
   - pixman-devel
-  - python38
+  - python3
+  - python3-pip
   - python3-sphinx
+  - python3-sphinx_rtd_theme
+  - python3-tomli
   - rdma-core-devel
   - redhat-rpm-config
   - snappy-devel
-  - spice-glib-devel
-  - spice-server-devel
   - systemd-devel
   - systemtap-sdt-devel
   - tar
   - zlib-devel
 state: present
   when:
-- centos_stream_8
+- centos_stream_9
diff --git a/scripts/ci/org.centos/stream/8/x86_64/configure 
b/scripts/ci/org.centos/stream/9/x86_64/configure
similarity index 98%
rename from scripts/ci/org.centos/stream/8/x86_64/configure
rename to scripts/ci/org.centos/stream/9/x86_64/configure
index 76781f17f41..1b6f40fd785 100755
--- a/scripts/ci/org.centos/stream/8/x86_64/configure
+++ b/scripts/ci/org.centos/stream/9/x86_64/configure
@@ -16,7 +16,7 @@
 # that patches adding downstream specific devices are not available.
 #
 ../configure \
---python=/usr/bin/python3.8 \
+--python=/usr/bin/python3.9 \
 --prefix="/usr" \
 --libdir="/usr/lib64" \
 --datadir="/usr/share" \
@@ -157,7 +157,6 @@
 --enable-docs \
 --enable-fdt \
 --enable-gcrypt \
---enable-glusterfs \
 --enable-gnutls \
 --enable-guest-agent \
 --enable-iconv \
@@ -180,7 +179,6 @@
 --enable-seccomp \
 --enable-snappy \
 --enable-smartcard \
---enable-spice \
 --enable-system \
 --enable-tcg \
 --enable-tools \
diff --git a/scripts/ci/org.centos/stream/8/x86_64/test-avocado 
b/scripts/ci/org.centos/stream/9/x86_64/test-avocado
similarity index 100%
rename from scripts/ci/org.centos/stream/8/x86_64/test-avocado
rename to scripts/ci/org.centos/stream/9/x86_64/test-avocado
diff --git a/scripts/ci/setup/build-environment.yml 
b/scripts/ci/setup/build-environment.yml
index f344d1a8509..9b7d96c01b2 100644
--- a/scripts/ci/setup/build-environment.yml
+++ b/scripts/ci/setup/build-environment.yml
@@ -174,26 +174,26 @@
 - ansible_facts['distribution_version'] == '2

Re: [PATCH for-9.1 4/9] Bump minimum glib version to v2.66

2024-04-12 Thread Paolo Bonzini
On Thu, Mar 28, 2024 at 3:06 PM Thomas Huth  wrote:
>
> Now that we dropped support for CentOS 8 and Ubuntu 20.04, we can
> look into bumping the glib version to a new minimum for further
> clean-ups. According to repology.org, available versions are:
>
>  CentOS Stream 9:   2.66.7
>  Debian 11: 2.66.8
>  Fedora 38: 2.74.1
>  Freebsd:   2.78.4
>  Homebrew:  2.80.0
>  Openbsd:   2.78.4
>  OpenSuse leap 15.5:2.70.5
>  pkgsrc_current:2.78.4
>  Ubuntu 22.04:  2.72.1
>
> Thus it should be safe to bump the minimum glib version to 2.66 now.
> Version 2.66 comes with new functions for URI parsing which will
> allow further clean-ups in the following patches.

Missing:

diff --git a/qga/commands-posix-ssh.c b/qga/commands-posix-ssh.c
index b0e0b1d674f..cc1f5a708e4 100644
--- a/qga/commands-posix-ssh.c
+++ b/qga/commands-posix-ssh.c
@@ -288,7 +288,6 @@ qmp_guest_ssh_get_authorized_keys(
 }

 #ifdef QGA_BUILD_UNIT_TEST
-#if GLIB_CHECK_VERSION(2, 60, 0)
 static const strList test_key2 = {
 .value = (char *)"algo key2 comments"
 };
@@ -484,11 +483,4 @@ int main(int argc, char *argv[])

 return g_test_run();
 }
-#else
-int main(int argc, char *argv[])
-{
-g_test_message("test skipped, needs glib >= 2.60");
-return 0;
-}
-#endif /* GLIB_2_60 */
 #endif /* BUILD_UNIT_TEST */

Paolo




[PULL 2/2] meson.build: Disable -fzero-call-used-regs on OpenBSD

2024-04-12 Thread Paolo Bonzini
From: Thomas Huth 

QEMU currently does not work on OpenBSD since the -fzero-call-used-regs
option that we added to meson.build recently does not work with the
"retguard" extension from OpenBSD's Clang. Thus let's disable the
-fzero-call-used-regs here until there's a better solution available.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2278
Signed-off-by: Thomas Huth 
Reviewed-by: Philippe Mathieu-Daudé 
Message-ID: <20240411120819.56417-1-th...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 meson.build | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index c9c3217ba4b..91a0aa64c64 100644
--- a/meson.build
+++ b/meson.build
@@ -562,7 +562,11 @@ hardening_flags = [
 #
 # NB: Clang 17 is broken and SEGVs
 # https://github.com/llvm/llvm-project/issues/75168
-if cc.compiles('extern struct { void (*cb)(void); } s; void f(void) { s.cb(); 
}',
+#
+# NB2: This clashes with the "retguard" extension of OpenBSD's Clang
+# https://gitlab.com/qemu-project/qemu/-/issues/2278
+if host_os != 'openbsd' and \
+   cc.compiles('extern struct { void (*cb)(void); } s; void f(void) { s.cb(); 
}',
name: '-fzero-call-used-regs=used-gpr',
args: ['-O2', '-fzero-call-used-regs=used-gpr'])
 hardening_flags += '-fzero-call-used-regs=used-gpr'
-- 
2.44.0




[PULL 0/2] Final build system fixes for 9.0

2024-04-12 Thread Paolo Bonzini
The following changes since commit 02e16ab9f4f19c4bdd17c51952d70e2ded74c6bf:

  Update version for v9.0.0-rc3 release (2024-04-10 18:05:18 +0100)

are available in the Git repository at:

  https://gitlab.com/bonzini/qemu.git tags/for-upstream

for you to fetch changes up to 2d6d995709482cc8b6a76dbb5334a28001a14a9a:

  meson.build: Disable -fzero-call-used-regs on OpenBSD (2024-04-12 12:02:12 
+0200)


build system fixes


Matheus Tavares Bernardino (1):
  Makefile: fix use of -j without an argument

Thomas Huth (1):
  meson.build: Disable -fzero-call-used-regs on OpenBSD

 Makefile| 9 +++--
 meson.build | 6 +-
 2 files changed, 12 insertions(+), 3 deletions(-)
-- 
2.44.0




[PULL 1/2] Makefile: fix use of -j without an argument

2024-04-12 Thread Paolo Bonzini
From: Matheus Tavares Bernardino 

Our Makefile massages the given make arguments to invoke ninja
accordingly. One key difference is that ninja will parallelize by
default, whereas make only does so with -j or -j. The make man page
says that "if the -j option is given without an argument, make will not
limit the number of jobs that can run simultaneously". We use to support
that by replacing -j with "" (empty string) when calling ninja, so that
it would do its auto-parallelization based on the number of CPU cores.

This was accidentally broken at d1ce2cc95b (Makefile: preserve
--jobserver-auth argument when calling ninja, 2024-04-02),
causing `make -j` to fail:

$ make -j V=1
  /usr/bin/ninja -v   -j -d keepdepfile all | cat
  make  -C contrib/plugins/ V="1" TARGET_DIR="contrib/plugins/" all
  ninja: fatal: invalid -j parameter
  make: *** [Makefile:161: run-ninja] Error

Let's fix that and indent the touched code for better readability.

Signed-off-by: Matheus Tavares Bernardino 
Fixes: d1ce2cc95b ("Makefile: preserve --jobserver-auth argument when calling 
ninja", 2024-04-02)
Signed-off-by: Paolo Bonzini 
---
 Makefile | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index 183756018ff..02a257584ba 100644
--- a/Makefile
+++ b/Makefile
@@ -141,8 +141,13 @@ MAKE.n = $(findstring n,$(firstword $(filter-out 
--%,$(MAKEFLAGS
 MAKE.k = $(findstring k,$(firstword $(filter-out --%,$(MAKEFLAGS
 MAKE.q = $(findstring q,$(firstword $(filter-out --%,$(MAKEFLAGS
 MAKE.nq = $(if $(word 2, $(MAKE.n) $(MAKE.q)),nq)
-NINJAFLAGS = $(if $V,-v) $(if $(MAKE.n), -n) $(if $(MAKE.k), -k0) \
-$(or $(filter -l% -j%, $(MAKEFLAGS)), $(if $(filter 
--jobserver-auth=%, $(MAKEFLAGS)),, -j1)) \
+NINJAFLAGS = \
+$(if $V,-v) \
+$(if $(MAKE.n), -n) \
+$(if $(MAKE.k), -k0) \
+$(filter-out -j, \
+  $(or $(filter -l% -j%, $(MAKEFLAGS)), \
+   $(if $(filter --jobserver-auth=%, $(MAKEFLAGS)),, -j1))) \
 -d keepdepfile
 ninja-cmd-goals = $(or $(MAKECMDGOALS), all)
 ninja-cmd-goals += $(foreach g, $(MAKECMDGOALS), $(.ninja-goals.$g))
-- 
2.44.0




Re: [PATCH for-9.0] meson.build: Disable -fzero-call-used-regs on OpenBSD

2024-04-12 Thread Paolo Bonzini
Queued, thanks.

Paolo




Re: [PATCH] Makefile: fix use of -j without an argument

2024-04-12 Thread Paolo Bonzini
On Thu, Apr 11, 2024 at 5:46 PM Matheus Tavares Bernardino
 wrote:
> +$(if $(filter -j, $(MAKEFLAGS)) \
> +,, \
> +$(or \
> + $(filter -l% -j%, $(MAKEFLAGS)), \
> + $(if $(filter --jobserver-auth=%, $(MAKEFLAGS)),, -j1)) \
> +) -d keepdepfile

This is more easily written as $(filter-out -j, $(or ...)).

I've sent a v2.

Paolo

>  ninja-cmd-goals = $(or $(MAKECMDGOALS), all)
>  ninja-cmd-goals += $(foreach g, $(MAKECMDGOALS), $(.ninja-goals.$g))
>
> --
> 2.37.2
>




[PATCH v2] Makefile: fix use of -j without an argument

2024-04-12 Thread Paolo Bonzini
From: Matheus Tavares Bernardino 

Our Makefile massages the given make arguments to invoke ninja
accordingly. One key difference is that ninja will parallelize by
default, whereas make only does so with -j or -j. The make man page
says that "if the -j option is given without an argument, make will not
limit the number of jobs that can run simultaneously". We use to support
that by replacing -j with "" (empty string) when calling ninja, so that
it would do its auto-parallelization based on the number of CPU cores.

This was accidentally broken at d1ce2cc95b (Makefile: preserve
--jobserver-auth argument when calling ninja, 2024-04-02),
causing `make -j` to fail:

$ make -j V=1
  /usr/bin/ninja -v   -j -d keepdepfile all | cat
  make  -C contrib/plugins/ V="1" TARGET_DIR="contrib/plugins/" all
  ninja: fatal: invalid -j parameter
  make: *** [Makefile:161: run-ninja] Error

Let's fix that and indent the touched code for better readability.

Signed-off-by: Matheus Tavares Bernardino 
Fixes: d1ce2cc95b ("Makefile: preserve --jobserver-auth argument when calling 
ninja", 2024-04-02)
Signed-off-by: Paolo Bonzini 
---
 Makefile | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index 183756018ff..02a257584ba 100644
--- a/Makefile
+++ b/Makefile
@@ -141,8 +141,13 @@ MAKE.n = $(findstring n,$(firstword $(filter-out 
--%,$(MAKEFLAGS
 MAKE.k = $(findstring k,$(firstword $(filter-out --%,$(MAKEFLAGS
 MAKE.q = $(findstring q,$(firstword $(filter-out --%,$(MAKEFLAGS
 MAKE.nq = $(if $(word 2, $(MAKE.n) $(MAKE.q)),nq)
-NINJAFLAGS = $(if $V,-v) $(if $(MAKE.n), -n) $(if $(MAKE.k), -k0) \
-$(or $(filter -l% -j%, $(MAKEFLAGS)), $(if $(filter 
--jobserver-auth=%, $(MAKEFLAGS)),, -j1)) \
+NINJAFLAGS = \
+$(if $V,-v) \
+$(if $(MAKE.n), -n) \
+$(if $(MAKE.k), -k0) \
+$(filter-out -j, \
+  $(or $(filter -l% -j%, $(MAKEFLAGS)), \
+   $(if $(filter --jobserver-auth=%, $(MAKEFLAGS)),, -j1))) \
 -d keepdepfile
 ninja-cmd-goals = $(or $(MAKECMDGOALS), all)
 ninja-cmd-goals += $(foreach g, $(MAKECMDGOALS), $(.ninja-goals.$g))
-- 
2.44.0




Re: [PATCH for-9.1 v1 0/3] Add SEV/SEV-ES machine compat options for KVM_SEV_INIT2

2024-04-11 Thread Paolo Bonzini
On Wed, Apr 10, 2024 at 1:08 AM Michael Roth  wrote:
>
> These patches are also available at:
>
>   https://github.com/amdese/qemu/commits/sev-init-legacy-v1
>
> and are based on top Paolo's qemu-coco-queue branch containing the
> following patches:

A more complete version of patch 2 was already on the list, so I
queued 1 and 3 to qemu-coco-queue.

Thanks!

Paolo

>
>   [PATCH for-9.1 00/26] x86, kvm: common confidential computing subset
>   https://lore.kernel.org/all/20240322181116.1228416-1-pbonz...@redhat.com/T/
>
> Overview
> 
>
> With the following patches applied from qemu-coco-queue:
>
>   https://lore.kernel.org/all/2024031914.1014247-1-pbonz...@redhat.com/
>
> QEMU version 9.1+ will begin automatically making use of the new
> KVM_SEV_INIT2 API for initializing SEV and SEV-ES (and eventually, SEV-SNP)
> guests verses the older KVM_SEV_INIT/KVM_SEV_ES_INIT interfaces.
>
> However, the older interfaces would silently avoid sync'ing FPU/XSAVE state
> set by QEMU to each vCPU's VMSA prior to encryption. With KVM_SEV_INIT2,
> this state will now be synced into the VMSA, resulting in measurements
> changes and, theoretically, behaviorial changes, though the latter are
> unlikely to be seen in practice. The specific VMSA changes are documented
> in the section below for reference.
>
> This series implements machine compatibility options for SEV/SEV-ES so that
> only VMs created with QEMU 9.1+ will make use of KVM_SEV_INIT2 so that VMSA
> differences can be accounted for beforehand, and older machine types will
> continue using the older interfaces to avoid unexpected measurement
> changes.
>
> Specific VMSA changes
> -
>
> With KVM_SEV_INIT2, rather than 0, QEMU/KVM will instead begin setting the
> following fields in the VMSA before measurement/encryption:
>
>   VMSA byte offset [1032:1033] = 80 1f (MXCSR, Multimedia Control Status
> Register)
>   VMSA byte offset [1040:1041] = 7f 03 (FCW, FPU/x86 Control Word)
>
> Setting FCW (FPU/x86 Control Word) to 0x37f is consistent with 11.5.7 of
> APM Volume 2. MXCSR reset state is not defined for XSAVE, but QEMU's 0x1f80
> value is consistent with machine reset state documented in APM Volume 2
> 4.2.2. As such, it is reasonable to begin including these in the VMSA
> measurement calculations.
>
> NOTE: section 11.5.7 also documents that FTW should be all 1's, whereas
>   QEMU currently sets all zeroes. Should that be changed as part of
>   this, or are there other reasons for setting 0?
>
> Thanks,
>
> Mike
>
> 
> Michael Roth (3):
>   i386/sev: Add 'legacy-vm-type' parameter for SEV guest objects
>   hw/i386: Add 9.1 machine types for i440fx/q35
>   hw/i386/sev: Use legacy SEV VM types for older machine types
>
>  hw/i386/pc.c |  5 +
>  hw/i386/pc_piix.c| 13 -
>  hw/i386/pc_q35.c | 12 +++-
>  include/hw/i386/pc.h |  3 +++
>  qapi/qom.json| 11 ++-
>  target/i386/sev.c| 19 ++-
>  6 files changed, 59 insertions(+), 4 deletions(-)
>
>
>




Re: [PATCH for-9.1 09/19] target/i386: move 60-BF opcodes to new decoder

2024-04-11 Thread Paolo Bonzini
On Thu, Apr 11, 2024 at 5:05 PM Zhao Liu  wrote:
>
> On Tue, Apr 09, 2024 at 06:43:13PM +0200, Paolo Bonzini wrote:
> > Date: Tue,  9 Apr 2024 18:43:13 +0200
> > From: Paolo Bonzini 
> > Subject: [PATCH for-9.1 09/19] target/i386: move 60-BF opcodes to new
> >  decoder
> > X-Mailer: git-send-email 2.44.0
> >
> > Compared to the old decoder, the main differences in translation
> > are for the little-used ARPL instruction.  IMUL is adjusted a bit
> > to share more code to produce flags, but is otherwise very similar.
> >
> > Signed-off-by: Paolo Bonzini 
> > ---
> >  target/i386/tcg/decode-new.h |   2 +
> >  target/i386/tcg/translate.c  |   9 +-
> >  target/i386/tcg/decode-new.c.inc | 171 +
> >  target/i386/tcg/emit.c.inc   | 317 +++
> >  4 files changed, 497 insertions(+), 2 deletions(-)
>
> HMM, I met Guest boot failure on this patch because of ata unrecognized.
> I haven't located the exact error yet, so let me post my log first.
> If there are other means I can use to dig further, I'd be happy to try
> that too.
>
> # Command (boot a ubuntu Guest via TCG)
>
> ./qemu/build/qemu-system-x86_64 \
> -smp 1 \
> -name ubuntu -m 4G \
> -cpu max -accel tcg \
> -hda ../img_qemu/test.qcow2 -nographic \
> -kernel ../img_qemu/kernel/vmlinuz-6.4.0-rc6+ \
> -initrd ../img_qemu/kernel/initrd.img-6.4.0-rc6+ \
> -append "root=/dev/sda ro console=ttyS0" \
> -qmp unix:/tmp/qmp-sock,server=on,wait=off

I did run a bunch of boot tests but I'll check this one too.

Thanks!

Paolo




Re: [PATCH for-9.1 09/19] target/i386: move 60-BF opcodes to new decoder

2024-04-11 Thread Paolo Bonzini
On Thu, Apr 11, 2024 at 9:47 AM Richard Henderson
 wrote:
> > +case MO_32:
> > +#ifdef TARGET_X86_64
> > +/*
> > + * This could also use the same algorithm as MO_16.  It produces 
> > fewer
> > + * TCG ops and better code if flags are needed, but it requires a 
> > 64-bit
> > + * multiply even if they are not (and thus the high part of the 
> > multiply
> > + * is dead).
> > + */
>
> Is 64-bit multiply ever slower these days?
> My intuition says "slow" multiply is at least a decade out of date.

I was thinking more about TCG_TARGET_REG_BITS == 32.

> > +tcg_gen_negsetcondi_i32(TCG_COND_LT, s->tmp2_i32, s->tmp2_i32, 0);
>
> This seems like something the optimizer should handle, but doesn't.

I wanted to avoid using TARGET_LONG_BITS - 1, but it's not a problem
to use extract. I've changed it.

At least the ppc and x86 backends do support it and convert it to SAR,
so I didn't notice in my test that it was the backend doing it and not
the optimizer!

Paolo




Re: [PATCH for-9.1 04/19] target/i386: do not use s->tmp0 and s->tmp4 to compute flags

2024-04-10 Thread Paolo Bonzini
Il mer 10 apr 2024, 08:35 Richard Henderson 
ha scritto:

> On 4/9/24 06:43, Paolo Bonzini wrote:
> > Create a new temporary whenever flags have to use one, instead of using
> > s->tmp0 or s->tmp4.  NULL can now be passed as the scratch register
> > to gen_prepare_*.
> >
> > Signed-off-by: Paolo Bonzini 
> > ---
> >   target/i386/tcg/translate.c | 54 +
> >   1 file changed, 31 insertions(+), 23 deletions(-)
> >
> > diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
> > index 197cccb6c96..debc1b27283 100644
> > --- a/target/i386/tcg/translate.c
> > +++ b/target/i386/tcg/translate.c
> > @@ -947,9 +947,9 @@ static CCPrepare gen_prepare_eflags_c(DisasContext
> *s, TCGv reg)
> >   case CC_OP_SUBB ... CC_OP_SUBQ:
> >   /* (DATA_TYPE)CC_SRCT < (DATA_TYPE)CC_SRC */
> >   size = s->cc_op - CC_OP_SUBB;
> > -t1 = gen_ext_tl(s->tmp0, cpu_cc_src, size, false);
> > -/* If no temporary was used, be careful not to alias t1 and
> t0.  */
> > -t0 = t1 == cpu_cc_src ? s->tmp0 : reg;
> > +/* Be careful not to alias t1 and t0.  */
> > +t1 = gen_ext_tl(NULL, cpu_cc_src, size, false);
> > +t0 = (reg == t1 || !reg) ? tcg_temp_new() : reg;
> >   tcg_gen_mov_tl(t0, s->cc_srcT);
> >   gen_extu(size, t0);
>
> The tcg_temp_new, mov, and extu can be had with gen_ext_tl...
>

There's actually a lot more that can be done now that I looked more closely
at gen_ext_tl. It is fine (modulo bugs elsewhere) to just extend cc_* in
place. In fact this lets the optimizer work better, even allows (rare)
cross tb optimization because it effectively bumps CC_OP_ADD* to
target_long size, and is just as effective in removing tmp0/tmp4.

Paolo


> >   goto add_sub;
> > @@ -957,8 +957,9 @@ static CCPrepare gen_prepare_eflags_c(DisasContext
> *s, TCGv reg)
> >   case CC_OP_ADDB ... CC_OP_ADDQ:
> >   /* (DATA_TYPE)CC_DST < (DATA_TYPE)CC_SRC */
> >   size = s->cc_op - CC_OP_ADDB;
> > -t1 = gen_ext_tl(s->tmp0, cpu_cc_src, size, false);
> > -t0 = gen_ext_tl(reg, cpu_cc_dst, size, false);
> > +/* Be careful not to alias t1 and t0.  */
> > +t1 = gen_ext_tl(NULL, cpu_cc_src, size, false);
> > +t0 = gen_ext_tl(reg == t1 ? NULL : reg, cpu_cc_dst, size,
> false);
>
> ... like this.
>
> It would be helpful to update the function comments (nothing is 'compute
> ... to reg' in
> these functions).  Future cleanup, perhaps rename 'reg' to 'scratch', or
> remove the
> argument entirely where applicable.
>
> > @@ -1109,11 +1113,13 @@ static CCPrepare gen_prepare_cc(DisasContext *s,
> int b, TCGv reg)
> >   size = s->cc_op - CC_OP_SUBB;
> >   switch (jcc_op) {
> >   case JCC_BE:
> > -tcg_gen_mov_tl(s->tmp4, s->cc_srcT);
> > -gen_extu(size, s->tmp4);
> > -t0 = gen_ext_tl(s->tmp0, cpu_cc_src, size, false);
> > -cc = (CCPrepare) { .cond = TCG_COND_LEU, .reg = s->tmp4,
> > -   .reg2 = t0, .use_reg2 = true };
> > +/* Be careful not to alias t1 and t0.  */
> > +t1 = gen_ext_tl(NULL, cpu_cc_src, size, false);
> > +t0 = (reg == t1 || !reg) ? tcg_temp_new() : reg;
> > +tcg_gen_mov_tl(t0, s->cc_srcT);
> > +gen_extu(size, t0);
>
> gen_ext_tl
>
> > +cc = (CCPrepare) { .cond = TCG_COND_LEU, .reg = t0,
> > +   .reg2 = t1, .use_reg2 = true };
> >   break;
> >
> >   case JCC_L:
> > @@ -1122,11 +1128,13 @@ static CCPrepare gen_prepare_cc(DisasContext *s,
> int b, TCGv reg)
> >   case JCC_LE:
> >   cond = TCG_COND_LE;
> >   fast_jcc_l:
> > -tcg_gen_mov_tl(s->tmp4, s->cc_srcT);
> > -gen_exts(size, s->tmp4);
> > -t0 = gen_ext_tl(s->tmp0, cpu_cc_src, size, true);
> > -cc = (CCPrepare) { .cond = cond, .reg = s->tmp4,
> > -   .reg2 = t0, .use_reg2 = true };
> > +/* Be careful not to alias t1 and t0.  */
> > +t1 = gen_ext_tl(NULL, cpu_cc_src, size, true);
> > +t0 = (reg == t1 || !reg) ? tcg_temp_new() : reg;
> > +tcg_gen_mov_tl(t0, s->cc_srcT);
> > +gen_exts(size, t0);
>
> gen_ext_tl
>
> With that,
> Reviewed-by: Richard Henderson 
>
>
> r~
>
>


[PATCH for-9.1 19/19] target/i386: remove duplicate prefix decoding

2024-04-09 Thread Paolo Bonzini
Now that a bulk of opcodes go through the new decoder, it is sensible
to do some cleanup.  Go immediately through disas_insn_new and only jump
back after parsing the prefixes.

disas_insn() now only contains the three sigsetjmp cases, and they
are more easily managed if they are inlined into i386_tr_translate_insn.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c  | 259 +++
 target/i386/tcg/decode-new.c.inc |  60 +--
 2 files changed, 100 insertions(+), 219 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index d3c863c5d1d..93601abf994 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -2453,10 +2453,6 @@ static void gen_sty_env_A0(DisasContext *s, int offset, 
bool align)
 tcg_gen_qemu_st_i128(t, s->tmp0, mem_index, mop);
 }
 
-#include "decode-new.h"
-#include "emit.c.inc"
-#include "decode-new.c.inc"
-
 static void gen_cmpxchg8b(DisasContext *s, CPUX86State *env, int modrm)
 {
 TCGv_i64 cmp, val, old;
@@ -3119,183 +3115,6 @@ static bool disas_insn_x87(DisasContext *s, CPUState 
*cpu, int b)
 return true;
 }
 
-static void disas_insn_old(DisasContext *s, CPUState *cpu, int b);
-
-/* convert one instruction. s->base.is_jmp is set if the translation must
-   be stopped. Return the next pc value */
-static bool disas_insn(DisasContext *s, CPUState *cpu)
-{
-CPUX86State *env = cpu_env(cpu);
-int b, prefixes;
-MemOp aflag, dflag;
-bool orig_cc_op_dirty = s->cc_op_dirty;
-CCOp orig_cc_op = s->cc_op;
-target_ulong orig_pc_save = s->pc_save;
-
-s->pc = s->base.pc_next;
-s->override = -1;
-s->popl_esp_hack = 0;
-#ifdef TARGET_X86_64
-s->rex_r = 0;
-s->rex_x = 0;
-s->rex_b = 0;
-#endif
-s->rip_offset = 0; /* for relative ip address */
-s->vex_l = 0;
-s->vex_v = 0;
-s->vex_w = false;
-switch (sigsetjmp(s->jmpbuf, 0)) {
-case 0:
-break;
-case 1:
-gen_exception_gpf(s);
-return true;
-case 2:
-/* Restore state that may affect the next instruction. */
-s->pc = s->base.pc_next;
-/*
- * TODO: These save/restore can be removed after the table-based
- * decoder is complete; we will be decoding the insn completely
- * before any code generation that might affect these variables.
- */
-s->cc_op_dirty = orig_cc_op_dirty;
-s->cc_op = orig_cc_op;
-s->pc_save = orig_pc_save;
-/* END TODO */
-s->base.num_insns--;
-tcg_remove_ops_after(s->prev_insn_end);
-s->base.insn_start = s->prev_insn_start;
-s->base.is_jmp = DISAS_TOO_MANY;
-return false;
-default:
-g_assert_not_reached();
-}
-
-prefixes = 0;
-
- next_byte:
-s->prefix = prefixes;
-b = x86_ldub_code(env, s);
-/* Collect prefixes.  */
-switch (b) {
-case 0x0f:
-b = x86_ldub_code(env, s) + 0x100;
-break;
-case 0xf3:
-prefixes |= PREFIX_REPZ;
-prefixes &= ~PREFIX_REPNZ;
-goto next_byte;
-case 0xf2:
-prefixes |= PREFIX_REPNZ;
-prefixes &= ~PREFIX_REPZ;
-goto next_byte;
-case 0xf0:
-prefixes |= PREFIX_LOCK;
-goto next_byte;
-case 0x2e:
-s->override = R_CS;
-goto next_byte;
-case 0x36:
-s->override = R_SS;
-goto next_byte;
-case 0x3e:
-s->override = R_DS;
-goto next_byte;
-case 0x26:
-s->override = R_ES;
-goto next_byte;
-case 0x64:
-s->override = R_FS;
-goto next_byte;
-case 0x65:
-s->override = R_GS;
-goto next_byte;
-case 0x66:
-prefixes |= PREFIX_DATA;
-goto next_byte;
-case 0x67:
-prefixes |= PREFIX_ADR;
-goto next_byte;
-#ifdef TARGET_X86_64
-case 0x40 ... 0x4f:
-if (CODE64(s)) {
-/* REX prefix */
-prefixes |= PREFIX_REX;
-s->vex_w = (b >> 3) & 1;
-s->rex_r = (b & 0x4) << 1;
-s->rex_x = (b & 0x2) << 2;
-s->rex_b = (b & 0x1) << 3;
-goto next_byte;
-}
-break;
-#endif
-case 0xc5: /* 2-byte VEX */
-case 0xc4: /* 3-byte VEX */
-if (CODE32(s) && !VM86(s)) {
-int vex2 = x86_ldub_code(env, s);
-s->pc--; /* rewind the advance_pc() x86_ldub_code() did */
-
-if (!CODE64(s) && (vex2 & 0xc0) != 0xc0) {
-/* 4.1.4.6: In 32-bit mode, bits [7:6] must be 11b,
-   otherwise the instruction is LES or LDS.  */
-break;
-}
-disas_insn_new(s, cpu, b);
-return s->pc;
-}
-break;
- 

[PATCH for-9.1 10/19] target/i386: generalize gen_movl_seg_T0

2024-04-09 Thread Paolo Bonzini
In the new decoder it is sometimes easier to put the segment
in T1 instead of T0, usually because another operand was loaded
by common code in T0.  Genrealize gen_movl_seg_T0 to allow
using any source.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index de1ccb6ea7f..8a34e50c452 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -2531,12 +2531,12 @@ static void gen_op_movl_seg_real(DisasContext *s, 
X86Seg seg_reg, TCGv seg)
 tcg_gen_shli_tl(cpu_seg_base[seg_reg], selector, 4);
 }
 
-/* move T0 to seg_reg and compute if the CPU state may change. Never
+/* move SRC to seg_reg and compute if the CPU state may change. Never
call this function with seg_reg == R_CS */
-static void gen_movl_seg_T0(DisasContext *s, X86Seg seg_reg)
+static void gen_movl_seg(DisasContext *s, X86Seg seg_reg, TCGv src)
 {
 if (PE(s) && !VM86(s)) {
-tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
+tcg_gen_trunc_tl_i32(s->tmp2_i32, src);
 gen_helper_load_seg(tcg_env, tcg_constant_i32(seg_reg), s->tmp2_i32);
 /* abort translation because the addseg value may change or
because ss32 may change. For R_SS, translation must always
@@ -2548,7 +2548,7 @@ static void gen_movl_seg_T0(DisasContext *s, X86Seg 
seg_reg)
 s->base.is_jmp = DISAS_EOB_NEXT;
 }
 } else {
-gen_op_movl_seg_real(s, seg_reg, s->T0);
+gen_op_movl_seg_real(s, seg_reg, src);
 if (seg_reg == R_SS) {
 s->base.is_jmp = DISAS_EOB_INHIBIT_IRQ;
 }
@@ -4086,13 +4086,13 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 goto illegal_op;
 reg = b >> 3;
 ot = gen_pop_T0(s);
-gen_movl_seg_T0(s, reg);
+gen_movl_seg(s, reg, s->T0);
 gen_pop_update(s, ot);
 break;
 case 0x1a1: /* pop fs */
 case 0x1a9: /* pop gs */
 ot = gen_pop_T0(s);
-gen_movl_seg_T0(s, (b >> 3) & 7);
+gen_movl_seg(s, (b >> 3) & 7, s->T0);
 gen_pop_update(s, ot);
 break;
 
@@ -4139,7 +4139,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 if (reg >= 6 || reg == R_CS)
 goto illegal_op;
 gen_ldst_modrm(env, s, modrm, MO_16, OR_TMP0, 0);
-gen_movl_seg_T0(s, reg);
+gen_movl_seg(s, reg, s->T0);
 break;
 case 0x8c: /* mov Gv, seg */
 modrm = x86_ldub_code(env, s);
@@ -4325,7 +4325,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 gen_add_A0_im(s, 1 << ot);
 /* load the segment first to handle exceptions properly */
 gen_op_ld_v(s, MO_16, s->T0, s->A0);
-gen_movl_seg_T0(s, op);
+gen_movl_seg(s, op, s->T0);
 /* then put the data */
 gen_op_mov_reg_v(s, ot, reg, s->T1);
 break;
-- 
2.44.0




[PATCH for-9.1 09/19] target/i386: move 60-BF opcodes to new decoder

2024-04-09 Thread Paolo Bonzini
Compared to the old decoder, the main differences in translation
are for the little-used ARPL instruction.  IMUL is adjusted a bit
to share more code to produce flags, but is otherwise very similar.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.h |   2 +
 target/i386/tcg/translate.c  |   9 +-
 target/i386/tcg/decode-new.c.inc | 171 +
 target/i386/tcg/emit.c.inc   | 317 +++
 4 files changed, 497 insertions(+), 2 deletions(-)

diff --git a/target/i386/tcg/decode-new.h b/target/i386/tcg/decode-new.h
index 8ffde8d1cd6..ca99a620ce9 100644
--- a/target/i386/tcg/decode-new.h
+++ b/target/i386/tcg/decode-new.h
@@ -165,6 +165,8 @@ typedef enum X86InsnSpecial {
 /* Always locked if it has a memory operand (XCHG) */
 X86_SPECIAL_Locked,
 
+/* Do not apply segment base to effective address */
+X86_SPECIAL_NoSeg,
 /*
  * Rd/Mb or Rd/Mw in the manual: register operand 0 is treated as 32 bits
  * (and writeback zero-extends it to 64 bits if applicable).  PREFIX_DATA
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index c251fa21e6d..de1ccb6ea7f 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -1296,7 +1296,11 @@ static void gen_cmps(DisasContext *s, MemOp ot)
 gen_string_movl_A0_EDI(s);
 gen_op_ld_v(s, ot, s->T1, s->A0);
 gen_string_movl_A0_ESI(s);
-gen_op(s, OP_CMPL, ot, OR_TMP0);
+gen_op_ld_v(s, ot, s->T0, s->A0);
+tcg_gen_mov_tl(cpu_cc_src, s->T1);
+tcg_gen_mov_tl(s->cc_srcT, s->T0);
+tcg_gen_sub_tl(cpu_cc_dst, s->T0, s->T1);
+set_cc_op(s, CC_OP_SUBB + ot);
 
 dshift = gen_compute_Dshift(s, ot);
 gen_op_add_reg(s, s->aflag, R_ESI, dshift);
@@ -3124,6 +3128,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 
 s->pc = s->base.pc_next;
 s->override = -1;
+s->popl_esp_hack = 0;
 #ifdef TARGET_X86_64
 s->rex_r = 0;
 s->rex_x = 0;
@@ -3181,7 +3186,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 #ifndef CONFIG_USER_ONLY
 use_new &= b <= limit;
 #endif
-if (use_new && b <= 0x5f) {
+if (use_new && b <= 0xbf) {
 disas_insn_new(s, cpu, b);
 return true;
 }
diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index c6fd7a053bd..f6d6873dd83 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -33,6 +33,13 @@
  * ("cannot encode 16-bit or 32-bit size in 64-bit mode") as modifiers of the
  * "v" or "z" sizes.  The decoder simply makes them separate operand sizes.
  *
+ * The manual lists immediate far destinations as Ap (technically an implicit
+ * argument).  The decoder splits them into two immediates, using "Ip" for
+ * the offset part (that comes first in the instruction stream) and "Iw" for
+ * the segment/selector part.  The size of the offset is given by s->dflag
+ * and the instructions are illegal in 64-bit mode, so the choice of "Ip"
+ * is somewhat arbitrary; "Iv" or "Iz" would work just as well.
+ *
  * Vector operands
  * ---
  *
@@ -151,6 +158,8 @@
  */
 #define X86_OP_ENTRYrr(op, op0, s0, op1, s1, ...) \
 X86_OP_ENTRY3(op, None, None, op0, s0, op1, s1, ## __VA_ARGS__)
+#define X86_OP_ENTRYwr(op, op0, s0, op1, s1, ...) \
+X86_OP_ENTRY3(op, op0, s0, None, None, op1, s1, ## __VA_ARGS__)
 #define X86_OP_ENTRY2(op, op0, s0, op1, s1, ...)  \
 X86_OP_ENTRY3(op, op0, s0, 2op, s0, op1, s1, ## __VA_ARGS__)
 #define X86_OP_ENTRYw(op, op0, s0, ...)   \
@@ -163,6 +172,7 @@
 X86_OP_ENTRY3(op, None, None, None, None, None, None, ## __VA_ARGS__)
 
 #define cpuid(feat) .cpuid = X86_FEAT_##feat,
+#define noseg .special = X86_SPECIAL_NoSeg,
 #define xchg .special = X86_SPECIAL_Locked,
 #define lock .special = X86_SPECIAL_HasLock,
 #define mmx .special = X86_SPECIAL_MMX,
@@ -209,6 +219,8 @@
 #define p_66_f3_f2.valid_prefix = P_66 | P_F3 | P_F2,
 #define p_00_66_f3_f2 .valid_prefix = P_00 | P_66 | P_F3 | P_F2,
 
+#define UNKNOWN_OPCODE ((X86OpEntry) {})
+
 static uint8_t get_modrm(DisasContext *s, CPUX86State *env)
 {
 if (!s->has_modrm) {
@@ -1108,6 +1120,51 @@ static void decode_0F(DisasContext *s, CPUX86State *env, 
X86OpEntry *entry, uint
 do_decode_0F(s, env, entry, b);
 }
 
+static void decode_63(DisasContext *s, CPUX86State *env, X86OpEntry *entry, 
uint8_t *b)
+{
+static const X86OpEntry arpl = X86_OP_ENTRY2(ARPL, E,w, G,w, chk(prot));
+static const X86OpEntry mov = X86_OP_ENTRY3(MOV, G,v, E,v, None, None);
+static const X86OpEntry movsxd = X86_OP_ENTRY3(MOV, G,v, E,d, None, None, 
sextT0);
+if (!CODE64(s)) {
+*entry = arpl;
+} else if (REX_W(s)) {
+*entry = movsxd;

[PATCH for-9.1 18/19] target/i386: split legacy decoder into a separate function

2024-04-09 Thread Paolo Bonzini
Split the bits that have some duplication with disas_insn_new, from
those that should be the main topic of the conversion.  This is the
first step towards removing duplicate decoding of prefixes between
disas_insn and disas_insn_new.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 58 +++--
 1 file changed, 37 insertions(+), 21 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index e7f51685ed8..d3c863c5d1d 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -3119,15 +3119,15 @@ static bool disas_insn_x87(DisasContext *s, CPUState 
*cpu, int b)
 return true;
 }
 
+static void disas_insn_old(DisasContext *s, CPUState *cpu, int b);
+
 /* convert one instruction. s->base.is_jmp is set if the translation must
be stopped. Return the next pc value */
 static bool disas_insn(DisasContext *s, CPUState *cpu)
 {
 CPUX86State *env = cpu_env(cpu);
 int b, prefixes;
-int shift;
-MemOp ot, aflag, dflag;
-int modrm, reg, rm, mod, op, opreg, val;
+MemOp aflag, dflag;
 bool orig_cc_op_dirty = s->cc_op_dirty;
 CCOp orig_cc_op = s->cc_op;
 target_ulong orig_pc_save = s->pc_save;
@@ -3273,6 +3273,38 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 s->aflag = aflag;
 s->dflag = dflag;
 
+switch (b) {
+case 0 ... 0xd7:
+case 0xe0  ... 0xff:
+case 0x10e ... 0x117:
+case 0x128 ... 0x12f:
+case 0x138 ... 0x19f:
+case 0x1a0 ... 0x1a1:
+case 0x1a8 ... 0x1a9:
+case 0x1af:
+case 0x1b2:
+case 0x1b4 ... 0x1b7:
+case 0x1be ... 0x1bf:
+case 0x1c2 ... 0x1c6:
+case 0x1c8 ... 0x1ff:
+disas_insn_new(s, cpu, b);
+break;
+default:
+disas_insn_old(s, cpu, b);
+break;
+}
+return true;
+}
+
+static void disas_insn_old(DisasContext *s, CPUState *cpu, int b)
+{
+CPUX86State *env = cpu_env(cpu);
+int prefixes = s->prefix;
+MemOp dflag = s->dflag;
+int shift;
+MemOp ot;
+int modrm, reg, rm, mod, op, opreg, val;
+
 /* now check op code */
 switch (b) {
 /**/
@@ -4726,31 +4758,15 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 
 set_cc_op(s, CC_OP_POPCNT);
 break;
-case 0 ... 0xd7:
-case 0xe0  ... 0xff:
-case 0x10e ... 0x117:
-case 0x128 ... 0x12f:
-case 0x138 ... 0x19f:
-case 0x1a0 ... 0x1a1:
-case 0x1a8 ... 0x1a9:
-case 0x1af:
-case 0x1b2:
-case 0x1b4 ... 0x1b7:
-case 0x1be ... 0x1bf:
-case 0x1c2 ... 0x1c6:
-case 0x1c8 ... 0x1ff:
-disas_insn_new(s, cpu, b);
-break;
 default:
 goto unknown_op;
 }
-return true;
+return;
  illegal_op:
 gen_illegal_opcode(s);
-return true;
+return;
  unknown_op:
 gen_unknown_opcode(env, s);
-return true;
 }
 
 void tcg_x86_init(void)
-- 
2.44.0




[PATCH for-9.1 05/19] target/i386: reintroduce debugging mechanism

2024-04-09 Thread Paolo Bonzini
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c  | 27 +++
 target/i386/tcg/decode-new.c.inc |  3 +++
 2 files changed, 30 insertions(+)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index debc1b27283..2a372842db4 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -2971,6 +2971,9 @@ static void gen_sty_env_A0(DisasContext *s, int offset, 
bool align)
 tcg_gen_qemu_st_i128(t, s->tmp0, mem_index, mop);
 }
 
+static bool first = true;
+static unsigned long limit;
+
 #include "decode-new.h"
 #include "emit.c.inc"
 #include "decode-new.c.inc"
@@ -3126,15 +3129,39 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 
 prefixes = 0;
 
+if (first) {
+const char *limit_str = getenv("QEMU_I386_LIMIT");
+limit = limit_str ? atol(limit_str) : -1;
+first = false;
+}
+bool use_new = true;
+#ifdef CONFIG_USER_ONLY
+use_new &= limit > 0;
+#endif
+
  next_byte:
 s->prefix = prefixes;
 b = x86_ldub_code(env, s);
 /* Collect prefixes.  */
 switch (b) {
 default:
+#ifndef CONFIG_USER_ONLY
+use_new &= b <= limit;
+#endif
+if (use_new && 0) {
+disas_insn_new(s, cpu, b);
+return true;
+}
 break;
 case 0x0f:
 b = x86_ldub_code(env, s) + 0x100;
+#ifndef CONFIG_USER_ONLY
+use_new &= b <= limit;
+#endif
+if (use_new && 0) {
+disas_insn_new(s, cpu, b);
+return true;
+}
 break;
 case 0xf3:
 prefixes |= PREFIX_REPZ;
diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 426c4594120..3fc6485d74c 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -1689,6 +1689,9 @@ static void disas_insn_new(DisasContext *s, CPUState 
*cpu, int b)
 X86DecodeFunc decode_func = decode_root;
 uint8_t cc_live;
 
+#ifdef CONFIG_USER_ONLY
+if (limit) { --limit; }
+#endif
 s->has_modrm = false;
 
  next_byte:
-- 
2.44.0




[PATCH for-9.1 02/19] target/i386: use TSTEQ/TSTNE to check flags

2024-04-09 Thread Paolo Bonzini
The new conditions obviously come in handy when testing individual bits
of EFLAGS, and they make it possible to remove the .mask field of
CCPrepare.

Lowering to shift+and is done by the optimizer if necessary.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 32 
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index b7117393961..4de5090846a 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -996,8 +996,8 @@ static CCPrepare gen_prepare_eflags_c(DisasContext *s, TCGv 
reg)
 case CC_OP_EFLAGS:
 case CC_OP_SARB ... CC_OP_SARQ:
 /* CC_SRC & 1 */
-return (CCPrepare) { .cond = TCG_COND_NE,
- .reg = cpu_cc_src, .mask = CC_C };
+return (CCPrepare) { .cond = TCG_COND_TSTNE,
+ .reg = cpu_cc_src, .mask = -1, .imm = CC_C };
 
 default:
/* The need to compute only C from CC_OP_DYNAMIC is important
@@ -1014,8 +1014,8 @@ static CCPrepare gen_prepare_eflags_c(DisasContext *s, 
TCGv reg)
 static CCPrepare gen_prepare_eflags_p(DisasContext *s, TCGv reg)
 {
 gen_compute_eflags(s);
-return (CCPrepare) { .cond = TCG_COND_NE, .reg = cpu_cc_src,
- .mask = CC_P };
+return (CCPrepare) { .cond = TCG_COND_TSTNE, .reg = cpu_cc_src,
+ .mask = -1, .imm = CC_P };
 }
 
 /* compute eflags.S to reg */
@@ -1029,8 +1029,8 @@ static CCPrepare gen_prepare_eflags_s(DisasContext *s, 
TCGv reg)
 case CC_OP_ADCX:
 case CC_OP_ADOX:
 case CC_OP_ADCOX:
-return (CCPrepare) { .cond = TCG_COND_NE, .reg = cpu_cc_src,
- .mask = CC_S };
+return (CCPrepare) { .cond = TCG_COND_TSTNE, .reg = cpu_cc_src,
+ .mask = -1, .imm = CC_S };
 case CC_OP_CLR:
 case CC_OP_POPCNT:
 return (CCPrepare) { .cond = TCG_COND_NEVER, .mask = -1 };
@@ -1058,8 +1058,8 @@ static CCPrepare gen_prepare_eflags_o(DisasContext *s, 
TCGv reg)
  .reg = cpu_cc_src, .mask = -1 };
 default:
 gen_compute_eflags(s);
-return (CCPrepare) { .cond = TCG_COND_NE, .reg = cpu_cc_src,
- .mask = CC_O };
+return (CCPrepare) { .cond = TCG_COND_TSTNE, .reg = cpu_cc_src,
+ .mask = -1, .imm = CC_O };
 }
 }
 
@@ -1074,8 +1074,8 @@ static CCPrepare gen_prepare_eflags_z(DisasContext *s, 
TCGv reg)
 case CC_OP_ADCX:
 case CC_OP_ADOX:
 case CC_OP_ADCOX:
-return (CCPrepare) { .cond = TCG_COND_NE, .reg = cpu_cc_src,
- .mask = CC_Z };
+return (CCPrepare) { .cond = TCG_COND_TSTNE, .reg = cpu_cc_src,
+ .mask = -1, .imm = CC_Z };
 case CC_OP_CLR:
 return (CCPrepare) { .cond = TCG_COND_ALWAYS, .mask = -1 };
 case CC_OP_POPCNT:
@@ -1153,8 +1153,8 @@ static CCPrepare gen_prepare_cc(DisasContext *s, int b, 
TCGv reg)
 break;
 case JCC_BE:
 gen_compute_eflags(s);
-cc = (CCPrepare) { .cond = TCG_COND_NE, .reg = cpu_cc_src,
-   .mask = CC_Z | CC_C };
+cc = (CCPrepare) { .cond = TCG_COND_TSTNE, .reg = cpu_cc_src,
+   .mask = -1, .imm = CC_Z | CC_C };
 break;
 case JCC_S:
 cc = gen_prepare_eflags_s(s, reg);
@@ -1168,8 +1168,8 @@ static CCPrepare gen_prepare_cc(DisasContext *s, int b, 
TCGv reg)
 reg = s->tmp0;
 }
 tcg_gen_addi_tl(reg, cpu_cc_src, CC_O - CC_S);
-cc = (CCPrepare) { .cond = TCG_COND_NE, .reg = reg,
-   .mask = CC_O };
+cc = (CCPrepare) { .cond = TCG_COND_TSTNE, .reg = reg,
+   .mask = -1, .imm = CC_O };
 break;
 default:
 case JCC_LE:
@@ -1178,8 +1178,8 @@ static CCPrepare gen_prepare_cc(DisasContext *s, int b, 
TCGv reg)
 reg = s->tmp0;
 }
 tcg_gen_addi_tl(reg, cpu_cc_src, CC_O - CC_S);
-cc = (CCPrepare) { .cond = TCG_COND_NE, .reg = reg,
-   .mask = CC_O | CC_Z };
+cc = (CCPrepare) { .cond = TCG_COND_TSTNE, .reg = reg,
+   .mask = -1, .imm = CC_O | CC_Z };
 break;
 }
 break;
-- 
2.44.0




[PATCH for-9.1 08/19] target/i386: allow instructions with more than one immediate

2024-04-09 Thread Paolo Bonzini
While keeping decode->immediate for convenience and for 4-operand instructions,
store the immediate in X86DecodedOp as well.  This enables instructions
with more than one immediate such as ENTER.  It can also be used for far
calls and jumps.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.h | 17 -
 target/i386/tcg/decode-new.c.inc |  2 +-
 target/i386/tcg/emit.c.inc   |  4 +++-
 3 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/target/i386/tcg/decode-new.h b/target/i386/tcg/decode-new.h
index 15e6bfef4b1..8ffde8d1cd6 100644
--- a/target/i386/tcg/decode-new.h
+++ b/target/i386/tcg/decode-new.h
@@ -271,16 +271,23 @@ typedef struct X86DecodedOp {
 bool has_ea;
 int offset;   /* For MMX and SSE */
 
-/*
- * This field is used internally by macros OP0_PTR/OP1_PTR/OP2_PTR,
- * do not access directly!
- */
-TCGv_ptr v_ptr;
+union {
+   target_ulong imm;
+/*
+ * This field is used internally by macros OP0_PTR/OP1_PTR/OP2_PTR,
+ * do not access directly!
+ */
+TCGv_ptr v_ptr;
+};
 } X86DecodedOp;
 
 struct X86DecodedInsn {
 X86OpEntry e;
 X86DecodedOp op[3];
+/*
+ * Rightmost immediate, for convenience since most instructions have
+ * one (and also for 4-operand instructions).
+ */
 target_ulong immediate;
 AddressParts mem;
 
diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 1e792426ff5..c6fd7a053bd 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -1473,7 +1473,7 @@ static bool decode_op(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode,
 case X86_TYPE_I:  /* Immediate */
 case X86_TYPE_J:  /* Relative offset for a jump */
 op->unit = X86_OP_IMM;
-decode->immediate = insn_get_signed(env, s, op->ot);
+decode->immediate = op->imm = insn_get_signed(env, s, op->ot);
 break;
 
 case X86_TYPE_L:  /* The upper 4 bits of the immediate select a 128-bit 
register */
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index a64186b8957..a27d3040e03 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -259,7 +259,7 @@ static void gen_load(DisasContext *s, X86DecodedInsn 
*decode, int opn, TCGv v)
 }
 break;
 case X86_OP_IMM:
-tcg_gen_movi_tl(v, decode->immediate);
+tcg_gen_movi_tl(v, op->imm);
 break;
 
 case X86_OP_MMX:
@@ -283,6 +283,8 @@ static void gen_load(DisasContext *s, X86DecodedInsn 
*decode, int opn, TCGv v)
 static TCGv_ptr op_ptr(X86DecodedInsn *decode, int opn)
 {
 X86DecodedOp *op = >op[opn];
+
+assert (op->unit == X86_OP_MMX || op->unit == X86_OP_SSE);
 if (op->v_ptr) {
 return op->v_ptr;
 }
-- 
2.44.0




[PATCH for-9.1 13/19] target/i386: move remaining conditional operations to new decoder

2024-04-09 Thread Paolo Bonzini
Move long-displacement Jcc, SETcc and CMOVcc to the new decoder.
While filling in the tables makes the code seem longer, the new
emitters are all just one line of code.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.h |  1 +
 target/i386/tcg/translate.c  |  2 +-
 target/i386/tcg/decode-new.c.inc | 56 
 target/i386/tcg/emit.c.inc   | 10 ++
 4 files changed, 68 insertions(+), 1 deletion(-)

diff --git a/target/i386/tcg/decode-new.h b/target/i386/tcg/decode-new.h
index 77bb31eb143..cd7ceca21e8 100644
--- a/target/i386/tcg/decode-new.h
+++ b/target/i386/tcg/decode-new.h
@@ -106,6 +106,7 @@ typedef enum X86CPUIDFeature {
 X86_FEAT_AVX2,
 X86_FEAT_BMI1,
 X86_FEAT_BMI2,
+X86_FEAT_CMOV,
 X86_FEAT_CMPCCXADD,
 X86_FEAT_F16C,
 X86_FEAT_FMA,
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 26e4c7520db..f3c437aee88 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -3209,7 +3209,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 #ifndef CONFIG_USER_ONLY
 use_new &= b <= limit;
 #endif
-if (use_new && 0) {
+if (use_new && (b >= 0x138 && b <= 0x19f)) {
 disas_insn_new(s, cpu, b);
 return true;
 }
diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 87ae63faf9a..36eb53515af 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -993,6 +993,15 @@ static const X86OpEntry opcodes_0F[256] = {
 /* Incorrectly listed as Mq,Vq in the manual */
 [0x17] = X86_OP_ENTRY3(VMOVHPx_st,  M,q, None,None, V,dq, vex5 p_00_66),
 
+[0x40] = X86_OP_ENTRY2(CMOVcc, G,v, E,v, cpuid(CMOV)),
+[0x41] = X86_OP_ENTRY2(CMOVcc, G,v, E,v, cpuid(CMOV)),
+[0x42] = X86_OP_ENTRY2(CMOVcc, G,v, E,v, cpuid(CMOV)),
+[0x43] = X86_OP_ENTRY2(CMOVcc, G,v, E,v, cpuid(CMOV)),
+[0x44] = X86_OP_ENTRY2(CMOVcc, G,v, E,v, cpuid(CMOV)),
+[0x45] = X86_OP_ENTRY2(CMOVcc, G,v, E,v, cpuid(CMOV)),
+[0x46] = X86_OP_ENTRY2(CMOVcc, G,v, E,v, cpuid(CMOV)),
+[0x47] = X86_OP_ENTRY2(CMOVcc, G,v, E,v, cpuid(CMOV)),
+
 [0x50] = X86_OP_ENTRY3(MOVMSK, G,y, None,None, U,x, vex7 p_00_66),
 [0x51] = X86_OP_GROUP3(sse_unary,  V,x, H,x, W,x, vex2_rep3 
p_00_66_f3_f2), /* sqrtps */
 [0x52] = X86_OP_GROUP3(sse_unary,  V,x, H,x, W,x, vex4_rep5 p_00_f3), /* 
rsqrtps */
@@ -1020,6 +1029,24 @@ static const X86OpEntry opcodes_0F[256] = {
 [0x76] = X86_OP_ENTRY3(PCMPEQD,V,x, H,x, W,x,  vex4 mmx avx2_256 
p_00_66),
 [0x77] = X86_OP_GROUP0(0F77),
 
+[0x80] = X86_OP_ENTRYr(Jcc, J,z_f64),
+[0x81] = X86_OP_ENTRYr(Jcc, J,z_f64),
+[0x82] = X86_OP_ENTRYr(Jcc, J,z_f64),
+[0x83] = X86_OP_ENTRYr(Jcc, J,z_f64),
+[0x84] = X86_OP_ENTRYr(Jcc, J,z_f64),
+[0x85] = X86_OP_ENTRYr(Jcc, J,z_f64),
+[0x86] = X86_OP_ENTRYr(Jcc, J,z_f64),
+[0x87] = X86_OP_ENTRYr(Jcc, J,z_f64),
+
+[0x90] = X86_OP_ENTRYw(SETcc, E,b),
+[0x91] = X86_OP_ENTRYw(SETcc, E,b),
+[0x92] = X86_OP_ENTRYw(SETcc, E,b),
+[0x93] = X86_OP_ENTRYw(SETcc, E,b),
+[0x94] = X86_OP_ENTRYw(SETcc, E,b),
+[0x95] = X86_OP_ENTRYw(SETcc, E,b),
+[0x96] = X86_OP_ENTRYw(SETcc, E,b),
+[0x97] = X86_OP_ENTRYw(SETcc, E,b),
+
 [0x28] = X86_OP_ENTRY3(MOVDQ,  V,x,  None,None, W,x, vex1 p_00_66), /* 
MOVAPS */
 [0x29] = X86_OP_ENTRY3(MOVDQ,  W,x,  None,None, V,x, vex1 p_00_66), /* 
MOVAPS */
 [0x2A] = X86_OP_GROUP0(0F2A),
@@ -1032,6 +1059,15 @@ static const X86OpEntry opcodes_0F[256] = {
 [0x38] = X86_OP_GROUP0(0F38),
 [0x3a] = X86_OP_GROUP0(0F3A),
 
+[0x48] = X86_OP_ENTRY2(CMOVcc, G,v, E,v, cpuid(CMOV)),
+[0x49] = X86_OP_ENTRY2(CMOVcc, G,v, E,v, cpuid(CMOV)),
+[0x4a] = X86_OP_ENTRY2(CMOVcc, G,v, E,v, cpuid(CMOV)),
+[0x4b] = X86_OP_ENTRY2(CMOVcc, G,v, E,v, cpuid(CMOV)),
+[0x4c] = X86_OP_ENTRY2(CMOVcc, G,v, E,v, cpuid(CMOV)),
+[0x4d] = X86_OP_ENTRY2(CMOVcc, G,v, E,v, cpuid(CMOV)),
+[0x4e] = X86_OP_ENTRY2(CMOVcc, G,v, E,v, cpuid(CMOV)),
+[0x4f] = X86_OP_ENTRY2(CMOVcc, G,v, E,v, cpuid(CMOV)),
+
 [0x58] = X86_OP_ENTRY3(VADD,   V,x, H,x, W,x, vex2_rep3 p_00_66_f3_f2),
 [0x59] = X86_OP_ENTRY3(VMUL,   V,x, H,x, W,x, vex2_rep3 p_00_66_f3_f2),
 [0x5a] = X86_OP_GROUP0(0F5A),
@@ -1057,6 +1093,24 @@ static const X86OpEntry opcodes_0F[256] = {
 [0x7e] = X86_OP_GROUP0(0F7E),
 [0x7f] = X86_OP_GROUP0(0F7F),
 
+[0x88] = X86_OP_ENTRYr(Jcc, J,z_f64),
+[0x89] = X86_OP_ENTRYr(Jcc, J,z_f64),
+[0x8a] = X86_OP_ENTRYr(Jcc, J,z_f64),
+[0x8b] = X86_OP_ENTRYr(Jcc, J,z_f64),
+[0x8c] = X86_OP_ENTRYr(Jcc, J,z_f64),
+[0x8d] = X86_OP_ENTRYr(Jcc, J,z_f64),
+[0x8e] = X86_OP_ENTRYr(Jcc, J,z_f64),
+[0x8f] = X86_OP_ENTRYr(Jcc, J,z_f64),
+
+[0x98] = X86_OP_ENTRYw(SETcc, E,b),
+[0x99] = 

[PATCH for-9.1 06/19] target/i386: move 00-5F opcodes to new decoder

2024-04-09 Thread Paolo Bonzini
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c  |   2 +-
 target/i386/tcg/decode-new.c.inc | 120 ++
 target/i386/tcg/emit.c.inc   | 202 +++
 3 files changed, 323 insertions(+), 1 deletion(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 2a372842db4..e501d4701b6 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -3148,7 +3148,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 #ifndef CONFIG_USER_ONLY
 use_new &= b <= limit;
 #endif
-if (use_new && 0) {
+if (use_new && b <= 0x5f) {
 disas_insn_new(s, cpu, b);
 return true;
 }
diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 3fc6485d74c..1e792426ff5 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -121,6 +121,8 @@
 
 #define X86_OP_GROUP2(op, op0, s0, op1, s1, ...)  \
 X86_OP_GROUP3(op, op0, s0, 2op, s0, op1, s1, ## __VA_ARGS__)
+#define X86_OP_GROUPw(op, op0, s0, ...)   \
+X86_OP_GROUP3(op, op0, s0, None, None, None, None, ## __VA_ARGS__)
 #define X86_OP_GROUP0(op, ...)\
 X86_OP_GROUP3(op, None, None, None, None, None, None, ## __VA_ARGS__)
 
@@ -140,12 +142,23 @@
 .op3 = X86_TYPE_I, .s3 = X86_SIZE_b,  \
 ## __VA_ARGS__)
 
+/*
+ * Short forms that are mostly useful for ALU opcodes and other
+ * one-byte opcodes.  For vector instructions it is usually
+ * clearer to write all three operands explicitly, because the
+ * corresponding gen_* function will use OP_PTRn rather than s->T0
+ * and s->T1.
+ */
+#define X86_OP_ENTRYrr(op, op0, s0, op1, s1, ...) \
+X86_OP_ENTRY3(op, None, None, op0, s0, op1, s1, ## __VA_ARGS__)
 #define X86_OP_ENTRY2(op, op0, s0, op1, s1, ...)  \
 X86_OP_ENTRY3(op, op0, s0, 2op, s0, op1, s1, ## __VA_ARGS__)
 #define X86_OP_ENTRYw(op, op0, s0, ...)   \
 X86_OP_ENTRY3(op, op0, s0, None, None, None, None, ## __VA_ARGS__)
 #define X86_OP_ENTRYr(op, op0, s0, ...)   \
 X86_OP_ENTRY3(op, None, None, None, None, op0, s0, ## __VA_ARGS__)
+#define X86_OP_ENTRY1(op, op0, s0, ...)   \
+X86_OP_ENTRY3(op, op0, s0, 2op, s0, None, None, ## __VA_ARGS__)
 #define X86_OP_ENTRY0(op, ...)\
 X86_OP_ENTRY3(op, None, None, None, None, None, None, ## __VA_ARGS__)
 
@@ -1096,7 +1109,114 @@ static void decode_0F(DisasContext *s, CPUX86State 
*env, X86OpEntry *entry, uint
 }
 
 static const X86OpEntry opcodes_root[256] = {
+[0x00] = X86_OP_ENTRY2(ADD, E,b, G,b, lock),
+[0x01] = X86_OP_ENTRY2(ADD, E,v, G,v, lock),
+[0x02] = X86_OP_ENTRY2(ADD, G,b, E,b, lock),
+[0x03] = X86_OP_ENTRY2(ADD, G,v, E,v, lock),
+[0x04] = X86_OP_ENTRY2(ADD, 0,b, I,b, lock),   /* AL, Ib */
+[0x05] = X86_OP_ENTRY2(ADD, 0,v, I,z, lock),   /* rAX, Iz */
+[0x06] = X86_OP_ENTRYr(PUSH, ES, w, chk(i64)),
+[0x07] = X86_OP_ENTRYw(POP, ES, w, chk(i64)),
+
+[0x10] = X86_OP_ENTRY2(ADC, E,b, G,b, lock),
+[0x11] = X86_OP_ENTRY2(ADC, E,v, G,v, lock),
+[0x12] = X86_OP_ENTRY2(ADC, G,b, E,b, lock),
+[0x13] = X86_OP_ENTRY2(ADC, G,v, E,v, lock),
+[0x14] = X86_OP_ENTRY2(ADC, 0,b, I,b, lock),   /* AL, Ib */
+[0x15] = X86_OP_ENTRY2(ADC, 0,v, I,z, lock),   /* rAX, Iz */
+[0x16] = X86_OP_ENTRYr(PUSH, SS, w, chk(i64)),
+[0x17] = X86_OP_ENTRYw(POP, SS, w, chk(i64)),
+
+[0x20] = X86_OP_ENTRY2(AND, E,b, G,b, lock),
+[0x21] = X86_OP_ENTRY2(AND, E,v, G,v, lock),
+[0x22] = X86_OP_ENTRY2(AND, G,b, E,b, lock),
+[0x23] = X86_OP_ENTRY2(AND, G,v, E,v, lock),
+[0x24] = X86_OP_ENTRY2(AND, 0,b, I,b, lock),   /* AL, Ib */
+[0x25] = X86_OP_ENTRY2(AND, 0,v, I,z, lock),   /* rAX, Iz */
+[0x26] = {},
+[0x27] = X86_OP_ENTRY0(DAA, chk(i64)),
+
+[0x30] = X86_OP_ENTRY2(XOR, E,b, G,b, lock),
+[0x31] = X86_OP_ENTRY2(XOR, E,v, G,v, lock),
+[0x32] = X86_OP_ENTRY2(XOR, G,b, E,b, lock),
+[0x33] = X86_OP_ENTRY2(XOR, G,v, E,v, lock),
+[0x34] = X86_OP_ENTRY2(XOR, 0,b, I,b, lock),   /* AL, Ib */
+[0x35] = X86_OP_ENTRY2(XOR, 0,v, I,z, lock),   /* rAX, Iz */
+[0x36] = {},
+[0x37] = X86_OP_ENTRY0(AAA, chk(i64)),
+
+[0x40] = X86_OP_ENTRY1(INC, 0,v, chk(i64)),
+[0x41] = X86_OP_ENTRY1(INC, 1,v, chk(i64)),
+[0x42] = X86_OP_ENTRY1(INC, 2,v, chk(i64)),
+[0x43] = X86_OP_ENTRY1(INC, 3,v, chk(i64)),
+[0x44] = X86_OP_ENTRY1(INC, 4,v, chk(i64)),
+[0x45] = X86_OP_ENTRY1(INC, 5,v, chk(i64)),
+[0x46] = X86_OP_ENTRY1(INC, 6,v, chk(i64)),
+[0x47] = X86_OP_ENTRY1(INC, 7,v, chk(i64)),
+
+[0x50] = X86_OP_ENTRYr(PUSH, LoBits,d64),
+[0x51] = X86_OP_ENTRYr(PUSH, LoBits,d64),
+[0x52] = X86_OP_ENTRYr(PUSH, LoBits,d64),
+[0x

[PATCH for-9.1 11/19] target/i386: move C0-FF opcodes to new decoder (except for x87)

2024-04-09 Thread Paolo Bonzini
The shift instructions are rewritten instead of reusing code from the old
decoder.  Rotates use CC_OP_ADCOX more extensively and generally rely
more on the optimizer, so that the code generators are shared between
the immediate-count and variable-count cases.

In particular, this makes gen_RCL and gen_RCR pretty efficient for the
count == 1 case, which becomes (apart from a few extra movs) something like:

  (compute_cc_all if needed)
  // save old value for OF calculation
  mov cc_src2, T0
  // the bulk of RCL is just this!
  deposit T0, cc_src, T0, 1, TARGET_LONG_BITS - 1
  // compute carry
  shr cc_dst, cc_src2, length - 1
  and cc_dst, cc_dst, 1
  // compute overflow
  xor cc_src2, cc_src2, T0
  extract cc_src2, cc_src2, length - 1, 1

32-bit MUL and IMUL are also slightly more efficient on 64-bit hosts.

Signed-off-by: Paolo Bonzini 
---
 include/tcg/tcg.h|   6 +
 target/i386/tcg/decode-new.h |   2 +
 target/i386/tcg/translate.c  |  23 +-
 target/i386/tcg/decode-new.c.inc | 157 -
 target/i386/tcg/emit.c.inc   | 996 ++-
 5 files changed, 1176 insertions(+), 8 deletions(-)

diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 05a1912f8a3..88653c4f824 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -105,6 +105,12 @@ typedef uint64_t TCGRegSet;
 /* Turn some undef macros into true macros.  */
 #define TCG_TARGET_HAS_add2_i32 1
 #define TCG_TARGET_HAS_sub2_i32 1
+/* Define parameterized _tl macros.  */
+#define TCG_TARGET_deposit_tl_valid TCG_TARGET_deposit_i32_valid
+#define TCG_TARGET_extract_tl_valid TCG_TARGET_extract_i32_valid
+#else
+#define TCG_TARGET_deposit_tl_valid TCG_TARGET_deposit_i64_valid
+#define TCG_TARGET_extract_tl_valid TCG_TARGET_extract_i64_valid
 #endif
 
 #ifndef TCG_TARGET_deposit_i32_valid
diff --git a/target/i386/tcg/decode-new.h b/target/i386/tcg/decode-new.h
index ca99a620ce9..77bb31eb143 100644
--- a/target/i386/tcg/decode-new.h
+++ b/target/i386/tcg/decode-new.h
@@ -48,6 +48,7 @@ typedef enum X86OpType {
 
 /* Custom */
 X86_TYPE_WM, /* modrm byte selects an XMM/YMM memory operand */
+X86_TYPE_I_unsigned, /* Immediate, zero-extended */
 X86_TYPE_2op, /* 2-operand RMW instruction */
 X86_TYPE_LoBits, /* encoded in bits 0-2 of the operand + REX.B */
 X86_TYPE_0, /* Hard-coded GPRs (RAX..RDI) */
@@ -88,6 +89,7 @@ typedef enum X86OpSize {
 X86_SIZE_x,  /* 128/256-bit, based on operand size */
 X86_SIZE_y,  /* 32/64-bit, based on operand size */
 X86_SIZE_z,  /* 16-bit for 16-bit operand size, else 32-bit */
+X86_SIZE_z_f64,  /* 32-bit for 32-bit operand size or 64-bit mode, else 
16-bit */
 
 /* Custom */
 X86_SIZE_d64,
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 8a34e50c452..720668e023a 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -38,6 +38,9 @@
 #include "exec/helper-info.c.inc"
 #undef  HELPER_H
 
+/* Fixes for Windows namespace pollution.  */
+#undef IN
+#undef OUT
 
 #define PREFIX_REPZ   0x01
 #define PREFIX_REPNZ  0x02
@@ -2495,14 +2498,24 @@ static inline int insn_const_size(MemOp ot)
 }
 }
 
+static void gen_conditional_jump_labels(DisasContext *s, target_long diff,
+TCGLabel *not_taken, TCGLabel *taken)
+{
+if (not_taken) {
+gen_set_label(not_taken);
+}
+gen_jmp_rel_csize(s, 0, 1);
+
+gen_set_label(taken);
+gen_jmp_rel(s, s->dflag, diff, 0);
+}
+
 static void gen_jcc(DisasContext *s, int b, int diff)
 {
 TCGLabel *l1 = gen_new_label();
 
 gen_jcc1(s, b, l1);
-gen_jmp_rel_csize(s, 0, 1);
-gen_set_label(l1);
-gen_jmp_rel(s, s->dflag, diff, 0);
+gen_conditional_jump_labels(s, diff, NULL, l1);
 }
 
 static void gen_cmovcc1(DisasContext *s, int b, TCGv dest, TCGv src)
@@ -2759,7 +2772,7 @@ static void gen_unknown_opcode(CPUX86State *env, 
DisasContext *s)
 
 /* an interrupt is different from an exception because of the
privilege checks */
-static void gen_interrupt(DisasContext *s, int intno)
+static void gen_interrupt(DisasContext *s, uint8_t intno)
 {
 gen_update_cc_op(s);
 gen_update_eip_cur(s);
@@ -3186,7 +3199,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 #ifndef CONFIG_USER_ONLY
 use_new &= b <= limit;
 #endif
-if (use_new && b <= 0xbf) {
+if (use_new && (b < 0xd8 || b >= 0xe0)) {
 disas_insn_new(s, cpu, b);
 return true;
 }
diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index f6d6873dd83..87ae63faf9a 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -40,6 +40,15 @@
  * and the instructions are illegal in 64-bit mode, so the choice of "Ip"
  * is somewhat arbitrary; "Iv" or "Iz" would work just as well.
  *
+

[PATCH for-9.1 16/19] target/i386: remove now-converted opcodes from old decoder

2024-04-09 Thread Paolo Bonzini
Send all converted opcodes to disas_insn_new() directly from the big
decoding switch statement; once more, the debugging/bisecting logic
disappears.

Signed-off-by: Paolo Bonzini 
---
 target/i386/helper.h|   11 -
 target/i386/tcg/shift_helper_template.h.inc |  108 -
 target/i386/tcg/int_helper.c|   34 -
 target/i386/tcg/translate.c | 2172 +--
 target/i386/tcg/decode-new.c.inc|3 -
 5 files changed, 11 insertions(+), 2317 deletions(-)
 delete mode 100644 target/i386/tcg/shift_helper_template.h.inc

diff --git a/target/i386/helper.h b/target/i386/helper.h
index ac2b04abd63..3c207ac62d6 100644
--- a/target/i386/helper.h
+++ b/target/i386/helper.h
@@ -207,15 +207,4 @@ DEF_HELPER_1(emms, void, env)
 #define SHIFT 2
 #include "tcg/ops_sse_header.h.inc"
 
-DEF_HELPER_3(rclb, tl, env, tl, tl)
-DEF_HELPER_3(rclw, tl, env, tl, tl)
-DEF_HELPER_3(rcll, tl, env, tl, tl)
-DEF_HELPER_3(rcrb, tl, env, tl, tl)
-DEF_HELPER_3(rcrw, tl, env, tl, tl)
-DEF_HELPER_3(rcrl, tl, env, tl, tl)
-#ifdef TARGET_X86_64
-DEF_HELPER_3(rclq, tl, env, tl, tl)
-DEF_HELPER_3(rcrq, tl, env, tl, tl)
-#endif
-
 DEF_HELPER_1(rdrand, tl, env)
diff --git a/target/i386/tcg/shift_helper_template.h.inc 
b/target/i386/tcg/shift_helper_template.h.inc
deleted file mode 100644
index 54f15d6e05c..000
--- a/target/i386/tcg/shift_helper_template.h.inc
+++ /dev/null
@@ -1,108 +0,0 @@
-/*
- *  x86 shift helpers
- *
- *  Copyright (c) 2008 Fabrice Bellard
- *
- * This library is free software; you can redistribute it and/or
- * modify it under the terms of the GNU Lesser General Public
- * License as published by the Free Software Foundation; either
- * version 2.1 of the License, or (at your option) any later version.
- *
- * This library is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * Lesser General Public License for more details.
- *
- * You should have received a copy of the GNU Lesser General Public
- * License along with this library; if not, see <http://www.gnu.org/licenses/>.
- */
-
-#define DATA_BITS (1 << (3 + SHIFT))
-#define SHIFT_MASK (DATA_BITS - 1)
-#if DATA_BITS <= 32
-#define SHIFT1_MASK 0x1f
-#else
-#define SHIFT1_MASK 0x3f
-#endif
-
-#if DATA_BITS == 8
-#define SUFFIX b
-#define DATA_MASK 0xff
-#elif DATA_BITS == 16
-#define SUFFIX w
-#define DATA_MASK 0x
-#elif DATA_BITS == 32
-#define SUFFIX l
-#define DATA_MASK 0x
-#elif DATA_BITS == 64
-#define SUFFIX q
-#define DATA_MASK 0xULL
-#else
-#error unhandled operand size
-#endif
-
-target_ulong glue(helper_rcl, SUFFIX)(CPUX86State *env, target_ulong t0,
-  target_ulong t1)
-{
-int count, eflags;
-target_ulong src;
-target_long res;
-
-count = t1 & SHIFT1_MASK;
-#if DATA_BITS == 16
-count = rclw_table[count];
-#elif DATA_BITS == 8
-count = rclb_table[count];
-#endif
-if (count) {
-eflags = env->cc_src;
-t0 &= DATA_MASK;
-src = t0;
-res = (t0 << count) | ((target_ulong)(eflags & CC_C) << (count - 1));
-if (count > 1) {
-res |= t0 >> (DATA_BITS + 1 - count);
-}
-t0 = res;
-env->cc_src = (eflags & ~(CC_C | CC_O)) |
-(lshift(src ^ t0, 11 - (DATA_BITS - 1)) & CC_O) |
-((src >> (DATA_BITS - count)) & CC_C);
-}
-return t0;
-}
-
-target_ulong glue(helper_rcr, SUFFIX)(CPUX86State *env, target_ulong t0,
-  target_ulong t1)
-{
-int count, eflags;
-target_ulong src;
-target_long res;
-
-count = t1 & SHIFT1_MASK;
-#if DATA_BITS == 16
-count = rclw_table[count];
-#elif DATA_BITS == 8
-count = rclb_table[count];
-#endif
-if (count) {
-eflags = env->cc_src;
-t0 &= DATA_MASK;
-src = t0;
-res = (t0 >> count) |
-((target_ulong)(eflags & CC_C) << (DATA_BITS - count));
-if (count > 1) {
-res |= t0 << (DATA_BITS + 1 - count);
-}
-t0 = res;
-env->cc_src = (eflags & ~(CC_C | CC_O)) |
-(lshift(src ^ t0, 11 - (DATA_BITS - 1)) & CC_O) |
-((src >> (count - 1)) & CC_C);
-}
-return t0;
-}
-
-#undef DATA_BITS
-#undef SHIFT_MASK
-#undef SHIFT1_MASK
-#undef DATA_TYPE
-#undef DATA_MASK
-#undef SUFFIX
diff --git a/target/i386/tcg/int_helper.c b/target/i386/tcg/int_helper.c
index ab85dc55400..df16130f5df 100644
--- a/target/i386/tcg/int_helper.c
+++ b/target/i386/tcg/int_helper.c
@@ -29,22 +29,6 @@
 
 //#define DEBUG_MULDIV
 
-/* modulo 9 table */
-static const uint8_t rclb_table[32] = {
-0, 1, 2, 3, 4, 5, 6, 7,
-8, 0, 1, 2, 3, 4, 5, 6,
-7, 8, 0, 1, 2, 3, 4, 5,
-6

[PATCH for-9.1 14/19] target/i386: move BSWAP to new decoder

2024-04-09 Thread Paolo Bonzini
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c  |  4 +++-
 target/i386/tcg/decode-new.c.inc |  9 +
 target/i386/tcg/emit.c.inc   | 11 +++
 3 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index f3c437aee88..a1e6e8ec7d9 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -3209,7 +3209,9 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 #ifndef CONFIG_USER_ONLY
 use_new &= b <= limit;
 #endif
-if (use_new && (b >= 0x138 && b <= 0x19f)) {
+if (use_new &&
+   ((b >= 0x138 && b <= 0x19f) ||
+ (b >= 0x1c8 && b <= 0x1cf))) {
 disas_insn_new(s, cpu, b);
 return true;
 }
diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 36eb53515af..2ee949b50e2 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -1118,6 +1118,15 @@ static const X86OpEntry opcodes_0F[256] = {
 [0xc5] = X86_OP_ENTRY3(PEXTRW, G,d, U,dq,I,b,   vex5 mmx p_00_66),
 [0xc6] = X86_OP_ENTRY4(VSHUF,  V,x, H,x, W,x,   vex4 p_00_66),
 
+[0xc8] = X86_OP_ENTRY1(BSWAP, LoBits,y),
+[0xc9] = X86_OP_ENTRY1(BSWAP, LoBits,y),
+[0xca] = X86_OP_ENTRY1(BSWAP, LoBits,y),
+[0xcb] = X86_OP_ENTRY1(BSWAP, LoBits,y),
+[0xcc] = X86_OP_ENTRY1(BSWAP, LoBits,y),
+[0xcd] = X86_OP_ENTRY1(BSWAP, LoBits,y),
+[0xce] = X86_OP_ENTRY1(BSWAP, LoBits,y),
+[0xcf] = X86_OP_ENTRY1(BSWAP, LoBits,y),
+
 [0xd0] = X86_OP_ENTRY3(VADDSUB,   V,x, H,x, W,x,vex2 cpuid(SSE3) 
p_66_f2),
 [0xd1] = X86_OP_ENTRY3(PSRLW_r,   V,x, H,x, W,x,vex4 mmx avx2_256 
p_00_66),
 [0xd2] = X86_OP_ENTRY3(PSRLD_r,   V,x, H,x, W,x,vex4 mmx avx2_256 
p_00_66),
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index dc5142be51f..1dc246f8c1e 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -1299,6 +1299,17 @@ static void gen_BOUND(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
 }
 }
 
+static void gen_BSWAP(DisasContext *s, CPUX86State *env, X86DecodedInsn 
*decode)
+{
+#ifdef TARGET_X86_64
+if (s->dflag == MO_64) {
+tcg_gen_bswap64_i64(s->T0, s->T0);
+return;
+}
+#endif
+tcg_gen_bswap32_tl(s->T0, s->T0, TCG_BSWAP_OZ);
+}
+
 static void gen_BZHI(DisasContext *s, CPUX86State *env, X86DecodedInsn *decode)
 {
 MemOp ot = decode->op[0].ot;
-- 
2.44.0




[PATCH for-9.1 15/19] target/i386: port extensions of one-byte opcodes to new decoder

2024-04-09 Thread Paolo Bonzini
A few two-byte opcodes are simple extensions of existing one-byte opcodes;
they are easy to decode and need no change to emit.c.inc.  Port them to
the new decoder.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.h |  1 +
 target/i386/tcg/translate.c  |  4 
 target/i386/tcg/decode-new.c.inc | 27 +++
 target/i386/tcg/emit.c.inc   | 15 +++
 4 files changed, 47 insertions(+)

diff --git a/target/i386/tcg/decode-new.h b/target/i386/tcg/decode-new.h
index cd7ceca21e8..2ea06b44787 100644
--- a/target/i386/tcg/decode-new.h
+++ b/target/i386/tcg/decode-new.h
@@ -47,6 +47,7 @@ typedef enum X86OpType {
 X86_TYPE_Y, /* string destination */
 
 /* Custom */
+X86_TYPE_EM, /* modrm byte selects an ALU memory operand */
 X86_TYPE_WM, /* modrm byte selects an XMM/YMM memory operand */
 X86_TYPE_I_unsigned, /* Immediate, zero-extended */
 X86_TYPE_2op, /* 2-operand RMW instruction */
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index a1e6e8ec7d9..e8352d43678 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -3211,6 +3211,10 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 #endif
 if (use_new &&
((b >= 0x138 && b <= 0x19f) ||
+ (b & ~9) == 0x1a0 ||
+ b == 0x1af || b == 0x1b2 ||
+ (b >= 0x1b4 && b <= 0x1b7) ||
+ b == 0x1be || b == 0x1bf || b == 0x1c3 ||
  (b >= 0x1c8 && b <= 0x1cf))) {
 disas_insn_new(s, cpu, b);
 return true;
diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 2ee949b50e2..2e27d28dc95 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -43,6 +43,12 @@
  * Operand types
  * -
  *
+ * For memory-only operands, if the emitter functions wants to rely on
+ * generic load and writeback, the decoder needs to know the type of the
+ * operand.  Therefore, M is often replaced by the more specific EM and WM
+ * (respectively selecting an ALU operand, like the operand type E, or a
+ * vector operand like the operand type W).
+ *
  * Immediates are almost always signed or masked away in helpers.  Two
  * common exceptions are IN/OUT and absolute jumps.  For these, there is
  * an additional custom operand type "I_unsigned".  Alternatively, the
@@ -1047,6 +1053,9 @@ static const X86OpEntry opcodes_0F[256] = {
 [0x96] = X86_OP_ENTRYw(SETcc, E,b),
 [0x97] = X86_OP_ENTRYw(SETcc, E,b),
 
+[0xa0] = X86_OP_ENTRYr(PUSH, FS, w),
+[0xa1] = X86_OP_ENTRYw(POP, FS, w),
+
 [0x28] = X86_OP_ENTRY3(MOVDQ,  V,x,  None,None, W,x, vex1 p_00_66), /* 
MOVAPS */
 [0x29] = X86_OP_ENTRY3(MOVDQ,  W,x,  None,None, V,x, vex1 p_00_66), /* 
MOVAPS */
 [0x2A] = X86_OP_GROUP0(0F2A),
@@ -,9 +1120,22 @@ static const X86OpEntry opcodes_0F[256] = {
 [0x9e] = X86_OP_ENTRYw(SETcc, E,b),
 [0x9f] = X86_OP_ENTRYw(SETcc, E,b),
 
+[0xa8] = X86_OP_ENTRYr(PUSH, GS, w),
+[0xa9] = X86_OP_ENTRYw(POP, GS, w),
 [0xae] = X86_OP_GROUP0(group15),
+[0xaf] = X86_OP_ENTRY2(IMUL3,  G,v, E,v),
+
+[0xb2] = X86_OP_ENTRY3(LSS, G,v, M,p, None, None),
+[0xb4] = X86_OP_ENTRY3(LFS, G,v, M,p, None, None),
+[0xb5] = X86_OP_ENTRY3(LGS, G,v, M,p, None, None),
+[0xb6] = X86_OP_ENTRY3(MOV, G,v, E,b, None, None, zextT0), /* MOVZX */
+[0xb7] = X86_OP_ENTRY3(MOV, G,v, E,w, None, None, zextT0), /* MOVZX */
+
+[0xbe] = X86_OP_ENTRY3(MOV, G,v, E,b, None, None, sextT0), /* MOVSX */
+[0xbf] = X86_OP_ENTRY3(MOV, G,v, E,w, None, None, sextT0), /* MOVSX */
 
 [0xc2] = X86_OP_ENTRY4(VCMP,   V,x, H,x, W,x,   vex2_rep3 
p_00_66_f3_f2),
+[0xc3] = X86_OP_ENTRY3(MOV,EM,y,G,y, None,None, cpuid(SSE2)), /* 
MOVNTI */
 [0xc4] = X86_OP_ENTRY4(PINSRW, V,dq,H,dq,E,w,   vex5 mmx p_00_66),
 [0xc5] = X86_OP_ENTRY3(PEXTRW, G,d, U,dq,I,b,   vex5 mmx p_00_66),
 [0xc6] = X86_OP_ENTRY4(VSHUF,  V,x, H,x, W,x,   vex4 p_00_66),
@@ -1814,8 +1836,13 @@ static bool decode_op(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode,
 
 case X86_TYPE_WM:  /* modrm byte selects an XMM/YMM memory operand */
 op->unit = X86_OP_SSE;
+goto get_modrm_mem;
+
+case X86_TYPE_EM:  /* modrm byte selects an ALU memory operand */
+op->unit = X86_OP_INT;
 /* fall through */
 case X86_TYPE_M:  /* modrm byte selects a memory operand */
+get_modrm_mem:
 modrm = get_modrm(s, env);
 if ((modrm >> 6) == 3) {
 return false;
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index 1dc246f8c1e..35bb56c750e 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -1956,6 +1956,16 @@ static void gen_LES(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
 

[PATCH for-9.1 12/19] target/i386: merge and enlarge a few ranges for call to disas_insn_new

2024-04-09 Thread Paolo Bonzini
Since new opcodes are not going to be added in translate.c, round the
case labels that call to disas_insn_new(), including whole sets of
eight opcodes when possible.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 720668e023a..26e4c7520db 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -6866,9 +6866,8 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 break;
 case 0x10e ... 0x117:
 case 0x128 ... 0x12f:
-case 0x138 ... 0x13a:
-case 0x150 ... 0x179:
-case 0x17c ... 0x17f:
+case 0x138 ... 0x13f:
+case 0x150 ... 0x17f:
 case 0x1c2:
 case 0x1c4 ... 0x1c6:
 case 0x1d0 ... 0x1fe:
-- 
2.44.0




[PATCH for-9.1 17/19] target/i386: decode x87 instructions in a separate function

2024-04-09 Thread Paolo Bonzini
These are unlikely to be converted to the table-based decoding
soon (perhaps there could be generic ESC decoding in decode-new.c.inc
for the Mod/RM byte, but not operand decoding), so keep them separate
from the remaining legacy-decoded instructions.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 1120 ++-
 1 file changed, 566 insertions(+), 554 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 81291da4132..e7f51685ed8 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -2555,6 +2555,570 @@ static void gen_cmpxchg16b(DisasContext *s, CPUX86State 
*env, int modrm)
 }
 #endif
 
+static bool disas_insn_x87(DisasContext *s, CPUState *cpu, int b)
+{
+CPUX86State *env = cpu_env(cpu);
+bool update_fip = true;
+int modrm, mod, rm, op;
+
+if (s->flags & (HF_EM_MASK | HF_TS_MASK)) {
+/* if CR0.EM or CR0.TS are set, generate an FPU exception */
+/* XXX: what to do if illegal op ? */
+gen_exception(s, EXCP07_PREX);
+return true;
+}
+modrm = x86_ldub_code(env, s);
+mod = (modrm >> 6) & 3;
+rm = modrm & 7;
+op = ((b & 7) << 3) | ((modrm >> 3) & 7);
+if (mod != 3) {
+/* memory op */
+AddressParts a = gen_lea_modrm_0(env, s, modrm);
+TCGv ea = gen_lea_modrm_1(s, a, false);
+TCGv last_addr = tcg_temp_new();
+bool update_fdp = true;
+
+tcg_gen_mov_tl(last_addr, ea);
+gen_lea_v_seg(s, s->aflag, ea, a.def_seg, s->override);
+
+switch (op) {
+case 0x00 ... 0x07: /* fxxxs */
+case 0x10 ... 0x17: /* fixxxl */
+case 0x20 ... 0x27: /* fxxxl */
+case 0x30 ... 0x37: /* fixxx */
+{
+int op1;
+op1 = op & 7;
+
+switch (op >> 4) {
+case 0:
+tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
+s->mem_index, MO_LEUL);
+gen_helper_flds_FT0(tcg_env, s->tmp2_i32);
+break;
+case 1:
+tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
+s->mem_index, MO_LEUL);
+gen_helper_fildl_FT0(tcg_env, s->tmp2_i32);
+break;
+case 2:
+tcg_gen_qemu_ld_i64(s->tmp1_i64, s->A0,
+s->mem_index, MO_LEUQ);
+gen_helper_fldl_FT0(tcg_env, s->tmp1_i64);
+break;
+case 3:
+default:
+tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
+s->mem_index, MO_LESW);
+gen_helper_fildl_FT0(tcg_env, s->tmp2_i32);
+break;
+}
+
+gen_helper_fp_arith_ST0_FT0(op1);
+if (op1 == 3) {
+/* fcomp needs pop */
+gen_helper_fpop(tcg_env);
+}
+}
+break;
+case 0x08: /* flds */
+case 0x0a: /* fsts */
+case 0x0b: /* fstps */
+case 0x18 ... 0x1b: /* fildl, fisttpl, fistl, fistpl */
+case 0x28 ... 0x2b: /* fldl, fisttpll, fstl, fstpl */
+case 0x38 ... 0x3b: /* filds, fisttps, fists, fistps */
+switch (op & 7) {
+case 0:
+switch (op >> 4) {
+case 0:
+tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
+s->mem_index, MO_LEUL);
+gen_helper_flds_ST0(tcg_env, s->tmp2_i32);
+break;
+case 1:
+tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
+s->mem_index, MO_LEUL);
+gen_helper_fildl_ST0(tcg_env, s->tmp2_i32);
+break;
+case 2:
+tcg_gen_qemu_ld_i64(s->tmp1_i64, s->A0,
+s->mem_index, MO_LEUQ);
+gen_helper_fldl_ST0(tcg_env, s->tmp1_i64);
+break;
+case 3:
+default:
+tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
+s->mem_index, MO_LESW);
+gen_helper_fildl_ST0(tcg_env, s->tmp2_i32);
+break;
+}
+break;
+case 1:
+/* XXX: the corresponding CPUID bit must be tested ! */
+switch (op >> 4) {
+case 1:
+gen_helper_fisttl_ST0(s->tmp2_i32, tcg_env);
+tcg_gen_qemu_st_i32(s->tmp2_i32, s->

[PATCH for-9.1 01/19] target/i386: use TSTEQ/TSTNE to test low bits

2024-04-09 Thread Paolo Bonzini
When testing the sign bit or equality to zero of a partial register, it
is useful to use a single TSTEQ or TSTNE operation.  It can also be used
to test the parity flag, using bit 0 of the population count.

Do not do this for target_ulong-sized values however; the optimizer would
produce a comparison against zero anyway, and it avoids shifts by 64
which are undefined behavior.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 28 
 target/i386/tcg/emit.c.inc  |  5 ++---
 2 files changed, 22 insertions(+), 11 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 76a42c679c7..b7117393961 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -928,11 +928,21 @@ typedef struct CCPrepare {
 bool no_setcond;
 } CCPrepare;
 
+static CCPrepare gen_prepare_sign_nz(TCGv src, MemOp size)
+{
+if (size == MO_TL) {
+return (CCPrepare) { .cond = TCG_COND_LT, .reg = src, .mask = -1 };
+} else {
+return (CCPrepare) { .cond = TCG_COND_TSTNE, .reg = src, .mask = -1,
+ .imm = 1ull << ((8 << size) - 1) };
+}
+}
+
 /* compute eflags.C to reg */
 static CCPrepare gen_prepare_eflags_c(DisasContext *s, TCGv reg)
 {
 TCGv t0, t1;
-int size, shift;
+MemOp size;
 
 switch (s->cc_op) {
 case CC_OP_SUBB ... CC_OP_SUBQ:
@@ -967,9 +977,7 @@ static CCPrepare gen_prepare_eflags_c(DisasContext *s, TCGv 
reg)
 case CC_OP_SHLB ... CC_OP_SHLQ:
 /* (CC_SRC >> (DATA_BITS - 1)) & 1 */
 size = s->cc_op - CC_OP_SHLB;
-shift = (8 << size) - 1;
-return (CCPrepare) { .cond = TCG_COND_NE, .reg = cpu_cc_src,
- .mask = (target_ulong)1 << shift };
+return gen_prepare_sign_nz(cpu_cc_src, size);
 
 case CC_OP_MULB ... CC_OP_MULQ:
 return (CCPrepare) { .cond = TCG_COND_NE,
@@ -1029,8 +1037,7 @@ static CCPrepare gen_prepare_eflags_s(DisasContext *s, 
TCGv reg)
 default:
 {
 MemOp size = (s->cc_op - CC_OP_ADDB) & 3;
-TCGv t0 = gen_ext_tl(reg, cpu_cc_dst, size, true);
-return (CCPrepare) { .cond = TCG_COND_LT, .reg = t0, .mask = -1 };
+return gen_prepare_sign_nz(cpu_cc_dst, size);
 }
 }
 }
@@ -1077,8 +1084,13 @@ static CCPrepare gen_prepare_eflags_z(DisasContext *s, 
TCGv reg)
 default:
 {
 MemOp size = (s->cc_op - CC_OP_ADDB) & 3;
-TCGv t0 = gen_ext_tl(reg, cpu_cc_dst, size, false);
-return (CCPrepare) { .cond = TCG_COND_EQ, .reg = t0, .mask = -1 };
+if (size == MO_TL) {
+return (CCPrepare) { .cond = TCG_COND_EQ, .reg = cpu_cc_dst,
+ .mask = -1 };
+} else {
+return (CCPrepare) { .cond = TCG_COND_TSTEQ, .reg = cpu_cc_dst,
+ .mask = -1, .imm = (1ull << (8 << size)) 
- 1 };
+}
 }
 }
 }
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index 6bcf88ecd71..0e00f6635dd 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -1209,7 +1209,7 @@ static void gen_CMPccXADD(DisasContext *s, CPUX86State 
*env, X86DecodedInsn *dec
 [JCC_Z] = TCG_COND_EQ,
 [JCC_BE] = TCG_COND_LEU,
 [JCC_S] = TCG_COND_LT,  /* test sign bit by comparing against 0 */
-[JCC_P] = TCG_COND_EQ,  /* even parity - tests low bit of popcount */
+[JCC_P] = TCG_COND_TSTEQ,  /* even parity - tests low bit of popcount 
*/
 [JCC_L] = TCG_COND_LT,
 [JCC_LE] = TCG_COND_LE,
 };
@@ -1260,8 +1260,7 @@ static void gen_CMPccXADD(DisasContext *s, CPUX86State 
*env, X86DecodedInsn *dec
 case JCC_P:
 tcg_gen_ext8u_tl(s->tmp0, s->T0);
 tcg_gen_ctpop_tl(s->tmp0, s->tmp0);
-tcg_gen_andi_tl(s->tmp0, s->tmp0, 1);
-cmp_lhs = s->tmp0, cmp_rhs = tcg_constant_tl(0);
+cmp_lhs = s->tmp0, cmp_rhs = tcg_constant_tl(1);
 break;
 
 case JCC_S:
-- 
2.44.0




[PATCH for-9.1 04/19] target/i386: do not use s->tmp0 and s->tmp4 to compute flags

2024-04-09 Thread Paolo Bonzini
Create a new temporary whenever flags have to use one, instead of using
s->tmp0 or s->tmp4.  NULL can now be passed as the scratch register
to gen_prepare_*.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 54 +
 1 file changed, 31 insertions(+), 23 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 197cccb6c96..debc1b27283 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -947,9 +947,9 @@ static CCPrepare gen_prepare_eflags_c(DisasContext *s, TCGv 
reg)
 case CC_OP_SUBB ... CC_OP_SUBQ:
 /* (DATA_TYPE)CC_SRCT < (DATA_TYPE)CC_SRC */
 size = s->cc_op - CC_OP_SUBB;
-t1 = gen_ext_tl(s->tmp0, cpu_cc_src, size, false);
-/* If no temporary was used, be careful not to alias t1 and t0.  */
-t0 = t1 == cpu_cc_src ? s->tmp0 : reg;
+/* Be careful not to alias t1 and t0.  */
+t1 = gen_ext_tl(NULL, cpu_cc_src, size, false);
+t0 = (reg == t1 || !reg) ? tcg_temp_new() : reg;
 tcg_gen_mov_tl(t0, s->cc_srcT);
 gen_extu(size, t0);
 goto add_sub;
@@ -957,8 +957,9 @@ static CCPrepare gen_prepare_eflags_c(DisasContext *s, TCGv 
reg)
 case CC_OP_ADDB ... CC_OP_ADDQ:
 /* (DATA_TYPE)CC_DST < (DATA_TYPE)CC_SRC */
 size = s->cc_op - CC_OP_ADDB;
-t1 = gen_ext_tl(s->tmp0, cpu_cc_src, size, false);
-t0 = gen_ext_tl(reg, cpu_cc_dst, size, false);
+/* Be careful not to alias t1 and t0.  */
+t1 = gen_ext_tl(NULL, cpu_cc_src, size, false);
+t0 = gen_ext_tl(reg == t1 ? NULL : reg, cpu_cc_dst, size, false);
 add_sub:
 return (CCPrepare) { .cond = TCG_COND_LTU, .reg = t0,
  .reg2 = t1, .use_reg2 = true };
@@ -1002,6 +1003,9 @@ static CCPrepare gen_prepare_eflags_c(DisasContext *s, 
TCGv reg)
/* The need to compute only C from CC_OP_DYNAMIC is important
   in efficiently implementing e.g. INC at the start of a TB.  */
gen_update_cc_op(s);
+   if (!reg) {
+   reg = tcg_temp_new();
+   }
gen_helper_cc_compute_c(reg, cpu_cc_dst, cpu_cc_src,
cpu_cc_src2, cpu_cc_op);
return (CCPrepare) { .cond = TCG_COND_NE, .reg = reg,
@@ -1098,7 +1102,7 @@ static CCPrepare gen_prepare_cc(DisasContext *s, int b, 
TCGv reg)
 int inv, jcc_op, cond;
 MemOp size;
 CCPrepare cc;
-TCGv t0;
+TCGv t0, t1;
 
 inv = b & 1;
 jcc_op = (b >> 1) & 7;
@@ -1109,11 +1113,13 @@ static CCPrepare gen_prepare_cc(DisasContext *s, int b, 
TCGv reg)
 size = s->cc_op - CC_OP_SUBB;
 switch (jcc_op) {
 case JCC_BE:
-tcg_gen_mov_tl(s->tmp4, s->cc_srcT);
-gen_extu(size, s->tmp4);
-t0 = gen_ext_tl(s->tmp0, cpu_cc_src, size, false);
-cc = (CCPrepare) { .cond = TCG_COND_LEU, .reg = s->tmp4,
-   .reg2 = t0, .use_reg2 = true };
+/* Be careful not to alias t1 and t0.  */
+t1 = gen_ext_tl(NULL, cpu_cc_src, size, false);
+t0 = (reg == t1 || !reg) ? tcg_temp_new() : reg;
+tcg_gen_mov_tl(t0, s->cc_srcT);
+gen_extu(size, t0);
+cc = (CCPrepare) { .cond = TCG_COND_LEU, .reg = t0,
+   .reg2 = t1, .use_reg2 = true };
 break;
 
 case JCC_L:
@@ -1122,11 +1128,13 @@ static CCPrepare gen_prepare_cc(DisasContext *s, int b, 
TCGv reg)
 case JCC_LE:
 cond = TCG_COND_LE;
 fast_jcc_l:
-tcg_gen_mov_tl(s->tmp4, s->cc_srcT);
-gen_exts(size, s->tmp4);
-t0 = gen_ext_tl(s->tmp0, cpu_cc_src, size, true);
-cc = (CCPrepare) { .cond = cond, .reg = s->tmp4,
-   .reg2 = t0, .use_reg2 = true };
+/* Be careful not to alias t1 and t0.  */
+t1 = gen_ext_tl(NULL, cpu_cc_src, size, true);
+t0 = (reg == t1 || !reg) ? tcg_temp_new() : reg;
+tcg_gen_mov_tl(t0, s->cc_srcT);
+gen_exts(size, t0);
+cc = (CCPrepare) { .cond = cond, .reg = t0,
+   .reg2 = t1, .use_reg2 = true };
 break;
 
 default:
@@ -1160,8 +1168,8 @@ static CCPrepare gen_prepare_cc(DisasContext *s, int b, 
TCGv reg)
 break;
 case JCC_L:
 gen_compute_eflags(s);
-if (reg == cpu_cc_src) {
-reg = s->tmp0;
+if (reg == cpu_cc_src || !reg) {
+reg = tcg_temp_new();
 }
 tcg_gen_addi_tl(reg, cpu_cc_src, CC_O - CC_S);
 cc = (CCPrepare) { .cond = TCG_COND_TSTNE, .reg = reg,
@@ -1170,8 +1178,8 @@ static CCPrepare gen_prepare_cc(DisasContext *s, int b, 
TCGv reg)
 default:
 case JCC_LE:
 gen_c

[PATCH for-9.1 03/19] target/i386: remove mask from CCPrepare

2024-04-09 Thread Paolo Bonzini
With the introduction of TSTEQ and TSTNE the .mask field is always -1,
so remove all the now-unnecessary code.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 81 +
 1 file changed, 27 insertions(+), 54 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 4de5090846a..197cccb6c96 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -923,7 +923,6 @@ typedef struct CCPrepare {
 TCGv reg;
 TCGv reg2;
 target_ulong imm;
-target_ulong mask;
 bool use_reg2;
 bool no_setcond;
 } CCPrepare;
@@ -931,9 +930,9 @@ typedef struct CCPrepare {
 static CCPrepare gen_prepare_sign_nz(TCGv src, MemOp size)
 {
 if (size == MO_TL) {
-return (CCPrepare) { .cond = TCG_COND_LT, .reg = src, .mask = -1 };
+return (CCPrepare) { .cond = TCG_COND_LT, .reg = src };
 } else {
-return (CCPrepare) { .cond = TCG_COND_TSTNE, .reg = src, .mask = -1,
+return (CCPrepare) { .cond = TCG_COND_TSTNE, .reg = src,
  .imm = 1ull << ((8 << size) - 1) };
 }
 }
@@ -962,17 +961,17 @@ static CCPrepare gen_prepare_eflags_c(DisasContext *s, 
TCGv reg)
 t0 = gen_ext_tl(reg, cpu_cc_dst, size, false);
 add_sub:
 return (CCPrepare) { .cond = TCG_COND_LTU, .reg = t0,
- .reg2 = t1, .mask = -1, .use_reg2 = true };
+ .reg2 = t1, .use_reg2 = true };
 
 case CC_OP_LOGICB ... CC_OP_LOGICQ:
 case CC_OP_CLR:
 case CC_OP_POPCNT:
-return (CCPrepare) { .cond = TCG_COND_NEVER, .mask = -1 };
+return (CCPrepare) { .cond = TCG_COND_NEVER };
 
 case CC_OP_INCB ... CC_OP_INCQ:
 case CC_OP_DECB ... CC_OP_DECQ:
 return (CCPrepare) { .cond = TCG_COND_NE, .reg = cpu_cc_src,
- .mask = -1, .no_setcond = true };
+ .no_setcond = true };
 
 case CC_OP_SHLB ... CC_OP_SHLQ:
 /* (CC_SRC >> (DATA_BITS - 1)) & 1 */
@@ -981,23 +980,23 @@ static CCPrepare gen_prepare_eflags_c(DisasContext *s, 
TCGv reg)
 
 case CC_OP_MULB ... CC_OP_MULQ:
 return (CCPrepare) { .cond = TCG_COND_NE,
- .reg = cpu_cc_src, .mask = -1 };
+ .reg = cpu_cc_src };
 
 case CC_OP_BMILGB ... CC_OP_BMILGQ:
 size = s->cc_op - CC_OP_BMILGB;
 t0 = gen_ext_tl(reg, cpu_cc_src, size, false);
-return (CCPrepare) { .cond = TCG_COND_EQ, .reg = t0, .mask = -1 };
+return (CCPrepare) { .cond = TCG_COND_EQ, .reg = t0 };
 
 case CC_OP_ADCX:
 case CC_OP_ADCOX:
 return (CCPrepare) { .cond = TCG_COND_NE, .reg = cpu_cc_dst,
- .mask = -1, .no_setcond = true };
+ .no_setcond = true };
 
 case CC_OP_EFLAGS:
 case CC_OP_SARB ... CC_OP_SARQ:
 /* CC_SRC & 1 */
 return (CCPrepare) { .cond = TCG_COND_TSTNE,
- .reg = cpu_cc_src, .mask = -1, .imm = CC_C };
+ .reg = cpu_cc_src, .imm = CC_C };
 
 default:
/* The need to compute only C from CC_OP_DYNAMIC is important
@@ -1006,7 +1005,7 @@ static CCPrepare gen_prepare_eflags_c(DisasContext *s, 
TCGv reg)
gen_helper_cc_compute_c(reg, cpu_cc_dst, cpu_cc_src,
cpu_cc_src2, cpu_cc_op);
return (CCPrepare) { .cond = TCG_COND_NE, .reg = reg,
-.mask = -1, .no_setcond = true };
+.no_setcond = true };
 }
 }
 
@@ -1015,7 +1014,7 @@ static CCPrepare gen_prepare_eflags_p(DisasContext *s, 
TCGv reg)
 {
 gen_compute_eflags(s);
 return (CCPrepare) { .cond = TCG_COND_TSTNE, .reg = cpu_cc_src,
- .mask = -1, .imm = CC_P };
+ .imm = CC_P };
 }
 
 /* compute eflags.S to reg */
@@ -1030,10 +1029,10 @@ static CCPrepare gen_prepare_eflags_s(DisasContext *s, 
TCGv reg)
 case CC_OP_ADOX:
 case CC_OP_ADCOX:
 return (CCPrepare) { .cond = TCG_COND_TSTNE, .reg = cpu_cc_src,
- .mask = -1, .imm = CC_S };
+ .imm = CC_S };
 case CC_OP_CLR:
 case CC_OP_POPCNT:
-return (CCPrepare) { .cond = TCG_COND_NEVER, .mask = -1 };
+return (CCPrepare) { .cond = TCG_COND_NEVER };
 default:
 {
 MemOp size = (s->cc_op - CC_OP_ADDB) & 3;
@@ -1049,17 +1048,16 @@ static CCPrepare gen_prepare_eflags_o(DisasContext *s, 
TCGv reg)
 case CC_OP_ADOX:
 case CC_OP_ADCOX:
 return (CCPrepare) { .cond = TCG_COND_NE, .reg = cpu_cc_src2,
- .mask = -1, .no_setcond = true };
+ .no_setcond = true };
 case CC_OP_CLR:
 case CC_OP_POPCNT:
-return (CCPrepare) { .cond = TCG_COND_NEVER, .mas

[PATCH for-9.1 07/19] target/i386: extract gen_far_call/jmp, reordering temporaries

2024-04-09 Thread Paolo Bonzini
Extract the code into new functions, and swap T0/T1 so that T0 corresponds
to the first immediate in the instruction stream.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 93 +
 1 file changed, 53 insertions(+), 40 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index e501d4701b6..c251fa21e6d 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -2518,12 +2518,13 @@ static inline void gen_op_movl_T0_seg(DisasContext *s, 
X86Seg seg_reg)
  offsetof(CPUX86State,segs[seg_reg].selector));
 }
 
-static inline void gen_op_movl_seg_T0_vm(DisasContext *s, X86Seg seg_reg)
+static void gen_op_movl_seg_real(DisasContext *s, X86Seg seg_reg, TCGv seg)
 {
-tcg_gen_ext16u_tl(s->T0, s->T0);
-tcg_gen_st32_tl(s->T0, tcg_env,
+TCGv selector = tcg_temp_new();
+tcg_gen_ext16u_tl(selector, seg);
+tcg_gen_st32_tl(selector, tcg_env,
 offsetof(CPUX86State,segs[seg_reg].selector));
-tcg_gen_shli_tl(cpu_seg_base[seg_reg], s->T0, 4);
+tcg_gen_shli_tl(cpu_seg_base[seg_reg], selector, 4);
 }
 
 /* move T0 to seg_reg and compute if the CPU state may change. Never
@@ -2543,13 +2544,45 @@ static void gen_movl_seg_T0(DisasContext *s, X86Seg 
seg_reg)
 s->base.is_jmp = DISAS_EOB_NEXT;
 }
 } else {
-gen_op_movl_seg_T0_vm(s, seg_reg);
+gen_op_movl_seg_real(s, seg_reg, s->T0);
 if (seg_reg == R_SS) {
 s->base.is_jmp = DISAS_EOB_INHIBIT_IRQ;
 }
 }
 }
 
+static void gen_far_call(DisasContext *s)
+{
+TCGv_i32 new_cs = tcg_temp_new_i32();
+tcg_gen_trunc_tl_i32(new_cs, s->T1);
+if (PE(s) && !VM86(s)) {
+gen_helper_lcall_protected(tcg_env, new_cs, s->T0,
+   tcg_constant_i32(s->dflag - 1),
+   eip_next_tl(s));
+} else {
+TCGv_i32 new_eip = tcg_temp_new_i32();
+tcg_gen_trunc_tl_i32(new_eip, s->T0);
+gen_helper_lcall_real(tcg_env, new_cs, new_eip,
+  tcg_constant_i32(s->dflag - 1),
+  eip_next_i32(s));
+}
+s->base.is_jmp = DISAS_JUMP;
+}
+
+static void gen_far_jmp(DisasContext *s)
+{
+if (PE(s) && !VM86(s)) {
+TCGv_i32 new_cs = tcg_temp_new_i32();
+tcg_gen_trunc_tl_i32(new_cs, s->T1);
+gen_helper_ljmp_protected(tcg_env, new_cs, s->T0,
+  eip_next_tl(s));
+} else {
+gen_op_movl_seg_real(s, R_CS, s->T1);
+gen_op_jmp_v(s, s->T0);
+}
+s->base.is_jmp = DISAS_JUMP;
+}
+
 static void gen_svm_check_intercept(DisasContext *s, uint32_t type)
 {
 /* no SVM activated; fast case */
@@ -3656,23 +3689,10 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 if (mod == 3) {
 goto illegal_op;
 }
-gen_op_ld_v(s, ot, s->T1, s->A0);
+gen_op_ld_v(s, ot, s->T0, s->A0);
 gen_add_A0_im(s, 1 << ot);
-gen_op_ld_v(s, MO_16, s->T0, s->A0);
-do_lcall:
-if (PE(s) && !VM86(s)) {
-tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
-gen_helper_lcall_protected(tcg_env, s->tmp2_i32, s->T1,
-   tcg_constant_i32(dflag - 1),
-   eip_next_tl(s));
-} else {
-tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
-tcg_gen_trunc_tl_i32(s->tmp3_i32, s->T1);
-gen_helper_lcall_real(tcg_env, s->tmp2_i32, s->tmp3_i32,
-  tcg_constant_i32(dflag - 1),
-  eip_next_i32(s));
-}
-s->base.is_jmp = DISAS_JUMP;
+gen_op_ld_v(s, MO_16, s->T1, s->A0);
+gen_far_call(s);
 break;
 case 4: /* jmp Ev */
 if (dflag == MO_16) {
@@ -3686,19 +3706,10 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 if (mod == 3) {
 goto illegal_op;
 }
-gen_op_ld_v(s, ot, s->T1, s->A0);
+gen_op_ld_v(s, ot, s->T0, s->A0);
 gen_add_A0_im(s, 1 << ot);
-gen_op_ld_v(s, MO_16, s->T0, s->A0);
-do_ljmp:
-if (PE(s) && !VM86(s)) {
-tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
-gen_helper_ljmp_protected(tcg_env, s->tmp2_i32, s->T1,
-  eip_next_tl(s));
-} else {
-gen_op_movl_seg_T0_vm(s, R_CS);
-gen_op_jmp_v(s, s->T1);
-}
-s->base.is_jmp = DISAS_JUMP;
+gen_op_ld_v(s, MO_16,

[PATCH for-9.1 00/19] target/i386: convert 1-byte opcodes to new decoder

2024-04-09 Thread Paolo Bonzini
This series includes changes to the x86 TCG decoder that switch the
1-byte opcodes to the table-driven decoder (except for x87).  A few
easy 2-byte opcodes are also converted (BSWAP, SETcc, CMOVcc,
MOVZX/MOVSX and those that are extensions of 1-byte opcodes like PUSH/POP
FS/GS, LFS/LGS/LSS).

After optimization, the generated code is generally similar to what
is produced by the old decoder, with some differences for 32-bit
multiplications and rotate operations (RCL/RCR, and ROL/ROR less so).

This reaches a point where prefix decoding is done entirely in the new
decoder; when the opcode is loaded, if needed it will defer to
translate.c for the actual translation of the instruction.

Quite surprisingly, even without removing this duplicate code the
patch remove more lines than it adds, even though the table-driven
translator is theoretically more verbose (1 line per entry in the tables
plus all the function declarations for group decoders and emitters).
This shows how operand decoding is spread all over the place in
translate.c.

These have been ready for a few months; now that it seems clearer that
issue 2092 is a generic problem with vhost-user, it is time to get
this upstream.

Paolo

Based-on: <20240406223248.502699-1-richard.hender...@linaro.org>


Paolo Bonzini (19):
  target/i386: use TSTEQ/TSTNE to test low bits
  target/i386: use TSTEQ/TSTNE to check flags
  target/i386: remove mask from CCPrepare
  target/i386: do not use s->tmp0 and s->tmp4 to compute flags
  target/i386: reintroduce debugging mechanism
  target/i386: move 00-5F opcodes to new decoder
  target/i386: extract gen_far_call/jmp, reordering temporaries
  target/i386: allow instructions with more than one immediate
  target/i386: move 60-BF opcodes to new decoder
  target/i386: generalize gen_movl_seg_T0
  target/i386: move C0-FF opcodes to new decoder (except for x87)
  target/i386: merge and enlarge a few ranges for call to disas_insn_new
  target/i386: move remaining conditional operations to new decoder
  target/i386: move BSWAP to new decoder
  target/i386: port extensions of one-byte opcodes to new decoder
  target/i386: remove now-converted opcodes from old decoder
  target/i386: decode x87 instructions in a separate function
  target/i386: split legacy decoder into a separate function
  target/i386: remove duplicate prefix decoding

 include/tcg/tcg.h   |6 +
 target/i386/helper.h|   11 -
 target/i386/tcg/decode-new.h|   23 +-
 target/i386/tcg/shift_helper_template.h.inc |  108 -
 target/i386/tcg/int_helper.c|   34 -
 target/i386/tcg/translate.c | 3717 ---
 target/i386/tcg/decode-new.c.inc|  602 ++-
 target/i386/tcg/emit.c.inc  | 1560 +++-
 8 files changed, 2914 insertions(+), 3147 deletions(-)
 delete mode 100644 target/i386/tcg/shift_helper_template.h.inc

-- 
2.44.0




Re: [PATCH] target/i386: fix direction of "32-bit MMU" test

2024-04-09 Thread Paolo Bonzini
On Tue, Apr 9, 2024 at 12:59 PM Zhao Liu  wrote:
>
> Hi Michael & Paolo,
>
> On Fri, Apr 05, 2024 at 08:30:43PM +0300, Michael Tokarev wrote:
> > Date: Fri, 5 Apr 2024 20:30:43 +0300
> > From: Michael Tokarev 
> > Subject: Re: [PATCH] target/i386: fix direction of "32-bit MMU" test
> >
> > 01.04.2024 09:02, Michael Tokarev:
> >
> > > Anyone can guess why this rather trivial and obviously correct patch 
> > > causes segfaults
> > > in a few tests in staging-7.2 - when run in tcg mode, namely:
> > >
> > >pxe-test
> > >migration-test
> > >boot-serial-test
> > >bios-tables-test
> > >vmgenid-test
> > >cdrom-test
> > >
> > > When reverting this single commit from staging-7.2, it all works fine 
> > > again.
> >
> > It sigsegvs in probe_access_internal():
> >
> >   CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr); -- this one returns 
> > NULL,
> >
> > and next there's a call
> >
> >   tlb_addr = tlb_read_ofs(entry, elt_ofs);
> >
> > which fails.
> >
> > #0  0x55c5de8a in tlb_read_ofs (ofs=8, entry=0x0) at 
> > 7.2/accel/tcg/cputlb.c:1455
> > #1  probe_access_internal
> > (env=0x56a862a0, addr=4294967280, fault_size=fault_size@entry=1,
> > access_type=access_type@entry=MMU_INST_FETCH, mmu_idx=5,
> > nonfault=nonfault@entry=false, phost=0x7fffea4d32a0, pfull=0x7fffea4d3298,
> > retaddr=0)
> > at 7.2/accel/tcg/cputlb.c:1555
> > #2  0x55c62aba in get_page_addr_code_hostp
> > (env=, addr=addr@entry=4294967280, hostp=hostp@entry=0x0)
> > at 7.2/accel/tcg/cputlb.c:1691
> > #3  0x55c52b54 in get_page_addr_code (addr=4294967280, 
> > env=)
> > at 7.2/include/exec/exec-all.h:714
> > #4  tb_htable_lookup
> > (cpu=cpu@entry=0x56a85530, pc=pc@entry=4294967280,
> > cs_base=cs_base@entry=4294901760, flags=flags@entry=64,
> > cflags=cflags@entry=4278190080) at 7.2/accel/tcg/cpu-exec.c:236
> > #5  0x55c53e8e in tb_lookup
> > (cflags=4278190080, flags=64, cs_base=4294901760, pc=4294967280, 
> > cpu=0x56a85530)
> > at 7.2/accel/tcg/cpu-exec.c:270
> > #6  cpu_exec (cpu=cpu@entry=0x56a85530) at 7.2/accel/tcg/cpu-exec.c:1001
> > #7  0x55c75d2f in tcg_cpus_exec (cpu=cpu@entry=0x56a85530)
> > at 7.2/accel/tcg/tcg-accel-ops.c:69
> > #8  0x55c75e80 in mttcg_cpu_thread_fn (arg=arg@entry=0x56a85530)
> > at 7.2/accel/tcg/tcg-accel-ops-mttcg.c:95
> > #9  0x55ded098 in qemu_thread_start (args=0x56adac40)
> > at 7.2/util/qemu-thread-posix.c:505
> > #10 0x75793134 in start_thread (arg=)
> > #11 0x758137dc in clone3 ()
> >
>
> I debugged it manually, and found the problem occurs in tlb_index() with
> mmu_idx=5.
>
> For v7.2, the maximum mmu index supported by i386 is 4 (since
> NB_MMU_MODES = 5 defined in target/i386/cpu-param.h).
>
> On Michael's 7.2-i386-mmu-idx tree, the commit 9fc3a7828d25 ("target/i386:
> use separate MMU indexes for 32-bit accesses") introduced more indexes
> without relaxing the NB_MMU_MODES for i386.
>
> Before this fix, probe_access_internal() just got the wrong mmu_idx as 4,
> and it's not out of bounds. After this fix, the right mmu_idx=5 is truly
> out of bounds.
>
> On the master branch, there's no such issue since the commits ffd824f3f32d
> ("include/exec: Set default NB_MMU_MODES to 16") and 6787318a5d86
> ("target/i386: Remove NB_MMU_MODES define") relaxed upper limit of MMU
> index for i386.

Thanks Zhao! Alternatively, it's enough to set NB_MMU_MODES to 8 in
commit 9fc3a7828d25.

Paolo




Re: [PATCH for-9.1 v2 00/28] linux-user/i386: Properly align signal frame

2024-04-09 Thread Paolo Bonzini

On 4/9/24 07:02, Richard Henderson wrote:

v1: 
https://lore.kernel.org/qemu-devel/20230524054647.1093758-1-richard.hender...@linaro.org/

But v1 isn't particularly complet or korrect.

Disconnect fpstate from sigframe, just like the kernel does.
Return the separate portions of the frame from get_sigframe.
Alter all of the target fpu routines to access memory that
has already been translated and sized.


With the exception of patch 22, and with small nits in patches 1/19/23:

Reviewed-by: Paolo Bonzini 



r~


Richard Henderson (28):
   target/i386: Add tcg/access.[ch]
   target/i386: Convert do_fldt, do_fstt to X86Access
   target/i386: Convert helper_{fbld,fbst}_ST0 to X86Access
   target/i386: Convert do_fldenv to X86Access
   target/i386: Convert do_fstenv to X86Access
   target/i386: Convert do_fsave, do_frstor to X86Access
   target/i386: Convert do_xsave_{fpu,mxcr,sse} to X86Access
   target/i386: Convert do_xrstor_{fpu,mxcr,sse} to X86Access
   tagret/i386: Convert do_fxsave, do_fxrstor to X86Access
   target/i386: Convert do_xsave_* to X86Access
   target/i386: Convert do_xrstor_* to X86Access
   target/i386: Split out do_xsave_chk
   target/i386: Add rbfm argument to cpu_x86_{xsave,xrstor}
   target/i386: Add {hw,sw}_reserved to X86LegacyXSaveArea
   linux-user/i386: Drop xfeatures_size from sigcontext arithmetic
   linux-user/i386: Remove xfeatures from target_fpstate_fxsave
   linux-user/i386: Replace target_fpstate_fxsave with X86LegacyXSaveArea
   linux-user/i386: Split out struct target_fregs_state
   linux-user/i386: Fix -mregparm=3 for signal delivery
   linux-user/i386: Return boolean success from restore_sigcontext
   linux-user/i386: Return boolean success from xrstor_sigcontext
   linux-user/i386: Fix allocation and alignment of fp state
   target/i386: Honor xfeatures in xrstor_sigcontext
   target/i386: Convert do_xsave to X86Access
   target/i386: Convert do_xrstor to X86Access
   target/i386: Pass host pointer and size to cpu_x86_{fsave,frstor}
   target/i386: Pass host pointer and size to cpu_x86_{fxsave,fxrstor}
   target/i386: Pass host pointer and size to cpu_x86_{xsave,xrstor}

  target/i386/cpu.h|  57 ++-
  target/i386/tcg/access.h |  40 ++
  linux-user/i386/signal.c | 669 ++-
  target/i386/tcg/access.c | 160 
  target/i386/tcg/fpu_helper.c | 561 --
  tests/tcg/x86_64/test-1648.c |  33 ++
  target/i386/tcg/meson.build  |   1 +
  tests/tcg/x86_64/Makefile.target |   1 +
  8 files changed, 1014 insertions(+), 508 deletions(-)
  create mode 100644 target/i386/tcg/access.h
  create mode 100644 target/i386/tcg/access.c
  create mode 100644 tests/tcg/x86_64/test-1648.c






Re: [PATCH v2 02/28] target/i386: Convert do_fldt, do_fstt to X86Access

2024-04-09 Thread Paolo Bonzini

On 4/9/24 07:02, Richard Henderson wrote:

Signed-off-by: Richard Henderson 
---
  target/i386/tcg/fpu_helper.c | 44 +---
  1 file changed, 31 insertions(+), 13 deletions(-)


Three incorrect GETPC()s that get fixed later in the series:

do_fsave:


@@ -2459,15 +2465,18 @@ void helper_fldenv(CPUX86State *env, target_ulong ptr, 
int data32)
  static void do_fsave(CPUX86State *env, target_ulong ptr, int data32,
   uintptr_t retaddr)
  {
+X86Access ac;
  floatx80 tmp;
  int i;
  
  do_fstenv(env, ptr, data32, retaddr);
  
  ptr += (target_ulong)14 << data32;

+access_prepare(, env, ptr, 80, MMU_DATA_STORE, GETPC());
+


do_xsave_fpu:


@@ -2506,6 +2518,7 @@ static void do_xsave_fpu(CPUX86State *env, target_ulong 
ptr, uintptr_t ra)
  {
  int fpus, fptag, i;
  target_ulong addr;
+X86Access ac;
  
  fpus = (env->fpus & ~0x3800) | (env->fpstt & 0x7) << 11;

  fptag = 0;
@@ -2524,9 +2537,11 @@ static void do_xsave_fpu(CPUX86State *env, target_ulong 
ptr, uintptr_t ra)
  cpu_stq_data_ra(env, ptr + XO(legacy.fpdp), 0, ra); /* edp+sel; rdp */
  
  addr = ptr + XO(legacy.fpregs);

+access_prepare(, env, addr, 8 * 16, MMU_DATA_STORE, GETPC());
+
  for (i = 0; i < 8; i++) {
  floatx80 tmp = ST(i);
-do_fstt(env, tmp, addr, ra);
+do_fstt(, addr, tmp);
  addr += 16;
  }
  }


do_xrstor_fpu:


@@ -2699,6 +2714,7 @@ static void do_xrstor_fpu(CPUX86State *env, target_ulong 
ptr, uintptr_t ra)
  {
  int i, fpuc, fpus, fptag;
  target_ulong addr;
+X86Access ac;
  
  fpuc = cpu_lduw_data_ra(env, ptr + XO(legacy.fcw), ra);

  fpus = cpu_lduw_data_ra(env, ptr + XO(legacy.fsw), ra);
@@ -2711,8 +2727,10 @@ static void do_xrstor_fpu(CPUX86State *env, target_ulong 
ptr, uintptr_t ra)
  }
  
  addr = ptr + XO(legacy.fpregs);

+access_prepare(, env, addr, 8 * 16, MMU_DATA_LOAD, GETPC());
+
  for (i = 0; i < 8; i++) {
-floatx80 tmp = do_fldt(env, addr, ra);
+floatx80 tmp = do_fldt(, addr);
  ST(i) = tmp;
  addr += 16;
  }






Re: [PATCH v2 23/28] target/i386: Honor xfeatures in xrstor_sigcontext

2024-04-09 Thread Paolo Bonzini

On 4/9/24 07:02, Richard Henderson wrote:

Signed-off-by: Richard Henderson 
---
  linux-user/i386/signal.c | 19 ++-
  1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/linux-user/i386/signal.c b/linux-user/i386/signal.c
index d015fe520a..fd09c973d4 100644
--- a/linux-user/i386/signal.c
+++ b/linux-user/i386/signal.c
@@ -612,6 +612,7 @@ static bool xrstor_sigcontext(CPUX86State *env, FPStateKind 
fpkind,
  struct target_fpx_sw_bytes *sw = (void *)>sw_reserved;
  uint32_t magic1, magic2;
  uint32_t extended_size, xstate_size, min_size, max_size;
+uint64_t xfeatures;
  
  switch (fpkind) {

  case FPSTATE_XSAVE:
@@ -628,10 +629,25 @@ static bool xrstor_sigcontext(CPUX86State *env, 
FPStateKind fpkind,
  xstate_size > extended_size) {
  break;
  }
+
+/*
+ * Restore the features indicated in the frame, masked by
+ * those currently enabled.  Re-check the frame size.
+ * ??? It is not clear where the kernel does this, but it
+ * is not in check_xstate_in_sigframe, and so (probably)
+ * does not fall back to fxrstor.
+ */


I think you're referring to this in __fpu_restore_sig?

if (use_xsave()) {
/*
 * Remove all UABI feature bits not set in user_xfeatures
 * from the memory xstate header which makes the full
 * restore below bring them into init state. This works for
 * fx_only mode as well because that has only FP and SSE
 * set in user_xfeatures.
 *
 * Preserve supervisor states!
 */
u64 mask = user_xfeatures | xfeatures_mask_supervisor();

fpregs->xsave.header.xfeatures &= mask;
success = !os_xrstor_safe(fpu->fpstate,
  fpu_kernel_cfg.max_features);

It is not masking against the user process's xcr0, but qemu-user's xcr0
is effectively user_xfeatures (it's computed in x86_cpu_reset_hold() and
will never change afterwards since XSETBV is privileged).

Paolo


+xfeatures = tswap64(sw->xfeatures) & env->xcr0;
+min_size = xsave_area_size(xfeatures, false);
+if (xstate_size < min_size) {
+return false;
+}
+
  if (!access_ok(env_cpu(env), VERIFY_READ, fxstate_addr,
 xstate_size + TARGET_FP_XSTATE_MAGIC2_SIZE)) {
  return false;
  }
+
  /*
   * Check for the presence of second magic word at the end of memory
   * layout. This detects the case where the user just copied the legacy
@@ -644,7 +660,8 @@ static bool xrstor_sigcontext(CPUX86State *env, FPStateKind 
fpkind,
  if (magic2 != FP_XSTATE_MAGIC2) {
  break;
  }
-cpu_x86_xrstor(env, fxstate_addr, -1);
+
+cpu_x86_xrstor(env, fxstate_addr, xfeatures);
  return true;
  
  default:





Re: [PATCH v2 19/28] linux-user/i386: Fix -mregparm=3 for signal delivery

2024-04-09 Thread Paolo Bonzini

On 4/9/24 07:02, Richard Henderson wrote:

Since v2.6.19, the kernel has supported -mregparm=3.

Signed-off-by: Richard Henderson 
---
  linux-user/i386/signal.c | 20 +---
  1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/linux-user/i386/signal.c b/linux-user/i386/signal.c
index 559b63c25b..f8cc0cff07 100644
--- a/linux-user/i386/signal.c
+++ b/linux-user/i386/signal.c
@@ -427,6 +427,11 @@ void setup_frame(int sig, struct target_sigaction *ka,
  env->regs[R_ESP] = frame_addr;
  env->eip = ka->_sa_handler;
  
+/* Make -mregparm=3 work */

+env->regs[R_EAX] = sig;
+env->regs[R_EDX] = 0;
+env->regs[R_ECX] = 0;


Perhaps also move here the

__put_user(sig, >sig);

from above, for consistency with setup_rt_frame?

Paolo


  cpu_x86_load_seg(env, R_DS, __USER_DS);
  cpu_x86_load_seg(env, R_ES, __USER_DS);
  cpu_x86_load_seg(env, R_SS, __USER_DS);
@@ -448,9 +453,6 @@ void setup_rt_frame(int sig, struct target_sigaction *ka,
  target_sigset_t *set, CPUX86State *env)
  {
  abi_ulong frame_addr;
-#ifndef TARGET_X86_64
-abi_ulong addr;
-#endif
  struct rt_sigframe *frame;
  int i;
  
@@ -460,14 +462,6 @@ void setup_rt_frame(int sig, struct target_sigaction *ka,

  if (!lock_user_struct(VERIFY_WRITE, frame, frame_addr, 0))
  goto give_sigsegv;
  
-/* These fields are only in rt_sigframe on 32 bit */

-#ifndef TARGET_X86_64
-__put_user(sig, >sig);
-addr = frame_addr + offsetof(struct rt_sigframe, info);
-__put_user(addr, >pinfo);
-addr = frame_addr + offsetof(struct rt_sigframe, uc);
-__put_user(addr, >puc);
-#endif
  if (ka->sa_flags & TARGET_SA_SIGINFO) {
  frame->info = *info;
  }
@@ -507,9 +501,13 @@ void setup_rt_frame(int sig, struct target_sigaction *ka,
  env->eip = ka->_sa_handler;
  
  #ifndef TARGET_X86_64

+/* Store arguments for both -mregparm=3 and standard. */
  env->regs[R_EAX] = sig;
+__put_user(sig, >sig);
  env->regs[R_EDX] = frame_addr + offsetof(struct rt_sigframe, info);
+__put_user(env->regs[R_EDX], >pinfo);
  env->regs[R_ECX] = frame_addr + offsetof(struct rt_sigframe, uc);
+__put_user(env->regs[R_ECX], >puc);
  #else
  env->regs[R_EAX] = 0;
  env->regs[R_EDI] = sig;





Re: [PATCH v2 01/28] target/i386: Add tcg/access.[ch]

2024-04-09 Thread Paolo Bonzini

On 4/9/24 07:02, Richard Henderson wrote:

Provide a method to amortize page lookup across large blocks.

Signed-off-by: Richard Henderson 
---
  target/i386/tcg/access.h|  40 +
  target/i386/tcg/access.c| 160 
  target/i386/tcg/meson.build |   1 +
  3 files changed, 201 insertions(+)
  create mode 100644 target/i386/tcg/access.h
  create mode 100644 target/i386/tcg/access.c

diff --git a/target/i386/tcg/access.h b/target/i386/tcg/access.h
new file mode 100644
index 00..d70808a3a3
--- /dev/null
+++ b/target/i386/tcg/access.h
@@ -0,0 +1,40 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/* Access guest memory in blocks. */
+
+#ifndef X86_TCG_ACCESS_H
+#define X86_TCG_ACCESS_H
+
+/* An access covers at most sizeof(X86XSaveArea), at most 2 pages. */
+typedef struct X86Access {
+target_ulong vaddr;
+void *haddr1;
+void *haddr2;
+uint16_t size;
+uint16_t size1;
+/*
+ * If we can't access the host page directly, we'll have to do I/O access
+ * via ld/st helpers. These are internal details, so we store the rest
+ * to do the access here instead of passing it around in the helpers.
+ */
+int mmu_idx;
+CPUX86State *env;
+uintptr_t ra;
+} X86Access;
+
+void access_prepare_mmu(X86Access *ret, CPUX86State *env,
+vaddr vaddr, unsigned size,
+MMUAccessType type, int mmu_idx, uintptr_t ra);
+void access_prepare(X86Access *ret, CPUX86State *env, vaddr vaddr,
+unsigned size, MMUAccessType type, uintptr_t ra);
+
+uint8_t  access_ldb(X86Access *ac, vaddr addr);
+uint16_t access_ldw(X86Access *ac, vaddr addr);
+uint32_t access_ldl(X86Access *ac, vaddr addr);
+uint64_t access_ldq(X86Access *ac, vaddr addr);
+
+void access_stb(X86Access *ac, vaddr addr, uint8_t val);
+void access_stw(X86Access *ac, vaddr addr, uint16_t val);
+void access_stl(X86Access *ac, vaddr addr, uint32_t val);
+void access_stq(X86Access *ac, vaddr addr, uint64_t val);
+
+#endif
diff --git a/target/i386/tcg/access.c b/target/i386/tcg/access.c
new file mode 100644
index 00..8b70f3244b
--- /dev/null
+++ b/target/i386/tcg/access.c
@@ -0,0 +1,160 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/* Access guest memory in blocks. */
+
+#include "qemu/osdep.h"
+#include "cpu.h"
+#include "exec/cpu_ldst.h"
+#include "exec/exec-all.h"
+#include "access.h"
+
+
+void access_prepare_mmu(X86Access *ret, CPUX86State *env,
+vaddr vaddr, unsigned size,
+MMUAccessType type, int mmu_idx, uintptr_t ra)
+{
+int size1, size2;
+void *haddr1, *haddr2;
+
+assert(size > 0 && size <= TARGET_PAGE_SIZE);
+
+size1 = MIN(size, -(vaddr | TARGET_PAGE_MASK)),
+size2 = size - size1;
+
+memset(ret, 0, sizeof(*ret));
+ret->vaddr = vaddr;
+ret->size = size;
+ret->size1 = size1;
+ret->mmu_idx = mmu_idx;
+ret->env = env;
+ret->ra = ra;
+
+haddr1 = probe_access(env, vaddr, size1, type, mmu_idx, ra);
+ret->haddr1 = haddr1;
+
+if (unlikely(size2)) {
+haddr2 = probe_access(env, vaddr + size1, size2, type, mmu_idx, ra);
+if (haddr2 == haddr1 + size1) {
+ret->size1 = size;
+} else {
+ret->haddr2 = haddr2;
+}
+}


Should there be an assert(!ret->haddr2) here for the CONFIG_USER_ONLY 
case, or alternatively a g_assert_unreachable() in the "else" above?



+}
+
+void access_prepare(X86Access *ret, CPUX86State *env, vaddr vaddr,
+unsigned size, MMUAccessType type, uintptr_t ra)
+{
+int mmu_idx = cpu_mmu_index(env_cpu(env), false);
+access_prepare_mmu(ret, env, vaddr, size, type, mmu_idx, ra);
+}
+
+static void *access_ptr(X86Access *ac, vaddr addr, unsigned len)
+{
+vaddr offset = addr - ac->vaddr;
+
+assert(addr >= ac->vaddr);
+
+#ifdef CONFIG_USER_ONLY
+assert(offset <= ac->size1 - len);
+return ac->haddr1 + offset;
+#else
+if (likely(offset <= ac->size1 - len)) {
+return ac->haddr1;
+}
+assert(offset <= ac->size - len);
+if (likely(offset >= ac->size1)) {
+return ac->haddr2;
+}


I think the returns should be (respectively) ac->haddr1 + offset and 
ac->haddr2 + (offset - ac->size1)?


Also I would add a comment above the second "if", like

/*
 * If the address is not naturally aligned, it might span
 * both pages.  Only return ac->haddr2 if the area is
 * entirely within the second page, otherwise fall back
 * to slow accesses.
 */

Paolo


+uint8_t access_ldb(X86Access *ac, vaddr addr)
+{
+void *p = access_ptr(ac, addr, sizeof(uint8_t));
+
+if (test_ptr(p)) {
+return ldub_p(p);
+}
+return cpu_ldub_mmuidx_ra(ac->env, addr, ac->mmu_idx, ac->ra);
+}
+
+uint16_t access_ldw(X86Access *ac, vaddr addr)
+{
+void *p = access_ptr(ac, addr, sizeof(uint16_t));
+
+if (test_ptr(p)) {
+return lduw_le_p(p);
+}

Re: [PATCH] target/i386: fix direction of "32-bit MMU" test

2024-04-08 Thread Paolo Bonzini
Il ven 5 apr 2024, 19:30 Michael Tokarev  ha scritto:

> 01.04.2024 09:02, Michael Tokarev:
>
> > Anyone can guess why this rather trivial and obviously correct patch
> causes segfaults
> > in a few tests in staging-7.2 - when run in tcg mode, namely:
> >
> >pxe-test
> >migration-test
> >boot-serial-test
> >bios-tables-test
> >vmgenid-test
> >cdrom-test
> >
> > When reverting this single commit from staging-7.2, it all works fine
> again.
>
> It sigsegvs in probe_access_internal():
>
>CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr); -- this one returns
> NULL,
>
> and next there's a call
>
>tlb_addr = tlb_read_ofs(entry, elt_ofs);
>
> which fails.
>

I will take a look tomorrow.

Paolo


> #0  0x55c5de8a in tlb_read_ofs (ofs=8, entry=0x0) at
> 7.2/accel/tcg/cputlb.c:1455
> #1  probe_access_internal
>  (env=0x56a862a0, addr=4294967280, fault_size=fault_size@entry=1,
> access_type=access_type@entry=MMU_INST_FETCH, mmu_idx=5,
> nonfault=nonfault@entry=false, phost=0x7fffea4d32a0,
> pfull=0x7fffea4d3298, retaddr=0)
>  at 7.2/accel/tcg/cputlb.c:1555
> #2  0x55c62aba in get_page_addr_code_hostp
>  (env=, addr=addr@entry=4294967280, hostp=hostp@entry
> =0x0)
>  at 7.2/accel/tcg/cputlb.c:1691
> #3  0x55c52b54 in get_page_addr_code (addr=4294967280,
> env=)
>  at 7.2/include/exec/exec-all.h:714
> #4  tb_htable_lookup
>  (cpu=cpu@entry=0x56a85530, pc=pc@entry=4294967280,
> cs_base=cs_base@entry=4294901760, flags=flags@entry=64,
> cflags=cflags@entry=4278190080) at
> 7.2/accel/tcg/cpu-exec.c:236
> #5  0x55c53e8e in tb_lookup
>  (cflags=4278190080, flags=64, cs_base=4294901760, pc=4294967280,
> cpu=0x56a85530)
>  at 7.2/accel/tcg/cpu-exec.c:270
> #6  cpu_exec (cpu=cpu@entry=0x56a85530) at
> 7.2/accel/tcg/cpu-exec.c:1001
> #7  0x55c75d2f in tcg_cpus_exec (cpu=cpu@entry=0x56a85530)
>  at 7.2/accel/tcg/tcg-accel-ops.c:69
> #8  0x55c75e80 in mttcg_cpu_thread_fn (arg=arg@entry
> =0x56a85530)
>  at 7.2/accel/tcg/tcg-accel-ops-mttcg.c:95
> #9  0x55ded098 in qemu_thread_start (args=0x56adac40)
>  at 7.2/util/qemu-thread-posix.c:505
> #10 0x75793134 in start_thread (arg=)
> #11 0x758137dc in clone3 ()
>
>
> I'm removing this whole set from 7.2 for now:
>
>   2cc68629a6fc target/i386: fix direction of "32-bit MMU" test
>   90f641531c78 target/i386: use separate MMU indexes for 32-bit accesses
>   5f97afe2543f target/i386: introduce function to query MMU indices
>
> This leaves us with
>
>   b1661801c184 "target/i386: Fix physical address truncation"
>
> but without its fix, 2cc68629a6fc.
>
> It looks like I should revert b1661801c184 from 7.2 too, re-opening
> https://gitlab.com/qemu-project/qemu/-/issues/2040 - since to me it isn't
> clear if this change actually fixes this issue or not without the
> previous change, 90f641531c78, which is missing from 7.2.10.
>
> At the very least this will simplify possible another attempt to
> cherry-pick
> these changes to 7.2.
>
> Thanks,
>
> /mjt
>
>


Re: [PATCH] Revert "hw/virtio: Add support for VDPA network simulation devices"

2024-04-08 Thread Paolo Bonzini
Il lun 8 apr 2024, 12:18 Michael S. Tsirkin  ha scritto:

> On Mon, Apr 08, 2024 at 10:51:57AM +0100, Peter Maydell wrote:
> > On Mon, 8 Apr 2024 at 10:48, Michael S. Tsirkin  wrote:
> > >
> > > This reverts commit cd341fd1ffded978b2aa0b5309b00be7c42e347c.
> > >
> > > The patch adds non-upstream code in
> > > include/standard-headers/linux/virtio_pci.h
> > > which would make maintainance harder.
> > >
> > > Revert for now.
>

As long as it is part of the spec, why not just move the problematic parts
to a QEMU specific header? As far as I understand the kernel is never going
to consume these constants anyway.

Paolo

> > Suggested-by: Jason Wang 
> > > Signed-off-by: Michael S. Tsirkin 
> >
> > Are you intending to target this revert for 9.0 ?
> >
> > -- PMM
>
> Yes.
>
>


[PULL 0/3] 9.0 bugfixes for 2024-04-08

2024-04-08 Thread Paolo Bonzini
The following changes since commit ce64e6224affb8b4e4b019f76d2950270b391af5:

  Merge tag 'qemu-sparc-20240404' of https://github.com/mcayland/qemu into 
staging (2024-04-04 15:28:06 +0100)

are available in the Git repository at:

  https://gitlab.com/bonzini/qemu.git tags/for-upstream

for you to fetch changes up to e34f4d87e8d47b0a65cb663aaf7bef60c2112d36:

  kvm: error out of kvm_irqchip_add_msi_route() in case of full route table 
(2024-04-08 21:22:00 +0200)


* fall back to non-ioeventfd notification if KVM routing table is full
* support kitware ninja with jobserver support
* nanomips: fix warnings with GCC 14


Igor Mammedov (1):
  kvm: error out of kvm_irqchip_add_msi_route() in case of full route table

Martin Hundebøll (1):
  Makefile: preserve --jobserver-auth argument when calling ninja

Paolo Bonzini (1):
  nanomips: fix warnings with GCC 14

 Makefile|   2 +-
 accel/kvm/kvm-all.c |  15 ++--
 disas/nanomips.c| 194 ++--
 3 files changed, 108 insertions(+), 103 deletions(-)
-- 
2.44.0




[PULL 2/3] nanomips: fix warnings with GCC 14

2024-04-08 Thread Paolo Bonzini
GCC 14 shows -Wshadow=local warnings if an enum conflicts with a local
variable (including a parameter).  To avoid this, move the problematic
enum and all of its dependencies after the hundreds of functions that
have a parameter named "instruction".

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 disas/nanomips.c | 194 +++
 1 file changed, 97 insertions(+), 97 deletions(-)

diff --git a/disas/nanomips.c b/disas/nanomips.c
index a0253598dd6..db0c297b8dc 100644
--- a/disas/nanomips.c
+++ b/disas/nanomips.c
@@ -36,35 +36,6 @@ typedef uint32_t uint32;
 typedef uint16_t uint16;
 typedef uint64_t img_address;
 
-typedef enum  {
-instruction,
-call_instruction,
-branch_instruction,
-return_instruction,
-reserved_block,
-pool,
-} TABLE_ENTRY_TYPE;
-
-typedef enum {
-MIPS64_= 0x0001,
-XNP_   = 0x0002,
-XMMS_  = 0x0004,
-EVA_   = 0x0008,
-DSP_   = 0x0010,
-MT_= 0x0020,
-EJTAG_ = 0x0040,
-TLBINV_= 0x0080,
-CP0_   = 0x0100,
-CP1_   = 0x0200,
-CP2_   = 0x0400,
-UDI_   = 0x0800,
-MCU_   = 0x1000,
-VZ_= 0x2000,
-TLB_   = 0x4000,
-MVH_   = 0x8000,
-ALL_ATTRIBUTES = 0xull,
-} TABLE_ATTRIBUTE_TYPE;
-
 typedef struct Dis_info {
   img_address m_pc;
   fprintf_function fprintf_func;
@@ -72,22 +43,6 @@ typedef struct Dis_info {
   sigjmp_buf buf;
 } Dis_info;
 
-typedef bool (*conditional_function)(uint64 instruction);
-typedef char * (*disassembly_function)(uint64 instruction,
-Dis_info *info);
-
-typedef struct Pool {
-TABLE_ENTRY_TYPE type;
-const struct Pool*next_table;
-int  next_table_size;
-int  instructions_size;
-uint64   mask;
-uint64   value;
-disassembly_function disassembly;
-conditional_function condition;
-uint64   attributes;
-} Pool;
-
 #define IMGASSERTONCE(test)
 
 
@@ -544,58 +499,6 @@ static uint64 extract_op_code_value(const uint16 *data, 
int size)
 }
 
 
-/*
- * Recurse through tables until the instruction is found then return
- * the string and size
- *
- * inputs:
- *  pointer to a word stream,
- *  disassember table and size
- * returns:
- *  instruction size- negative is error
- *  disassembly string  - on error will constain error string
- */
-static int Disassemble(const uint16 *data, char **dis,
- TABLE_ENTRY_TYPE *type, const Pool *table,
- int table_size, Dis_info *info)
-{
-for (int i = 0; i < table_size; i++) {
-uint64 op_code = extract_op_code_value(data,
- table[i].instructions_size);
-if ((op_code & table[i].mask) == table[i].value) {
-/* possible match */
-conditional_function cond = table[i].condition;
-if ((cond == NULL) || cond(op_code)) {
-if (table[i].type == pool) {
-return Disassemble(data, dis, type,
-   table[i].next_table,
-   table[i].next_table_size,
-   info);
-} else if ((table[i].type == instruction) ||
-   (table[i].type == call_instruction) ||
-   (table[i].type == branch_instruction) ||
-   (table[i].type == return_instruction)) {
-disassembly_function dis_fn = table[i].disassembly;
-if (dis_fn == 0) {
-*dis = g_strdup(
-"disassembler failure - bad table entry");
-return -6;
-}
-*type = table[i].type;
-*dis = dis_fn(op_code, info);
-return table[i].instructions_size;
-} else {
-*dis = g_strdup("reserved instruction");
-return -2;
-}
-}
-}
-}
-*dis = g_strdup("failed to disassemble");
-return -1;  /* failed to disassemble*/
-}
-
-
 static uint64 extract_code_18_to_0(uint64 instruction)
 {
 uint64 value = 0;
@@ -16213,6 +16116,51 @@ static char *YIELD(uint64 instruction, Dis_info *info)
  *
  */
 
+typedef enum  {
+instruction,
+call_instruction,
+branch_instruction,
+return_instruction,
+reserved_block,
+pool,
+} TABLE_ENTRY_TYPE;
+
+typedef enum {
+MIPS64_= 0x0001,
+XNP_   = 0x0002,
+XMMS_  = 0x0004,
+EVA_   = 0x0008,
+DSP_   = 0x0010,
+MT_= 0x0020,
+EJTAG_ = 0x0040,
+TLBINV

[PULL 3/3] kvm: error out of kvm_irqchip_add_msi_route() in case of full route table

2024-04-08 Thread Paolo Bonzini
From: Igor Mammedov 

subj is calling kvm_add_routing_entry() which simply extends
  KVMState::irq_routes::entries[]
but doesn't check if number of routes goes beyond limit the kernel
is willing to accept. Which later leads toi the assert

  qemu-kvm: ../accel/kvm/kvm-all.c:1833: kvm_irqchip_commit_routes: Assertion 
`ret == 0' failed

typically it happens during guest boot for large enough guest

Reproduced with:
  ./qemu --enable-kvm -m 8G -smp 64 -machine pc \
 `for b in {1..2}; do echo -n "-device pci-bridge,id=pci$b,chassis_nr=$b ";
for i in {0..31}; do touch /tmp/vblk$b$i;
   echo -n "-drive file=/tmp/vblk$b$i,if=none,id=drive$b$i,format=raw
-device virtio-blk-pci,drive=drive$b$i,bus=pci$b ";
  done; done`

While crash at boot time is bad, the same might happen at hotplug time
which is unacceptable.
So instead calling kvm_add_routing_entry() unconditionally, check first
that number of routes won't exceed KVM_CAP_IRQ_ROUTING. This way virtio
device insteads killin qemu, will gracefully fail to initialize device
as expected with following warnings on console:
virtio-blk failed to set guest notifier (-28), ensure -accel kvm is set.
virtio_bus_start_ioeventfd: failed. Fallback to userspace (slower).

Signed-off-by: Igor Mammedov 
Message-ID: <20240408110956.451558-1-imamm...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 accel/kvm/kvm-all.c | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index a8cecd040eb..931f74256e8 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -1999,12 +1999,17 @@ int kvm_irqchip_add_msi_route(KVMRouteChange *c, int 
vector, PCIDevice *dev)
 return -EINVAL;
 }
 
-trace_kvm_irqchip_add_msi_route(dev ? dev->name : (char *)"N/A",
-vector, virq);
+if (s->irq_routes->nr < s->gsi_count) {
+trace_kvm_irqchip_add_msi_route(dev ? dev->name : (char *)"N/A",
+vector, virq);
 
-kvm_add_routing_entry(s, );
-kvm_arch_add_msi_route_post(, vector, dev);
-c->changes++;
+kvm_add_routing_entry(s, );
+kvm_arch_add_msi_route_post(, vector, dev);
+c->changes++;
+} else {
+kvm_irqchip_release_virq(s, virq);
+return -ENOSPC;
+}
 
 return virq;
 }
-- 
2.44.0




<    2   3   4   5   6   7   8   9   10   11   >