date:20240312

Re: [PATCH v2 00/13] Cleanup on SMP and its test

2024-03-12 Thread Zhao Liu

Hi Philippe,

On Sat, Mar 09, 2024 at 02:49:17PM +0100, Philippe Mathieu-Daudé wrote:
> Date: Sat, 9 Mar 2024 14:49:17 +0100
> From: Philippe Mathieu-Daudé 
> Subject: Re: [PATCH v2 00/13] Cleanup on SMP and its test
> 
> On 9/3/24 01:46, Zhao Liu wrote:
> > Hi Philippe,
> > 
> > > 
> > > Can you share your base commit please?
> > > 
> > > Applying: hw/core/machine-smp: Remove deprecated "parameter=0" SMP
> > > configurations
> > > Applying: hw/core/machine-smp: Deprecate unsupported "parameter=1" SMP
> > > configurations
> > > error: patch failed: docs/about/deprecated.rst:47
> > > error: docs/about/deprecated.rst: patch does not apply
> > > Patch failed at 0002 hw/core/machine-smp: Deprecate unsupported
> > > "parameter=1" SMP configurations
> > > 
> > 
> > The base commit is e1007b6bab5cf ("Merge tag 'pull-request-2024-03-01'
> > of https://gitlab.com/thuth/qemu into staging").
> > 
> > But I think this conflict is because of the first 4 patches of mudule
> > series you picked. Let me rebase this series on that module series and
> > refresh a v3.
> 
> Ah no, it is due to commit 01e449809b ("*-user: Deprecate and
> disable -p pagesize").
> 
> No need to respin this series, I queued it in favor of the 4 other
> patches.

In the commit 54c4ea8f3ae6 ("hw/core/machine-smp: Deprecate unsupported
'parameter=1' SMP configurations"), the smp related thing is put under
the section "User-mode emulator command line arguments" instead of "System
emulator command line arguments".

Is this not quite right...or does it need to be fixed? If so I can tweak
and clean it up with a minor patch. ;-)

Thanks,
Zhao

Re: [PATCH v5] pc: q35: Bump max_cpus to 4096 vcpus

2024-03-12 Thread Ani Sinha




> On 28-Feb-2024, at 20:03, Ani Sinha  wrote:
> 
> Since commit f10a570b093e6 ("KVM: x86: Add CONFIG_KVM_MAX_NR_VCPUS to allow 
> up to 4096 vCPUs")
> Linux kernel can support upto a maximum number of 4096 vcpus when MAXSMP is
> enabled in the kernel. At present, QEMU has been tested to correctly boot a
> linux guest with 4096 vcpus using the current edk2 upstream master branch that
> has the fixes corresponding to the following two PRs:
> 
> https://github.com/tianocore/edk2/pull/5410
> https://github.com/tianocore/edk2/pull/5418
> 
> The changes merged into edk2 with the above PRs will be in the upcoming 
> 2024-05
> release. With current seabios firmware, it boots fine with 4096 vcpus already.
> So bump up the value max_cpus to 4096 for q35 machines versions 9 and newer.
> Q35 machines versions 8.2 and older continue to support 1024 maximum vcpus
> as before for compatibility reasons.
> 
> If KVM is not able to support the specified number of vcpus, QEMU would
> return the following error messages:
> 
> $ ./qemu-system-x86_64 -cpu host -accel kvm -machine q35 -smp 1728
> qemu-system-x86_64: -accel kvm: warning: Number of SMP cpus requested (1728) 
> exceeds the recommended cpus supported by KVM (12)
> qemu-system-x86_64: -accel kvm: warning: Number of hotpluggable cpus 
> requested (1728) exceeds the recommended cpus supported by KVM (12)
> Number of SMP cpus requested (1728) exceeds the maximum cpus supported by KVM 
> (1024)
> 
> Cc: Daniel P. Berrangé 
> Cc: Igor Mammedov 
> Cc: Michael S. Tsirkin 
> Cc: Julia Suvorova 
> Cc: kra...@redhat.com
> Reviewed-by: Daniel P. Berrangé 
> Reviewed-by: Igor Mammedov 
> Reviewed-by: Gerd Hoffmann 
> Signed-off-by: Ani Sinha 

Ping .. who is picking this up? The soft code freeze starts today?

> ---
> hw/i386/pc_q35.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
> 
> Changelog:
> v5: mention the PRs in the commit message. Add tags.
> v4: tweaked commit message as per suggestion from danpb explicitly
> stating that 4096 vcpus work with edk2 fixes that are going to be
> available in the coming edk2 release.
> v3: bump up to 4096 vcpus. It has now been tested to work with edk2.
> See RH Jira: https://issues.redhat.com/browse/RHEL-22202
> v2: bump up the vcpu number to 1856. Add failure messages from ekd2 in
> the commit description.
> 
> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> index 45a4102e75..df63a92b78 100644
> --- a/hw/i386/pc_q35.c
> +++ b/hw/i386/pc_q35.c
> @@ -350,7 +350,7 @@ static void pc_q35_machine_options(MachineClass *m)
> m->default_nic = "e1000e";
> m->default_kernel_irqchip_split = false;
> m->no_floppy = 1;
> -m->max_cpus = 1024;
> +m->max_cpus = 4096;
> m->no_parallel = !module_object_class_by_name(TYPE_ISA_PARALLEL);
> machine_class_allow_dynamic_sysbus_dev(m, TYPE_AMD_IOMMU_DEVICE);
> machine_class_allow_dynamic_sysbus_dev(m, TYPE_INTEL_IOMMU_DEVICE);
> @@ -371,6 +371,7 @@ static void pc_q35_8_2_machine_options(MachineClass *m)
> {
> pc_q35_9_0_machine_options(m);
> m->alias = NULL;
> +m->max_cpus = 1024;
> compat_props_add(m->compat_props, hw_compat_8_2, hw_compat_8_2_len);
> compat_props_add(m->compat_props, pc_compat_8_2, pc_compat_8_2_len);
> }
> -- 
> 2.42.0
>

Re: [PATCH] error: Move ERRP_GUARD() to the beginning of the function

2024-03-12 Thread Markus Armbruster

Zhao Liu  writes:

> From: Zhao Liu 
>
> Since the commit 05e385d2a9 ("error: Move ERRP_GUARD() to the beginning
> of the function"), there are new codes that don't put ERRP_GUARD() at
> the beginning of the functions.
>
> As stated in the commit 05e385d2a9: "include/qapi/error.h advises to put
> ERRP_GUARD() right at the beginning of the function, because only then
> can it guard the whole function.", so clean up the few spots
> disregarding the advice.
>
> Inspired-by: Markus Armbruster 
> Signed-off-by: Zhao Liu 

Reviewed-by: Markus Armbruster

[PATCH] docs/about/deprecated.rst: Move SMP configurations item to system emulator section

2024-03-12 Thread Zhao Liu

From: Zhao Liu 

In the commit 54c4ea8f3ae6 ("hw/core/machine-smp: Deprecate unsupported
'parameter=1' SMP configurations"), the SMP related item is put under
the section "User-mode emulator command line arguments" instead of
"System emulator command line arguments".

-smp is a system emulator command, so move SMP configurations item to
system emulator section.

Signed-off-by: Zhao Liu 
---
based on 7489f7f3f81d.

Note: the git diff understood my move of SMP item as the move of the
whole "User-mode emulator command line arguments" section, which may
cause confusion about the contents of this patch.
---
 docs/about/deprecated.rst | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst
index dfd681cd024e..2f9277c9158c 100644
--- a/docs/about/deprecated.rst
+++ b/docs/about/deprecated.rst
@@ -47,16 +47,6 @@ as short-form boolean values, and passed to plugins as 
``arg_name=on``.
 However, short-form booleans are deprecated and full explicit ``arg_name=on``
 form is preferred.
 
-User-mode emulator command line arguments
--
-
-``-p`` (since 9.0)
-''
-
-The ``-p`` option pretends to control the host page size.  However,
-it is not possible to change the host page size, and using the
-option only causes failures.
-
 ``-smp`` (Unsupported "parameter=1" SMP configurations) (since 9.0)
 '''
 
@@ -71,6 +61,16 @@ configurations (e.g. -smp drawers=1,books=1,clusters=1 for 
x86 PC machine) is
 marked deprecated since 9.0, users have to ensure that all the topology members
 described with -smp are supported by the target machine.
 
+User-mode emulator command line arguments
+-
+
+``-p`` (since 9.0)
+''
+
+The ``-p`` option pretends to control the host page size.  However,
+it is not possible to change the host page size, and using the
+option only causes failures.
+
 QEMU Machine Protocol (QMP) commands
 
 
-- 
2.34.1

[PATCH v2 1/1] memory tier: acpi/hmat: create CPUless memory tiers after obtaining HMAT info

2024-03-12 Thread Ho-Ren (Jack) Chuang

The current implementation treats emulated memory devices, such as
CXL1.1 type3 memory, as normal DRAM when they are emulated as normal memory
(E820_TYPE_RAM). However, these emulated devices have different
characteristics than traditional DRAM, making it important to
distinguish them. Thus, we modify the tiered memory initialization process
to introduce a delay specifically for CPUless NUMA nodes. This delay
ensures that the memory tier initialization for these nodes is deferred
until HMAT information is obtained during the boot process. Finally,
demotion tables are recalculated at the end.

* Abstract common functions into `find_alloc_memory_type()`
Since different memory devices require finding or allocating a memory type,
these common steps are abstracted into a single function,
`find_alloc_memory_type()`, enhancing code scalability and conciseness.

* Handle cases where there is no HMAT when creating memory tiers
There is a scenario where a CPUless node does not provide HMAT information.
If no HMAT is specified, it falls back to using the default DRAM tier.

* Change adist calculation code to use another new lock, mt_perf_lock.
In the current implementation, iterating through CPUlist nodes requires
holding the `memory_tier_lock`. However, `mt_calc_adistance()` will end up
trying to acquire the same lock, leading to a potential deadlock.
Therefore, we propose introducing a standalone `mt_perf_lock` to protect
`default_dram_perf`. This approach not only avoids deadlock but also
prevents holding a large lock simultaneously.

Signed-off-by: Ho-Ren (Jack) Chuang 
Signed-off-by: Hao Xiang 
---
 drivers/acpi/numa/hmat.c | 11 ++
 drivers/dax/kmem.c   | 13 +--
 include/linux/acpi.h |  6 
 include/linux/memory-tiers.h |  8 +
 mm/memory-tiers.c| 70 +---
 5 files changed, 92 insertions(+), 16 deletions(-)

diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c
index d6b85f0f6082..28812ec2c793 100644
--- a/drivers/acpi/numa/hmat.c
+++ b/drivers/acpi/numa/hmat.c
@@ -38,6 +38,8 @@ static LIST_HEAD(targets);
 static LIST_HEAD(initiators);
 static LIST_HEAD(localities);
 
+static LIST_HEAD(hmat_memory_types);
+
 static DEFINE_MUTEX(target_lock);
 
 /*
@@ -149,6 +151,12 @@ int acpi_get_genport_coordinates(u32 uid,
 }
 EXPORT_SYMBOL_NS_GPL(acpi_get_genport_coordinates, CXL);
 
+struct memory_dev_type *hmat_find_alloc_memory_type(int adist)
+{
+   return find_alloc_memory_type(adist, _memory_types);
+}
+EXPORT_SYMBOL_GPL(hmat_find_alloc_memory_type);
+
 static __init void alloc_memory_initiator(unsigned int cpu_pxm)
 {
struct memory_initiator *initiator;
@@ -1038,6 +1046,9 @@ static __init int hmat_init(void)
if (!hmat_set_default_dram_perf())
register_mt_adistance_algorithm(_adist_nb);
 
+   /* Post-create CPUless memory tiers after getting HMAT info */
+   memory_tier_late_init();
+
return 0;
 out_put:
hmat_free_structures();
diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
index 42ee360cf4e3..aee17ab59f4f 100644
--- a/drivers/dax/kmem.c
+++ b/drivers/dax/kmem.c
@@ -55,21 +55,10 @@ static LIST_HEAD(kmem_memory_types);
 
 static struct memory_dev_type *kmem_find_alloc_memory_type(int adist)
 {
-   bool found = false;
struct memory_dev_type *mtype;
 
mutex_lock(_memory_type_lock);
-   list_for_each_entry(mtype, _memory_types, list) {
-   if (mtype->adistance == adist) {
-   found = true;
-   break;
-   }
-   }
-   if (!found) {
-   mtype = alloc_memory_type(adist);
-   if (!IS_ERR(mtype))
-   list_add(>list, _memory_types);
-   }
+   mtype = find_alloc_memory_type(adist, _memory_types);
mutex_unlock(_memory_type_lock);
 
return mtype;
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index b7165e52b3c6..3f927ff01f02 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -434,12 +434,18 @@ int thermal_acpi_critical_trip_temp(struct acpi_device 
*adev, int *ret_temp);
 
 #ifdef CONFIG_ACPI_HMAT
 int acpi_get_genport_coordinates(u32 uid, struct access_coordinate *coord);
+struct memory_dev_type *hmat_find_alloc_memory_type(int adist);
 #else
 static inline int acpi_get_genport_coordinates(u32 uid,
   struct access_coordinate *coord)
 {
return -EOPNOTSUPP;
 }
+
+static inline struct memory_dev_type *hmat_find_alloc_memory_type(int adist)
+{
+   return NULL;
+}
 #endif
 
 #ifdef CONFIG_ACPI_NUMA
diff --git a/include/linux/memory-tiers.h b/include/linux/memory-tiers.h
index 69e781900082..4bc2596c5774 100644
--- a/include/linux/memory-tiers.h
+++ b/include/linux/memory-tiers.h
@@ -48,6 +48,9 @@ int mt_calc_adistance(int node, int *adist);
 int mt_set_default_dram_perf(int nid, struct access_coordinate *perf,
 const

[PATCH v2 0/1] Improved Memory Tier Creation for CPUless NUMA Nodes

2024-03-12 Thread Ho-Ren (Jack) Chuang

When a memory device, such as CXL1.1 type3 memory, is emulated as
normal memory (E820_TYPE_RAM), the memory device is indistinguishable
from normal DRAM in terms of memory tiering with the current implementation.
The current memory tiering assigns all detected normal memory nodes
to the same DRAM tier. This results in normal memory devices with
different attributions being unable to be assigned to the correct memory tier,
leading to the inability to migrate pages between different types of memory.
https://lore.kernel.org/linux-mm/ph0pr08mb7955e9f08ccb64f23963b5c3a8...@ph0pr08mb7955.namprd08.prod.outlook.com/T/

This patchset automatically resolves the issues. It delays the initialization
of memory tiers for CPUless NUMA nodes until they obtain HMAT information
at boot time, eliminating the need for user intervention.
If no HMAT is specified, it falls back to using `default_dram_type`.

Example usecase:
We have CXL memory on the host, and we create VMs with a new system memory
device backed by host CXL memory. We inject CXL memory performance attributes
through QEMU, and the guest now sees memory nodes with performance attributes
in HMAT. With this change, we enable the guest kernel to construct
the correct memory tiering for the memory nodes.

-v2:
 Thanks to Ying's comments,
 * Rewrite cover letter & patch description
 * Rename functions, don't use _hmat
 * Abstract common functions into find_alloc_memory_type()
 * Use the expected way to use set_node_memory_tier instead of modifying it
-v1:
 * 
https://lore.kernel.org/linux-mm/20240301082248.3456086-1-horenchu...@bytedance.com/T/


Ho-Ren (Jack) Chuang (1):
  memory tier: acpi/hmat: create CPUless memory tiers after obtaining
HMAT info

 drivers/acpi/numa/hmat.c | 11 ++
 drivers/dax/kmem.c   | 13 +--
 include/linux/acpi.h |  6 
 include/linux/memory-tiers.h |  8 +
 mm/memory-tiers.c| 70 +---
 5 files changed, 92 insertions(+), 16 deletions(-)

-- 
Ho-Ren (Jack) Chuang

Re: [PATCH] spapr: avoid overhead of finding vhyp class in critical operations

2024-03-12 Thread Harsh Prateek Bora


Hi Nick,

One minor comment below:

On 2/24/24 13:03, Nicholas Piggin wrote:

PPC_VIRTUAL_HYPERVISOR_GET_CLASS is used in critical operations like
interrupts and TLB misses and is quite costly. Running the
kvm-unit-tests sieve program with radix MMU enabled thrashes the TCG
TLB and spends a lot of time in TLB and page table walking code. The
test takes 67 seconds to complete with a lot of time being spent in
code related to finding the vhyp class:

12.01%  [.] g_str_hash
 8.94%  [.] g_hash_table_lookup
 8.06%  [.] object_class_dynamic_cast
 6.21%  [.] address_space_ldq
 4.94%  [.] __strcmp_avx2
 4.28%  [.] tlb_set_page_full
 4.08%  [.] address_space_translate_internal
 3.17%  [.] object_class_dynamic_cast_assert
 2.84%  [.] ppc_radix64_xlate

Keep a pointer to the class and avoid this lookup. This reduces the
execution time to 40 seconds.

Signed-off-by: Nicholas Piggin 
---
This feels a bit ugly, but the performance problem of looking up the
class in fast paths can't be ignored. Is there a "nicer" way to get the
same result?

Thanks,
Nick

  target/ppc/cpu.h   |  3 ++-
  target/ppc/mmu-book3s-v3.h |  4 +---
  hw/ppc/pegasos2.c  |  1 +
  target/ppc/cpu_init.c  |  9 +++--
  target/ppc/excp_helper.c   | 16 
  target/ppc/kvm.c   |  4 +---
  target/ppc/mmu-hash64.c| 16 
  target/ppc/mmu-radix64.c   |  4 +---
  8 files changed, 17 insertions(+), 40 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index ec14574d14..eb85d9aa71 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -1437,6 +1437,7 @@ struct ArchCPU {
  int vcpu_id;
  uint32_t compat_pvr;
  PPCVirtualHypervisor *vhyp;
+PPCVirtualHypervisorClass *vhyp_class;
  void *machine_data;
  int32_t node_id; /* NUMA node this CPU belongs to */
  PPCHash64Options *hash64_opts;
@@ -1535,7 +1536,7 @@ DECLARE_OBJ_CHECKERS(PPCVirtualHypervisor, 
PPCVirtualHypervisorClass,
  
  static inline bool vhyp_cpu_in_nested(PowerPCCPU *cpu)

  {
-return PPC_VIRTUAL_HYPERVISOR_GET_CLASS(cpu->vhyp)->cpu_in_nested(cpu);
+return cpu->vhyp_class->cpu_in_nested(cpu);
  }
  #endif /* CONFIG_USER_ONLY */
  
diff --git a/target/ppc/mmu-book3s-v3.h b/target/ppc/mmu-book3s-v3.h

index 674377a19e..f3f7993958 100644
--- a/target/ppc/mmu-book3s-v3.h
+++ b/target/ppc/mmu-book3s-v3.h
@@ -108,9 +108,7 @@ static inline hwaddr ppc_hash64_hpt_mask(PowerPCCPU *cpu)
  uint64_t base;
  
  if (cpu->vhyp) {


All the checks for cpu->vhyp needs to be changed to check for 
cpu->vhyp_class now, for all such instances.


With that,

Reviewed-by: Harsh Prateek Bora 



-PPCVirtualHypervisorClass *vhc =
-PPC_VIRTUAL_HYPERVISOR_GET_CLASS(cpu->vhyp);
-return vhc->hpt_mask(cpu->vhyp);
+return cpu->vhyp_class->hpt_mask(cpu->vhyp);
  }
  if (cpu->env.mmu_model == POWERPC_MMU_3_00) {
  ppc_v3_pate_t pate;
diff --git a/hw/ppc/pegasos2.c b/hw/ppc/pegasos2.c
index 04d6decb2b..c22e8b336d 100644
--- a/hw/ppc/pegasos2.c
+++ b/hw/ppc/pegasos2.c
@@ -400,6 +400,7 @@ static void pegasos2_machine_reset(MachineState *machine, 
ShutdownCause reason)
  machine->fdt = fdt;
  
  pm->cpu->vhyp = PPC_VIRTUAL_HYPERVISOR(machine);

+pm->cpu->vhyp_class = PPC_VIRTUAL_HYPERVISOR_GET_CLASS(pm->cpu->vhyp);
  }
  
  enum pegasos2_rtas_tokens {

diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
index 9bccddb350..63d0094024 100644
--- a/target/ppc/cpu_init.c
+++ b/target/ppc/cpu_init.c
@@ -6631,6 +6631,7 @@ void cpu_ppc_set_vhyp(PowerPCCPU *cpu, 
PPCVirtualHypervisor *vhyp)
  CPUPPCState *env = >env;
  
  cpu->vhyp = vhyp;

+cpu->vhyp_class = PPC_VIRTUAL_HYPERVISOR_GET_CLASS(vhyp);
  
  /*

   * With a virtual hypervisor mode we never allow the CPU to go
@@ -7224,9 +7225,7 @@ static void ppc_cpu_exec_enter(CPUState *cs)
  PowerPCCPU *cpu = POWERPC_CPU(cs);
  
  if (cpu->vhyp) {

-PPCVirtualHypervisorClass *vhc =
-PPC_VIRTUAL_HYPERVISOR_GET_CLASS(cpu->vhyp);
-vhc->cpu_exec_enter(cpu->vhyp, cpu);
+cpu->vhyp_class->cpu_exec_enter(cpu->vhyp, cpu);
  }
  }
  
@@ -7235,9 +7234,7 @@ static void ppc_cpu_exec_exit(CPUState *cs)

  PowerPCCPU *cpu = POWERPC_CPU(cs);
  
  if (cpu->vhyp) {

-PPCVirtualHypervisorClass *vhc =
-PPC_VIRTUAL_HYPERVISOR_GET_CLASS(cpu->vhyp);
-vhc->cpu_exec_exit(cpu->vhyp, cpu);
+cpu->vhyp_class->cpu_exec_exit(cpu->vhyp, cpu);
  }
  }
  #endif /* CONFIG_TCG */
diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
index 98952de267..445350488c 100644
--- a/target/ppc/excp_helper.c
+++ b/target/ppc/excp_helper.c
@@ -840,9 +840,7 @@ static void powerpc_excp_7xx(PowerPCCPU *cpu, int excp)
   * HV mode, we need to keep hypercall support.
   */
  if (lev == 1 && cpu->vhyp) {
-PPCVirtualHypervisorClass *vhc =
-

Re: [PATCH v4 00/25] migration: Improve error reporting

2024-03-12 Thread Cédric Le Goater


On 3/11/24 21:24, Peter Xu wrote:

On Fri, Mar 08, 2024 at 04:15:08PM +0800, Peter Xu wrote:

On Wed, Mar 06, 2024 at 02:34:15PM +0100, Cédric Le Goater wrote:

* [1-4] already queued in migration-next.
   
   migration: Report error when shutdown fails

   migration: Remove SaveStateHandler and LoadStateHandler typedefs
   migration: Add documentation for SaveVMHandlers
   migration: Do not call PRECOPY_NOTIFY_SETUP notifiers in case of error
   
* [5-9] are prequisite changes in other components related to the

   migration save_setup() handler. They make sure a failure is not
   returned without setting an error.
   
   s390/stattrib: Add Error** argument to set_migrationmode() handler

   vfio: Always report an error in vfio_save_setup()
   migration: Always report an error in block_save_setup()
   migration: Always report an error in ram_save_setup()
   migration: Add Error** argument to vmstate_save()

* [10-15] are the core changes in migration and memory components to
   propagate an error reported in a save_setup() handler.

   migration: Add Error** argument to qemu_savevm_state_setup()
   migration: Add Error** argument to .save_setup() handler
   migration: Add Error** argument to .load_setup() handler


Further queued 5-12 in migration-staging (until here), thanks.


Just to keep a record: due to the virtio failover test failure and the
other block migration uncertainty in patch 7 (in which case we may want to
have a fix on sectors==0 case), I unqueued this chunk for 9.0.


ok. I will ask the block folks for help to understand if sectors==0
is also an error in the save_setup context. May be  we can still
merge these in 9.0 cycle.
 
Thanks,


C.

Re: [PATCH 4/5] plugins: conditional callbacks

2024-03-12 Thread Pierrick Bouvier


On 3/11/24 14:08, Alex Bennée wrote:

Pierrick Bouvier  writes:


Extend plugins API to support callback called with a given criteria
(evaluated inline).

Added functions:
- qemu_plugin_register_vcpu_tb_exec_cond_cb
- qemu_plugin_register_vcpu_insn_exec_cond_cb

They expect as parameter a condition, a qemu_plugin_u64_t (op1) and an
immediate (op2). Callback is called if op1 |cond| op2 is true.

Signed-off-by: Pierrick Bouvier 
---
  include/qemu/plugin.h|   7 ++
  include/qemu/qemu-plugin.h   |  76 +++
  plugins/plugin.h |   8 ++
  accel/tcg/plugin-gen.c   | 174 ++-
  plugins/api.c|  51 ++
  plugins/core.c   |  19 
  plugins/qemu-plugins.symbols |   2 +
  7 files changed, 334 insertions(+), 3 deletions(-)

diff --git a/include/qemu/plugin.h b/include/qemu/plugin.h
index d92d64744e6..056102b2361 100644
--- a/include/qemu/plugin.h
+++ b/include/qemu/plugin.h
@@ -74,6 +74,8 @@ enum plugin_dyn_cb_type {
  enum plugin_dyn_cb_subtype {
  PLUGIN_CB_REGULAR,
  PLUGIN_CB_REGULAR_R,
+PLUGIN_CB_COND,
+PLUGIN_CB_COND_R,
  PLUGIN_CB_INLINE_ADD_U64,
  PLUGIN_CB_INLINE_STORE_U64,
  PLUGIN_N_CB_SUBTYPES,
@@ -97,6 +99,11 @@ struct qemu_plugin_dyn_cb {
  enum qemu_plugin_op op;
  uint64_t imm;
  } inline_insn;
+struct {
+qemu_plugin_u64 entry;
+enum qemu_plugin_cond cond;
+uint64_t imm;
+} cond_cb;
  };
  };
  
diff --git a/include/qemu/qemu-plugin.h b/include/qemu/qemu-plugin.h

index c5cac897a0b..337de25ece7 100644
--- a/include/qemu/qemu-plugin.h
+++ b/include/qemu/qemu-plugin.h
@@ -262,6 +262,29 @@ enum qemu_plugin_mem_rw {
  QEMU_PLUGIN_MEM_RW,
  };
  
+/**

+ * enum qemu_plugin_cond - condition to enable callback
+ *
+ * @QEMU_PLUGIN_COND_NEVER: false
+ * @QEMU_PLUGIN_COND_ALWAYS: true
+ * @QEMU_PLUGIN_COND_EQ: is equal?
+ * @QEMU_PLUGIN_COND_NE: is not equal?
+ * @QEMU_PLUGIN_COND_LT: is less than?
+ * @QEMU_PLUGIN_COND_LE: is less than or equal?
+ * @QEMU_PLUGIN_COND_GT: is greater than?
+ * @QEMU_PLUGIN_COND_GE: is greater than or equal?
+ */
+enum qemu_plugin_cond {
+QEMU_PLUGIN_COND_NEVER,
+QEMU_PLUGIN_COND_ALWAYS,
+QEMU_PLUGIN_COND_EQ,
+QEMU_PLUGIN_COND_NE,
+QEMU_PLUGIN_COND_LT,
+QEMU_PLUGIN_COND_LE,
+QEMU_PLUGIN_COND_GT,
+QEMU_PLUGIN_COND_GE,
+};
+
  /**
   * typedef qemu_plugin_vcpu_tb_trans_cb_t - translation callback
   * @id: unique plugin id
@@ -301,6 +324,32 @@ void qemu_plugin_register_vcpu_tb_exec_cb(struct 
qemu_plugin_tb *tb,
enum qemu_plugin_cb_flags flags,
void *userdata);
  
+/**

+ * qemu_plugin_register_vcpu_tb_exec_cond_cb() - register conditional callback
+ * @tb: the opaque qemu_plugin_tb handle for the translation
+ * @cb: callback function
+ * @cond: condition to enable callback
+ * @entry: first operand for condition
+ * @imm: second operand for condition
+ * @flags: does the plugin read or write the CPU's registers?
+ * @userdata: any plugin data to pass to the @cb?
+ *
+ * The @cb function is called when a translated unit executes if
+ * entry @cond imm is true.
+ * If condition is QEMU_PLUGIN_COND_ALWAYS, condition is never interpreted and
+ * this function is equivalent to qemu_plugin_register_vcpu_tb_exec_cb.
+ * If condition QEMU_PLUGIN_COND_NEVER, condition is never interpreted and
+ * callback is never installed.
+ */
+QEMU_PLUGIN_API
+void qemu_plugin_register_vcpu_tb_exec_cond_cb(struct qemu_plugin_tb *tb,
+   qemu_plugin_vcpu_udata_cb_t cb,
+   enum qemu_plugin_cb_flags flags,
+   enum qemu_plugin_cond cond,
+   qemu_plugin_u64 entry,


Is this a fixed entry or part of a scoreboard?



entry is an entry of scoreboard (automatically associated to each vcpu 
using vcpu_index) and can be modified by any other inline op, or 
callback. @imm (next parameter) is fixed yes.


callback will be called only if entry  imm true.
tests/plugin/inline.c has tests for this, and store operation.


I'm trying to write a control flow plugin with a structure like:

   /* We use this to track the current execution state */
   typedef struct {
   /* address of start of block */
   uint64_t block_start;
   /* address of end of block */
   uint64_t block_end;
   /* address of last executed PC */
   uint64_t last_pc;
   } VCPUScoreBoard;



Seems ok.


And I want to check to see if last_pc (set by STORE_U64 for each insn) == 
block_end
(set at start of TB with what we know).

Is this something I need to get with:

last_pc = qemu_plugin_scoreboard_u64_in_struct(state, VCPUScoreBoard, 
last_pc);



Yes.


?


+

Re: [PATCH 4/5] plugins: conditional callbacks

2024-03-12 Thread Pierrick Bouvier


On 3/11/24 19:43, Alex Bennée wrote:

Pierrick Bouvier  writes:


Extend plugins API to support callback called with a given criteria
(evaluated inline).

Added functions:
- qemu_plugin_register_vcpu_tb_exec_cond_cb
- qemu_plugin_register_vcpu_insn_exec_cond_cb

They expect as parameter a condition, a qemu_plugin_u64_t (op1) and an
immediate (op2). Callback is called if op1 |cond| op2 is true.

Signed-off-by: Pierrick Bouvier 


  
+static TCGCond plugin_cond_to_tcgcond(enum qemu_plugin_cond cond)

+{
+switch (cond) {
+case QEMU_PLUGIN_COND_EQ:
+return TCG_COND_EQ;
+case QEMU_PLUGIN_COND_NE:
+return TCG_COND_NE;
+case QEMU_PLUGIN_COND_LT:
+return TCG_COND_LTU;
+case QEMU_PLUGIN_COND_LE:
+return TCG_COND_LEU;
+case QEMU_PLUGIN_COND_GT:
+return TCG_COND_GTU;
+case QEMU_PLUGIN_COND_GE:
+return TCG_COND_GEU;
+default:
+/* ALWAYS and NEVER conditions should never reach */
+g_assert_not_reached();
+}
+}
+
+static TCGOp *append_cond_udata_cb(const struct qemu_plugin_dyn_cb *cb,
+   TCGOp *begin_op, TCGOp *op, int *cb_idx)
+{
+char *ptr = cb->cond_cb.entry.score->data->data;
+size_t elem_size = g_array_get_element_size(
+cb->cond_cb.entry.score->data);
+size_t offset = cb->cond_cb.entry.offset;
+/* Condition should be negated, as calling the cb is the "else" path */
+TCGCond cond = tcg_invert_cond(plugin_cond_to_tcgcond(cb->cond_cb.cond));
+
+op = copy_const_ptr(_op, op, ptr);
+op = copy_ld_i32(_op, op);
+op = copy_mul_i32(_op, op, elem_size);
+op = copy_ext_i32_ptr(_op, op);
+op = copy_const_ptr(_op, op, ptr + offset);
+op = copy_add_ptr(_op, op);
+op = copy_ld_i64(_op, op);
+op = copy_brcondi_i64(_op, op, cond, cb->cond_cb.imm);
+op = copy_call(_op, op, cb->f.vcpu_udata, cb_idx);
+op = copy_set_label(_op, op);
+return op;


I think we are missing something here to ensure that udata is set
correctly for the callback, see my RFC:

   Subject: [RFC PATCH] contrib/plugins: control flow plugin (WIP!)
   Date: Mon, 11 Mar 2024 15:34:32 +
   Message-Id: <20240311153432.1395190-1-alex.ben...@linaro.org>

which is seeing the same value every time in the callback.



I'm trying to reproduce and will answer on this thread.

Re: [PATCH v2 00/13] Cleanup on SMP and its test

2024-03-12 Thread Thomas Huth

On 12/03/2024 07.46, Zhao Liu wrote:

Hi Philippe,

On Sat, Mar 09, 2024 at 02:49:17PM +0100, Philippe Mathieu-Daudé wrote:

Date: Sat, 9 Mar 2024 14:49:17 +0100
From: Philippe Mathieu-Daudé 
Subject: Re: [PATCH v2 00/13] Cleanup on SMP and its test

On 9/3/24 01:46, Zhao Liu wrote:

Hi Philippe,

Can you share your base commit please?

Applying: hw/core/machine-smp: Remove deprecated "parameter=0" SMP
configurations
Applying: hw/core/machine-smp: Deprecate unsupported "parameter=1" SMP
configurations
error: patch failed: docs/about/deprecated.rst:47
error: docs/about/deprecated.rst: patch does not apply
Patch failed at 0002 hw/core/machine-smp: Deprecate unsupported
"parameter=1" SMP configurations

The base commit is e1007b6bab5cf ("Merge tag 'pull-request-2024-03-01'
of https://gitlab.com/thuth/qemu into staging").

But I think this conflict is because of the first 4 patches of mudule
series you picked. Let me rebase this series on that module series and
refresh a v3.

Ah no, it is due to commit 01e449809b ("*-user: Deprecate and
disable -p pagesize").

No need to respin this series, I queued it in favor of the 4 other
patches.

In the commit 54c4ea8f3ae6 ("hw/core/machine-smp: Deprecate unsupported
'parameter=1' SMP configurations"), the smp related thing is put under
the section "User-mode emulator command line arguments" instead of "System
emulator command line arguments".

Is this not quite right...or does it need to be fixed? If so I can tweak
and clean it up with a minor patch. ;-)

Yes, please send a patch to clean it up!

 Thanks
  Thomas

Re: [PATCH v3 08/20] qapi/schema: add type narrowing to lookup_type()

2024-03-12 Thread Markus Armbruster

John Snow  writes:

> On Mon, Mar 11, 2024 at 2:14 PM John Snow  wrote:
>>
>> On Tue, Feb 20, 2024 at 5:39 AM Markus Armbruster  wrote:
>> >
>> > John Snow  writes:
>> >
>> > > This function is a bit hard to type as-is; mypy needs some assertions to
>> > > assist with the type narrowing.
>> > >
>> > > Signed-off-by: John Snow 
>> > > ---
>> > >  scripts/qapi/schema.py | 4 +++-
>> > >  1 file changed, 3 insertions(+), 1 deletion(-)
>> > >
>> > > diff --git a/scripts/qapi/schema.py b/scripts/qapi/schema.py
>> > > index 043ee7556e6..e617abb03af 100644
>> > > --- a/scripts/qapi/schema.py
>> > > +++ b/scripts/qapi/schema.py
>> > > @@ -997,7 +997,9 @@ def lookup_entity(self, name, typ=None):
>> >def lookup_entity(self, name, typ=None):
>> >ent = self._entity_dict.get(name)
>> >if typ and not isinstance(ent, typ):
>> >return None
>> > >  return ent
>> > >
>> > >  def lookup_type(self, name):
>> > > -return self.lookup_entity(name, QAPISchemaType)
>> > > +typ = self.lookup_entity(name, QAPISchemaType)
>> > > +assert typ is None or isinstance(typ, QAPISchemaType)
>> > > +return typ
>> > >
>> > >  def resolve_type(self, name, info, what):
>> > >  typ = self.lookup_type(name)
>> >
>> > I figure the real trouble-maker is .lookup_entity().
>> >
>> > When not passed an optional type argument, it returns QAPISchemaEntity.
>> >
>> > When passed an optional type argument, it returns that type or None.
>> >
>> > Too cute for type hints to express, I guess.
>> >
>> > What if we drop .lookup_entity()'s optional argument?  There are just
>> > three callers:
>> >
>> > 1. .lookup_type(), visible above.
>> >
>> >def lookup_type(self, name):
>> >ent = self.lookup_entity(name)
>> >if isinstance(ent, QAPISchemaType):
>> >return ent
>> >return None
>> >
>> > This should permit typing it as -> Optional[QAPISchemaType] without
>> > further ado.
>> >
>> > 2. ._make_implicit_object_type() below
>> >
>> >Uses .lookup_type() to check whether the implicit object type already
>> >exists.  We figure we could
>> >
>> >typ = self.lookup_entity(name)
>> >if typ:
>> >assert(isinstance(typ, QAPISchemaObjectType))
>> ># The implicit object type has multiple users.  This can
>> >
>> > 3. QAPIDocDirective.run() doesn't pass a type argument, so no change.
>> >
>> > Thoughts?
>> >
>> > If you'd prefer not to rock the boat for this series, could it still
>> > make sense as a followup?
>>
>> It makes sense as a follow-up, I think. I had other patches in the
>> past that attempted to un-cuten these functions and make them more
>> statically solid, but the shifting sands kept making it easier to put
>> off until later.
>>
>> Lemme see if I can just tack this on to the end of the series and see
>> how it behaves...
>
> Oh, I see what you're doing. Well, I think it's fine if you want to,
> but it's also fine to keep this "stricter" method. There's also ways
> to type it using mypy's @overload which I've monkey'd with in the
> past. Dealer's choice, honestly, but I think I'm eager to just get to
> the "fully typed" baseline and then worry about changing more stuff.

That's okay.  However, a good part of the typing exercise's benefit is
the pinpointing of needlessly cute code, i.e. code that could be just as
well be less cute.  To actually reap the benefit, we need to make it
less cute.  If we put it off, we risk to forget.  Acceptable if we take
appropriate steps not to forget.

Re: [PATCH v3 09/20] qapi/schema: assert resolve_type has 'info' and 'what' args on error

2024-03-12 Thread Markus Armbruster

John Snow  writes:

> On Tue, Feb 20, 2024 at 5:48 AM Markus Armbruster  wrote:
>>
>> John Snow  writes:
>>
>> > resolve_type() is generally used to resolve configuration-provided type
>> > names into type objects, and generally requires valid 'info' and 'what'
>> > parameters.
>> >
>> > In some cases, such as with QAPISchemaArrayType.check(), resolve_type
>> > may be used to resolve built-in types and as such will not have an
>> > 'info' argument, but also must not fail in this scenario.
>> >
>> > Use an assertion to sate mypy that we will indeed have 'info' and 'what'
>> > parameters for the error pathway in resolve_type.
>> >
>> > Note: there are only three callsites to resolve_type at present where
>> > "info" is perceived to be possibly None:
>>
>> Who is the perceiver?  mypy?
>
> Deep.
>
> Yes.

Recommend active voice: "where mypy preceives @info to be possibly None".

>>
>> >
>> > 1) QAPISchemaArrayType.check()
>> > 2) QAPISchemaObjectTypeMember.check()
>> > 3) QAPISchemaEvent.check()
>> >
>> > Of those three, only the first actually ever passes None; the other two
>> > are limited by their base class initializers which accept info=None, 
>> > but
>> > neither subclass actually use a None value in practice, currently.
>> >
>> > Signed-off-by: John Snow 
>> > ---
>> >  scripts/qapi/schema.py | 1 +
>> >  1 file changed, 1 insertion(+)
>> >
>> > diff --git a/scripts/qapi/schema.py b/scripts/qapi/schema.py
>> > index e617abb03af..573be7275a6 100644
>> > --- a/scripts/qapi/schema.py
>> > +++ b/scripts/qapi/schema.py
>> > @@ -1004,6 +1004,7 @@ def lookup_type(self, name):
>> >  def resolve_type(self, name, info, what):
>> >  typ = self.lookup_type(name)
>> >  if not typ:
>> > +assert info and what  # built-in types must not fail lookup
>> >  if callable(what):
>> >  what = what(info)
>> >  raise QAPISemError(
>> <
>>

Re: [RFC PATCH] contrib/plugins: control flow plugin (WIP!)

2024-03-12 Thread Pierrick Bouvier


On 3/11/24 19:34, Alex Bennée wrote:

This is a simple control flow tracking plugin that uses the latest
inline and conditional operations to detect and track control flow
changes. It is currently an exercise at seeing how useful the changes
are.

Signed-off-by: Alex Bennée 
Based-on: 20240229055359.972151-1-pierrick.bouv...@linaro.org
Cc: Gustavo Romero 
Cc: Pierrick Bouvier 

---
This is a work in progress. It looks like I've found a bug in the
processing of udata (see fprintf) because I see:

vcpu_tb_trans: 0x41717c
vcpu_tb_branched_exec: 0x5620a598e8a0
vcpu_tb_trans: 0x417194
vcpu_tb_trans: 0x409af0
vcpu_tb_branched_exec: 0x5620a598e8a0
vcpu_tb_trans: 0x409afc
vcpu_tb_trans: 0x423920
vcpu_tb_branched_exec: 0x5620a598e8a0
collected 1429 destination nodes in the hash table
   addr: 0x4046a4 b.hs #0x4046c8
 branches 1
   to 0xa598e8a0 (0)
   addr: 0x4019c0 bl #0x400944
 branches 12
   to 0xa598e8a0 (11)
   addr: 0x445da8 b.eq #0x445df8

so it looks like udata is always junk.


This can be rebased on top of
20240312075428.244210-1-pierrick.bouv...@linaro.org, which fixes udata 
passing to conditional callback.


Thanks for reporting the issue :)


---
  contrib/plugins/cflow.c  | 344 +++
  contrib/plugins/Makefile |   1 +
  2 files changed, 345 insertions(+)
  create mode 100644 contrib/plugins/cflow.c

diff --git a/contrib/plugins/cflow.c b/contrib/plugins/cflow.c
new file mode 100644
index 00..f3ad6fd20f
--- /dev/null
+++ b/contrib/plugins/cflow.c
@@ -0,0 +1,344 @@
+/*
+ * Control Flow plugin
+ *
+ * This plugin will track changes to control flow and detect where
+ * instructions fault.
+ *
+ * Copyright (c) 2024 Linaro Ltd
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+QEMU_PLUGIN_EXPORT int qemu_plugin_version = QEMU_PLUGIN_VERSION;
+
+/* Temp hack, works for Aarch64 */
+#define INSN_WIDTH 4
+
+typedef enum {
+SORT_HOTDEST,  /* hottest branch */
+SORT_EARLY,/* most early exits */
+SORT_POPDEST,  /* most destinations */
+} ReportType;
+
+ReportType report = SORT_HOTDEST;
+int topn = 10;
+
+typedef struct {
+uint64_t daddr;
+uint64_t dcount;
+} DestData;
+
+/* A node is an address where we can go to multiple places */
+typedef struct {
+GMutex lock;
+/* address of the branch point */
+uint64_t addr;
+/* array of DestData */
+GArray *dests;
+/* early exit count */
+uint64_t early_exit;
+/* jump destination count */
+uint64_t dest_count;
+/* instruction data */
+char *insn_disas;
+/* times translated as last in block? */
+int last_count;
+/* times translated in the middle of block? */
+int mid_count;
+} NodeData;
+
+/* We use this to track the current execution state */
+typedef struct {
+/* address of start of block */
+uint64_t block_start;
+/* address of end of block */
+uint64_t block_end;
+/* address of last executed PC */
+uint64_t last_pc;
+} VCPUScoreBoard;
+
+static GMutex node_lock;
+static GHashTable *nodes;
+struct qemu_plugin_scoreboard *state;
+
+/* SORT_HOTDEST */
+static gint hottest(gconstpointer a, gconstpointer b)
+{
+NodeData *na = (NodeData *) a;
+NodeData *nb = (NodeData *) b;
+
+return na->dest_count > nb->dest_count ? -1 :
+na->dest_count == nb->dest_count ? 0 : 1;
+}
+
+static gint early(gconstpointer a, gconstpointer b)
+{
+NodeData *na = (NodeData *) a;
+NodeData *nb = (NodeData *) b;
+
+return na->early_exit > nb->early_exit ? -1 :
+na->early_exit == nb->early_exit ? 0 : 1;
+}
+
+static gint popular(gconstpointer a, gconstpointer b)
+{
+NodeData *na = (NodeData *) a;
+NodeData *nb = (NodeData *) b;
+
+return na->dests->len > nb->dests->len ? -1 :
+na->dests->len == nb->dests->len ? 0 : 1;
+}
+
+static void plugin_exit(qemu_plugin_id_t id, void *p)
+{
+g_autoptr(GString) result = g_string_new("collected ");
+GList *data;
+GCompareFunc sort = 
+int n = 0;
+
+g_mutex_lock(_lock);
+g_string_append_printf(result, "%d destination nodes in the hash table\n",
+   g_hash_table_size(nodes));
+
+data = g_hash_table_get_values(nodes);
+
+switch (report) {
+case SORT_HOTDEST:
+sort = 
+break;
+case SORT_EARLY:
+sort = 
+break;
+case SORT_POPDEST:
+sort = 
+break;
+}
+
+data = g_list_sort(data, sort);
+
+for (GList *l = data;
+ l != NULL && n < topn;
+ l = l->next, n++) {
+NodeData *n = l->data;
+g_string_append_printf(result, "  addr: 0x%"PRIx64 " %s\n",
+   n->addr, n->insn_disas);
+if (n->early_exit) {
+g_string_append_printf(result, "early exits %"PRId64"\n",
+   n->early_exit);
+}
+g_string_append_printf(result, "

Re: [PATCH 08/13] ppc/pnv: Set POWER9, POWER10 ibm,pa-features bits

2024-03-12 Thread Cédric Le Goater


On 3/11/24 19:51, Nicholas Piggin wrote:

Copy the pa-features arrays from spapr, adjusting slightly as
described in comments.

Cc: "Cédric Le Goater" 
Cc: "Frédéric Barrat" 
Signed-off-by: Nicholas Piggin 
---
  hw/ppc/pnv.c   | 67 --
  hw/ppc/spapr.c |  1 +
  2 files changed, 66 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 52d964f77a..3e30c08420 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -332,6 +332,35 @@ static void pnv_chip_power8_dt_populate(PnvChip *chip, 
void *fdt)
  }
  }
  
+/*

+ * Same as spapr pa_features_300 except pnv always enables CI largepages bit.
+ */
+static const uint8_t pa_features_300[] = { 66, 0,
+/* 0: MMU|FPU|SLB|RUN|DABR|NX, 1: CILRG|fri[nzpm]|DABRX|SPRG3|SLB0|PP110 */
+/* 2: VPM|DS205|PPR|DS202|DS206, 3: LSD|URG, 5: LE|CFAR|EB|LSQ */
+0xf6, 0x3f, 0xc7, 0xc0, 0x00, 0xf0, /* 0 - 5 */
+/* 6: DS207 */
+0x80, 0x00, 0x00, 0x00, 0x00, 0x00, /* 6 - 11 */
+/* 16: Vector */
+0x00, 0x00, 0x00, 0x00, 0x80, 0x00, /* 12 - 17 */
+/* 18: Vec. Scalar, 20: Vec. XOR, 22: HTM */
+0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 18 - 23 */
+/* 24: Ext. Dec, 26: 64 bit ftrs, 28: PM ftrs */
+0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 24 - 29 */
+/* 32: LE atomic, 34: EBB + ext EBB */
+0x00, 0x00, 0x80, 0x00, 0xC0, 0x00, /* 30 - 35 */
+/* 40: Radix MMU */
+0x00, 0x00, 0x00, 0x00, 0x80, 0x00, /* 36 - 41 */
+/* 42: PM, 44: PC RA, 46: SC vec'd */
+0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 42 - 47 */
+/* 48: SIMD, 50: QP BFP, 52: String */
+0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 48 - 53 */
+/* 54: DecFP, 56: DecI, 58: SHA */
+0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 54 - 59 */
+/* 60: NM atomic, 62: RNG */
+0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 60 - 65 */
+};
+
  static void pnv_chip_power9_dt_populate(PnvChip *chip, void *fdt)
  {
  static const char compat[] = "ibm,power9-xscom\0ibm,xscom";
@@ -349,7 +378,7 @@ static void pnv_chip_power9_dt_populate(PnvChip *chip, void 
*fdt)
  offset = pnv_dt_core(chip, pnv_core, fdt);
  
  _FDT((fdt_setprop(fdt, offset, "ibm,pa-features",

-   pa_features_207, sizeof(pa_features_207;
+   pa_features_300, sizeof(pa_features_300;
  }
  
  if (chip->ram_size) {

@@ -359,6 +388,40 @@ static void pnv_chip_power9_dt_populate(PnvChip *chip, 
void *fdt)
  pnv_dt_lpc(chip, fdt, 0, PNV9_LPCM_BASE(chip), PNV9_LPCM_SIZE);
  }
  
+/*

+ * Same as spapr pa_features_31 except pnv always enables CI largepages bit,
+ * always disables copy/paste.
+ */
+static const uint8_t pa_features_31[] = { 74, 0,
+/* 0: MMU|FPU|SLB|RUN|DABR|NX, 1: CILRG|fri[nzpm]|DABRX|SPRG3|SLB0|PP110 */
+/* 2: VPM|DS205|PPR|DS202|DS206, 3: LSD|URG, 5: LE|CFAR|EB|LSQ */
+0xf6, 0x3f, 0xc7, 0xc0, 0x00, 0xf0, /* 0 - 5 */
+/* 6: DS207 */
+0x80, 0x00, 0x00, 0x00, 0x00, 0x00, /* 6 - 11 */
+/* 16: Vector */
+0x00, 0x00, 0x00, 0x00, 0x80, 0x00, /* 12 - 17 */
+/* 18: Vec. Scalar, 20: Vec. XOR */
+0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 18 - 23 */
+/* 24: Ext. Dec, 26: 64 bit ftrs, 28: PM ftrs */
+0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 24 - 29 */
+/* 32: LE atomic, 34: EBB + ext EBB */
+0x00, 0x00, 0x80, 0x00, 0xC0, 0x00, /* 30 - 35 */
+/* 40: Radix MMU */
+0x00, 0x00, 0x00, 0x00, 0x80, 0x00, /* 36 - 41 */
+/* 42: PM, 44: PC RA, 46: SC vec'd */
+0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 42 - 47 */
+/* 48: SIMD, 50: QP BFP, 52: String */
+0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 48 - 53 */
+/* 54: DecFP, 56: DecI, 58: SHA */
+0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 54 - 59 */
+/* 60: NM atomic, 62: RNG */
+0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 60 - 65 */
+/* 68: DEXCR[SBHE|IBRTPDUS|SRAPD|NPHIE|PHIE] */
+0x00, 0x00, 0xce, 0x00, 0x00, 0x00, /* 66 - 71 */
+/* 72: [P]HASHCHK */
+0x80, 0x00, /* 72 - 73 */
+};
+
  static void pnv_chip_power10_dt_populate(PnvChip *chip, void *fdt)
  {
  static const char compat[] = "ibm,power10-xscom\0ibm,xscom";
@@ -376,7 +439,7 @@ static void pnv_chip_power10_dt_populate(PnvChip *chip, 
void *fdt)
  offset = pnv_dt_core(chip, pnv_core, fdt);
  
  _FDT((fdt_setprop(fdt, offset, "ibm,pa-features",

-   pa_features_207, sizeof(pa_features_207;
+   pa_features_31, sizeof(pa_features_31;
  }
  
  if (chip->ram_size) {

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 128bfe11a8..b53c13e037 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -233,6 +233,7 @@ static void spapr_dt_pa_features(SpaprMachineState *spapr,
   PowerPCCPU *cpu,
   void *fdt, int offset)
  {
+/* These should be kept in sync with pnv */


yes. In that case, the array definition should be moved under

[PATCH v2 2/2] hw/arm/sbsa-ref: Add cpu-map to device tree

2024-03-12 Thread Xiong Yining

From: xiongyining1480 

Support CPU topology description through device tree.

Signed-off-by: Xiong Yining 
Signed-off-by: Chen Baozi 
Reviewed-by: Marcin Juszkiewicz 
---
 hw/arm/sbsa-ref.c | 34 ++
 1 file changed, 34 insertions(+)

diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
index fde7dd528f..5b2c32515d 100644
--- a/hw/arm/sbsa-ref.c
+++ b/hw/arm/sbsa-ref.c
@@ -264,9 +264,43 @@ static void create_fdt(SBSAMachineState *sms)
 ms->possible_cpus->cpus[cs->cpu_index].props.node_id);
 }
 
+qemu_fdt_setprop_cell(sms->fdt, nodename, "phandle",
+qemu_fdt_alloc_phandle(sms->fdt));
+
 g_free(nodename);
 }
 
+/*
+ * Add vCPU topology description through fdt node cpu-map.
+ * See fdt_add_cpu_nodes() on hw/arm/virt.c for longer description.
+ */
+qemu_fdt_add_subnode(sms->fdt, "/cpus/cpu-map");
+
+for (cpu = sms->smp_cpus - 1; cpu >= 0; cpu--) {
+char *cpu_path = g_strdup_printf("/cpus/cpu@%d", cpu);
+char *map_path;
+
+if (ms->smp.threads > 1) {
+map_path = g_strdup_printf(
+"/cpus/cpu-map/socket%d/cluster%d/core%d/thread%d",
+cpu / (ms->smp.clusters * ms->smp.cores * ms->smp.threads),
+(cpu / (ms->smp.cores * ms->smp.threads)) % ms->smp.clusters,
+(cpu / ms->smp.threads) % ms->smp.cores,
+cpu % ms->smp.threads);
+} else {
+map_path = g_strdup_printf(
+"/cpus/cpu-map/socket%d/cluster%d/core%d",
+cpu / (ms->smp.clusters * ms->smp.cores),
+(cpu / ms->smp.cores) % ms->smp.clusters,
+cpu % ms->smp.cores);
+}
+qemu_fdt_add_path(sms->fdt, map_path);
+qemu_fdt_setprop_phandle(sms->fdt, map_path, "cpu", cpu_path);
+
+g_free(map_path);
+g_free(cpu_path);
+}
+
 sbsa_fdt_add_gic_node(sms);
 }
 
-- 
2.34.1

[PATCH v2 1/2] hw/arm/sbsa-ref:Enable CPU cluster on ARM sbsa machine

2024-03-12 Thread Xiong Yining

From: xiongyining1480 

Enable the CPU cluster on ARM sbsa machine, so user can configure the
cluster hierarchy.

Signed-off-by: Xiong Yining 
Signed-off-by: Chen Baozi 
Tested-by: Marcin Juszkiewicz 
---
 hw/arm/sbsa-ref.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
index f5709d6c14..fde7dd528f 100644
--- a/hw/arm/sbsa-ref.c
+++ b/hw/arm/sbsa-ref.c
@@ -886,6 +886,7 @@ static void sbsa_ref_class_init(ObjectClass *oc, void *data)
 mc->default_ram_size = 1 * GiB;
 mc->default_ram_id = "sbsa-ref.ram";
 mc->default_cpus = 4;
+mc->smp_props.clusters_supported = true;
 mc->possible_cpu_arch_ids = sbsa_ref_possible_cpu_arch_ids;
 mc->cpu_index_to_instance_props = sbsa_ref_cpu_index_to_props;
 mc->get_default_cpu_node_id = sbsa_ref_get_default_cpu_node_id;
-- 
2.34.1

Re: [PATCH] spapr: avoid overhead of finding vhyp class in critical operations

2024-03-12 Thread Nicholas Piggin

On Tue Mar 12, 2024 at 4:38 PM AEST, Harsh Prateek Bora wrote:
> Hi Nick,
>
> One minor comment below:
>
> On 2/24/24 13:03, Nicholas Piggin wrote:
> > PPC_VIRTUAL_HYPERVISOR_GET_CLASS is used in critical operations like
> > interrupts and TLB misses and is quite costly. Running the
> > kvm-unit-tests sieve program with radix MMU enabled thrashes the TCG
> > TLB and spends a lot of time in TLB and page table walking code. The
> > test takes 67 seconds to complete with a lot of time being spent in
> > code related to finding the vhyp class:
> > 
> > 12.01%  [.] g_str_hash
> >  8.94%  [.] g_hash_table_lookup
> >  8.06%  [.] object_class_dynamic_cast
> >  6.21%  [.] address_space_ldq
> >  4.94%  [.] __strcmp_avx2
> >  4.28%  [.] tlb_set_page_full
> >  4.08%  [.] address_space_translate_internal
> >  3.17%  [.] object_class_dynamic_cast_assert
> >  2.84%  [.] ppc_radix64_xlate
> > 
> > Keep a pointer to the class and avoid this lookup. This reduces the
> > execution time to 40 seconds.
> > 
> > Signed-off-by: Nicholas Piggin 
> > ---
> > This feels a bit ugly, but the performance problem of looking up the
> > class in fast paths can't be ignored. Is there a "nicer" way to get the
> > same result?
> > 
> > Thanks,
> > Nick
> > 
> >   target/ppc/cpu.h   |  3 ++-
> >   target/ppc/mmu-book3s-v3.h |  4 +---
> >   hw/ppc/pegasos2.c  |  1 +
> >   target/ppc/cpu_init.c  |  9 +++--
> >   target/ppc/excp_helper.c   | 16 
> >   target/ppc/kvm.c   |  4 +---
> >   target/ppc/mmu-hash64.c| 16 
> >   target/ppc/mmu-radix64.c   |  4 +---
> >   8 files changed, 17 insertions(+), 40 deletions(-)
> > 
> > diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> > index ec14574d14..eb85d9aa71 100644
> > --- a/target/ppc/cpu.h
> > +++ b/target/ppc/cpu.h
> > @@ -1437,6 +1437,7 @@ struct ArchCPU {
> >   int vcpu_id;
> >   uint32_t compat_pvr;
> >   PPCVirtualHypervisor *vhyp;
> > +PPCVirtualHypervisorClass *vhyp_class;
> >   void *machine_data;
> >   int32_t node_id; /* NUMA node this CPU belongs to */
> >   PPCHash64Options *hash64_opts;
> > @@ -1535,7 +1536,7 @@ DECLARE_OBJ_CHECKERS(PPCVirtualHypervisor, 
> > PPCVirtualHypervisorClass,
> >   
> >   static inline bool vhyp_cpu_in_nested(PowerPCCPU *cpu)
> >   {
> > -return PPC_VIRTUAL_HYPERVISOR_GET_CLASS(cpu->vhyp)->cpu_in_nested(cpu);
> > +return cpu->vhyp_class->cpu_in_nested(cpu);
> >   }
> >   #endif /* CONFIG_USER_ONLY */
> >   
> > diff --git a/target/ppc/mmu-book3s-v3.h b/target/ppc/mmu-book3s-v3.h
> > index 674377a19e..f3f7993958 100644
> > --- a/target/ppc/mmu-book3s-v3.h
> > +++ b/target/ppc/mmu-book3s-v3.h
> > @@ -108,9 +108,7 @@ static inline hwaddr ppc_hash64_hpt_mask(PowerPCCPU 
> > *cpu)
> >   uint64_t base;
> >   
> >   if (cpu->vhyp) {
>
> All the checks for cpu->vhyp needs to be changed to check for 
> cpu->vhyp_class now, for all such instances.

It wasn't supposed to, because vhyp != NULL implies vhyp_class != NULL.
It's supposed to be an equivalent transformation just changing the
lookup function.

Okay to leave it as is?

Thanks,
Nick

>
> With that,
>
> Reviewed-by: Harsh Prateek Bora 
>
>
> > -PPCVirtualHypervisorClass *vhc =
> > -PPC_VIRTUAL_HYPERVISOR_GET_CLASS(cpu->vhyp);
> > -return vhc->hpt_mask(cpu->vhyp);
> > +return cpu->vhyp_class->hpt_mask(cpu->vhyp);
> >   }
> >   if (cpu->env.mmu_model == POWERPC_MMU_3_00) {
> >   ppc_v3_pate_t pate;
> > diff --git a/hw/ppc/pegasos2.c b/hw/ppc/pegasos2.c
> > index 04d6decb2b..c22e8b336d 100644
> > --- a/hw/ppc/pegasos2.c
> > +++ b/hw/ppc/pegasos2.c
> > @@ -400,6 +400,7 @@ static void pegasos2_machine_reset(MachineState 
> > *machine, ShutdownCause reason)
> >   machine->fdt = fdt;
> >   
> >   pm->cpu->vhyp = PPC_VIRTUAL_HYPERVISOR(machine);
> > +pm->cpu->vhyp_class = PPC_VIRTUAL_HYPERVISOR_GET_CLASS(pm->cpu->vhyp);
> >   }
> >   
> >   enum pegasos2_rtas_tokens {
> > diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
> > index 9bccddb350..63d0094024 100644
> > --- a/target/ppc/cpu_init.c
> > +++ b/target/ppc/cpu_init.c
> > @@ -6631,6 +6631,7 @@ void cpu_ppc_set_vhyp(PowerPCCPU *cpu, 
> > PPCVirtualHypervisor *vhyp)
> >   CPUPPCState *env = >env;
> >   
> >   cpu->vhyp = vhyp;
> > +cpu->vhyp_class = PPC_VIRTUAL_HYPERVISOR_GET_CLASS(vhyp);
> >   
> >   /*
> >* With a virtual hypervisor mode we never allow the CPU to go
> > @@ -7224,9 +7225,7 @@ static void ppc_cpu_exec_enter(CPUState *cs)
> >   PowerPCCPU *cpu = POWERPC_CPU(cs);
> >   
> >   if (cpu->vhyp) {
> > -PPCVirtualHypervisorClass *vhc =
> > -PPC_VIRTUAL_HYPERVISOR_GET_CLASS(cpu->vhyp);
> > -vhc->cpu_exec_enter(cpu->vhyp, cpu);
> > +cpu->vhyp_class->cpu_exec_enter(cpu->vhyp, cpu);
> >   }
> >   }
> >   
> > @@ -7235,9 +7234,7 @@ static void ppc_cpu_exec_exit(CPUState *cs)
>

Re: [PATCH 04/13] ppc/spapr: Remove copy-paste from pa-features

2024-03-12 Thread Harsh Prateek Bora


Hi Nick,

On 3/12/24 00:21, Nicholas Piggin wrote:

TCG does not support copy/paste instructions. Remove it from
ibm,pa-features. This has never been implemented under TCG or


s/or/nor ?


practically usable under KVM, so it won't be missed.

Signed-off-by: Nicholas Piggin 
---
  hw/ppc/spapr.c | 8 ++--
  1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 5099f12cc6..7d7da30f60 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -254,8 +254,8 @@ static void spapr_dt_pa_features(SpaprMachineState *spapr,
  0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 24 - 29 */
  /* 30: MMR, 32: LE atomic, 34: EBB + ext EBB */
  0x80, 0x00, 0x80, 0x00, 0xC0, 0x00, /* 30 - 35 */
-/* 36: SPR SO, 38: Copy/Paste, 40: Radix MMU */
-0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 36 - 41 */
+/* 36: SPR SO, 40: Radix MMU */
+0x80, 0x00, 0x00, 0x00, 0x80, 0x00, /* 36 - 41 */
  /* 42: PM, 44: PC RA, 46: SC vec'd */
  0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 42 - 47 */
  /* 48: SIMD, 50: QP BFP, 52: String */
@@ -288,6 +288,10 @@ static void spapr_dt_pa_features(SpaprMachineState *spapr,
   * SSO (SAO) ordering is supported on KVM and thread=single hosts,
   * but not MTTCG, so disable it. To advertise it, a cap would have
   * to be added, or support implemented for MTTCG.
+ *
+ * Copy/paste is not supported by TCG, so it is not advertised. KVM
+ * can execute them but it has no accelerator drivers which are usable,
+ * so there isn't much need for it anyway.
   */


If doing a re-spin, you may consider comments on prev patch applicable
above as well. Either ways, with prev typo fixed:

Reviewed-by: Harsh Prateek Bora 

  
  if (ppc_hash64_has(cpu, PPC_HASH64_CI_LARGEPAGE)) {

Re: [PATCH 02/13] target/ppc: POWER10 does not have transactional memory

2024-03-12 Thread Nicholas Piggin

On Tue Mar 12, 2024 at 6:10 PM AEST, Harsh Prateek Bora wrote:
> Hi Nick,
>
> One query/comment below:
>
> On 3/12/24 00:21, Nicholas Piggin wrote:
> > POWER10 hardware implements a degenerate transactional memory facility
> > in POWER8/9 PCR compatibility modes to permit migration from older
> > CPUs, but POWER10 / ISA v3.1 mode does not support it so the CPU model
> > should not support it.
> > 
> > Signed-off-by: Nicholas Piggin 
> > ---
> >   target/ppc/cpu_init.c | 4 ++--
> >   1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
> > index 572cbdf25f..d7e84a2f40 100644
> > --- a/target/ppc/cpu_init.c
> > +++ b/target/ppc/cpu_init.c
> > @@ -6573,7 +6573,7 @@ POWERPC_FAMILY(POWER10)(ObjectClass *oc, void *data)
> >   PPC2_FP_TST_ISA206 | PPC2_BCTAR_ISA207 |
> >   PPC2_LSQ_ISA207 | PPC2_ALTIVEC_207 |
> >   PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 |
> > -PPC2_TM | PPC2_ISA300 | PPC2_PRCNTL | PPC2_ISA310 |
> > +PPC2_ISA300 | PPC2_PRCNTL | PPC2_ISA310 |
> >   PPC2_MEM_LWSYNC | PPC2_BCDA_ISA206;
> >   pcc->msr_mask = (1ull << MSR_SF) |
> >   (1ull << MSR_HV) |
> > @@ -6617,7 +6617,7 @@ POWERPC_FAMILY(POWER10)(ObjectClass *oc, void *data)
> >   pcc->flags = POWERPC_FLAG_VRE | POWERPC_FLAG_SE |
> >POWERPC_FLAG_BE | POWERPC_FLAG_PMM |
> >POWERPC_FLAG_BUS_CLK | POWERPC_FLAG_CFAR |
> > - POWERPC_FLAG_VSX | POWERPC_FLAG_TM | POWERPC_FLAG_SCV;
> > + POWERPC_FLAG_VSX | POWERPC_FLAG_SCV;
> >   pcc->l1_dcache_size = 0x8000;
> >   pcc->l1_icache_size = 0x8000;
> >   }
>
> Shouldn't we also have below change included with this:
>
> diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
> index aac095e5fd..faefc0420e 100644
> --- a/target/ppc/cpu_init.c
> +++ b/target/ppc/cpu_init.c
> @@ -6641,7 +6641,6 @@ POWERPC_FAMILY(POWER10)(ObjectClass *oc, void *data)
>   PPC2_MEM_LWSYNC | PPC2_BCDA_ISA206 | PPC2_ATTN;
>   pcc->msr_mask = (1ull << MSR_SF) |
>   (1ull << MSR_HV) |
> -(1ull << MSR_TM) |
>   (1ull << MSR_VR) |
>   (1ull << MSR_VSX) |
>   (1ull << MSR_EE) |

I think you're probably right. I'll do some testing...

Thanks,
Nick

>
> Otherwise,
> Reviewed-by: Harsh Prateek Bora

Re: [PATCH 01/13] ppc: Drop support for POWER9 and POWER10 DD1 chips

2024-03-12 Thread Harsh Prateek Bora





On 3/12/24 14:29, Nicholas Piggin wrote:

On Tue Mar 12, 2024 at 2:55 PM AEST, Harsh Prateek Bora wrote:



On 3/12/24 10:20, Harsh Prateek Bora wrote:



On 3/12/24 00:21, Nicholas Piggin wrote:

The POWER9 DD1 and POWER10 DD1 chips are not public and are no longer of
any use in QEMU. Remove them.

Signed-off-by: Nicholas Piggin 
---
   hw/ppc/spapr_cpu_core.c |  2 --
   target/ppc/cpu-models.c |  4 
   target/ppc/cpu_init.c   |  7 ++-
   target/ppc/kvm.c    | 11 ---
   4 files changed, 2 insertions(+), 22 deletions(-)


Do we want to squash in removal of the macro as well?




Actually both, correcting diff:

diff --git a/target/ppc/cpu-models.h b/target/ppc/cpu-models.h
index 0229ef3a9a..7d89b41214 100644
--- a/target/ppc/cpu-models.h
+++ b/target/ppc/cpu-models.h
@@ -348,11 +348,9 @@ enum {
   CPU_POWERPC_POWER8NVL_BASE = 0x004C,
   CPU_POWERPC_POWER8NVL_v10  = 0x004C0100,
   CPU_POWERPC_POWER9_BASE= 0x004E,
-CPU_POWERPC_POWER9_DD1 = 0x004E1100,
   CPU_POWERPC_POWER9_DD20= 0x004E1200,
   CPU_POWERPC_POWER9_DD22= 0x004E1202,
   CPU_POWERPC_POWER10_BASE   = 0x0080,
-CPU_POWERPC_POWER10_DD1= 0x00801100,
   CPU_POWERPC_POWER10_DD20   = 0x00801200,
   CPU_POWERPC_970_v22= 0x00390202,
   CPU_POWERPC_970FX_v10  = 0x00391100,


That would make sense, but we do seem to use this list as somewhat of a
reference or at least historic graveyard too (note all the other CPUs we
no longer support). So I was going to just leave them there.


Oh ok, in that case, it's fine.

regards,
Harsh


Thanks,
Nick

Re: [PATCH v3] docs/system/ppc: Document running Linux on AmigaNG machines

2024-03-12 Thread Bernhard Beschow




Am 9. März 2024 11:34:56 UTC schrieb BALATON Zoltan :
>On Thu, 29 Feb 2024, BALATON Zoltan wrote:
>> On Wed, 21 Feb 2024, BALATON Zoltan wrote:
>>> Documentation on how to run Linux on the amigaone, pegasos2 and
>>> sam460ex machines is currently buried in the depths of the qemu-devel
>>> mailing list and in the source code. Let's collect the information in
>>> the QEMU handbook for a one stop solution.
>> 
>> Ping? (Just so it's not missed from next pull.)
>
>Ping for freeze.

Has this patch been tagged yet? It would really be a pity if it didn't make it 
into 9.0.

FWIW:

Reviewed-by: Bernhard Beschow 

>
>> Regards,
>> BALATON Zoltan
>> 
>>> Co-authored-by: Bernhard Beschow 
>>> Signed-off-by: BALATON Zoltan 
>>> Reviewed-by: Nicholas Piggin 
>>> Tested-by: Bernhard Beschow 
>>> ---
>>> v3: Apply changes and Tested-by tag from Bernhard
>>> v2: Move top level title one level up so subsections will be below it in TOC
>>> 
>>> MAINTAINERS |   1 +
>>> docs/system/ppc/amigang.rst | 161 
>>> docs/system/target-ppc.rst  |   1 +
>>> 3 files changed, 163 insertions(+)
>>> create mode 100644 docs/system/ppc/amigang.rst
>>> 
>>> diff --git a/MAINTAINERS b/MAINTAINERS
>>> index 7d61fb9319..0aef8cb2a6 100644
>>> --- a/MAINTAINERS
>>> +++ b/MAINTAINERS
>>> @@ -1562,6 +1562,7 @@ F: hw/rtc/m41t80.c
>>> F: pc-bios/canyonlands.dt[sb]
>>> F: pc-bios/u-boot-sam460ex-20100605.bin
>>> F: roms/u-boot-sam460ex
>>> +F: docs/system/ppc/amigang.rst
>>> 
>>> pegasos2
>>> M: BALATON Zoltan 
>>> diff --git a/docs/system/ppc/amigang.rst b/docs/system/ppc/amigang.rst
>>> new file mode 100644
>>> index 00..ba1a3d80b9
>>> --- /dev/null
>>> +++ b/docs/system/ppc/amigang.rst
>>> @@ -0,0 +1,161 @@
>>> +=
>>> +AmigaNG boards (``amigaone``, ``pegasos2``, ``sam460ex``)
>>> +=
>>> +
>>> +These PowerPC machines emulate boards that are primarily used for
>>> +running Amiga like OSes (AmigaOS 4, MorphOS and AROS) but these can
>>> +also run Linux which is what this section documents.
>>> +
>>> +Eyetech AmigaOne/Mai Logic Teron (``amigaone``)
>>> +===
>>> +
>>> +The ``amigaone`` machine emulates an AmigaOne XE mainboard by Eyetech
>>> +which is a rebranded Mai Logic Teron board with modified U-Boot
>>> +firmware to support AmigaOS 4.
>>> +
>>> +Emulated devices
>>> +
>>> +
>>> + * PowerPC 7457 CPU (can also use``-cpu g3, 750cxe, 750fx`` or ``750gx``)
>>> + * Articia S north bridge
>>> + * VIA VT82C686B south bridge
>>> + * PCI VGA compatible card (guests may need other card instead)
>>> + * PS/2 keyboard and mouse
>>> +
>>> +Firmware
>>> +
>>> +
>>> +A firmware binary is necessary for the boot process. It is a modified
>>> +U-Boot under GPL but its source is lost so it cannot be included in
>>> +QEMU. A binary is available at
>>> +https://www.hyperion-entertainment.com/index.php/downloads?view=files=28.
>>> +The ROM image is in the last 512kB which can be extracted with the
>>> +following command:
>>> +
>>> +.. code-block:: bash
>>> +
>>> +  $ tail -c 524288 updater.image > u-boot-amigaone.bin
>>> +
>>> +The BIOS emulator in the firmware is unable to run QEMU‘s standard
>>> +vgabios so ``VGABIOS-lgpl-latest.bin`` is needed instead which can be
>>> +downloaded from http://www.nongnu.org/vgabios.
>>> +
>>> +Running Linux
>>> +-
>>> +
>>> +There are some Linux images under the following link that work on the
>>> +``amigaone`` machine:
>>> +https://sourceforge.net/projects/amigaone-linux/files/debian-installer/.
>>> +To boot the system run:
>>> +
>>> +.. code-block:: bash
>>> +
>>> +  $ qemu-system-ppc -machine amigaone -bios u-boot-amigaone.bin \
>>> +-cdrom "A1 Linux Net Installer.iso" \
>>> +-device 
>>> ati-vga,model=rv100,romfile=VGABIOS-lgpl-latest.bin
>>> +
>>> +From the firmware menu that appears select ``Boot sequence`` →
>>> +``Amiga Multiboot Options`` and set ``Boot device 1`` to
>>> +``Onboard VIA IDE CDROM``. Then hit escape until the main screen appears 
>>> again,
>>> +hit escape once more and from the exit menu that appears select either
>>> +``Save settings and exit`` or ``Use settings for this session only``. It 
>>> may
>>> +take a long time loading the kernel into memory but eventually it boots 
>>> and the
>>> +installer becomes visible. The ``ati-vga`` RV100 emulation is not
>>> +complete yet so only frame buffer works, DRM and 3D is not available.
>>> +
>>> +Genesi/bPlan Pegasos II (``pegasos2``)
>>> +==
>>> +
>>> +The ``pegasos2`` machine emulates the Pegasos II sold by Genesi and
>>> +designed by bPlan. Its schematics are available at
>>> +https://www.powerdeveloper.org/platforms/pegasos/schematics.
>>> +
>>> +Emulated devices
>>> +
>>> +
>>> + * PowerPC 7457 CPU (can also use``-cpu

Re: [PATCH v2 00/29] Cleanup up to fix missing ERRP_GUARD() for error_prepend()

2024-03-12 Thread Thomas Huth

On 12/03/2024 11.26, Zhao Liu wrote:

On Tue, Mar 12, 2024 at 09:50:25AM +0100, Thomas Huth wrote:

Date: Tue, 12 Mar 2024 09:50:25 +0100
From: Thomas Huth 
Subject: Re: [PATCH v2 00/29] Cleanup up to fix missing ERRP_GUARD() for
  error_prepend()

On 12/03/2024 09.43, Zhao Liu wrote:

Hi Thomas/Markus/Michael,

For the remaing patches, could you please help me merge them next?

Many thanks!

Yes, I'm currently reviewing the ones that don't have a Reviewed-by yet. I
can pick up the remaining patches if the other maintainers won't pick them
up for the softfreeze today.

Appreciate that you can help me get on the last train of releases.

If possible, could you please also help me pick up two other ERRP_GUARD()
related cleanups (total 8 patches, both got r/b)? ;-)

My cleanup is too fragmented, I'll try to centralize my work to make it easier
for maintainer to review and merge in the future!

[1]: 
https://lore.kernel.org/qemu-devel/20240223085653.1255438-1-zhao1@linux.intel.com/
[2]: 
https://lore.kernel.org/qemu-devel/20240312060337.3240965-1-zhao1@linux.intel.com/

I'll try to include them!

 Thomas

Re: [PATCH v3 00/29] hw, target: Prefer fast cpu_env() over slower CPU QOM cast macro

2024-03-12 Thread Thomas Huth


On 29/01/2024 17.44, Philippe Mathieu-Daudé wrote:

Patches missing review: 1, 2, 5, 6, 8, 11, 14, 15, 29

It will be simpler if I get the whole series via my hw-cpus
tree once fully reviewed.

Since v2:
- Rebased
- bsd/linux-user
- Preliminary clean cpu_reset_hold
- Add R-b

Since v1:
- Avoid CPU() cast (Paolo)
- Split per targets (Thomas)

Use cpu_env() -- which is fast path -- when possible.
Bulk conversion using Coccinelle spatch (script included).

Philippe Mathieu-Daudé (29):
   bulk: Access existing variables initialized to >F when available
   hw/core: Declare CPUArchId::cpu as CPUState instead of Object
   hw/acpi/cpu: Use CPUState typedef
   bulk: Call in place single use cpu_env()
   scripts/coccinelle: Add cpu_env.cocci script
   target: Replace CPU_GET_CLASS(cpu -> obj) in cpu_reset_hold() handler
   target/alpha: Prefer fast cpu_env() over slower CPU QOM cast macro
   target/arm: Prefer fast cpu_env() over slower CPU QOM cast macro
   target/avr: Prefer fast cpu_env() over slower CPU QOM cast macro
   target/cris: Prefer fast cpu_env() over slower CPU QOM cast macro
   target/hexagon: Prefer fast cpu_env() over slower CPU QOM cast macro
   target/hppa: Prefer fast cpu_env() over slower CPU QOM cast macro
   target/i386/hvf: Use CPUState typedef
   target/i386: Prefer fast cpu_env() over slower CPU QOM cast macro
   target/loongarch: Prefer fast cpu_env() over slower CPU QOM cast macro
   target/m68k: Prefer fast cpu_env() over slower CPU QOM cast macro
   target/microblaze: Prefer fast cpu_env() over slower CPU QOM cast
 macro
   target/mips: Prefer fast cpu_env() over slower CPU QOM cast macro
   target/nios2: Prefer fast cpu_env() over slower CPU QOM cast macro
   target/openrisc: Prefer fast cpu_env() over slower CPU QOM cast macro
   target/ppc: Prefer fast cpu_env() over slower CPU QOM cast macro
   target/riscv: Prefer fast cpu_env() over slower CPU QOM cast macro
   target/rx: Prefer fast cpu_env() over slower CPU QOM cast macro
   target/s390x: Prefer fast cpu_env() over slower CPU QOM cast macro
   target/sh4: Prefer fast cpu_env() over slower CPU QOM cast macro
   target/sparc: Prefer fast cpu_env() over slower CPU QOM cast macro
   target/tricore: Prefer fast cpu_env() over slower CPU QOM cast macro
   target/xtensa: Prefer fast cpu_env() over slower CPU QOM cast macro
   user: Prefer fast cpu_env() over slower CPU QOM cast macro


FYI, I'll try to queue those for my PR today except for:

 scripts/coccinelle: Add cpu_env.cocci script
 --> Still needs review and you mentioned a pending change

 target/arm: Prefer fast cpu_env() over slower CPU QOM cast macro
 --> Needs a rebase and review

 target/hppa: Prefer fast cpu_env() over slower CPU QOM cast macro
 --> Needs a rebase

 target/i386: Prefer fast cpu_env() over slower CPU QOM cast macro
 --> There were unaddressed review comments from Igor

 target/riscv: Prefer fast cpu_env() over slower CPU QOM cast macro
 --> Needs a rebase

 Thomas

[PATCH] tests: Raise timeouts for bufferiszero and crypto-tlscredsx509

2024-03-12 Thread Peter Maydell

On our gcov CI job, the bufferiszero and crypto-tlscredsx509
tests time out occasionally, making the job flaky. Double the
timeout on these two tests.

Cc: qemu-sta...@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2221
Signed-off-by: Peter Maydell 
---
cc stable just because it probably helps CI reliability there too
---
 tests/unit/meson.build | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tests/unit/meson.build b/tests/unit/meson.build
index cae925c1325..30db3c418fa 100644
--- a/tests/unit/meson.build
+++ b/tests/unit/meson.build
@@ -173,8 +173,9 @@ test_env.set('G_TEST_BUILDDIR', meson.current_build_dir())
 
 slow_tests = {
   'test-aio-multithread' : 120,
+  'test-bufferiszero': 60,
   'test-crypto-block' : 300,
-  'test-crypto-tlscredsx509': 45,
+  'test-crypto-tlscredsx509': 90,
   'test-crypto-tlssession': 45,
   'test-replication': 60,
 }
-- 
2.34.1

Re: [PATCH v5 52/65] i386/tdx: Wire TDX_REPORT_FATAL_ERROR with GuestPanic facility

2024-03-12 Thread Xiaoyao Li


On 3/11/2024 3:29 PM, Markus Armbruster wrote:

Xiaoyao Li  writes:


On 3/7/2024 9:51 PM, Markus Armbruster wrote:

Xiaoyao Li  writes:


On 2/29/2024 4:51 PM, Markus Armbruster wrote:

Xiaoyao Li  writes:


Integrate TDX's TDX_REPORT_FATAL_ERROR into QEMU GuestPanic facility

Originated-from: Isaku Yamahata 
Signed-off-by: Xiaoyao Li 
---
Changes in v5:
- mention additional error information in gpa when it presents;
- refine the documentation; (Markus)

Changes in v4:
- refine the documentation; (Markus)

Changes in v3:
- Add docmentation of new type and struct; (Daniel)
- refine the error message handling; (Daniel)
---
qapi/run-state.json   | 31 +--
system/runstate.c | 58 +++
target/i386/kvm/tdx.c | 24 +-
3 files changed, 110 insertions(+), 3 deletions(-)

diff --git a/qapi/run-state.json b/qapi/run-state.json
index dd0770b379e5..b71dd1884eb6 100644
--- a/qapi/run-state.json
+++ b/qapi/run-state.json


[...]


@@ -564,6 +567,30 @@
  'psw-addr': 'uint64',
  'reason': 'S390CrashReason'}}
+##
+# @GuestPanicInformationTdx:
+#
+# TDX Guest panic information specific to TDX, as specified in the
+# "Guest-Hypervisor Communication Interface (GHCI) Specification",
+# section TDG.VP.VMCALL.
+#
+# @error-code: TD-specific error code
+#
+# @message: Human-readable error message provided by the guest. Not
+# to be trusted.
+#
+# @gpa: guest-physical address of a page that contains more verbose
+# error information, as zero-terminated string.  Present when the
+# "GPA valid" bit (bit 63) is set in @error-code.


Uh, peeking at GHCI Spec section 3.4 TDG.VP.VMCALL, I
see operand R12 consists of

   bitsnamedescription
   31:0TD-specific error code  TD-specific error code
   Panic – 0x0.
   Values – 0x1 to 0x
   reserved.
   62:32   TD-specific extendedTD-specific extended error code.
   error code  TD software defined.
   63  GPA Valid   Set if the TD specified additional
   information in the GPA parameter
   (R13).
Is @error-code all of R12, or just bits 31:0?
If it's all of R12, description of @error-code as "TD-specific error
code" is misleading.


We pass all of R12 to @error_code.

Here it wants to use "error_code" as generic as the whole R12. Do you have any 
better description of it ?


Sadly, the spec is of no help: it doesn't name the entire thing, only
the three sub-fields TD-specific error code, TD-specific extended error
code, GPA valid.

We could take the hint, and provide the sub-fields instead:

* @error-code contains the TD-specific error code (bits 31:0)

* @extended-error-code contains the TD-specific extended error code
(bits 62:32)

* we don't need @gpa-valid, because it's the same as "@gpa is present"

If we decide to keep the single member, we do need another name for it.
@error-codes (plural) doesn't exactly feel wonderful, but it gives at
least a subtle hint that it's not just *the* error code.


The reason we only defined one single member, is that the
extended-error-code is not used now, and I believe it won't be used in
the near future.


Aha!  Then I recommend

* @error-code contains the TD-specific error code (bits 31:0)

* Omit bits 62:32 from the reply; if we later find an actual use for
   them, we can add a suitable member

* Omit bit 63, because it's the same as "@gpa is present"


If no objection from others, I will use @error-codes (plural) in the
next version.


I recommend to keep the @error-code name, but narrow its value to the
actual error code, i.e. bits 31:0.


It works for me. I will got this direction in the next version.


If it's just bits 31:0, then 'Present when the "GPA valid" bit (bit 63)
is set in @error-code' is wrong.  Could go with 'Only present when the
guest provides this information'.


[...]

Re: [PATCH 02/13] target/ppc: POWER10 does not have transactional memory

2024-03-12 Thread Harsh Prateek Bora


Hi Nick,

One query/comment below:

On 3/12/24 00:21, Nicholas Piggin wrote:

POWER10 hardware implements a degenerate transactional memory facility
in POWER8/9 PCR compatibility modes to permit migration from older
CPUs, but POWER10 / ISA v3.1 mode does not support it so the CPU model
should not support it.

Signed-off-by: Nicholas Piggin 
---
  target/ppc/cpu_init.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
index 572cbdf25f..d7e84a2f40 100644
--- a/target/ppc/cpu_init.c
+++ b/target/ppc/cpu_init.c
@@ -6573,7 +6573,7 @@ POWERPC_FAMILY(POWER10)(ObjectClass *oc, void *data)
  PPC2_FP_TST_ISA206 | PPC2_BCTAR_ISA207 |
  PPC2_LSQ_ISA207 | PPC2_ALTIVEC_207 |
  PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 |
-PPC2_TM | PPC2_ISA300 | PPC2_PRCNTL | PPC2_ISA310 |
+PPC2_ISA300 | PPC2_PRCNTL | PPC2_ISA310 |
  PPC2_MEM_LWSYNC | PPC2_BCDA_ISA206;
  pcc->msr_mask = (1ull << MSR_SF) |
  (1ull << MSR_HV) |
@@ -6617,7 +6617,7 @@ POWERPC_FAMILY(POWER10)(ObjectClass *oc, void *data)
  pcc->flags = POWERPC_FLAG_VRE | POWERPC_FLAG_SE |
   POWERPC_FLAG_BE | POWERPC_FLAG_PMM |
   POWERPC_FLAG_BUS_CLK | POWERPC_FLAG_CFAR |
- POWERPC_FLAG_VSX | POWERPC_FLAG_TM | POWERPC_FLAG_SCV;
+ POWERPC_FLAG_VSX | POWERPC_FLAG_SCV;
  pcc->l1_dcache_size = 0x8000;
  pcc->l1_icache_size = 0x8000;
  }


Shouldn't we also have below change included with this:

diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
index aac095e5fd..faefc0420e 100644
--- a/target/ppc/cpu_init.c
+++ b/target/ppc/cpu_init.c
@@ -6641,7 +6641,6 @@ POWERPC_FAMILY(POWER10)(ObjectClass *oc, void *data)
 PPC2_MEM_LWSYNC | PPC2_BCDA_ISA206 | PPC2_ATTN;
 pcc->msr_mask = (1ull << MSR_SF) |
 (1ull << MSR_HV) |
-(1ull << MSR_TM) |
 (1ull << MSR_VR) |
 (1ull << MSR_VSX) |
 (1ull << MSR_EE) |

Otherwise,
Reviewed-by: Harsh Prateek Bora

Re: [PATCH v3] hw: gpio: introduce pcf8574 driver

2024-03-12 Thread Philippe Mathieu-Daudé


On 11/3/24 10:58, Dmitriy Sharikhin wrote:

NXP PCF8574 and compatible ICs are simple I2C GPIO expanders.
PCF8574 incorporates quasi-bidirectional IO, and simple
communication protocol, when IO read is I2C byte read, and
IO write is I2C byte write. User can think of it as
open-drain port, when line high state is input and line low
state is output.

Signed-off-by: Dmitrii Sharikhin 
---
  MAINTAINERS   |   6 ++
  hw/gpio/Kconfig   |   4 +
  hw/gpio/meson.build   |   1 +
  hw/gpio/pcf8574.c | 162 ++
  include/hw/gpio/pcf8574.h |  15 
  5 files changed, 188 insertions(+)
  create mode 100644 hw/gpio/pcf8574.c
  create mode 100644 include/hw/gpio/pcf8574.h


Patch queued, thanks!

Re: [PATCH] meson.build: Always require an objc compiler on macos hosts

2024-03-12 Thread Philippe Mathieu-Daudé


On 11/3/24 14:33, Peter Maydell wrote:

We currently only insist that an ObjectiveC compiler is present on
macos hosts if we're building the Cocoa UI.  However, since then
we've added some other parts of QEMU which are also written in ObjC:
the coreaudio audio backend, and the vmnet net backend.  This means
that if you try to configure QEMU on macos with --disable-cocoa the
build will fail:

../meson.build:3741:13: ERROR: No host machine compiler for 'audio/coreaudio.m'

Since in practice any macos host will have an ObjC compiler
available, rather than trying to gate the compiler detection on an
increasingly complicated list of every bit of QEMU that uses ObjC,
just require it unconditionally on macos hosts.

Resolves https://gitlab.com/qemu-project/qemu/-/issues/2138
Signed-off-by: Peter Maydell 
---
Per the commit message, in theory we could allow a no-objc
build and disable coreaudio, vmnet, etc. But I didn't really see
a reason why that would be useful, and it's bound to keep
breaking unless we actively defend it in CI. So I preferred
to simply require ObjC on macos.

  meson.build | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)


Queued, thanks.

[PULL 09/13] hw/core: Cleanup unused included header in machine-qmp-cmds.c

2024-03-12 Thread Philippe Mathieu-Daudé

From: Zhao Liu 

Remove unused header (qemu/main-loop.h) in machine-qmp-cmds.c.

Tested by "./configure" and then "make".

Signed-off-by: Zhao Liu 
Reviewed-by: Philippe Mathieu-Daudé 
Message-ID: <20240311075621.3224684-3-zhao1@linux.intel.com>
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/core/machine-qmp-cmds.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/hw/core/machine-qmp-cmds.c b/hw/core/machine-qmp-cmds.c
index 3860a50c3b..4b72009cd3 100644
--- a/hw/core/machine-qmp-cmds.c
+++ b/hw/core/machine-qmp-cmds.c
@@ -19,7 +19,6 @@
 #include "qapi/qmp/qobject.h"
 #include "qapi/qobject-input-visitor.h"
 #include "qapi/type-helpers.h"
-#include "qemu/main-loop.h"
 #include "qemu/uuid.h"
 #include "qom/qom-qobject.h"
 #include "sysemu/hostmem.h"
-- 
2.41.0

Re: [PATCH v2 00/14] libvhost-user: support more memslots and cleanup memslot handling code

2024-03-12 Thread David Hildenbrand


On 11.03.24 21:03, Mario Casquero wrote:

This series has been successfully tested by QE. Start the
qemu-storage-daemon in the background with a rhel 9.5 image and
vhost-user-blk. After that, boot up a VM with virtio-mem and
vhost-user-blk-pci. Check with the HMP command 'info mtree' that
virtio-mem is making use of multiple memslots.

Tested-by: Mario Casquero 


Thanks Mario!

--
Cheers,

David / dhildenb

[PATCH v2 0/2] ARM Sbsa-ref: Enable CPU cluster topology

2024-03-12 Thread Xiong Yining

Enable CPU cluster support on SbsaQemu platform, so that users can
specify a 4-level CPU hierarchy sockets/clusters/cores/threads. And this
topology can be passed to the firmware through DT cpu-map.

Changes in v2:
- put this code before sbsa_fdt_add_gic_node().

xiongyining1480 (2):
  hw/arm/sbsa-ref:Enable CPU cluster on ARM sbsa machine
  hw/arm/sbsa-ref: Add cpu-map to device tree

 hw/arm/sbsa-ref.c | 35 +++
 1 file changed, 35 insertions(+)

-- 
2.34.1

Re: [RFC 0/2] Add RISC-V Server Platform Reference Board

2024-03-12 Thread Wu, Fei

On 3/8/2024 6:15 AM, Marcin Juszkiewicz wrote:
> W dniu 4.03.2024 o 11:25, Fei Wu pisze:
> 
>> The RISC-V Server Platform specification[1] defines a standardized
>> set of hardware and software capabilities, that portable system
>> software, such as OS and hypervisors can rely on being present in a
>> RISC-V server platform. This patchset provides a RISC-V Server
>> Platform (RVSP) reference implementation on qemu which is in
>> compliance with the spec as faithful as possible.
> 
> I am working on sbsa-ref which is AArch64 Standard Server Platform
> implementation. Will not go through details of rvsp-ref but give some
> potential hints from my work with our platform.
> 
Hi Marcin,

Thank you for sharing this.

> 
> 1. Consider versioning the platform.
> 
> We have 'platform_version'.'major/minor' exported in
> DeviceTree-formatted data. This allows for firmware to know which of
> non-discoverable hardware features exists and which not. We use it to
> disable XHCI controller on older platform version.
> 
Looks good, I will add it.

> 
> 2. If specification allows to have non-discoverable devices then add some.
> 
> This will require you to handle them in firmware in some way. Sooner or
> later some physical hardware will be in same situation so they can use
> your firmware code as reference. We have AHCI and XHCI on system bus
> (hardcoded in firmware).
> 
This RFC currently adds the devices like AHCI as PCI devices.

> 
> 3. You are going to use EDK2 with ACPI. Hide DT from code there with
> some hardware information library.
> 
> For sbsa-ref we created SbsaHardwareInfoLib in
> https://openfw.io/edk2-devel/20240306-no-dt-for-cpu-v6-0-acd8727a1...@linaro.org/
>  patchset.
> 
Looks good, I will ask my colleague working on FW part to take a look.

Thanks,
Fei.

Re: [PATCH v2 1/1] memory tier: acpi/hmat: create CPUless memory tiers after obtaining HMAT info

2024-03-12 Thread Huang, Ying

"Ho-Ren (Jack) Chuang"  writes:

> The current implementation treats emulated memory devices, such as
> CXL1.1 type3 memory, as normal DRAM when they are emulated as normal memory
> (E820_TYPE_RAM). However, these emulated devices have different
> characteristics than traditional DRAM, making it important to
> distinguish them. Thus, we modify the tiered memory initialization process
> to introduce a delay specifically for CPUless NUMA nodes. This delay
> ensures that the memory tier initialization for these nodes is deferred
> until HMAT information is obtained during the boot process. Finally,
> demotion tables are recalculated at the end.
>
> * Abstract common functions into `find_alloc_memory_type()`

We should move kmem_put_memory_types() (renamed to
mt_put_memory_types()?) too.  This can be put in a separate patch.

> Since different memory devices require finding or allocating a memory type,
> these common steps are abstracted into a single function,
> `find_alloc_memory_type()`, enhancing code scalability and conciseness.
>
> * Handle cases where there is no HMAT when creating memory tiers
> There is a scenario where a CPUless node does not provide HMAT information.
> If no HMAT is specified, it falls back to using the default DRAM tier.
>
> * Change adist calculation code to use another new lock, mt_perf_lock.
> In the current implementation, iterating through CPUlist nodes requires
> holding the `memory_tier_lock`. However, `mt_calc_adistance()` will end up
> trying to acquire the same lock, leading to a potential deadlock.
> Therefore, we propose introducing a standalone `mt_perf_lock` to protect
> `default_dram_perf`. This approach not only avoids deadlock but also
> prevents holding a large lock simultaneously.
>
> Signed-off-by: Ho-Ren (Jack) Chuang 
> Signed-off-by: Hao Xiang 
> ---
>  drivers/acpi/numa/hmat.c | 11 ++
>  drivers/dax/kmem.c   | 13 +--
>  include/linux/acpi.h |  6 
>  include/linux/memory-tiers.h |  8 +
>  mm/memory-tiers.c| 70 +---
>  5 files changed, 92 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c
> index d6b85f0f6082..28812ec2c793 100644
> --- a/drivers/acpi/numa/hmat.c
> +++ b/drivers/acpi/numa/hmat.c
> @@ -38,6 +38,8 @@ static LIST_HEAD(targets);
>  static LIST_HEAD(initiators);
>  static LIST_HEAD(localities);
>  
> +static LIST_HEAD(hmat_memory_types);
> +

HMAT isn't a device driver for some memory devices.  So I don't think we
should manage memory types in HMAT.  Instead, if the memory_type of a
node isn't set by the driver, we should manage it in memory-tier.c as
fallback.

>  static DEFINE_MUTEX(target_lock);
>  
>  /*
> @@ -149,6 +151,12 @@ int acpi_get_genport_coordinates(u32 uid,
>  }
>  EXPORT_SYMBOL_NS_GPL(acpi_get_genport_coordinates, CXL);
>  
> +struct memory_dev_type *hmat_find_alloc_memory_type(int adist)
> +{
> + return find_alloc_memory_type(adist, _memory_types);
> +}
> +EXPORT_SYMBOL_GPL(hmat_find_alloc_memory_type);
> +
>  static __init void alloc_memory_initiator(unsigned int cpu_pxm)
>  {
>   struct memory_initiator *initiator;
> @@ -1038,6 +1046,9 @@ static __init int hmat_init(void)
>   if (!hmat_set_default_dram_perf())
>   register_mt_adistance_algorithm(_adist_nb);
>  
> + /* Post-create CPUless memory tiers after getting HMAT info */
> + memory_tier_late_init();
> +

This should be called in memory-tier.c via

late_initcall(memory_tier_late_init);

Then, we don't need hmat to call it.

>   return 0;
>  out_put:
>   hmat_free_structures();
> diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
> index 42ee360cf4e3..aee17ab59f4f 100644
> --- a/drivers/dax/kmem.c
> +++ b/drivers/dax/kmem.c
> @@ -55,21 +55,10 @@ static LIST_HEAD(kmem_memory_types);
>  
>  static struct memory_dev_type *kmem_find_alloc_memory_type(int adist)
>  {
> - bool found = false;
>   struct memory_dev_type *mtype;
>  
>   mutex_lock(_memory_type_lock);
> - list_for_each_entry(mtype, _memory_types, list) {
> - if (mtype->adistance == adist) {
> - found = true;
> - break;
> - }
> - }
> - if (!found) {
> - mtype = alloc_memory_type(adist);
> - if (!IS_ERR(mtype))
> - list_add(>list, _memory_types);
> - }
> + mtype = find_alloc_memory_type(adist, _memory_types);
>   mutex_unlock(_memory_type_lock);
>  
>   return mtype;
> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
> index b7165e52b3c6..3f927ff01f02 100644
> --- a/include/linux/acpi.h
> +++ b/include/linux/acpi.h
> @@ -434,12 +434,18 @@ int thermal_acpi_critical_trip_temp(struct acpi_device 
> *adev, int *ret_temp);
>  
>  #ifdef CONFIG_ACPI_HMAT
>  int acpi_get_genport_coordinates(u32 uid, struct access_coordinate *coord);
> +struct memory_dev_type *hmat_find_alloc_memory_type(int adist);
>  #else

Re: [PATCH v9 02/21] hw/core/machine: Support modules in -smp

2024-03-12 Thread Zhao Liu

> > @@ -51,6 +51,10 @@ static char *cpu_hierarchy_to_string(MachineState *ms)
> >   g_string_append_printf(s, " * clusters (%u)", ms->smp.clusters);
> >   }
> > +if (mc->smp_props.modules_supported) {
> > +g_string_append_printf(s, " * modules (%u)", ms->smp.clusters);
> > +}
> 
> smp.clusters -> smp.modules?
>

Good catch! Thanks!

-Zhao

Re: [PATCH 06/13] ppc/spapr: Add pa-features for POWER10 machines

2024-03-12 Thread BALATON Zoltan


On Tue, 12 Mar 2024, Nicholas Piggin wrote:

On Tue Mar 12, 2024 at 7:07 AM AEST, BALATON Zoltan wrote:

On Mon, 11 Mar 2024, Philippe Mathieu-Daudé wrote:

On 11/3/24 19:51, Nicholas Piggin wrote:

From: Benjamin Gray 

Add POWER10 pa-features entry.

Notably DEXCR and and [P]HASHST/[P]HASHCHK instruction support is
advertised. Each DEXCR aspect is allocated a bit in the device tree,
using the 68--71 byte range (inclusive). The functionality of the
[P]HASHST/[P]HASHCHK instructions is separately declared in byte 72,
bit 0 (BE).

Signed-off-by: Benjamin Gray 
[npiggin: reword title and changelog, adjust a few bits]
Signed-off-by: Nicholas Piggin 
---
  hw/ppc/spapr.c | 34 ++
  1 file changed, 34 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 247f920f07..128bfe11a8 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -265,6 +265,36 @@ static void spapr_dt_pa_features(SpaprMachineState
*spapr,
  /* 60: NM atomic, 62: RNG */
  0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 60 - 65 */
  };
+/* 3.1 removes SAO, HTM support */
+uint8_t pa_features_31[] = { 74, 0,


Nitpicking because pre-existing, all these arrays could be static const.


If we are at it then maybe also s/0x00/   0/ because having a stream of
0x80 and 0x00 is not the most readable.


Eh, it's more readable because it aligns colums.


Not sure it you've noticed the 3 spaces before the 0 replacing 0x0 that 
would keep alignment. But it's not something that needs to be changed just 
commented on it as it came up but I don't expect it to be done now on the 
day of the freeze. It's more important to get the already reviewed and 
queued patches in a pull request to not miss the release. So this comment 
is just for the fuuture.


Regards,
BALATON Zoltan


But probably better
more readable and  less error prone would be like -

   PA_FEATURE_SET(pa_features_31,  6, 0); /* DS207 */
   PA_FEATURE_SET(pa_features_31, 18, 0); /* Vector scalar */

I just didn't quite find something I like yet. I won't change style
before adding the missing bits either way, but certainly would be
good to clean it up after.

Thanks,
Nick

RE: [PATCH v2 4/9] aspeed/smc: Add AST2700 support

2024-03-12 Thread Jamin Lin

> On 3/4/24 10:29, Jamin Lin wrote:
> > AST2700 fmc/spi controller's address decoding unit is 64KB and only
> > bits [31:16] are used for decoding. Introduce seg_to_reg and
> > reg_to_seg handlers for ast2700 fmc/spi controller.
> > In addition, adds ast2700 fmc, spi0, spi1, and spi2 class init handler.
> >
> > AST2700 support the maximum dram size is 8GiB.
> > Update dma_rw function and trace-event to support 64 bits dram
> > address. DMA length is from 1 byte to 32MB for AST2700, AST2600 and
> > AST10x0 and DMA length is from 4 bytes to 32MB for AST2500.
> >
> > In other words, if "R_DMA_LEN" is 0, it should move at least 1 byte
> > data for AST2700, AST2600 and AST10x0 and 4 bytes data for AST2500.
> > To support all ASPEED SOCs, adds dma_start_length parameter to store
> > the start length and update DMA_LENGTH mask to "1FF" to fix dma
> > moving incorrect data length issue.
> >
> > Currently, dma_rw function only supports length 4 bytes aligned.
> >
> > Signed-off-by: Troy Lee 
> > Signed-off-by: Jamin Lin 
> > ---
> >   hw/ssi/aspeed_smc.c | 326
> +---
> >   hw/ssi/trace-events |   2 +-
> >   include/hw/ssi/aspeed_smc.h |   1 +
> >   3 files changed, 309 insertions(+), 20 deletions(-)
> >
> > diff --git a/hw/ssi/aspeed_smc.c b/hw/ssi/aspeed_smc.c index
> > 3c93936fd1..73121edf2b 100644
> > --- a/hw/ssi/aspeed_smc.c
> > +++ b/hw/ssi/aspeed_smc.c
> > @@ -131,6 +131,9 @@
> >   #define   FMC_WDT2_CTRL_BOOT_SOURCE  BIT(4) /* O: primary
> 1: alternate */
> >   #define   FMC_WDT2_CTRL_EN   BIT(0)
> >
> > +/* DMA DRAM Side Address High Part (AST2700) */
> > +#define R_DMA_DRAM_ADDR_HIGH   (0x7c / 4)
> > +
> >   /* DMA Control/Status Register */
> >   #define R_DMA_CTRL(0x80 / 4)
> >   #define   DMA_CTRL_REQUEST  (1 << 31)
> > @@ -177,13 +180,18 @@
> >* DMA flash addresses should be 4 bytes aligned and the valid address
> >* range is 0x2000 - 0x2FFF.
> >*
> > - * DMA length is from 4 bytes to 32MB
> > + * DMA length is from 4 bytes to 32MB (AST2500)
> >*   0: 4 bytes
> >*   0x7F: 32M bytes
> > + *
> > + * DMA length is from 1 byte to 32MB (AST2600, AST10x0 and AST2700)
> > + *   0: 1 byte
> > + *   0x1FF: 32M bytes
> 
> OK. Then, we need to fix the model first before adding  AST2700 support.
Will add
> 
> >*/
> >   #define DMA_DRAM_ADDR(asc, val)   ((val) & (asc)->dma_dram_mask)
> > +#define DMA_DRAM_ADDR_HIGH(val)   ((val) & 0xf)
> >   #define DMA_FLASH_ADDR(asc, val)  ((val) & (asc)->dma_flash_mask)
> > -#define DMA_LENGTH(val) ((val) & 0x01FC)
> > +#define DMA_LENGTH(val) ((val) & 0x01FF)
> >
> >   /* Flash opcodes. */
> >   #define SPI_OP_READ   0x03/* Read data bytes (low
> frequency) */
> > @@ -202,6 +210,7 @@ static const AspeedSegments
> aspeed_2500_spi2_segments[];
> >   #define ASPEED_SMC_FEATURE_DMA   0x1
> >   #define ASPEED_SMC_FEATURE_DMA_GRANT 0x2
> >   #define ASPEED_SMC_FEATURE_WDT_CONTROL 0x4
> > +#define ASPEED_SMC_FEATURE_DMA_DRAM_ADDR_HIGH 0x08
> >
> >   static inline bool aspeed_smc_has_dma(const AspeedSMCClass *asc)
> >   {
> > @@ -213,6 +222,11 @@ static inline bool
> aspeed_smc_has_wdt_control(const AspeedSMCClass *asc)
> >   return !!(asc->features & ASPEED_SMC_FEATURE_WDT_CONTROL);
> >   }
> >
> > +static inline bool aspeed_smc_has_dma_dram_addr_high(const
> > +AspeedSMCClass *asc) {
> > +return !!(asc->features &
> ASPEED_SMC_FEATURE_DMA_DRAM_ADDR_HIGH);
> > +}
> > +
> >   #define aspeed_smc_error(fmt, ...)
> \
> >   qemu_log_mask(LOG_GUEST_ERROR, "%s: " fmt "\n", __func__, ##
> > __VA_ARGS__)
> >
> > @@ -655,7 +669,7 @@ static const MemoryRegionOps
> aspeed_smc_flash_ops = {
> >   .endianness = DEVICE_LITTLE_ENDIAN,
> >   .valid = {
> >   .min_access_size = 1,
> > -.max_access_size = 4,
> > +.max_access_size = 8,
> >   },
> >   };
> >
> > @@ -734,6 +748,9 @@ static uint64_t aspeed_smc_read(void *opaque,
> hwaddr addr, unsigned int size)
> >   (aspeed_smc_has_dma(asc) && addr == R_DMA_CTRL) ||
> >   (aspeed_smc_has_dma(asc) && addr == R_DMA_FLASH_ADDR)
> ||
> >   (aspeed_smc_has_dma(asc) && addr == R_DMA_DRAM_ADDR)
> ||
> > +(aspeed_smc_has_dma(asc) &&
> > + aspeed_smc_has_dma_dram_addr_high(asc) &&
> > + addr == R_DMA_DRAM_ADDR_HIGH) ||
> >   (aspeed_smc_has_dma(asc) && addr == R_DMA_LEN) ||
> >   (aspeed_smc_has_dma(asc) && addr == R_DMA_CHECKSUM)
> ||
> >   (addr >= R_SEG_ADDR0 &&
> > @@ -840,8 +857,11 @@ static bool
> aspeed_smc_inject_read_failure(AspeedSMCState *s)
> >*/
> >   static void aspeed_smc_dma_checksum(AspeedSMCState *s)
> >   {
> > +AspeedSMCClass *asc = ASPEED_SMC_GET_CLASS(s);
> >   MemTxResult result;
> > +uint32_t dma_len;
> >   uint32_t data;
> > +uint32_t extra;
> >
> >   if (s->regs[R_DMA_CTRL] & DMA_CTRL_WRITE) {
> >   aspeed_smc_error("invalid

Re: [PATCH v2 00/29] Cleanup up to fix missing ERRP_GUARD() for error_prepend()

2024-03-12 Thread Zhao Liu

On Tue, Mar 12, 2024 at 09:50:25AM +0100, Thomas Huth wrote:
> Date: Tue, 12 Mar 2024 09:50:25 +0100
> From: Thomas Huth 
> Subject: Re: [PATCH v2 00/29] Cleanup up to fix missing ERRP_GUARD() for
>  error_prepend()
> 
> On 12/03/2024 09.43, Zhao Liu wrote:
> > Hi Thomas/Markus/Michael,
> > 
> > For the remaing patches, could you please help me merge them next?
> > 
> > Many thanks!
> 
> Yes, I'm currently reviewing the ones that don't have a Reviewed-by yet. I
> can pick up the remaining patches if the other maintainers won't pick them
> up for the softfreeze today.
> 

Appreciate that you can help me get on the last train of releases.

If possible, could you please also help me pick up two other ERRP_GUARD()
related cleanups (total 8 patches, both got r/b)? ;-)

My cleanup is too fragmented, I'll try to centralize my work to make it easier
for maintainer to review and merge in the future!

[1]: 
https://lore.kernel.org/qemu-devel/20240223085653.1255438-1-zhao1@linux.intel.com/
[2]: 
https://lore.kernel.org/qemu-devel/20240312060337.3240965-1-zhao1@linux.intel.com/

Many thanks,
Zhao

Re: [PATCH 06/13] ppc/spapr: Add pa-features for POWER10 machines

2024-03-12 Thread Nicholas Piggin

On Tue Mar 12, 2024 at 7:34 PM AEST, Harsh Prateek Bora wrote:
>
>
> On 3/12/24 00:21, Nicholas Piggin wrote:
> > From: Benjamin Gray 
> > 
> > Add POWER10 pa-features entry.
> > 
> > Notably DEXCR and and [P]HASHST/[P]HASHCHK instruction support is
>
> s/and and/and
>
> > advertised. Each DEXCR aspect is allocated a bit in the device tree,
> > using the 68--71 byte range (inclusive). The functionality of the
> > [P]HASHST/[P]HASHCHK instructions is separately declared in byte 72,
> > bit 0 (BE).
> > 
> > Signed-off-by: Benjamin Gray 
> > [npiggin: reword title and changelog, adjust a few bits]
> > Signed-off-by: Nicholas Piggin 
> > ---
> >   hw/ppc/spapr.c | 34 ++
> >   1 file changed, 34 insertions(+)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 247f920f07..128bfe11a8 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -265,6 +265,36 @@ static void spapr_dt_pa_features(SpaprMachineState 
> > *spapr,
> >   /* 60: NM atomic, 62: RNG */
> >   0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 60 - 65 */
> >   };
> > +/* 3.1 removes SAO, HTM support */
> > +uint8_t pa_features_31[] = { 74, 0,
> > +/* 0: MMU|FPU|SLB|RUN|DABR|NX, 1: fri[nzpm]|DABRX|SPRG3|SLB0|PP110 
> > */
> > +/* 2: VPM|DS205|PPR|DS202|DS206, 3: LSD|URG, 5: LE|CFAR|EB|LSQ */
> > +0xf6, 0x1f, 0xc7, 0xc0, 0x00, 0xf0, /* 0 - 5 */
> > +/* 6: DS207 */
> > +0x80, 0x00, 0x00, 0x00, 0x00, 0x00, /* 6 - 11 */
> > +/* 16: Vector */
> > +0x00, 0x00, 0x00, 0x00, 0x80, 0x00, /* 12 - 17 */
> > +/* 18: Vec. Scalar, 20: Vec. XOR */
> > +0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 18 - 23 */
> > +/* 24: Ext. Dec, 26: 64 bit ftrs, 28: PM ftrs */
> > +0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 24 - 29 */
> > +/* 32: LE atomic, 34: EBB + ext EBB */
> > +0x00, 0x00, 0x80, 0x00, 0xC0, 0x00, /* 30 - 35 */
> > +/* 40: Radix MMU */
> > +0x00, 0x00, 0x00, 0x00, 0x80, 0x00, /* 36 - 41 */
> > +/* 42: PM, 44: PC RA, 46: SC vec'd */
> > +0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 42 - 47 */
> > +/* 48: SIMD, 50: QP BFP, 52: String */
> > +0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 48 - 53 */
> > +/* 54: DecFP, 56: DecI, 58: SHA */
> > +0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 54 - 59 */
> > +/* 60: NM atomic, 62: RNG */
> > +0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 60 - 65 */
> > +/* 68: DEXCR[SBHE|IBRTPDUS|SRAPD|NPHIE|PHIE] */
> > +0x00, 0x00, 0xce, 0x00, 0x00, 0x00, /* 66 - 71 */
> > +/* 72: [P]HASHCHK */
>
> Do we want to mention [P]HASHST as well in comment above ?

Sure. I'll do a quick respin.

Thanks,
Nick

>
> > +0x80, 0x00, /* 72 - 73 */
> > +};
> >   uint8_t *pa_features = NULL;
> >   size_t pa_size;
> >   
>
> In future, we may want to have helpers returning pointer to the
> pa_features array and corresponding size conditionally based on the
> required ISA support needed, instead of having local arrays bloat this
> routine.
>
> For now, with cosmetic fixes,
>
> Reviewed-by: Harsh Prateek Bora 
>
> > @@ -280,6 +310,10 @@ static void spapr_dt_pa_features(SpaprMachineState 
> > *spapr,
> >   pa_features = pa_features_300;
> >   pa_size = sizeof(pa_features_300);
> >   }
> > +if (ppc_check_compat(cpu, CPU_POWERPC_LOGICAL_3_10, 0, 
> > cpu->compat_pvr)) {
> > +pa_features = pa_features_31;
> > +pa_size = sizeof(pa_features_31);
> > +}
> >   if (!pa_features) {
> >   return;
> >   }

[PULL for 9.0 0/8] final maintainer updates (testing, gdbstub)

2024-03-12 Thread Alex Bennée

The following changes since commit 7489f7f3f81dcb776df8c1b9a9db281fc21bf05f:

  Merge tag 'hw-misc-20240309' of https://github.com/philmd/qemu into staging 
(2024-03-09 20:12:21 +)

are available in the Git repository at:

  https://gitlab.com/stsquad/qemu.git tags/pull-maintainer-final-120324-1

for you to fetch changes up to 0532045e8112c13a8a949e696576672e64c6fa14:

  gdbstub: Fix double close() of the follow-fork-mode socket (2024-03-12 
10:48:35 +)


final updates for 9.0 (testing, gdbstub):

  - avoid transferring pointless git data
  - fix the over rebuilding of test VMs
  - support Xfer:siginfo:read in gdbstub
  - fix double close() in gdbstub


Alex Bennée (2):
  gitlab: aggressively avoid extra GIT data
  tests/vm: ensure we build everything by default

Gustavo Romero (5):
  gdbstub: Rename back gdb_handlesig
  linux-user: Move tswap_siginfo out of target code
  gdbstub: Save target's siginfo
  gdbstub: Add Xfer:siginfo:read stub
  tests/tcg: Add multiarch test for Xfer:siginfo:read stub

Ilya Leoshkevich (1):
  gdbstub: Fix double close() of the follow-fork-mode socket

 gdbstub/internals.h|  1 +
 include/gdbstub/user.h | 19 +++--
 linux-user/signal-common.h |  2 -
 bsd-user/main.c|  2 +-
 bsd-user/signal.c  |  5 ++-
 gdbstub/gdbstub.c  |  8 
 gdbstub/user.c | 49 +++---
 linux-user/aarch64/signal.c|  2 +-
 linux-user/alpha/signal.c  |  2 +-
 linux-user/arm/signal.c|  2 +-
 linux-user/hexagon/signal.c|  2 +-
 linux-user/hppa/signal.c   |  2 +-
 linux-user/i386/signal.c   |  6 +--
 linux-user/loongarch64/signal.c|  2 +-
 linux-user/m68k/signal.c   |  4 +-
 linux-user/main.c  |  2 +-
 linux-user/microblaze/signal.c |  2 +-
 linux-user/mips/signal.c   |  4 +-
 linux-user/nios2/signal.c  |  2 +-
 linux-user/openrisc/signal.c   |  2 +-
 linux-user/ppc/signal.c|  4 +-
 linux-user/riscv/signal.c  |  2 +-
 linux-user/s390x/signal.c  |  2 +-
 linux-user/sh4/signal.c|  2 +-
 linux-user/signal.c| 15 +--
 linux-user/sparc/signal.c  |  2 +-
 linux-user/xtensa/signal.c |  2 +-
 tests/tcg/multiarch/segfault.c | 14 +++
 .gitlab-ci.d/base.yml  |  4 ++
 .gitlab-ci.d/buildtest-template.yml|  1 +
 .gitlab-ci.d/buildtest.yml |  2 +
 .gitlab-ci.d/windows.yml   |  2 +
 tests/tcg/multiarch/Makefile.target| 10 -
 .../multiarch/gdbstub/test-qxfer-siginfo-read.py   | 26 
 tests/vm/basevm.py |  2 +-
 35 files changed, 158 insertions(+), 52 deletions(-)
 create mode 100644 tests/tcg/multiarch/segfault.c
 create mode 100644 tests/tcg/multiarch/gdbstub/test-qxfer-siginfo-read.py

-- 
2.39.2

Re: [PATCH v3 14/29] target/i386: Prefer fast cpu_env() over slower CPU QOM cast macro

2024-03-12 Thread Thomas Huth


On 30/01/2024 14.01, Igor Mammedov wrote:

On Mon, 29 Jan 2024 17:44:56 +0100
Philippe Mathieu-Daudé  wrote:


Mechanical patch produced running the command documented
in scripts/coccinelle/cpu_env.cocci_template header.



commenting here since, I'm not expert on coccinelle scripts.

On negative side we are permanently loosing type checking in this area.


Not really that much. Have a look at cpu_env(), it has a comment saying:

 "We validate that CPUArchState follows CPUState in cpu-all.h"

So instead of run-time checking, the check should have already been done 
during compile time, i.e. when you have a valid CPUState pointer, it should 
be possible to derive a valid CPUArchState pointer from it without much 
further checking during runtime.



Is it worth it, what gains do we get with this series?


It's a small optimization, but why not?


Side note,
QOM cast expenses you are replacing could be negated by disabling
CONFIG_QOM_CAST_DEBUG without killing type check code when it's enabled.
That way you will speed up not only cpuenv access but also all other casts
across the board.


Yes, but that checking is enabled by default and does not have such 
compile-time checks that could be used instead, so I think Philippe's series 
here is still a good idea.



Signed-off-by: Philippe Mathieu-Daudé 
---

...

  static inline void vmx_clear_nmi_blocking(CPUState *cpu)
  {
-X86CPU *x86_cpu = X86_CPU(cpu);
-CPUX86State *env = _cpu->env;
-
-env->hflags2 &= ~HF2_NMI_MASK;



+cpu_env(cpu)->hflags2 &= ~HF2_NMI_MASK;


this style of de-referencing return value of macro/function
was discouraged in past and preferred way was 'Foo f = CAST(me); f->some_access

(it's just imprint speaking, I don't recall where it comes from)


I agree, though the new code is perfectly valid, it looks nicer if we'd use 
a variable here instead.


 Thomas

Re: [PATCH] virtio-blk: iothread-vq-mapping coroutine pool sizing

2024-03-12 Thread Kevin Wolf

Am 11.03.2024 um 21:14 hat Stefan Hajnoczi geschrieben:
> It is possible to hit the sysctl vm.max_map_count limit when the
> coroutine pool size becomes large. Each coroutine requires two mappings
> (one for the stack and one for the guard page). QEMU can crash with
> "failed to set up stack guard page" or "failed to allocate memory for
> stack" when this happens.
> 
> Coroutine pool sizing is simple when there is only one AioContext: sum
> up all I/O requests across all virtqueues.
> 
> When the iothread-vq-mapping option is used we should calculate tighter
> bounds: take the maximum number of the number of I/O requests across all
> virtqueues. This number is lower than simply summing all virtqueues when
> only a subset of the virtqueues is handled by each AioContext.

The reasoning is that each thread has its own coroutine pool for which
the pool size applies individually, and it doesn't need to have space
for coroutines running in a different thread, right? I'd like to have
this recorded in the commit message.

Of course, this also makes me wonder if a global coroutine pool size
really makes sense or if it should be per thread. One thread could be
serving only one queue (maybe the main thread with a CD-ROM device) and
another thread 32 queues (the iothread with the interesting disks).
There is no reason for the first thread to have a coroutine pool as big
as the second one.

But before we make the size thread-local, maybe having thread-local
pools wasn't right to begin with because multiple threads can run main
context code and they should therefore share the same coroutine pool (we
already had the problem earlier that coroutines start on the vcpu thread
and terminate on the main thread and this plays havoc with coroutine
pools).

Maybe per-AioContext pools with per-AioContext sizes would make more
sense?

> This is not a solution to hitting vm.max_map_count, but it helps. A
> guest with 64 vCPUs (hence 64 virtqueues) across 4 IOThreads with one
> iothread-vq-mapping virtio-blk device and a root disk without goes from
> pool_max_size 16,448 to 10,304.
> 
> Reported-by: Sanjay Rao 
> Reported-by: Boaz Ben Shabat 
> Signed-off-by: Stefan Hajnoczi 

Either way, this should already strictly improve the situation, so I'm
happy to apply this change for now.

Kevin

Re: [PATCH v5 49/65] i386/tdx: handle TDG.VP.VMCALL

2024-03-12 Thread Xiaoyao Li


On 3/11/2024 5:27 PM, Daniel P. Berrangé wrote:

On Thu, Feb 29, 2024 at 01:37:10AM -0500, Xiaoyao Li wrote:

From: Isaku Yamahata 

Add property "quote-generation-socket" to tdx-guest, which is a property
of type SocketAddress to specify Quote Generation Service(QGS).

On request of GetQuote, it connects to the QGS socket, read request
data from shared guest memory, send the request data to the QGS,
and store the response into shared guest memory, at last notify
TD guest by interrupt.

command line example:
   qemu-system-x86_64 \
 -object '{"qom-type":"tdx-guest","id":"tdx0","quote-generation-socket":{"type": "vsock", 
"cid":"1","port":"1234"}}' \


Can you illustrate this with 'unix' sockets, not 'vsock'.


Are you suggesting only updating the commit message to an example of 
unix socket? Or you want the code to test with some unix socket QGS?


(It seems the QGS I got for testing, only supports vsock socket. Because 
at the time when it got developed, it was supposed to communicate with 
drivers inside TD guest directly not via VMM (KVM+QEMU). Anyway, I will 
talk to internal folks to see if any plan to support unix socket.)



It makes no conceptual sense to be using vsock for two
processes on the host to be using vsock to talk to
each other. vsock is only needed for the guest to talk
to the host.


 -machine confidential-guest-support=tdx0

Note, above example uses vsock type socket because the QGS we used
implements the vsock socket. It can be other types, like UNIX socket,
which depends on the implementation of QGS.

To avoid no response from QGS server, setup a timer for the transaction.
If timeout, make it an error and interrupt guest. Define the threshold of
time to 30s at present, maybe change to other value if not appropriate.

Signed-off-by: Isaku Yamahata 
Codeveloped-by: Chenyi Qiang 
Signed-off-by: Chenyi Qiang 
Codeveloped-by: Xiaoyao Li 
Signed-off-by: Xiaoyao Li 
---
Changes in v5:
- add more decription of quote-generation-socket property;

Changes in v4:
- merge next patch "i386/tdx: setup a timer for the qio channel";

Changes in v3:
- rename property "quote-generation-service" to "quote-generation-socket";
- change the type of "quote-generation-socket" from str to
   SocketAddress;
- squash next patch into this one;
---
  qapi/qom.json |   8 +-
  target/i386/kvm/meson.build   |   2 +-
  target/i386/kvm/tdx-quote-generator.c | 170 
  target/i386/kvm/tdx-quote-generator.h |  95 +++
  target/i386/kvm/tdx.c | 216 ++
  target/i386/kvm/tdx.h |   6 +
  6 files changed, 495 insertions(+), 2 deletions(-)
  create mode 100644 target/i386/kvm/tdx-quote-generator.c
  create mode 100644 target/i386/kvm/tdx-quote-generator.h



With regards,
Daniel

Re: [PATCH v2 11/29] block/vdi: Fix missing ERRP_GUARD() for error_prepend()

2024-03-12 Thread Thomas Huth


On 11/03/2024 04.38, Zhao Liu wrote:

From: Zhao Liu 

As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():

* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
*   error_append_hint(), because that doesn't work with _fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or _fatal.

ERRP_GUARD() could avoid the case when @errp is _fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].

The vdi_co_do_create() passes @errp to error_prepend() without
ERRP_GUARD(), and its @errp parameter is so widely sourced that it is
necessary to protect it with ERRP_GUARD().

To avoid the potential issues as [1] said, add missing ERRP_GUARD() at
the beginning of this function.

[1]: Issue description in the commit message of commit ae7c80a7bd73
  ("error: New macro ERRP_GUARD()").

Cc: Stefan Weil 
Cc: Kevin Wolf 
Cc: Hanna Reitz 
Cc: qemu-bl...@nongnu.org
Signed-off-by: Zhao Liu 
---
  block/vdi.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/block/vdi.c b/block/vdi.c
index 3b57becb9fe0..6363da08cee9 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -738,6 +738,7 @@ static int coroutine_fn GRAPH_UNLOCKED
  vdi_co_do_create(BlockdevCreateOptions *create_options, size_t block_size,
   Error **errp)
  {
+ERRP_GUARD();
  BlockdevCreateOptionsVdi *vdi_opts;
  int ret = 0;
  uint64_t bytes = 0;


Reviewed-by: Thomas Huth

Re: [PATCH] sun4u: remap ebus BAR0 to use unassigned_io_ops instead of alias to PCI IO space

2024-03-12 Thread Philippe Mathieu-Daudé


On 11/3/24 09:28, Philippe Mathieu-Daudé wrote:

On 11/3/24 07:43, Mark Cave-Ayland wrote:
During kernel startup OpenBSD accesses addresses mapped by BAR0 of the 
ebus device
but at offsets where no IO devices exist. Before commit 4aa07e8649 
("hw/sparc64/ebus:
Access memory regions via pci_address_space_io()") BAR0 was mapped to 
legacy IO
space which allows accesses to unmapped devices to succeed, but 
afterwards these
accesses to unmapped PCI IO space cause a memory fault which prevents 
OpenBSD from

booting.

Since no devices are mapped at the addresses accessed by OpenBSD, 
change ebus BAR0
from a PCI IO space alias to an IO memory region using 
unassigned_io_ops which allows

these accesses to succeed and so allows OpenBSD to boot once again.

Fixes: 4aa07e8649 ("hw/sparc64/ebus: Access memory regions via 
pci_address_space_io()")

Signed-off-by: Mark Cave-Ayland 


Reviewed-by: Philippe Mathieu-Daudé 


---

[MCA: I'd like to merge this for 9.0 since I've been carrying various 
local workarounds

to allow OpenBSD to boot on SPARC64 for some time.]


Sure.


Patch queued!

Re: [PATCH v2 00/29] Cleanup up to fix missing ERRP_GUARD() for error_prepend()

2024-03-12 Thread Philippe Mathieu-Daudé


On 11/3/24 04:37, Zhao Liu wrote:


---
Zhao Liu (29):



   hw/core/loader-fit: Fix missing ERRP_GUARD() for error_prepend()
   hw/core/qdev-properties-system: Fix missing ERRP_GUARD() for
 error_prepend()
   hw/misc/ivshmem: Fix missing ERRP_GUARD() for error_prepend()


I'm queuing these 3 patches, thanks!

[PULL 05/13] hw/core/qdev-properties-system: Fix missing ERRP_GUARD() for error_prepend()

2024-03-12 Thread Philippe Mathieu-Daudé

From: Zhao Liu 

As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():

* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
*   error_append_hint(), because that doesn't work with _fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or _fatal.

ERRP_GUARD() could avoid the case when @errp is _fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].

The set_chr() passes @errp to error_prepend() without ERRP_GUARD().

As a PropertyInfo.set method, there are too many possible callers to
check the impact of this defect; it may or may not be harmless. Thus it
is necessary to protect @errp with ERRP_GUARD().

To avoid the issue like [1] said, add missing ERRP_GUARD() at the
beginning of this function.

[1]: Issue description in the commit message of commit ae7c80a7bd73
 ("error: New macro ERRP_GUARD()").

Cc: Paolo Bonzini 
Cc: "Daniel P. Berrangé" 
Signed-off-by: Zhao Liu 
Reviewed-by: Markus Armbruster 
Message-ID: <20240311033822.3142585-16-zhao1@linux.intel.com>
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/core/qdev-properties-system.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
index b45e90edb2..00c968f4f5 100644
--- a/hw/core/qdev-properties-system.c
+++ b/hw/core/qdev-properties-system.c
@@ -242,6 +242,7 @@ static void get_chr(Object *obj, Visitor *v, const char 
*name, void *opaque,
 static void set_chr(Object *obj, Visitor *v, const char *name, void *opaque,
 Error **errp)
 {
+ERRP_GUARD();
 Property *prop = opaque;
 CharBackend *be = object_field_prop_ptr(obj, prop);
 Chardev *s;
-- 
2.41.0

[PULL 03/13] hw/ppc/sam460ex: Support short options for adding drives

2024-03-12 Thread Philippe Mathieu-Daudé

From: BALATON Zoltan 

Having to use -drive if=none,... and -device ide-[cd,hd] is
inconvenient. Add support for shorter convenience options such as
-cdrom and -drive media=disk. Also adjust two nearby comments for code
style.

Signed-off-by: BALATON Zoltan 
Message-ID: <20240305225721.e9a404e6...@zero.eik.bme.hu>
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/ppc/sam460ex.c | 24 +++-
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/hw/ppc/sam460ex.c b/hw/ppc/sam460ex.c
index 7e34b6c5e0..d42b677898 100644
--- a/hw/ppc/sam460ex.c
+++ b/hw/ppc/sam460ex.c
@@ -33,6 +33,7 @@
 #include "hw/char/serial.h"
 #include "hw/i2c/ppc4xx_i2c.h"
 #include "hw/i2c/smbus_eeprom.h"
+#include "hw/ide/pci.h"
 #include "hw/usb/hcd-ehci.h"
 #include "hw/ppc/fdt.h"
 #include "hw/qdev-properties.h"
@@ -449,15 +450,27 @@ static void sam460ex_init(MachineState *machine)
 
 /* PCI devices */
 pci_create_simple(pci_bus, PCI_DEVFN(6, 0), "sm501");
-/* SoC has a single SATA port but we don't emulate that yet
+/*
+ * SoC has a single SATA port but we don't emulate that
  * However, firmware and usual clients have driver for SiI311x
- * so add one for convenience by default */
+ * PCI SATA card so add one for convenience by default
+ */
 if (defaults_enabled()) {
-pci_create_simple(pci_bus, -1, "sii3112");
+PCIIDEState *s = PCI_IDE(pci_create_simple(pci_bus, -1, "sii3112"));
+DriveInfo *di;
+
+di = drive_get_by_index(IF_IDE, 0);
+if (di) {
+ide_bus_create_drive(>bus[0], 0, di);
+}
+/* Use index 2 only if 1 does not exist, this allows -cdrom */
+di = drive_get_by_index(IF_IDE, 1) ?: drive_get_by_index(IF_IDE, 2);
+if (di) {
+ide_bus_create_drive(>bus[1], 0, di);
+}
 }
 
-/* SoC has 4 UARTs
- * but board has only one wired and two are present in fdt */
+/* SoC has 4 UARTs but board has only one wired and two described in fdt */
 if (serial_hd(0) != NULL) {
 serial_mm_init(get_system_memory(), 0x4ef600300, 0,
qdev_get_gpio_in(uic[1], 1),
@@ -531,6 +544,7 @@ static void sam460ex_machine_init(MachineClass *mc)
 {
 mc->desc = "aCube Sam460ex";
 mc->init = sam460ex_init;
+mc->block_default_type = IF_IDE;
 mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("460exb");
 mc->default_ram_size = 512 * MiB;
 mc->default_ram_id = "ppc4xx.sdram";
-- 
2.41.0

[PULL 12/13] meson.build: Always require an objc compiler on macos hosts

2024-03-12 Thread Philippe Mathieu-Daudé

From: Peter Maydell 

We currently only insist that an ObjectiveC compiler is present on
macos hosts if we're building the Cocoa UI.  However, since then
we've added some other parts of QEMU which are also written in ObjC:
the coreaudio audio backend, and the vmnet net backend.  This means
that if you try to configure QEMU on macos with --disable-cocoa the
build will fail:

../meson.build:3741:13: ERROR: No host machine compiler for 'audio/coreaudio.m'

Since in practice any macos host will have an ObjC compiler
available, rather than trying to gate the compiler detection on an
increasingly complicated list of every bit of QEMU that uses ObjC,
just require it unconditionally on macos hosts.

Resolves https://gitlab.com/qemu-project/qemu/-/issues/2138

Signed-off-by: Peter Maydell 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Daniel P. Berrangé 
Message-ID: <2024031114.3991537-1-peter.mayd...@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé 
---
 meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index f9dbe7634e..e3fab8ce9f 100644
--- a/meson.build
+++ b/meson.build
@@ -66,7 +66,7 @@ if host_os == 'windows' and add_languages('cpp', required: 
false, native: false)
   cxx = meson.get_compiler('cpp')
 endif
 if host_os == 'darwin' and \
-   add_languages('objc', required: get_option('cocoa'), native: false)
+   add_languages('objc', required: true, native: false)
   all_languages += ['objc']
   objc = meson.get_compiler('objc')
 endif
-- 
2.41.0

[PULL 01/13] hw/ide/ahci: Rename ahci_internal.h to ahci-internal.h

2024-03-12 Thread Philippe Mathieu-Daudé

From: BALATON Zoltan 

Other headers now use dash instead of underscore. Rename
ahci_internal.h accordingly for consistency.

Signed-off-by: BALATON Zoltan 
Reviewed-by: Markus Armbruster 
Reviewed-by: Thomas Huth 
Reviewed-by: Philippe Mathieu-Daudé 
Message-ID: <20240227131310.c24eb4e6...@zero.eik.bme.hu>
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/ide/{ahci_internal.h => ahci-internal.h} | 0
 hw/ide/ahci.c   | 2 +-
 hw/ide/ich.c| 2 +-
 3 files changed, 2 insertions(+), 2 deletions(-)
 rename hw/ide/{ahci_internal.h => ahci-internal.h} (100%)

diff --git a/hw/ide/ahci_internal.h b/hw/ide/ahci-internal.h
similarity index 100%
rename from hw/ide/ahci_internal.h
rename to hw/ide/ahci-internal.h
diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index b8123bc73d..bfefad2965 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -37,7 +37,7 @@
 #include "hw/ide/pci.h"
 #include "hw/ide/ahci-pci.h"
 #include "hw/ide/ahci-sysbus.h"
-#include "ahci_internal.h"
+#include "ahci-internal.h"
 #include "ide-internal.h"
 
 #include "trace.h"
diff --git a/hw/ide/ich.c b/hw/ide/ich.c
index 3ea793d790..9b909c87f3 100644
--- a/hw/ide/ich.c
+++ b/hw/ide/ich.c
@@ -70,7 +70,7 @@
 #include "sysemu/dma.h"
 #include "hw/ide/pci.h"
 #include "hw/ide/ahci-pci.h"
-#include "ahci_internal.h"
+#include "ahci-internal.h"
 
 #define ICH9_MSI_CAP_OFFSET 0x80
 #define ICH9_SATA_CAP_OFFSET0xA8
-- 
2.41.0

[PULL 10/13] hw/core: Cleanup unused included headers in numa.c

2024-03-12 Thread Philippe Mathieu-Daudé

From: Zhao Liu 

Remove unused header in numa.c:
* qemu/bitmap.h
* migration/vmstate.h

Note: Though parse_numa_hmat_lb() has the variable named "bitmap_copy",
it doesn't use the normal bitmap ops so that it's safe to exclude
qemu/bitmap.h header.

Tested by "./configure" and then "make".

Signed-off-by: Zhao Liu 
Reviewed-by: Philippe Mathieu-Daudé 
Message-ID: <20240311075621.3224684-4-zhao1@linux.intel.com>
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/core/numa.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/hw/core/numa.c b/hw/core/numa.c
index f08956ddb0..81d2124349 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -28,7 +28,6 @@
 #include "sysemu/numa.h"
 #include "exec/cpu-common.h"
 #include "exec/ramlist.h"
-#include "qemu/bitmap.h"
 #include "qemu/error-report.h"
 #include "qapi/error.h"
 #include "qapi/opts-visitor.h"
@@ -36,7 +35,6 @@
 #include "sysemu/qtest.h"
 #include "hw/core/cpu.h"
 #include "hw/mem/pc-dimm.h"
-#include "migration/vmstate.h"
 #include "hw/boards.h"
 #include "hw/mem/memory-device.h"
 #include "qemu/option.h"
-- 
2.41.0

Re: [PATCH v2 00/29] Cleanup up to fix missing ERRP_GUARD() for error_prepend()

2024-03-12 Thread Zhao Liu

Hi Thomas/Markus/Michael,

For the remaing patches, could you please help me merge them next?

Many thanks!
Zhao

On Tue, Mar 12, 2024 at 09:17:30AM +0100, Philippe Mathieu-Daudé wrote:
> Date: Tue, 12 Mar 2024 09:17:30 +0100
> From: Philippe Mathieu-Daudé 
> Subject: Re: [PATCH v2 00/29] Cleanup up to fix missing ERRP_GUARD() for
>  error_prepend()
> 
> On 11/3/24 04:37, Zhao Liu wrote:
> 
> > ---
> > Zhao Liu (29):
> 
> >hw/core/loader-fit: Fix missing ERRP_GUARD() for error_prepend()
> >hw/core/qdev-properties-system: Fix missing ERRP_GUARD() for
> >  error_prepend()
> >hw/misc/ivshmem: Fix missing ERRP_GUARD() for error_prepend()
> 
> I'm queuing these 3 patches, thanks!

Re: [PATCH 03/13] ppc/spapr|pnv: Remove SAO from pa-features

2024-03-12 Thread Harsh Prateek Bora


Hi Nick,

One cosmetic comment, in case you are doing a re-spin:

On 3/12/24 00:21, Nicholas Piggin wrote:

SAO is a page table attribute that strengthens the memory ordering of
accesses. QEMU with MTTCG does not implement this, so clear it in
ibm,pa-features. This is an obscure feature that has been removed from
POWER10 ISA v3.1, there isn't much concern with removing it.

Signed-off-by: Nicholas Piggin 
---
  hw/ppc/pnv.c   |  2 +-
  hw/ppc/spapr.c | 14 ++
  2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 0b47b92baa..aa9786e970 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -150,7 +150,7 @@ static void pnv_dt_core(PnvChip *chip, PnvCore *pc, void 
*fdt)
  uint32_t page_sizes_prop[64];
  size_t page_sizes_prop_size;
  const uint8_t pa_features[] = { 24, 0,
-0xf6, 0x3f, 0xc7, 0xc0, 0x80, 0xf0,
+0xf6, 0x3f, 0xc7, 0xc0, 0x00, 0xf0,
  0x80, 0x00, 0x00, 0x00, 0x00, 0x00,
  0x00, 0x00, 0x00, 0x00, 0x80, 0x00,
  0x80, 0x00, 0x80, 0x00, 0x80, 0x00 };
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 55263f0815..5099f12cc6 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -234,16 +234,16 @@ static void spapr_dt_pa_features(SpaprMachineState *spapr,
   void *fdt, int offset)
  {
  uint8_t pa_features_206[] = { 6, 0,
-0xf6, 0x1f, 0xc7, 0x00, 0x80, 0xc0 };
+0xf6, 0x1f, 0xc7, 0x00, 0x00, 0xc0 };
  uint8_t pa_features_207[] = { 24, 0,
-0xf6, 0x1f, 0xc7, 0xc0, 0x80, 0xf0,
+0xf6, 0x1f, 0xc7, 0xc0, 0x00, 0xf0,
  0x80, 0x00, 0x00, 0x00, 0x00, 0x00,
  0x00, 0x00, 0x00, 0x00, 0x80, 0x00,
  0x80, 0x00, 0x80, 0x00, 0x00, 0x00 };
  uint8_t pa_features_300[] = { 66, 0,
  /* 0: MMU|FPU|SLB|RUN|DABR|NX, 1: fri[nzpm]|DABRX|SPRG3|SLB0|PP110 */
-/* 2: VPM|DS205|PPR|DS202|DS206, 3: LSD|URG, SSO, 5: LE|CFAR|EB|LSQ */
-0xf6, 0x1f, 0xc7, 0xc0, 0x80, 0xf0, /* 0 - 5 */
+/* 2: VPM|DS205|PPR|DS202|DS206, 3: LSD|URG, 5: LE|CFAR|EB|LSQ */


Do we want to mention in comments SSO (disabled), also ..


+0xf6, 0x1f, 0xc7, 0xc0, 0x00, 0xf0, /* 0 - 5 */
  /* 6: DS207 */
  0x80, 0x00, 0x00, 0x00, 0x00, 0x00, /* 6 - 11 */
  /* 16: Vector */
@@ -284,6 +284,12 @@ static void spapr_dt_pa_features(SpaprMachineState *spapr,
  return;
  }
  
+/*

+ * SSO (SAO) ordering is supported on KVM and thread=single hosts,
+ * but not MTTCG, so disable it. To advertise it, a cap would have
+ * to be added, or support implemented for MTTCG.
+ */
+


This comment could go in the beginning where we are actually disabling it.

Otherwise,

Reviewed-by: Harsh Prateek Bora 



  if (ppc_hash64_has(cpu, PPC_HASH64_CI_LARGEPAGE)) {
  /*
   * Note: we keep CI large pages off by default because a 64K capable

Re: [PATCH v5] pc: q35: Bump max_cpus to 4096 vcpus

2024-03-12 Thread Zhao Liu

On Wed, Feb 28, 2024 at 08:03:51PM +0530, Ani Sinha wrote:
> Date: Wed, 28 Feb 2024 20:03:51 +0530
> From: Ani Sinha 
> Subject: [PATCH v5] pc: q35: Bump max_cpus to 4096 vcpus
> X-Mailer: git-send-email 2.42.0
> 
> Since commit f10a570b093e6 ("KVM: x86: Add CONFIG_KVM_MAX_NR_VCPUS to allow 
> up to 4096 vCPUs")
> Linux kernel can support upto a maximum number of 4096 vcpus when MAXSMP is
> enabled in the kernel. At present, QEMU has been tested to correctly boot a
> linux guest with 4096 vcpus using the current edk2 upstream master branch that
> has the fixes corresponding to the following two PRs:
> 
> https://github.com/tianocore/edk2/pull/5410
> https://github.com/tianocore/edk2/pull/5418
> 
> The changes merged into edk2 with the above PRs will be in the upcoming 
> 2024-05
> release. With current seabios firmware, it boots fine with 4096 vcpus already.
> So bump up the value max_cpus to 4096 for q35 machines versions 9 and newer.
> Q35 machines versions 8.2 and older continue to support 1024 maximum vcpus
> as before for compatibility reasons.
> 
> If KVM is not able to support the specified number of vcpus, QEMU would
> return the following error messages:
> 
> $ ./qemu-system-x86_64 -cpu host -accel kvm -machine q35 -smp 1728
> qemu-system-x86_64: -accel kvm: warning: Number of SMP cpus requested (1728) 
> exceeds the recommended cpus supported by KVM (12)
> qemu-system-x86_64: -accel kvm: warning: Number of hotpluggable cpus 
> requested (1728) exceeds the recommended cpus supported by KVM (12)
> Number of SMP cpus requested (1728) exceeds the maximum cpus supported by KVM 
> (1024)
> 
> Cc: Daniel P. Berrangé 
> Cc: Igor Mammedov 
> Cc: Michael S. Tsirkin 
> Cc: Julia Suvorova 
> Cc: kra...@redhat.com
> Reviewed-by: Daniel P. Berrangé 
> Reviewed-by: Igor Mammedov 
> Reviewed-by: Gerd Hoffmann 
> Signed-off-by: Ani Sinha 
> ---
>  hw/i386/pc_q35.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>

Reviewed-by: Zhao Liu

Re: [PATCH v2 25/29] hw/virtio/vhost-vsock: Fix missing ERRP_GUARD() for error_prepend()

2024-03-12 Thread Thomas Huth


On 11/03/2024 04.38, Zhao Liu wrote:

From: Zhao Liu 

As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():

* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
*   error_append_hint(), because that doesn't work with _fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or _fatal.

ERRP_GUARD() could avoid the case when @errp is _fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].

The vhost_vsock_device_realize() passes @errp to error_prepend(), and as
a VirtioDeviceClass.realize method, its @errp is from
DeviceClass.realize so that there is no guarantee that the @errp won't
point to @error_fatal.

To avoid the issue like [1] said, add missing ERRP_GUARD() at the
beginning of this function.

[1]: Issue description in the commit message of commit ae7c80a7bd73
  ("error: New macro ERRP_GUARD()").

Cc: "Michael S. Tsirkin" 
Signed-off-by: Zhao Liu 
---
  hw/virtio/vhost-vsock.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/hw/virtio/vhost-vsock.c b/hw/virtio/vhost-vsock.c
index d5ca0b5a1055..3d4a5a97f484 100644
--- a/hw/virtio/vhost-vsock.c
+++ b/hw/virtio/vhost-vsock.c
@@ -121,6 +121,7 @@ static const VMStateDescription vmstate_virtio_vhost_vsock 
= {
  
  static void vhost_vsock_device_realize(DeviceState *dev, Error **errp)

  {
+ERRP_GUARD();
  VHostVSockCommon *vvc = VHOST_VSOCK_COMMON(dev);
  VirtIODevice *vdev = VIRTIO_DEVICE(dev);
  VHostVSock *vsock = VHOST_VSOCK(dev);


Reviewed-by: Thomas Huth

Re: [PATCH] spapr: avoid overhead of finding vhyp class in critical operations

2024-03-12 Thread Harsh Prateek Bora





On 3/12/24 14:18, Nicholas Piggin wrote:

On Tue Mar 12, 2024 at 4:38 PM AEST, Harsh Prateek Bora wrote:

Hi Nick,

One minor comment below:

On 2/24/24 13:03, Nicholas Piggin wrote:

PPC_VIRTUAL_HYPERVISOR_GET_CLASS is used in critical operations like
interrupts and TLB misses and is quite costly. Running the
kvm-unit-tests sieve program with radix MMU enabled thrashes the TCG
TLB and spends a lot of time in TLB and page table walking code. The
test takes 67 seconds to complete with a lot of time being spent in
code related to finding the vhyp class:

 12.01%  [.] g_str_hash
  8.94%  [.] g_hash_table_lookup
  8.06%  [.] object_class_dynamic_cast
  6.21%  [.] address_space_ldq
  4.94%  [.] __strcmp_avx2
  4.28%  [.] tlb_set_page_full
  4.08%  [.] address_space_translate_internal
  3.17%  [.] object_class_dynamic_cast_assert
  2.84%  [.] ppc_radix64_xlate

Keep a pointer to the class and avoid this lookup. This reduces the
execution time to 40 seconds.

Signed-off-by: Nicholas Piggin 
---
This feels a bit ugly, but the performance problem of looking up the
class in fast paths can't be ignored. Is there a "nicer" way to get the
same result?

Thanks,
Nick

   target/ppc/cpu.h   |  3 ++-
   target/ppc/mmu-book3s-v3.h |  4 +---
   hw/ppc/pegasos2.c  |  1 +
   target/ppc/cpu_init.c  |  9 +++--
   target/ppc/excp_helper.c   | 16 
   target/ppc/kvm.c   |  4 +---
   target/ppc/mmu-hash64.c| 16 
   target/ppc/mmu-radix64.c   |  4 +---
   8 files changed, 17 insertions(+), 40 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index ec14574d14..eb85d9aa71 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -1437,6 +1437,7 @@ struct ArchCPU {
   int vcpu_id;
   uint32_t compat_pvr;
   PPCVirtualHypervisor *vhyp;
+PPCVirtualHypervisorClass *vhyp_class;
   void *machine_data;
   int32_t node_id; /* NUMA node this CPU belongs to */
   PPCHash64Options *hash64_opts;
@@ -1535,7 +1536,7 @@ DECLARE_OBJ_CHECKERS(PPCVirtualHypervisor, 
PPCVirtualHypervisorClass,
   
   static inline bool vhyp_cpu_in_nested(PowerPCCPU *cpu)

   {
-return PPC_VIRTUAL_HYPERVISOR_GET_CLASS(cpu->vhyp)->cpu_in_nested(cpu);
+return cpu->vhyp_class->cpu_in_nested(cpu);
   }
   #endif /* CONFIG_USER_ONLY */
   
diff --git a/target/ppc/mmu-book3s-v3.h b/target/ppc/mmu-book3s-v3.h

index 674377a19e..f3f7993958 100644
--- a/target/ppc/mmu-book3s-v3.h
+++ b/target/ppc/mmu-book3s-v3.h
@@ -108,9 +108,7 @@ static inline hwaddr ppc_hash64_hpt_mask(PowerPCCPU *cpu)
   uint64_t base;
   
   if (cpu->vhyp) {


All the checks for cpu->vhyp needs to be changed to check for
cpu->vhyp_class now, for all such instances.


It wasn't supposed to, because vhyp != NULL implies vhyp_class != NULL.
It's supposed to be an equivalent transformation just changing the
lookup function.


I agree, but not just it appears a bit odd, my only worry is if a future
change cause vhyp_class to be NULL before the control reaches here, this
check wont really serve the purpose. Anyways, not a mandatory
requirement for now, so I shall leave it to your choice.

regards,
Harsh



Okay to leave it as is?

Thanks,
Nick



With that,

Reviewed-by: Harsh Prateek Bora 



-PPCVirtualHypervisorClass *vhc =
-PPC_VIRTUAL_HYPERVISOR_GET_CLASS(cpu->vhyp);
-return vhc->hpt_mask(cpu->vhyp);
+return cpu->vhyp_class->hpt_mask(cpu->vhyp);
   }
   if (cpu->env.mmu_model == POWERPC_MMU_3_00) {
   ppc_v3_pate_t pate;
diff --git a/hw/ppc/pegasos2.c b/hw/ppc/pegasos2.c
index 04d6decb2b..c22e8b336d 100644
--- a/hw/ppc/pegasos2.c
+++ b/hw/ppc/pegasos2.c
@@ -400,6 +400,7 @@ static void pegasos2_machine_reset(MachineState *machine, 
ShutdownCause reason)
   machine->fdt = fdt;
   
   pm->cpu->vhyp = PPC_VIRTUAL_HYPERVISOR(machine);

+pm->cpu->vhyp_class = PPC_VIRTUAL_HYPERVISOR_GET_CLASS(pm->cpu->vhyp);
   }
   
   enum pegasos2_rtas_tokens {

diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
index 9bccddb350..63d0094024 100644
--- a/target/ppc/cpu_init.c
+++ b/target/ppc/cpu_init.c
@@ -6631,6 +6631,7 @@ void cpu_ppc_set_vhyp(PowerPCCPU *cpu, 
PPCVirtualHypervisor *vhyp)
   CPUPPCState *env = >env;
   
   cpu->vhyp = vhyp;

+cpu->vhyp_class = PPC_VIRTUAL_HYPERVISOR_GET_CLASS(vhyp);
   
   /*

* With a virtual hypervisor mode we never allow the CPU to go
@@ -7224,9 +7225,7 @@ static void ppc_cpu_exec_enter(CPUState *cs)
   PowerPCCPU *cpu = POWERPC_CPU(cs);
   
   if (cpu->vhyp) {

-PPCVirtualHypervisorClass *vhc =
-PPC_VIRTUAL_HYPERVISOR_GET_CLASS(cpu->vhyp);
-vhc->cpu_exec_enter(cpu->vhyp, cpu);
+cpu->vhyp_class->cpu_exec_enter(cpu->vhyp, cpu);
   }
   }
   
@@ -7235,9 +7234,7 @@ static void ppc_cpu_exec_exit(CPUState *cs)

   PowerPCCPU *cpu = POWERPC_CPU(cs);
   
   if (cpu->vhyp) {

-

Re: [PATCH 08/13] ppc/pnv: Set POWER9, POWER10 ibm,pa-features bits

2024-03-12 Thread Cédric Le Goater


On 3/12/24 09:54, Nicholas Piggin wrote:

On Tue Mar 12, 2024 at 6:06 PM AEST, Cédric Le Goater wrote:

On 3/11/24 19:51, Nicholas Piggin wrote:

Copy the pa-features arrays from spapr, adjusting slightly as
described in comments.

Cc: "Cédric Le Goater" 
Cc: "Frédéric Barrat" 
Signed-off-by: Nicholas Piggin 
---
   hw/ppc/pnv.c   | 67 --
   hw/ppc/spapr.c |  1 +
   2 files changed, 66 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 52d964f77a..3e30c08420 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -332,6 +332,35 @@ static void pnv_chip_power8_dt_populate(PnvChip *chip, 
void *fdt)
   }
   }
   
+/*

+ * Same as spapr pa_features_300 except pnv always enables CI largepages bit.
+ */
+static const uint8_t pa_features_300[] = { 66, 0,
+/* 0: MMU|FPU|SLB|RUN|DABR|NX, 1: CILRG|fri[nzpm]|DABRX|SPRG3|SLB0|PP110 */
+/* 2: VPM|DS205|PPR|DS202|DS206, 3: LSD|URG, 5: LE|CFAR|EB|LSQ */
+0xf6, 0x3f, 0xc7, 0xc0, 0x00, 0xf0, /* 0 - 5 */
+/* 6: DS207 */
+0x80, 0x00, 0x00, 0x00, 0x00, 0x00, /* 6 - 11 */
+/* 16: Vector */
+0x00, 0x00, 0x00, 0x00, 0x80, 0x00, /* 12 - 17 */
+/* 18: Vec. Scalar, 20: Vec. XOR, 22: HTM */
+0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 18 - 23 */
+/* 24: Ext. Dec, 26: 64 bit ftrs, 28: PM ftrs */
+0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 24 - 29 */
+/* 32: LE atomic, 34: EBB + ext EBB */
+0x00, 0x00, 0x80, 0x00, 0xC0, 0x00, /* 30 - 35 */
+/* 40: Radix MMU */
+0x00, 0x00, 0x00, 0x00, 0x80, 0x00, /* 36 - 41 */
+/* 42: PM, 44: PC RA, 46: SC vec'd */
+0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 42 - 47 */
+/* 48: SIMD, 50: QP BFP, 52: String */
+0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 48 - 53 */
+/* 54: DecFP, 56: DecI, 58: SHA */
+0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 54 - 59 */
+/* 60: NM atomic, 62: RNG */
+0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 60 - 65 */
+};
+
   static void pnv_chip_power9_dt_populate(PnvChip *chip, void *fdt)
   {
   static const char compat[] = "ibm,power9-xscom\0ibm,xscom";
@@ -349,7 +378,7 @@ static void pnv_chip_power9_dt_populate(PnvChip *chip, void 
*fdt)
   offset = pnv_dt_core(chip, pnv_core, fdt);
   
   _FDT((fdt_setprop(fdt, offset, "ibm,pa-features",

-   pa_features_207, sizeof(pa_features_207;
+   pa_features_300, sizeof(pa_features_300;
   }
   
   if (chip->ram_size) {

@@ -359,6 +388,40 @@ static void pnv_chip_power9_dt_populate(PnvChip *chip, 
void *fdt)
   pnv_dt_lpc(chip, fdt, 0, PNV9_LPCM_BASE(chip), PNV9_LPCM_SIZE);
   }
   
+/*

+ * Same as spapr pa_features_31 except pnv always enables CI largepages bit,
+ * always disables copy/paste.
+ */
+static const uint8_t pa_features_31[] = { 74, 0,
+/* 0: MMU|FPU|SLB|RUN|DABR|NX, 1: CILRG|fri[nzpm]|DABRX|SPRG3|SLB0|PP110 */
+/* 2: VPM|DS205|PPR|DS202|DS206, 3: LSD|URG, 5: LE|CFAR|EB|LSQ */
+0xf6, 0x3f, 0xc7, 0xc0, 0x00, 0xf0, /* 0 - 5 */
+/* 6: DS207 */
+0x80, 0x00, 0x00, 0x00, 0x00, 0x00, /* 6 - 11 */
+/* 16: Vector */
+0x00, 0x00, 0x00, 0x00, 0x80, 0x00, /* 12 - 17 */
+/* 18: Vec. Scalar, 20: Vec. XOR */
+0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 18 - 23 */
+/* 24: Ext. Dec, 26: 64 bit ftrs, 28: PM ftrs */
+0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 24 - 29 */
+/* 32: LE atomic, 34: EBB + ext EBB */
+0x00, 0x00, 0x80, 0x00, 0xC0, 0x00, /* 30 - 35 */
+/* 40: Radix MMU */
+0x00, 0x00, 0x00, 0x00, 0x80, 0x00, /* 36 - 41 */
+/* 42: PM, 44: PC RA, 46: SC vec'd */
+0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 42 - 47 */
+/* 48: SIMD, 50: QP BFP, 52: String */
+0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 48 - 53 */
+/* 54: DecFP, 56: DecI, 58: SHA */
+0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 54 - 59 */
+/* 60: NM atomic, 62: RNG */
+0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 60 - 65 */
+/* 68: DEXCR[SBHE|IBRTPDUS|SRAPD|NPHIE|PHIE] */
+0x00, 0x00, 0xce, 0x00, 0x00, 0x00, /* 66 - 71 */
+/* 72: [P]HASHCHK */
+0x80, 0x00, /* 72 - 73 */
+};
+
   static void pnv_chip_power10_dt_populate(PnvChip *chip, void *fdt)
   {
   static const char compat[] = "ibm,power10-xscom\0ibm,xscom";
@@ -376,7 +439,7 @@ static void pnv_chip_power10_dt_populate(PnvChip *chip, 
void *fdt)
   offset = pnv_dt_core(chip, pnv_core, fdt);
   
   _FDT((fdt_setprop(fdt, offset, "ibm,pa-features",

-   pa_features_207, sizeof(pa_features_207;
+   pa_features_31, sizeof(pa_features_31;
   }
   
   if (chip->ram_size) {

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 128bfe11a8..b53c13e037 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -233,6 +233,7 @@ static void spapr_dt_pa_features(SpaprMachineState *spapr,
PowerPCCPU *cpu,
void *fdt,

Re: [PATCH 1/2] io: Introduce qio_channel_file_new_dupfd

2024-03-12 Thread Daniel P . Berrangé

On Mon, Mar 11, 2024 at 08:33:34PM -0300, Fabiano Rosas wrote:
> Add a new helper function for creating a QIOChannelFile channel with a
> duplicated file descriptor. This saves the calling code from having to
> do error checking on the dup() call.
> 
> Suggested-by: Daniel P. Berrangé 
> Signed-off-by: Fabiano Rosas 
> ---
>  include/io/channel-file.h | 18 ++
>  io/channel-file.c | 12 
>  2 files changed, 30 insertions(+)

Reviewed-by: Daniel P. Berrangé 

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

[PULL 2/8] tests/vm: ensure we build everything by default

2024-03-12 Thread Alex Bennée

The "check" target by itself is not enough to ensure we build the user
mode binaries. While we can't test them with check-tcg we can at least
include them in the build.

Signed-off-by: Alex Bennée 
Reviewed-by: Thomas Huth 
Cc: Richard Henderson 
Cc: Gustavo Romero 

diff --git a/tests/vm/basevm.py b/tests/vm/basevm.py
index f8fd751eb1..4a1af04b9a 100644
--- a/tests/vm/basevm.py
+++ b/tests/vm/basevm.py
@@ -606,7 +606,7 @@ def get_default_jobs():
 parser.add_argument("--build-qemu",
 help="build QEMU from source in guest")
 parser.add_argument("--build-target",
-help="QEMU build target", default="check")
+help="QEMU build target", default="all check")
 parser.add_argument("--build-path", default=None,
 help="Path of build directory, "\
 "for using build tree QEMU binary. ")
-- 
2.39.2

[PULL 3/8] gdbstub: Rename back gdb_handlesig

2024-03-12 Thread Alex Bennée

From: Gustavo Romero 

Rename gdb_handlesig_reason back to gdb_handlesig. There is no need to
add a wrapper for gdb_handlesig and rename it when a new parameter is
added.

Signed-off-by: Gustavo Romero 
Reviewed-by: Alex Bennée 
Reviewed-by: Richard Henderson 
Message-Id: <20240309030901.1726211-2-gustavo.rom...@linaro.org>
Signed-off-by: Alex Bennée 

diff --git a/include/gdbstub/user.h b/include/gdbstub/user.h
index 4c4e5c4c58..4fc88f8a25 100644
--- a/include/gdbstub/user.h
+++ b/include/gdbstub/user.h
@@ -10,7 +10,7 @@
 #define GDBSTUB_USER_H
 
 /**
- * gdb_handlesig_reason() - yield control to gdb
+ * gdb_handlesig() - yield control to gdb
  * @cpu: CPU
  * @sig: if non-zero, the signal number which caused us to stop
  * @reason: stop reason for stop reply packet or NULL
@@ -25,18 +25,7 @@
  * or 0 if no signal should be delivered, ie the signal that caused
  * us to stop should be ignored.
  */
-int gdb_handlesig_reason(CPUState *, int, const char *);
-
-/**
- * gdb_handlesig() - yield control to gdb
- * @cpu CPU
- * @sig: if non-zero, the signal number which caused us to stop
- * @see gdb_handlesig_reason()
- */
-static inline int gdb_handlesig(CPUState *cpu, int sig)
-{
-return gdb_handlesig_reason(cpu, sig, NULL);
-}
+int gdb_handlesig(CPUState *, int, const char *);
 
 /**
  * gdb_signalled() - inform remote gdb of sig exit
diff --git a/gdbstub/user.c b/gdbstub/user.c
index 7f9f19a124..520987fddc 100644
--- a/gdbstub/user.c
+++ b/gdbstub/user.c
@@ -190,7 +190,7 @@ void gdb_qemu_exit(int code)
 exit(code);
 }
 
-int gdb_handlesig_reason(CPUState *cpu, int sig, const char *reason)
+int gdb_handlesig(CPUState *cpu, int sig, const char *reason)
 {
 char buf[256];
 int n;
@@ -746,7 +746,7 @@ void gdb_breakpoint_remove_all(CPUState *cs)
 void gdb_syscall_handling(const char *syscall_packet)
 {
 gdb_put_packet(syscall_packet);
-gdb_handlesig(gdbserver_state.c_cpu, 0);
+gdb_handlesig(gdbserver_state.c_cpu, 0, NULL);
 }
 
 static bool should_catch_syscall(int num)
@@ -764,7 +764,7 @@ void gdb_syscall_entry(CPUState *cs, int num)
 {
 if (should_catch_syscall(num)) {
 g_autofree char *reason = g_strdup_printf("syscall_entry:%x;", num);
-gdb_handlesig_reason(cs, gdb_target_sigtrap(), reason);
+gdb_handlesig(cs, gdb_target_sigtrap(), reason);
 }
 }
 
@@ -772,7 +772,7 @@ void gdb_syscall_return(CPUState *cs, int num)
 {
 if (should_catch_syscall(num)) {
 g_autofree char *reason = g_strdup_printf("syscall_return:%x;", num);
-gdb_handlesig_reason(cs, gdb_target_sigtrap(), reason);
+gdb_handlesig(cs, gdb_target_sigtrap(), reason);
 }
 }
 
diff --git a/linux-user/main.c b/linux-user/main.c
index 41caa77cb5..55aa11c9b4 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -1018,7 +1018,7 @@ int main(int argc, char **argv, char **envp)
 gdbstub);
 exit(EXIT_FAILURE);
 }
-gdb_handlesig(cpu, 0);
+gdb_handlesig(cpu, 0, NULL);
 }
 
 #ifdef CONFIG_SEMIHOSTING
diff --git a/linux-user/signal.c b/linux-user/signal.c
index cc7dd78e41..bca44c295d 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -1180,7 +1180,7 @@ static void handle_pending_signal(CPUArchState *cpu_env, 
int sig,
 /* dequeue signal */
 k->pending = 0;
 
-sig = gdb_handlesig(cpu, sig);
+sig = gdb_handlesig(cpu, sig, NULL);
 if (!sig) {
 sa = NULL;
 handler = TARGET_SIG_IGN;
-- 
2.39.2

[PULL 4/8] linux-user: Move tswap_siginfo out of target code

2024-03-12 Thread Alex Bennée

From: Gustavo Romero 

Move tswap_siginfo from target code to handle_pending_signal. This will
allow some cleanups and having the siginfo ready to be used in gdbstub.

Signed-off-by: Gustavo Romero 
Suggested-by: Richard Henderson 
Reviewed-by: Richard Henderson 
Message-Id: <20240309030901.1726211-3-gustavo.rom...@linaro.org>
Signed-off-by: Alex Bennée 

diff --git a/linux-user/signal-common.h b/linux-user/signal-common.h
index a7df12fc44..f4cbe6185e 100644
--- a/linux-user/signal-common.h
+++ b/linux-user/signal-common.h
@@ -43,8 +43,6 @@ void host_to_target_sigset_internal(target_sigset_t *d,
 const sigset_t *s);
 void target_to_host_sigset_internal(sigset_t *d,
 const target_sigset_t *s);
-void tswap_siginfo(target_siginfo_t *tinfo,
-   const target_siginfo_t *info);
 void set_sigmask(const sigset_t *set);
 void force_sig(int sig);
 void force_sigsegv(int oldsig);
diff --git a/linux-user/aarch64/signal.c b/linux-user/aarch64/signal.c
index a1e22d526d..bc7a13800d 100644
--- a/linux-user/aarch64/signal.c
+++ b/linux-user/aarch64/signal.c
@@ -670,7 +670,7 @@ static void target_setup_frame(int usig, struct 
target_sigaction *ka,
 aarch64_set_svcr(env, 0, R_SVCR_SM_MASK | R_SVCR_ZA_MASK);
 
 if (info) {
-tswap_siginfo(>info, info);
+frame->info = *info;
 env->xregs[1] = frame_addr + offsetof(struct target_rt_sigframe, info);
 env->xregs[2] = frame_addr + offsetof(struct target_rt_sigframe, uc);
 }
diff --git a/linux-user/alpha/signal.c b/linux-user/alpha/signal.c
index 4ec42994d4..896c2c148a 100644
--- a/linux-user/alpha/signal.c
+++ b/linux-user/alpha/signal.c
@@ -173,7 +173,7 @@ void setup_rt_frame(int sig, struct target_sigaction *ka,
 goto give_sigsegv;
 }
 
-tswap_siginfo(>info, info);
+frame->info = *info;
 
 __put_user(0, >uc.tuc_flags);
 __put_user(0, >uc.tuc_link);
diff --git a/linux-user/arm/signal.c b/linux-user/arm/signal.c
index 59806335f5..8db1c4b233 100644
--- a/linux-user/arm/signal.c
+++ b/linux-user/arm/signal.c
@@ -357,7 +357,7 @@ void setup_rt_frame(int usig, struct target_sigaction *ka,
 
 info_addr = frame_addr + offsetof(struct rt_sigframe, info);
 uc_addr = frame_addr + offsetof(struct rt_sigframe, sig.uc);
-tswap_siginfo(>info, info);
+frame->info = *info;
 
 setup_sigframe(>sig.uc, set, env);
 
diff --git a/linux-user/hexagon/signal.c b/linux-user/hexagon/signal.c
index 60fa7e1bce..492b51f155 100644
--- a/linux-user/hexagon/signal.c
+++ b/linux-user/hexagon/signal.c
@@ -162,7 +162,7 @@ void setup_rt_frame(int sig, struct target_sigaction *ka,
 }
 
 setup_ucontext(>uc, env, set);
-tswap_siginfo(>info, info);
+frame->info = *info;
 /*
  * The on-stack signal trampoline is no longer executed;
  * however, the libgcc signal frame unwinding code checks
diff --git a/linux-user/hppa/signal.c b/linux-user/hppa/signal.c
index c84557e906..682ba25922 100644
--- a/linux-user/hppa/signal.c
+++ b/linux-user/hppa/signal.c
@@ -127,7 +127,7 @@ void setup_rt_frame(int sig, struct target_sigaction *ka,
 goto give_sigsegv;
 }
 
-tswap_siginfo(>info, info);
+frame->info = *info;
 frame->uc.tuc_flags = 0;
 frame->uc.tuc_link = 0;
 
diff --git a/linux-user/i386/signal.c b/linux-user/i386/signal.c
index bc5d45302e..cfe70fc5cf 100644
--- a/linux-user/i386/signal.c
+++ b/linux-user/i386/signal.c
@@ -430,7 +430,7 @@ void setup_frame(int sig, struct target_sigaction *ka,
 setup_sigcontext(>sc, >fpstate, env, set->sig[0],
 frame_addr + offsetof(struct sigframe, fpstate));
 
-for(i = 1; i < TARGET_NSIG_WORDS; i++) {
+for (i = 1; i < TARGET_NSIG_WORDS; i++) {
 __put_user(set->sig[i], >extramask[i - 1]);
 }
 
@@ -490,7 +490,7 @@ void setup_rt_frame(int sig, struct target_sigaction *ka,
 __put_user(addr, >puc);
 #endif
 if (ka->sa_flags & TARGET_SA_SIGINFO) {
-tswap_siginfo(>info, info);
+frame->info = *info;
 }
 
 /* Create the ucontext.  */
@@ -504,7 +504,7 @@ void setup_rt_frame(int sig, struct target_sigaction *ka,
 setup_sigcontext(>uc.tuc_mcontext, >fpstate, env,
 set->sig[0], frame_addr + offsetof(struct rt_sigframe, fpstate));
 
-for(i = 0; i < TARGET_NSIG_WORDS; i++) {
+for (i = 0; i < TARGET_NSIG_WORDS; i++) {
 __put_user(set->sig[i], >uc.tuc_sigmask.sig[i]);
 }
 
diff --git a/linux-user/loongarch64/signal.c b/linux-user/loongarch64/signal.c
index 39ea82c814..1a322f9697 100644
--- a/linux-user/loongarch64/signal.c
+++ b/linux-user/loongarch64/signal.c
@@ -376,7 +376,7 @@ void setup_rt_frame(int sig, struct target_sigaction *ka,
 extctx.end.haddr = (void *)frame + (extctx.end.gaddr - frame_addr);
 }
 
-tswap_siginfo(>rs_info, info);
+frame->rs_info = *info;
 
 __put_user(0, >rs_uc.tuc_flags);
 __put_user(0, >rs_uc.tuc_link);
diff --git

[PULL 6/8] gdbstub: Add Xfer:siginfo:read stub

2024-03-12 Thread Alex Bennée

From: Gustavo Romero 

Add stub to handle Xfer:siginfo:read packet query that requests the
machine's siginfo data.

This is used when GDB user executes 'print $_siginfo' and when the
machine stops due to a signal, for instance, on SIGSEGV. The information
in siginfo allows GDB to determiner further details on the signal, like
the fault address/insn when the SIGSEGV is caught.

Signed-off-by: Gustavo Romero 
Message-Id: <20240309030901.1726211-5-gustavo.rom...@linaro.org>
Signed-off-by: Alex Bennée 
Reviewed-by: Richard Henderson 

diff --git a/gdbstub/internals.h b/gdbstub/internals.h
index b472459838..e83b179920 100644
--- a/gdbstub/internals.h
+++ b/gdbstub/internals.h
@@ -190,6 +190,7 @@ typedef union GdbCmdVariant {
 void gdb_handle_query_rcmd(GArray *params, void *user_ctx); /* softmmu */
 void gdb_handle_query_offsets(GArray *params, void *user_ctx); /* user */
 void gdb_handle_query_xfer_auxv(GArray *params, void *user_ctx); /*user */
+void gdb_handle_query_xfer_siginfo(GArray *params, void *user_ctx); /*user */
 void gdb_handle_v_file_open(GArray *params, void *user_ctx); /* user */
 void gdb_handle_v_file_close(GArray *params, void *user_ctx); /* user */
 void gdb_handle_v_file_pread(GArray *params, void *user_ctx); /* user */
diff --git a/gdbstub/gdbstub.c b/gdbstub/gdbstub.c
index 17efcae0d0..9c23d44baf 100644
--- a/gdbstub/gdbstub.c
+++ b/gdbstub/gdbstub.c
@@ -1664,6 +1664,8 @@ static void handle_query_supported(GArray *params, void 
*user_ctx)
 g_string_append(gdbserver_state.str_buf, ";qXfer:auxv:read+");
 }
 g_string_append(gdbserver_state.str_buf, ";QCatchSyscalls+");
+
+g_string_append(gdbserver_state.str_buf, ";qXfer:siginfo:read+");
 #endif
 g_string_append(gdbserver_state.str_buf, ";qXfer:exec-file:read+");
 #endif
@@ -1818,6 +1820,12 @@ static const GdbCmdParseEntry gdb_gen_query_table[] = {
 .cmd_startswith = 1,
 .schema = "l,l0"
 },
+{
+.handler = gdb_handle_query_xfer_siginfo,
+.cmd = "Xfer:siginfo:read::",
+.cmd_startswith = 1,
+.schema = "l,l0"
+ },
 #endif
 {
 .handler = gdb_handle_query_xfer_exec_file,
diff --git a/gdbstub/user.c b/gdbstub/user.c
index cf693bfbc4..2005f3312b 100644
--- a/gdbstub/user.c
+++ b/gdbstub/user.c
@@ -852,3 +852,26 @@ void gdb_handle_set_catch_syscalls(GArray *params, void 
*user_ctx)
 err:
 gdb_put_packet("E00");
 }
+
+void gdb_handle_query_xfer_siginfo(GArray *params, void *user_ctx)
+{
+unsigned long offset, len;
+uint8_t *siginfo_offset;
+
+offset = get_param(params, 0)->val_ul;
+len = get_param(params, 1)->val_ul;
+
+if (offset + len > gdbserver_user_state.siginfo_len) {
+/* Invalid offset and/or requested length. */
+gdb_put_packet("E01");
+return;
+}
+
+siginfo_offset = (uint8_t *)gdbserver_user_state.siginfo + offset;
+
+/* Reply */
+g_string_assign(gdbserver_state.str_buf, "l");
+gdb_memtox(gdbserver_state.str_buf, (const char *)siginfo_offset, len);
+gdb_put_packet_binary(gdbserver_state.str_buf->str,
+  gdbserver_state.str_buf->len, true);
+}
-- 
2.39.2

[PULL 7/8] tests/tcg: Add multiarch test for Xfer:siginfo:read stub

2024-03-12 Thread Alex Bennée

From: Gustavo Romero 

Add multiarch test for testing if Xfer:siginfo:read query is properly
handled by gdbstub.

Signed-off-by: Gustavo Romero 
Reviewed-by: Richard Henderson 
Message-Id: <20240309030901.1726211-6-gustavo.rom...@linaro.org>
Signed-off-by: Alex Bennée 

diff --git a/tests/tcg/multiarch/segfault.c b/tests/tcg/multiarch/segfault.c
new file mode 100644
index 00..e6c8ff31ca
--- /dev/null
+++ b/tests/tcg/multiarch/segfault.c
@@ -0,0 +1,14 @@
+#include 
+#include 
+
+/* Cause a segfault for testing purposes. */
+
+int main(int argc, char *argv[])
+{
+int *ptr = (void *)0xdeadbeef;
+
+if (argc == 2 && strcmp(argv[1], "-s") == 0) {
+/* Cause segfault. */
+printf("%d\n", *ptr);
+}
+}
diff --git a/tests/tcg/multiarch/Makefile.target 
b/tests/tcg/multiarch/Makefile.target
index 979a0dd1bc..5e3391ec9d 100644
--- a/tests/tcg/multiarch/Makefile.target
+++ b/tests/tcg/multiarch/Makefile.target
@@ -71,6 +71,13 @@ run-gdbstub-qxfer-auxv-read: sha1
--bin $< --test 
$(MULTIARCH_SRC)/gdbstub/test-qxfer-auxv-read.py, \
basic gdbstub qXfer:auxv:read support)
 
+run-gdbstub-qxfer-siginfo-read: segfault
+   $(call run-test, $@, $(GDB_SCRIPT) \
+   --gdb $(GDB) \
+   --qemu $(QEMU) --qargs "$(QEMU_OPTS)" \
+   --bin "$< -s" --test 
$(MULTIARCH_SRC)/gdbstub/test-qxfer-siginfo-read.py, \
+   basic gdbstub qXfer:siginfo:read support)
+
 run-gdbstub-proc-mappings: sha1
$(call run-test, $@, $(GDB_SCRIPT) \
--gdb $(GDB) \
@@ -128,7 +135,8 @@ EXTRA_RUNS += run-gdbstub-sha1 run-gdbstub-qxfer-auxv-read \
  run-gdbstub-proc-mappings run-gdbstub-thread-breakpoint \
  run-gdbstub-registers run-gdbstub-prot-none \
  run-gdbstub-catch-syscalls run-gdbstub-follow-fork-mode-child \
- run-gdbstub-follow-fork-mode-parent
+ run-gdbstub-follow-fork-mode-parent \
+ run-gdbstub-qxfer-siginfo-read
 
 # ARM Compatible Semi Hosting Tests
 #
diff --git a/tests/tcg/multiarch/gdbstub/test-qxfer-siginfo-read.py 
b/tests/tcg/multiarch/gdbstub/test-qxfer-siginfo-read.py
new file mode 100644
index 00..862596b07a
--- /dev/null
+++ b/tests/tcg/multiarch/gdbstub/test-qxfer-siginfo-read.py
@@ -0,0 +1,26 @@
+from __future__ import print_function
+#
+# Test gdbstub Xfer:siginfo:read stub.
+#
+# The test runs a binary that causes a SIGSEGV and then looks for additional
+# info about the signal through printing GDB's '$_siginfo' special variable,
+# which sends a Xfer:siginfo:read query to the gdbstub.
+#
+# The binary causes a SIGSEGV at dereferencing a pointer with value 0xdeadbeef,
+# so the test looks for and checks if this address is correctly reported by the
+# gdbstub.
+#
+# This is launched via tests/guest-debug/run-test.py
+#
+
+import gdb
+from test_gdbstub import main, report
+
+def run_test():
+"Run through the test"
+
+gdb.execute("continue", False, True)
+resp = gdb.execute("print/x $_siginfo", False, True)
+report(resp.find("si_addr = 0xdeadbeef"), "Found fault address.")
+
+main(run_test)
-- 
2.39.2

[PULL 5/8] gdbstub: Save target's siginfo

2024-03-12 Thread Alex Bennée

From: Gustavo Romero 

Save target's siginfo into gdbserver_state so it can be used later, for
example, in any stub that requires the target's si_signo and si_code.

This change affects only linux-user mode.

Signed-off-by: Gustavo Romero 
Suggested-by: Richard Henderson 
Message-Id: <20240309030901.1726211-4-gustavo.rom...@linaro.org>
Signed-off-by: Alex Bennée 
Reviewed-by: Richard Henderson 

diff --git a/include/gdbstub/user.h b/include/gdbstub/user.h
index 4fc88f8a25..3b8358e3da 100644
--- a/include/gdbstub/user.h
+++ b/include/gdbstub/user.h
@@ -9,11 +9,15 @@
 #ifndef GDBSTUB_USER_H
 #define GDBSTUB_USER_H
 
+#define MAX_SIGINFO_LENGTH 128
+
 /**
  * gdb_handlesig() - yield control to gdb
  * @cpu: CPU
  * @sig: if non-zero, the signal number which caused us to stop
  * @reason: stop reason for stop reply packet or NULL
+ * @siginfo: target-specific siginfo struct
+ * @siginfo_len: target-specific siginfo struct length
  *
  * This function yields control to gdb, when a user-mode-only target
  * needs to stop execution. If @sig is non-zero, then we will send a
@@ -25,7 +29,7 @@
  * or 0 if no signal should be delivered, ie the signal that caused
  * us to stop should be ignored.
  */
-int gdb_handlesig(CPUState *, int, const char *);
+int gdb_handlesig(CPUState *, int, const char *, void *, int);
 
 /**
  * gdb_signalled() - inform remote gdb of sig exit
diff --git a/bsd-user/main.c b/bsd-user/main.c
index 3dc285e5b7..01b313756e 100644
--- a/bsd-user/main.c
+++ b/bsd-user/main.c
@@ -606,7 +606,7 @@ int main(int argc, char **argv)
 
 if (gdbstub) {
 gdbserver_start(gdbstub);
-gdb_handlesig(cpu, 0);
+gdb_handlesig(cpu, 0, NULL, NULL, 0);
 }
 cpu_loop(env);
 /* never exits */
diff --git a/bsd-user/signal.c b/bsd-user/signal.c
index e9f80a06d3..2936eeb7a8 100644
--- a/bsd-user/signal.c
+++ b/bsd-user/signal.c
@@ -27,6 +27,9 @@
 #include "hw/core/tcg-cpu-ops.h"
 #include "host-signal.h"
 
+/* target_siginfo_t must fit in gdbstub's siginfo save area. */
+QEMU_BUILD_BUG_ON(sizeof(target_siginfo_t) > MAX_SIGINFO_LENGTH);
+
 static struct target_sigaction sigact_table[TARGET_NSIG];
 static void host_signal_handler(int host_sig, siginfo_t *info, void *puc);
 static void target_to_host_sigset_internal(sigset_t *d,
@@ -890,7 +893,7 @@ static void handle_pending_signal(CPUArchState *env, int 
sig,
 
 k->pending = 0;
 
-sig = gdb_handlesig(cpu, sig);
+sig = gdb_handlesig(cpu, sig, NULL, >info, sizeof(k->info));
 if (!sig) {
 sa = NULL;
 handler = TARGET_SIG_IGN;
diff --git a/gdbstub/user.c b/gdbstub/user.c
index 520987fddc..cf693bfbc4 100644
--- a/gdbstub/user.c
+++ b/gdbstub/user.c
@@ -95,6 +95,8 @@ typedef struct {
 enum GDBForkState fork_state;
 int fork_sockets[2];
 pid_t fork_peer_pid, fork_peer_tid;
+uint8_t siginfo[MAX_SIGINFO_LENGTH];
+unsigned long siginfo_len;
 } GDBUserState;
 
 static GDBUserState gdbserver_user_state;
@@ -190,7 +192,8 @@ void gdb_qemu_exit(int code)
 exit(code);
 }
 
-int gdb_handlesig(CPUState *cpu, int sig, const char *reason)
+int gdb_handlesig(CPUState *cpu, int sig, const char *reason, void *siginfo,
+  int siginfo_len)
 {
 char buf[256];
 int n;
@@ -199,6 +202,18 @@ int gdb_handlesig(CPUState *cpu, int sig, const char 
*reason)
 return sig;
 }
 
+if (siginfo) {
+/*
+ * Save target-specific siginfo.
+ *
+ * siginfo size, i.e. siginfo_len, is asserted at compile-time to fit 
in
+ * gdbserver_user_state.siginfo, usually in the source file calling
+ * gdb_handlesig. See, for instance, {linux,bsd}-user/signal.c.
+ */
+memcpy(gdbserver_user_state.siginfo, siginfo, siginfo_len);
+gdbserver_user_state.siginfo_len = siginfo_len;
+}
+
 /* disable single step if it was enabled */
 cpu_single_step(cpu, 0);
 tb_flush(cpu);
@@ -746,7 +761,7 @@ void gdb_breakpoint_remove_all(CPUState *cs)
 void gdb_syscall_handling(const char *syscall_packet)
 {
 gdb_put_packet(syscall_packet);
-gdb_handlesig(gdbserver_state.c_cpu, 0, NULL);
+gdb_handlesig(gdbserver_state.c_cpu, 0, NULL, NULL, 0);
 }
 
 static bool should_catch_syscall(int num)
@@ -764,7 +779,7 @@ void gdb_syscall_entry(CPUState *cs, int num)
 {
 if (should_catch_syscall(num)) {
 g_autofree char *reason = g_strdup_printf("syscall_entry:%x;", num);
-gdb_handlesig(cs, gdb_target_sigtrap(), reason);
+gdb_handlesig(cs, gdb_target_sigtrap(), reason, NULL, 0);
 }
 }
 
@@ -772,7 +787,7 @@ void gdb_syscall_return(CPUState *cs, int num)
 {
 if (should_catch_syscall(num)) {
 g_autofree char *reason = g_strdup_printf("syscall_return:%x;", num);
-gdb_handlesig(cs, gdb_target_sigtrap(), reason);
+gdb_handlesig(cs, gdb_target_sigtrap(), reason, NULL, 0);
 }
 }
 
diff --git a/linux-user/main.c b/linux-user/main.c
index 55aa11c9b4..9277df2e9d 100644
---

[PULL 8/8] gdbstub: Fix double close() of the follow-fork-mode socket

2024-03-12 Thread Alex Bennée

From: Ilya Leoshkevich 

When the terminal GDB_FORK_ENABLED state is reached, the coordination
socket is not needed anymore and is therefore closed. However, if there
is a communication error between QEMU gdbstub and GDB, the generic
error handling code attempts to close it again.

Fix by closing it later - before returning - instead.

Fixes: Coverity CID 1539966
Fixes: d547e711a8a5 ("gdbstub: Implement follow-fork-mode child")
Signed-off-by: Ilya Leoshkevich 
Signed-off-by: Alex Bennée 
Message-Id: <20240312001813.13720-1-...@linux.ibm.com>

diff --git a/gdbstub/user.c b/gdbstub/user.c
index 2005f3312b..edeb72efeb 100644
--- a/gdbstub/user.c
+++ b/gdbstub/user.c
@@ -517,6 +517,7 @@ void gdbserver_fork_end(CPUState *cpu, pid_t pid)
 switch (gdbserver_user_state.fork_state) {
 case GDB_FORK_ENABLED:
 if (gdbserver_user_state.running_state) {
+close(fd);
 return;
 }
 QEMU_FALLTHROUGH;
@@ -542,7 +543,6 @@ void gdbserver_fork_end(CPUState *cpu, pid_t pid)
 gdbserver_user_state.fork_state = GDB_FORK_ACTIVE;
 break;
 case GDB_FORK_ENABLE:
-close(fd);
 gdbserver_user_state.fork_state = GDB_FORK_ENABLED;
 break;
 case GDB_FORK_DISABLE:
@@ -557,7 +557,6 @@ void gdbserver_fork_end(CPUState *cpu, pid_t pid)
 if (write(fd, , 1) != 1) {
 goto fail;
 }
-close(fd);
 gdbserver_user_state.fork_state = GDB_FORK_ENABLED;
 break;
 case GDB_FORK_DISABLING:
-- 
2.39.2

[PULL 1/8] gitlab: aggressively avoid extra GIT data

2024-03-12 Thread Alex Bennée

This avoids fetching blobs and tree references for branches we are not
going to worry about. Also skip tag references which are similarly not
useful and keep the default --prune. This keeps the .git data to
around 100M rather than the ~400M even a shallow clone takes.

So we can check the savings we also run a quick du while setting up
the build.

We also have to have special settings of GIT_FETCH_EXTRA_FLAGS for the
Windows build (git too old?) and the migration legacy test where we
build an older QEMU alongside the main one.

Signed-off-by: Alex Bennée 

diff --git a/.gitlab-ci.d/base.yml b/.gitlab-ci.d/base.yml
index 2dd8a9b57c..bf3d8efab6 100644
--- a/.gitlab-ci.d/base.yml
+++ b/.gitlab-ci.d/base.yml
@@ -24,6 +24,10 @@ variables:
 # Each script line from will be in a collapsible section in the job output
 # and show the duration of each line.
 FF_SCRIPT_SECTIONS: 1
+# The project has a fairly fat GIT repo so we try and avoid bringing in 
things
+# we don't need. The --filter options avoid blobs and tree references we 
aren't going to use
+# and we also avoid fetching tags.
+GIT_FETCH_EXTRA_FLAGS: --filter=blob:none --filter=tree:0 --no-tags 
--prune --quiet
 
   interruptible: true
 
diff --git a/.gitlab-ci.d/buildtest-template.yml 
b/.gitlab-ci.d/buildtest-template.yml
index 4fbfeb6667..22045add80 100644
--- a/.gitlab-ci.d/buildtest-template.yml
+++ b/.gitlab-ci.d/buildtest-template.yml
@@ -14,6 +14,7 @@
 - export CCACHE_DIR="$CCACHE_BASEDIR/ccache"
 - export CCACHE_MAXSIZE="500M"
 - export PATH="$CCACHE_WRAPPERSDIR:$PATH"
+- du -sh .git
 - mkdir build
 - cd build
 - ccache --zero-stats
diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
index c7d92fc301..cfdff175c3 100644
--- a/.gitlab-ci.d/buildtest.yml
+++ b/.gitlab-ci.d/buildtest.yml
@@ -187,6 +187,8 @@ build-previous-qemu:
   variables:
 IMAGE: opensuse-leap
 TARGETS: x86_64-softmmu aarch64-softmmu
+# Override the default flags as we need more to grab the old version
+GIT_FETCH_EXTRA_FLAGS: --prune --quiet
   before_script:
 - export QEMU_PREV_VERSION="$(sed 's/\([0-9.]*\)\.[0-9]*/v\1.0/' VERSION)"
 - git remote add upstream https://gitlab.com/qemu-project/qemu
diff --git a/.gitlab-ci.d/windows.yml b/.gitlab-ci.d/windows.yml
index f116b8012d..94834269ec 100644
--- a/.gitlab-ci.d/windows.yml
+++ b/.gitlab-ci.d/windows.yml
@@ -28,6 +28,8 @@ msys2-64bit:
 # qTests don't run successfully with "--without-default-devices",
 # so let's exclude the qtests from CI for now.
 TEST_ARGS: --no-suite qtest
+# The Windows git is a bit older so override the default
+GIT_FETCH_EXTRA_FLAGS: --no-tags --prune --quiet
   artifacts:
 name: "$CI_JOB_NAME-$CI_COMMIT_REF_SLUG"
 expire_in: 7 days
-- 
2.39.2

Re: [PATCH 4/5] plugins: conditional callbacks

2024-03-12 Thread Pierrick Bouvier


On 3/12/24 10:03, Pierrick Bouvier wrote:

On 3/11/24 19:43, Alex Bennée wrote:

Pierrick Bouvier  writes:


Extend plugins API to support callback called with a given criteria
(evaluated inline).

Added functions:
- qemu_plugin_register_vcpu_tb_exec_cond_cb
- qemu_plugin_register_vcpu_insn_exec_cond_cb

They expect as parameter a condition, a qemu_plugin_u64_t (op1) and an
immediate (op2). Callback is called if op1 |cond| op2 is true.

Signed-off-by: Pierrick Bouvier 


   
+static TCGCond plugin_cond_to_tcgcond(enum qemu_plugin_cond cond)

+{
+switch (cond) {
+case QEMU_PLUGIN_COND_EQ:
+return TCG_COND_EQ;
+case QEMU_PLUGIN_COND_NE:
+return TCG_COND_NE;
+case QEMU_PLUGIN_COND_LT:
+return TCG_COND_LTU;
+case QEMU_PLUGIN_COND_LE:
+return TCG_COND_LEU;
+case QEMU_PLUGIN_COND_GT:
+return TCG_COND_GTU;
+case QEMU_PLUGIN_COND_GE:
+return TCG_COND_GEU;
+default:
+/* ALWAYS and NEVER conditions should never reach */
+g_assert_not_reached();
+}
+}
+
+static TCGOp *append_cond_udata_cb(const struct qemu_plugin_dyn_cb *cb,
+   TCGOp *begin_op, TCGOp *op, int *cb_idx)
+{
+char *ptr = cb->cond_cb.entry.score->data->data;
+size_t elem_size = g_array_get_element_size(
+cb->cond_cb.entry.score->data);
+size_t offset = cb->cond_cb.entry.offset;
+/* Condition should be negated, as calling the cb is the "else" path */
+TCGCond cond = tcg_invert_cond(plugin_cond_to_tcgcond(cb->cond_cb.cond));
+
+op = copy_const_ptr(_op, op, ptr);


This line was wrong, and cb->userp should be copied instead.
I'll fix this, add a test specifically checking udata for conditional 
callback and resend the series.



+op = copy_ld_i32(_op, op);
+op = copy_mul_i32(_op, op, elem_size);
+op = copy_ext_i32_ptr(_op, op);
+op = copy_const_ptr(_op, op, ptr + offset);
+op = copy_add_ptr(_op, op);
+op = copy_ld_i64(_op, op);
+op = copy_brcondi_i64(_op, op, cond, cb->cond_cb.imm);
+op = copy_call(_op, op, cb->f.vcpu_udata, cb_idx);
+op = copy_set_label(_op, op);
+return op;


I think we are missing something here to ensure that udata is set
correctly for the callback, see my RFC:

Subject: [RFC PATCH] contrib/plugins: control flow plugin (WIP!)
Date: Mon, 11 Mar 2024 15:34:32 +
Message-Id: <20240311153432.1395190-1-alex.ben...@linaro.org>

which is seeing the same value every time in the callback.



I'm trying to reproduce and will answer on this thread.

Re: [PATCH 07/13] ppc/pnv: Permit ibm,pa-features set per machine variant

2024-03-12 Thread Cédric Le Goater


On 3/11/24 19:51, Nicholas Piggin wrote:

This allows different pa-features for powernv8/9/10.

Cc: "Cédric Le Goater" 
Cc: "Frédéric Barrat" 
Signed-off-by: Nicholas Piggin 


The features could be a chip class attribute instead.

Thanks,

C.



---
  hw/ppc/pnv.c | 41 +
  1 file changed, 29 insertions(+), 12 deletions(-)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index aa9786e970..52d964f77a 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -133,7 +133,7 @@ static int get_cpus_node(void *fdt)
   * device tree, used in XSCOM to address cores and in interrupt
   * servers.
   */
-static void pnv_dt_core(PnvChip *chip, PnvCore *pc, void *fdt)
+static int pnv_dt_core(PnvChip *chip, PnvCore *pc, void *fdt)
  {
  PowerPCCPU *cpu = pc->threads[0];
  CPUState *cs = CPU(cpu);
@@ -149,11 +149,6 @@ static void pnv_dt_core(PnvChip *chip, PnvCore *pc, void 
*fdt)
  uint32_t cpufreq = 10;
  uint32_t page_sizes_prop[64];
  size_t page_sizes_prop_size;
-const uint8_t pa_features[] = { 24, 0,
-0xf6, 0x3f, 0xc7, 0xc0, 0x00, 0xf0,
-0x80, 0x00, 0x00, 0x00, 0x00, 0x00,
-0x00, 0x00, 0x00, 0x00, 0x80, 0x00,
-0x80, 0x00, 0x80, 0x00, 0x80, 0x00 };
  int offset;
  char *nodename;
  int cpus_offset = get_cpus_node(fdt);
@@ -236,15 +231,14 @@ static void pnv_dt_core(PnvChip *chip, PnvCore *pc, void 
*fdt)
 page_sizes_prop, page_sizes_prop_size)));
  }
  
-_FDT((fdt_setprop(fdt, offset, "ibm,pa-features",

-   pa_features, sizeof(pa_features;
-
  /* Build interrupt servers properties */
  for (i = 0; i < smt_threads; i++) {
  servers_prop[i] = cpu_to_be32(pc->pir + i);
  }
  _FDT((fdt_setprop(fdt, offset, "ibm,ppc-interrupt-server#s",
 servers_prop, sizeof(*servers_prop) * smt_threads)));
+
+return offset;
  }
  
  static void pnv_dt_icp(PnvChip *chip, void *fdt, uint32_t pir,

@@ -299,6 +293,17 @@ PnvChip *pnv_chip_add_phb(PnvChip *chip, PnvPHB *phb)
  return chip;
  }
  
+/*

+ * Same as spapr pa_features_207 except pnv always enables CI largepages bit.
+ * HTM is always enabled because TCG does implement HTM, it's just a
+ * degenerate implementation.
+ */
+static const uint8_t pa_features_207[] = { 24, 0,
+ 0xf6, 0x3f, 0xc7, 0xc0, 0x00, 0xf0,
+ 0x80, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x80, 0x00,
+ 0x80, 0x00, 0x80, 0x00, 0x80, 0x00 };
+
  static void pnv_chip_power8_dt_populate(PnvChip *chip, void *fdt)
  {
  static const char compat[] = "ibm,power8-xscom\0ibm,xscom";
@@ -311,8 +316,12 @@ static void pnv_chip_power8_dt_populate(PnvChip *chip, 
void *fdt)
  
  for (i = 0; i < chip->nr_cores; i++) {

  PnvCore *pnv_core = chip->cores[i];
+int offset;
+
+offset = pnv_dt_core(chip, pnv_core, fdt);
  
-pnv_dt_core(chip, pnv_core, fdt);

+_FDT((fdt_setprop(fdt, offset, "ibm,pa-features",
+   pa_features_207, sizeof(pa_features_207;
  
  /* Interrupt Control Presenters (ICP). One per core. */

  pnv_dt_icp(chip, fdt, pnv_core->pir, CPU_CORE(pnv_core)->nr_threads);
@@ -335,8 +344,12 @@ static void pnv_chip_power9_dt_populate(PnvChip *chip, 
void *fdt)
  
  for (i = 0; i < chip->nr_cores; i++) {

  PnvCore *pnv_core = chip->cores[i];
+int offset;
  
-pnv_dt_core(chip, pnv_core, fdt);

+offset = pnv_dt_core(chip, pnv_core, fdt);
+
+_FDT((fdt_setprop(fdt, offset, "ibm,pa-features",
+   pa_features_207, sizeof(pa_features_207;
  }
  
  if (chip->ram_size) {

@@ -358,8 +371,12 @@ static void pnv_chip_power10_dt_populate(PnvChip *chip, 
void *fdt)
  
  for (i = 0; i < chip->nr_cores; i++) {

  PnvCore *pnv_core = chip->cores[i];
+int offset;
+
+offset = pnv_dt_core(chip, pnv_core, fdt);
  
-pnv_dt_core(chip, pnv_core, fdt);

+_FDT((fdt_setprop(fdt, offset, "ibm,pa-features",
+   pa_features_207, sizeof(pa_features_207;
  }
  
  if (chip->ram_size) {

Re: [PATCH v2 10/29] block/snapshot: Fix missing ERRP_GUARD() for error_prepend()

2024-03-12 Thread Thomas Huth


On 11/03/2024 04.38, Zhao Liu wrote:

From: Zhao Liu 

As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():

* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
*   error_append_hint(), because that doesn't work with _fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or _fatal.

ERRP_GUARD() could avoid the case when @errp is _fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].

In block/snapshot.c, there are 2 functions passing @errp to
error_prepend() without ERRP_GUARD():
  - bdrv_all_delete_snapshot()
  - bdrv_all_goto_snapshot()

As the APIs exposed in include/block/snapshot.h, they could be called
by other modules.

To avoid potential issues as [1] said, add missing ERRP_GUARD() at the
beginning of these 2 functions.

[1]: Issue description in the commit message of commit ae7c80a7bd73
  ("error: New macro ERRP_GUARD()").

Cc: Kevin Wolf 
Cc: Hanna Reitz 
Cc: qemu-bl...@nongnu.org
Signed-off-by: Zhao Liu 
---
  block/snapshot.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/block/snapshot.c b/block/snapshot.c
index 8694fc0a3eba..8242b4abac41 100644
--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -566,6 +566,7 @@ int bdrv_all_delete_snapshot(const char *name,
   bool has_devices, strList *devices,
   Error **errp)
  {
+ERRP_GUARD();
  g_autoptr(GList) bdrvs = NULL;
  GList *iterbdrvs;
  
@@ -605,6 +606,7 @@ int bdrv_all_goto_snapshot(const char *name,

 bool has_devices, strList *devices,
 Error **errp)
  {
+ERRP_GUARD();
  g_autoptr(GList) bdrvs = NULL;
  GList *iterbdrvs;
  int ret;


Reviewed-by: Thomas Huth

[PULL 07/13] sun4u: remap ebus BAR0 to use unassigned_io_ops instead of alias to PCI IO space

2024-03-12 Thread Philippe Mathieu-Daudé

From: Mark Cave-Ayland 

During kernel startup OpenBSD accesses addresses mapped by BAR0 of the ebus 
device
but at offsets where no IO devices exist. Before commit 4aa07e8649 
("hw/sparc64/ebus:
Access memory regions via pci_address_space_io()") BAR0 was mapped to legacy IO
space which allows accesses to unmapped devices to succeed, but afterwards these
accesses to unmapped PCI IO space cause a memory fault which prevents OpenBSD 
from
booting.

Since no devices are mapped at the addresses accessed by OpenBSD, change ebus 
BAR0
from a PCI IO space alias to an IO memory region using unassigned_io_ops which 
allows
these accesses to succeed and so allows OpenBSD to boot once again.

Fixes: 4aa07e8649 ("hw/sparc64/ebus: Access memory regions via 
pci_address_space_io()")
Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Philippe Mathieu-Daudé 
Message-ID: <20240311064345.2531197-1-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/sparc64/sun4u.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/hw/sparc64/sun4u.c b/hw/sparc64/sun4u.c
index eda9b58a21..cff6d5abaf 100644
--- a/hw/sparc64/sun4u.c
+++ b/hw/sparc64/sun4u.c
@@ -360,8 +360,13 @@ static void ebus_realize(PCIDevice *pci_dev, Error **errp)
 pci_dev->config[0x09] = 0x00; // programming i/f
 pci_dev->config[0x0D] = 0x0a; // latency_timer
 
-memory_region_init_alias(>bar0, OBJECT(s), "bar0",
- pci_address_space_io(pci_dev), 0, 0x100);
+/*
+ * BAR0 is accessed by OpenBSD but not for ebus device access: allow any
+ * memory access to this region to succeed which allows the OpenBSD kernel
+ * to boot.
+ */
+memory_region_init_io(>bar0, OBJECT(s), _io_ops, s,
+  "bar0", 0x100);
 pci_register_bar(pci_dev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY, >bar0);
 memory_region_init_alias(>bar1, OBJECT(s), "bar1",
  pci_address_space_io(pci_dev), 0, 0x8000);
-- 
2.41.0

[PULL 04/13] hw/core/loader-fit: Fix missing ERRP_GUARD() for error_prepend()

2024-03-12 Thread Philippe Mathieu-Daudé

From: Zhao Liu 

As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():

* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
*   error_append_hint(), because that doesn't work with _fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or _fatal.

ERRP_GUARD() could avoid the case when @errp is _fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].

In hw/core/loader-fit.c, there are 2 functions passing @errp to
error_prepend() without ERRP_GUARD():
 - fit_load_kernel()
 - fit_load_fdt()

Their @errp parameters are both the pointers of the local @err virable
in load_fit().

Though they don't cause the issue like [1] said, to follow the
requirement of @errp, add missing ERRP_GUARD() at their beginning.

[1]: Issue description in the commit message of commit ae7c80a7bd73
 ("error: New macro ERRP_GUARD()").

Cc: Paul Burton 
Cc: Aleksandar Rikalo 
Signed-off-by: Zhao Liu 
Message-ID: <20240311033822.3142585-15-zhao1@linux.intel.com>
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/core/loader-fit.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/core/loader-fit.c b/hw/core/loader-fit.c
index b7c7b3ba94..9f20007dbb 100644
--- a/hw/core/loader-fit.c
+++ b/hw/core/loader-fit.c
@@ -120,6 +120,7 @@ static int fit_load_kernel(const struct fit_loader *ldr, 
const void *itb,
int cfg, void *opaque, hwaddr *pend,
Error **errp)
 {
+ERRP_GUARD();
 const char *name;
 const void *data;
 const void *load_data;
@@ -178,6 +179,7 @@ static int fit_load_fdt(const struct fit_loader *ldr, const 
void *itb,
 int cfg, void *opaque, const void *match_data,
 hwaddr kernel_end, Error **errp)
 {
+ERRP_GUARD();
 Error *err = NULL;
 const char *name;
 const void *data;
-- 
2.41.0

[PULL 02/13] hw/pci: add some convenient trace-events for pcie and shpc hotplug

2024-03-12 Thread Philippe Mathieu-Daudé

From: Vladimir Sementsov-Ogievskiy 

Add trace-events that may help to debug problems with hotplugging.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Philippe Mathieu-Daudé 
Message-ID: <20240301154146.761531-2-vsement...@yandex-team.ru>
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/pci/pcie.c   | 56 +
 hw/pci/shpc.c   | 46 +
 hw/pci/trace-events |  6 +
 3 files changed, 108 insertions(+)

diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
index 6db0cf69cd..f56079acf5 100644
--- a/hw/pci/pcie.c
+++ b/hw/pci/pcie.c
@@ -28,6 +28,7 @@
 #include "hw/pci/pcie_regs.h"
 #include "hw/pci/pcie_port.h"
 #include "qemu/range.h"
+#include "trace.h"
 
 //#define DEBUG_PCIE
 #ifdef DEBUG_PCIE
@@ -45,6 +46,23 @@ static bool pcie_sltctl_powered_off(uint16_t sltctl)
 && (sltctl & PCI_EXP_SLTCTL_PIC) == PCI_EXP_SLTCTL_PWR_IND_OFF;
 }
 
+static const char *pcie_led_state_to_str(uint16_t value)
+{
+switch (value) {
+case PCI_EXP_SLTCTL_PWR_IND_ON:
+case PCI_EXP_SLTCTL_ATTN_IND_ON:
+return "on";
+case PCI_EXP_SLTCTL_PWR_IND_BLINK:
+case PCI_EXP_SLTCTL_ATTN_IND_BLINK:
+return "blink";
+case PCI_EXP_SLTCTL_PWR_IND_OFF:
+case PCI_EXP_SLTCTL_ATTN_IND_OFF:
+return "off";
+default:
+return "invalid";
+}
+}
+
 /***
  * pci express capability helper functions
  */
@@ -735,6 +753,28 @@ void pcie_cap_slot_get(PCIDevice *dev, uint16_t *slt_ctl, 
uint16_t *slt_sta)
 *slt_sta = pci_get_word(exp_cap + PCI_EXP_SLTSTA);
 }
 
+static void find_child_fn(PCIBus *bus, PCIDevice *dev, void *opaque)
+{
+PCIDevice **child = opaque;
+
+if (!*child) {
+*child = dev;
+}
+}
+
+/*
+ * Returns the plugged device or first function of multifunction plugged device
+ */
+static PCIDevice *pcie_cap_slot_find_child(PCIDevice *dev)
+{
+PCIBus *sec_bus = pci_bridge_get_sec_bus(PCI_BRIDGE(dev));
+PCIDevice *child = NULL;
+
+pci_for_each_device(sec_bus, pci_bus_num(sec_bus), find_child_fn, );
+
+return child;
+}
+
 void pcie_cap_slot_write_config(PCIDevice *dev,
 uint16_t old_slt_ctl, uint16_t old_slt_sta,
 uint32_t addr, uint32_t val, int len)
@@ -779,6 +819,22 @@ void pcie_cap_slot_write_config(PCIDevice *dev,
 sltsta);
 }
 
+if (trace_event_get_state_backends(TRACE_PCIE_CAP_SLOT_WRITE_CONFIG)) {
+DeviceState *parent = DEVICE(dev);
+DeviceState *child = DEVICE(pcie_cap_slot_find_child(dev));
+
+trace_pcie_cap_slot_write_config(
+parent->canonical_path,
+child ? child->canonical_path : "no-child",
+(sltsta & PCI_EXP_SLTSTA_PDS) ? "present" : "not present",
+pcie_led_state_to_str(old_slt_ctl & PCI_EXP_SLTCTL_PIC),
+pcie_led_state_to_str(val & PCI_EXP_SLTCTL_PIC),
+pcie_led_state_to_str(old_slt_ctl & PCI_EXP_SLTCTL_AIC),
+pcie_led_state_to_str(val & PCI_EXP_SLTCTL_AIC),
+(old_slt_ctl & PCI_EXP_SLTCTL_PWR_OFF) ? "off" : "on",
+(val & PCI_EXP_SLTCTL_PWR_OFF) ? "off" : "on");
+}
+
 /*
  * If the slot is populated, power indicator is off and power
  * controller is off, it is safe to detach the devices.
diff --git a/hw/pci/shpc.c b/hw/pci/shpc.c
index d2a5eea69e..aac6f2d034 100644
--- a/hw/pci/shpc.c
+++ b/hw/pci/shpc.c
@@ -8,6 +8,7 @@
 #include "hw/pci/pci.h"
 #include "hw/pci/pci_bus.h"
 #include "hw/pci/msi.h"
+#include "trace.h"
 
 /* TODO: model power only and disabled slot states. */
 /* TODO: handle SERR and wakeups */
@@ -123,6 +124,34 @@
 #define SHPC_PCI_TO_IDX(pci_slot) ((pci_slot) - 1)
 #define SHPC_IDX_TO_PHYSICAL(slot) ((slot) + 1)
 
+static const char *shpc_led_state_to_str(uint8_t value)
+{
+switch (value) {
+case SHPC_LED_ON:
+return "on";
+case SHPC_LED_BLINK:
+return "blink";
+case SHPC_LED_OFF:
+return "off";
+default:
+return "invalid";
+}
+}
+
+static const char *shpc_slot_state_to_str(uint8_t value)
+{
+switch (value) {
+case SHPC_STATE_PWRONLY:
+return "power-only";
+case SHPC_STATE_ENABLED:
+return "enabled";
+case SHPC_STATE_DISABLED:
+return "disabled";
+default:
+return "invalid";
+}
+}
+
 static uint8_t shpc_get_status(SHPCDevice *shpc, int slot, uint16_t msk)
 {
 uint8_t *status = shpc->config + SHPC_SLOT_STATUS(slot);
@@ -302,6 +331,23 @@ static void shpc_slot_command(PCIDevice *d, uint8_t target,
 shpc_set_status(shpc, slot, state, SHPC_SLOT_STATE_MASK);
 }
 
+if (trace_event_get_state_backends(TRACE_SHPC_SLOT_COMMAND)) {
+DeviceState *parent = DEVICE(d);
+int pci_slot = SHPC_IDX_TO_PCI(slot);
+DeviceState *child =
+

[PULL 08/13] hw/core: Cleanup unused included headers in cpu-common.c

2024-03-12 Thread Philippe Mathieu-Daudé

From: Zhao Liu 

Remove unused headers in cpu-common.c:
* qemu/notify.h
* exec/cpu-common.h
* qemu/error-report.h
* qemu/qemu-print.h

Tested by "./configure" and then "make".

Signed-off-by: Zhao Liu 
Reviewed-by: Philippe Mathieu-Daudé 
Message-ID: <20240311075621.3224684-2-zhao1@linux.intel.com>
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/core/cpu-common.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/hw/core/cpu-common.c b/hw/core/cpu-common.c
index 0108fb11db..4bd9c70a83 100644
--- a/hw/core/cpu-common.c
+++ b/hw/core/cpu-common.c
@@ -22,14 +22,10 @@
 #include "qapi/error.h"
 #include "hw/core/cpu.h"
 #include "sysemu/hw_accel.h"
-#include "qemu/notify.h"
 #include "qemu/log.h"
 #include "qemu/main-loop.h"
 #include "exec/log.h"
-#include "exec/cpu-common.h"
 #include "exec/gdbstub.h"
-#include "qemu/error-report.h"
-#include "qemu/qemu-print.h"
 #include "sysemu/tcg.h"
 #include "hw/boards.h"
 #include "hw/qdev-properties.h"
-- 
2.41.0

Re: [PATCH v2 00/29] Cleanup up to fix missing ERRP_GUARD() for error_prepend()

2024-03-12 Thread Thomas Huth


On 12/03/2024 09.43, Zhao Liu wrote:

Hi Thomas/Markus/Michael,

For the remaing patches, could you please help me merge them next?

Many thanks!


Yes, I'm currently reviewing the ones that don't have a Reviewed-by yet. I 
can pick up the remaining patches if the other maintainers won't pick them 
up for the softfreeze today.


 Thomas



On Tue, Mar 12, 2024 at 09:17:30AM +0100, Philippe Mathieu-Daudé wrote:

Date: Tue, 12 Mar 2024 09:17:30 +0100
From: Philippe Mathieu-Daudé 
Subject: Re: [PATCH v2 00/29] Cleanup up to fix missing ERRP_GUARD() for
  error_prepend()

On 11/3/24 04:37, Zhao Liu wrote:


---
Zhao Liu (29):



hw/core/loader-fit: Fix missing ERRP_GUARD() for error_prepend()
hw/core/qdev-properties-system: Fix missing ERRP_GUARD() for
  error_prepend()
hw/misc/ivshmem: Fix missing ERRP_GUARD() for error_prepend()


I'm queuing these 3 patches, thanks!

Re: [PATCH v9 06/21] i386/cpu: Use APIC ID info to encode cache topo in CPUID[4]

2024-03-12 Thread Zhao Liu

On Mon, Mar 11, 2024 at 05:03:02PM +0800, Xiaoyao Li wrote:
> Date: Mon, 11 Mar 2024 17:03:02 +0800
> From: Xiaoyao Li 
> Subject: Re: [PATCH v9 06/21] i386/cpu: Use APIC ID info to encode cache
>  topo in CPUID[4]
> 
> On 3/10/2024 9:38 PM, Zhao Liu wrote:
> > Hi Xiaoyao,
> > 
> > > >case 3: /* L3 cache info */
> > > > -die_offset = apicid_die_offset(_info);
> > > >if (cpu->enable_l3_cache) {
> > > > +addressable_threads_width = 
> > > > apicid_die_offset(_info);
> > > 
> > > Please get rid of the local variable @addressable_threads_width.
> > > 
> > > It is truly confusing.
> > 
> > There're several reasons for this:
> > 
> > 1. This commit is trying to use APIC ID topology layout to decode 2
> > cache topology fields in CPUID[4], CPUID.04H:EAX[bits 25:14] and
> > CPUID.04H:EAX[bits 31:26]. When there's a addressable_cores_width to map
> > to CPUID.04H:EAX[bits 31:26], it's more clear to also map
> > CPUID.04H:EAX[bits 25:14] to another variable.
> 
> I don't dislike using a variable. I dislike the name of that variable since
> it's misleading

Names are hard to choose...

> 
> > 2. All these 2 variables are temporary in this commit, and they will be
> > replaed by 2 helpers in follow-up cleanup of this series.
> 
> you mean patch 20?
> 
> I don't see how removing the local variable @addressable_threads_width
> conflicts with patch 20. As a con, it introduces code churn.

Yes...I prefer to wrap it in variables in advance, then the meaning of
the fields is clearer I think.

> > 3. Similarly, to make it easier to clean up later with the helper and
> > for more people to review, it's neater to explicitly indicate the
> > CPUID.04H:EAX[bits 25:14] with a variable here.
> 
> If you do want keeping the variable. Please add a comment above it to
> explain the meaning.
>

OK, I'll add comments for both 2 variables. Thanks!

Re: [PATCH v4 20/24] replay: simple auto-snapshot mode for record

2024-03-12 Thread Pavel Dovgalyuk


On 11.03.2024 20:40, Nicholas Piggin wrote:

record makes an initial snapshot when the machine is created, to enable
reverse-debugging. Often the issue being debugged appears near the end of
the trace, so it is important for performance to keep snapshots close to
the end.

This implements a periodic snapshot mode that keeps a rolling set of
recent snapshots. This could be done by the debugger or other program
that talks QMP, but for setting up simple scenarios and tests, this is
more convenient.

Signed-off-by: Nicholas Piggin 
---
  docs/system/replay.rst   |  5 
  include/sysemu/replay.h  | 11 
  replay/replay-snapshot.c | 57 
  replay/replay.c  | 27 +--
  system/vl.c  |  9 +++
  qemu-options.hx  |  9 +--
  6 files changed, 114 insertions(+), 4 deletions(-)

diff --git a/docs/system/replay.rst b/docs/system/replay.rst
index ca7c17c63d..1ae8614475 100644
--- a/docs/system/replay.rst
+++ b/docs/system/replay.rst
@@ -156,6 +156,11 @@ for storing VM snapshots. Here is the example of the 
command line for this:
  ``empty.qcow2`` drive does not connected to any virtual block device and used
  for VM snapshots only.
  
+``rrsnapmode`` can be used to select just an initial snapshot or periodic

+snapshots, with ``rrsnapcount`` specifying the number of periodic snapshots
+to maintain, and ``rrsnaptime`` the amount of run time in seconds between
+periodic snapshots.
+
  .. _network-label:
  
  Network devices

diff --git a/include/sysemu/replay.h b/include/sysemu/replay.h
index 8102fa54f0..92fa82842b 100644
--- a/include/sysemu/replay.h
+++ b/include/sysemu/replay.h
@@ -48,6 +48,17 @@ typedef enum ReplayCheckpoint ReplayCheckpoint;
  
  typedef struct ReplayNetState ReplayNetState;
  
+enum ReplaySnapshotMode {

+REPLAY_SNAPSHOT_MODE_INITIAL,
+REPLAY_SNAPSHOT_MODE_PERIODIC,
+};


This should be defined in replay-internal.h, because it is internal for 
replay.



+typedef enum ReplaySnapshotMode ReplaySnapshotMode;
+
+extern ReplaySnapshotMode replay_snapshot_mode;
+
+extern uint64_t replay_snapshot_periodic_delay;
+extern int replay_snapshot_periodic_nr_keep;


These ones are internal too.


+
  /* Name of the initial VM snapshot */
  extern char *replay_snapshot;
  
diff --git a/replay/replay-snapshot.c b/replay/replay-snapshot.c

index ccb4d89dda..762555feaa 100644
--- a/replay/replay-snapshot.c
+++ b/replay/replay-snapshot.c
@@ -70,6 +70,53 @@ void replay_vmstate_register(void)
  vmstate_register(NULL, 0, _replay, _state);
  }
  
+static QEMUTimer *replay_snapshot_timer;

+static int replay_snapshot_count;
+
+static void replay_snapshot_timer_cb(void *opaque)
+{
+Error *err = NULL;
+char *name;
+
+if (!replay_can_snapshot()) {
+/* Try again soon */
+timer_mod(replay_snapshot_timer,
+  qemu_clock_get_ms(QEMU_CLOCK_REALTIME) +
+  replay_snapshot_periodic_delay / 10);
+return;
+}
+
+name = g_strdup_printf("%s-%d", replay_snapshot, replay_snapshot_count);
+if (!save_snapshot(name,
+   true, NULL, false, NULL, )) {
+error_report_err(err);
+error_report("Could not create periodic snapshot "
+ "for icount record, disabling");
+g_free(name);
+return;
+}
+g_free(name);
+replay_snapshot_count++;
+
+if (replay_snapshot_periodic_nr_keep >= 1 &&
+replay_snapshot_count > replay_snapshot_periodic_nr_keep) {
+int del_nr;
+
+del_nr = replay_snapshot_count - replay_snapshot_periodic_nr_keep - 1;
+name = g_strdup_printf("%s-%d", replay_snapshot, del_nr);


Copy-paste of snapshot name format.


+if (!delete_snapshot(name, false, NULL, )) {
+error_report_err(err);
+error_report("Could not delete periodic snapshot "
+ "for icount record");
+}
+g_free(name);
+}
+
+timer_mod(replay_snapshot_timer,
+  qemu_clock_get_ms(QEMU_CLOCK_REALTIME) +
+  replay_snapshot_periodic_delay);
+}
+
  void replay_vmstate_init(void)
  {
  Error *err = NULL;
@@ -82,6 +129,16 @@ void replay_vmstate_init(void)
  error_report("Could not create snapshot for icount record");
  exit(1);
  }
+
+if (replay_snapshot_mode == REPLAY_SNAPSHOT_MODE_PERIODIC) {
+replay_snapshot_timer = timer_new_ms(QEMU_CLOCK_REALTIME,
+ replay_snapshot_timer_cb,
+ NULL);
+timer_mod(replay_snapshot_timer,
+  qemu_clock_get_ms(QEMU_CLOCK_REALTIME) +
+  replay_snapshot_periodic_delay);
+}
+
  } else if (replay_mode == REPLAY_MODE_PLAY) {
  if (!load_snapshot(replay_snapshot, NULL, false, NULL, )) {

Re: [PATCH v2 26/29] hw/virtio/vhost: Fix missing ERRP_GUARD() for error_prepend()

2024-03-12 Thread Thomas Huth


On 11/03/2024 04.38, Zhao Liu wrote:

From: Zhao Liu 

As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():

* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
*   error_append_hint(), because that doesn't work with _fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or _fatal.

ERRP_GUARD() could avoid the case when @errp is _fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].

In hw/virtio/vhost.c, there are 2 functions passing @errp to
error_prepend() without ERRP_GUARD():
- vhost_save_backend_state()
- vhost_load_backend_state()

Their @errp both points to callers' @local_err. However, as the APIs
defined in include/hw/virtio/vhost.h, it is necessary to protect their
@errp with ERRP_GUARD().

To follow the requirement of @errp, add missing ERRP_GUARD() at their
beginning.

[1]: Issue description in the commit message of commit ae7c80a7bd73
  ("error: New macro ERRP_GUARD()").

Cc: "Michael S. Tsirkin" 
Signed-off-by: Zhao Liu 
---
  hw/virtio/vhost.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 2c9ac794680e..2e4e040db87a 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -2199,6 +2199,7 @@ int vhost_check_device_state(struct vhost_dev *dev, Error 
**errp)
  
  int vhost_save_backend_state(struct vhost_dev *dev, QEMUFile *f, Error **errp)

  {
+ERRP_GUARD();
  /* Maximum chunk size in which to transfer the state */
  const size_t chunk_size = 1 * 1024 * 1024;
  g_autofree void *transfer_buf = NULL;
@@ -2291,6 +2292,7 @@ fail:
  
  int vhost_load_backend_state(struct vhost_dev *dev, QEMUFile *f, Error **errp)

  {
+ERRP_GUARD();
  size_t transfer_buf_size = 0;
  g_autofree void *transfer_buf = NULL;
  g_autoptr(GError) g_err = NULL;


Reviewed-by: Thomas Huth

Re: [PATCH 06/13] ppc/spapr: Add pa-features for POWER10 machines

2024-03-12 Thread Harsh Prateek Bora





On 3/12/24 00:21, Nicholas Piggin wrote:

From: Benjamin Gray 

Add POWER10 pa-features entry.

Notably DEXCR and and [P]HASHST/[P]HASHCHK instruction support is


s/and and/and


advertised. Each DEXCR aspect is allocated a bit in the device tree,
using the 68--71 byte range (inclusive). The functionality of the
[P]HASHST/[P]HASHCHK instructions is separately declared in byte 72,
bit 0 (BE).

Signed-off-by: Benjamin Gray 
[npiggin: reword title and changelog, adjust a few bits]
Signed-off-by: Nicholas Piggin 
---
  hw/ppc/spapr.c | 34 ++
  1 file changed, 34 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 247f920f07..128bfe11a8 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -265,6 +265,36 @@ static void spapr_dt_pa_features(SpaprMachineState *spapr,
  /* 60: NM atomic, 62: RNG */
  0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 60 - 65 */
  };
+/* 3.1 removes SAO, HTM support */
+uint8_t pa_features_31[] = { 74, 0,
+/* 0: MMU|FPU|SLB|RUN|DABR|NX, 1: fri[nzpm]|DABRX|SPRG3|SLB0|PP110 */
+/* 2: VPM|DS205|PPR|DS202|DS206, 3: LSD|URG, 5: LE|CFAR|EB|LSQ */
+0xf6, 0x1f, 0xc7, 0xc0, 0x00, 0xf0, /* 0 - 5 */
+/* 6: DS207 */
+0x80, 0x00, 0x00, 0x00, 0x00, 0x00, /* 6 - 11 */
+/* 16: Vector */
+0x00, 0x00, 0x00, 0x00, 0x80, 0x00, /* 12 - 17 */
+/* 18: Vec. Scalar, 20: Vec. XOR */
+0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 18 - 23 */
+/* 24: Ext. Dec, 26: 64 bit ftrs, 28: PM ftrs */
+0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 24 - 29 */
+/* 32: LE atomic, 34: EBB + ext EBB */
+0x00, 0x00, 0x80, 0x00, 0xC0, 0x00, /* 30 - 35 */
+/* 40: Radix MMU */
+0x00, 0x00, 0x00, 0x00, 0x80, 0x00, /* 36 - 41 */
+/* 42: PM, 44: PC RA, 46: SC vec'd */
+0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 42 - 47 */
+/* 48: SIMD, 50: QP BFP, 52: String */
+0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 48 - 53 */
+/* 54: DecFP, 56: DecI, 58: SHA */
+0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 54 - 59 */
+/* 60: NM atomic, 62: RNG */
+0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 60 - 65 */
+/* 68: DEXCR[SBHE|IBRTPDUS|SRAPD|NPHIE|PHIE] */
+0x00, 0x00, 0xce, 0x00, 0x00, 0x00, /* 66 - 71 */
+/* 72: [P]HASHCHK */


Do we want to mention [P]HASHST as well in comment above ?


+0x80, 0x00, /* 72 - 73 */
+};
  uint8_t *pa_features = NULL;
  size_t pa_size;
  


In future, we may want to have helpers returning pointer to the
pa_features array and corresponding size conditionally based on the
required ISA support needed, instead of having local arrays bloat this
routine.

For now, with cosmetic fixes,

Reviewed-by: Harsh Prateek Bora 


@@ -280,6 +310,10 @@ static void spapr_dt_pa_features(SpaprMachineState *spapr,
  pa_features = pa_features_300;
  pa_size = sizeof(pa_features_300);
  }
+if (ppc_check_compat(cpu, CPU_POWERPC_LOGICAL_3_10, 0, cpu->compat_pvr)) {
+pa_features = pa_features_31;
+pa_size = sizeof(pa_features_31);
+}
  if (!pa_features) {
  return;
  }

Re: [PATCH 10/13] spapr: set MSR[ME] and MSR[FP] on client entry

2024-03-12 Thread Harsh Prateek Bora





On 3/12/24 00:21, Nicholas Piggin wrote:

The initial MSR state for PAPR specifies MSR[ME] and MSR[FP] are set.

Signed-off-by: Nicholas Piggin 


It would be good to mention PAPR section numbers suggesting the same.
Anyways,

Reviewed-by: Harsh Prateek Bora 


---
  hw/ppc/spapr_cpu_core.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index 50523ead25..f3b01b0801 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -42,6 +42,8 @@ static void spapr_reset_vcpu(PowerPCCPU *cpu)
   * as 32bit (MSR_SF=0) in "8.2.1. Initial Register Values".
   */
  env->msr &= ~(1ULL << MSR_SF);
+env->msr |= (1ULL << MSR_ME) | (1ULL << MSR_FP);
+
  env->spr[SPR_HIOR] = 0;
  
  lpcr = env->spr[SPR_LPCR];

Re: [PATCH v3] docs/system/ppc: Document running Linux on AmigaNG machines

2024-03-12 Thread Nicholas Piggin

On Tue Mar 12, 2024 at 7:28 PM AEST, Bernhard Beschow wrote:
>
>
> Am 9. März 2024 11:34:56 UTC schrieb BALATON Zoltan :
> >On Thu, 29 Feb 2024, BALATON Zoltan wrote:
> >> On Wed, 21 Feb 2024, BALATON Zoltan wrote:
> >>> Documentation on how to run Linux on the amigaone, pegasos2 and
> >>> sam460ex machines is currently buried in the depths of the qemu-devel
> >>> mailing list and in the source code. Let's collect the information in
> >>> the QEMU handbook for a one stop solution.
> >> 
> >> Ping? (Just so it's not missed from next pull.)
> >
> >Ping for freeze.
>
> Has this patch been tagged yet? It would really be a pity if it didn't make 
> it into 9.0.

Will send out a PR today and I'll include it.

>
> FWIW:
>
> Reviewed-by: Bernhard Beschow 

Thanks, always helpful.

Thanks,
Nick

>
> >
> >> Regards,
> >> BALATON Zoltan
> >> 
> >>> Co-authored-by: Bernhard Beschow 
> >>> Signed-off-by: BALATON Zoltan 
> >>> Reviewed-by: Nicholas Piggin 
> >>> Tested-by: Bernhard Beschow 
> >>> ---
> >>> v3: Apply changes and Tested-by tag from Bernhard
> >>> v2: Move top level title one level up so subsections will be below it in 
> >>> TOC
> >>> 
> >>> MAINTAINERS |   1 +
> >>> docs/system/ppc/amigang.rst | 161 
> >>> docs/system/target-ppc.rst  |   1 +
> >>> 3 files changed, 163 insertions(+)
> >>> create mode 100644 docs/system/ppc/amigang.rst
> >>> 
> >>> diff --git a/MAINTAINERS b/MAINTAINERS
> >>> index 7d61fb9319..0aef8cb2a6 100644
> >>> --- a/MAINTAINERS
> >>> +++ b/MAINTAINERS
> >>> @@ -1562,6 +1562,7 @@ F: hw/rtc/m41t80.c
> >>> F: pc-bios/canyonlands.dt[sb]
> >>> F: pc-bios/u-boot-sam460ex-20100605.bin
> >>> F: roms/u-boot-sam460ex
> >>> +F: docs/system/ppc/amigang.rst
> >>> 
> >>> pegasos2
> >>> M: BALATON Zoltan 
> >>> diff --git a/docs/system/ppc/amigang.rst b/docs/system/ppc/amigang.rst
> >>> new file mode 100644
> >>> index 00..ba1a3d80b9
> >>> --- /dev/null
> >>> +++ b/docs/system/ppc/amigang.rst
> >>> @@ -0,0 +1,161 @@
> >>> +=
> >>> +AmigaNG boards (``amigaone``, ``pegasos2``, ``sam460ex``)
> >>> +=
> >>> +
> >>> +These PowerPC machines emulate boards that are primarily used for
> >>> +running Amiga like OSes (AmigaOS 4, MorphOS and AROS) but these can
> >>> +also run Linux which is what this section documents.
> >>> +
> >>> +Eyetech AmigaOne/Mai Logic Teron (``amigaone``)
> >>> +===
> >>> +
> >>> +The ``amigaone`` machine emulates an AmigaOne XE mainboard by Eyetech
> >>> +which is a rebranded Mai Logic Teron board with modified U-Boot
> >>> +firmware to support AmigaOS 4.
> >>> +
> >>> +Emulated devices
> >>> +
> >>> +
> >>> + * PowerPC 7457 CPU (can also use``-cpu g3, 750cxe, 750fx`` or ``750gx``)
> >>> + * Articia S north bridge
> >>> + * VIA VT82C686B south bridge
> >>> + * PCI VGA compatible card (guests may need other card instead)
> >>> + * PS/2 keyboard and mouse
> >>> +
> >>> +Firmware
> >>> +
> >>> +
> >>> +A firmware binary is necessary for the boot process. It is a modified
> >>> +U-Boot under GPL but its source is lost so it cannot be included in
> >>> +QEMU. A binary is available at
> >>> +https://www.hyperion-entertainment.com/index.php/downloads?view=files=28.
> >>> +The ROM image is in the last 512kB which can be extracted with the
> >>> +following command:
> >>> +
> >>> +.. code-block:: bash
> >>> +
> >>> +  $ tail -c 524288 updater.image > u-boot-amigaone.bin
> >>> +
> >>> +The BIOS emulator in the firmware is unable to run QEMU‘s standard
> >>> +vgabios so ``VGABIOS-lgpl-latest.bin`` is needed instead which can be
> >>> +downloaded from http://www.nongnu.org/vgabios.
> >>> +
> >>> +Running Linux
> >>> +-
> >>> +
> >>> +There are some Linux images under the following link that work on the
> >>> +``amigaone`` machine:
> >>> +https://sourceforge.net/projects/amigaone-linux/files/debian-installer/.
> >>> +To boot the system run:
> >>> +
> >>> +.. code-block:: bash
> >>> +
> >>> +  $ qemu-system-ppc -machine amigaone -bios u-boot-amigaone.bin \
> >>> +-cdrom "A1 Linux Net Installer.iso" \
> >>> +-device 
> >>> ati-vga,model=rv100,romfile=VGABIOS-lgpl-latest.bin
> >>> +
> >>> +From the firmware menu that appears select ``Boot sequence`` →
> >>> +``Amiga Multiboot Options`` and set ``Boot device 1`` to
> >>> +``Onboard VIA IDE CDROM``. Then hit escape until the main screen appears 
> >>> again,
> >>> +hit escape once more and from the exit menu that appears select either
> >>> +``Save settings and exit`` or ``Use settings for this session only``. It 
> >>> may
> >>> +take a long time loading the kernel into memory but eventually it boots 
> >>> and the
> >>> +installer becomes visible. The ``ati-vga`` RV100 emulation is not
> >>> +complete yet so only frame buffer works, DRM and 3D is not

Re: [PATCH] gdbstub: Fix double close() of the follow-fork-mode socket

2024-03-12 Thread Alex Bennée

Ilya Leoshkevich  writes:

> When the terminal GDB_FORK_ENABLED state is reached, the coordination
> socket is not needed anymore and is therefore closed. However, if there
> is a communication error between QEMU gdbstub and GDB, the generic
> error handling code attempts to close it again.
>
> Fix by closing it later - before returning - instead.
>
> Fixes: Coverity CID 1539966
> Fixes: d547e711a8a5 ("gdbstub: Implement follow-fork-mode child")
> Signed-off-by: Ilya Leoshkevich 

Queued to gdbstub/next, thanks.

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro

Re: [PATCH v3 3/3] Add support for RAPL MSRs in KVM/Qemu

2024-03-12 Thread Anthony Harivel



Hi Daniel, Paolo,

Here my last questions before wrapping up and send v4, or maybe call off
my attempt to add RAPL interface in QEMU.


Daniel P. Berrangé, Jan 30, 2024 at 10:39:
> > +rcu_register_thread();
> > +
> > +/* Get QEMU PID*/
> > +pid = getpid();
> > +
> > +/* Nb of CPUS per packages */
> > +maxcpus = vmsr_get_maxcpus(0);
> > +
> > +/* Nb of Physical Packages on the system */
> > +maxpkgs = vmsr_get_max_physical_package(maxcpus);
>
> This function can fail so this needs to be checked & reported.
>
> > +
> > +/* Those MSR values should not change as well */
> > +vmsr->msr_unit  = vmsr_read_msr(MSR_RAPL_POWER_UNIT, 0, pid,
> > +s->msr_energy.socket_path);
> > +vmsr->msr_limit = vmsr_read_msr(MSR_PKG_POWER_LIMIT, 0, pid,
> > +s->msr_energy.socket_path);
> > +vmsr->msr_info  = vmsr_read_msr(MSR_PKG_POWER_INFO, 0, pid,
> > +s->msr_energy.socket_path);
>
> This function can fail for a variety of reasons, most especially if someone
> gave an incorrect socket path, or if the daemon is not running. This is not
> getting diagnosed, and even if we try to report it here, we're in a background
> thread at this point.
>
> I think we need to connect and report errors before even starting this
> thread, so that QEMU startup gets aborted upon configuration error.
>

Fair enough. Would it be ok to do the sanity check before 
rcu_register_thread() and "return NULL;" in case of error or would you 
prefer me to check all of this before even calling the 
qemu_thread_create() ? 

> > +/* Populate all the thread stats */
> > +for (int i = 0; i < num_threads; i++) {
> > +thd_stat[i].utime = g_new0(unsigned long long, 2);
> > +thd_stat[i].stime = g_new0(unsigned long long, 2);
> > +thd_stat[i].thread_id = thread_ids[i];
> > +vmsr_read_thread_stat(_stat[i], pid, 0);
>
> It is non-obvious that the 3rd parameter here is an index into
> the utime & stime array. This function would be saner to review
> if called as:
>
> vmsr_read_thread_stat(pid,
> thd_stat[i].thread_id,
> _stat[i].utime[0],
> _stat[i].stime[0],
> _stat[i].cpu_id);
>
> so we see what are input parameters and what are output parameters.
>
> Also this method can fail, eg if the thread has exited already,
> so we need to take that into account and stop trying to get info
> for that thread in later code. eg by setting 'thread_id' to 0
> and then skipping any thread_id == 0 later.
>
>

Good point. I'll rework the function and return "thread_id" to 0 in 
case of failure in order to test it later on. 

> > +thd_stat[i].numa_node_id = 
> > numa_node_of_cpu(thd_stat[i].cpu_id);
> > +}
> > +
> > +/* Retrieve all packages power plane energy counter */
> > +for (int i = 0; i <= maxpkgs; i++) {
> > +for (int j = 0; j < num_threads; j++) {
> > +/*
> > + * Use the first thread we found that ran on the CPU
> > + * of the package to read the packages energy counter
> > + */
> > +if (thd_stat[j].numa_node_id == i) {
>
> 'i' is a CPU ID value, while 'numa_node_id' is a NUMA node ID value.
> I don't think it is semantically valid to compare them for equality.
>
> I'm not sure the NUMA node is even relevant, since IIUC from the docs
> earlier, the power values are scoped per package, which would mean per
> CPU socket.
>

'i' here is the package number on the host. 
I'm using functions of libnuma to populate the maxpkgs of the host. 
I tested this on different Intel CPU with multiple packages and this 
has always returned the good number of packages. A false positive ?

So here I'm checking if the thread has run on the package number 'i'. 
I populate 'numa_node_id' with numa_node_of_cpu().

I did not wanted to reinvent the wheel and the only lib that was talking 
about "node" was libnuma.

Maybe I'm wrong assuming that a "node" (defined as an area where all 
memory has the same speed as seen from a particular CPU) could lead me 
to the packages number ?

And this is what I see you wrote below: 
"A numa node isn't a package AFAICT."


Regards,
Anthony

[PATCH v2 1/5] plugins: prepare introduction of new inline ops

2024-03-12 Thread Pierrick Bouvier

Until now, only add_u64 was available, and all functions assumed this or
were named uniquely.

Signed-off-by: Pierrick Bouvier 
---
 include/qemu/plugin.h  |  2 +-
 plugins/plugin.h   |  1 +
 accel/tcg/plugin-gen.c | 77 +-
 plugins/api.c  | 23 ++---
 plugins/core.c |  5 +--
 5 files changed, 61 insertions(+), 47 deletions(-)

diff --git a/include/qemu/plugin.h b/include/qemu/plugin.h
index 12a96cea2a4..33a7cbe910c 100644
--- a/include/qemu/plugin.h
+++ b/include/qemu/plugin.h
@@ -74,7 +74,7 @@ enum plugin_dyn_cb_type {
 enum plugin_dyn_cb_subtype {
 PLUGIN_CB_REGULAR,
 PLUGIN_CB_REGULAR_R,
-PLUGIN_CB_INLINE,
+PLUGIN_CB_INLINE_ADD_U64,
 PLUGIN_N_CB_SUBTYPES,
 };
 
diff --git a/plugins/plugin.h b/plugins/plugin.h
index 7c34f23cfcb..696b1fa38b0 100644
--- a/plugins/plugin.h
+++ b/plugins/plugin.h
@@ -70,6 +70,7 @@ struct qemu_plugin_ctx 
*plugin_id_to_ctx_locked(qemu_plugin_id_t id);
 
 void plugin_register_inline_op_on_entry(GArray **arr,
 enum qemu_plugin_mem_rw rw,
+enum plugin_dyn_cb_subtype type,
 enum qemu_plugin_op op,
 qemu_plugin_u64 entry,
 uint64_t imm);
diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
index 8028786c7bb..494467e0833 100644
--- a/accel/tcg/plugin-gen.c
+++ b/accel/tcg/plugin-gen.c
@@ -81,7 +81,7 @@ enum plugin_gen_from {
 enum plugin_gen_cb {
 PLUGIN_GEN_CB_UDATA,
 PLUGIN_GEN_CB_UDATA_R,
-PLUGIN_GEN_CB_INLINE,
+PLUGIN_GEN_CB_INLINE_ADD_U64,
 PLUGIN_GEN_CB_MEM,
 PLUGIN_GEN_ENABLE_MEM_HELPER,
 PLUGIN_GEN_DISABLE_MEM_HELPER,
@@ -127,11 +127,7 @@ static void gen_empty_udata_cb_no_rwg(void)
 gen_empty_udata_cb(gen_helper_plugin_vcpu_udata_cb_no_rwg);
 }
 
-/*
- * For now we only support addi_i64.
- * When we support more ops, we can generate one empty inline cb for each.
- */
-static void gen_empty_inline_cb(void)
+static void gen_empty_inline_cb_add_u64(void)
 {
 TCGv_i32 cpu_index = tcg_temp_ebb_new_i32();
 TCGv_ptr cpu_index_as_ptr = tcg_temp_ebb_new_ptr();
@@ -219,9 +215,11 @@ static void plugin_gen_empty_callback(enum plugin_gen_from 
from)
 gen_empty_mem_helper);
 /* fall through */
 case PLUGIN_GEN_FROM_TB:
+/* emit inline op before any callback */
+gen_wrapped(from, PLUGIN_GEN_CB_INLINE_ADD_U64,
+gen_empty_inline_cb_add_u64);
 gen_wrapped(from, PLUGIN_GEN_CB_UDATA, gen_empty_udata_cb_no_rwg);
 gen_wrapped(from, PLUGIN_GEN_CB_UDATA_R, gen_empty_udata_cb_no_wg);
-gen_wrapped(from, PLUGIN_GEN_CB_INLINE, gen_empty_inline_cb);
 break;
 default:
 g_assert_not_reached();
@@ -232,13 +230,14 @@ void plugin_gen_empty_mem_callback(TCGv_i64 addr, 
uint32_t info)
 {
 enum qemu_plugin_mem_rw rw = get_plugin_meminfo_rw(info);
 
+/* emit inline op before any callback */
+gen_plugin_cb_start(PLUGIN_GEN_FROM_MEM, PLUGIN_GEN_CB_INLINE_ADD_U64, rw);
+gen_empty_inline_cb_add_u64();
+tcg_gen_plugin_cb_end();
+
 gen_plugin_cb_start(PLUGIN_GEN_FROM_MEM, PLUGIN_GEN_CB_MEM, rw);
 gen_empty_mem_cb(addr, info);
 tcg_gen_plugin_cb_end();
-
-gen_plugin_cb_start(PLUGIN_GEN_FROM_MEM, PLUGIN_GEN_CB_INLINE, rw);
-gen_empty_inline_cb();
-tcg_gen_plugin_cb_end();
 }
 
 static TCGOp *find_op(TCGOp *op, TCGOpcode opc)
@@ -436,9 +435,9 @@ static TCGOp *append_udata_cb(const struct 
qemu_plugin_dyn_cb *cb,
 return op;
 }
 
-static TCGOp *append_inline_cb(const struct qemu_plugin_dyn_cb *cb,
-   TCGOp *begin_op, TCGOp *op,
-   int *unused)
+static TCGOp *append_inline_cb_add_u64(const struct qemu_plugin_dyn_cb *cb,
+   TCGOp *begin_op, TCGOp *op,
+   int *unused)
 {
 char *ptr = cb->inline_insn.entry.score->data->data;
 size_t elem_size = g_array_get_element_size(
@@ -538,9 +537,9 @@ inject_udata_cb(const GArray *cbs, TCGOp *begin_op)
 }
 
 static void
-inject_inline_cb(const GArray *cbs, TCGOp *begin_op, op_ok_fn ok)
+inject_inline_cb_add_u64(const GArray *cbs, TCGOp *begin_op, op_ok_fn ok)
 {
-inject_cb_type(cbs, begin_op, append_inline_cb, ok);
+inject_cb_type(cbs, begin_op, append_inline_cb_add_u64, ok);
 }
 
 static void
@@ -588,8 +587,9 @@ static void inject_mem_enable_helper(struct qemu_plugin_tb 
*ptb,
 GArray *arr;
 size_t n_cbs, i;
 
-cbs[0] = plugin_insn->cbs[PLUGIN_CB_MEM][PLUGIN_CB_REGULAR];
-cbs[1] = plugin_insn->cbs[PLUGIN_CB_MEM][PLUGIN_CB_INLINE];
+/* emit inline op before any callback */
+cbs[0] = plugin_insn->cbs[PLUGIN_CB_MEM][PLUGIN_CB_INLINE_ADD_U64];
+cbs[1] = plugin_insn->cbs[PLUGIN_CB_MEM][PLUGIN_CB_REGULAR];
 
 n_cbs = 0;
 for

[PATCH v2 5/5] tests/plugin/inline: add test for condition callback

2024-03-12 Thread Pierrick Bouvier

Count number of tb and insn executed using a conditional callback. We
ensure the callback has been called expected number of time (per vcpu).

Signed-off-by: Pierrick Bouvier 
---
 tests/plugin/inline.c | 89 +--
 1 file changed, 86 insertions(+), 3 deletions(-)

diff --git a/tests/plugin/inline.c b/tests/plugin/inline.c
index 30acc7a1838..771c246094e 100644
--- a/tests/plugin/inline.c
+++ b/tests/plugin/inline.c
@@ -20,8 +20,14 @@ typedef struct {
 uint64_t count_insn_inline;
 uint64_t count_mem;
 uint64_t count_mem_inline;
+uint64_t tb_cond_num_trigger;
+uint64_t tb_cond_track_count;
+uint64_t insn_cond_num_trigger;
+uint64_t insn_cond_track_count;
 } CPUCount;
 
+static const uint64_t cond_trigger_limit = 100;
+
 typedef struct {
 uint64_t data_insn;
 uint64_t data_tb;
@@ -35,6 +41,10 @@ static qemu_plugin_u64 count_insn;
 static qemu_plugin_u64 count_insn_inline;
 static qemu_plugin_u64 count_mem;
 static qemu_plugin_u64 count_mem_inline;
+static qemu_plugin_u64 tb_cond_num_trigger;
+static qemu_plugin_u64 tb_cond_track_count;
+static qemu_plugin_u64 insn_cond_num_trigger;
+static qemu_plugin_u64 insn_cond_track_count;
 static struct qemu_plugin_scoreboard *data;
 static qemu_plugin_u64 data_insn;
 static qemu_plugin_u64 data_tb;
@@ -56,12 +66,19 @@ static void stats_insn(void)
 const uint64_t per_vcpu = qemu_plugin_u64_sum(count_insn);
 const uint64_t inl_per_vcpu =
 qemu_plugin_u64_sum(count_insn_inline);
+const uint64_t cond_num_trigger =
+qemu_plugin_u64_sum(insn_cond_num_trigger);
+const uint64_t cond_track_left = 
qemu_plugin_u64_sum(insn_cond_track_count);
+const uint64_t conditional =
+cond_num_trigger * cond_trigger_limit + cond_track_left;
 printf("insn: %" PRIu64 "\n", expected);
 printf("insn: %" PRIu64 " (per vcpu)\n", per_vcpu);
 printf("insn: %" PRIu64 " (per vcpu inline)\n", inl_per_vcpu);
+printf("insn: %" PRIu64 " (cond cb)\n", conditional);
 g_assert(expected > 0);
 g_assert(per_vcpu == expected);
 g_assert(inl_per_vcpu == expected);
+g_assert(conditional == expected);
 }
 
 static void stats_tb(void)
@@ -70,12 +87,18 @@ static void stats_tb(void)
 const uint64_t per_vcpu = qemu_plugin_u64_sum(count_tb);
 const uint64_t inl_per_vcpu =
 qemu_plugin_u64_sum(count_tb_inline);
+const uint64_t cond_num_trigger = qemu_plugin_u64_sum(tb_cond_num_trigger);
+const uint64_t cond_track_left = qemu_plugin_u64_sum(tb_cond_track_count);
+const uint64_t conditional =
+cond_num_trigger * cond_trigger_limit + cond_track_left;
 printf("tb: %" PRIu64 "\n", expected);
 printf("tb: %" PRIu64 " (per vcpu)\n", per_vcpu);
 printf("tb: %" PRIu64 " (per vcpu inline)\n", inl_per_vcpu);
+printf("tb: %" PRIu64 " (conditional cb)\n", conditional);
 g_assert(expected > 0);
 g_assert(per_vcpu == expected);
 g_assert(inl_per_vcpu == expected);
+g_assert(conditional == expected);
 }
 
 static void stats_mem(void)
@@ -104,14 +127,35 @@ static void plugin_exit(qemu_plugin_id_t id, void *udata)
 const uint64_t insn_inline = qemu_plugin_u64_get(count_insn_inline, i);
 const uint64_t mem = qemu_plugin_u64_get(count_mem, i);
 const uint64_t mem_inline = qemu_plugin_u64_get(count_mem_inline, i);
-printf("cpu %d: tb (%" PRIu64 ", %" PRIu64 ") | "
-   "insn (%" PRIu64 ", %" PRIu64 ") | "
+const uint64_t tb_cond_trigger =
+qemu_plugin_u64_get(tb_cond_num_trigger, i);
+const uint64_t tb_cond_left =
+qemu_plugin_u64_get(tb_cond_track_count, i);
+const uint64_t insn_cond_trigger =
+qemu_plugin_u64_get(insn_cond_num_trigger, i);
+const uint64_t insn_cond_left =
+qemu_plugin_u64_get(insn_cond_track_count, i);
+printf("cpu %d: tb (%" PRIu64 ", %" PRIu64
+   ", %" PRIu64 " * %" PRIu64 " + %" PRIu64
+   ") | "
+   "insn (%" PRIu64 ", %" PRIu64
+   ", %" PRIu64 " * %" PRIu64 " + %" PRIu64
+   ") | "
"mem (%" PRIu64 ", %" PRIu64 ")"
"\n",
-   i, tb, tb_inline, insn, insn_inline, mem, mem_inline);
+   i,
+   tb, tb_inline,
+   tb_cond_trigger, cond_trigger_limit, tb_cond_left,
+   insn, insn_inline,
+   insn_cond_trigger, cond_trigger_limit, insn_cond_left,
+   mem, mem_inline);
 g_assert(tb == tb_inline);
 g_assert(insn == insn_inline);
 g_assert(mem == mem_inline);
+g_assert(tb_cond_trigger == tb / cond_trigger_limit);
+g_assert(tb_cond_left == tb % cond_trigger_limit);
+g_assert(insn_cond_trigger == insn / cond_trigger_limit);
+g_assert(insn_cond_left == insn % cond_trigger_limit);
 }
 
 stats_tb();
@@ -132,6 +176,24 @@ static void vcpu_tb_exec(unsigned

[PATCH v2 3/5] tests/plugin/inline: add test for STORE_U64 inline op

2024-03-12 Thread Pierrick Bouvier

Signed-off-by: Pierrick Bouvier 
---
 tests/plugin/inline.c | 41 +
 1 file changed, 37 insertions(+), 4 deletions(-)

diff --git a/tests/plugin/inline.c b/tests/plugin/inline.c
index 0163e9b51c5..30acc7a1838 100644
--- a/tests/plugin/inline.c
+++ b/tests/plugin/inline.c
@@ -22,6 +22,12 @@ typedef struct {
 uint64_t count_mem_inline;
 } CPUCount;
 
+typedef struct {
+uint64_t data_insn;
+uint64_t data_tb;
+uint64_t data_mem;
+} CPUData;
+
 static struct qemu_plugin_scoreboard *counts;
 static qemu_plugin_u64 count_tb;
 static qemu_plugin_u64 count_tb_inline;
@@ -29,6 +35,10 @@ static qemu_plugin_u64 count_insn;
 static qemu_plugin_u64 count_insn_inline;
 static qemu_plugin_u64 count_mem;
 static qemu_plugin_u64 count_mem_inline;
+static struct qemu_plugin_scoreboard *data;
+static qemu_plugin_u64 data_insn;
+static qemu_plugin_u64 data_tb;
+static qemu_plugin_u64 data_mem;
 
 static uint64_t global_count_tb;
 static uint64_t global_count_insn;
@@ -109,11 +119,13 @@ static void plugin_exit(qemu_plugin_id_t id, void *udata)
 stats_mem();
 
 qemu_plugin_scoreboard_free(counts);
+qemu_plugin_scoreboard_free(data);
 }
 
 static void vcpu_tb_exec(unsigned int cpu_index, void *udata)
 {
 qemu_plugin_u64_add(count_tb, cpu_index, 1);
+g_assert(qemu_plugin_u64_get(data_tb, cpu_index) == (uintptr_t) udata);
 g_mutex_lock(_lock);
 max_cpu_index = MAX(max_cpu_index, cpu_index);
 global_count_tb++;
@@ -123,6 +135,7 @@ static void vcpu_tb_exec(unsigned int cpu_index, void 
*udata)
 static void vcpu_insn_exec(unsigned int cpu_index, void *udata)
 {
 qemu_plugin_u64_add(count_insn, cpu_index, 1);
+g_assert(qemu_plugin_u64_get(data_insn, cpu_index) == (uintptr_t) udata);
 g_mutex_lock(_lock);
 global_count_insn++;
 g_mutex_unlock(_lock);
@@ -131,9 +144,10 @@ static void vcpu_insn_exec(unsigned int cpu_index, void 
*udata)
 static void vcpu_mem_access(unsigned int cpu_index,
 qemu_plugin_meminfo_t info,
 uint64_t vaddr,
-void *userdata)
+void *udata)
 {
 qemu_plugin_u64_add(count_mem, cpu_index, 1);
+g_assert(qemu_plugin_u64_get(data_mem, cpu_index) == (uintptr_t) udata);
 g_mutex_lock(_lock);
 global_count_mem++;
 g_mutex_unlock(_lock);
@@ -141,20 +155,34 @@ static void vcpu_mem_access(unsigned int cpu_index,
 
 static void vcpu_tb_trans(qemu_plugin_id_t id, struct qemu_plugin_tb *tb)
 {
+void *tb_store = tb;
 qemu_plugin_register_vcpu_tb_exec_cb(
-tb, vcpu_tb_exec, QEMU_PLUGIN_CB_NO_REGS, 0);
+tb, vcpu_tb_exec, QEMU_PLUGIN_CB_NO_REGS, tb_store);
 qemu_plugin_register_vcpu_tb_exec_inline_per_vcpu(
 tb, QEMU_PLUGIN_INLINE_ADD_U64, count_tb_inline, 1);
+qemu_plugin_register_vcpu_tb_exec_inline_per_vcpu(
+tb, QEMU_PLUGIN_INLINE_STORE_U64, data_tb, (uintptr_t) tb_store);
 
 for (int idx = 0; idx < qemu_plugin_tb_n_insns(tb); ++idx) {
 struct qemu_plugin_insn *insn = qemu_plugin_tb_get_insn(tb, idx);
+void *insn_store = insn;
+void *mem_store = (char *)insn_store + 0xff;
+
 qemu_plugin_register_vcpu_insn_exec_cb(
-insn, vcpu_insn_exec, QEMU_PLUGIN_CB_NO_REGS, 0);
+insn, vcpu_insn_exec, QEMU_PLUGIN_CB_NO_REGS, insn_store);
+qemu_plugin_register_vcpu_insn_exec_inline_per_vcpu(
+insn, QEMU_PLUGIN_INLINE_STORE_U64, data_insn,
+(uintptr_t) insn_store);
 qemu_plugin_register_vcpu_insn_exec_inline_per_vcpu(
 insn, QEMU_PLUGIN_INLINE_ADD_U64, count_insn_inline, 1);
+
 qemu_plugin_register_vcpu_mem_cb(insn, _mem_access,
  QEMU_PLUGIN_CB_NO_REGS,
- QEMU_PLUGIN_MEM_RW, 0);
+ QEMU_PLUGIN_MEM_RW, mem_store);
+qemu_plugin_register_vcpu_mem_inline_per_vcpu(
+insn, QEMU_PLUGIN_MEM_RW,
+QEMU_PLUGIN_INLINE_STORE_U64,
+data_mem, (uintptr_t) mem_store);
 qemu_plugin_register_vcpu_mem_inline_per_vcpu(
 insn, QEMU_PLUGIN_MEM_RW,
 QEMU_PLUGIN_INLINE_ADD_U64,
@@ -179,6 +207,11 @@ int qemu_plugin_install(qemu_plugin_id_t id, const 
qemu_info_t *info,
 counts, CPUCount, count_insn_inline);
 count_mem_inline = qemu_plugin_scoreboard_u64_in_struct(
 counts, CPUCount, count_mem_inline);
+data = qemu_plugin_scoreboard_new(sizeof(CPUData));
+data_insn = qemu_plugin_scoreboard_u64_in_struct(data, CPUData, data_insn);
+data_tb = qemu_plugin_scoreboard_u64_in_struct(data, CPUData, data_tb);
+data_mem = qemu_plugin_scoreboard_u64_in_struct(data, CPUData, data_mem);
+
 qemu_plugin_register_vcpu_tb_trans_cb(id, vcpu_tb_trans);
 qemu_plugin_register_atexit_cb(id, plugin_exit, NULL);
 
-- 
2.39.2

[PATCH v2 2/5] plugins: add new inline op STORE_U64

2024-03-12 Thread Pierrick Bouvier

Signed-off-by: Pierrick Bouvier 
---
 include/qemu/plugin.h  |   1 +
 include/qemu/qemu-plugin.h |   4 +-
 accel/tcg/plugin-gen.c | 114 -
 plugins/api.c  |   2 +
 plugins/core.c |   4 ++
 5 files changed, 120 insertions(+), 5 deletions(-)

diff --git a/include/qemu/plugin.h b/include/qemu/plugin.h
index 33a7cbe910c..d92d64744e6 100644
--- a/include/qemu/plugin.h
+++ b/include/qemu/plugin.h
@@ -75,6 +75,7 @@ enum plugin_dyn_cb_subtype {
 PLUGIN_CB_REGULAR,
 PLUGIN_CB_REGULAR_R,
 PLUGIN_CB_INLINE_ADD_U64,
+PLUGIN_CB_INLINE_STORE_U64,
 PLUGIN_N_CB_SUBTYPES,
 };
 
diff --git a/include/qemu/qemu-plugin.h b/include/qemu/qemu-plugin.h
index 4fc6c3739b2..c5cac897a0b 100644
--- a/include/qemu/qemu-plugin.h
+++ b/include/qemu/qemu-plugin.h
@@ -305,12 +305,12 @@ void qemu_plugin_register_vcpu_tb_exec_cb(struct 
qemu_plugin_tb *tb,
  * enum qemu_plugin_op - describes an inline op
  *
  * @QEMU_PLUGIN_INLINE_ADD_U64: add an immediate value uint64_t
- *
- * Note: currently only a single inline op is supported.
+ * @QEMU_PLUGIN_INLINE_STORE_U64: store an immediate value uint64_t
  */
 
 enum qemu_plugin_op {
 QEMU_PLUGIN_INLINE_ADD_U64,
+QEMU_PLUGIN_INLINE_STORE_U64,
 };
 
 /**
diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
index 494467e0833..02c894106e2 100644
--- a/accel/tcg/plugin-gen.c
+++ b/accel/tcg/plugin-gen.c
@@ -46,8 +46,9 @@
 #include "qemu/plugin.h"
 #include "cpu.h"
 #include "tcg/tcg.h"
-#include "tcg/tcg-temp-internal.h"
+#include "tcg/tcg-internal.h"
 #include "tcg/tcg-op.h"
+#include "tcg/tcg-temp-internal.h"
 #include "exec/exec-all.h"
 #include "exec/plugin-gen.h"
 #include "exec/translator.h"
@@ -82,6 +83,7 @@ enum plugin_gen_cb {
 PLUGIN_GEN_CB_UDATA,
 PLUGIN_GEN_CB_UDATA_R,
 PLUGIN_GEN_CB_INLINE_ADD_U64,
+PLUGIN_GEN_CB_INLINE_STORE_U64,
 PLUGIN_GEN_CB_MEM,
 PLUGIN_GEN_ENABLE_MEM_HELPER,
 PLUGIN_GEN_DISABLE_MEM_HELPER,
@@ -153,6 +155,30 @@ static void gen_empty_inline_cb_add_u64(void)
 tcg_temp_free_i32(cpu_index);
 }
 
+static void gen_empty_inline_cb_store_u64(void)
+{
+TCGv_i32 cpu_index = tcg_temp_ebb_new_i32();
+TCGv_ptr cpu_index_as_ptr = tcg_temp_ebb_new_ptr();
+TCGv_i64 val = tcg_temp_ebb_new_i64();
+TCGv_ptr ptr = tcg_temp_ebb_new_ptr();
+
+tcg_gen_ld_i32(cpu_index, tcg_env,
+   -offsetof(ArchCPU, env) + offsetof(CPUState, cpu_index));
+/* second operand will be replaced by immediate value */
+tcg_gen_mul_i32(cpu_index, cpu_index, cpu_index);
+tcg_gen_ext_i32_ptr(cpu_index_as_ptr, cpu_index);
+tcg_gen_movi_ptr(ptr, 0);
+tcg_gen_add_ptr(ptr, ptr, cpu_index_as_ptr);
+
+tcg_gen_movi_i64(val, 0);
+tcg_gen_st_i64(val, ptr, 0);
+
+tcg_temp_free_ptr(ptr);
+tcg_temp_free_i64(val);
+tcg_temp_free_ptr(cpu_index_as_ptr);
+tcg_temp_free_i32(cpu_index);
+}
+
 static void gen_empty_mem_cb(TCGv_i64 addr, uint32_t info)
 {
 TCGv_i32 cpu_index = tcg_temp_ebb_new_i32();
@@ -218,6 +244,8 @@ static void plugin_gen_empty_callback(enum plugin_gen_from 
from)
 /* emit inline op before any callback */
 gen_wrapped(from, PLUGIN_GEN_CB_INLINE_ADD_U64,
 gen_empty_inline_cb_add_u64);
+gen_wrapped(from, PLUGIN_GEN_CB_INLINE_STORE_U64,
+gen_empty_inline_cb_store_u64);
 gen_wrapped(from, PLUGIN_GEN_CB_UDATA, gen_empty_udata_cb_no_rwg);
 gen_wrapped(from, PLUGIN_GEN_CB_UDATA_R, gen_empty_udata_cb_no_wg);
 break;
@@ -235,6 +263,11 @@ void plugin_gen_empty_mem_callback(TCGv_i64 addr, uint32_t 
info)
 gen_empty_inline_cb_add_u64();
 tcg_gen_plugin_cb_end();
 
+gen_plugin_cb_start(PLUGIN_GEN_FROM_MEM,
+PLUGIN_GEN_CB_INLINE_STORE_U64, rw);
+gen_empty_inline_cb_store_u64();
+tcg_gen_plugin_cb_end();
+
 gen_plugin_cb_start(PLUGIN_GEN_FROM_MEM, PLUGIN_GEN_CB_MEM, rw);
 gen_empty_mem_cb(addr, info);
 tcg_gen_plugin_cb_end();
@@ -352,6 +385,20 @@ static TCGOp *copy_st_i64(TCGOp **begin_op, TCGOp *op)
 return op;
 }
 
+static TCGOp *copy_mov_i64(TCGOp **begin_op, TCGOp *op, uint64_t v)
+{
+if (TCG_TARGET_REG_BITS == 32) {
+op = copy_op(begin_op, op, INDEX_op_mov_i32);
+op->args[1] = tcgv_i32_arg(TCGV_LOW(tcg_constant_i64(v)));
+op = copy_op(begin_op, op, INDEX_op_mov_i32);
+op->args[1] = tcgv_i32_arg(TCGV_HIGH(tcg_constant_i64(v)));
+} else {
+op = copy_op(begin_op, op, INDEX_op_mov_i64);
+op->args[1] = tcgv_i64_arg(tcg_constant_i64(v));
+}
+return op;
+}
+
 static TCGOp *copy_add_i64(TCGOp **begin_op, TCGOp *op, uint64_t v)
 {
 if (TCG_TARGET_REG_BITS == 32) {
@@ -455,6 +502,24 @@ static TCGOp *append_inline_cb_add_u64(const struct 
qemu_plugin_dyn_cb *cb,
 return op;
 }
 
+static TCGOp *append_inline_cb_store_u64(const struct qemu_plugin_dyn_cb *cb,
+

[PATCH v2 4/5] plugins: conditional callbacks

2024-03-12 Thread Pierrick Bouvier

Extend plugins API to support callback called with a given criteria
(evaluated inline).

Added functions:
- qemu_plugin_register_vcpu_tb_exec_cond_cb
- qemu_plugin_register_vcpu_insn_exec_cond_cb

They expect as parameter a condition, a qemu_plugin_u64_t (op1) and an
immediate (op2). Callback is called if op1 |cond| op2 is true.

Signed-off-by: Pierrick Bouvier 
---
 include/qemu/plugin.h|   7 ++
 include/qemu/qemu-plugin.h   |  76 +++
 plugins/plugin.h |   8 ++
 accel/tcg/plugin-gen.c   | 174 ++-
 plugins/api.c|  51 ++
 plugins/core.c   |  19 
 plugins/qemu-plugins.symbols |   2 +
 7 files changed, 334 insertions(+), 3 deletions(-)

diff --git a/include/qemu/plugin.h b/include/qemu/plugin.h
index d92d64744e6..056102b2361 100644
--- a/include/qemu/plugin.h
+++ b/include/qemu/plugin.h
@@ -74,6 +74,8 @@ enum plugin_dyn_cb_type {
 enum plugin_dyn_cb_subtype {
 PLUGIN_CB_REGULAR,
 PLUGIN_CB_REGULAR_R,
+PLUGIN_CB_COND,
+PLUGIN_CB_COND_R,
 PLUGIN_CB_INLINE_ADD_U64,
 PLUGIN_CB_INLINE_STORE_U64,
 PLUGIN_N_CB_SUBTYPES,
@@ -97,6 +99,11 @@ struct qemu_plugin_dyn_cb {
 enum qemu_plugin_op op;
 uint64_t imm;
 } inline_insn;
+struct {
+qemu_plugin_u64 entry;
+enum qemu_plugin_cond cond;
+uint64_t imm;
+} cond_cb;
 };
 };
 
diff --git a/include/qemu/qemu-plugin.h b/include/qemu/qemu-plugin.h
index c5cac897a0b..337de25ece7 100644
--- a/include/qemu/qemu-plugin.h
+++ b/include/qemu/qemu-plugin.h
@@ -262,6 +262,29 @@ enum qemu_plugin_mem_rw {
 QEMU_PLUGIN_MEM_RW,
 };
 
+/**
+ * enum qemu_plugin_cond - condition to enable callback
+ *
+ * @QEMU_PLUGIN_COND_NEVER: false
+ * @QEMU_PLUGIN_COND_ALWAYS: true
+ * @QEMU_PLUGIN_COND_EQ: is equal?
+ * @QEMU_PLUGIN_COND_NE: is not equal?
+ * @QEMU_PLUGIN_COND_LT: is less than?
+ * @QEMU_PLUGIN_COND_LE: is less than or equal?
+ * @QEMU_PLUGIN_COND_GT: is greater than?
+ * @QEMU_PLUGIN_COND_GE: is greater than or equal?
+ */
+enum qemu_plugin_cond {
+QEMU_PLUGIN_COND_NEVER,
+QEMU_PLUGIN_COND_ALWAYS,
+QEMU_PLUGIN_COND_EQ,
+QEMU_PLUGIN_COND_NE,
+QEMU_PLUGIN_COND_LT,
+QEMU_PLUGIN_COND_LE,
+QEMU_PLUGIN_COND_GT,
+QEMU_PLUGIN_COND_GE,
+};
+
 /**
  * typedef qemu_plugin_vcpu_tb_trans_cb_t - translation callback
  * @id: unique plugin id
@@ -301,6 +324,32 @@ void qemu_plugin_register_vcpu_tb_exec_cb(struct 
qemu_plugin_tb *tb,
   enum qemu_plugin_cb_flags flags,
   void *userdata);
 
+/**
+ * qemu_plugin_register_vcpu_tb_exec_cond_cb() - register conditional callback
+ * @tb: the opaque qemu_plugin_tb handle for the translation
+ * @cb: callback function
+ * @cond: condition to enable callback
+ * @entry: first operand for condition
+ * @imm: second operand for condition
+ * @flags: does the plugin read or write the CPU's registers?
+ * @userdata: any plugin data to pass to the @cb?
+ *
+ * The @cb function is called when a translated unit executes if
+ * entry @cond imm is true.
+ * If condition is QEMU_PLUGIN_COND_ALWAYS, condition is never interpreted and
+ * this function is equivalent to qemu_plugin_register_vcpu_tb_exec_cb.
+ * If condition QEMU_PLUGIN_COND_NEVER, condition is never interpreted and
+ * callback is never installed.
+ */
+QEMU_PLUGIN_API
+void qemu_plugin_register_vcpu_tb_exec_cond_cb(struct qemu_plugin_tb *tb,
+   qemu_plugin_vcpu_udata_cb_t cb,
+   enum qemu_plugin_cb_flags flags,
+   enum qemu_plugin_cond cond,
+   qemu_plugin_u64 entry,
+   uint64_t imm,
+   void *userdata);
+
 /**
  * enum qemu_plugin_op - describes an inline op
  *
@@ -344,6 +393,33 @@ void qemu_plugin_register_vcpu_insn_exec_cb(struct 
qemu_plugin_insn *insn,
 enum qemu_plugin_cb_flags flags,
 void *userdata);
 
+/**
+ * qemu_plugin_register_vcpu_insn_exec_cond_cb() - conditional insn execution 
cb
+ * @insn: the opaque qemu_plugin_insn handle for an instruction
+ * @cb: callback function
+ * @flags: does the plugin read or write the CPU's registers?
+ * @cond: condition to enable callback
+ * @entry: first operand for condition
+ * @imm: second operand for condition
+ * @userdata: any plugin data to pass to the @cb?
+ *
+ * The @cb function is called when an instruction executes if
+ * entry @cond imm is true.
+ * If condition is QEMU_PLUGIN_COND_ALWAYS, condition is never interpreted and
+ * this function is equivalent to qemu_plugin_register_vcpu_insn_exec_cb.
+ * If condition QEMU_PLUGIN_COND_NEVER, condition is

[PATCH v2 0/5] TCG plugins new inline operations

2024-03-12 Thread Pierrick Bouvier

This series implement two new operations for plugins:
- Store inline allows to write a specific value to a scoreboard.
- Conditional callback executes a callback only when a given condition is true.
  The condition is evaluated inline.

It's possible to mix various inline operations (add, store) with conditional
callbacks, allowing efficient "trap" based counters.

It builds on top of new scoreboard API, introduced in the previous series.

v2
--

- fixed issue with udata not being passed to conditional callback
- added specific test for this in tests/plugin/inline.c (udata was NULL before).

Pierrick Bouvier (5):
  plugins: prepare introduction of new inline ops
  plugins: add new inline op STORE_U64
  tests/plugin/inline: add test for STORE_U64 inline op
  plugins: conditional callbacks
  tests/plugin/inline: add test for condition callback

 include/qemu/plugin.h|  10 +-
 include/qemu/qemu-plugin.h   |  80 +++-
 plugins/plugin.h |   9 +
 accel/tcg/plugin-gen.c   | 359 +++
 plugins/api.c|  76 +++-
 plugins/core.c   |  28 ++-
 tests/plugin/inline.c| 130 -
 plugins/qemu-plugins.symbols |   2 +
 8 files changed, 635 insertions(+), 59 deletions(-)

-- 
2.39.2

[PULL 11/13] hw/gpio: introduce pcf8574 driver

2024-03-12 Thread Philippe Mathieu-Daudé

From: Dmitriy Sharikhin 

NXP PCF8574 and compatible ICs are simple I2C GPIO expanders.
PCF8574 incorporates quasi-bidirectional IO, and simple
communication protocol, when IO read is I2C byte read, and
IO write is I2C byte write. User can think of it as
open-drain port, when line high state is input and line low
state is output.

Signed-off-by: Dmitrii Sharikhin 
Reviewed-by: Philippe Mathieu-Daudé 
Message-ID: 
Signed-off-by: Philippe Mathieu-Daudé 
---
 MAINTAINERS   |   6 ++
 include/hw/gpio/pcf8574.h |  15 
 hw/gpio/pcf8574.c | 162 ++
 hw/gpio/Kconfig   |   4 +
 hw/gpio/meson.build   |   1 +
 5 files changed, 188 insertions(+)
 create mode 100644 include/hw/gpio/pcf8574.h
 create mode 100644 hw/gpio/pcf8574.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 4d96f855de..72c23e3682 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2503,6 +2503,12 @@ S: Maintained
 F: hw/i2c/i2c_mux_pca954x.c
 F: include/hw/i2c/i2c_mux_pca954x.h
 
+pcf8574
+M: Dmitrii Sharikhin 
+S: Maintained
+F: hw/gpio/pcf8574.c
+F: include/gpio/pcf8574.h
+
 Generic Loader
 M: Alistair Francis 
 S: Maintained
diff --git a/include/hw/gpio/pcf8574.h b/include/hw/gpio/pcf8574.h
new file mode 100644
index 00..3291d7dbbc
--- /dev/null
+++ b/include/hw/gpio/pcf8574.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+/*
+ * NXP PCF8574 8-port I2C GPIO expansion chip.
+ *
+ * Copyright (c) 2024 KNS Group (YADRO).
+ * Written by Dmitrii Sharikhin 
+ */
+
+#ifndef _HW_GPIO_PCF8574
+#define _HW_GPIO_PCF8574
+
+#define TYPE_PCF8574 "pcf8574"
+
+#endif /* _HW_GPIO_PCF8574 */
diff --git a/hw/gpio/pcf8574.c b/hw/gpio/pcf8574.c
new file mode 100644
index 00..d37909e2ad
--- /dev/null
+++ b/hw/gpio/pcf8574.c
@@ -0,0 +1,162 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+/*
+ * NXP PCF8574 8-port I2C GPIO expansion chip.
+ * Copyright (c) 2024 KNS Group (YADRO).
+ * Written by Dmitrii Sharikhin 
+ */
+
+#include "qemu/osdep.h"
+#include "hw/i2c/i2c.h"
+#include "hw/gpio/pcf8574.h"
+#include "hw/irq.h"
+#include "migration/vmstate.h"
+#include "qemu/log.h"
+#include "qemu/module.h"
+#include "qom/object.h"
+
+/*
+ * PCF8574 and compatible chips incorporate quasi-bidirectional
+ * IO. Electrically it means that device sustain pull-up to line
+ * unless IO port is configured as output _and_ driven low.
+ *
+ * IO access is implemented as simple I2C single-byte read
+ * or write operation. So, to configure line to input user write 1
+ * to corresponding bit. To configure line to output and drive it low
+ * user write 0 to corresponding bit.
+ *
+ * In essence, user can think of quasi-bidirectional IO as
+ * open-drain line, except presence of builtin rising edge acceleration
+ * embedded in PCF8574 IC
+ *
+ * PCF8574 has interrupt request line, which is being pulled down when
+ * port line state differs from last read. Port read operation clears
+ * state and INT line returns to high state via pullup.
+ */
+
+OBJECT_DECLARE_SIMPLE_TYPE(PCF8574State, PCF8574)
+
+#define PORTS_COUNT (8)
+
+struct PCF8574State {
+I2CSlave parent_obj;
+uint8_t  lastrq; /* Last requested state. If changed - assert irq */
+uint8_t  input;  /* external electrical line state */
+uint8_t  output; /* Pull-up (1) or drive low (0) on bit */
+qemu_irq handler[PORTS_COUNT];
+qemu_irq intrq;  /* External irq request */
+};
+
+static void pcf8574_reset(DeviceState *dev)
+{
+PCF8574State *s = PCF8574(dev);
+s->lastrq = MAKE_64BIT_MASK(0, PORTS_COUNT);
+s->input  = MAKE_64BIT_MASK(0, PORTS_COUNT);
+s->output = MAKE_64BIT_MASK(0, PORTS_COUNT);
+}
+
+static inline uint8_t pcf8574_line_state(PCF8574State *s)
+{
+/* we driving line low or external circuit does that */
+return s->input & s->output;
+}
+
+static uint8_t pcf8574_rx(I2CSlave *i2c)
+{
+PCF8574State *s = PCF8574(i2c);
+uint8_t linestate = pcf8574_line_state(s);
+if (s->lastrq != linestate) {
+s->lastrq = linestate;
+if (s->intrq) {
+qemu_set_irq(s->intrq, 1);
+}
+}
+return linestate;
+}
+
+static int pcf8574_tx(I2CSlave *i2c, uint8_t data)
+{
+PCF8574State *s = PCF8574(i2c);
+uint8_t prev;
+uint8_t diff;
+uint8_t actual;
+int line = 0;
+
+prev = pcf8574_line_state(s);
+s->output = data;
+actual = pcf8574_line_state(s);
+
+for (diff = (actual ^ prev); diff; diff &= ~(1 << line)) {
+line = ctz32(diff);
+if (s->handler[line]) {
+qemu_set_irq(s->handler[line], (actual >> line) & 1);
+}
+}
+
+if (s->intrq) {
+qemu_set_irq(s->intrq, actual == s->lastrq);
+}
+
+return 0;
+}
+
+static const VMStateDescription vmstate_pcf8574 = {
+.name   = "pcf8574",
+.version_id = 0,
+.minimum_version_id = 0,
+.fields = (VMStateField[]) {
+VMSTATE_I2C_SLAVE(parent_obj, PCF8574State),
+

[PULL 00/13] Misc HW patches for 2024-03-12

2024-03-12 Thread Philippe Mathieu-Daudé

The following changes since commit 7489f7f3f81dcb776df8c1b9a9db281fc21bf05f:

  Merge tag 'hw-misc-20240309' of https://github.com/philmd/qemu into staging 
(2024-03-09 20:12:21 +)

are available in the Git repository at:

  https://github.com/philmd/qemu.git tags/hw-misc-20240312

for you to fetch changes up to afc8b05cea14b2eea6f1eaa640f74b21486fca48:

  docs/about/deprecated.rst: Move SMP configurations item to system emulator 
section (2024-03-12 09:19:04 +0100)


Misc HW patch queue

- Rename hw/ide/ahci-internal.h for consistency (Zoltan)
- More convenient PCI hotplug trace events (Vladimir)
- Short CLI option to add drives for sam460ex machine (Zoltan)
- More missing ERRP_GUARD() macros (Zhao)
- Avoid faulting when unmapped I/O BAR is accessed on SPARC EBUS (Mark)
- Remove unused includes in hw/core/ (Zhao)
- New PCF8574 GPIO over I2C model (Dmitriy)
- Require ObjC on Darwin macOS by default (Peter)
- Corrected "-smp parameter=1" placement in docs/ (Zhao)



BALATON Zoltan (2):
  hw/ide/ahci: Rename ahci_internal.h to ahci-internal.h
  hw/ppc/sam460ex: Support short options for adding drives

Dmitriy Sharikhin (1):
  hw/gpio: introduce pcf8574 driver

Mark Cave-Ayland (1):
  sun4u: remap ebus BAR0 to use unassigned_io_ops instead of alias to
PCI IO space

Peter Maydell (1):
  meson.build: Always require an objc compiler on macos hosts

Vladimir Sementsov-Ogievskiy (1):
  hw/pci: add some convenient trace-events for pcie and shpc hotplug

Zhao Liu (7):
  hw/core/loader-fit: Fix missing ERRP_GUARD() for error_prepend()
  hw/core/qdev-properties-system: Fix missing ERRP_GUARD() for
error_prepend()
  hw/misc/ivshmem: Fix missing ERRP_GUARD() for error_prepend()
  hw/core: Cleanup unused included headers in cpu-common.c
  hw/core: Cleanup unused included header in machine-qmp-cmds.c
  hw/core: Cleanup unused included headers in numa.c
  docs/about/deprecated.rst: Move SMP configurations item to system
emulator section

 MAINTAINERS |   6 +
 docs/about/deprecated.rst   |  20 +--
 meson.build |   2 +-
 hw/ide/{ahci_internal.h => ahci-internal.h} |   0
 include/hw/gpio/pcf8574.h   |  15 ++
 hw/core/cpu-common.c|   4 -
 hw/core/loader-fit.c|   2 +
 hw/core/machine-qmp-cmds.c  |   1 -
 hw/core/numa.c  |   2 -
 hw/core/qdev-properties-system.c|   1 +
 hw/gpio/pcf8574.c   | 162 
 hw/ide/ahci.c   |   2 +-
 hw/ide/ich.c|   2 +-
 hw/misc/ivshmem.c   |   1 +
 hw/pci/pcie.c   |  56 +++
 hw/pci/shpc.c   |  46 ++
 hw/ppc/sam460ex.c   |  24 ++-
 hw/sparc64/sun4u.c  |   9 +-
 hw/gpio/Kconfig |   4 +
 hw/gpio/meson.build |   1 +
 hw/pci/trace-events |   6 +
 21 files changed, 339 insertions(+), 27 deletions(-)
 rename hw/ide/{ahci_internal.h => ahci-internal.h} (100%)
 create mode 100644 include/hw/gpio/pcf8574.h
 create mode 100644 hw/gpio/pcf8574.c

-- 
2.41.0

[PULL 06/13] hw/misc/ivshmem: Fix missing ERRP_GUARD() for error_prepend()

2024-03-12 Thread Philippe Mathieu-Daudé

From: Zhao Liu 

As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():

* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
*   error_append_hint(), because that doesn't work with _fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or _fatal.

ERRP_GUARD() could avoid the case when @errp is _fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].

The ivshmem_common_realize() passes @errp to error_prepend(), and as a
DeviceClass.realize method, there are too many possible callers to check
the impact of this defect; it may or may not be harmless. Thus it is
necessary to protect @errp with ERRP_GUARD().

To avoid the issue like [1] said, add missing ERRP_GUARD() at the
beginning of this function.

[1]: Issue description in the commit message of commit ae7c80a7bd73
 ("error: New macro ERRP_GUARD()").

Cc: Juan Quintela 
Cc: Manos Pitsidianakis 
Cc: Michael Galaxy 
Cc: Steve Sistare 
Signed-off-by: Zhao Liu 
Message-ID: <20240311033822.3142585-17-zhao1@linux.intel.com>
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/misc/ivshmem.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index a2fd0bc365..de49d1b8a8 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -832,6 +832,7 @@ static void ivshmem_write_config(PCIDevice *pdev, uint32_t 
address,
 
 static void ivshmem_common_realize(PCIDevice *dev, Error **errp)
 {
+ERRP_GUARD();
 IVShmemState *s = IVSHMEM_COMMON(dev);
 Error *err = NULL;
 uint8_t *pci_conf;
-- 
2.41.0

[PULL 13/13] docs/about/deprecated.rst: Move SMP configurations item to system emulator section

2024-03-12 Thread Philippe Mathieu-Daudé

From: Zhao Liu 

In the commit 54c4ea8f3ae6 ("hw/core/machine-smp: Deprecate unsupported
'parameter=1' SMP configurations"), the SMP related item is put under
the section "User-mode emulator command line arguments" instead of
"System emulator command line arguments".

-smp is a system emulator command, so move SMP configurations item to
system emulator section.

Signed-off-by: Zhao Liu 
Reviewed-by: Thomas Huth 
Message-ID: <20240312071512.3283513-1-zhao1@linux.intel.com>
Signed-off-by: Philippe Mathieu-Daudé 
---
 docs/about/deprecated.rst | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst
index dfd681cd02..2f9277c915 100644
--- a/docs/about/deprecated.rst
+++ b/docs/about/deprecated.rst
@@ -47,16 +47,6 @@ as short-form boolean values, and passed to plugins as 
``arg_name=on``.
 However, short-form booleans are deprecated and full explicit ``arg_name=on``
 form is preferred.
 
-User-mode emulator command line arguments
--
-
-``-p`` (since 9.0)
-''
-
-The ``-p`` option pretends to control the host page size.  However,
-it is not possible to change the host page size, and using the
-option only causes failures.
-
 ``-smp`` (Unsupported "parameter=1" SMP configurations) (since 9.0)
 '''
 
@@ -71,6 +61,16 @@ configurations (e.g. -smp drawers=1,books=1,clusters=1 for 
x86 PC machine) is
 marked deprecated since 9.0, users have to ensure that all the topology members
 described with -smp are supported by the target machine.
 
+User-mode emulator command line arguments
+-
+
+``-p`` (since 9.0)
+''
+
+The ``-p`` option pretends to control the host page size.  However,
+it is not possible to change the host page size, and using the
+option only causes failures.
+
 QEMU Machine Protocol (QMP) commands
 
 
-- 
2.41.0

Re: [PATCH 08/13] ppc/pnv: Set POWER9, POWER10 ibm,pa-features bits

2024-03-12 Thread Nicholas Piggin

On Tue Mar 12, 2024 at 6:06 PM AEST, Cédric Le Goater wrote:
> On 3/11/24 19:51, Nicholas Piggin wrote:
> > Copy the pa-features arrays from spapr, adjusting slightly as
> > described in comments.
> > 
> > Cc: "Cédric Le Goater" 
> > Cc: "Frédéric Barrat" 
> > Signed-off-by: Nicholas Piggin 
> > ---
> >   hw/ppc/pnv.c   | 67 --
> >   hw/ppc/spapr.c |  1 +
> >   2 files changed, 66 insertions(+), 2 deletions(-)
> > 
> > diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
> > index 52d964f77a..3e30c08420 100644
> > --- a/hw/ppc/pnv.c
> > +++ b/hw/ppc/pnv.c
> > @@ -332,6 +332,35 @@ static void pnv_chip_power8_dt_populate(PnvChip *chip, 
> > void *fdt)
> >   }
> >   }
> >   
> > +/*
> > + * Same as spapr pa_features_300 except pnv always enables CI largepages 
> > bit.
> > + */
> > +static const uint8_t pa_features_300[] = { 66, 0,
> > +/* 0: MMU|FPU|SLB|RUN|DABR|NX, 1: 
> > CILRG|fri[nzpm]|DABRX|SPRG3|SLB0|PP110 */
> > +/* 2: VPM|DS205|PPR|DS202|DS206, 3: LSD|URG, 5: LE|CFAR|EB|LSQ */
> > +0xf6, 0x3f, 0xc7, 0xc0, 0x00, 0xf0, /* 0 - 5 */
> > +/* 6: DS207 */
> > +0x80, 0x00, 0x00, 0x00, 0x00, 0x00, /* 6 - 11 */
> > +/* 16: Vector */
> > +0x00, 0x00, 0x00, 0x00, 0x80, 0x00, /* 12 - 17 */
> > +/* 18: Vec. Scalar, 20: Vec. XOR, 22: HTM */
> > +0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 18 - 23 */
> > +/* 24: Ext. Dec, 26: 64 bit ftrs, 28: PM ftrs */
> > +0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 24 - 29 */
> > +/* 32: LE atomic, 34: EBB + ext EBB */
> > +0x00, 0x00, 0x80, 0x00, 0xC0, 0x00, /* 30 - 35 */
> > +/* 40: Radix MMU */
> > +0x00, 0x00, 0x00, 0x00, 0x80, 0x00, /* 36 - 41 */
> > +/* 42: PM, 44: PC RA, 46: SC vec'd */
> > +0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 42 - 47 */
> > +/* 48: SIMD, 50: QP BFP, 52: String */
> > +0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 48 - 53 */
> > +/* 54: DecFP, 56: DecI, 58: SHA */
> > +0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 54 - 59 */
> > +/* 60: NM atomic, 62: RNG */
> > +0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 60 - 65 */
> > +};
> > +
> >   static void pnv_chip_power9_dt_populate(PnvChip *chip, void *fdt)
> >   {
> >   static const char compat[] = "ibm,power9-xscom\0ibm,xscom";
> > @@ -349,7 +378,7 @@ static void pnv_chip_power9_dt_populate(PnvChip *chip, 
> > void *fdt)
> >   offset = pnv_dt_core(chip, pnv_core, fdt);
> >   
> >   _FDT((fdt_setprop(fdt, offset, "ibm,pa-features",
> > -   pa_features_207, sizeof(pa_features_207;
> > +   pa_features_300, sizeof(pa_features_300;
> >   }
> >   
> >   if (chip->ram_size) {
> > @@ -359,6 +388,40 @@ static void pnv_chip_power9_dt_populate(PnvChip *chip, 
> > void *fdt)
> >   pnv_dt_lpc(chip, fdt, 0, PNV9_LPCM_BASE(chip), PNV9_LPCM_SIZE);
> >   }
> >   
> > +/*
> > + * Same as spapr pa_features_31 except pnv always enables CI largepages 
> > bit,
> > + * always disables copy/paste.
> > + */
> > +static const uint8_t pa_features_31[] = { 74, 0,
> > +/* 0: MMU|FPU|SLB|RUN|DABR|NX, 1: 
> > CILRG|fri[nzpm]|DABRX|SPRG3|SLB0|PP110 */
> > +/* 2: VPM|DS205|PPR|DS202|DS206, 3: LSD|URG, 5: LE|CFAR|EB|LSQ */
> > +0xf6, 0x3f, 0xc7, 0xc0, 0x00, 0xf0, /* 0 - 5 */
> > +/* 6: DS207 */
> > +0x80, 0x00, 0x00, 0x00, 0x00, 0x00, /* 6 - 11 */
> > +/* 16: Vector */
> > +0x00, 0x00, 0x00, 0x00, 0x80, 0x00, /* 12 - 17 */
> > +/* 18: Vec. Scalar, 20: Vec. XOR */
> > +0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 18 - 23 */
> > +/* 24: Ext. Dec, 26: 64 bit ftrs, 28: PM ftrs */
> > +0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 24 - 29 */
> > +/* 32: LE atomic, 34: EBB + ext EBB */
> > +0x00, 0x00, 0x80, 0x00, 0xC0, 0x00, /* 30 - 35 */
> > +/* 40: Radix MMU */
> > +0x00, 0x00, 0x00, 0x00, 0x80, 0x00, /* 36 - 41 */
> > +/* 42: PM, 44: PC RA, 46: SC vec'd */
> > +0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 42 - 47 */
> > +/* 48: SIMD, 50: QP BFP, 52: String */
> > +0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 48 - 53 */
> > +/* 54: DecFP, 56: DecI, 58: SHA */
> > +0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 54 - 59 */
> > +/* 60: NM atomic, 62: RNG */
> > +0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 60 - 65 */
> > +/* 68: DEXCR[SBHE|IBRTPDUS|SRAPD|NPHIE|PHIE] */
> > +0x00, 0x00, 0xce, 0x00, 0x00, 0x00, /* 66 - 71 */
> > +/* 72: [P]HASHCHK */
> > +0x80, 0x00, /* 72 - 73 */
> > +};
> > +
> >   static void pnv_chip_power10_dt_populate(PnvChip *chip, void *fdt)
> >   {
> >   static const char compat[] = "ibm,power10-xscom\0ibm,xscom";
> > @@ -376,7 +439,7 @@ static void pnv_chip_power10_dt_populate(PnvChip *chip, 
> > void *fdt)
> >   offset = pnv_dt_core(chip, pnv_core, fdt);
> >   
> >   _FDT((fdt_setprop(fdt, offset, "ibm,pa-features",
> > -   pa_features_207, sizeof(pa_features_207;
> > +

Re: [PATCH v4 11/24] net: Use virtual time for net announce

2024-03-12 Thread Pavel Dovgalyuk


This won't work, as needed. Announce timer can't be enabled, because
it is set in post_load function. Therefore announce callbacks break
the replay, when virtio-net is used with snapshots.

On 11.03.2024 20:40, Nicholas Piggin wrote:

Using virtual time for announce ensures that guest visible effects
are deterministic and don't break replay.

Signed-off-by: Nicholas Piggin 
---
  net/announce.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/announce.c b/net/announce.c
index 9e99044422..70b5d5e822 100644
--- a/net/announce.c
+++ b/net/announce.c
@@ -187,7 +187,7 @@ static void qemu_announce_self_once(void *opaque)
  
  void qemu_announce_self(AnnounceTimer *timer, AnnounceParameters *params)

  {
-qemu_announce_timer_reset(timer, params, QEMU_CLOCK_REALTIME,
+qemu_announce_timer_reset(timer, params, QEMU_CLOCK_VIRTUAL,
qemu_announce_self_once, timer);
  if (params->rounds) {
  qemu_announce_self_once(timer);

Re: [PATCH 2/2] migration: Fix error handling after dup in file migration

2024-03-12 Thread Daniel P . Berrangé

On Mon, Mar 11, 2024 at 08:33:35PM -0300, Fabiano Rosas wrote:
> The file migration code was allowing a possible -1 from a failed call
> to dup() to propagate into the new QIOFileChannel::fd before checking
> for validity. Coverity doesn't like that, possibly due to the the
> lseek(-1, ...) call that would ensue before returning from the channel
> creation routine.
> 
> Use the newly introduced qio_channel_file_dupfd() to properly check
> the return of dup() before proceeding.
> 
> Fixes: CID 1539961
> Fixes: CID 1539965
> Fixes: CID 1539960
> Fixes: 2dd7ee7a51 ("migration/multifd: Add incoming QIOChannelFile support")
> Fixes: decdc76772 ("migration/multifd: Add mapped-ram support to fd: URI")
> Reported-by: Peter Maydell 
> Signed-off-by: Fabiano Rosas 
> ---
>  migration/fd.c   |  9 -
>  migration/file.c | 14 +++---
>  2 files changed, 11 insertions(+), 12 deletions(-)
> 
> diff --git a/migration/fd.c b/migration/fd.c
> index d4ae72d132..4e2a63a73d 100644
> --- a/migration/fd.c
> +++ b/migration/fd.c
> @@ -80,6 +80,7 @@ static gboolean fd_accept_incoming_migration(QIOChannel 
> *ioc,
>  void fd_start_incoming_migration(const char *fdname, Error **errp)
>  {
>  QIOChannel *ioc;
> +QIOChannelFile *fioc;
>  int fd = monitor_fd_param(monitor_cur(), fdname, errp);
>  if (fd == -1) {
>  return;
> @@ -103,15 +104,13 @@ void fd_start_incoming_migration(const char *fdname, 
> Error **errp)
>  int channels = migrate_multifd_channels();
>  
>  while (channels--) {
> -ioc = QIO_CHANNEL(qio_channel_file_new_fd(dup(fd)));
> -
> -if (QIO_CHANNEL_FILE(ioc)->fd == -1) {
> -error_setg(errp, "Failed to duplicate fd %d", fd);
> +fioc = qio_channel_file_new_dupfd(fd, errp);
> +if (!fioc) {
>  return;
>  }
>  
>  qio_channel_set_name(ioc, "migration-fd-incoming");
> -qio_channel_add_watch_full(ioc, G_IO_IN,
> +qio_channel_add_watch_full(QIO_CHANNEL(fioc), G_IO_IN,
> fd_accept_incoming_migration,
> NULL, NULL,
> g_main_context_get_thread_default());

Nothing is free'ing the already created channels, if this while()
loop fails on the 2nd or later iterations.

> diff --git a/migration/file.c b/migration/file.c
> index 164b079966..d458f48269 100644
> --- a/migration/file.c
> +++ b/migration/file.c
> @@ -58,12 +58,13 @@ bool file_send_channel_create(gpointer opaque, Error 
> **errp)
>  int fd = fd_args_get_fd();
>  
>  if (fd && fd != -1) {
> -ioc = qio_channel_file_new_fd(dup(fd));
> +ioc = qio_channel_file_new_dupfd(fd, errp);
>  } else {
>  ioc = qio_channel_file_new_path(outgoing_args.fname, flags, 0, errp);
> -if (!ioc) {
> -goto out;
> -}
> +}
> +
> +if (!ioc) {
> +goto out;
>  }
>  
>  multifd_channel_connect(opaque, QIO_CHANNEL(ioc));
> @@ -147,10 +148,9 @@ void file_start_incoming_migration(FileMigrationArgs 
> *file_args, Error **errp)
> NULL, NULL,
> g_main_context_get_thread_default());
>  
> -fioc = qio_channel_file_new_fd(dup(fioc->fd));
> +fioc = qio_channel_file_new_dupfd(fioc->fd, errp);
>  
> -if (!fioc || fioc->fd == -1) {
> -error_setg(errp, "Error creating migration incoming channel");
> +if (!fioc) {
>  break;
>  }
>  } while (++i < channels);

Again, nothing is free'ing when the loops fails on 2nd or later
iterations.

So a weak

  Reviewed-by: Daniel P. Berrangé 

on the basis that it fixes the bugs that it claims to fix, but there
are more bugs that still need fixing here.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v9 11/21] i386/cpu: Decouple CPUID[0x1F] subleaf with specific topology level

2024-03-12 Thread Zhao Liu

On Mon, Mar 11, 2024 at 04:45:41PM +0800, Xiaoyao Li wrote:
> Date: Mon, 11 Mar 2024 16:45:41 +0800
> From: Xiaoyao Li 
> Subject: Re: [PATCH v9 11/21] i386/cpu: Decouple CPUID[0x1F] subleaf with
>  specific topology level
> 
> On 2/27/2024 6:32 PM, Zhao Liu wrote:
> > From: Zhao Liu 
> > 
> > At present, the subleaf 0x02 of CPUID[0x1F] is bound to the "die" level.
> > 
> > In fact, the specific topology level exposed in 0x1F depends on the
> > platform's support for extension levels (module, tile and die).
> > 
> > To help expose "module" level in 0x1F, decouple CPUID[0x1F] subleaf
> > with specific topology level.
> > 
> > Tested-by: Yongwei Ma 
> > Signed-off-by: Zhao Liu 
> 
> Reviewed-by: Xiaoyao Li 

Thanks!

> Besides, some nits below.
>

[snip]

> > +static void encode_topo_cpuid1f(CPUX86State *env, uint32_t count,
> > +X86CPUTopoInfo *topo_info,
> > +uint32_t *eax, uint32_t *ebx,
> > +uint32_t *ecx, uint32_t *edx)
> > +{
> > +X86CPU *cpu = env_archcpu(env);
> > +unsigned long level;
> > +uint32_t num_threads_next_level, offset_next_level;
> > +
> > +assert(count + 1 < CPU_TOPO_LEVEL_MAX);
> > +
> > +/*
> > + * Find the No.count topology levels in avail_cpu_topo bitmap.
> > + * Start from bit 0 (CPU_TOPO_LEVEL_INVALID).
> 
> AFAICS, it starts from bit 1 (CPU_TOPO_LEVEL_SMT). Because the initial value
> of level is CPU_TOPO_LEVEL_INVALID, but the first round of the loop is to
> find the valid bit starting from (level + 1).

Yes, this description is much clearer.

> > + */
> > +level = CPU_TOPO_LEVEL_INVALID;
> > +for (int i = 0; i <= count; i++) {
> > +level = find_next_bit(env->avail_cpu_topo,
> > +  CPU_TOPO_LEVEL_PACKAGE,
> > +  level + 1);
> > +
> > +/*
> > + * CPUID[0x1f] doesn't explicitly encode the package level,
> > + * and it just encode the invalid level (all fields are 0)
> > + * into the last subleaf of 0x1f.
> > + */
> 
> QEMU will never set bit CPU_TOPO_LEVEL_PACKAGE in env->avail_cpu_topo.

In the patch 9 [1], I set the CPU_TOPO_LEVEL_PACKAGE in bitmap. This
level is a basic topology level in general, so it's worth being set.

Only in Intel's 0x1F, it doesn't have a corresponding type, and where
I use it as a termination condition for 0x1F encoding (not an error case).

[1]: 
https://lore.kernel.org/qemu-devel/20240227103231.1556302-10-zhao1@linux.intel.com/

> So I think we should assert() it instead of fixing it silently.
> 
> > +if (level == CPU_TOPO_LEVEL_PACKAGE) {
> > +level = CPU_TOPO_LEVEL_INVALID;
> > +break;
> > +}
> > +}
> > +
> > +if (level == CPU_TOPO_LEVEL_INVALID) {
> > +num_threads_next_level = 0;
> > +offset_next_level = 0;
> > +} else {
> > +unsigned long next_level;
> 
> please define it at the beginning of the function. e.g.,

Okay, I'll put the declaration of "next_level" at the beginning of this
function with a current variable "level".

> 
> > +next_level = find_next_bit(env->avail_cpu_topo,
> > +   CPU_TOPO_LEVEL_PACKAGE,
> > +   level + 1);
> > +num_threads_next_level = num_threads_by_topo_level(topo_info,
> > +   next_level);
> > +offset_next_level = apicid_offset_by_topo_level(topo_info,
> > +next_level);
> > +}
> > +
> > +*eax = offset_next_level;
> > +*ebx = num_threads_next_level;
> > +*ebx &= 0x; /* The count doesn't need to be reliable. */
> 
> we can combine them together. e.g.,
> 
> *ebx = num_threads_next_level & 0x; /* ... */
> 
> > +*ecx = count & 0xff;
> > +*ecx |= cpuid1f_topo_type(level) << 8;
> 
> Ditto,
> 
> *ecx = count & 0xff | cpuid1f_topo_type(level) << 8;

OK, will combine these.

> > +*edx = cpu->apic_id;
> > +
> > +assert(!(*eax & ~0x1f));
> > +}
> > +

Re: [PATCH v4 00/25] migration: Improve error reporting

2024-03-12 Thread Cédric Le Goater


On 3/12/24 08:16, Cédric Le Goater wrote:

On 3/11/24 21:24, Peter Xu wrote:

On Fri, Mar 08, 2024 at 04:15:08PM +0800, Peter Xu wrote:

On Wed, Mar 06, 2024 at 02:34:15PM +0100, Cédric Le Goater wrote:

* [1-4] already queued in migration-next.
   migration: Report error when shutdown fails
   migration: Remove SaveStateHandler and LoadStateHandler typedefs
   migration: Add documentation for SaveVMHandlers
   migration: Do not call PRECOPY_NOTIFY_SETUP notifiers in case of error
* [5-9] are prequisite changes in other components related to the
   migration save_setup() handler. They make sure a failure is not
   returned without setting an error.
   s390/stattrib: Add Error** argument to set_migrationmode() handler
   vfio: Always report an error in vfio_save_setup()
   migration: Always report an error in block_save_setup()
   migration: Always report an error in ram_save_setup()
   migration: Add Error** argument to vmstate_save()

* [10-15] are the core changes in migration and memory components to
   propagate an error reported in a save_setup() handler.

   migration: Add Error** argument to qemu_savevm_state_setup()
   migration: Add Error** argument to .save_setup() handler
   migration: Add Error** argument to .load_setup() handler


Further queued 5-12 in migration-staging (until here), thanks.


Just to keep a record: due to the virtio failover test failure and the
other block migration uncertainty in patch 7 (in which case we may want to
have a fix on sectors==0 case), I unqueued this chunk for 9.0.


ok. I will ask the block folks for help to understand if sectors==0
is also an error in the save_setup context. May be  we can still
merge these in 9.0 cycle.


I discussed with Kevin and sectors==0 is not an error case, the loop
should simply continue. That said, commit 66db46ca83b8 ("migration:
Deprecate block migration") would let us remove all that code in
the next cycle which is even simpler.

Thanks,

C.

Re: [PATCH v4 20/24] replay: simple auto-snapshot mode for record

2024-03-12 Thread Nicholas Piggin

On Tue Mar 12, 2024 at 7:00 PM AEST, Pavel Dovgalyuk wrote:
> On 11.03.2024 20:40, Nicholas Piggin wrote:
> > record makes an initial snapshot when the machine is created, to enable
> > reverse-debugging. Often the issue being debugged appears near the end of
> > the trace, so it is important for performance to keep snapshots close to
> > the end.
> > 
> > This implements a periodic snapshot mode that keeps a rolling set of
> > recent snapshots. This could be done by the debugger or other program
> > that talks QMP, but for setting up simple scenarios and tests, this is
> > more convenient.
> > 
> > Signed-off-by: Nicholas Piggin 
> > ---
> >   docs/system/replay.rst   |  5 
> >   include/sysemu/replay.h  | 11 
> >   replay/replay-snapshot.c | 57 
> >   replay/replay.c  | 27 +--
> >   system/vl.c  |  9 +++
> >   qemu-options.hx  |  9 +--
> >   6 files changed, 114 insertions(+), 4 deletions(-)
> > 
> > diff --git a/docs/system/replay.rst b/docs/system/replay.rst
> > index ca7c17c63d..1ae8614475 100644
> > --- a/docs/system/replay.rst
> > +++ b/docs/system/replay.rst
> > @@ -156,6 +156,11 @@ for storing VM snapshots. Here is the example of the 
> > command line for this:
> >   ``empty.qcow2`` drive does not connected to any virtual block device and 
> > used
> >   for VM snapshots only.
> >   
> > +``rrsnapmode`` can be used to select just an initial snapshot or periodic
> > +snapshots, with ``rrsnapcount`` specifying the number of periodic snapshots
> > +to maintain, and ``rrsnaptime`` the amount of run time in seconds between
> > +periodic snapshots.
> > +
> >   .. _network-label:
> >   
> >   Network devices
> > diff --git a/include/sysemu/replay.h b/include/sysemu/replay.h
> > index 8102fa54f0..92fa82842b 100644
> > --- a/include/sysemu/replay.h
> > +++ b/include/sysemu/replay.h
> > @@ -48,6 +48,17 @@ typedef enum ReplayCheckpoint ReplayCheckpoint;
> >   
> >   typedef struct ReplayNetState ReplayNetState;
> >   
> > +enum ReplaySnapshotMode {
> > +REPLAY_SNAPSHOT_MODE_INITIAL,
> > +REPLAY_SNAPSHOT_MODE_PERIODIC,
> > +};
>
> This should be defined in replay-internal.h, because it is internal for 
> replay.
>
> > +typedef enum ReplaySnapshotMode ReplaySnapshotMode;
> > +
> > +extern ReplaySnapshotMode replay_snapshot_mode;
> > +
> > +extern uint64_t replay_snapshot_periodic_delay;
> > +extern int replay_snapshot_periodic_nr_keep;
>
> These ones are internal too.

Okay for both.

>
> > +
> >   /* Name of the initial VM snapshot */
> >   extern char *replay_snapshot;
> >   
> > diff --git a/replay/replay-snapshot.c b/replay/replay-snapshot.c
> > index ccb4d89dda..762555feaa 100644
> > --- a/replay/replay-snapshot.c
> > +++ b/replay/replay-snapshot.c
> > @@ -70,6 +70,53 @@ void replay_vmstate_register(void)
> >   vmstate_register(NULL, 0, _replay, _state);
> >   }
> >   
> > +static QEMUTimer *replay_snapshot_timer;
> > +static int replay_snapshot_count;
> > +
> > +static void replay_snapshot_timer_cb(void *opaque)
> > +{
> > +Error *err = NULL;
> > +char *name;
> > +
> > +if (!replay_can_snapshot()) {
> > +/* Try again soon */
> > +timer_mod(replay_snapshot_timer,
> > +  qemu_clock_get_ms(QEMU_CLOCK_REALTIME) +
> > +  replay_snapshot_periodic_delay / 10);
> > +return;
> > +}
> > +
> > +name = g_strdup_printf("%s-%d", replay_snapshot, 
> > replay_snapshot_count);
> > +if (!save_snapshot(name,
> > +   true, NULL, false, NULL, )) {
> > +error_report_err(err);
> > +error_report("Could not create periodic snapshot "
> > + "for icount record, disabling");
> > +g_free(name);
> > +return;
> > +}
> > +g_free(name);
> > +replay_snapshot_count++;
> > +
> > +if (replay_snapshot_periodic_nr_keep >= 1 &&
> > +replay_snapshot_count > replay_snapshot_periodic_nr_keep) {
> > +int del_nr;
> > +
> > +del_nr = replay_snapshot_count - replay_snapshot_periodic_nr_keep 
> > - 1;
> > +name = g_strdup_printf("%s-%d", replay_snapshot, del_nr);
>
> Copy-paste of snapshot name format.

Yes good catch.

>
> > +if (!delete_snapshot(name, false, NULL, )) {
> > +error_report_err(err);
> > +error_report("Could not delete periodic snapshot "
> > + "for icount record");
> > +}
> > +g_free(name);
> > +}
> > +
> > +timer_mod(replay_snapshot_timer,
> > +  qemu_clock_get_ms(QEMU_CLOCK_REALTIME) +
> > +  replay_snapshot_periodic_delay);
> > +}
> > +
> >   void replay_vmstate_init(void)
> >   {
> >   Error *err = NULL;
> > @@ -82,6 +129,16 @@ void replay_vmstate_init(void)
> >   error_report("Could not create snapshot for icount 
> > record");
> >   exit(1);
> >   }
> > +
> > +

Re: [PATCH] disas/riscv: Further correction to LUI disassembly

2024-03-12 Thread Andrew Jones

On Mon, Mar 11, 2024 at 11:56:42AM -0700, Richard Bagley wrote:
> I have realized that *the patch is indeed a fix*, not a workaround.
> 
> In fact, the argument to LUI and AUIPC in assembly *must* be a number
> between [0x0, 0xf].
> RISC-V Assembly Programmer's Manual : Load Upper Immediate's Immediate
> 

I think that's just documenting the current behavior, but the behavior
(not accepting a signed decimal number for a signed immediate) doesn't
appear to be justified, so I think my suggestion in [1] still stands.
That said, I don't really have much of a horse in this race so if
somebody comes along and closes that BZ with a simple justification of
"we, the people that work on this stuff, agreed we prefer the range
[0x0, 0xf]", then I won't argue.

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=30746

Thanks,
drew

> Signed decimal numbers are programmed as their two's complement.
> 
> I checked: neither GCC nor LLVM will assemble
> 
> > lui x1, -4
> 
> The LLVM compiled models the arguments to LUI and AUIPC as UIMM (unsigned
> immediate) or UIMM20 (20 bit unsigned immediate).
> 
> I should have checked this from the start. I jumped to the conclusion that
> both formats (signed decimal, two's complement) for negative arguments
> should be supported, and that I was encountering a bug.
> I apologize to all for the unnecessary back-and-forth.
> 
> I don't yet see a reason why llvm and gcc could not support a signed number
> in decimal format, perhaps requiring a pseudo-instruction.
> This might be desirable, if only in support of assembly programming.
> On the other hand, it is easy to make the conversion to a two's-complement
> number.
> 
> Richard
> 
> On Sat, Mar 9, 2024 at 4:01 AM Andrew Jones  wrote:
> 
> > On Fri, Mar 08, 2024 at 08:22:01PM -0800, Richard Bagley wrote:
> > > post-nack, one further comment:
> > >
> > > One could argue that this change also aligns QEMU with supporting tools
> > (as
> > > Andrew observed), and it makes sense to merge this change into QEMU until
> > > those tools update to supporting signed decimal numbers with immediates.
> > >
> > > As it is, both GNU assembler and the LLVM integrated assembler (or
> > llvm-mc)
> > > throws an error with examples such as
> > > auipc s0, -17
> > >
> > > On the other hand, I have only seen this problem with the output of the
> > > COLLECT plug-in, not (as yet) with QEMU execution proper.
> > > If the problem is confined to COLLECT, perhaps the argument for aligning
> > > with other tools is not as strong.
> > >
> > > In the meantime, I have adjusted my change locally to include AUIPC, and
> > > written a substantive, and I hope, clear commit description.
> > > If you would like me to resubmit a patch with this updated change, please
> > > let me know.
> >
> > Since the patch is ready for posting, then it might as well be posted
> > (even if it may not get merged right away). If the issue arises again,
> > then we can refer to the latest proposed patch, which will be preserved
> > in the mail archives.
> >
> > Thanks,
> > drew
> >

Re: [PATCH] tests: Raise timeouts for bufferiszero and crypto-tlscredsx509

2024-03-12 Thread Daniel P . Berrangé

On Tue, Mar 12, 2024 at 11:08:15AM +, Peter Maydell wrote:
> On our gcov CI job, the bufferiszero and crypto-tlscredsx509
> tests time out occasionally, making the job flaky. Double the
> timeout on these two tests.
> 
> Cc: qemu-sta...@nongnu.org
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2221
> Signed-off-by: Peter Maydell 
> ---
> cc stable just because it probably helps CI reliability there too
> ---
>  tests/unit/meson.build | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/unit/meson.build b/tests/unit/meson.build
> index cae925c1325..30db3c418fa 100644
> --- a/tests/unit/meson.build
> +++ b/tests/unit/meson.build
> @@ -173,8 +173,9 @@ test_env.set('G_TEST_BUILDDIR', meson.current_build_dir())
>  
>  slow_tests = {
>'test-aio-multithread' : 120,
> +  'test-bufferiszero': 60,
>'test-crypto-block' : 300,
> -  'test-crypto-tlscredsx509': 45,
> +  'test-crypto-tlscredsx509': 90,
>'test-crypto-tlssession': 45,

I'd probably suggest bumping this to 90 too, as it is a similar order
to CPU burn complexity to the other tls test - both of them create
a huge number of certs for testing many scenarios.

Either way,

Reviewed-by: Daniel P. Berrangé 

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

1 2 3 4 5 6 7 >

1 - 100 of 648 matches

Mail list logo