Re: [PATCH v2 3/7] util: Introduce ThreadContext user-creatable object

2022-10-10 Thread Markus Armbruster
David Hildenbrand  writes:

> Setting the CPU affinity of QEMU threads is a bit problematic, because
> QEMU doesn't always have permissions to set the CPU affinity itself,
> for example, with seccomp after initialized by QEMU:
> -sandbox enable=on,resourcecontrol=deny
>
> General information about CPU affinities can be found in the man page of
> taskset:
> CPU affinity is a scheduler property that "bonds" a process to a given
> set of CPUs on the system. The Linux scheduler will honor the given CPU
> affinity and the process will not run on any other CPUs.
>
> While upper layers are already aware of how to handle CPU affinities for
> long-lived threads like iothreads or vcpu threads, especially short-lived
> threads, as used for memory-backend preallocation, are more involved to
> handle. These threads are created on demand and upper layers are not even
> able to identify and configure them.
>
> Introduce the concept of a ThreadContext, that is essentially a thread
> used for creating new threads. All threads created via that context
> thread inherit the configured CPU affinity. Consequently, it's
> sufficient to create a ThreadContext and configure it once, and have all
> threads created via that ThreadContext inherit the same CPU affinity.
>
> The CPU affinity of a ThreadContext can be configured two ways:
>
> (1) Obtaining the thread id via the "thread-id" property and setting the
> CPU affinity manually.
>
> (2) Setting the "cpu-affinity" property and letting QEMU try set the
> CPU affinity itself. This will fail if QEMU doesn't have permissions
> to do so anymore after seccomp was initialized.
>
> A simple QEMU example to set the CPU affinity to CPU 0,1,6,7 would be:
> qemu-system-x86_64 -S \
>   -object thread-context,id=tc1,cpu-affinity=0-1,cpu-affinity=6-7
>
> And we can query it via HMP/QMP:
> (qemu) qom-get tc1 cpu-affinity
> [
> 0,
> 1,
> 6,
> 7
> ]
>
> But note that due to dynamic library loading this example will not work
> before we actually make use of thread_context_create_thread() in QEMU
> code, because the type will otherwise not get registered.

What do you mean exactly by "not work"?  It's not "CLI option or HMP
command fails":

$ upstream-qemu -S -display none -nodefaults -monitor stdio -object 
thread-context,id=tc1,cpu-affinity=0-1,cpu-affinity=6-7
QEMU 7.1.50 monitor - type 'help' for more information
(qemu) qom-get tc1 cpu-affinity
[
0,
1,
6,
7
]
(qemu) info cpus
* CPU #0: thread_id=1670613

Even though the affinities refer to nonexistent CPUs :)

> A ThreadContext can be reused, simply by reconfiguring the CPU affinity.

So, when a thread is created, its affinity comes from its thread context
(if any).  When I later change the context's affinity, it does *not*
affect existing threads, only future ones.  Correct?

> Reviewed-by: Michal Privoznik 
> Signed-off-by: David Hildenbrand 
> ---
>  include/qemu/thread-context.h |  57 +++
>  qapi/qom.json |  17 +++
>  util/meson.build  |   1 +
>  util/oslib-posix.c|   1 +
>  util/thread-context.c | 278 ++
>  5 files changed, 354 insertions(+)
>  create mode 100644 include/qemu/thread-context.h
>  create mode 100644 util/thread-context.c
>
> diff --git a/include/qemu/thread-context.h b/include/qemu/thread-context.h
> new file mode 100644
> index 00..2ebd6b7fe1
> --- /dev/null
> +++ b/include/qemu/thread-context.h
> @@ -0,0 +1,57 @@
> +/*
> + * QEMU Thread Context
> + *
> + * Copyright Red Hat Inc., 2022
> + *
> + * Authors:
> + *  David Hildenbrand 
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#ifndef SYSEMU_THREAD_CONTEXT_H
> +#define SYSEMU_THREAD_CONTEXT_H
> +
> +#include "qapi/qapi-types-machine.h"
> +#include "qemu/thread.h"
> +#include "qom/object.h"
> +
> +#define TYPE_THREAD_CONTEXT "thread-context"
> +OBJECT_DECLARE_TYPE(ThreadContext, ThreadContextClass,
> +THREAD_CONTEXT)
> +
> +struct ThreadContextClass {
> +ObjectClass parent_class;
> +};
> +
> +struct ThreadContext {
> +/* private */
> +Object parent;
> +
> +/* private */
> +unsigned int thread_id;
> +QemuThread thread;
> +
> +/* Semaphore to wait for context thread action. */
> +QemuSemaphore sem;
> +/* Semaphore to wait for action in context thread. */
> +QemuSemaphore sem_thread;
> +/* Mutex to synchronize requests. */
> +QemuMutex mutex;
> +
> +/* Commands for the thread to execute. */
> +int thread_cmd;
> +void *thread_cmd_data;
> +
> +/* CPU affinity bitmap used for initialization. */
> +unsigned long *init_cpu_bitmap;
> +int init_cpu_nbits;
> +};
> +
> +void thread_context_create_thread(ThreadContext *tc, QemuThread *thread,
> +

Re: [PATCH v3 2/2] hw/intc: sifive_plic: change interrupt priority register to WARL field

2022-10-10 Thread Alistair Francis
On Mon, Oct 3, 2022 at 5:07 PM Clément Chigot  wrote:
>
> On Mon, Oct 3, 2022 at 6:14 AM Jim Shu  wrote:
> >
> > PLIC spec [1] requires interrupt source priority registers are WARL
> > field and the number of supported priority is power-of-2 to simplify SW
> > discovery.
> >
> > Existing QEMU RISC-V machine (e.g. shakti_c) don't strictly follow PLIC
> > spec, whose number of supported priority is not power-of-2. Just change
> > each bit of interrupt priority register to WARL field when the number of
> > supported priority is power-of-2.
> >
> > [1] 
> > https://github.com/riscv/riscv-plic-spec/blob/master/riscv-plic.adoc#interrupt-priorities
> >
> > Signed-off-by: Jim Shu 

Acked-by: Alistair Francis 

Alistair

> > ---
> >  hw/intc/sifive_plic.c | 21 +++--
> >  1 file changed, 19 insertions(+), 2 deletions(-)
> >
> > diff --git a/hw/intc/sifive_plic.c b/hw/intc/sifive_plic.c
> > index f864efa761..c2dfacf028 100644
> > --- a/hw/intc/sifive_plic.c
> > +++ b/hw/intc/sifive_plic.c
> > @@ -180,7 +180,15 @@ static void sifive_plic_write(void *opaque, hwaddr 
> > addr, uint64_t value,
> >  if (addr_between(addr, plic->priority_base, plic->num_sources << 2)) {
> >  uint32_t irq = ((addr - plic->priority_base) >> 2) + 1;
> >
> > -if (value <= plic->num_priorities) {
> > +if (((plic->num_priorities + 1) & plic->num_priorities) == 0) {
> > +/*
> > + * if "num_priorities + 1" is power-of-2, make each register 
> > bit of
> > + * interrupt priority WARL (Write-Any-Read-Legal). Just filter
> > + * out the access to unsupported priority bits.
> > + */
> > +plic->source_priority[irq] = value % (plic->num_priorities + 
> > 1);
> > +sifive_plic_update(plic);
> > +} else if (value <= plic->num_priorities) {
> >  plic->source_priority[irq] = value;
> >  sifive_plic_update(plic);
> >  }
> > @@ -207,7 +215,16 @@ static void sifive_plic_write(void *opaque, hwaddr 
> > addr, uint64_t value,
> >  uint32_t contextid = (addr & (plic->context_stride - 1));
> >
> >  if (contextid == 0) {
> > -if (value <= plic->num_priorities) {
> > +if (((plic->num_priorities + 1) & plic->num_priorities) == 0) {
> > +/*
> > + * if "num_priorities + 1" is power-of-2, each register 
> > bit of
> > + * interrupt priority is WARL (Write-Any-Read-Legal). Just
> > + * filter out the access to unsupported priority bits.
> > + */
> > +plic->target_priority[addrid] = value %
> > +(plic->num_priorities + 1);
> > +sifive_plic_update(plic);
> > +} else if (value <= plic->num_priorities) {
> >  plic->target_priority[addrid] = value;
> >  sifive_plic_update(plic);
> >  }
> > --
> > 2.17.1
>
> Reviewed-by: Clément Chigot 
>



Re: [PATCH v3 0/2] Enhance maximum priority support of PLIC

2022-10-10 Thread Jim Shu
Gentle ping.

It's a patch for fix and spec alignment of PLIC.


On Mon, Oct 3, 2022 at 12:14 PM Jim Shu  wrote:
>
> This patchset fixes hard-coded maximum priority of interrupt priority
> register and also changes this register to WARL field to align the PLIC
> spec.
>
> Changelog:
>
> v3:
>   * fix opposite of power-of-2 max priority checking expression.
>
> v2:
>   * change interrupt priority register to WARL field.
>
> Jim Shu (2):
>   hw/intc: sifive_plic: fix hard-coded max priority level
>   hw/intc: sifive_plic: change interrupt priority register to WARL field
>
>  hw/intc/sifive_plic.c | 25 ++---
>  1 file changed, 22 insertions(+), 3 deletions(-)
>
> --
> 2.17.1
>



Re: [RFC PATCH v2 2/4] acpi: fadt: support revision 6.0 of the ACPI specification

2022-10-10 Thread Ani Sinha
On Mon, Oct 10, 2022 at 6:53 PM Miguel Luis  wrote:
>
> Update the Fixed ACPI Description Table (FADT) to revision 6.0 of the ACPI
> specification adding the field "Hypervisor Vendor Identity" that was missing.
>
> This field's description states the following: "64-bit identifier of 
> hypervisor
> vendor. All bytes in this field are considered part of the vendor identity.
> These identifiers are defined independently by the vendors themselves,
> usually following the name of the hypervisor product. Version information
> should NOT be included in this field - this shall simply denote the vendor's
> name or identifier. Version information can be communicated through a
> supplemental vendor-specific hypervisor API. Firmware implementers would
> place zero bytes into this field, denoting that no hypervisor is present in
> the actual firmware."
>
> Hereupon, what should a valid identifier of an Hypervisor Vendor ID be and
> where should QEMU provide that information?
>
> On the v1 [1] of this RFC there's the suggestion of having this information
> in sync by the current acceleration name. This also seems to imply that QEMU,
> which generates the FADT table, and the FADT consumer need to be in sync with
> the values of this field.
>
> This version follows Ani Sinha's suggestion [2] of using "QEMU" for the
> hypervisor vendor ID.
>
> [1]: https://lists.nongnu.org/archive/html/qemu-devel/2022-10/msg00911.html
> [2]: https://lists.nongnu.org/archive/html/qemu-devel/2022-10/msg00989.html
>
> Signed-off-by: Miguel Luis 

Reviewed-by: Ani Sinha 

> ---
>  hw/acpi/aml-build.c  | 13 ++---
>  hw/arm/virt-acpi-build.c | 10 +-
>  2 files changed, 15 insertions(+), 8 deletions(-)
>
> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
> index e6bfac95c7..42feb4d4d7 100644
> --- a/hw/acpi/aml-build.c
> +++ b/hw/acpi/aml-build.c
> @@ -2070,7 +2070,7 @@ void build_pptt(GArray *table_data, BIOSLinker *linker, 
> MachineState *ms,
>  acpi_table_end(linker, );
>  }
>
> -/* build rev1/rev3/rev5.1 FADT */
> +/* build rev1/rev3/rev5.1/rev6.0 FADT */
>  void build_fadt(GArray *tbl, BIOSLinker *linker, const AcpiFadtData *f,
>  const char *oem_id, const char *oem_table_id)
>  {
> @@ -2193,8 +2193,15 @@ void build_fadt(GArray *tbl, BIOSLinker *linker, const 
> AcpiFadtData *f,
>  /* SLEEP_STATUS_REG */
>  build_append_gas_from_struct(tbl, >sleep_sts);
>
> -/* TODO: extra fields need to be added to support revisions above rev5 */
> -assert(f->rev == 5);
> +if (f->rev == 5) {
> +goto done;
> +}
> +
> +/* Hypervisor Vendor Identity */
> +build_append_padded_str(tbl, "QEMU", 8, '\0');
> +
> +/* TODO: extra fields need to be added to support revisions above rev6 */
> +assert(f->rev == 6);
>
>  done:
>  acpi_table_end(linker, );
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 9b3aee01bf..72bb6f61a5 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -809,13 +809,13 @@ build_madt(GArray *table_data, BIOSLinker *linker, 
> VirtMachineState *vms)
>  }
>
>  /* FADT */
> -static void build_fadt_rev5(GArray *table_data, BIOSLinker *linker,
> +static void build_fadt_rev6(GArray *table_data, BIOSLinker *linker,
>  VirtMachineState *vms, unsigned dsdt_tbl_offset)
>  {
> -/* ACPI v5.1 */
> +/* ACPI v6.0 */
>  AcpiFadtData fadt = {
> -.rev = 5,
> -.minor_ver = 1,
> +.rev = 6,
> +.minor_ver = 0,
>  .flags = 1 << ACPI_FADT_F_HW_REDUCED_ACPI,
>  .xdsdt_tbl_offset = _tbl_offset,
>  };
> @@ -945,7 +945,7 @@ void virt_acpi_build(VirtMachineState *vms, 
> AcpiBuildTables *tables)
>
>  /* FADT MADT PPTT GTDT MCFG SPCR DBG2 pointed to by RSDT */
>  acpi_add_table(table_offsets, tables_blob);
> -build_fadt_rev5(tables_blob, tables->linker, vms, dsdt);
> +build_fadt_rev6(tables_blob, tables->linker, vms, dsdt);
>
>  acpi_add_table(table_offsets, tables_blob);
>  build_madt(tables_blob, tables->linker, vms);
> --
> 2.37.3
>



Re: [PATCH v7 08/18] accel/tcg: Introduce tlb_set_page_full

2022-10-10 Thread Alistair Francis
On Wed, Oct 5, 2022 at 1:11 AM Richard Henderson
 wrote:
>
> Now that we have collected all of the page data into
> CPUTLBEntryFull, provide an interface to record that
> all in one go, instead of using 4 arguments.  This interface
> allows CPUTLBEntryFull to be extended without having to
> change the number of arguments.
>
> Reviewed-by: Alex Bennée 
> Reviewed-by: Peter Maydell 
> Reviewed-by: Philippe Mathieu-Daudé 
> Signed-off-by: Richard Henderson 
> ---
>  include/exec/cpu-defs.h | 14 +++
>  include/exec/exec-all.h | 22 ++
>  accel/tcg/cputlb.c  | 51 ++---
>  3 files changed, 69 insertions(+), 18 deletions(-)
>
> diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h
> index f70f54d850..5e12cc1854 100644
> --- a/include/exec/cpu-defs.h
> +++ b/include/exec/cpu-defs.h
> @@ -148,7 +148,21 @@ typedef struct CPUTLBEntryFull {
>   * + the offset within the target MemoryRegion (otherwise)
>   */
>  hwaddr xlat_section;
> +
> +/*
> + * @phys_addr contains the physical address in the address space
> + * given by cpu_asidx_from_attrs(cpu, @attrs).
> + */
> +hwaddr phys_addr;
> +
> +/* @attrs contains the memory transaction attributes for the page. */
>  MemTxAttrs attrs;
> +
> +/* @prot contains the complete protections for the page. */
> +uint8_t prot;
> +
> +/* @lg_page_size contains the log2 of the page size. */
> +uint8_t lg_page_size;
>  } CPUTLBEntryFull;
>
>  /*
> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
> index d255d69bc1..b1b920a713 100644
> --- a/include/exec/exec-all.h
> +++ b/include/exec/exec-all.h
> @@ -257,6 +257,28 @@ void tlb_flush_range_by_mmuidx_all_cpus_synced(CPUState 
> *cpu,
> uint16_t idxmap,
> unsigned bits);
>
> +/**
> + * tlb_set_page_full:
> + * @cpu: CPU context
> + * @mmu_idx: mmu index of the tlb to modify
> + * @vaddr: virtual address of the entry to add
> + * @full: the details of the tlb entry
> + *
> + * Add an entry to @cpu tlb index @mmu_idx.  All of the fields of
> + * @full must be filled, except for xlat_section, and constitute
> + * the complete description of the translated page.
> + *
> + * This is generally called by the target tlb_fill function after
> + * having performed a successful page table walk to find the physical
> + * address and attributes for the translation.
> + *
> + * At most one entry for a given virtual address is permitted. Only a
> + * single TARGET_PAGE_SIZE region is mapped; @full->lg_page_size is only
> + * used by tlb_flush_page.
> + */
> +void tlb_set_page_full(CPUState *cpu, int mmu_idx, target_ulong vaddr,
> +   CPUTLBEntryFull *full);
> +
>  /**
>   * tlb_set_page_with_attrs:
>   * @cpu: CPU to add this TLB entry for
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index e3ee4260bd..361078471b 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -1095,16 +1095,16 @@ static void tlb_add_large_page(CPUArchState *env, int 
> mmu_idx,
>  env_tlb(env)->d[mmu_idx].large_page_mask = lp_mask;
>  }
>
> -/* Add a new TLB entry. At most one entry for a given virtual address
> +/*
> + * Add a new TLB entry. At most one entry for a given virtual address
>   * is permitted. Only a single TARGET_PAGE_SIZE region is mapped, the
>   * supplied size is only used by tlb_flush_page.
>   *
>   * Called from TCG-generated code, which is under an RCU read-side
>   * critical section.
>   */
> -void tlb_set_page_with_attrs(CPUState *cpu, target_ulong vaddr,
> - hwaddr paddr, MemTxAttrs attrs, int prot,
> - int mmu_idx, target_ulong size)
> +void tlb_set_page_full(CPUState *cpu, int mmu_idx,
> +   target_ulong vaddr, CPUTLBEntryFull *full)
>  {
>  CPUArchState *env = cpu->env_ptr;
>  CPUTLB *tlb = env_tlb(env);
> @@ -1117,35 +1117,36 @@ void tlb_set_page_with_attrs(CPUState *cpu, 
> target_ulong vaddr,
>  CPUTLBEntry *te, tn;
>  hwaddr iotlb, xlat, sz, paddr_page;
>  target_ulong vaddr_page;
> -int asidx = cpu_asidx_from_attrs(cpu, attrs);
> -int wp_flags;
> +int asidx, wp_flags, prot;
>  bool is_ram, is_romd;
>
>  assert_cpu_is_self(cpu);
>
> -if (size <= TARGET_PAGE_SIZE) {
> +if (full->lg_page_size <= TARGET_PAGE_BITS) {
>  sz = TARGET_PAGE_SIZE;
>  } else {
> -tlb_add_large_page(env, mmu_idx, vaddr, size);
> -sz = size;
> +sz = (hwaddr)1 << full->lg_page_size;
> +tlb_add_large_page(env, mmu_idx, vaddr, sz);
>  }
>  vaddr_page = vaddr & TARGET_PAGE_MASK;
> -paddr_page = paddr & TARGET_PAGE_MASK;
> +paddr_page = full->phys_addr & TARGET_PAGE_MASK;
>
> +prot = full->prot;
> +asidx = cpu_asidx_from_attrs(cpu, full->attrs);
>  section = address_space_translate_for_iotlb(cpu, 

[PATCH v5 2/3] hw/intc: Remove unused extioi system memory region of LoongArch

2022-10-10 Thread Xiaojuan Yang
Remove the unused extioi system memory region and we only
support the extioi iocsr memory region now.

Signed-off-by: Xiaojuan Yang 
---
 hw/intc/loongarch_extioi.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/hw/intc/loongarch_extioi.c b/hw/intc/loongarch_extioi.c
index a703dd30de..44fbe977ae 100644
--- a/hw/intc/loongarch_extioi.c
+++ b/hw/intc/loongarch_extioi.c
@@ -296,9 +296,6 @@ static void loongarch_extioi_instance_init(Object *obj)
 qdev_init_gpio_out(DEVICE(obj), >parent_irq[cpu][pin], 1);
 }
 }
-memory_region_init_io(>extioi_system_mem, OBJECT(s), _ops,
-  s, "extioi_system_mem", 0x900);
-sysbus_init_mmio(SYS_BUS_DEVICE(dev), >extioi_system_mem);
 }
 
 static void loongarch_extioi_class_init(ObjectClass *klass, void *data)
-- 
2.31.1




[PATCH v5 1/3] hw/intc: Fix LoongArch extioi function

2022-10-10 Thread Xiaojuan Yang
When cpu read or write extioi COREISR reg, it should access
the reg belonged to itself, so the index of 's->coreisr' is
current cpu number. Using MemTxAttrs' requester_type and id
to get the cpu index.

Based-on: <20220927141504.3886314-1-alex.ben...@linaro.org>
Signed-off-by: Xiaojuan Yang 
---
 hw/intc/loongarch_extioi.c  | 50 -
 hw/intc/trace-events|  5 ++--
 target/loongarch/iocsr_helper.c | 16 +--
 3 files changed, 42 insertions(+), 29 deletions(-)

diff --git a/hw/intc/loongarch_extioi.c b/hw/intc/loongarch_extioi.c
index 22803969bc..a703dd30de 100644
--- a/hw/intc/loongarch_extioi.c
+++ b/hw/intc/loongarch_extioi.c
@@ -17,7 +17,6 @@
 #include "migration/vmstate.h"
 #include "trace.h"
 
-
 static void extioi_update_irq(LoongArchExtIOI *s, int irq, int level)
 {
 int ipnum, cpu, found, irq_index, irq_mask;
@@ -68,44 +67,50 @@ static void extioi_setirq(void *opaque, int irq, int level)
 extioi_update_irq(s, irq, level);
 }
 
-static uint64_t extioi_readw(void *opaque, hwaddr addr, unsigned size)
+static MemTxResult extioi_readw(void *opaque, hwaddr addr, uint64_t *data,
+unsigned size, MemTxAttrs attrs)
 {
 LoongArchExtIOI *s = LOONGARCH_EXTIOI(opaque);
 unsigned long offset = addr & 0x;
-uint32_t index, cpu, ret = 0;
+uint32_t index, cpu;
 
 switch (offset) {
 case EXTIOI_NODETYPE_START ... EXTIOI_NODETYPE_END - 1:
 index = (offset - EXTIOI_NODETYPE_START) >> 2;
-ret = s->nodetype[index];
+*data = s->nodetype[index];
 break;
 case EXTIOI_IPMAP_START ... EXTIOI_IPMAP_END - 1:
 index = (offset - EXTIOI_IPMAP_START) >> 2;
-ret = s->ipmap[index];
+*data = s->ipmap[index];
 break;
 case EXTIOI_ENABLE_START ... EXTIOI_ENABLE_END - 1:
 index = (offset - EXTIOI_ENABLE_START) >> 2;
-ret = s->enable[index];
+*data = s->enable[index];
 break;
 case EXTIOI_BOUNCE_START ... EXTIOI_BOUNCE_END - 1:
 index = (offset - EXTIOI_BOUNCE_START) >> 2;
-ret = s->bounce[index];
+*data = s->bounce[index];
 break;
 case EXTIOI_COREISR_START ... EXTIOI_COREISR_END - 1:
-index = ((offset - EXTIOI_COREISR_START) & 0x1f) >> 2;
-cpu = ((offset - EXTIOI_COREISR_START) >> 8) & 0x3;
-ret = s->coreisr[cpu][index];
+index = (offset - EXTIOI_COREISR_START) >> 2;
+/* using attrs to get current cpu index */
+if (attrs.requester_type != MTRT_CPU) {
+trace_loongarch_extioi_badreadw(addr);
+return MEMTX_ACCESS_ERROR;
+}
+cpu = attrs.requester_id;
+*data = s->coreisr[cpu][index];
 break;
 case EXTIOI_COREMAP_START ... EXTIOI_COREMAP_END - 1:
 index = (offset - EXTIOI_COREMAP_START) >> 2;
-ret = s->coremap[index];
+*data = s->coremap[index];
 break;
 default:
 break;
 }
 
-trace_loongarch_extioi_readw(addr, ret);
-return ret;
+trace_loongarch_extioi_readw(addr, *data);
+return MEMTX_OK;
 }
 
 static inline void extioi_enable_irq(LoongArchExtIOI *s, int index,\
@@ -127,8 +132,9 @@ static inline void extioi_enable_irq(LoongArchExtIOI *s, 
int index,\
 }
 }
 
-static void extioi_writew(void *opaque, hwaddr addr,
-  uint64_t val, unsigned size)
+static MemTxResult extioi_writew(void *opaque, hwaddr addr,
+  uint64_t val, unsigned size,
+  MemTxAttrs attrs)
 {
 LoongArchExtIOI *s = LOONGARCH_EXTIOI(opaque);
 int i, cpu, index, old_data, irq;
@@ -183,8 +189,13 @@ static void extioi_writew(void *opaque, hwaddr addr,
 s->bounce[index] = val;
 break;
 case EXTIOI_COREISR_START ... EXTIOI_COREISR_END - 1:
-index = ((offset - EXTIOI_COREISR_START) & 0x1f) >> 2;
-cpu = ((offset - EXTIOI_COREISR_START) >> 8) & 0x3;
+index = (offset - EXTIOI_COREISR_START) >> 2;
+/* using attrs to get current cpu index */
+if (attrs.requester_type != MTRT_CPU) {
+trace_loongarch_extioi_badwritew(addr, val);
+return MEMTX_ACCESS_ERROR;
+}
+cpu = attrs.requester_id;
 old_data = s->coreisr[cpu][index];
 s->coreisr[cpu][index] = old_data & ~val;
 /* write 1 to clear interrrupt */
@@ -231,11 +242,12 @@ static void extioi_writew(void *opaque, hwaddr addr,
 default:
 break;
 }
+return MEMTX_OK;
 }
 
 static const MemoryRegionOps extioi_ops = {
-.read = extioi_readw,
-.write = extioi_writew,
+.read_with_attrs = extioi_readw,
+.write_with_attrs = extioi_writew,
 .impl.min_access_size = 4,
 .impl.max_access_size = 4,
 .valid.min_access_size = 4,
diff --git a/hw/intc/trace-events b/hw/intc/trace-events
index 0a90c1cdec..e4392c6eab 100644
--- a/hw/intc/trace-events
+++ b/hw/intc/trace-events
@@ -306,6 

[PATCH v5 0/3] Add memmap and fix bugs for LoongArch

2022-10-10 Thread Xiaojuan Yang
This series add memmap table and fix extioi, ipi device
emulation for LoongArch virt machine.

The 'Fix LoongArch extioi function' patch based on:
20220927141504.3886314-1-alex.ben...@linaro.org

Changes for v5:
These changes are following Philippe Mathieu-Daude's advice.
1. Add trace_bad_read/write function when MemTxAttrs type is
   not MTRT_CPU in extioi_read/write().
2. Separate 'remove unused extioi system memory region' to a
   single patch.

Changes for v4: 
Add 'reviewed-by' tag in fixing ipi patch, and other changes
are the same as v3. 
1. Remove the memmap table patch in this series, it
   will apply until we have more than one machinestate.
2. Using MemTxAttrs' requester_type and requester_id
   to get current cpu index in loongarch extioi regs
   emulation.
   This patch based on: 
   20220927141504.3886314-1-alex.ben...@linaro.org
3. Rewrite the commit message of fixing ipi patch, and 
   add reviewed by tag in the patch.

Changes for v3: 
1. Remove the memmap table patch in this series, it
   will apply until we have more than one machinestate.
2. Using MemTxAttrs' requester_type and requester_id
   to get current cpu index in loongarch extioi regs
   emulation.
   This patch based on: 
   20220927141504.3886314-1-alex.ben...@linaro.org
3. Rewrite the commit message of fixing ipi patch, and 
   this patch has been reviewed.

Changes for v2: 
1. Adjust the position of 'PLATFORM' element in memmap table

Changes for v1: 
1. Add memmap table for LoongArch virt machine
2. Fix LoongArch extioi function
3. Fix LoongArch ipi device emulation

Xiaojuan Yang (3):
  hw/intc: Fix LoongArch extioi function
  hw/intc: Remove unused extioi system memory region of LoongArch
  hw/intc: Fix LoongArch ipi device emulation

 hw/intc/loongarch_extioi.c  | 53 +++--
 hw/intc/loongarch_ipi.c |  1 -
 hw/intc/trace-events|  5 ++--
 target/loongarch/iocsr_helper.c | 16 +-
 4 files changed, 42 insertions(+), 33 deletions(-)

-- 
2.31.1




[PATCH v4 24/24] target/arm: Use the max page size in a 2-stage ptw

2022-10-10 Thread Richard Henderson
We had only been reporting the stage2 page size.  This causes
problems if stage1 is using a larger page size (16k, 2M, etc),
but stage2 is using a smaller page size, because cputlb does
not set large_page_{addr,mask} properly.

Fix by using the max of the two page sizes.

Reported-by: Marc Zyngier 
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/ptw.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index 0dbbb7d4d4..b8934765ec 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -2584,7 +2584,7 @@ static bool get_phys_addr_twostage(CPUARMState *env, 
S1Translate *ptw,
ARMMMUFaultInfo *fi)
 {
 hwaddr ipa;
-int s1_prot;
+int s1_prot, s1_lgpgsz;
 bool is_secure = ptw->in_secure;
 bool ret, ipa_secure, s2walk_secure;
 ARMCacheAttrs cacheattrs1;
@@ -2620,6 +2620,7 @@ static bool get_phys_addr_twostage(CPUARMState *env, 
S1Translate *ptw,
  * Save the stage1 results so that we may merge prot and cacheattrs later.
  */
 s1_prot = result->f.prot;
+s1_lgpgsz = result->f.lg_page_size;
 cacheattrs1 = result->cacheattrs;
 memset(result, 0, sizeof(*result));
 
@@ -2634,6 +2635,14 @@ static bool get_phys_addr_twostage(CPUARMState *env, 
S1Translate *ptw,
 return ret;
 }
 
+/*
+ * Use the maximum of the S1 & S2 page size, so that invalidation
+ * of pages > TARGET_PAGE_SIZE works correctly.
+ */
+if (result->f.lg_page_size < s1_lgpgsz) {
+result->f.lg_page_size = s1_lgpgsz;
+}
+
 /* Combine the S1 and S2 cache attributes. */
 hcr = arm_hcr_el2_eff_secstate(env, is_secure);
 if (hcr & HCR_DC) {
-- 
2.34.1




[PATCH v4 21/24] target/arm: Consider GP an attribute in get_phys_addr_lpae

2022-10-10 Thread Richard Henderson
Both GP and DBM are in the upper attribute block.
Extend the computation of attrs to include them,
then simplify the setting of guarded.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/ptw.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index 2227d2a2fd..8db635ca98 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -1079,7 +1079,6 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
S1Translate *ptw,
 uint32_t el = regime_el(env, mmu_idx);
 uint64_t descaddrmask;
 bool aarch64 = arm_el_is_aa64(env, el);
-bool guarded = false;
 uint64_t descriptor;
 bool nstable;
 
@@ -1338,7 +1337,7 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
S1Translate *ptw,
 descaddr &= ~(hwaddr)(page_size - 1);
 descaddr |= (address & (page_size - 1));
 /* Extract attributes from the descriptor */
-attrs = descriptor & (MAKE_64BIT_MASK(2, 10) | MAKE_64BIT_MASK(52, 12));
+attrs = descriptor & (MAKE_64BIT_MASK(2, 10) | MAKE_64BIT_MASK(50, 14));
 
 if (mmu_idx == ARMMMUIdx_Stage2 || mmu_idx == ARMMMUIdx_Stage2_S) {
 /* Stage 2 table descriptors do not include any attribute fields */
@@ -1346,7 +1345,6 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
S1Translate *ptw,
 }
 /* Merge in attributes from table descriptors */
 attrs |= nstable << 5; /* NS */
-guarded = extract64(descriptor, 50, 1);  /* GP */
 if (param.hpd) {
 /* HPD disables all the table attributes except NSTable.  */
 goto skip_attrs;
@@ -1399,7 +1397,7 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
S1Translate *ptw,
 
 /* When in aarch64 mode, and BTI is enabled, remember GP in the TLB.  */
 if (aarch64 && cpu_isar_feature(aa64_bti, cpu)) {
-result->f.guarded = guarded;
+result->f.guarded = extract64(attrs, 50, 1); /* GP */
 }
 
 if (mmu_idx == ARMMMUIdx_Stage2 || mmu_idx == ARMMMUIdx_Stage2_S) {
-- 
2.34.1




[PATCH v4 22/24] target/arm: Implement FEAT_HAFDBS, access flag portion

2022-10-10 Thread Richard Henderson
Perform the atomic update for hardware management of the access flag.

Signed-off-by: Richard Henderson 
---
v4: Raise permission fault if pte read-only and atomic update reqd.
Split out dirty bit portion.
Prepare for a single update for AF + DB.
---
 docs/system/arm/emulation.rst |   1 +
 target/arm/cpu64.c|   1 +
 target/arm/ptw.c  | 147 +++---
 3 files changed, 138 insertions(+), 11 deletions(-)

diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
index cfb4b0768b..580e67b190 100644
--- a/docs/system/arm/emulation.rst
+++ b/docs/system/arm/emulation.rst
@@ -32,6 +32,7 @@ the following architecture extensions:
 - FEAT_FlagM (Flag manipulation instructions v2)
 - FEAT_FlagM2 (Enhancements to flag manipulation instructions)
 - FEAT_GTG (Guest translation granule size)
+- FEAT_HAFDBS (Hardware management of the access flag and dirty bit state)
 - FEAT_HCX (Support for the HCRX_EL2 register)
 - FEAT_HPDS (Hierarchical permission disables)
 - FEAT_I8MM (AArch64 Int8 matrix multiplication instructions)
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 85e0d1daf1..fe1369fe96 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -1165,6 +1165,7 @@ static void aarch64_max_initfn(Object *obj)
 cpu->isar.id_aa64mmfr0 = t;
 
 t = cpu->isar.id_aa64mmfr1;
+t = FIELD_DP64(t, ID_AA64MMFR1, HAFDBS, 1);   /* FEAT_HAFDBS, AF only */
 t = FIELD_DP64(t, ID_AA64MMFR1, VMIDBITS, 2); /* FEAT_VMID16 */
 t = FIELD_DP64(t, ID_AA64MMFR1, VH, 1);   /* FEAT_VHE */
 t = FIELD_DP64(t, ID_AA64MMFR1, HPDS, 1); /* FEAT_HPDS */
diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index 8db635ca98..82b6ab029e 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -21,7 +21,9 @@ typedef struct S1Translate {
 bool in_secure;
 bool in_debug;
 bool out_secure;
+bool out_rw;
 bool out_be;
+hwaddr out_virt;
 hwaddr out_phys;
 void *out_host;
 } S1Translate;
@@ -240,6 +242,8 @@ static bool S1_ptw_translate(CPUARMState *env, S1Translate 
*ptw,
 uint8_t pte_attrs;
 bool pte_secure;
 
+ptw->out_virt = addr;
+
 if (unlikely(ptw->in_debug)) {
 /*
  * From gdbstub, do not use softmmu so that we don't modify the
@@ -267,6 +271,7 @@ static bool S1_ptw_translate(CPUARMState *env, S1Translate 
*ptw,
 pte_secure = s2.f.attrs.secure;
 }
 ptw->out_host = NULL;
+ptw->out_rw = false;
 } else {
 CPUTLBEntryFull *full;
 int flags;
@@ -281,6 +286,7 @@ static bool S1_ptw_translate(CPUARMState *env, S1Translate 
*ptw,
 goto fail;
 }
 ptw->out_phys = full->phys_addr;
+ptw->out_rw = full->prot & PROT_WRITE;
 pte_attrs = full->pte_attrs;
 pte_secure = full->attrs.secure;
 }
@@ -324,14 +330,16 @@ static uint32_t arm_ldl_ptw(CPUARMState *env, S1Translate 
*ptw,
 ARMMMUFaultInfo *fi)
 {
 CPUState *cs = env_cpu(env);
+void *host = ptw->out_host;
 uint32_t data;
 
-if (likely(ptw->out_host)) {
+if (likely(host)) {
 /* Page tables are in RAM, and we have the host address. */
+data = qatomic_read((uint32_t *)host);
 if (ptw->out_be) {
-data = ldl_be_p(ptw->out_host);
+data = be32_to_cpu(data);
 } else {
-data = ldl_le_p(ptw->out_host);
+data = le32_to_cpu(data);
 }
 } else {
 /* Page tables are in MMIO. */
@@ -357,15 +365,25 @@ static uint64_t arm_ldq_ptw(CPUARMState *env, S1Translate 
*ptw,
 ARMMMUFaultInfo *fi)
 {
 CPUState *cs = env_cpu(env);
+void *host = ptw->out_host;
 uint64_t data;
 
-if (likely(ptw->out_host)) {
+if (likely(host)) {
 /* Page tables are in RAM, and we have the host address. */
+#ifdef CONFIG_ATOMIC64
+data = qatomic_read__nocheck((uint64_t *)host);
 if (ptw->out_be) {
-data = ldq_be_p(ptw->out_host);
+data = be64_to_cpu(data);
 } else {
-data = ldq_le_p(ptw->out_host);
+data = le64_to_cpu(data);
 }
+#else
+if (ptw->out_be) {
+data = ldq_be_p(host);
+} else {
+data = ldq_le_p(host);
+}
+#endif
 } else {
 /* Page tables are in MMIO. */
 MemTxAttrs attrs = { .secure = ptw->out_secure };
@@ -386,6 +404,91 @@ static uint64_t arm_ldq_ptw(CPUARMState *env, S1Translate 
*ptw,
 return data;
 }
 
+static uint64_t arm_casq_ptw(CPUARMState *env, uint64_t old_val,
+ uint64_t new_val, S1Translate *ptw,
+ ARMMMUFaultInfo *fi)
+{
+uint64_t cur_val;
+void *host = ptw->out_host;
+
+if (unlikely(!host)) {
+fi->type = ARMFault_UnsuppAtomicUpdate;
+fi->s1ptw = true;
+return 0;
+}
+
+/*
+ * Raising a stage2 Protection fault 

[PATCH v4 20/24] target/arm: Don't shift attrs in get_phys_addr_lpae

2022-10-10 Thread Richard Henderson
Leave the upper and lower attributes in the place they originate
from in the descriptor.  Shifting them around is confusing, since
one cannot read the bit numbers out of the manual.  Also, new
attributes have been added which would alter the shifts.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/ptw.c | 31 +++
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index acbf09cce8..2227d2a2fd 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -1071,7 +1071,7 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
S1Translate *ptw,
 hwaddr descaddr, indexmask, indexmask_grainsize;
 uint32_t tableattrs;
 target_ulong page_size;
-uint32_t attrs;
+uint64_t attrs;
 int32_t stride;
 int addrsize, inputsize, outputsize;
 uint64_t tcr = regime_tcr(env, mmu_idx);
@@ -1338,49 +1338,48 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
S1Translate *ptw,
 descaddr &= ~(hwaddr)(page_size - 1);
 descaddr |= (address & (page_size - 1));
 /* Extract attributes from the descriptor */
-attrs = extract64(descriptor, 2, 10)
-| (extract64(descriptor, 52, 12) << 10);
+attrs = descriptor & (MAKE_64BIT_MASK(2, 10) | MAKE_64BIT_MASK(52, 12));
 
 if (mmu_idx == ARMMMUIdx_Stage2 || mmu_idx == ARMMMUIdx_Stage2_S) {
 /* Stage 2 table descriptors do not include any attribute fields */
 goto skip_attrs;
 }
 /* Merge in attributes from table descriptors */
-attrs |= nstable << 3; /* NS */
+attrs |= nstable << 5; /* NS */
 guarded = extract64(descriptor, 50, 1);  /* GP */
 if (param.hpd) {
 /* HPD disables all the table attributes except NSTable.  */
 goto skip_attrs;
 }
-attrs |= extract32(tableattrs, 0, 2) << 11; /* XN, PXN */
+attrs |= extract64(tableattrs, 0, 2) << 53; /* XN, PXN */
 /*
  * The sense of AP[1] vs APTable[0] is reversed, as APTable[0] == 1
  * means "force PL1 access only", which means forcing AP[1] to 0.
  */
-attrs &= ~(extract32(tableattrs, 2, 1) << 4);   /* !APT[0] => AP[1] */
-attrs |= extract32(tableattrs, 3, 1) << 5;  /* APT[1] => AP[2] */
+attrs &= ~(extract64(tableattrs, 2, 1) << 6);   /* !APT[0] => AP[1] */
+attrs |= extract32(tableattrs, 3, 1) << 7;  /* APT[1] => AP[2] */
  skip_attrs:
 
 /*
  * Here descaddr is the final physical address, and attributes
  * are all in attrs.
  */
-if ((attrs & (1 << 8)) == 0) {
+if ((attrs & (1 << 10)) == 0) {
 /* Access flag */
 fi->type = ARMFault_AccessFlag;
 goto do_fault;
 }
 
-ap = extract32(attrs, 4, 2);
+ap = extract32(attrs, 6, 2);
 
 if (mmu_idx == ARMMMUIdx_Stage2 || mmu_idx == ARMMMUIdx_Stage2_S) {
 ns = mmu_idx == ARMMMUIdx_Stage2;
-xn = extract32(attrs, 11, 2);
+xn = extract64(attrs, 54, 2);
 result->f.prot = get_S2prot(env, ap, xn, s1_is_el0);
 } else {
-ns = extract32(attrs, 3, 1);
-xn = extract32(attrs, 12, 1);
-pxn = extract32(attrs, 11, 1);
+ns = extract32(attrs, 5, 1);
+xn = extract64(attrs, 54, 1);
+pxn = extract64(attrs, 53, 1);
 result->f.prot = get_S1prot(env, mmu_idx, aarch64, ap, ns, xn, pxn);
 }
 
@@ -1405,10 +1404,10 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
S1Translate *ptw,
 
 if (mmu_idx == ARMMMUIdx_Stage2 || mmu_idx == ARMMMUIdx_Stage2_S) {
 result->cacheattrs.is_s2_format = true;
-result->cacheattrs.attrs = extract32(attrs, 0, 4);
+result->cacheattrs.attrs = extract32(attrs, 2, 4);
 } else {
 /* Index into MAIR registers for cache attributes */
-uint8_t attrindx = extract32(attrs, 0, 3);
+uint8_t attrindx = extract32(attrs, 2, 3);
 uint64_t mair = env->cp15.mair_el[regime_el(env, mmu_idx)];
 assert(attrindx <= 7);
 result->cacheattrs.is_s2_format = false;
@@ -1423,7 +1422,7 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
S1Translate *ptw,
 if (param.ds) {
 result->cacheattrs.shareability = param.sh;
 } else {
-result->cacheattrs.shareability = extract32(attrs, 6, 2);
+result->cacheattrs.shareability = extract32(attrs, 8, 2);
 }
 
 result->f.phys_addr = descaddr;
-- 
2.34.1




[PATCH v4 19/24] target/arm: Fix fault reporting in get_phys_addr_lpae

2022-10-10 Thread Richard Henderson
Always overriding fi->type was incorrect, as we would not properly
propagate the fault type from S1_ptw_translate, or arm_ldq_ptw.
Simplify things by providing a new label for a translation fault.
For other faults, store into fi directly.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/ptw.c | 31 +--
 1 file changed, 13 insertions(+), 18 deletions(-)

diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index 9b767f8236..acbf09cce8 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -1065,8 +1065,6 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
S1Translate *ptw,
 ARMCPU *cpu = env_archcpu(env);
 ARMMMUIdx mmu_idx = ptw->in_mmu_idx;
 bool is_secure = ptw->in_secure;
-/* Read an LPAE long-descriptor translation table. */
-ARMFaultType fault_type = ARMFault_Translation;
 uint32_t level;
 ARMVAParameters param;
 uint64_t ttbr;
@@ -1103,8 +1101,7 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
S1Translate *ptw,
  * so our choice is to always raise the fault.
  */
 if (param.tsz_oob) {
-fault_type = ARMFault_Translation;
-goto do_fault;
+goto do_translation_fault;
 }
 
 addrsize = 64 - 8 * param.tbi;
@@ -1141,8 +1138,7 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
S1Translate *ptw,
addrsize - inputsize);
 if (-top_bits != param.select) {
 /* The gap between the two regions is a Translation fault */
-fault_type = ARMFault_Translation;
-goto do_fault;
+goto do_translation_fault;
 }
 }
 
@@ -1168,7 +1164,7 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
S1Translate *ptw,
  * Translation table walk disabled => Translation fault on TLB miss
  * Note: This is always 0 on 64-bit EL2 and EL3.
  */
-goto do_fault;
+goto do_translation_fault;
 }
 
 if (mmu_idx != ARMMMUIdx_Stage2 && mmu_idx != ARMMMUIdx_Stage2_S) {
@@ -1199,8 +1195,7 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
S1Translate *ptw,
 if (param.ds && stride == 9 && sl2) {
 if (sl0 != 0) {
 level = 0;
-fault_type = ARMFault_Translation;
-goto do_fault;
+goto do_translation_fault;
 }
 startlevel = -1;
 } else if (!aarch64 || stride == 9) {
@@ -1219,8 +1214,7 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
S1Translate *ptw,
 ok = check_s2_mmu_setup(cpu, aarch64, startlevel,
 inputsize, stride, outputsize);
 if (!ok) {
-fault_type = ARMFault_Translation;
-goto do_fault;
+goto do_translation_fault;
 }
 level = startlevel;
 }
@@ -1242,7 +1236,7 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
S1Translate *ptw,
 descaddr |= extract64(ttbr, 2, 4) << 48;
 } else if (descaddr >> outputsize) {
 level = 0;
-fault_type = ARMFault_AddressSize;
+fi->type = ARMFault_AddressSize;
 goto do_fault;
 }
 
@@ -1296,7 +1290,7 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
S1Translate *ptw,
 
 if (!(descriptor & 1) || (!(descriptor & 2) && (level == 3))) {
 /* Invalid, or the Reserved level 3 encoding */
-goto do_fault;
+goto do_translation_fault;
 }
 
 descaddr = descriptor & descaddrmask;
@@ -1314,7 +1308,7 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
S1Translate *ptw,
 descaddr |= extract64(descriptor, 12, 4) << 48;
 }
 } else if (descaddr >> outputsize) {
-fault_type = ARMFault_AddressSize;
+fi->type = ARMFault_AddressSize;
 goto do_fault;
 }
 
@@ -1371,9 +1365,9 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
S1Translate *ptw,
  * Here descaddr is the final physical address, and attributes
  * are all in attrs.
  */
-fault_type = ARMFault_AccessFlag;
 if ((attrs & (1 << 8)) == 0) {
 /* Access flag */
+fi->type = ARMFault_AccessFlag;
 goto do_fault;
 }
 
@@ -1390,8 +1384,8 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
S1Translate *ptw,
 result->f.prot = get_S1prot(env, mmu_idx, aarch64, ap, ns, xn, pxn);
 }
 
-fault_type = ARMFault_Permission;
 if (!(result->f.prot & (1 << access_type))) {
+fi->type = ARMFault_Permission;
 goto do_fault;
 }
 
@@ -1436,8 +1430,9 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
S1Translate *ptw,
 result->f.lg_page_size = ctz64(page_size);
 return false;
 
-do_fault:
-fi->type = fault_type;
+ do_translation_fault:
+fi->type = ARMFault_Translation;
+ do_fault:
 fi->level = level;
 /* Tag the error as S2 for failed S1 PTW at S2 or ordinary S2.  */
 fi->stage2 = fi->s1ptw || (mmu_idx == 

[PATCH v4 23/24] target/arm: Implement FEAT_HAFDBS, dirty bit portion

2022-10-10 Thread Richard Henderson
Perform the atomic update for hardware management of the dirty bit.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu64.c |  2 +-
 target/arm/ptw.c   | 20 
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index fe1369fe96..0732796559 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -1165,7 +1165,7 @@ static void aarch64_max_initfn(Object *obj)
 cpu->isar.id_aa64mmfr0 = t;
 
 t = cpu->isar.id_aa64mmfr1;
-t = FIELD_DP64(t, ID_AA64MMFR1, HAFDBS, 1);   /* FEAT_HAFDBS, AF only */
+t = FIELD_DP64(t, ID_AA64MMFR1, HAFDBS, 2);   /* FEAT_HAFDBS */
 t = FIELD_DP64(t, ID_AA64MMFR1, VMIDBITS, 2); /* FEAT_VMID16 */
 t = FIELD_DP64(t, ID_AA64MMFR1, VH, 1);   /* FEAT_VHE */
 t = FIELD_DP64(t, ID_AA64MMFR1, HPDS, 1); /* FEAT_HPDS */
diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index 82b6ab029e..0dbbb7d4d4 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -1484,10 +1484,30 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
S1Translate *ptw,
 ap = extract32(attrs, 6, 2);
 
 if (mmu_idx == ARMMMUIdx_Stage2 || mmu_idx == ARMMMUIdx_Stage2_S) {
+if (param.hd
+&& extract64(attrs, 51, 1)  /* DBM */
+&& access_type == MMU_DATA_STORE) {
+/*
+ * Pre-emptively set S2AP[1], so that we compute EXEC properly.
+ * C.f. AArch64.S2ApplyOutputPerms, which does the same thing.
+ */
+ap |= 2;
+new_descriptor |= 1ull << 7;
+}
 ns = mmu_idx == ARMMMUIdx_Stage2;
 xn = extract64(attrs, 54, 2);
 result->f.prot = get_S2prot(env, ap, xn, s1_is_el0);
 } else {
+if (param.hd
+&& extract64(attrs, 51, 1)  /* DBM */
+&& access_type == MMU_DATA_STORE) {
+/*
+ * Pre-emptively clear AP[2], so that we compute EXEC properly.
+ * C.f. AArch64.S1ApplyOutputPerms, which does the same thing.
+ */
+ap &= ~2;
+new_descriptor &= ~(1ull << 7);
+}
 ns = extract32(attrs, 5, 1);
 xn = extract64(attrs, 54, 1);
 pxn = extract64(attrs, 53, 1);
-- 
2.34.1




[PATCH v4 18/24] target/arm: Remove loop from get_phys_addr_lpae

2022-10-10 Thread Richard Henderson
The unconditional loop was used both to iterate over levels
and to control parsing of attributes.  Use an explicit goto
in both cases.

While this appears less clean for iterating over levels, we
will need to jump back into the middle of this loop for
atomic updates, which is even uglier.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/ptw.c | 182 +++
 1 file changed, 91 insertions(+), 91 deletions(-)

diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index d54e6ca938..9b767f8236 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -1082,6 +1082,8 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
S1Translate *ptw,
 uint64_t descaddrmask;
 bool aarch64 = arm_el_is_aa64(env, el);
 bool guarded = false;
+uint64_t descriptor;
+bool nstable;
 
 /* TODO: This code does not support shareability levels. */
 if (aarch64) {
@@ -1274,99 +1276,97 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
S1Translate *ptw,
  * bits at each step.
  */
 tableattrs = is_secure ? 0 : (1 << 4);
-for (;;) {
-uint64_t descriptor;
-bool nstable;
 
-descaddr |= (address >> (stride * (4 - level))) & indexmask;
-descaddr &= ~7ULL;
-nstable = extract32(tableattrs, 4, 1);
-if (!nstable) {
-/* Stage2_S -> Stage2 or Phys_S -> Phys_NS */
-ptw->in_ptw_idx &= ~1;
-ptw->in_secure = false;
-}
-if (!S1_ptw_translate(env, ptw, descaddr, fi)) {
-goto do_fault;
-}
-descriptor = arm_ldq_ptw(env, ptw, fi);
-if (fi->type != ARMFault_None) {
-goto do_fault;
-}
-
-if (!(descriptor & 1) ||
-(!(descriptor & 2) && (level == 3))) {
-/* Invalid, or the Reserved level 3 encoding */
-goto do_fault;
-}
-
-descaddr = descriptor & descaddrmask;
-
-/*
- * For FEAT_LPA and PS=6, bits [51:48] of descaddr are in [15:12]
- * of descriptor.  For FEAT_LPA2 and effective DS, bits [51:50] of
- * descaddr are in [9:8].  Otherwise, if descaddr is out of range,
- * raise AddressSizeFault.
- */
-if (outputsize > 48) {
-if (param.ds) {
-descaddr |= extract64(descriptor, 8, 2) << 50;
-} else {
-descaddr |= extract64(descriptor, 12, 4) << 48;
-}
-} else if (descaddr >> outputsize) {
-fault_type = ARMFault_AddressSize;
-goto do_fault;
-}
-
-if ((descriptor & 2) && (level < 3)) {
-/*
- * Table entry. The top five bits are attributes which may
- * propagate down through lower levels of the table (and
- * which are all arranged so that 0 means "no effect", so
- * we can gather them up by ORing in the bits at each level).
- */
-tableattrs |= extract64(descriptor, 59, 5);
-level++;
-indexmask = indexmask_grainsize;
-continue;
-}
-/*
- * Block entry at level 1 or 2, or page entry at level 3.
- * These are basically the same thing, although the number
- * of bits we pull in from the vaddr varies. Note that although
- * descaddrmask masks enough of the low bits of the descriptor
- * to give a correct page or table address, the address field
- * in a block descriptor is smaller; so we need to explicitly
- * clear the lower bits here before ORing in the low vaddr bits.
- */
-page_size = (1ULL << ((stride * (4 - level)) + 3));
-descaddr &= ~(hwaddr)(page_size - 1);
-descaddr |= (address & (page_size - 1));
-/* Extract attributes from the descriptor */
-attrs = extract64(descriptor, 2, 10)
-| (extract64(descriptor, 52, 12) << 10);
-
-if (mmu_idx == ARMMMUIdx_Stage2 || mmu_idx == ARMMMUIdx_Stage2_S) {
-/* Stage 2 table descriptors do not include any attribute fields */
-break;
-}
-/* Merge in attributes from table descriptors */
-attrs |= nstable << 3; /* NS */
-guarded = extract64(descriptor, 50, 1);  /* GP */
-if (param.hpd) {
-/* HPD disables all the table attributes except NSTable.  */
-break;
-}
-attrs |= extract32(tableattrs, 0, 2) << 11; /* XN, PXN */
-/*
- * The sense of AP[1] vs APTable[0] is reversed, as APTable[0] == 1
- * means "force PL1 access only", which means forcing AP[1] to 0.
- */
-attrs &= ~(extract32(tableattrs, 2, 1) << 4);   /* !APT[0] => AP[1] */
-attrs |= extract32(tableattrs, 3, 1) << 5;  /* APT[1] => AP[2] */
-break;
+ next_level:
+descaddr |= (address >> (stride * (4 - level))) & indexmask;
+descaddr &= ~7ULL;
+nstable = 

[PATCH v4 17/24] target/arm: Add ARMFault_UnsuppAtomicUpdate

2022-10-10 Thread Richard Henderson
This fault type is to be used with FEAT_HAFDBS when
the guest enables hw updates, but places the tables
in memory where atomic updates are unsupported.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/internals.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/target/arm/internals.h b/target/arm/internals.h
index 76ec7ee8cc..e195d771e0 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -338,6 +338,7 @@ typedef enum ARMFaultType {
 ARMFault_AsyncExternal,
 ARMFault_Debug,
 ARMFault_TLBConflict,
+ARMFault_UnsuppAtomicUpdate,
 ARMFault_Lockdown,
 ARMFault_Exclusive,
 ARMFault_ICacheMaint,
@@ -524,6 +525,9 @@ static inline uint32_t arm_fi_to_lfsc(ARMMMUFaultInfo *fi)
 case ARMFault_TLBConflict:
 fsc = 0x30;
 break;
+case ARMFault_UnsuppAtomicUpdate:
+fsc = 0x31;
+break;
 case ARMFault_Lockdown:
 fsc = 0x34;
 break;
-- 
2.34.1




[PATCH v4 16/24] target/arm: Move S1_ptw_translate outside arm_ld[lq]_ptw

2022-10-10 Thread Richard Henderson
Separate S1 translation from the actual lookup.
Will enable lpae hardware updates.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/ptw.c | 41 ++---
 1 file changed, 22 insertions(+), 19 deletions(-)

diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index b2bfcfde9a..d54e6ca938 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -320,18 +320,12 @@ static bool S1_ptw_translate(CPUARMState *env, 
S1Translate *ptw,
 }
 
 /* All loads done in the course of a page table walk go through here. */
-static uint32_t arm_ldl_ptw(CPUARMState *env, S1Translate *ptw, hwaddr addr,
+static uint32_t arm_ldl_ptw(CPUARMState *env, S1Translate *ptw,
 ARMMMUFaultInfo *fi)
 {
 CPUState *cs = env_cpu(env);
 uint32_t data;
 
-if (!S1_ptw_translate(env, ptw, addr, fi)) {
-/* Failure. */
-assert(fi->s1ptw);
-return 0;
-}
-
 if (likely(ptw->out_host)) {
 /* Page tables are in RAM, and we have the host address. */
 if (ptw->out_be) {
@@ -359,18 +353,12 @@ static uint32_t arm_ldl_ptw(CPUARMState *env, S1Translate 
*ptw, hwaddr addr,
 return data;
 }
 
-static uint64_t arm_ldq_ptw(CPUARMState *env, S1Translate *ptw, hwaddr addr,
+static uint64_t arm_ldq_ptw(CPUARMState *env, S1Translate *ptw,
 ARMMMUFaultInfo *fi)
 {
 CPUState *cs = env_cpu(env);
 uint64_t data;
 
-if (!S1_ptw_translate(env, ptw, addr, fi)) {
-/* Failure. */
-assert(fi->s1ptw);
-return 0;
-}
-
 if (likely(ptw->out_host)) {
 /* Page tables are in RAM, and we have the host address. */
 if (ptw->out_be) {
@@ -527,7 +515,10 @@ static bool get_phys_addr_v5(CPUARMState *env, S1Translate 
*ptw,
 fi->type = ARMFault_Translation;
 goto do_fault;
 }
-desc = arm_ldl_ptw(env, ptw, table, fi);
+if (!S1_ptw_translate(env, ptw, table, fi)) {
+goto do_fault;
+}
+desc = arm_ldl_ptw(env, ptw, fi);
 if (fi->type != ARMFault_None) {
 goto do_fault;
 }
@@ -565,7 +556,10 @@ static bool get_phys_addr_v5(CPUARMState *env, S1Translate 
*ptw,
 /* Fine pagetable.  */
 table = (desc & 0xf000) | ((address >> 8) & 0xffc);
 }
-desc = arm_ldl_ptw(env, ptw, table, fi);
+if (!S1_ptw_translate(env, ptw, table, fi)) {
+goto do_fault;
+}
+desc = arm_ldl_ptw(env, ptw, fi);
 if (fi->type != ARMFault_None) {
 goto do_fault;
 }
@@ -650,7 +644,10 @@ static bool get_phys_addr_v6(CPUARMState *env, S1Translate 
*ptw,
 fi->type = ARMFault_Translation;
 goto do_fault;
 }
-desc = arm_ldl_ptw(env, ptw, table, fi);
+if (!S1_ptw_translate(env, ptw, table, fi)) {
+goto do_fault;
+}
+desc = arm_ldl_ptw(env, ptw, fi);
 if (fi->type != ARMFault_None) {
 goto do_fault;
 }
@@ -703,7 +700,10 @@ static bool get_phys_addr_v6(CPUARMState *env, S1Translate 
*ptw,
 ns = extract32(desc, 3, 1);
 /* Lookup l2 entry.  */
 table = (desc & 0xfc00) | ((address >> 10) & 0x3fc);
-desc = arm_ldl_ptw(env, ptw, table, fi);
+if (!S1_ptw_translate(env, ptw, table, fi)) {
+goto do_fault;
+}
+desc = arm_ldl_ptw(env, ptw, fi);
 if (fi->type != ARMFault_None) {
 goto do_fault;
 }
@@ -1286,7 +1286,10 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
S1Translate *ptw,
 ptw->in_ptw_idx &= ~1;
 ptw->in_secure = false;
 }
-descriptor = arm_ldq_ptw(env, ptw, descaddr, fi);
+if (!S1_ptw_translate(env, ptw, descaddr, fi)) {
+goto do_fault;
+}
+descriptor = arm_ldq_ptw(env, ptw, fi);
 if (fi->type != ARMFault_None) {
 goto do_fault;
 }
-- 
2.34.1




[PATCH v4 15/24] target/arm: Extract HA and HD in aa64_va_parameters

2022-10-10 Thread Richard Henderson
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/internals.h | 2 ++
 target/arm/helper.c| 8 +++-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/target/arm/internals.h b/target/arm/internals.h
index c3c3920ded..76ec7ee8cc 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -1041,6 +1041,8 @@ typedef struct ARMVAParameters {
 bool hpd: 1;
 bool tsz_oob: 1;  /* tsz has been clamped to legal range */
 bool ds : 1;
+bool ha : 1;
+bool hd : 1;
 ARMGranuleSize gran : 2;
 } ARMVAParameters;
 
diff --git a/target/arm/helper.c b/target/arm/helper.c
index c672903f43..4487957e5d 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -10470,7 +10470,7 @@ ARMVAParameters aa64_va_parameters(CPUARMState *env, 
uint64_t va,
ARMMMUIdx mmu_idx, bool data)
 {
 uint64_t tcr = regime_tcr(env, mmu_idx);
-bool epd, hpd, tsz_oob, ds;
+bool epd, hpd, tsz_oob, ds, ha, hd;
 int select, tsz, tbi, max_tsz, min_tsz, ps, sh;
 ARMGranuleSize gran;
 ARMCPU *cpu = env_archcpu(env);
@@ -10489,6 +10489,8 @@ ARMVAParameters aa64_va_parameters(CPUARMState *env, 
uint64_t va,
 epd = false;
 sh = extract32(tcr, 12, 2);
 ps = extract32(tcr, 16, 3);
+ha = extract32(tcr, 21, 1) && cpu_isar_feature(aa64_hafs, cpu);
+hd = extract32(tcr, 22, 1) && cpu_isar_feature(aa64_hdbs, cpu);
 ds = extract64(tcr, 32, 1);
 } else {
 /*
@@ -10510,6 +10512,8 @@ ARMVAParameters aa64_va_parameters(CPUARMState *env, 
uint64_t va,
 hpd = extract64(tcr, 42, 1);
 }
 ps = extract64(tcr, 32, 3);
+ha = extract64(tcr, 39, 1) && cpu_isar_feature(aa64_hafs, cpu);
+hd = extract64(tcr, 40, 1) && cpu_isar_feature(aa64_hdbs, cpu);
 ds = extract64(tcr, 59, 1);
 }
 
@@ -10581,6 +10585,8 @@ ARMVAParameters aa64_va_parameters(CPUARMState *env, 
uint64_t va,
 .hpd = hpd,
 .tsz_oob = tsz_oob,
 .ds = ds,
+.ha = ha,
+.hd = ha && hd,
 .gran = gran,
 };
 }
-- 
2.34.1




[PATCH v4 12/24] target/arm: Use bool consistently for get_phys_addr subroutines

2022-10-10 Thread Richard Henderson
The return type of the functions is already bool, but in a few
instances we used an integer type with the return statement.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/ptw.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index dd6556560a..6c5ed56a10 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -2432,7 +2432,7 @@ static bool get_phys_addr_disabled(CPUARMState *env, 
target_ulong address,
 result->f.lg_page_size = TARGET_PAGE_BITS;
 result->cacheattrs.shareability = shareability;
 result->cacheattrs.attrs = memattr;
-return 0;
+return false;
 }
 
 static bool get_phys_addr_twostage(CPUARMState *env, S1Translate *ptw,
@@ -2443,9 +2443,8 @@ static bool get_phys_addr_twostage(CPUARMState *env, 
S1Translate *ptw,
 {
 hwaddr ipa;
 int s1_prot;
-int ret;
 bool is_secure = ptw->in_secure;
-bool ipa_secure, s2walk_secure;
+bool ret, ipa_secure, s2walk_secure;
 ARMCacheAttrs cacheattrs1;
 bool is_el0;
 uint64_t hcr;
@@ -2520,7 +2519,7 @@ static bool get_phys_addr_twostage(CPUARMState *env, 
S1Translate *ptw,
  && (ipa_secure
  || !(env->cp15.vtcr_el2 & (VTCR_NSA | VTCR_NSW;
 
-return 0;
+return false;
 }
 
 static bool get_phys_addr_with_struct(CPUARMState *env, S1Translate *ptw,
-- 
2.34.1




[PATCH v4 11/24] target/arm: Split out get_phys_addr_twostage

2022-10-10 Thread Richard Henderson
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/ptw.c | 191 +--
 1 file changed, 100 insertions(+), 91 deletions(-)

diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index 8f41d285b7..dd6556560a 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -31,6 +31,13 @@ static bool get_phys_addr_lpae(CPUARMState *env, S1Translate 
*ptw,
GetPhysAddrResult *result, ARMMMUFaultInfo *fi)
 __attribute__((nonnull));
 
+static bool get_phys_addr_with_struct(CPUARMState *env, S1Translate *ptw,
+  target_ulong address,
+  MMUAccessType access_type,
+  GetPhysAddrResult *result,
+  ARMMMUFaultInfo *fi)
+__attribute__((nonnull));
+
 /* This mapping is common between ID_AA64MMFR0.PARANGE and TCR_ELx.{I}PS. */
 static const uint8_t pamax_map[] = {
 [0] = 32,
@@ -2428,6 +2435,94 @@ static bool get_phys_addr_disabled(CPUARMState *env, 
target_ulong address,
 return 0;
 }
 
+static bool get_phys_addr_twostage(CPUARMState *env, S1Translate *ptw,
+   target_ulong address,
+   MMUAccessType access_type,
+   GetPhysAddrResult *result,
+   ARMMMUFaultInfo *fi)
+{
+hwaddr ipa;
+int s1_prot;
+int ret;
+bool is_secure = ptw->in_secure;
+bool ipa_secure, s2walk_secure;
+ARMCacheAttrs cacheattrs1;
+bool is_el0;
+uint64_t hcr;
+
+ret = get_phys_addr_with_struct(env, ptw, address, access_type, result, 
fi);
+
+/* If S1 fails or S2 is disabled, return early.  */
+if (ret || regime_translation_disabled(env, ARMMMUIdx_Stage2, is_secure)) {
+return ret;
+}
+
+ipa = result->f.phys_addr;
+ipa_secure = result->f.attrs.secure;
+if (is_secure) {
+/* Select TCR based on the NS bit from the S1 walk. */
+s2walk_secure = !(ipa_secure
+  ? env->cp15.vstcr_el2 & VSTCR_SW
+  : env->cp15.vtcr_el2 & VTCR_NSW);
+} else {
+assert(!ipa_secure);
+s2walk_secure = false;
+}
+
+is_el0 = ptw->in_mmu_idx == ARMMMUIdx_Stage1_E0;
+ptw->in_mmu_idx = s2walk_secure ? ARMMMUIdx_Stage2_S : ARMMMUIdx_Stage2;
+ptw->in_secure = s2walk_secure;
+
+/*
+ * S1 is done, now do S2 translation.
+ * Save the stage1 results so that we may merge prot and cacheattrs later.
+ */
+s1_prot = result->f.prot;
+cacheattrs1 = result->cacheattrs;
+memset(result, 0, sizeof(*result));
+
+ret = get_phys_addr_lpae(env, ptw, ipa, access_type, is_el0, result, fi);
+fi->s2addr = ipa;
+
+/* Combine the S1 and S2 perms.  */
+result->f.prot &= s1_prot;
+
+/* If S2 fails, return early.  */
+if (ret) {
+return ret;
+}
+
+/* Combine the S1 and S2 cache attributes. */
+hcr = arm_hcr_el2_eff_secstate(env, is_secure);
+if (hcr & HCR_DC) {
+/*
+ * HCR.DC forces the first stage attributes to
+ *  Normal Non-Shareable,
+ *  Inner Write-Back Read-Allocate Write-Allocate,
+ *  Outer Write-Back Read-Allocate Write-Allocate.
+ * Do not overwrite Tagged within attrs.
+ */
+if (cacheattrs1.attrs != 0xf0) {
+cacheattrs1.attrs = 0xff;
+}
+cacheattrs1.shareability = 0;
+}
+result->cacheattrs = combine_cacheattrs(hcr, cacheattrs1,
+result->cacheattrs);
+
+/*
+ * Check if IPA translates to secure or non-secure PA space.
+ * Note that VSTCR overrides VTCR and {N}SW overrides {N}SA.
+ */
+result->f.attrs.secure =
+(is_secure
+ && !(env->cp15.vstcr_el2 & (VSTCR_SA | VSTCR_SW))
+ && (ipa_secure
+ || !(env->cp15.vtcr_el2 & (VTCR_NSA | VTCR_NSW;
+
+return 0;
+}
+
 static bool get_phys_addr_with_struct(CPUARMState *env, S1Translate *ptw,
   target_ulong address,
   MMUAccessType access_type,
@@ -2441,99 +2536,13 @@ static bool get_phys_addr_with_struct(CPUARMState *env, 
S1Translate *ptw,
 if (mmu_idx != s1_mmu_idx) {
 /*
  * Call ourselves recursively to do the stage 1 and then stage 2
- * translations if mmu_idx is a two-stage regime.
+ * translations if mmu_idx is a two-stage regime, and EL2 present.
+ * Otherwise, a stage1+stage2 translation is just stage 1.
  */
+ptw->in_mmu_idx = mmu_idx = s1_mmu_idx;
 if (arm_feature(env, ARM_FEATURE_EL2)) {
-hwaddr ipa;
-int s1_prot;
-int ret;
-bool ipa_secure, s2walk_secure;
-ARMCacheAttrs cacheattrs1;
-bool is_el0;
-uint64_t hcr;
-
-

[PATCH v4 08/24] target/arm: Plumb debug into S1Translate

2022-10-10 Thread Richard Henderson
Before using softmmu page tables for the ptw, plumb down
a debug parameter so that we can query page table entries
from gdbstub without modifying cpu state.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
v4: Add debug to S1Translate, and plumb the S1Translate structure down
from the very outside.  It means that S1Translate is now perhaps
mis-named, but it also eliminates the "secure_debug" function name.
---
 target/arm/ptw.c | 55 
 1 file changed, 37 insertions(+), 18 deletions(-)

diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index dee69ee743..8fa0088d98 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -17,6 +17,7 @@
 typedef struct S1Translate {
 ARMMMUIdx in_mmu_idx;
 bool in_secure;
+bool in_debug;
 bool out_secure;
 hwaddr out_phys;
 } S1Translate;
@@ -230,6 +231,7 @@ static bool S1_ptw_translate(CPUARMState *env, S1Translate 
*ptw,
 S1Translate s2ptw = {
 .in_mmu_idx = s2_mmu_idx,
 .in_secure = is_secure,
+.in_debug = ptw->in_debug,
 };
 uint64_t hcr;
 int ret;
@@ -2370,13 +2372,15 @@ static bool get_phys_addr_disabled(CPUARMState *env, 
target_ulong address,
 return 0;
 }
 
-bool get_phys_addr_with_secure(CPUARMState *env, target_ulong address,
-   MMUAccessType access_type, ARMMMUIdx mmu_idx,
-   bool is_secure, GetPhysAddrResult *result,
-   ARMMMUFaultInfo *fi)
+static bool get_phys_addr_with_struct(CPUARMState *env, S1Translate *ptw,
+  target_ulong address,
+  MMUAccessType access_type,
+  GetPhysAddrResult *result,
+  ARMMMUFaultInfo *fi)
 {
+ARMMMUIdx mmu_idx = ptw->in_mmu_idx;
 ARMMMUIdx s1_mmu_idx = stage_1_mmu_idx(mmu_idx);
-S1Translate ptw;
+bool is_secure = ptw->in_secure;
 
 if (mmu_idx != s1_mmu_idx) {
 /*
@@ -2392,8 +2396,9 @@ bool get_phys_addr_with_secure(CPUARMState *env, 
target_ulong address,
 bool is_el0;
 uint64_t hcr;
 
-ret = get_phys_addr_with_secure(env, address, access_type,
-s1_mmu_idx, is_secure, result, fi);
+ptw->in_mmu_idx = s1_mmu_idx;
+ret = get_phys_addr_with_struct(env, ptw, address, access_type,
+result, fi);
 
 /* If S1 fails or S2 is disabled, return early.  */
 if (ret || regime_translation_disabled(env, ARMMMUIdx_Stage2,
@@ -2413,9 +2418,9 @@ bool get_phys_addr_with_secure(CPUARMState *env, 
target_ulong address,
 s2walk_secure = false;
 }
 
-ptw.in_mmu_idx =
+ptw->in_mmu_idx =
 s2walk_secure ? ARMMMUIdx_Stage2_S : ARMMMUIdx_Stage2;
-ptw.in_secure = s2walk_secure;
+ptw->in_secure = s2walk_secure;
 is_el0 = mmu_idx == ARMMMUIdx_E10_0;
 
 /*
@@ -2427,7 +2432,7 @@ bool get_phys_addr_with_secure(CPUARMState *env, 
target_ulong address,
 cacheattrs1 = result->cacheattrs;
 memset(result, 0, sizeof(*result));
 
-ret = get_phys_addr_lpae(env, , ipa, access_type,
+ret = get_phys_addr_lpae(env, ptw, ipa, access_type,
  is_el0, result, fi);
 fi->s2addr = ipa;
 
@@ -2534,19 +2539,29 @@ bool get_phys_addr_with_secure(CPUARMState *env, 
target_ulong address,
   is_secure, result, fi);
 }
 
-ptw.in_mmu_idx = mmu_idx;
-ptw.in_secure = is_secure;
-
 if (regime_using_lpae_format(env, mmu_idx)) {
-return get_phys_addr_lpae(env, , address, access_type, false,
+return get_phys_addr_lpae(env, ptw, address, access_type, false,
   result, fi);
 } else if (regime_sctlr(env, mmu_idx) & SCTLR_XP) {
-return get_phys_addr_v6(env, , address, access_type, result, fi);
+return get_phys_addr_v6(env, ptw, address, access_type, result, fi);
 } else {
-return get_phys_addr_v5(env, , address, access_type, result, fi);
+return get_phys_addr_v5(env, ptw, address, access_type, result, fi);
 }
 }
 
+bool get_phys_addr_with_secure(CPUARMState *env, target_ulong address,
+   MMUAccessType access_type, ARMMMUIdx mmu_idx,
+   bool is_secure, GetPhysAddrResult *result,
+   ARMMMUFaultInfo *fi)
+{
+S1Translate ptw = {
+.in_mmu_idx = mmu_idx,
+.in_secure = is_secure,
+};
+return get_phys_addr_with_struct(env, , address, access_type,
+ result, fi);
+}
+
 bool get_phys_addr(CPUARMState *env, target_ulong address,
 

[PATCH v4 10/24] target/arm: Use softmmu tlbs for page table walking

2022-10-10 Thread Richard Henderson
So far, limit the change to S1_ptw_translate, arm_ldl_ptw, and
arm_ldq_ptw.  Use probe_access_full to find the host address,
and if so use a host load.  If the probe fails, we've got our
fault info already.  On the off chance that page tables are not
in RAM, continue to use the address_space_ld* functions.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
v4: Put the host address into S1Translate immediately.
---
 target/arm/cpu.h|   5 +
 target/arm/ptw.c| 196 +---
 target/arm/tlb_helper.c |  17 +++-
 3 files changed, 144 insertions(+), 74 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index c94e289012..e9e77b7563 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -225,6 +225,8 @@ typedef struct CPUARMTBFlags {
 target_ulong flags2;
 } CPUARMTBFlags;
 
+typedef struct ARMMMUFaultInfo ARMMMUFaultInfo;
+
 typedef struct CPUArchState {
 /* Regs for current mode.  */
 uint32_t regs[16];
@@ -715,6 +717,9 @@ typedef struct CPUArchState {
 struct CPUBreakpoint *cpu_breakpoint[16];
 struct CPUWatchpoint *cpu_watchpoint[16];
 
+/* Optional fault info across tlb lookup. */
+ARMMMUFaultInfo *tlb_fi;
+
 /* Fields up to this point are cleared by a CPU reset */
 struct {} end_reset_fields;
 
diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index c58788ac69..8f41d285b7 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -9,6 +9,7 @@
 #include "qemu/osdep.h"
 #include "qemu/log.h"
 #include "qemu/range.h"
+#include "exec/exec-all.h"
 #include "cpu.h"
 #include "internals.h"
 #include "idau.h"
@@ -21,6 +22,7 @@ typedef struct S1Translate {
 bool out_secure;
 bool out_be;
 hwaddr out_phys;
+void *out_host;
 } S1Translate;
 
 static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
@@ -200,7 +202,7 @@ static bool regime_translation_disabled(CPUARMState *env, 
ARMMMUIdx mmu_idx,
 return (regime_sctlr(env, mmu_idx) & SCTLR_M) == 0;
 }
 
-static bool ptw_attrs_are_device(uint64_t hcr, ARMCacheAttrs cacheattrs)
+static bool S2_attrs_are_device(uint64_t hcr, uint8_t attrs)
 {
 /*
  * For an S1 page table walk, the stage 1 attributes are always
@@ -211,11 +213,10 @@ static bool ptw_attrs_are_device(uint64_t hcr, 
ARMCacheAttrs cacheattrs)
  * With HCR_EL2.FWB == 1 this is when descriptor bit [4] is 0, ie
  * when cacheattrs.attrs bit [2] is 0.
  */
-assert(cacheattrs.is_s2_format);
 if (hcr & HCR_FWB) {
-return (cacheattrs.attrs & 0x4) == 0;
+return (attrs & 0x4) == 0;
 } else {
-return (cacheattrs.attrs & 0xc) == 0;
+return (attrs & 0xc) == 0;
 }
 }
 
@@ -224,32 +225,65 @@ static bool S1_ptw_translate(CPUARMState *env, 
S1Translate *ptw,
  hwaddr addr, ARMMMUFaultInfo *fi)
 {
 bool is_secure = ptw->in_secure;
+ARMMMUIdx mmu_idx = ptw->in_mmu_idx;
 ARMMMUIdx s2_mmu_idx = is_secure ? ARMMMUIdx_Stage2_S : ARMMMUIdx_Stage2;
+bool s2_phys = false;
+uint8_t pte_attrs;
+bool pte_secure;
 
-if (arm_mmu_idx_is_stage1_of_2(ptw->in_mmu_idx) &&
-!regime_translation_disabled(env, s2_mmu_idx, is_secure)) {
-GetPhysAddrResult s2 = {};
-S1Translate s2ptw = {
-.in_mmu_idx = s2_mmu_idx,
-.in_secure = is_secure,
-.in_debug = ptw->in_debug,
-};
-uint64_t hcr;
-int ret;
+if (!arm_mmu_idx_is_stage1_of_2(mmu_idx)
+|| regime_translation_disabled(env, s2_mmu_idx, is_secure)) {
+s2_mmu_idx = is_secure ? ARMMMUIdx_Phys_S : ARMMMUIdx_Phys_NS;
+s2_phys = true;
+}
 
-ret = get_phys_addr_lpae(env, , addr, MMU_DATA_LOAD,
- false, , fi);
-if (ret) {
-assert(fi->type != ARMFault_None);
-fi->s2addr = addr;
-fi->stage2 = true;
-fi->s1ptw = true;
-fi->s1ns = !is_secure;
-return false;
+if (unlikely(ptw->in_debug)) {
+/*
+ * From gdbstub, do not use softmmu so that we don't modify the
+ * state of the cpu at all, including softmmu tlb contents.
+ */
+if (s2_phys) {
+ptw->out_phys = addr;
+pte_attrs = 0;
+pte_secure = is_secure;
+} else {
+S1Translate s2ptw = {
+.in_mmu_idx = s2_mmu_idx,
+.in_secure = is_secure,
+.in_debug = true,
+};
+GetPhysAddrResult s2 = { };
+if (!get_phys_addr_lpae(env, , addr, MMU_DATA_LOAD,
+false, , fi)) {
+goto fail;
+}
+ptw->out_phys = s2.f.phys_addr;
+pte_attrs = s2.cacheattrs.attrs;
+pte_secure = s2.f.attrs.secure;
 }
+ptw->out_host = NULL;
+} else {
+CPUTLBEntryFull *full;
+int flags;
 
-hcr = 

[PATCH v4 13/24] target/arm: Add ptw_idx to S1Translate

2022-10-10 Thread Richard Henderson
Hoist the computation of the mmu_idx for the ptw up to
get_phys_addr_with_struct and get_phys_addr_twostage.
This removes the duplicate check for stage2 disabled
from the middle of the walk, performing it only once.

Signed-off-by: Richard Henderson 
---
 target/arm/ptw.c | 53 ++--
 1 file changed, 42 insertions(+), 11 deletions(-)

diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index 6c5ed56a10..b2bfcfde9a 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -17,6 +17,7 @@
 
 typedef struct S1Translate {
 ARMMMUIdx in_mmu_idx;
+ARMMMUIdx in_ptw_idx;
 bool in_secure;
 bool in_debug;
 bool out_secure;
@@ -233,17 +234,12 @@ static bool S1_ptw_translate(CPUARMState *env, 
S1Translate *ptw,
 {
 bool is_secure = ptw->in_secure;
 ARMMMUIdx mmu_idx = ptw->in_mmu_idx;
-ARMMMUIdx s2_mmu_idx = is_secure ? ARMMMUIdx_Stage2_S : ARMMMUIdx_Stage2;
-bool s2_phys = false;
+ARMMMUIdx s2_mmu_idx = ptw->in_ptw_idx;
+bool s2_phys = s2_mmu_idx == ARMMMUIdx_Phys_S ||
+   s2_mmu_idx == ARMMMUIdx_Phys_NS;
 uint8_t pte_attrs;
 bool pte_secure;
 
-if (!arm_mmu_idx_is_stage1_of_2(mmu_idx)
-|| regime_translation_disabled(env, s2_mmu_idx, is_secure)) {
-s2_mmu_idx = is_secure ? ARMMMUIdx_Phys_S : ARMMMUIdx_Phys_NS;
-s2_phys = true;
-}
-
 if (unlikely(ptw->in_debug)) {
 /*
  * From gdbstub, do not use softmmu so that we don't modify the
@@ -256,10 +252,12 @@ static bool S1_ptw_translate(CPUARMState *env, 
S1Translate *ptw,
 } else {
 S1Translate s2ptw = {
 .in_mmu_idx = s2_mmu_idx,
+.in_ptw_idx = is_secure ? ARMMMUIdx_Phys_S : ARMMMUIdx_Phys_NS,
 .in_secure = is_secure,
 .in_debug = true,
 };
 GetPhysAddrResult s2 = { };
+
 if (!get_phys_addr_lpae(env, , addr, MMU_DATA_LOAD,
 false, , fi)) {
 goto fail;
@@ -1283,7 +1281,11 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
S1Translate *ptw,
 descaddr |= (address >> (stride * (4 - level))) & indexmask;
 descaddr &= ~7ULL;
 nstable = extract32(tableattrs, 4, 1);
-ptw->in_secure = !nstable;
+if (!nstable) {
+/* Stage2_S -> Stage2 or Phys_S -> Phys_NS */
+ptw->in_ptw_idx &= ~1;
+ptw->in_secure = false;
+}
 descriptor = arm_ldq_ptw(env, ptw, descaddr, fi);
 if (fi->type != ARMFault_None) {
 goto do_fault;
@@ -2470,6 +2472,7 @@ static bool get_phys_addr_twostage(CPUARMState *env, 
S1Translate *ptw,
 
 is_el0 = ptw->in_mmu_idx == ARMMMUIdx_Stage1_E0;
 ptw->in_mmu_idx = s2walk_secure ? ARMMMUIdx_Stage2_S : ARMMMUIdx_Stage2;
+ptw->in_ptw_idx = s2walk_secure ? ARMMMUIdx_Phys_S : ARMMMUIdx_Phys_NS;
 ptw->in_secure = s2walk_secure;
 
 /*
@@ -2529,10 +2532,32 @@ static bool get_phys_addr_with_struct(CPUARMState *env, 
S1Translate *ptw,
   ARMMMUFaultInfo *fi)
 {
 ARMMMUIdx mmu_idx = ptw->in_mmu_idx;
-ARMMMUIdx s1_mmu_idx = stage_1_mmu_idx(mmu_idx);
 bool is_secure = ptw->in_secure;
+ARMMMUIdx s1_mmu_idx;
 
-if (mmu_idx != s1_mmu_idx) {
+switch (mmu_idx) {
+case ARMMMUIdx_Phys_S:
+case ARMMMUIdx_Phys_NS:
+/* Checking Phys early avoids special casing later vs regime_el. */
+return get_phys_addr_disabled(env, address, access_type, mmu_idx,
+  is_secure, result, fi);
+
+case ARMMMUIdx_Stage1_E0:
+case ARMMMUIdx_Stage1_E1:
+case ARMMMUIdx_Stage1_E1_PAN:
+/* First stage lookup uses second stage for ptw. */
+ptw->in_ptw_idx = is_secure ? ARMMMUIdx_Stage2_S : ARMMMUIdx_Stage2;
+break;
+
+case ARMMMUIdx_E10_0:
+s1_mmu_idx = ARMMMUIdx_Stage1_E0;
+goto do_twostage;
+case ARMMMUIdx_E10_1:
+s1_mmu_idx = ARMMMUIdx_Stage1_E1;
+goto do_twostage;
+case ARMMMUIdx_E10_1_PAN:
+s1_mmu_idx = ARMMMUIdx_Stage1_E1_PAN;
+do_twostage:
 /*
  * Call ourselves recursively to do the stage 1 and then stage 2
  * translations if mmu_idx is a two-stage regime, and EL2 present.
@@ -2543,6 +2568,12 @@ static bool get_phys_addr_with_struct(CPUARMState *env, 
S1Translate *ptw,
 return get_phys_addr_twostage(env, ptw, address, access_type,
   result, fi);
 }
+/* fall through */
+
+default:
+/* Single stage and second stage uses physical for ptw. */
+ptw->in_ptw_idx = is_secure ? ARMMMUIdx_Phys_S : ARMMMUIdx_Phys_NS;
+break;
 }
 
 /*
-- 
2.34.1




[PATCH v4 09/24] target/arm: Move be test for regime into S1TranslateResult

2022-10-10 Thread Richard Henderson
Hoist this test out of arm_ld[lq]_ptw into S1_ptw_translate.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/ptw.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index 8fa0088d98..c58788ac69 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -19,6 +19,7 @@ typedef struct S1Translate {
 bool in_secure;
 bool in_debug;
 bool out_secure;
+bool out_be;
 hwaddr out_phys;
 } S1Translate;
 
@@ -277,6 +278,7 @@ static bool S1_ptw_translate(CPUARMState *env, S1Translate 
*ptw,
 
 ptw->out_secure = is_secure;
 ptw->out_phys = addr;
+ptw->out_be = regime_translation_big_endian(env, ptw->in_mmu_idx);
 return true;
 }
 
@@ -296,7 +298,7 @@ static uint32_t arm_ldl_ptw(CPUARMState *env, S1Translate 
*ptw, hwaddr addr,
 addr = ptw->out_phys;
 attrs.secure = ptw->out_secure;
 as = arm_addressspace(cs, attrs);
-if (regime_translation_big_endian(env, ptw->in_mmu_idx)) {
+if (ptw->out_be) {
 data = address_space_ldl_be(as, addr, attrs, );
 } else {
 data = address_space_ldl_le(as, addr, attrs, );
@@ -324,7 +326,7 @@ static uint64_t arm_ldq_ptw(CPUARMState *env, S1Translate 
*ptw, hwaddr addr,
 addr = ptw->out_phys;
 attrs.secure = ptw->out_secure;
 as = arm_addressspace(cs, attrs);
-if (regime_translation_big_endian(env, ptw->in_mmu_idx)) {
+if (ptw->out_be) {
 data = address_space_ldq_be(as, addr, attrs, );
 } else {
 data = address_space_ldq_le(as, addr, attrs, );
-- 
2.34.1




[PATCH v4 05/24] target/arm: Move ARMMMUIdx_Stage2 to a real tlb mmu_idx

2022-10-10 Thread Richard Henderson
We had been marking this ARM_MMU_IDX_NOTLB, move it to a real tlb.
Flush the tlb when invalidating stage 1+2 translations.  Re-use
alle1_tlbmask() for other instances of EL1&0 + Stage2.

Signed-off-by: Richard Henderson 
---
v4: Implement the IPAS2 and RIPAS2 tlb flushing insns;
Reuse alle1_tlbmask to fix aa32 and vttbr flushing.
---
 target/arm/cpu-param.h |   2 +-
 target/arm/cpu.h   |  23 ---
 target/arm/helper.c| 151 ++---
 3 files changed, 127 insertions(+), 49 deletions(-)

diff --git a/target/arm/cpu-param.h b/target/arm/cpu-param.h
index a5b27db275..b7bde18986 100644
--- a/target/arm/cpu-param.h
+++ b/target/arm/cpu-param.h
@@ -45,6 +45,6 @@
 bool guarded;
 #endif
 
-#define NB_MMU_MODES 10
+#define NB_MMU_MODES 12
 
 #endif
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index f93060e6d6..c94e289012 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -2906,8 +2906,9 @@ bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync);
  * EL2 (aka NS PL2)
  * EL3 (aka S PL1)
  * Physical (NS & S)
+ * Stage2 (NS & S)
  *
- * for a total of 10 different mmu_idx.
+ * for a total of 12 different mmu_idx.
  *
  * R profile CPUs have an MPU, but can use the same set of MMU indexes
  * as A profile. They only need to distinguish EL0 and EL1 (and
@@ -2976,6 +2977,15 @@ typedef enum ARMMMUIdx {
 ARMMMUIdx_Phys_NS   = 8 | ARM_MMU_IDX_A,
 ARMMMUIdx_Phys_S= 9 | ARM_MMU_IDX_A,
 
+/*
+ * Used for second stage of an S12 page table walk, or for descriptor
+ * loads during first stage of an S1 page table walk.  Note that both
+ * are in use simultaneously for SecureEL2: the security state for
+ * the S2 ptw is selected by the NS bit from the S1 ptw.
+ */
+ARMMMUIdx_Stage2= 10 | ARM_MMU_IDX_A,
+ARMMMUIdx_Stage2_S  = 11 | ARM_MMU_IDX_A,
+
 /*
  * These are not allocated TLBs and are used only for AT system
  * instructions or for the first stage of an S12 page table walk.
@@ -2983,15 +2993,6 @@ typedef enum ARMMMUIdx {
 ARMMMUIdx_Stage1_E0 = 0 | ARM_MMU_IDX_NOTLB,
 ARMMMUIdx_Stage1_E1 = 1 | ARM_MMU_IDX_NOTLB,
 ARMMMUIdx_Stage1_E1_PAN = 2 | ARM_MMU_IDX_NOTLB,
-/*
- * Not allocated a TLB: used only for second stage of an S12 page
- * table walk, or for descriptor loads during first stage of an S1
- * page table walk. Note that if we ever want to have a TLB for this
- * then various TLB flush insns which currently are no-ops or flush
- * only stage 1 MMU indexes will need to change to flush stage 2.
- */
-ARMMMUIdx_Stage2 = 3 | ARM_MMU_IDX_NOTLB,
-ARMMMUIdx_Stage2_S   = 4 | ARM_MMU_IDX_NOTLB,
 
 /*
  * M-profile.
@@ -3022,6 +3023,8 @@ typedef enum ARMMMUIdxBit {
 TO_CORE_BIT(E20_2),
 TO_CORE_BIT(E20_2_PAN),
 TO_CORE_BIT(E3),
+TO_CORE_BIT(Stage2),
+TO_CORE_BIT(Stage2_S),
 
 TO_CORE_BIT(MUser),
 TO_CORE_BIT(MPriv),
diff --git a/target/arm/helper.c b/target/arm/helper.c
index dde64a487a..18c51bb777 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -399,6 +399,21 @@ static void contextidr_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 raw_write(env, ri, value);
 }
 
+static int alle1_tlbmask(CPUARMState *env)
+{
+/*
+ * Note that the 'ALL' scope must invalidate both stage 1 and
+ * stage 2 translations, whereas most other scopes only invalidate
+ * stage 1 translations.
+ */
+return (ARMMMUIdxBit_E10_1 |
+ARMMMUIdxBit_E10_1_PAN |
+ARMMMUIdxBit_E10_0 |
+ARMMMUIdxBit_Stage2 |
+ARMMMUIdxBit_Stage2_S);
+}
+
+
 /* IS variants of TLB operations must affect all cores */
 static void tlbiall_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
  uint64_t value)
@@ -501,10 +516,7 @@ static void tlbiall_nsnh_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 {
 CPUState *cs = env_cpu(env);
 
-tlb_flush_by_mmuidx(cs,
-ARMMMUIdxBit_E10_1 |
-ARMMMUIdxBit_E10_1_PAN |
-ARMMMUIdxBit_E10_0);
+tlb_flush_by_mmuidx(cs, alle1_tlbmask(env));
 }
 
 static void tlbiall_nsnh_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -512,10 +524,7 @@ static void tlbiall_nsnh_is_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 {
 CPUState *cs = env_cpu(env);
 
-tlb_flush_by_mmuidx_all_cpus_synced(cs,
-ARMMMUIdxBit_E10_1 |
-ARMMMUIdxBit_E10_1_PAN |
-ARMMMUIdxBit_E10_0);
+tlb_flush_by_mmuidx_all_cpus_synced(cs, alle1_tlbmask(env));
 }
 
 
@@ -554,6 +563,24 @@ static void tlbimva_hyp_is_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
  ARMMMUIdxBit_E2);
 }
 
+static void tlbiipas2_hyp_write(CPUARMState *env, const ARMCPRegInfo *ri,
+uint64_t value)
+{
+

[PATCH v4 14/24] target/arm: Add isar predicates for FEAT_HAFDBS

2022-10-10 Thread Richard Henderson
The MMFR1 field may indicate support for hardware update of
access flag alone, or access flag and dirty bit.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index e9e77b7563..cde4e86db2 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -4139,6 +4139,16 @@ static inline bool isar_feature_aa64_lva(const 
ARMISARegisters *id)
 return FIELD_EX64(id->id_aa64mmfr2, ID_AA64MMFR2, VARANGE) != 0;
 }
 
+static inline bool isar_feature_aa64_hafs(const ARMISARegisters *id)
+{
+return FIELD_EX64(id->id_aa64mmfr1, ID_AA64MMFR1, HAFDBS) != 0;
+}
+
+static inline bool isar_feature_aa64_hdbs(const ARMISARegisters *id)
+{
+return FIELD_EX64(id->id_aa64mmfr1, ID_AA64MMFR1, HAFDBS) >= 2;
+}
+
 static inline bool isar_feature_aa64_tts2uxn(const ARMISARegisters *id)
 {
 return FIELD_EX64(id->id_aa64mmfr1, ID_AA64MMFR1, XNX) != 0;
-- 
2.34.1




[PATCH v4 07/24] target/arm: Split out S1Translate type

2022-10-10 Thread Richard Henderson
Consolidate most of the inputs and outputs of S1_ptw_translate
into a single structure.  Plumb this through arm_ld*_ptw from
the controlling get_phys_addr_* routine.

Signed-off-by: Richard Henderson 
---
v4: Replaces a different S1TranslateResult patch, and plumbs the
structure further out in the function call tree.
---
 target/arm/ptw.c | 140 ++-
 1 file changed, 79 insertions(+), 61 deletions(-)

diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index a977d09c6d..dee69ee743 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -14,9 +14,16 @@
 #include "idau.h"
 
 
-static bool get_phys_addr_lpae(CPUARMState *env, uint64_t address,
-   MMUAccessType access_type, ARMMMUIdx mmu_idx,
-   bool is_secure, bool s1_is_el0,
+typedef struct S1Translate {
+ARMMMUIdx in_mmu_idx;
+bool in_secure;
+bool out_secure;
+hwaddr out_phys;
+} S1Translate;
+
+static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
+   uint64_t address,
+   MMUAccessType access_type, bool s1_is_el0,
GetPhysAddrResult *result, ARMMMUFaultInfo *fi)
 __attribute__((nonnull));
 
@@ -211,28 +218,31 @@ static bool ptw_attrs_are_device(uint64_t hcr, 
ARMCacheAttrs cacheattrs)
 }
 
 /* Translate a S1 pagetable walk through S2 if needed.  */
-static hwaddr S1_ptw_translate(CPUARMState *env, ARMMMUIdx mmu_idx,
-   hwaddr addr, bool *is_secure_ptr,
-   ARMMMUFaultInfo *fi)
+static bool S1_ptw_translate(CPUARMState *env, S1Translate *ptw,
+ hwaddr addr, ARMMMUFaultInfo *fi)
 {
-bool is_secure = *is_secure_ptr;
+bool is_secure = ptw->in_secure;
 ARMMMUIdx s2_mmu_idx = is_secure ? ARMMMUIdx_Stage2_S : ARMMMUIdx_Stage2;
 
-if (arm_mmu_idx_is_stage1_of_2(mmu_idx) &&
+if (arm_mmu_idx_is_stage1_of_2(ptw->in_mmu_idx) &&
 !regime_translation_disabled(env, s2_mmu_idx, is_secure)) {
 GetPhysAddrResult s2 = {};
+S1Translate s2ptw = {
+.in_mmu_idx = s2_mmu_idx,
+.in_secure = is_secure,
+};
 uint64_t hcr;
 int ret;
 
-ret = get_phys_addr_lpae(env, addr, MMU_DATA_LOAD, s2_mmu_idx,
- is_secure, false, , fi);
+ret = get_phys_addr_lpae(env, , addr, MMU_DATA_LOAD,
+ false, , fi);
 if (ret) {
 assert(fi->type != ARMFault_None);
 fi->s2addr = addr;
 fi->stage2 = true;
 fi->s1ptw = true;
 fi->s1ns = !is_secure;
-return ~0;
+return false;
 }
 
 hcr = arm_hcr_el2_eff_secstate(env, is_secure);
@@ -246,7 +256,7 @@ static hwaddr S1_ptw_translate(CPUARMState *env, ARMMMUIdx 
mmu_idx,
 fi->stage2 = true;
 fi->s1ptw = true;
 fi->s1ns = !is_secure;
-return ~0;
+return false;
 }
 
 if (arm_is_secure_below_el3(env)) {
@@ -256,19 +266,21 @@ static hwaddr S1_ptw_translate(CPUARMState *env, 
ARMMMUIdx mmu_idx,
 } else {
 is_secure = !(env->cp15.vtcr_el2 & VTCR_NSW);
 }
-*is_secure_ptr = is_secure;
 } else {
 assert(!is_secure);
 }
 
 addr = s2.f.phys_addr;
 }
-return addr;
+
+ptw->out_secure = is_secure;
+ptw->out_phys = addr;
+return true;
 }
 
 /* All loads done in the course of a page table walk go through here. */
-static uint32_t arm_ldl_ptw(CPUARMState *env, hwaddr addr, bool is_secure,
-ARMMMUIdx mmu_idx, ARMMMUFaultInfo *fi)
+static uint32_t arm_ldl_ptw(CPUARMState *env, S1Translate *ptw, hwaddr addr,
+ARMMMUFaultInfo *fi)
 {
 CPUState *cs = env_cpu(env);
 MemTxAttrs attrs = {};
@@ -276,13 +288,13 @@ static uint32_t arm_ldl_ptw(CPUARMState *env, hwaddr 
addr, bool is_secure,
 AddressSpace *as;
 uint32_t data;
 
-addr = S1_ptw_translate(env, mmu_idx, addr, _secure, fi);
-attrs.secure = is_secure;
-as = arm_addressspace(cs, attrs);
-if (fi->s1ptw) {
+if (!S1_ptw_translate(env, ptw, addr, fi)) {
 return 0;
 }
-if (regime_translation_big_endian(env, mmu_idx)) {
+addr = ptw->out_phys;
+attrs.secure = ptw->out_secure;
+as = arm_addressspace(cs, attrs);
+if (regime_translation_big_endian(env, ptw->in_mmu_idx)) {
 data = address_space_ldl_be(as, addr, attrs, );
 } else {
 data = address_space_ldl_le(as, addr, attrs, );
@@ -295,8 +307,8 @@ static uint32_t arm_ldl_ptw(CPUARMState *env, hwaddr addr, 
bool is_secure,
 return 0;
 }
 
-static uint64_t arm_ldq_ptw(CPUARMState *env, hwaddr addr, bool is_secure,
-ARMMMUIdx mmu_idx, ARMMMUFaultInfo *fi)
+static 

[PATCH v4 06/24] target/arm: Restrict tlb flush from vttbr_write to vmid change

2022-10-10 Thread Richard Henderson
Compare only the VMID field when considering whether we need to flush.

Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 18c51bb777..c672903f43 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -3815,10 +3815,10 @@ static void vttbr_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
  * A change in VMID to the stage2 page table (Stage2) invalidates
  * the stage2 and combined stage 1&2 tlbs (EL10_1 and EL10_0).
  */
-if (raw_read(env, ri) != value) {
+if (extract64(raw_read(env, ri) ^ value, 48, 16) != 0) {
 tlb_flush_by_mmuidx(cs, alle1_tlbmask(env));
-raw_write(env, ri, value);
 }
+raw_write(env, ri, value);
 }
 
 static const ARMCPRegInfo vmsa_pmsa_cp_reginfo[] = {
-- 
2.34.1




[PATCH v4 04/24] target/arm: Add ARMMMUIdx_Phys_{S,NS}

2022-10-10 Thread Richard Henderson
Not yet used, but add mmu indexes for 1-1 mapping
to physical addresses.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/cpu-param.h |  2 +-
 target/arm/cpu.h   |  7 ++-
 target/arm/ptw.c   | 19 +--
 3 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/target/arm/cpu-param.h b/target/arm/cpu-param.h
index f4338fd10e..a5b27db275 100644
--- a/target/arm/cpu-param.h
+++ b/target/arm/cpu-param.h
@@ -45,6 +45,6 @@
 bool guarded;
 #endif
 
-#define NB_MMU_MODES 8
+#define NB_MMU_MODES 10
 
 #endif
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index a34d496c5b..f93060e6d6 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -2905,8 +2905,9 @@ bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync);
  * EL2 EL2&0 +PAN
  * EL2 (aka NS PL2)
  * EL3 (aka S PL1)
+ * Physical (NS & S)
  *
- * for a total of 8 different mmu_idx.
+ * for a total of 10 different mmu_idx.
  *
  * R profile CPUs have an MPU, but can use the same set of MMU indexes
  * as A profile. They only need to distinguish EL0 and EL1 (and
@@ -2971,6 +2972,10 @@ typedef enum ARMMMUIdx {
 ARMMMUIdx_E2= 6 | ARM_MMU_IDX_A,
 ARMMMUIdx_E3= 7 | ARM_MMU_IDX_A,
 
+/* TLBs with 1-1 mapping to the physical address spaces. */
+ARMMMUIdx_Phys_NS   = 8 | ARM_MMU_IDX_A,
+ARMMMUIdx_Phys_S= 9 | ARM_MMU_IDX_A,
+
 /*
  * These are not allocated TLBs and are used only for AT system
  * instructions or for the first stage of an S12 page table walk.
diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index 2d182d62e5..a977d09c6d 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -179,6 +179,11 @@ static bool regime_translation_disabled(CPUARMState *env, 
ARMMMUIdx mmu_idx,
 case ARMMMUIdx_E3:
 break;
 
+case ARMMMUIdx_Phys_NS:
+case ARMMMUIdx_Phys_S:
+/* No translation for physical address spaces. */
+return true;
+
 default:
 g_assert_not_reached();
 }
@@ -2280,10 +2285,17 @@ static bool get_phys_addr_disabled(CPUARMState *env, 
target_ulong address,
 {
 uint8_t memattr = 0x00;/* Device nGnRnE */
 uint8_t shareability = 0;  /* non-sharable */
+int r_el;
 
-if (mmu_idx != ARMMMUIdx_Stage2 && mmu_idx != ARMMMUIdx_Stage2_S) {
-int r_el = regime_el(env, mmu_idx);
+switch (mmu_idx) {
+case ARMMMUIdx_Stage2:
+case ARMMMUIdx_Stage2_S:
+case ARMMMUIdx_Phys_NS:
+case ARMMMUIdx_Phys_S:
+break;
 
+default:
+r_el = regime_el(env, mmu_idx);
 if (arm_el_is_aa64(env, r_el)) {
 int pamax = arm_pamax(env_archcpu(env));
 uint64_t tcr = env->cp15.tcr_el[r_el];
@@ -2332,6 +2344,7 @@ static bool get_phys_addr_disabled(CPUARMState *env, 
target_ulong address,
 shareability = 2; /* outer sharable */
 }
 result->cacheattrs.is_s2_format = false;
+break;
 }
 
 result->f.phys_addr = address;
@@ -2536,6 +2549,7 @@ bool get_phys_addr(CPUARMState *env, target_ulong address,
 is_secure = arm_is_secure_below_el3(env);
 break;
 case ARMMMUIdx_Stage2:
+case ARMMMUIdx_Phys_NS:
 case ARMMMUIdx_MPrivNegPri:
 case ARMMMUIdx_MUserNegPri:
 case ARMMMUIdx_MPriv:
@@ -2544,6 +2558,7 @@ bool get_phys_addr(CPUARMState *env, target_ulong address,
 break;
 case ARMMMUIdx_E3:
 case ARMMMUIdx_Stage2_S:
+case ARMMMUIdx_Phys_S:
 case ARMMMUIdx_MSPrivNegPri:
 case ARMMMUIdx_MSUserNegPri:
 case ARMMMUIdx_MSPriv:
-- 
2.34.1




[PATCH v4 02/24] target/arm: Use probe_access_full for MTE

2022-10-10 Thread Richard Henderson
The CPUTLBEntryFull structure now stores the original pte attributes, as
well as the physical address.  Therefore, we no longer need a separate
bit in MemTxAttrs, nor do we need to walk the tree of memory regions.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h   |  1 -
 target/arm/sve_ldst_internal.h |  1 +
 target/arm/mte_helper.c| 62 ++
 target/arm/sve_helper.c| 54 ++---
 target/arm/tlb_helper.c|  4 ---
 5 files changed, 36 insertions(+), 86 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 1a909a1b43..f09ec8aa03 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -3400,7 +3400,6 @@ static inline MemTxAttrs *typecheck_memtxattrs(MemTxAttrs 
*x)
  * generic target bits directly.
  */
 #define arm_tlb_bti_gp(x) (typecheck_memtxattrs(x)->target_tlb_bit0)
-#define arm_tlb_mte_tagged(x) (typecheck_memtxattrs(x)->target_tlb_bit1)
 
 /*
  * AArch64 usage of the PAGE_TARGET_* bits for linux-user.
diff --git a/target/arm/sve_ldst_internal.h b/target/arm/sve_ldst_internal.h
index b5c473fc48..4f159ec4ad 100644
--- a/target/arm/sve_ldst_internal.h
+++ b/target/arm/sve_ldst_internal.h
@@ -134,6 +134,7 @@ typedef struct {
 void *host;
 int flags;
 MemTxAttrs attrs;
+bool tagged;
 } SVEHostPage;
 
 bool sve_probe_page(SVEHostPage *info, bool nofault, CPUARMState *env,
diff --git a/target/arm/mte_helper.c b/target/arm/mte_helper.c
index fdd23ab3f8..e85208339e 100644
--- a/target/arm/mte_helper.c
+++ b/target/arm/mte_helper.c
@@ -105,10 +105,9 @@ static uint8_t *allocation_tag_mem(CPUARMState *env, int 
ptr_mmu_idx,
   TARGET_PAGE_BITS - LOG2_TAG_GRANULE - 1);
 return tags + index;
 #else
-uintptr_t index;
 CPUTLBEntryFull *full;
+MemTxAttrs attrs;
 int in_page, flags;
-ram_addr_t ptr_ra;
 hwaddr ptr_paddr, tag_paddr, xlat;
 MemoryRegion *mr;
 ARMASIdx tag_asi;
@@ -124,30 +123,12 @@ static uint8_t *allocation_tag_mem(CPUARMState *env, int 
ptr_mmu_idx,
  * valid.  Indicate to probe_access_flags no-fault, then assert that
  * we received a valid page.
  */
-flags = probe_access_flags(env, ptr, ptr_access, ptr_mmu_idx,
-   ra == 0, , ra);
+flags = probe_access_full(env, ptr, ptr_access, ptr_mmu_idx,
+  ra == 0, , , ra);
 assert(!(flags & TLB_INVALID_MASK));
 
-/*
- * Find the CPUTLBEntryFull for ptr.  This *must* be present in the TLB
- * because we just found the mapping.
- * TODO: Perhaps there should be a cputlb helper that returns a
- * matching tlb entry + iotlb entry.
- */
-index = tlb_index(env, ptr_mmu_idx, ptr);
-# ifdef CONFIG_DEBUG_TCG
-{
-CPUTLBEntry *entry = tlb_entry(env, ptr_mmu_idx, ptr);
-target_ulong comparator = (ptr_access == MMU_DATA_LOAD
-   ? entry->addr_read
-   : tlb_addr_write(entry));
-g_assert(tlb_hit(comparator, ptr));
-}
-# endif
-full = _tlb(env)->d[ptr_mmu_idx].fulltlb[index];
-
 /* If the virtual page MemAttr != Tagged, access unchecked. */
-if (!arm_tlb_mte_tagged(>attrs)) {
+if (full->pte_attrs != 0xf0) {
 return NULL;
 }
 
@@ -162,6 +143,14 @@ static uint8_t *allocation_tag_mem(CPUARMState *env, int 
ptr_mmu_idx,
 return NULL;
 }
 
+/*
+ * Remember these values across the second lookup below,
+ * which may invalidate this pointer via tlb resize.
+ */
+ptr_paddr = full->phys_addr;
+attrs = full->attrs;
+full = NULL;
+
 /*
  * The Normal memory access can extend to the next page.  E.g. a single
  * 8-byte access to the last byte of a page will check only the last
@@ -170,9 +159,8 @@ static uint8_t *allocation_tag_mem(CPUARMState *env, int 
ptr_mmu_idx,
  */
 in_page = -(ptr | TARGET_PAGE_MASK);
 if (unlikely(ptr_size > in_page)) {
-void *ignore;
-flags |= probe_access_flags(env, ptr + in_page, ptr_access,
-ptr_mmu_idx, ra == 0, , ra);
+flags |= probe_access_full(env, ptr + in_page, ptr_access,
+   ptr_mmu_idx, ra == 0, , , ra);
 assert(!(flags & TLB_INVALID_MASK));
 }
 
@@ -180,33 +168,17 @@ static uint8_t *allocation_tag_mem(CPUARMState *env, int 
ptr_mmu_idx,
 if (unlikely(flags & TLB_WATCHPOINT)) {
 int wp = ptr_access == MMU_DATA_LOAD ? BP_MEM_READ : BP_MEM_WRITE;
 assert(ra != 0);
-cpu_check_watchpoint(env_cpu(env), ptr, ptr_size,
- full->attrs, wp, ra);
+cpu_check_watchpoint(env_cpu(env), ptr, ptr_size, attrs, wp, ra);
 }
 
-/*
- * Find the physical address within the normal mem space.
- * The memory region lookup must succeed because TLB_MMIO was
- * not set in the cputlb lookup above.

[PATCH v4 01/24] target/arm: Enable TARGET_PAGE_ENTRY_EXTRA

2022-10-10 Thread Richard Henderson
Copy attrs and shareability, into the TLB.  This will eventually
be used by S1_ptw_translate to report stage1 translation failures,
and by do_ats_write to fill in PAR_EL1.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/cpu-param.h  | 12 
 target/arm/tlb_helper.c |  3 +++
 2 files changed, 15 insertions(+)

diff --git a/target/arm/cpu-param.h b/target/arm/cpu-param.h
index 08681828ac..38347b0d20 100644
--- a/target/arm/cpu-param.h
+++ b/target/arm/cpu-param.h
@@ -30,6 +30,18 @@
  */
 # define TARGET_PAGE_BITS_VARY
 # define TARGET_PAGE_BITS_MIN  10
+
+/*
+ * Cache the attrs and shareability fields from the page table entry.
+ *
+ * For ARMMMUIdx_Stage2*, pte_attrs is the S2 descriptor bits [5:2].
+ * Otherwise, pte_attrs is the same as the MAIR_EL1 8-bit format.
+ * For shareability, as in the SH field of the VMSAv8-64 PTEs.
+ */
+# define TARGET_PAGE_ENTRY_EXTRA  \
+ uint8_t pte_attrs;   \
+ uint8_t shareability;
+
 #endif
 
 #define NB_MMU_MODES 8
diff --git a/target/arm/tlb_helper.c b/target/arm/tlb_helper.c
index 49601394ec..353edbeb1d 100644
--- a/target/arm/tlb_helper.c
+++ b/target/arm/tlb_helper.c
@@ -236,6 +236,9 @@ bool arm_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
 arm_tlb_mte_tagged() = true;
 }
 
+res.f.pte_attrs = res.cacheattrs.attrs;
+res.f.shareability = res.cacheattrs.shareability;
+
 tlb_set_page_full(cs, mmu_idx, address, );
 return true;
 } else if (probe) {
-- 
2.34.1




[PATCH v4 00/24] target/arm: Implement FEAT_HAFDBS

2022-10-10 Thread Richard Henderson
Changes for v4:
  * Rebase on today's target-arm.next pull, including 21 patches.
  * Split AF and DB enablement into two patches.
  * Perform only one atomic update per PTE.
  * Raise Permission fault if atomic update reqd to read-only PTE.
  * More use of S1Translate struct, which is perhaps now mis-named but
more generally useful/used; suggestions for better naming solicited.
  * Other minor updates per review.


r~


Based-on: 20221010142730.502083-1-peter.mayd...@linaro.org
("[PULL 00/28] target-arm queue")


Richard Henderson (24):
  target/arm: Enable TARGET_PAGE_ENTRY_EXTRA
  target/arm: Use probe_access_full for MTE
  target/arm: Use probe_access_full for BTI
  target/arm: Add ARMMMUIdx_Phys_{S,NS}
  target/arm: Move ARMMMUIdx_Stage2 to a real tlb mmu_idx
  target/arm: Restrict tlb flush from vttbr_write to vmid change
  target/arm: Split out S1Translate type
  target/arm: Plumb debug into S1Translate
  target/arm: Move be test for regime into S1TranslateResult
  target/arm: Use softmmu tlbs for page table walking
  target/arm: Split out get_phys_addr_twostage
  target/arm: Use bool consistently for get_phys_addr subroutines
  target/arm: Add ptw_idx to S1Translate
  target/arm: Add isar predicates for FEAT_HAFDBS
  target/arm: Extract HA and HD in aa64_va_parameters
  target/arm: Move S1_ptw_translate outside arm_ld[lq]_ptw
  target/arm: Add ARMFault_UnsuppAtomicUpdate
  target/arm: Remove loop from get_phys_addr_lpae
  target/arm: Fix fault reporting in get_phys_addr_lpae
  target/arm: Don't shift attrs in get_phys_addr_lpae
  target/arm: Consider GP an attribute in get_phys_addr_lpae
  target/arm: Implement FEAT_HAFDBS, access flag portion
  target/arm: Implement FEAT_HAFDBS, dirty bit portion
  target/arm: Use the max page size in a 2-stage ptw

 docs/system/arm/emulation.rst  |   1 +
 target/arm/cpu-param.h |  15 +-
 target/arm/cpu.h   |  57 +-
 target/arm/internals.h |   7 +
 target/arm/sve_ldst_internal.h |   1 +
 target/arm/cpu64.c |   1 +
 target/arm/helper.c| 163 --
 target/arm/mte_helper.c|  62 +--
 target/arm/ptw.c   | 945 ++---
 target/arm/sve_helper.c|  54 +-
 target/arm/tlb_helper.c|  24 +-
 target/arm/translate-a64.c |  21 +-
 12 files changed, 862 insertions(+), 489 deletions(-)

-- 
2.34.1




[PATCH v4 03/24] target/arm: Use probe_access_full for BTI

2022-10-10 Thread Richard Henderson
Add a field to TARGET_PAGE_ENTRY_EXTRA to hold the guarded bit.
In is_guarded_page, use probe_access_full instead of just guessing
that the tlb entry is still present.  Also handles the FIXME about
executing from device memory.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/cpu-param.h |  9 +
 target/arm/cpu.h   | 13 -
 target/arm/internals.h |  1 +
 target/arm/ptw.c   |  7 ---
 target/arm/translate-a64.c | 21 ++---
 5 files changed, 20 insertions(+), 31 deletions(-)

diff --git a/target/arm/cpu-param.h b/target/arm/cpu-param.h
index 38347b0d20..f4338fd10e 100644
--- a/target/arm/cpu-param.h
+++ b/target/arm/cpu-param.h
@@ -36,12 +36,13 @@
  *
  * For ARMMMUIdx_Stage2*, pte_attrs is the S2 descriptor bits [5:2].
  * Otherwise, pte_attrs is the same as the MAIR_EL1 8-bit format.
- * For shareability, as in the SH field of the VMSAv8-64 PTEs.
+ * For shareability and guarded, as in the SH and GP fields respectively
+ * of the VMSAv8-64 PTEs.
  */
 # define TARGET_PAGE_ENTRY_EXTRA  \
- uint8_t pte_attrs;   \
- uint8_t shareability;
-
+uint8_t pte_attrs;\
+uint8_t shareability; \
+bool guarded;
 #endif
 
 #define NB_MMU_MODES 8
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index f09ec8aa03..a34d496c5b 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -3388,19 +3388,6 @@ static inline uint64_t *aa64_vfp_qreg(CPUARMState *env, 
unsigned regno)
 /* Shared between translate-sve.c and sve_helper.c.  */
 extern const uint64_t pred_esz_masks[5];
 
-/* Helper for the macros below, validating the argument type. */
-static inline MemTxAttrs *typecheck_memtxattrs(MemTxAttrs *x)
-{
-return x;
-}
-
-/*
- * Lvalue macros for ARM TLB bits that we must cache in the TCG TLB.
- * Using these should be a bit more self-documenting than using the
- * generic target bits directly.
- */
-#define arm_tlb_bti_gp(x) (typecheck_memtxattrs(x)->target_tlb_bit0)
-
 /*
  * AArch64 usage of the PAGE_TARGET_* bits for linux-user.
  * Note that with the Linux kernel, PROT_MTE may not be cleared by mprotect
diff --git a/target/arm/internals.h b/target/arm/internals.h
index 9566364dca..c3c3920ded 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -1095,6 +1095,7 @@ typedef struct ARMCacheAttrs {
 unsigned int attrs:8;
 unsigned int shareability:2; /* as in the SH field of the VMSAv8-64 PTEs */
 bool is_s2_format:1;
+bool guarded:1;  /* guarded bit of the v8-64 PTE */
 } ARMCacheAttrs;
 
 /* Fields that are valid upon success. */
diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index 23f16f4ff7..2d182d62e5 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -1313,9 +1313,10 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
uint64_t address,
  */
 result->f.attrs.secure = false;
 }
-/* When in aarch64 mode, and BTI is enabled, remember GP in the IOTLB.  */
-if (aarch64 && guarded && cpu_isar_feature(aa64_bti, cpu)) {
-arm_tlb_bti_gp(>f.attrs) = true;
+
+/* When in aarch64 mode, and BTI is enabled, remember GP in the TLB.  */
+if (aarch64 && cpu_isar_feature(aa64_bti, cpu)) {
+result->f.guarded = guarded;
 }
 
 if (mmu_idx == ARMMMUIdx_Stage2 || mmu_idx == ARMMMUIdx_Stage2_S) {
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 5b67375f4e..60ff753d81 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -14601,22 +14601,21 @@ static bool is_guarded_page(CPUARMState *env, 
DisasContext *s)
 #ifdef CONFIG_USER_ONLY
 return page_get_flags(addr) & PAGE_BTI;
 #else
+CPUTLBEntryFull *full;
+void *host;
 int mmu_idx = arm_to_core_mmu_idx(s->mmu_idx);
-unsigned int index = tlb_index(env, mmu_idx, addr);
-CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr);
+int flags;
 
 /*
  * We test this immediately after reading an insn, which means
- * that any normal page must be in the TLB.  The only exception
- * would be for executing from flash or device memory, which
- * does not retain the TLB entry.
- *
- * FIXME: Assume false for those, for now.  We could use
- * arm_cpu_get_phys_page_attrs_debug to re-read the page
- * table entry even for that case.
+ * that the TLB entry must be present and valid, and thus this
+ * access will never raise an exception.
  */
-return (tlb_hit(entry->addr_code, addr) &&
-arm_tlb_bti_gp(_tlb(env)->d[mmu_idx].fulltlb[index].attrs));
+flags = probe_access_full(env, addr, MMU_INST_FETCH, mmu_idx,
+  false, , , 0);
+assert(!(flags & TLB_INVALID_MASK));
+
+return full->guarded;
 #endif
 }
 
-- 
2.34.1




Re: [RISU PATCH 2/5] loongarch: Add LoongArch basic test support

2022-10-10 Thread gaosong


在 2022/10/10 23:34, Peter Maydell 写道:

+int get_risuop(struct reginfo *ri)
+{
+/* Return the risuop we have been asked to do
+ * (or -1 if this was a SIGILL for a non-risuop insn)
+ */
+uint32_t insn = ri->faulting_insn;
+uint32_t op = insn & 0xf;
+uint32_t key = insn & ~0xf;
+uint32_t risukey = 0x01f0;
+return (key != risukey) ? -1 : op;
+}

You'll probably find this needs tweaking when you rebase
on current risu git, because a recent refactor means this
function should now return a RisuOp, not an int. The changes
should be minor, though.

Ok,  I will correct it  on v2.

Thanks.
Song Gao



Re: [PATCH v2 09/11] vfio/migration: Reset device if setting recover state fails

2022-10-10 Thread liulongfang via
On 2022/5/31 1:07, Avihai Horon wrote:
> If vfio_migration_set_state() fails to set the device in the requested
> state it tries to put it in a recover state. If setting the device in
> the recover state fails as well, hw_error is triggered and the VM is
> aborted.
> 
> To improve user experience and avoid VM data loss, reset the device with
> VFIO_RESET_DEVICE instead of aborting the VM.
> 
> Signed-off-by: Avihai Horon 
> ---
>  hw/vfio/migration.c | 12 ++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> index 852759e6ca..6c34502611 100644
> --- a/hw/vfio/migration.c
> +++ b/hw/vfio/migration.c
> @@ -89,8 +89,16 @@ static int vfio_migration_set_state(VFIODevice *vbasedev,
>  /* Try to put the device in some good state */
>  mig_state->device_state = recover_state;
>  if (ioctl(vbasedev->fd, VFIO_DEVICE_FEATURE, feature)) {
> -hw_error("%s: Device in error state, can't recover",
> - vbasedev->name);
> +if (ioctl(vbasedev->fd, VFIO_DEVICE_RESET)) {
> +hw_error("%s: Device in error state, can't recover",
> + vbasedev->name);
> +}
> +
> +error_report(
> +"%s: Device was reset due to failure in changing device 
> state to recover state %s",
> +vbasedev->name, mig_state_to_str(recover_state));
> +
> +return -1;
>  }
> 

When I used the qemu 7.1.50 version compiled with this set of patches,
I found that after the migration failed due to disconnecting the destination VM
during the live migration process, when I exited the source qemu, the
following error would appear:

[100337.287047] BUG: Bad page state in process qemu-system-aar  pfn:82199518
[100337.295815] page:356de4da refcount:-2 mapcount:0 mapping:000
0 index:0x0 pfn:0x82199518
[100337.306403] flags: 0xbfff800(node=0|zone=2|lastcpupid=0x7fff)
[100337.314091] raw: 0bfff800 dead0100 dead0122 

[100337.322589] raw:   fffe 

[100337.330630] page dumped because: nonzero _refcount
[100337.335840] Modules linked in: hisi_acc_vfio_pci hisi_sec2 hisi_zip hisi_hpr
e hisi_qm uacce vfio_iommu_type1 vfio_pci vfio_pci_core vfio_virqfd vfio pv680_m
ii(O) [last unloaded: hisi_sec2]
[100337.354564] CPU: 1 PID: 786 Comm: qemu-system-aar Tainted: GB  O
   6.0.0-rc4+ #1
[100337.377378] Call trace:
[100337.380382]  dump_backtrace.part.0+0xc4/0xd0
[100337.385791]  show_stack+0x24/0x40
[100337.389478]  dump_stack_lvl+0x68/0x84
[100337.394155]  dump_stack+0x18/0x34
[100337.398006]  bad_page+0xf0/0x120
[100337.401796]  check_free_page_bad+0x84/0x90
[100337.406404]  free_pcppages_bulk+0x1bc/0x2b0
[100337.411126]  free_unref_page_commit+0x120/0x15c
[100337.416935]  free_unref_page+0x15c/0x254
[100337.421436]  free_compound_page+0x6c/0x100
[100337.425868]  free_transhuge_page+0xd4/0x140
[100337.430535]  destroy_large_folio+0x30/0x40
[100337.434953]  release_pages+0x1bc/0x4d0
[100337.439268]  free_pages_and_swap_cache+0x68/0x80
[100337.444224]  tlb_batch_pages_flush+0x5c/0x94
[100337.448976]  tlb_flush_mmu+0x4c/0xd4
[100337.453062]  unmap_page_range+0x8d0/0xbd0
[100337.457432]  unmap_single_vma+0x90/0x12c
[100337.461673]  unmap_vmas+0x84/0xfc
[100337.465354]  exit_mmap+0x88/0x1b0
[100337.469008]  __mmput+0x48/0x134
[100337.472637]  mmput+0x44/0x50
[100337.475857]  do_exit+0x2b8/0x970
[100337.479641]  do_group_exit+0x40/0xac
[100337.484079]  get_signal+0x8c0/0x934
[100337.488215]  do_notify_resume+0x1d0/0x1570
[100337.492795]  el0_svc+0xa8/0xc0
[100337.496452]  el0t_64_sync_handler+0x1ac/0x1b0
[100337.501187]  el0t_64_sync+0x19c/0x1a0

Can anyone see what is causing this error?

>  error_report("%s: Failed changing device state to %s", 
> vbasedev->name,
> 
Thanks
Longfang.



Re: [PATCH RFC] hw/cxl: type 3 devices can now present volatile or persistent memory

2022-10-10 Thread Gregory Price


I've pushed 5 new commits to this branch here (@Jonathan I've also made
a merge request to pull them into your branch).

https://gitlab.com/gourry.memverge/qemu/-/commits/cxl-2022-10-09

They're built on top of Jonathan's extensions for the CDAT since the
CDAT has memory region relevant entries and trying to do this separate
would be unwise.

1/5: PCI_CLASS_MEMORY_CXL patch
2/5: CXL_CAPACITY_MULTIPLIER pullout patch (@Davidlohr request)
3/5: Generalizes CDATDsmas intialization ahead of multi-region
4/5: Multi-region support w/ backward compatibility
 * Requires extra eyes for CDAT and Read/Write Change Validation*
5/5: Test and documentation update


On Mon, Oct 10, 2022 at 11:25:31AM -0400, Gregory Price wrote:
> > 
> > https://gitlab.com/jic23/qemu/-/commits/cxl-2022-10-09
> > There are a few messy corners in that tree but it should work. I'll be
> > pushing out a new version in a few days.
> > 
> > I updated that in latest version to build the tables based on the
> > memdev provided.  We'll want to add the volatile support to that alongside
> > your patch.
> > 
> 
> I will rebase my --persistent-memdev and --volatile-memdev patch on your
> branch and send out the commits when i'm done.  I may also add the
> scafolding for the partitionable-pmem field but not actually expose it.



Re: [PULL 29/55] Revert "intel_iommu: Fix irqchip / X2APIC configuration checks"

2022-10-10 Thread Peter Xu
On Mon, Oct 10, 2022 at 04:16:33PM -0700, David Woodhouse wrote:
> On Mon, 2022-10-10 at 15:08 -0400, Peter Xu wrote:
> > On Mon, Oct 10, 2022 at 10:39:52AM -0700, David Woodhouse wrote:
> > > On Mon, 2022-10-10 at 13:30 -0400, Michael S. Tsirkin wrote:
> > > > From: Peter Xu <
> > > > pet...@redhat.com
> > > > 
> > > > 
> > > > It's true that when vcpus<=255 we don't require the length of 32bit APIC
> > > > IDs.  However here since we already have EIM=ON it means the hypervisor
> > > > will declare the VM as x2apic supported (e.g. VT-d ECAP register will 
> > > > have
> > > > EIM bit 4 set), so the guest should assume the APIC IDs are 32bits width
> > > > even if vcpus<=255.  In short, commit 77250171bdc breaks any simple 
> > > > cmdline
> > > > that wants to boot a VM with >=9 but <=255 vcpus with:
> > > 
> > > I find that paragraph really hard to parse. What does it even mean that
> > > "guest should assume the APIC IDs are 32bits"? 
> > 
> > Quotting EIM definition:
> > 
> >  0: On Intel® 64 platforms, hardware supports only 8-bit APIC-IDs (xAPIC
> > Mode).
> > 
> >  1: On Intel® 64 platforms, hardware supports 32-bit APIC- IDs (x2APIC
> > mode).  Hardware implementation reporting Interrupt Remapping support
> > (IR) field as Clear also report this field as Clear.
> > 
> > I hope the statement was matching the spec.  Please let me know if you have
> > better way to reword it.
> 
> It needs to mention logical mode addressing. Because that, I presume,
> is why it broke only when you had more than 8 vCPUs. Because that's
> when the *logical* destination ID grew past 0xFF.

Agree.

> 
> > > In practice, all the EIM bit does is *allow* 32 bits of APIC ID in the
> > > tables. Which is perfectly fine if there are only 254 CPUs anyway, and
> > > we never need to use a higher value.
> > > 
> > > I *think* the actual problem here is when logical addressing is used,
> > > which puts the APIC cluster ID into higher bits? But it's kind of weird
> > > that the message doesn't mention that at all?
> > 
> > The commit message actually doesn't even need to contain a lot of
> > information in this case, IMO.
> 
> Well, it would be kind of useful if it said what the actual problem
> was, no?

Yes it'll be nice to have.

> 
> > Literally it can be seen as a revert of a commit which breaks guest with
> > > 8vcpu from boot.  I kept the other lines because that still make sense, or
> > 
> > it can be a full revert with "something broke with commit xxx, revert it to
> > fix" and anything else could be reworked.  AFAICT that's how it normally
> > works with QEMU or Linux.
> > 
> > I am not 100% familiar with the original purpose of the patch, would
> > eim=off work for you even after patch applied?  Anything severely wrong
> > with this patch?
> 
> I think the patch itself is fine; I'd just like the commit message to
> be clearer about what the problem was.

Thanks for confirming.

> 
> > > That's fixable by just setting the X2APIC_PHYSICAL bit in the ACPI
> > > FADT, isn't it? Then the only values that a guest may put into those
> > > fields — 32-bit fields or not — are lower than 0xff anyway.
> > 
> > It's still not clear to me why we need to make it inconsistent between the
> > EIM we declare to the guest and the KVM behavior on understanding EIM bit.
> > Even if enforced physical mode will work we loose the possibility of
> > cluster mode, and I also don't see what's the major benefit since EIM=off
> > will just work, afaiu, meanwhile make everything aligned.
> 
> Yeah, I think turning EIM off is absolutely fine.
> 
> > Are you fine if we proceed with this pull request first and revisit later?
> > Follow up patches will always be fine, and we're unbreaking something.  I
> > have copied you since the 1st patch I posted and the small patch was there
> > for weeks, it'll be appreciated if either you could comment earlier next
> > time, or even propose a better fix then we can discuss what's the best way
> > to fix.  Thanks.
> 
> Yeah, sorry for the delay. But that was partly because the commit
> message was confusing me and it took me a while to work out what was
> actually going on... which is really all I'm heckling now.

I see, that was totally fine, and it'll be definitely also fine to comment
anything even on the pull req.  It's just that as I tried to argue for this
specific case IMHO we should move on and revisit later so we shrink the
regression window, rather than redo a pull and let this fix wait for
another one.  It seems we reached a consensus on this, thanks for that.

In all cases (irrelevant of the pull req), feel free to post any patch
either based on this one or as replacement.  I'll be happy to read and
rethink.  So far it still doesn't make sense to me to not enable kvm x2apic
with eim=on, but maybe I'm wrong, and I'd be happy to be corrected in that
case.

Thanks,

-- 
Peter Xu




Re: [PULL 29/55] Revert "intel_iommu: Fix irqchip / X2APIC configuration checks"

2022-10-10 Thread David Woodhouse
On Mon, 2022-10-10 at 15:08 -0400, Peter Xu wrote:
> On Mon, Oct 10, 2022 at 10:39:52AM -0700, David Woodhouse wrote:
> > On Mon, 2022-10-10 at 13:30 -0400, Michael S. Tsirkin wrote:
> > > From: Peter Xu <
> > > pet...@redhat.com
> > > 
> > > 
> > > It's true that when vcpus<=255 we don't require the length of 32bit APIC
> > > IDs.  However here since we already have EIM=ON it means the hypervisor
> > > will declare the VM as x2apic supported (e.g. VT-d ECAP register will have
> > > EIM bit 4 set), so the guest should assume the APIC IDs are 32bits width
> > > even if vcpus<=255.  In short, commit 77250171bdc breaks any simple 
> > > cmdline
> > > that wants to boot a VM with >=9 but <=255 vcpus with:
> > 
> > I find that paragraph really hard to parse. What does it even mean that
> > "guest should assume the APIC IDs are 32bits"? 
> 
> Quotting EIM definition:
> 
>  0: On Intel® 64 platforms, hardware supports only 8-bit APIC-IDs (xAPIC
> Mode).
> 
>  1: On Intel® 64 platforms, hardware supports 32-bit APIC- IDs (x2APIC
> mode).  Hardware implementation reporting Interrupt Remapping support
> (IR) field as Clear also report this field as Clear.
> 
> I hope the statement was matching the spec.  Please let me know if you have
> better way to reword it.

It needs to mention logical mode addressing. Because that, I presume,
is why it broke only when you had more than 8 vCPUs. Because that's
when the *logical* destination ID grew past 0xFF.

> > In practice, all the EIM bit does is *allow* 32 bits of APIC ID in the
> > tables. Which is perfectly fine if there are only 254 CPUs anyway, and
> > we never need to use a higher value.
> > 
> > I *think* the actual problem here is when logical addressing is used,
> > which puts the APIC cluster ID into higher bits? But it's kind of weird
> > that the message doesn't mention that at all?
> 
> The commit message actually doesn't even need to contain a lot of
> information in this case, IMO.

Well, it would be kind of useful if it said what the actual problem
was, no?

> Literally it can be seen as a revert of a commit which breaks guest with
> > 8vcpu from boot.  I kept the other lines because that still make sense, or
> 
> it can be a full revert with "something broke with commit xxx, revert it to
> fix" and anything else could be reworked.  AFAICT that's how it normally
> works with QEMU or Linux.
> 
> I am not 100% familiar with the original purpose of the patch, would
> eim=off work for you even after patch applied?  Anything severely wrong
> with this patch?

I think the patch itself is fine; I'd just like the commit message to
be clearer about what the problem was.

> > That's fixable by just setting the X2APIC_PHYSICAL bit in the ACPI
> > FADT, isn't it? Then the only values that a guest may put into those
> > fields — 32-bit fields or not — are lower than 0xff anyway.
> 
> It's still not clear to me why we need to make it inconsistent between the
> EIM we declare to the guest and the KVM behavior on understanding EIM bit.
> Even if enforced physical mode will work we loose the possibility of
> cluster mode, and I also don't see what's the major benefit since EIM=off
> will just work, afaiu, meanwhile make everything aligned.

Yeah, I think turning EIM off is absolutely fine.

> Are you fine if we proceed with this pull request first and revisit later?
> Follow up patches will always be fine, and we're unbreaking something.  I
> have copied you since the 1st patch I posted and the small patch was there
> for weeks, it'll be appreciated if either you could comment earlier next
> time, or even propose a better fix then we can discuss what's the best way
> to fix.  Thanks.

Yeah, sorry for the delay. But that was partly because the commit
message was confusing me and it took me a while to work out what was
actually going on... which is really all I'm heckling now.



smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC PATCH 0/6] QEMU CXL Provide mock CXL events and irq support

2022-10-10 Thread Ira Weiny
On Mon, Oct 10, 2022 at 03:29:38PM -0700, Ira wrote:
> From: Ira Weiny 
> 
> CXL Event records inform the OS of various CXL device events.  Thus far CXL
> memory devices are emulated and therefore don't naturally have events which
> will occur.
> 
> Add mock events and a HMP trigger mechanism to facilitate guest OS testing of
> event support.
> 
> This support requires a follow on version of the event patch set.  The RFC was
> submitted and discussed here:
> 
>   
> https://lore.kernel.org/linux-cxl/20220813053243.757363-1-ira.we...@intel.com/
> 
> I'll post the lore link to the new version shortly.

Kernel support now posted here:


https://lore.kernel.org/all/20221010224131.1866246-1-ira.we...@intel.com/

Ira

> 
> Instructions for running this test.
> 
> Add qmp option to qemu:
> 
>$ qemu-system-x86_64 ... -qmp 
> unix:/tmp/run_qemu_qmp_0,server,nowait ...
> 
>   OR
> 
>$ run_qemu.sh ... --qmp ...
> 
> Enable tracing of events within the guest:
> 
>$ echo "" > /sys/kernel/tracing/trace
>$ echo 1 > /sys/kernel/tracing/events/cxl/enable
>$ echo 1 > /sys/kernel/tracing/tracing_on
> 
> Trigger event generation and interrupts in the host:
> 
>$ echo "cxl_event_inject cxl-devX" | qmp-shell -H 
> /tmp/run_qemu_qmp_0
> 
>   Where X == one of the memory devices; cxl-dev0 should work.
> 
> View events on the guest:
> 
>$ cat /sys/kernel/tracing/trace
> 
> 
> Ira Weiny (6):
>   qemu/bswap: Add const_le64()
>   qemu/uuid: Add UUID static initializer
>   hw/cxl/cxl-events: Add CXL mock events
>   hw/cxl/mailbox: Wire up get/clear event mailbox commands
>   hw/cxl/cxl-events: Add event interrupt support
>   hw/cxl/mailbox: Wire up Get/Set Event Interrupt policy
> 
>  hmp-commands.hx |  14 ++
>  hw/cxl/cxl-device-utils.c   |   1 +
>  hw/cxl/cxl-events.c | 330 
>  hw/cxl/cxl-host-stubs.c |   5 +
>  hw/cxl/cxl-mailbox-utils.c  | 224 +---
>  hw/cxl/meson.build  |   1 +
>  hw/mem/cxl_type3.c  |   7 +-
>  include/hw/cxl/cxl_device.h |  22 +++
>  include/hw/cxl/cxl_events.h | 194 +
>  include/qemu/bswap.h|  10 ++
>  include/qemu/uuid.h |  12 ++
>  include/sysemu/sysemu.h |   3 +
>  12 files changed, 802 insertions(+), 21 deletions(-)
>  create mode 100644 hw/cxl/cxl-events.c
>  create mode 100644 include/hw/cxl/cxl_events.h
> 
> 
> base-commit: 6f7f81898e4437ea544ee4ca24bef7ec543b1f06
> -- 
> 2.37.2
> 



[RFC PATCH 0/6] QEMU CXL Provide mock CXL events and irq support

2022-10-10 Thread ira . weiny
From: Ira Weiny 

CXL Event records inform the OS of various CXL device events.  Thus far CXL
memory devices are emulated and therefore don't naturally have events which
will occur.

Add mock events and a HMP trigger mechanism to facilitate guest OS testing of
event support.

This support requires a follow on version of the event patch set.  The RFC was
submitted and discussed here:


https://lore.kernel.org/linux-cxl/20220813053243.757363-1-ira.we...@intel.com/

I'll post the lore link to the new version shortly.

Instructions for running this test.

Add qmp option to qemu:

 $ qemu-system-x86_64 ... -qmp 
unix:/tmp/run_qemu_qmp_0,server,nowait ...

OR

 $ run_qemu.sh ... --qmp ...

Enable tracing of events within the guest:

 $ echo "" > /sys/kernel/tracing/trace
 $ echo 1 > /sys/kernel/tracing/events/cxl/enable
 $ echo 1 > /sys/kernel/tracing/tracing_on

Trigger event generation and interrupts in the host:

 $ echo "cxl_event_inject cxl-devX" | qmp-shell -H 
/tmp/run_qemu_qmp_0

Where X == one of the memory devices; cxl-dev0 should work.

View events on the guest:

 $ cat /sys/kernel/tracing/trace


Ira Weiny (6):
  qemu/bswap: Add const_le64()
  qemu/uuid: Add UUID static initializer
  hw/cxl/cxl-events: Add CXL mock events
  hw/cxl/mailbox: Wire up get/clear event mailbox commands
  hw/cxl/cxl-events: Add event interrupt support
  hw/cxl/mailbox: Wire up Get/Set Event Interrupt policy

 hmp-commands.hx |  14 ++
 hw/cxl/cxl-device-utils.c   |   1 +
 hw/cxl/cxl-events.c | 330 
 hw/cxl/cxl-host-stubs.c |   5 +
 hw/cxl/cxl-mailbox-utils.c  | 224 +---
 hw/cxl/meson.build  |   1 +
 hw/mem/cxl_type3.c  |   7 +-
 include/hw/cxl/cxl_device.h |  22 +++
 include/hw/cxl/cxl_events.h | 194 +
 include/qemu/bswap.h|  10 ++
 include/qemu/uuid.h |  12 ++
 include/sysemu/sysemu.h |   3 +
 12 files changed, 802 insertions(+), 21 deletions(-)
 create mode 100644 hw/cxl/cxl-events.c
 create mode 100644 include/hw/cxl/cxl_events.h


base-commit: 6f7f81898e4437ea544ee4ca24bef7ec543b1f06
-- 
2.37.2




[RFC PATCH 4/6] hw/cxl/mailbox: Wire up get/clear event mailbox commands

2022-10-10 Thread ira . weiny
From: Ira Weiny 

Replace the stubbed out CXL Get/Clear Event mailbox commands with
commands which return the mock event information.

Signed-off-by: Ira Weiny 
---
 hw/cxl/cxl-device-utils.c  |   1 +
 hw/cxl/cxl-mailbox-utils.c | 103 +++--
 2 files changed, 101 insertions(+), 3 deletions(-)

diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
index 687759b3017b..4bb41101882e 100644
--- a/hw/cxl/cxl-device-utils.c
+++ b/hw/cxl/cxl-device-utils.c
@@ -262,4 +262,5 @@ void cxl_device_register_init_common(CXLDeviceState 
*cxl_dstate)
 memdev_reg_init_common(cxl_dstate);
 
 assert(cxl_initialize_mailbox(cxl_dstate) == 0);
+cxl_mock_add_event_logs(cxl_dstate);
 }
diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index bb66c765a538..df345f23a30c 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -9,6 +9,7 @@
 
 #include "qemu/osdep.h"
 #include "hw/cxl/cxl.h"
+#include "hw/cxl/cxl_events.h"
 #include "hw/pci/pci.h"
 #include "qemu/cutils.h"
 #include "qemu/log.h"
@@ -116,11 +117,107 @@ struct cxl_cmd {
 return CXL_MBOX_SUCCESS;  \
 }
 
-DEFINE_MAILBOX_HANDLER_ZEROED(events_get_records, 0x20);
-DEFINE_MAILBOX_HANDLER_NOP(events_clear_records);
 DEFINE_MAILBOX_HANDLER_ZEROED(events_get_interrupt_policy, 4);
 DEFINE_MAILBOX_HANDLER_NOP(events_set_interrupt_policy);
 
+static ret_code cmd_events_get_records(struct cxl_cmd *cmd,
+   CXLDeviceState *cxlds,
+   uint16_t *len)
+{
+struct cxl_get_event_payload *pl;
+struct cxl_event_log *log;
+uint8_t log_type;
+uint16_t nr_overflow;
+
+if (cmd->in < sizeof(log_type)) {
+return CXL_MBOX_INVALID_INPUT;
+}
+
+log_type = *((uint8_t *)cmd->payload);
+if (log_type >= CXL_EVENT_TYPE_MAX) {
+return CXL_MBOX_INVALID_INPUT;
+}
+
+pl = (struct cxl_get_event_payload *)cmd->payload;
+
+log = find_event_log(cxlds, log_type);
+if (!log || log_empty(log)) {
+goto no_data;
+}
+
+memset(pl, 0, sizeof(*pl));
+pl->record_count = const_le16(1);
+
+if (log_rec_left(log) > 1) {
+pl->flags |= CXL_GET_EVENT_FLAG_MORE_RECORDS;
+}
+
+nr_overflow = log_overflow(log);
+if (nr_overflow) {
+struct timespec ts;
+uint64_t ns;
+
+clock_gettime(CLOCK_REALTIME, );
+
+ns = ((uint64_t)ts.tv_sec * 10) + (uint64_t)ts.tv_nsec;
+
+pl->flags |= CXL_GET_EVENT_FLAG_OVERFLOW;
+pl->overflow_err_count = cpu_to_le16(nr_overflow);
+ns -= 50; /* 5s ago */
+pl->first_overflow_timestamp = cpu_to_le64(ns);
+ns -= 10; /* 1s ago */
+pl->last_overflow_timestamp = cpu_to_le64(ns);
+}
+
+memcpy(>record, get_cur_event(log), sizeof(pl->record));
+pl->record.hdr.handle = get_cur_event_handle(log);
+*len = sizeof(pl->record);
+return CXL_MBOX_SUCCESS;
+
+no_data:
+*len = sizeof(*pl) - sizeof(pl->record);
+memset(pl, 0, *len);
+return CXL_MBOX_SUCCESS;
+}
+
+static ret_code cmd_events_clear_records(struct cxl_cmd *cmd,
+ CXLDeviceState *cxlds,
+ uint16_t *len)
+{
+struct cxl_mbox_clear_event_payload *pl;
+struct cxl_event_log *log;
+uint8_t log_type;
+
+pl = (struct cxl_mbox_clear_event_payload *)cmd->payload;
+log_type = pl->event_log;
+
+/* Don't handle more than 1 record at a time */
+if (pl->nr_recs != 1) {
+return CXL_MBOX_INVALID_INPUT;
+}
+
+if (log_type >= CXL_EVENT_TYPE_MAX) {
+return CXL_MBOX_INVALID_INPUT;
+}
+
+log = find_event_log(cxlds, log_type);
+if (!log) {
+return CXL_MBOX_SUCCESS;
+}
+
+/*
+ * The current code clears events as they are read.  Test that behavior
+ * only; don't support clearning from the middle of the log
+ */
+if (log->cur_event != le16_to_cpu(pl->handle)) {
+return CXL_MBOX_INVALID_INPUT;
+}
+
+log->cur_event++;
+*len = 0;
+return CXL_MBOX_SUCCESS;
+}
+
 /* 8.2.9.2.1 */
 static ret_code cmd_firmware_update_get_info(struct cxl_cmd *cmd,
  CXLDeviceState *cxl_dstate,
@@ -391,7 +488,7 @@ static struct cxl_cmd cxl_cmd_set[256][256] = {
 [EVENTS][GET_RECORDS] = { "EVENTS_GET_RECORDS",
 cmd_events_get_records, 1, 0 },
 [EVENTS][CLEAR_RECORDS] = { "EVENTS_CLEAR_RECORDS",
-cmd_events_clear_records, ~0, IMMEDIATE_LOG_CHANGE },
+cmd_events_clear_records, 8, IMMEDIATE_LOG_CHANGE },
 [EVENTS][GET_INTERRUPT_POLICY] = { "EVENTS_GET_INTERRUPT_POLICY",
 cmd_events_get_interrupt_policy, 0, 0 },
 [EVENTS][SET_INTERRUPT_POLICY] = { "EVENTS_SET_INTERRUPT_POLICY",
-- 
2.37.2




[RFC PATCH 2/6] qemu/uuid: Add UUID static initializer

2022-10-10 Thread ira . weiny
From: Ira Weiny 

UUID's are defined as network byte order fields.  No static initializer
was available for UUID's in their standard big endian format.

Define a big endian initializer for UUIDs.

Signed-off-by: Ira Weiny 
---
 include/qemu/uuid.h | 12 
 1 file changed, 12 insertions(+)

diff --git a/include/qemu/uuid.h b/include/qemu/uuid.h
index 9925febfa54d..dc40ee1fc998 100644
--- a/include/qemu/uuid.h
+++ b/include/qemu/uuid.h
@@ -61,6 +61,18 @@ typedef struct {
 (clock_seq_hi_and_reserved), (clock_seq_low), (node0), (node1), (node2),\
 (node3), (node4), (node5) }
 
+/* Normal (network byte order) UUID */
+#define UUID(time_low, time_mid, time_hi_and_version,\
+  clock_seq_hi_and_reserved, clock_seq_low, node0, node1, node2, \
+  node3, node4, node5)   \
+  { ((time_low) >> 24) & 0xff, ((time_low) >> 16) & 0xff,\
+((time_low) >> 8) & 0xff, (time_low) & 0xff, \
+((time_mid) >> 8) & 0xff, (time_mid) & 0xff, \
+((time_hi_and_version) >> 8) & 0xff, (time_hi_and_version) & 0xff,   \
+(clock_seq_hi_and_reserved), (clock_seq_low),\
+(node0), (node1), (node2), (node3), (node4), (node5) \
+  }
+
 #define UUID_FMT "%02hhx%02hhx%02hhx%02hhx-" \
  "%02hhx%02hhx-%02hhx%02hhx-" \
  "%02hhx%02hhx-" \
-- 
2.37.2




[RFC PATCH 3/6] hw/cxl/cxl-events: Add CXL mock events

2022-10-10 Thread ira . weiny
From: Ira Weiny 

To facilitate testing of guest software add mock events and code to
support iterating through the event logs.

Signed-off-by: Ira Weiny 
---
 hw/cxl/cxl-events.c | 248 
 hw/cxl/meson.build  |   1 +
 include/hw/cxl/cxl_device.h |  19 +++
 include/hw/cxl/cxl_events.h | 173 +
 4 files changed, 441 insertions(+)
 create mode 100644 hw/cxl/cxl-events.c
 create mode 100644 include/hw/cxl/cxl_events.h

diff --git a/hw/cxl/cxl-events.c b/hw/cxl/cxl-events.c
new file mode 100644
index ..c275280bcb64
--- /dev/null
+++ b/hw/cxl/cxl-events.c
@@ -0,0 +1,248 @@
+/*
+ * CXL Event processing
+ *
+ * Copyright(C) 2022 Intel Corporation.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See the
+ * COPYING file in the top-level directory.
+ */
+
+#include 
+
+#include "qemu/osdep.h"
+#include "qemu/bswap.h"
+#include "qemu/typedefs.h"
+#include "hw/cxl/cxl.h"
+#include "hw/cxl/cxl_events.h"
+
+struct cxl_event_log *find_event_log(CXLDeviceState *cxlds, int log_type)
+{
+if (log_type >= CXL_EVENT_TYPE_MAX) {
+return NULL;
+}
+return >event_logs[log_type];
+}
+
+struct cxl_event_record_raw *get_cur_event(struct cxl_event_log *log)
+{
+return log->events[log->cur_event];
+}
+
+uint16_t get_cur_event_handle(struct cxl_event_log *log)
+{
+return cpu_to_le16(log->cur_event);
+}
+
+bool log_empty(struct cxl_event_log *log)
+{
+return log->cur_event == log->nr_events;
+}
+
+int log_rec_left(struct cxl_event_log *log)
+{
+return log->nr_events - log->cur_event;
+}
+
+static void event_store_add_event(CXLDeviceState *cxlds,
+  enum cxl_event_log_type log_type,
+  struct cxl_event_record_raw *event)
+{
+struct cxl_event_log *log;
+
+assert(log_type < CXL_EVENT_TYPE_MAX);
+
+log = >event_logs[log_type];
+assert(log->nr_events < CXL_TEST_EVENT_CNT_MAX);
+
+log->events[log->nr_events] = event;
+log->nr_events++;
+}
+
+uint16_t log_overflow(struct cxl_event_log *log)
+{
+int cnt = log_rec_left(log) - 5;
+
+if (cnt < 0) {
+return 0;
+}
+return cnt;
+}
+
+#define CXL_EVENT_RECORD_FLAG_PERMANENT BIT(2)
+#define CXL_EVENT_RECORD_FLAG_MAINT_NEEDED  BIT(3)
+#define CXL_EVENT_RECORD_FLAG_PERF_DEGRADED BIT(4)
+#define CXL_EVENT_RECORD_FLAG_HW_REPLACEBIT(5)
+
+struct cxl_event_record_raw maint_needed = {
+.hdr = {
+.id.data = UUID(0xDEADBEEF, 0xCAFE, 0xBABE,
+0xa5, 0x5a, 0xa5, 0x5a, 0xa5, 0xa5, 0x5a, 0xa5),
+.length = sizeof(struct cxl_event_record_raw),
+.flags[0] = CXL_EVENT_RECORD_FLAG_MAINT_NEEDED,
+/* .handle = Set dynamically */
+.related_handle = const_le16(0xa5b6),
+},
+.data = { 0xDE, 0xAD, 0xBE, 0xEF },
+};
+
+struct cxl_event_record_raw hardware_replace = {
+.hdr = {
+.id.data = UUID(0xBABECAFE, 0xBEEF, 0xDEAD,
+0xa5, 0x5a, 0xa5, 0x5a, 0xa5, 0xa5, 0x5a, 0xa5),
+.length = sizeof(struct cxl_event_record_raw),
+.flags[0] = CXL_EVENT_RECORD_FLAG_HW_REPLACE,
+/* .handle = Set dynamically */
+.related_handle = const_le16(0xb6a5),
+},
+.data = { 0xDE, 0xAD, 0xBE, 0xEF },
+};
+
+#define CXL_GMER_EVT_DESC_UNCORECTABLE_EVENTBIT(0)
+#define CXL_GMER_EVT_DESC_THRESHOLD_EVENT   BIT(1)
+#define CXL_GMER_EVT_DESC_POISON_LIST_OVERFLOW  BIT(2)
+
+#define CXL_GMER_MEM_EVT_TYPE_ECC_ERROR 0x00
+#define CXL_GMER_MEM_EVT_TYPE_INV_ADDR  0x01
+#define CXL_GMER_MEM_EVT_TYPE_DATA_PATH_ERROR   0x02
+
+#define CXL_GMER_TRANS_UNKNOWN  0x00
+#define CXL_GMER_TRANS_HOST_READ0x01
+#define CXL_GMER_TRANS_HOST_WRITE   0x02
+#define CXL_GMER_TRANS_HOST_SCAN_MEDIA  0x03
+#define CXL_GMER_TRANS_HOST_INJECT_POISON   0x04
+#define CXL_GMER_TRANS_INTERNAL_MEDIA_SCRUB 0x05
+#define CXL_GMER_TRANS_INTERNAL_MEDIA_MANAGEMENT0x06
+
+#define CXL_GMER_VALID_CHANNEL  BIT(0)
+#define CXL_GMER_VALID_RANK BIT(1)
+#define CXL_GMER_VALID_DEVICE   BIT(2)
+#define CXL_GMER_VALID_COMPONENTBIT(3)
+
+struct cxl_event_gen_media gen_media = {
+.hdr = {
+.id.data = UUID(0xfbcd0a77, 0xc260, 0x417f,
+0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6),
+.length = sizeof(struct cxl_event_gen_media),
+.flags[0] = CXL_EVENT_RECORD_FLAG_PERMANENT,
+/* .handle = Set dynamically */
+.related_handle = const_le16(0),
+},
+.phys_addr = const_le64(0x2000),
+.descriptor = CXL_GMER_EVT_DESC_UNCORECTABLE_EVENT,
+.type = CXL_GMER_MEM_EVT_TYPE_DATA_PATH_ERROR,
+.transaction_type = 

[RFC PATCH 6/6] hw/cxl/mailbox: Wire up Get/Set Event Interrupt policy

2022-10-10 Thread ira . weiny
From: Ira Weiny 

Replace the stubbed out CXL Get/Set Event interrupt policy mailbox
commands.  Enable those commands to control interrupts for each of the
event log types.

Signed-off-by: Ira Weiny 
---
 hw/cxl/cxl-mailbox-utils.c  | 129 ++--
 include/hw/cxl/cxl_events.h |  21 ++
 2 files changed, 129 insertions(+), 21 deletions(-)

diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index df345f23a30c..52e8804c24ed 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -101,25 +101,6 @@ struct cxl_cmd {
 uint8_t *payload;
 };
 
-#define DEFINE_MAILBOX_HANDLER_ZEROED(name, size) \
-uint16_t __zero##name = size; \
-static ret_code cmd_##name(struct cxl_cmd *cmd,   \
-   CXLDeviceState *cxl_dstate, uint16_t *len) \
-{ \
-*len = __zero##name;  \
-memset(cmd->payload, 0, *len);\
-return CXL_MBOX_SUCCESS;  \
-}
-#define DEFINE_MAILBOX_HANDLER_NOP(name)  \
-static ret_code cmd_##name(struct cxl_cmd *cmd,   \
-   CXLDeviceState *cxl_dstate, uint16_t *len) \
-{ \
-return CXL_MBOX_SUCCESS;  \
-}
-
-DEFINE_MAILBOX_HANDLER_ZEROED(events_get_interrupt_policy, 4);
-DEFINE_MAILBOX_HANDLER_NOP(events_set_interrupt_policy);
-
 static ret_code cmd_events_get_records(struct cxl_cmd *cmd,
CXLDeviceState *cxlds,
uint16_t *len)
@@ -218,6 +199,110 @@ static ret_code cmd_events_clear_records(struct cxl_cmd 
*cmd,
 return CXL_MBOX_SUCCESS;
 }
 
+static ret_code cmd_events_get_interrupt_policy(struct cxl_cmd *cmd,
+CXLDeviceState *cxl_dstate,
+uint16_t *len)
+{
+struct cxl_event_interrupt_policy *policy;
+struct cxl_event_log *log;
+
+policy = (struct cxl_event_interrupt_policy *)cmd->payload;
+memset(policy, 0, sizeof(*policy));
+
+log = find_event_log(cxl_dstate, CXL_EVENT_TYPE_INFO);
+if (log->irq_enabled) {
+policy->info_settings = CXL_EVENT_INT_SETTING(log->irq_vec);
+}
+
+log = find_event_log(cxl_dstate, CXL_EVENT_TYPE_WARN);
+if (log->irq_enabled) {
+policy->warn_settings = CXL_EVENT_INT_SETTING(log->irq_vec);
+}
+
+log = find_event_log(cxl_dstate, CXL_EVENT_TYPE_FAIL);
+if (log->irq_enabled) {
+policy->failure_settings = CXL_EVENT_INT_SETTING(log->irq_vec);
+}
+
+log = find_event_log(cxl_dstate, CXL_EVENT_TYPE_FATAL);
+if (log->irq_enabled) {
+policy->fatal_settings = CXL_EVENT_INT_SETTING(log->irq_vec);
+}
+
+log = find_event_log(cxl_dstate, CXL_EVENT_TYPE_DYNAMIC_CAP);
+if (log->irq_enabled) {
+/* Dynamic Capacity borrows the same vector as info */
+policy->dyn_cap_settings = CXL_INT_MSI_MSIX;
+}
+
+*len = sizeof(*policy);
+return CXL_MBOX_SUCCESS;
+}
+
+static ret_code cmd_events_set_interrupt_policy(struct cxl_cmd *cmd,
+CXLDeviceState *cxl_dstate,
+uint16_t *len)
+{
+struct cxl_event_interrupt_policy *policy;
+struct cxl_event_log *log;
+
+policy = (struct cxl_event_interrupt_policy *)cmd->payload;
+
+log = find_event_log(cxl_dstate, CXL_EVENT_TYPE_INFO);
+if ((policy->info_settings & CXL_EVENT_INT_MODE_MASK) ==
+CXL_INT_MSI_MSIX) {
+log->irq_enabled = true;
+log->irq_vec = cxl_dstate->event_vector[CXL_EVENT_TYPE_INFO];
+} else {
+log->irq_enabled = false;
+log->irq_vec = 0;
+}
+
+log = find_event_log(cxl_dstate, CXL_EVENT_TYPE_WARN);
+if ((policy->warn_settings & CXL_EVENT_INT_MODE_MASK) ==
+CXL_INT_MSI_MSIX) {
+log->irq_enabled = true;
+log->irq_vec = cxl_dstate->event_vector[CXL_EVENT_TYPE_WARN];
+} else {
+log->irq_enabled = false;
+log->irq_vec = 0;
+}
+
+log = find_event_log(cxl_dstate, CXL_EVENT_TYPE_FAIL);
+if ((policy->failure_settings & CXL_EVENT_INT_MODE_MASK) ==
+CXL_INT_MSI_MSIX) {
+log->irq_enabled = true;
+log->irq_vec = cxl_dstate->event_vector[CXL_EVENT_TYPE_FAIL];
+} else {
+log->irq_enabled = false;
+log->irq_vec = 0;
+}
+
+log = find_event_log(cxl_dstate, 

[RFC PATCH 1/6] qemu/bswap: Add const_le64()

2022-10-10 Thread ira . weiny
From: Ira Weiny 

Gcc requires constant versions of cpu_to_le* calls.

Add a 64 bit version.

Signed-off-by: Ira Weiny 
---
 include/qemu/bswap.h | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/include/qemu/bswap.h b/include/qemu/bswap.h
index 346d05f2aab3..08e607821102 100644
--- a/include/qemu/bswap.h
+++ b/include/qemu/bswap.h
@@ -192,10 +192,20 @@ CPU_CONVERT(le, 64, uint64_t)
  (((_x) & 0xff00U) <<  8) |  \
  (((_x) & 0x00ffU) >>  8) |  \
  (((_x) & 0xff00U) >> 24))
+# define const_le64(_x)  \
+_x) & 0x00ffU) << 56) |  \
+ (((_x) & 0xff00U) << 40) |  \
+ (((_x) & 0x00ffU) << 24) |  \
+ (((_x) & 0xff00U) <<  8) |  \
+ (((_x) & 0x00ffU) >>  8) |  \
+ (((_x) & 0xff00U) >> 24) |  \
+ (((_x) & 0x00ffU) >> 40) |  \
+ (((_x) & 0xff00U) >> 56))
 # define const_le16(_x)  \
 _x) & 0x00ff) << 8) |\
  (((_x) & 0xff00) >> 8))
 #else
+# define const_le64(_x) (_x)
 # define const_le32(_x) (_x)
 # define const_le16(_x) (_x)
 #endif
-- 
2.37.2




[RFC PATCH 5/6] hw/cxl/cxl-events: Add event interrupt support

2022-10-10 Thread ira . weiny
From: Ira Weiny 

To facilitate testing of event interrupt support add a QMP HMP command
to reset the event logs and issue interrupts when the guest has enabled
those interrupts.

Signed-off-by: Ira Weiny 
---
 hmp-commands.hx | 14 +++
 hw/cxl/cxl-events.c | 82 +
 hw/cxl/cxl-host-stubs.c |  5 +++
 hw/mem/cxl_type3.c  |  7 +++-
 include/hw/cxl/cxl_device.h |  3 ++
 include/sysemu/sysemu.h |  3 ++
 6 files changed, 113 insertions(+), 1 deletion(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 564f1de364df..c59a98097317 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1266,6 +1266,20 @@ SRST
   Inject PCIe AER error
 ERST
 
+{
+.name   = "cxl_event_inject",
+.args_type  = "id:s",
+.params = "id ",
+.help   = "inject cxl events and interrupt\n\t\t\t"
+  " = qdev device id\n\t\t\t",
+.cmd= hmp_cxl_event_inject,
+},
+
+SRST
+``cxl_event_inject``
+  Inject CXL Events
+ERST
+
 {
 .name   = "netdev_add",
 .args_type  = "netdev:O",
diff --git a/hw/cxl/cxl-events.c b/hw/cxl/cxl-events.c
index c275280bcb64..6ece6f252462 100644
--- a/hw/cxl/cxl-events.c
+++ b/hw/cxl/cxl-events.c
@@ -10,8 +10,14 @@
 #include 
 
 #include "qemu/osdep.h"
+#include "sysemu/sysemu.h"
+#include "monitor/monitor.h"
 #include "qemu/bswap.h"
 #include "qemu/typedefs.h"
+#include "qapi/qmp/qdict.h"
+#include "hw/pci/pci.h"
+#include "hw/pci/msi.h"
+#include "hw/pci/msix.h"
 #include "hw/cxl/cxl.h"
 #include "hw/cxl/cxl_events.h"
 
@@ -68,6 +74,11 @@ uint16_t log_overflow(struct cxl_event_log *log)
 return cnt;
 }
 
+static void reset_log(struct cxl_event_log *log)
+{
+log->cur_event = 0;
+}
+
 #define CXL_EVENT_RECORD_FLAG_PERMANENT BIT(2)
 #define CXL_EVENT_RECORD_FLAG_MAINT_NEEDED  BIT(3)
 #define CXL_EVENT_RECORD_FLAG_PERF_DEGRADED BIT(4)
@@ -246,3 +257,74 @@ void cxl_mock_add_event_logs(CXLDeviceState *cxlds)
 event_store_add_event(cxlds, CXL_EVENT_TYPE_FATAL,
   (struct cxl_event_record_raw *));
 }
+
+static void cxl_reset_all_logs(CXLDeviceState *cxlds)
+{
+int i;
+
+for (i = 0; i < CXL_EVENT_TYPE_MAX; i++) {
+struct cxl_event_log *log = find_event_log(cxlds, i);
+
+if (!log) {
+continue;
+}
+
+reset_log(log);
+}
+}
+
+static void cxl_event_irq_assert(PCIDevice *pdev)
+{
+CXLType3Dev *ct3d = container_of(pdev, struct CXLType3Dev, parent_obj);
+CXLDeviceState *cxlds = >cxl_dstate;
+int i;
+
+for (i = 0; i < CXL_EVENT_TYPE_MAX; i++) {
+struct cxl_event_log *log;
+
+log = find_event_log(cxlds, i);
+if (!log || !log->irq_enabled || log_empty(log)) {
+continue;
+}
+
+/* Notifies interrupt, legacy IRQ is not supported */
+if (msix_enabled(pdev)) {
+msix_notify(pdev, log->irq_vec);
+} else if (msi_enabled(pdev)) {
+msi_notify(pdev, log->irq_vec);
+}
+}
+}
+
+static int do_cxl_event_inject(Monitor *mon, const QDict *qdict)
+{
+const char *id = qdict_get_str(qdict, "id");
+CXLType3Dev *ct3d;
+PCIDevice *pdev;
+int ret;
+
+ret = pci_qdev_find_device(id, );
+if (ret < 0) {
+monitor_printf(mon,
+   "id or cxl device path is invalid or device not "
+   "found. %s\n", id);
+return ret;
+}
+
+ct3d = container_of(pdev, struct CXLType3Dev, parent_obj);
+cxl_reset_all_logs(>cxl_dstate);
+
+cxl_event_irq_assert(pdev);
+return 0;
+}
+
+void hmp_cxl_event_inject(Monitor *mon, const QDict *qdict)
+{
+const char *id = qdict_get_str(qdict, "id");
+
+if (do_cxl_event_inject(mon, qdict) < 0) {
+return;
+}
+
+monitor_printf(mon, "OK id: %s\n", id);
+}
diff --git a/hw/cxl/cxl-host-stubs.c b/hw/cxl/cxl-host-stubs.c
index cae4afcdde26..61039263f25a 100644
--- a/hw/cxl/cxl-host-stubs.c
+++ b/hw/cxl/cxl-host-stubs.c
@@ -12,4 +12,9 @@ void cxl_fmws_link_targets(CXLState *stat, Error **errp) {};
 void cxl_machine_init(Object *obj, CXLState *state) {};
 void cxl_hook_up_pxb_registers(PCIBus *bus, CXLState *state, Error **errp) {};
 
+void hmp_cxl_event_inject(Monitor *mon, const QDict *qdict)
+{
+monitor_printf(mon, "CXL devices not supported\n");
+}
+
 const MemoryRegionOps cfmws_ops;
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index 2b13179d116d..b4a90136d190 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -459,7 +459,7 @@ static void ct3_realize(PCIDevice *pci_dev, Error **errp)
 ComponentRegisters *regs = _cstate->crb;
 MemoryRegion *mr = >component_registers;
 uint8_t *pci_conf = pci_dev->config;
-unsigned short msix_num = 3;
+unsigned short msix_num = 7;
 int i;
 
 if (!cxl_setup_memory(ct3d, errp)) {
@@ -502,6 +502,11 @@ static void ct3_realize(PCIDevice *pci_dev, Error 

Re: [PATCH v2 05/12] target/ppc: Move VPRTYB[WDQ] to decodetree and use gvec

2022-10-10 Thread Richard Henderson

On 10/10/22 12:13, Lucas Mateus Castro(alqotel) wrote:

From: "Lucas Mateus Castro (alqotel)" 

Moved VPRTYBW and VPRTYBD to use gvec and both of them and VPRTYBQ to
decodetree. VPRTYBW and VPRTYBD now also use .fni4 and .fni8,
respectively.

vprtybw:
reptloopmaster patch
8   12500   0,00991200 0,00626300 (-36.8%)
25  40000,01040600 0,00550600 (-47.1%)
100 10000,01084500 0,00601100 (-44.6%)
500 200 0,01490600 0,01394100 (-6.5%)
250040  0,03285100 0,05143000 (+56.6%)
800012  0,08971500 0,14662500 (+63.4%)

vprtybd:
reptloopmaster patch
8   12500   0,00665800 0,00652800 (-2.0%)
25  40000,00589300 0,00670400 (+13.8%)
100 10000,00646800 0,00743900 (+15.0%)
500 200 0,01065800 0,01586400 (+48.8%)
250040  0,03497000 0,07180100 (+105.3%)
800012  0,09242200 0,21566600 (+133.3%)

vprtybq:
reptloopmaster patch
8   12500   0,00656200 0,00665800 (+1.5%)
25  40000,00620500 0,00644900 (+3.9%)
100 10000,00707500 0,00764900 (+8.1%)
500 200 0,01203500 0,01349500 (+12.1%)
250040  0,03505700 0,04123100 (+17.6%)
800012  0,09590600 0,11586700 (+20.8%)

I wasn't expecting such a performance lost in both VPRTYBD and VPRTYBQ,
I'm not sure if it's worth to move those instructions. Comparing the
assembly of the helper with the TCGop they are pretty similar, so
I'm not sure why vprtybd took so much more time.

Signed-off-by: Lucas Mateus Castro (alqotel) 
---
  target/ppc/helper.h |  4 +-
  target/ppc/insn32.decode|  4 ++
  target/ppc/int_helper.c | 25 +
  target/ppc/translate/vmx-impl.c.inc | 80 +++--
  target/ppc/translate/vmx-ops.c.inc  |  3 --
  5 files changed, 83 insertions(+), 33 deletions(-)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index b2e910b089..a06193bc67 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -193,9 +193,7 @@ DEF_HELPER_FLAGS_3(vslo, TCG_CALL_NO_RWG, void, avr, avr, 
avr)
  DEF_HELPER_FLAGS_3(vsro, TCG_CALL_NO_RWG, void, avr, avr, avr)
  DEF_HELPER_FLAGS_3(vsrv, TCG_CALL_NO_RWG, void, avr, avr, avr)
  DEF_HELPER_FLAGS_3(vslv, TCG_CALL_NO_RWG, void, avr, avr, avr)
-DEF_HELPER_FLAGS_2(vprtybw, TCG_CALL_NO_RWG, void, avr, avr)
-DEF_HELPER_FLAGS_2(vprtybd, TCG_CALL_NO_RWG, void, avr, avr)
-DEF_HELPER_FLAGS_2(vprtybq, TCG_CALL_NO_RWG, void, avr, avr)
+DEF_HELPER_FLAGS_3(VPRTYBQ, TCG_CALL_NO_RWG, void, avr, avr, i32)
  DEF_HELPER_FLAGS_5(vaddsbs, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32)
  DEF_HELPER_FLAGS_5(vaddshs, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32)
  DEF_HELPER_FLAGS_5(vaddsws, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 2658dd3395..aa4968e6b9 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -529,6 +529,10 @@ VCTZDM  000100 . . . 1000100@VX
  VPDEPD  000100 . . . 10111001101@VX
  VPEXTD  000100 . . . 10110001101@VX
  
+VPRTYBD 000100 . 01001 . 1100010@VX_tb

+VPRTYBQ 000100 . 01010 . 1100010@VX_tb
+VPRTYBW 000100 . 01000 . 1100010@VX_tb
+
  ## Vector Permute and Formatting Instruction
  
  VEXTDUBVLX  000100 . . . . 011000   @VA

diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index c7fd0d1faa..c6ce4665fa 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -492,31 +492,8 @@ static inline void set_vscr_sat(CPUPPCState *env)
  env->vscr_sat.u32[0] = 1;
  }
  
-/* vprtybw */

-void helper_vprtybw(ppc_avr_t *r, ppc_avr_t *b)
-{
-int i;
-for (i = 0; i < ARRAY_SIZE(r->u32); i++) {
-uint64_t res = b->u32[i] ^ (b->u32[i] >> 16);
-res ^= res >> 8;
-r->u32[i] = res & 1;
-}
-}
-
-/* vprtybd */
-void helper_vprtybd(ppc_avr_t *r, ppc_avr_t *b)
-{
-int i;
-for (i = 0; i < ARRAY_SIZE(r->u64); i++) {
-uint64_t res = b->u64[i] ^ (b->u64[i] >> 32);
-res ^= res >> 16;
-res ^= res >> 8;
-r->u64[i] = res & 1;
-}
-}
-
  /* vprtybq */
-void helper_vprtybq(ppc_avr_t *r, ppc_avr_t *b)
+void helper_VPRTYBQ(ppc_avr_t *r, ppc_avr_t *b, uint32_t v)
  {
  uint64_t res = b->u64[0] ^ b->u64[1];
  res ^= res >> 32;
diff --git a/target/ppc/translate/vmx-impl.c.inc 
b/target/ppc/translate/vmx-impl.c.inc
index b9a9e83ab3..23601942bc 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1659,9 +1659,83 @@ GEN_VXFORM_NOA_ENV(vrfim, 5, 11);
  GEN_VXFORM_NOA_ENV(vrfin, 5, 8);
  GEN_VXFORM_NOA_ENV(vrfip, 5, 10);
  GEN_VXFORM_NOA_ENV(vrfiz, 5, 9);
-GEN_VXFORM_NOA(vprtybw, 1, 24);

Re: [PATCH 2/2] hw/cxl: Allow CXL type-3 devices to be persistent or volatile

2022-10-10 Thread Gregory Price


Hang tight, I'm whipping up a multi-region patch that will support a
vmem and pmem region and such.  Finally got oriented enough to figure
out the DPA decoding a bit.  I will probably need some help validating
the decoder logic and the CDAT table logic.

I will integrate the suggestions below into that patch set.

Jonathan i'm building on top of your gitlab branch and will make a
branch available for review when done.

On Mon, Oct 10, 2022 at 12:36:54PM -0700, Davidlohr Bueso wrote:
> On Mon, 10 Oct 2022, Davidlohr Bueso wrote:
> 
> > This hides requirement details as to the necessary changes that are needed 
> > for
> > volatile support - for example, build_dvsecs(). Imo using two backends 
> > (without
> > breaking current configs, of course) should be the initial version, not 
> > something
> > to leave pending.
> 
> Minimally this is along the lines I was thinking of. I rebased some of my 
> original
> patches on top of yours. It builds and passes tests/qtest/cxl-test, but 
> certainly
> untested otherwise. The original code did show the volatile support as per 
> cxl-list.
> 
> As such users can still use memdev which will map to the pmemdev. One thing 
> which I
> had not explored was the lsa + vmem thing, so the below prevents this for the 
> time
> being, fyi.
> 
> Thanks,
> Davidlohr
> 
> 8<
> 
> diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
> index e8341a818467..cd079dbddd9a 100644
> --- a/hw/mem/cxl_type3.c
> +++ b/hw/mem/cxl_type3.c
> @@ -18,14 +18,21 @@ static void build_dvsecs(CXLType3Dev *ct3d)
>  {
>  CXLComponentState *cxl_cstate = >cxl_cstate;
>  uint8_t *dvsec;
> +uint64_t size = 0;
> +
> +if (ct3d->hostvmem) {
> +size += ct3d->hostvmem->size;
> +}
> +if (ct3d->hostpmem) {
> +size += ct3d->hostpmem->size;
> +}
> 
>  dvsec = (uint8_t *)&(CXLDVSECDevice){
> -.cap = 0x1e,
> +.cap = 0x1e, /* one HDM range */
>.ctrl = 0x2,
>.status2 = 0x2,
> -.range1_size_hi = ct3d->hostmem->size >> 32,
> -.range1_size_lo = (2 << 5) | (2 << 2) | 0x3 |
> -(ct3d->hostmem->size & 0xF000),
> +.range1_size_hi = size >> 32,
> +.range1_size_lo = (2 << 5) | (2 << 2) | 0x3 | (size & 0xF000),
>.range1_base_hi = 0,
>.range1_base_lo = 0,
>  };
> @@ -98,70 +105,60 @@ static void ct3d_reg_write(void *opaque, hwaddr offset, 
> uint64_t value,
>  static bool cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
>  {
>  DeviceState *ds = DEVICE(ct3d);
> -MemoryRegion *mr;
>  char *name;
> -bool is_pmem = false;
> 
> -/*
> - * FIXME: For now we only allow a single host memory region.
> - * Handle the deprecated memdev property usage cases
> - */
> -if (!ct3d->hostmem && !ct3d->host_vmem && !ct3d->host_pmem) {
> +if (!ct3d->hostvmem && !ct3d->hostpmem) {
>error_setg(errp, "at least one memdev property must be set");
>return false;
> -} else if (ct3d->hostmem && (ct3d->host_vmem || ct3d->host_pmem)) {
> -error_setg(errp, "deprecated [memdev] cannot be used with new "
> - "persistent and volatile memdev properties");
> -return false;
> -} else if (ct3d->hostmem) {
> -warn_report("memdev is deprecated and defaults to pmem. "
> -"Use (persistent|volatile)-memdev instead.");
> -is_pmem = true;
> -} else {
> -if (ct3d->host_vmem && ct3d->host_pmem) {
> -error_setg(errp, "Multiple memory devices not supported yet");
> -return false;
> -}
> -is_pmem = !!ct3d->host_pmem;
> -ct3d->hostmem = ct3d->host_pmem ? ct3d->host_pmem : ct3d->host_vmem;
>  }
> 
> -/*
> - * for now, since there is only one memdev, we can set the type
> - * based on whether this was a ram region or file region
> - */
> -mr = host_memory_backend_get_memory(ct3d->hostmem);
> -if (!mr) {
> -error_setg(errp, "memdev property must be set");
> +/* TODO: volatile devices may have LSA */
> +if (ct3d->hostvmem && ct3d->lsa) {
> +error_setg(errp, "lsa property must be set");
>return false;
>  }
> 
> -/*
> - * FIXME: This code should eventually enumerate each memory region and
> - * report vmem and pmem capacity separate, but for now just set to one
> - */
> -memory_region_set_nonvolatile(mr, is_pmem);
> -memory_region_set_enabled(mr, true);
> -host_memory_backend_set_mapped(ct3d->hostmem, true);
> -
>  if (ds->id) {
>name = g_strdup_printf("cxl-type3-dpa-space:%s", ds->id);
>  } else {
>name = g_strdup("cxl-type3-dpa-space");
>  }
> -address_space_init(>hostmem_as, mr, name);
> -g_free(name);
> 
> -/* FIXME: When multiple regions are supported, this needs to aggregate */
> -ct3d->cxl_dstate.mem_size = ct3d->hostmem->size;
> -

[PATCH v2 12/12] target/ppc: Use gvec to decode XVTSTDC[DS]P

2022-10-10 Thread Lucas Mateus Castro(alqotel)
From: "Lucas Mateus Castro (alqotel)" 

Used gvec to translate XVTSTDCSP and XVTSTDCDP.

xvtstdcsp:
reptloopimm prev versioncurrent version
25  40000   0,0475500,040820 (-14.2%)
25  40001   0,0695200,053520 (-23.0%)
25  40003   0,0786600,058470 (-25.7%)
25  400051  0,0992800,190100 (+91.5%)
25  4000127 0,1296900,201750 (+55.6%)
800012  0   0,5546250,391385 (-29.4%)
800012  1   2,6756351,423656 (-46.8%)
800012  3   3,1868231,756885 (-44.9%)
800012  51  4,2844171,363698 (-68.2%)
800012  127 5,6380001,305333 (-76.8%)

xvtstdcdp:
reptloopimm prev versioncurrent version
25  40000   0,0474500,040590 (-14.5%)
25  40001   0,0741300,053570 (-27.7%)
25  40003   0,0841800,063020 (-25.1%)
25  400051  0,1033400,127980 (+23.8%)
25  4000127 0,1346700,128660 (-4.5%)
800012  0   0,5224270,391510 (-25.1%)
800012  1   2,8847081,426802 (-50.5%)
800012  3   3,4276251,972115 (-42.5%)
800012  51  4,4502601,251865 (-71.9%)
800012  127 5,8544791,250719 (-78.6%)

Overall, these instructions are the hardest ones to measure performance
as the gvec implementation is affected by the immediate. Above there are
5 different scenarios when it comes to immediate and 2 when it comes to
rept/loop combination. The immediates scenarios are: all bits are 0
therefore the target register should just be changed to 0, with 1 bit
set, with 2 bits set in a combination the new implementation can deal
with using gvec, 4 bits set and the new implementation can't deal with
it using gvec and all bits set. The rept/loop scenarios are high loop
and low rept (so it should spend more time executing it than translating
it) and high rept low loop (so it should spend more time translating it
than executing this code).
There was a gain when it came to translating the instructions
and in the execution time in the immediates the new implementation is
configured to accept, but a loss in performance in execution time for
more exoteric immediates.

Signed-off-by: Lucas Mateus Castro (alqotel) 
---
 target/ppc/fpu_helper.c |   7 +-
 target/ppc/helper.h |   4 +-
 target/ppc/translate/vsx-impl.c.inc | 188 ++--
 3 files changed, 184 insertions(+), 15 deletions(-)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index a66e16c212..6c94576575 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -22,6 +22,7 @@
 #include "exec/exec-all.h"
 #include "internal.h"
 #include "fpu/softfloat.h"
+#include "tcg/tcg-gvec-desc.h"
 
 static inline float128 float128_snan_to_qnan(float128 x)
 {
@@ -3263,17 +3264,19 @@ VSX_TSTDC(float64)
 VSX_TSTDC(float128)
 #undef VSX_TSTDC
 
-void helper_XVTSTDCDP(ppc_vsr_t *t, ppc_vsr_t *b, uint64_t dcmx, uint32_t v)
+void helper_XVTSTDCDP(ppc_vsr_t *t, ppc_vsr_t *b, uint32_t dcmx)
 {
 int i;
+dcmx = simd_data(dcmx);
 for (i = 0; i < 2; i++) {
 t->s64[i] = (int64_t)-float64_tstdc(b->f64[i], dcmx);
 }
 }
 
-void helper_XVTSTDCSP(ppc_vsr_t *t, ppc_vsr_t *b, uint64_t dcmx, uint32_t v)
+void helper_XVTSTDCSP(ppc_vsr_t *t, ppc_vsr_t *b, uint32_t dcmx)
 {
 int i;
+dcmx = simd_data(dcmx);
 for (i = 0; i < 4; i++) {
 t->s32[i] = (int32_t)-float32_tstdc(b->f32[i], dcmx);
 }
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 8344fe39c6..2851418acc 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -517,8 +517,8 @@ DEF_HELPER_3(xvcvsxdsp, void, env, vsr, vsr)
 DEF_HELPER_3(xvcvuxdsp, void, env, vsr, vsr)
 DEF_HELPER_3(xvcvsxwsp, void, env, vsr, vsr)
 DEF_HELPER_3(xvcvuxwsp, void, env, vsr, vsr)
-DEF_HELPER_FLAGS_4(XVTSTDCSP, TCG_CALL_NO_RWG, void, vsr, vsr, i64, i32)
-DEF_HELPER_FLAGS_4(XVTSTDCDP, TCG_CALL_NO_RWG, void, vsr, vsr, i64, i32)
+DEF_HELPER_FLAGS_3(XVTSTDCSP, TCG_CALL_NO_RWG, void, vsr, vsr, i32)
+DEF_HELPER_FLAGS_3(XVTSTDCDP, TCG_CALL_NO_RWG, void, vsr, vsr, i32)
 DEF_HELPER_3(xvrspi, void, env, vsr, vsr)
 DEF_HELPER_3(xvrspic, void, env, vsr, vsr)
 DEF_HELPER_3(xvrspim, void, env, vsr, vsr)
diff --git a/target/ppc/translate/vsx-impl.c.inc 
b/target/ppc/translate/vsx-impl.c.inc
index 4fdbc45ff4..26fc8c0b01 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -632,6 +632,8 @@ static void gen_mtvsrws(DisasContext *ctx)
 #define SGN_MASK_SP 0x80008000ull
 #define EXP_MASK_DP  0x7FF0ull
 #define EXP_MASK_SP 0x7F807F80ull
+#define FRC_MASK_DP (~(SGN_MASK_DP | EXP_MASK_DP))
+#define FRC_MASK_SP (~(SGN_MASK_SP | EXP_MASK_SP))
 
 

Re: [PATCH 2/2] hw/cxl: Allow CXL type-3 devices to be persistent or volatile

2022-10-10 Thread Davidlohr Bueso

On Mon, 10 Oct 2022, Davidlohr Bueso wrote:


This hides requirement details as to the necessary changes that are needed for
volatile support - for example, build_dvsecs(). Imo using two backends (without
breaking current configs, of course) should be the initial version, not 
something
to leave pending.


Minimally this is along the lines I was thinking of. I rebased some of my 
original
patches on top of yours. It builds and passes tests/qtest/cxl-test, but 
certainly
untested otherwise. The original code did show the volatile support as per 
cxl-list.

As such users can still use memdev which will map to the pmemdev. One thing 
which I
had not explored was the lsa + vmem thing, so the below prevents this for the 
time
being, fyi.

Thanks,
Davidlohr

8<

diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index e8341a818467..cd079dbddd9a 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -18,14 +18,21 @@ static void build_dvsecs(CXLType3Dev *ct3d)
 {
 CXLComponentState *cxl_cstate = >cxl_cstate;
 uint8_t *dvsec;
+uint64_t size = 0;
+
+if (ct3d->hostvmem) {
+size += ct3d->hostvmem->size;
+}
+if (ct3d->hostpmem) {
+size += ct3d->hostpmem->size;
+}

 dvsec = (uint8_t *)&(CXLDVSECDevice){
-.cap = 0x1e,
+.cap = 0x1e, /* one HDM range */
 .ctrl = 0x2,
 .status2 = 0x2,
-.range1_size_hi = ct3d->hostmem->size >> 32,
-.range1_size_lo = (2 << 5) | (2 << 2) | 0x3 |
-(ct3d->hostmem->size & 0xF000),
+.range1_size_hi = size >> 32,
+.range1_size_lo = (2 << 5) | (2 << 2) | 0x3 | (size & 0xF000),
 .range1_base_hi = 0,
 .range1_base_lo = 0,
 };
@@ -98,70 +105,60 @@ static void ct3d_reg_write(void *opaque, hwaddr offset, 
uint64_t value,
 static bool cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
 {
 DeviceState *ds = DEVICE(ct3d);
-MemoryRegion *mr;
 char *name;
-bool is_pmem = false;

-/*
- * FIXME: For now we only allow a single host memory region.
- * Handle the deprecated memdev property usage cases
- */
-if (!ct3d->hostmem && !ct3d->host_vmem && !ct3d->host_pmem) {
+if (!ct3d->hostvmem && !ct3d->hostpmem) {
 error_setg(errp, "at least one memdev property must be set");
 return false;
-} else if (ct3d->hostmem && (ct3d->host_vmem || ct3d->host_pmem)) {
-error_setg(errp, "deprecated [memdev] cannot be used with new "
- "persistent and volatile memdev properties");
-return false;
-} else if (ct3d->hostmem) {
-warn_report("memdev is deprecated and defaults to pmem. "
-"Use (persistent|volatile)-memdev instead.");
-is_pmem = true;
-} else {
-if (ct3d->host_vmem && ct3d->host_pmem) {
-error_setg(errp, "Multiple memory devices not supported yet");
-return false;
-}
-is_pmem = !!ct3d->host_pmem;
-ct3d->hostmem = ct3d->host_pmem ? ct3d->host_pmem : ct3d->host_vmem;
 }

-/*
- * for now, since there is only one memdev, we can set the type
- * based on whether this was a ram region or file region
- */
-mr = host_memory_backend_get_memory(ct3d->hostmem);
-if (!mr) {
-error_setg(errp, "memdev property must be set");
+/* TODO: volatile devices may have LSA */
+if (ct3d->hostvmem && ct3d->lsa) {
+error_setg(errp, "lsa property must be set");
 return false;
 }

-/*
- * FIXME: This code should eventually enumerate each memory region and
- * report vmem and pmem capacity separate, but for now just set to one
- */
-memory_region_set_nonvolatile(mr, is_pmem);
-memory_region_set_enabled(mr, true);
-host_memory_backend_set_mapped(ct3d->hostmem, true);
-
 if (ds->id) {
 name = g_strdup_printf("cxl-type3-dpa-space:%s", ds->id);
 } else {
 name = g_strdup("cxl-type3-dpa-space");
 }
-address_space_init(>hostmem_as, mr, name);
-g_free(name);

-/* FIXME: When multiple regions are supported, this needs to aggregate */
-ct3d->cxl_dstate.mem_size = ct3d->hostmem->size;
-ct3d->cxl_dstate.vmem_size = is_pmem ? 0 : ct3d->hostmem->size;
-ct3d->cxl_dstate.pmem_size = is_pmem ? ct3d->hostmem->size : 0;
+if (ct3d->hostvmem) {
+MemoryRegion *vmr;

-if (!ct3d->lsa) {
-error_setg(errp, "lsa property must be set");
-return false;
+vmr = host_memory_backend_get_memory(ct3d->hostvmem);
+if (!vmr) {
+error_setg(errp, "volatile-memdev property must be set");
+return false;
+}
+
+memory_region_set_nonvolatile(vmr, false);
+memory_region_set_enabled(vmr, true);
+host_memory_backend_set_mapped(ct3d->hostvmem, true);
+address_space_init(>hostvmem_as, vmr, name);
+ct3d->cxl_dstate.vmem_size = 

[PATCH v2 11/12] target/ppc: Moved XSTSTDC[QDS]P to decodetree

2022-10-10 Thread Lucas Mateus Castro(alqotel)
From: "Lucas Mateus Castro (alqotel)" 

Moved XSTSTDCSP, XSTSTDCDP and XSTSTDCQP to decodetree and moved some of
its decoding away from the helper as previously the DCMX, XB and BF were
calculated in the helper with the help of cpu_env, now that part was
moved to the decodetree with the rest.

xvtstdcsp:
reptloopmaster patch
8   12500   1,85393600 1,94683600 (+5.0%)
25  40001,78779800 1,92479000 (+7.7%)
100 10002,12775000 2,28895500 (+7.6%)
500 200 2,99655300 3,23102900 (+7.8%)
250040  6,89082200 7,44827500 (+8.1%)
800012 17,5058550018,95152100 (+8.3%)

xvtstdcdp:
reptloopmaster patch
8   12500   1,39043100 1,33539800 (-4.0%)
25  40001,35731800 1,37347800 (+1.2%)
100 10001,51514800 1,56053000 (+3.0%)
500 200 2,21014400 2,47906000 (+12.2%)
250040  5,39488200 6,68766700 (+24.0%)
800012 13,9862390018,17661900 (+30.0%)

xvtstdcdp:
reptloopmaster patch
8   12500   1,35123800 1,34455800 (-0.5%)
25  40001,36441200 1,36759600 (+0.2%)
100 10001,49763500 1,54138400 (+2.9%)
500 200 2,19020200 2,46196400 (+12.4%)
250040  5,39265700 6,68147900 (+23.9%)
800012 14,0416360018,19669600 (+29.6%)

As some values are now decoded outside the helper and passed to it as an
argument the number of arguments of the helper increased, the number
of TCGop needed to load the arguments increased. I suspect that's why
the slow-down in the tests with a high REPT but low LOOP.

Signed-off-by: Lucas Mateus Castro (alqotel) 
---
 target/ppc/fpu_helper.c | 114 +---
 target/ppc/helper.h |   6 +-
 target/ppc/insn32.decode|   6 ++
 target/ppc/translate/vsx-impl.c.inc |  20 -
 target/ppc/translate/vsx-ops.c.inc  |   4 -
 5 files changed, 60 insertions(+), 90 deletions(-)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 960a76a8a5..a66e16c212 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -3241,63 +3241,6 @@ void helper_XVXSIGSP(ppc_vsr_t *xt, ppc_vsr_t *xb)
 *xt = t;
 }
 
-/*
- * VSX_TEST_DC - VSX floating point test data class
- *   op- instruction mnemonic
- *   nels  - number of elements (1, 2 or 4)
- *   xbn   - VSR register number
- *   tp- type (float32 or float64)
- *   fld   - vsr_t field (VsrD(*) or VsrW(*))
- *   tfld   - target vsr_t field (VsrD(*) or VsrW(*))
- *   fld_max - target field max
- *   scrf - set result in CR and FPCC
- */
-#define VSX_TEST_DC(op, nels, xbn, tp, fld, tfld, fld_max, scrf)  \
-void helper_##op(CPUPPCState *env, uint32_t opcode) \
-{   \
-ppc_vsr_t *xt = >vsr[xT(opcode)];  \
-ppc_vsr_t *xb = >vsr[xbn]; \
-ppc_vsr_t t = { };  \
-uint32_t i, sign, dcmx; \
-uint32_t cc, match = 0; \
-\
-if (!scrf) {\
-dcmx = DCMX_XV(opcode); \
-} else {\
-t = *xt;\
-dcmx = DCMX(opcode);\
-}   \
-\
-for (i = 0; i < nels; i++) {\
-sign = tp##_is_neg(xb->fld);\
-if (tp##_is_any_nan(xb->fld)) { \
-match = extract32(dcmx, 6, 1);  \
-} else if (tp##_is_infinity(xb->fld)) { \
-match = extract32(dcmx, 4 + !sign, 1);  \
-} else if (tp##_is_zero(xb->fld)) { \
-match = extract32(dcmx, 2 + !sign, 1);  \
-} else if (tp##_is_zero_or_denormal(xb->fld)) { \
-match = extract32(dcmx, 0 + !sign, 1);  \
-}   \
-\
-if (scrf) { \
-cc = sign << CRF_LT_BIT | match << CRF_EQ_BIT;  \
-env->fpscr &= ~FP_FPCC; \
-env->fpscr |= cc << FPSCR_FPCC; \
-env->crf[BF(opcode)] = cc;  \
-} else {\
-t.tfld = match ? fld_max : 0;   \
-}   \
-match = 0;  

Re: [PATCH v2 12/12] target/ppc: Use gvec to decode XVTSTDC[DS]P

2022-10-10 Thread Lucas Mateus Martins Araujo e Castro



On 10/10/2022 16:42, Richard Henderson wrote:
 
On 10/10/22 12:13, Lucas Mateus Castro(alqotel) wrote:

+/* test if +Inf or -Inf */
+static void gen_is_any_inf(unsigned vece, TCGv_vec t, TCGv_vec b)
+{
+    uint64_t exp_msk = (vece == MO_32) ? (uint32_t)EXP_MASK_SP : 
EXP_MASK_DP;
+    uint64_t sgn_msk = (vece == MO_32) ? (uint32_t)SGN_MASK_SP : 
SGN_MASK_DP;
+    tcg_gen_andc_vec(vece, b, b, tcg_constant_vec_matching(t, vece, 
exp_msk));

+    tcg_gen_cmp_vec(TCG_COND_EQ, vece, t, b,
+    tcg_constant_vec_matching(t, vece, sgn_msk));
+}


Should be clearing sign and comparing exp, not the other way.

Yeah this was a mistake, I'll fix it in the next version.
Kind of weird that my tests didn't caught this, probably should test 
that the '.out' risu file I'm using actually test every immediate value.



+    GVecGen2 op = {
+    .fno = (vece == MO_32) ? gen_helper_XVTSTDCSP : 
gen_helper_XVTSTDCDP,

+    .vece = vece,
+    .opt_opc = vecop_list
  };

  REQUIRE_VSX(ctx);

-    tcg_gen_gvec_2i(vsr_full_offset(a->xt), vsr_full_offset(a->xb),
-    16, 16, (int32_t)(a->uim), [vece - MO_32]);
+    switch (a->uim) {
+    case 0:
+    set_cpu_vsr(a->xt, tcg_constant_i64(0), true);
+    set_cpu_vsr(a->xt, tcg_constant_i64(0), false);
+    break;
+    case ((1 << 0) | (1 << 1)):
+    /* test if +Denormal or -Denormal */
+    op.fniv = gen_is_any_denormal,
+    tcg_gen_gvec_2(vsr_full_offset(a->xt), 
vsr_full_offset(a->xb), 16, 16,

+   );


This default setting of .fno doesn't work, because the helper requires 
simd_data set,

which this invocation via tcg_gen_gvec_2 will not provide.

You could fix this by using GVecGen2i and tcg_gen_gvec_2i, and ignoring 
the immediate

And I can remove the new #include from int_helper which was bothering me.

argument added to the functions above.  Which also means...


+    case (1 << 0):
+    /* test if -Denormal */
+    op.fniv = gen_is_neg_denormal,
+    tcg_gen_gvec_2(vsr_full_offset(a->xt), 
vsr_full_offset(a->xb), 16, 16,

+   );
+    break;
+    case (1 << 1):
+    /* test if +Denormal */
+    op.fniv = gen_is_pos_denormal,
+    tcg_gen_gvec_2(vsr_full_offset(a->xt), 
vsr_full_offset(a->xb), 16, 16,

+   );
+    break;
+    case ((1 << 2) | (1 << 3)):
+    /* test if +0 or -0 */
+    op.fniv = gen_is_any_zero,
+    tcg_gen_gvec_2(vsr_full_offset(a->xt), 
vsr_full_offset(a->xb), 16, 16,

+   );
+    break;
+    case (1 << 2):
+    /* test if -0 */
+    op.fniv = gen_is_neg_zero,
+    tcg_gen_gvec_2(vsr_full_offset(a->xt), 
vsr_full_offset(a->xb), 16, 16,

+   );
+    break;
+    case (1 << 3):
+    /* test if +0 */
+    op.fniv = gen_is_pos_zero,
+    tcg_gen_gvec_2(vsr_full_offset(a->xt), 
vsr_full_offset(a->xb), 16, 16,

+   );
+    break;
+    case ((1 << 4) | (1 << 5)):
+    /* test if +Inf or -Inf */
+    op.fniv = gen_is_any_inf,
+    tcg_gen_gvec_2(vsr_full_offset(a->xt), 
vsr_full_offset(a->xb), 16, 16,

+   );
+    break;
+    case (1 << 4):
+    /* test if -Inf */
+    op.fniv = gen_is_neg_inf,
+    tcg_gen_gvec_2(vsr_full_offset(a->xt), 
vsr_full_offset(a->xb), 16, 16,

+   );
+    break;
+    case (1 << 5):
+    /* test if +Inf */
+    op.fniv = gen_is_pos_inf,
+    tcg_gen_gvec_2(vsr_full_offset(a->xt), 
vsr_full_offset(a->xb), 16, 16,

+   );
+    break;
+    case (1 << 6):
+    /* test if NaN */
+    op.fniv = gen_is_nan,
+    tcg_gen_gvec_2(vsr_full_offset(a->xt), 
vsr_full_offset(a->xb), 16, 16,

+   );
+    break;
+    default:
+    tcg_gen_gvec_2_ool(vsr_full_offset(a->xt), 
vsr_full_offset(a->xb), 16,

+   16, (int32_t)(a->uim), op.fno);
+    break;


You can have only the store to op.fniv in the switch, remove the 
default, and have a

common call to tcg_gen_gvec_2i after the switch.


r~

I'll send a v3 with these changes.
--
Lucas Mateus M. Araujo e Castro
Instituto de Pesquisas ELDORADO

Departamento Computação Embarcada
Analista de Software Junior
Aviso Legal - Disclaimer 


[PATCH v2 10/12] target/ppc: Moved XVTSTDC[DS]P to decodetree

2022-10-10 Thread Lucas Mateus Castro(alqotel)
From: "Lucas Mateus Castro (alqotel)" 

Moved XVTSTDCSP and XVTSTDCDP to decodetree an restructured the helper
to be simpler and do all decoding in the decodetree (so XB, XT and DCMX
are all calculated outside the helper).

Obs: The tests in this one are slightly different, these are the sum of
these instructions with all possible immediate and those instructions
are repeated 10 times.

xvtstdcsp:
reptloopmaster patch
8   12500   2,76402100 2,70699100 (-2.1%)
25  40002,64867100 2,67884100 (+1.1%)
100 10002,73806300 2,78701000 (+1.8%)
500 200 3,44666500 3,61027600 (+4.7%)
250040  5,85790200 6,47475500 (+10.5%)
800012 15,2210210017,46062900 (+14.7%)

xvtstdcdp:
reptloopmaster patch
8   12500   2,11818000 1,61065300 (-24.0%)
25  40002,04573400 1,60132200 (-21.7%)
100 10002,13834100 1,69988100 (-20.5%)
500 200 2,73977000 2,48631700 (-9.3%)
250040  5,05067000 5,25914100 (+4.1%)
800012 14,6050780015,93704900 (+9.1%)

Signed-off-by: Lucas Mateus Castro (alqotel) 
Reviewed-by: Richard Henderson 
---
 target/ppc/fpu_helper.c | 39 +++--
 target/ppc/helper.h |  4 +--
 target/ppc/insn32.decode|  5 
 target/ppc/translate/vsx-impl.c.inc | 28 +++--
 target/ppc/translate/vsx-ops.c.inc  |  8 --
 5 files changed, 70 insertions(+), 14 deletions(-)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index ae25f32d6e..960a76a8a5 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -3295,11 +3295,46 @@ void helper_##op(CPUPPCState *env, uint32_t opcode) 
\
 }   \
 }
 
-VSX_TEST_DC(xvtstdcdp, 2, xB(opcode), float64, VsrD(i), VsrD(i), UINT64_MAX, 0)
-VSX_TEST_DC(xvtstdcsp, 4, xB(opcode), float32, VsrW(i), VsrW(i), UINT32_MAX, 0)
 VSX_TEST_DC(xststdcdp, 1, xB(opcode), float64, VsrD(0), VsrD(0), 0, 1)
 VSX_TEST_DC(xststdcqp, 1, (rB(opcode) + 32), float128, f128, VsrD(0), 0, 1)
 
+#define VSX_TSTDC(tp)   \
+static int32_t tp##_tstdc(tp b, uint32_t dcmx)  \
+{   \
+uint32_t match = 0; \
+uint32_t sign = tp##_is_neg(b); \
+if (tp##_is_any_nan(b)) {   \
+match = extract32(dcmx, 6, 1);  \
+} else if (tp##_is_infinity(b)) {   \
+match = extract32(dcmx, 4 + !sign, 1);  \
+} else if (tp##_is_zero(b)) {   \
+match = extract32(dcmx, 2 + !sign, 1);  \
+} else if (tp##_is_zero_or_denormal(b)) {   \
+match = extract32(dcmx, 0 + !sign, 1);  \
+}   \
+return (match != 0);\
+}
+
+VSX_TSTDC(float32)
+VSX_TSTDC(float64)
+#undef VSX_TSTDC
+
+void helper_XVTSTDCDP(ppc_vsr_t *t, ppc_vsr_t *b, uint64_t dcmx, uint32_t v)
+{
+int i;
+for (i = 0; i < 2; i++) {
+t->s64[i] = (int64_t)-float64_tstdc(b->f64[i], dcmx);
+}
+}
+
+void helper_XVTSTDCSP(ppc_vsr_t *t, ppc_vsr_t *b, uint64_t dcmx, uint32_t v)
+{
+int i;
+for (i = 0; i < 4; i++) {
+t->s32[i] = (int32_t)-float32_tstdc(b->f32[i], dcmx);
+}
+}
+
 void helper_xststdcsp(CPUPPCState *env, uint32_t opcode, ppc_vsr_t *xb)
 {
 uint32_t dcmx, sign, exp;
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index fd8280dfa7..9e5d11939b 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -517,8 +517,8 @@ DEF_HELPER_3(xvcvsxdsp, void, env, vsr, vsr)
 DEF_HELPER_3(xvcvuxdsp, void, env, vsr, vsr)
 DEF_HELPER_3(xvcvsxwsp, void, env, vsr, vsr)
 DEF_HELPER_3(xvcvuxwsp, void, env, vsr, vsr)
-DEF_HELPER_2(xvtstdcsp, void, env, i32)
-DEF_HELPER_2(xvtstdcdp, void, env, i32)
+DEF_HELPER_FLAGS_4(XVTSTDCSP, TCG_CALL_NO_RWG, void, vsr, vsr, i64, i32)
+DEF_HELPER_FLAGS_4(XVTSTDCDP, TCG_CALL_NO_RWG, void, vsr, vsr, i64, i32)
 DEF_HELPER_3(xvrspi, void, env, vsr, vsr)
 DEF_HELPER_3(xvrspic, void, env, vsr, vsr)
 DEF_HELPER_3(xvrspim, void, env, vsr, vsr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 6549c4040e..c0a531be5c 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -199,6 +199,9 @@
 
 @XX2_uim4   .. . . uim:4 . . .. _uim 
xt=%xx_xt xb=%xx_xb
 
+%xx_uim76:1 2:1 16:5
+@XX2_uim7   .. . . .  . ... . .._uim 
xt=%xx_xt xb=%xx_xb uim=%xx_uim7
+
 _bf_xb  bf xb
 @XX2_bf_xb  .. bf:3 .. . . . . ._bf_xb 
xb=%xx_xb
 
@@ -848,6 +851,8 @@ XSCVSPDPN   00 . - . 

[PATCH v2 07/12] target/ppc: Move VABSDU[BHW] to decodetree and use gvec

2022-10-10 Thread Lucas Mateus Castro(alqotel)
From: "Lucas Mateus Castro (alqotel)" 

Moved VABSDUB, VABSDUH and VABSDUW to decodetree and use gvec to
translate them.

vabsdub:
reptloopmaster patch
8   12500   0,03601600 0,00688500 (-80.9%)
25  40000,03651000 0,00532100 (-85.4%)
100 10000,03666900 0,00595300 (-83.8%)
500 200 0,04305800 0,01244600 (-71.1%)
250040  0,06893300 0,04273700 (-38.0%)
800012  0,14633200 0,12660300 (-13.5%)

vabsduh:
reptloopmaster patch
8   12500   0,02172400 0,00687500 (-68.4%)
25  40000,02154100 0,00531500 (-75.3%)
100 10000,02235400 0,00596300 (-73.3%)
500 200 0,02827500 0,01245100 (-56.0%)
250040  0,05638400 0,04285500 (-24.0%)
800012  0,13166000 0,12641400 (-4.0%)

vabsduw:
reptloopmaster patch
8   12500   0,01646400 0,00688300 (-58.2%)
25  40000,01454500 0,00475500 (-67.3%)
100 10000,01545800 0,00511800 (-66.9%)
500 200 0,02168200 0,01114300 (-48.6%)
250040  0,04571300 0,04138800 (-9.5%)
800012  0,12209500 0,12178500 (-0.3%)

Same as VADDCUW and VSUBCUW, overall performance gain but it uses more
TCGop (4 before the patch, 6 after).

Signed-off-by: Lucas Mateus Castro (alqotel) 
Reviewed-by: Richard Henderson 
---
 target/ppc/helper.h |  6 ++--
 target/ppc/insn32.decode|  6 
 target/ppc/int_helper.c | 13 +++-
 target/ppc/translate/vmx-impl.c.inc | 49 +++--
 target/ppc/translate/vmx-ops.c.inc  |  3 --
 5 files changed, 60 insertions(+), 17 deletions(-)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 71c22efc2e..fd8280dfa7 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -146,9 +146,9 @@ DEF_HELPER_FLAGS_1(ftsqrt, TCG_CALL_NO_RWG_SE, i32, i64)
 DEF_HELPER_FLAGS_4(VAVGUB, TCG_CALL_NO_RWG, void, avr, avr, avr, i32)
 DEF_HELPER_FLAGS_4(VAVGUH, TCG_CALL_NO_RWG, void, avr, avr, avr, i32)
 DEF_HELPER_FLAGS_4(VAVGUW, TCG_CALL_NO_RWG, void, avr, avr, avr, i32)
-DEF_HELPER_FLAGS_3(vabsdub, TCG_CALL_NO_RWG, void, avr, avr, avr)
-DEF_HELPER_FLAGS_3(vabsduh, TCG_CALL_NO_RWG, void, avr, avr, avr)
-DEF_HELPER_FLAGS_3(vabsduw, TCG_CALL_NO_RWG, void, avr, avr, avr)
+DEF_HELPER_FLAGS_4(VABSDUB, TCG_CALL_NO_RWG, void, avr, avr, avr, i32)
+DEF_HELPER_FLAGS_4(VABSDUH, TCG_CALL_NO_RWG, void, avr, avr, avr, i32)
+DEF_HELPER_FLAGS_4(VABSDUW, TCG_CALL_NO_RWG, void, avr, avr, avr, i32)
 DEF_HELPER_FLAGS_4(VAVGSB, TCG_CALL_NO_RWG, void, avr, avr, avr, i32)
 DEF_HELPER_FLAGS_4(VAVGSH, TCG_CALL_NO_RWG, void, avr, avr, avr, i32)
 DEF_HELPER_FLAGS_4(VAVGSW, TCG_CALL_NO_RWG, void, avr, avr, avr, i32)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 38458c01de..ae151c4b62 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -528,6 +528,12 @@ VAVGUB  000100 . . . 110@VX
 VAVGUH  000100 . . . 1000110@VX
 VAVGUW  000100 . . . 1001010@VX
 
+## Vector Integer Absolute Difference Instructions
+
+VABSDUB 000100 . . . 111@VX
+VABSDUH 000100 . . . 1000111@VX
+VABSDUW 000100 . . . 1001011@VX
+
 ## Vector Bit Manipulation Instruction
 
 VGNB000100 . -- ... . 10011001100   @VX_n
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index bda76e54d4..d97a7f1f28 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -589,8 +589,8 @@ VAVG(VAVGSW, s32, int64_t)
 VAVG(VAVGUW, u32, uint64_t)
 #undef VAVG
 
-#define VABSDU_DO(name, element)\
-void helper_v##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)   \
+#define VABSDU(name, element)   \
+void helper_##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t v)\
 {   \
 int i;  \
 \
@@ -606,12 +606,9 @@ void helper_v##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t 
*b)   \
  *   name- instruction mnemonic suffix (b: byte, h: halfword, w: word)
  *   element - element type to access from vector
  */
-#define VABSDU(type, element)   \
-VABSDU_DO(absdu##type, element)
-VABSDU(b, u8)
-VABSDU(h, u16)
-VABSDU(w, u32)
-#undef VABSDU_DO
+VABSDU(VABSDUB, u8)
+VABSDU(VABSDUH, u16)
+VABSDU(VABSDUW, u32)
 #undef VABSDU
 
 #define VCF(suffix, cvt, element)   \
diff --git a/target/ppc/translate/vmx-impl.c.inc 
b/target/ppc/translate/vmx-impl.c.inc
index 1e3e099739..f46a354d31 100644
--- 

Re: [PATCH v2 11/12] target/ppc: Moved XSTSTDC[QDS]P to decodetree

2022-10-10 Thread Richard Henderson

On 10/10/22 12:13, Lucas Mateus Castro(alqotel) wrote:

From: "Lucas Mateus Castro (alqotel)"

Moved XSTSTDCSP, XSTSTDCDP and XSTSTDCQP to decodetree and moved some of
its decoding away from the helper as previously the DCMX, XB and BF were
calculated in the helper with the help of cpu_env, now that part was
moved to the decodetree with the rest.

xvtstdcsp:
reptloopmaster patch
8   12500   1,85393600 1,94683600 (+5.0%)
25  40001,78779800 1,92479000 (+7.7%)
100 10002,12775000 2,28895500 (+7.6%)
500 200 2,99655300 3,23102900 (+7.8%)
250040  6,89082200 7,44827500 (+8.1%)
800012 17,5058550018,95152100 (+8.3%)

xvtstdcdp:
reptloopmaster patch
8   12500   1,39043100 1,33539800 (-4.0%)
25  40001,35731800 1,37347800 (+1.2%)
100 10001,51514800 1,56053000 (+3.0%)
500 200 2,21014400 2,47906000 (+12.2%)
250040  5,39488200 6,68766700 (+24.0%)
800012 13,9862390018,17661900 (+30.0%)

xvtstdcdp:
reptloopmaster patch
8   12500   1,35123800 1,34455800 (-0.5%)
25  40001,36441200 1,36759600 (+0.2%)
100 10001,49763500 1,54138400 (+2.9%)
500 200 2,19020200 2,46196400 (+12.4%)
250040  5,39265700 6,68147900 (+23.9%)
800012 14,0416360018,19669600 (+29.6%)

As some values are now decoded outside the helper and passed to it as an
argument the number of arguments of the helper increased, the number
of TCGop needed to load the arguments increased. I suspect that's why
the slow-down in the tests with a high REPT but low LOOP.

Signed-off-by: Lucas Mateus Castro (alqotel)
---
  target/ppc/fpu_helper.c | 114 +---
  target/ppc/helper.h |   6 +-
  target/ppc/insn32.decode|   6 ++
  target/ppc/translate/vsx-impl.c.inc |  20 -
  target/ppc/translate/vsx-ops.c.inc  |   4 -
  5 files changed, 60 insertions(+), 90 deletions(-)


Reviewed-by: Richard Henderson 

r~



[PATCH v2 06/12] target/ppc: Move VAVG[SU][BHW] to decodetree and use gvec

2022-10-10 Thread Lucas Mateus Castro(alqotel)
From: "Lucas Mateus Castro (alqotel)" 

Moved the instructions VAVGUB, VAVGUH, VAVGUW, VAVGSB, VAVGSH, VAVGSW,
to decodetree and use gvec with them. For these one the right shift
had to be made before the sum as to avoid an overflow, so add 1 at the
end if any of the entries had 1 in its LSB as to replicate the "+ 1"
before the shift described by the ISA.

vavgub:
reptloopmaster patch
8   12500   0,02616600 0,00754200 (-71.2%)
25  40000,0253 0,00637700 (-74.8%)
100 10000,02604600 0,00790100 (-69.7%)
500 200 0,03189300 0,01838400 (-42.4%)
250040  0,06006900 0,06851000 (+14.1%)
800012  0,13941000 0,20548500 (+47.4%)

vavguh:
reptloopmaster patch
8   12500   0,01818200 0,00780600 (-57.1%)
25  40000,01789300 0,00641600 (-64.1%)
100 10000,01899100 0,00787200 (-58.5%)
500 200 0,02527200 0,01828400 (-27.7%)
250040  0,05361800 0,06773000 (+26.3%)
800012  0,12886600 0,20291400 (+57.5%)

vavguw:
reptloopmaster patch
8   12500   0,01423100 0,00776600 (-45.4%)
25  40000,01780800 0,00638600 (-64.1%)
100 10000,02085500 0,00787000 (-62.3%)
500 200 0,02737100 0,01828800 (-33.2%)
250040  0,05572600 0,06774200 (+21.6%)
800012  0,13101700 0,20311600 (+55.0%)

vavgsb:
reptloopmaster patch
8   12500   0,03006000 0,00788600 (-73.8%)
25  40000,02882200 0,00637800 (-77.9%)
100 10000,02958000 0,00791400 (-73.2%)
500 200 0,03548800 0,01860400 (-47.6%)
250040  0,0636 0,06850800 (+7.7%)
800012  0,13816500 0,20550300 (+48.7%)

vavgsh:
reptloopmaster patch
8   12500   0,01965900 0,00776600 (-60.5%)
25  40000,01875400 0,00638700 (-65.9%)
100 10000,01952200 0,00786900 (-59.7%)
500 200 0,02562000 0,01760300 (-31.3%)
250040  0,05384300 0,06742800 (+25.2%)
800012  0,13240800 0,2033 (+53.5%)

vavgsw:
reptloopmaster patch
8   12500   0,01407700 0,00775600 (-44.9%)
25  40000,01762300 0,0064 (-63.7%)
100 10000,02046500 0,00788500 (-61.5%)
500 200 0,02745600 0,01843000 (-32.9%)
250040  0,05375500 0,06820500 (+26.9%)
800012  0,13068300 0,20304900 (+55.4%)

These results to me seems to indicate that with gvec the results have a
slower translation but faster execution.

Signed-off-by: Lucas Mateus Castro (alqotel) 
Reviewed-by: Richard Henderson 
---
 target/ppc/helper.h |  12 ++--
 target/ppc/insn32.decode|   9 +++
 target/ppc/int_helper.c |  32 -
 target/ppc/translate/vmx-impl.c.inc | 106 
 target/ppc/translate/vmx-ops.c.inc  |   9 +--
 5 files changed, 127 insertions(+), 41 deletions(-)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index a06193bc67..71c22efc2e 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -143,15 +143,15 @@ DEF_HELPER_FLAGS_1(ftsqrt, TCG_CALL_NO_RWG_SE, i32, i64)
 #define dh_ctype_acc ppc_acc_t *
 #define dh_typecode_acc dh_typecode_ptr
 
-DEF_HELPER_FLAGS_3(vavgub, TCG_CALL_NO_RWG, void, avr, avr, avr)
-DEF_HELPER_FLAGS_3(vavguh, TCG_CALL_NO_RWG, void, avr, avr, avr)
-DEF_HELPER_FLAGS_3(vavguw, TCG_CALL_NO_RWG, void, avr, avr, avr)
+DEF_HELPER_FLAGS_4(VAVGUB, TCG_CALL_NO_RWG, void, avr, avr, avr, i32)
+DEF_HELPER_FLAGS_4(VAVGUH, TCG_CALL_NO_RWG, void, avr, avr, avr, i32)
+DEF_HELPER_FLAGS_4(VAVGUW, TCG_CALL_NO_RWG, void, avr, avr, avr, i32)
 DEF_HELPER_FLAGS_3(vabsdub, TCG_CALL_NO_RWG, void, avr, avr, avr)
 DEF_HELPER_FLAGS_3(vabsduh, TCG_CALL_NO_RWG, void, avr, avr, avr)
 DEF_HELPER_FLAGS_3(vabsduw, TCG_CALL_NO_RWG, void, avr, avr, avr)
-DEF_HELPER_FLAGS_3(vavgsb, TCG_CALL_NO_RWG, void, avr, avr, avr)
-DEF_HELPER_FLAGS_3(vavgsh, TCG_CALL_NO_RWG, void, avr, avr, avr)
-DEF_HELPER_FLAGS_3(vavgsw, TCG_CALL_NO_RWG, void, avr, avr, avr)
+DEF_HELPER_FLAGS_4(VAVGSB, TCG_CALL_NO_RWG, void, avr, avr, avr, i32)
+DEF_HELPER_FLAGS_4(VAVGSH, TCG_CALL_NO_RWG, void, avr, avr, avr, i32)
+DEF_HELPER_FLAGS_4(VAVGSW, TCG_CALL_NO_RWG, void, avr, avr, avr, i32)
 DEF_HELPER_4(vcmpeqfp, void, env, avr, avr, avr)
 DEF_HELPER_4(vcmpgefp, void, env, avr, avr, avr)
 DEF_HELPER_4(vcmpgtfp, void, env, avr, avr, avr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index aa4968e6b9..38458c01de 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -519,6 +519,15 @@ VCMPNEZW000100 . . . . 011111   @VC
 VCMPSQ  000100 ... -- . . 0010101   @VX_bf
 VCMPUQ  000100 ... -- . . 0010001   @VX_bf

Re: [PATCH v2 12/12] target/ppc: Use gvec to decode XVTSTDC[DS]P

2022-10-10 Thread Richard Henderson

On 10/10/22 12:13, Lucas Mateus Castro(alqotel) wrote:

+/* test if +Inf or -Inf */
+static void gen_is_any_inf(unsigned vece, TCGv_vec t, TCGv_vec b)
+{
+uint64_t exp_msk = (vece == MO_32) ? (uint32_t)EXP_MASK_SP : EXP_MASK_DP;
+uint64_t sgn_msk = (vece == MO_32) ? (uint32_t)SGN_MASK_SP : SGN_MASK_DP;
+tcg_gen_andc_vec(vece, b, b, tcg_constant_vec_matching(t, vece, exp_msk));
+tcg_gen_cmp_vec(TCG_COND_EQ, vece, t, b,
+tcg_constant_vec_matching(t, vece, sgn_msk));
+}


Should be clearing sign and comparing exp, not the other way.


+GVecGen2 op = {
+.fno = (vece == MO_32) ? gen_helper_XVTSTDCSP : gen_helper_XVTSTDCDP,
+.vece = vece,
+.opt_opc = vecop_list
  };
  
  REQUIRE_VSX(ctx);
  
-tcg_gen_gvec_2i(vsr_full_offset(a->xt), vsr_full_offset(a->xb),

-16, 16, (int32_t)(a->uim), [vece - MO_32]);
+switch (a->uim) {
+case 0:
+set_cpu_vsr(a->xt, tcg_constant_i64(0), true);
+set_cpu_vsr(a->xt, tcg_constant_i64(0), false);
+break;
+case ((1 << 0) | (1 << 1)):
+/* test if +Denormal or -Denormal */
+op.fniv = gen_is_any_denormal,
+tcg_gen_gvec_2(vsr_full_offset(a->xt), vsr_full_offset(a->xb), 16, 16,
+   );


This default setting of .fno doesn't work, because the helper requires simd_data set, 
which this invocation via tcg_gen_gvec_2 will not provide.


You could fix this by using GVecGen2i and tcg_gen_gvec_2i, and ignoring the immediate 
argument added to the functions above.  Which also means...



+case (1 << 0):
+/* test if -Denormal */
+op.fniv = gen_is_neg_denormal,
+tcg_gen_gvec_2(vsr_full_offset(a->xt), vsr_full_offset(a->xb), 16, 16,
+   );
+break;
+case (1 << 1):
+/* test if +Denormal */
+op.fniv = gen_is_pos_denormal,
+tcg_gen_gvec_2(vsr_full_offset(a->xt), vsr_full_offset(a->xb), 16, 16,
+   );
+break;
+case ((1 << 2) | (1 << 3)):
+/* test if +0 or -0 */
+op.fniv = gen_is_any_zero,
+tcg_gen_gvec_2(vsr_full_offset(a->xt), vsr_full_offset(a->xb), 16, 16,
+   );
+break;
+case (1 << 2):
+/* test if -0 */
+op.fniv = gen_is_neg_zero,
+tcg_gen_gvec_2(vsr_full_offset(a->xt), vsr_full_offset(a->xb), 16, 16,
+   );
+break;
+case (1 << 3):
+/* test if +0 */
+op.fniv = gen_is_pos_zero,
+tcg_gen_gvec_2(vsr_full_offset(a->xt), vsr_full_offset(a->xb), 16, 16,
+   );
+break;
+case ((1 << 4) | (1 << 5)):
+/* test if +Inf or -Inf */
+op.fniv = gen_is_any_inf,
+tcg_gen_gvec_2(vsr_full_offset(a->xt), vsr_full_offset(a->xb), 16, 16,
+   );
+break;
+case (1 << 4):
+/* test if -Inf */
+op.fniv = gen_is_neg_inf,
+tcg_gen_gvec_2(vsr_full_offset(a->xt), vsr_full_offset(a->xb), 16, 16,
+   );
+break;
+case (1 << 5):
+/* test if +Inf */
+op.fniv = gen_is_pos_inf,
+tcg_gen_gvec_2(vsr_full_offset(a->xt), vsr_full_offset(a->xb), 16, 16,
+   );
+break;
+case (1 << 6):
+/* test if NaN */
+op.fniv = gen_is_nan,
+tcg_gen_gvec_2(vsr_full_offset(a->xt), vsr_full_offset(a->xb), 16, 16,
+   );
+break;
+default:
+tcg_gen_gvec_2_ool(vsr_full_offset(a->xt), vsr_full_offset(a->xb), 16,
+   16, (int32_t)(a->uim), op.fno);
+break;


You can have only the store to op.fniv in the switch, remove the default, and have a 
common call to tcg_gen_gvec_2i after the switch.



r~



[PATCH v2 00/12] VMX/VSX instructions with gvec

2022-10-10 Thread Lucas Mateus Castro(alqotel)
From: "Lucas Mateus Castro (alqotel)" 

Patches missing review: 3,5,9,11,12

v1 -> v2:
- Implemented instructions with fni4/fni8 and dropped the helper:
* VSUBCUW
* VADDCUW
* VPRTYBW
* VPRTYBD
- Reworked patch12 to only use gvec implementation with a few
  immediates.
- Used bitsel_ver on patch9
- Changed vec variables to tcg_constant_vec when possible

This patch series moves some instructions from decode legacy to
decodetree and translate said instructions with gvec. Some cases using
gvec ended up with a bigger, more complex and slower so those
instructions were only moved to decodetree.

In each patch there's a comparison of the execution time before the
patch being applied and after. Said result is the sum of 10 executions.

The program used to time the execution worked like this:

clock_t start = clock();
for (int i = 0; i < LOOP; i++) {
asm (
 load values in registers, between 2 and 3 instructions
 ".rept REPT\n\t"
 "INSTRUCTION registers\n\t"
 ".endr\n\t"
 save result from register, 1 instruction
);
}
clock_t end = clock();
printf("INSTRUCTION rept=REPT loop=LOOP, time taken: %.12lf\n",
   ((double)(end - start))/ CLOCKS_PER_SEC);

Where the column rept in the value used in .rept in the inline assembly
and loop column is the value used for the for loop. All of those tests
were executed on a Power9. When comparing the TCGop the data used was
gathered using '-d op' and '-d op_opt'.

Lucas Mateus Castro (alqotel) (12):
  target/ppc: Moved VMLADDUHM to decodetree and use gvec
  target/ppc: Move VMH[R]ADDSHS instruction to decodetree
  target/ppc: Move V(ADD|SUB)CUW to decodetree and use gvec
  target/ppc: Move VNEG[WD] to decodtree and use gvec
  target/ppc: Move VPRTYB[WDQ] to decodetree and use gvec
  target/ppc: Move VAVG[SU][BHW] to decodetree and use gvec
  target/ppc: Move VABSDU[BHW] to decodetree and use gvec
  target/ppc: Use gvec to decode XV[N]ABS[DS]P/XVNEG[DS]P
  target/ppc: Use gvec to decode XVCPSGN[SD]P
  target/ppc: Moved XVTSTDC[DS]P to decodetree
  target/ppc: Moved XSTSTDC[QDS]P to decodetree
  target/ppc: Use gvec to decode XVTSTDC[DS]P

 target/ppc/fpu_helper.c | 140 +-
 target/ppc/helper.h |  42 ++-
 target/ppc/insn32.decode|  50 
 target/ppc/int_helper.c | 107 ++--
 target/ppc/translate.c  |   1 -
 target/ppc/translate/vmx-impl.c.inc | 364 +
 target/ppc/translate/vmx-ops.c.inc  |  15 +-
 target/ppc/translate/vsx-impl.c.inc | 394 +++-
 target/ppc/translate/vsx-ops.c.inc  |  21 --
 9 files changed, 808 insertions(+), 326 deletions(-)

-- 
2.37.3




Re: [PATCH v2 03/12] target/ppc: Move V(ADD|SUB)CUW to decodetree and use gvec

2022-10-10 Thread Richard Henderson

On 10/10/22 12:13, Lucas Mateus Castro(alqotel) wrote:

From: "Lucas Mateus Castro (alqotel)"

This patch moves VADDCUW and VSUBCUW to decodtree with gvec using an
implementation based on the helper, with the main difference being
changing the -1 (aka all bits set to 1) result returned by cmp when
true to +1. It also implemented a .fni4 version of those instructions
and dropped the helper.

vaddcuw:
reptloopmaster patch
8   12500   0,01008200 0,00612400 (-39.3%)
25  40000,01091500 0,00471600 (-56.8%)
100 10000,01332500 0,00593700 (-55.4%)
500 200 0,01998500 0,01275700 (-36.2%)
250040  0,04704300 0,04364300 (-7.2%)
800012  0,10748200 0,11241000 (+4.6%)

vsubcuw:
reptloopmaster patch
8   12500   0,01226200 0,00571600 (-53.4%)
25  40000,01493500 0,00462100 (-69.1%)
100 10000,01522700 0,00455100 (-70.1%)
500 200 0,02384600 0,01133500 (-52.5%)
250040  0,04935200 0,03178100 (-35.6%)
800012  0,09039900 0,09440600 (+4.4%)

Overall there was a gain in performance, but the TCGop code was still
slightly bigger in the new version (it went from 4 to 5).

Signed-off-by: Lucas Mateus Castro (alqotel)
---
  target/ppc/helper.h |  2 -
  target/ppc/insn32.decode|  2 +
  target/ppc/int_helper.c | 18 -
  target/ppc/translate/vmx-impl.c.inc | 61 +++--
  target/ppc/translate/vmx-ops.c.inc  |  3 +-
  5 files changed, 60 insertions(+), 26 deletions(-)


Reviewed-by: Richard Henderson 

r~



[PATCH v2 04/12] target/ppc: Move VNEG[WD] to decodtree and use gvec

2022-10-10 Thread Lucas Mateus Castro(alqotel)
From: "Lucas Mateus Castro (alqotel)" 

Moved the instructions VNEGW and VNEGD to decodetree and used gvec to
decode it.

vnegw:
reptloopmaster patch
8   12500   0,01053200 0,00548400 (-47.9%)
25  40000,01030500 0,0039 (-62.2%)
100 10000,01096300 0,00395400 (-63.9%)
500 200 0,01472000 0,00712300 (-51.6%)
250040  0,03809000 0,02147700 (-43.6%)
800012  0,09957100 0,06202100 (-37.7%)

vnegd:
reptloopmaster patch
8   12500   0,00594600 0,00543800 (-8.5%)
25  40000,00575200 0,00396400 (-31.1%)
100 10000,00676100 0,00394800 (-41.6%)
500 200 0,01149300 0,00709400 (-38.3%)
250040  0,03441500 0,02169600 (-37.0%)
800012  0,09516900 0,06337000 (-33.4%)

Signed-off-by: Lucas Mateus Castro (alqotel) 
Reviewed-by: Richard Henderson 
---
 target/ppc/helper.h |  2 --
 target/ppc/insn32.decode|  3 +++
 target/ppc/int_helper.c | 12 
 target/ppc/translate/vmx-impl.c.inc | 15 +--
 target/ppc/translate/vmx-ops.c.inc  |  2 --
 5 files changed, 16 insertions(+), 18 deletions(-)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index f7047ed2aa..b2e910b089 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -229,8 +229,6 @@ DEF_HELPER_FLAGS_2(VSTRIBL, TCG_CALL_NO_RWG, i32, avr, avr)
 DEF_HELPER_FLAGS_2(VSTRIBR, TCG_CALL_NO_RWG, i32, avr, avr)
 DEF_HELPER_FLAGS_2(VSTRIHL, TCG_CALL_NO_RWG, i32, avr, avr)
 DEF_HELPER_FLAGS_2(VSTRIHR, TCG_CALL_NO_RWG, i32, avr, avr)
-DEF_HELPER_FLAGS_2(vnegw, TCG_CALL_NO_RWG, void, avr, avr)
-DEF_HELPER_FLAGS_2(vnegd, TCG_CALL_NO_RWG, void, avr, avr)
 DEF_HELPER_FLAGS_2(vupkhpx, TCG_CALL_NO_RWG, void, avr, avr)
 DEF_HELPER_FLAGS_2(vupklpx, TCG_CALL_NO_RWG, void, avr, avr)
 DEF_HELPER_FLAGS_2(vupkhsb, TCG_CALL_NO_RWG, void, avr, avr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index aebc7b73c8..2658dd3395 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -629,6 +629,9 @@ VEXTSH2D000100 . 11001 . 1100010
@VX_tb
 VEXTSW2D000100 . 11010 . 1100010@VX_tb
 VEXTSD2Q000100 . 11011 . 1100010@VX_tb
 
+VNEGD   000100 . 00111 . 1100010@VX_tb
+VNEGW   000100 . 00110 . 1100010@VX_tb
+
 ## Vector Mask Manipulation Instructions
 
 MTVSRBM 000100 . 1 . 1100110@VX_tb
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index f8dd12e8ae..c7fd0d1faa 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1928,18 +1928,6 @@ XXBLEND(W, 32)
 XXBLEND(D, 64)
 #undef XXBLEND
 
-#define VNEG(name, element) \
-void helper_##name(ppc_avr_t *r, ppc_avr_t *b)  \
-{   \
-int i;  \
-for (i = 0; i < ARRAY_SIZE(r->element); i++) {  \
-r->element[i] = -b->element[i]; \
-}   \
-}
-VNEG(vnegw, s32)
-VNEG(vnegd, s64)
-#undef VNEG
-
 void helper_vsro(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
 {
 int sh = (b->VsrB(0xf) >> 3) & 0xf;
diff --git a/target/ppc/translate/vmx-impl.c.inc 
b/target/ppc/translate/vmx-impl.c.inc
index f52485a5f1..b9a9e83ab3 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -2625,8 +2625,19 @@ GEN_VXFORM_NOA(vclzb, 1, 28)
 GEN_VXFORM_NOA(vclzh, 1, 29)
 GEN_VXFORM_TRANS(vclzw, 1, 30)
 GEN_VXFORM_TRANS(vclzd, 1, 31)
-GEN_VXFORM_NOA_2(vnegw, 1, 24, 6)
-GEN_VXFORM_NOA_2(vnegd, 1, 24, 7)
+
+static bool do_vneg(DisasContext *ctx, arg_VX_tb *a, unsigned vece)
+{
+REQUIRE_INSNS_FLAGS2(ctx, ISA300);
+REQUIRE_VECTOR(ctx);
+
+tcg_gen_gvec_neg(vece, avr_full_offset(a->vrt), avr_full_offset(a->vrb),
+ 16, 16);
+return true;
+}
+
+TRANS(VNEGW, do_vneg, MO_32)
+TRANS(VNEGD, do_vneg, MO_64)
 
 static void gen_vexts_i64(TCGv_i64 t, TCGv_i64 b, int64_t s)
 {
diff --git a/target/ppc/translate/vmx-ops.c.inc 
b/target/ppc/translate/vmx-ops.c.inc
index ded0234123..27908533dd 100644
--- a/target/ppc/translate/vmx-ops.c.inc
+++ b/target/ppc/translate/vmx-ops.c.inc
@@ -181,8 +181,6 @@ GEN_VXFORM_300_EXT(vextractd, 6, 11, 0x10),
 GEN_VXFORM(vspltisb, 6, 12),
 GEN_VXFORM(vspltish, 6, 13),
 GEN_VXFORM(vspltisw, 6, 14),
-GEN_VXFORM_300_EO(vnegw, 0x01, 0x18, 0x06),
-GEN_VXFORM_300_EO(vnegd, 0x01, 0x18, 0x07),
 GEN_VXFORM_300_EO(vctzb, 0x01, 0x18, 0x1C),
 GEN_VXFORM_300_EO(vctzh, 0x01, 0x18, 0x1D),
 GEN_VXFORM_300_EO(vctzw, 0x01, 0x18, 0x1E),
-- 
2.37.3




[PATCH v2 09/12] target/ppc: Use gvec to decode XVCPSGN[SD]P

2022-10-10 Thread Lucas Mateus Castro(alqotel)
From: "Lucas Mateus Castro (alqotel)" 

Moved XVCPSGNSP and XVCPSGNDP to decodetree and used gvec to translate
them.

xvcpsgnsp:
reptloopmaster patch
8   12500   0,00561400 0,00537900 (-4.2%)
25  40000,00562100 0,0040 (-28.8%)
100 10000,00696900 0,00416300 (-40.3%)
500 200 0,02211900 0,00840700 (-62.0%)
250040  0,09328600 0,02728300 (-70.8%)
800012  0,27295300 0,06867800 (-74.8%)

xvcpsgndp:
reptloopmaster patch
8   12500   0,00556300 0,00584200 (+5.0%)
25  40000,00482700 0,00431700 (-10.6%)
100 10000,00585800 0,00464400 (-20.7%)
500 200 0,01565300 0,00839700 (-46.4%)
250040  0,05766500 0,02430600 (-57.8%)
800012  0,19875300 0,07947100 (-60.0%)

Like the previous instructions there seemed to be a improvement on
translation time.

Signed-off-by: Lucas Mateus Castro (alqotel) 
---
 target/ppc/insn32.decode|   2 +
 target/ppc/translate/vsx-impl.c.inc | 109 ++--
 target/ppc/translate/vsx-ops.c.inc  |   3 -
 3 files changed, 55 insertions(+), 59 deletions(-)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 5b687078be..6549c4040e 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -762,6 +762,8 @@ XVNABSDP00 . 0 . 01001 ..   @XX2
 XVNABSSP00 . 0 . 110101001 ..   @XX2
 XVNEGDP 00 . 0 . 11001 ..   @XX2
 XVNEGSP 00 . 0 . 110111001 ..   @XX2
+XVCPSGNDP   00 . . .  ...   @XX3
+XVCPSGNSP   00 . . . 1101 ...   @XX3
 
 ## VSX Scalar Multiply-Add Instructions
 
diff --git a/target/ppc/translate/vsx-impl.c.inc 
b/target/ppc/translate/vsx-impl.c.inc
index 3f9af811dc..4f17da514c 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -729,62 +729,6 @@ VSX_SCALAR_MOVE_QP(xsnabsqp, OP_NABS, SGN_MASK_DP)
 VSX_SCALAR_MOVE_QP(xsnegqp, OP_NEG, SGN_MASK_DP)
 VSX_SCALAR_MOVE_QP(xscpsgnqp, OP_CPSGN, SGN_MASK_DP)
 
-#define VSX_VECTOR_MOVE(name, op, sgn_mask)  \
-static void glue(gen_, name)(DisasContext *ctx)  \
-{\
-TCGv_i64 xbh, xbl, sgm;  \
-if (unlikely(!ctx->vsx_enabled)) {   \
-gen_exception(ctx, POWERPC_EXCP_VSXU);   \
-return;  \
-}\
-xbh = tcg_temp_new_i64();\
-xbl = tcg_temp_new_i64();\
-sgm = tcg_temp_new_i64();\
-get_cpu_vsr(xbh, xB(ctx->opcode), true); \
-get_cpu_vsr(xbl, xB(ctx->opcode), false);\
-tcg_gen_movi_i64(sgm, sgn_mask); \
-switch (op) {\
-case OP_ABS: {   \
-tcg_gen_andc_i64(xbh, xbh, sgm); \
-tcg_gen_andc_i64(xbl, xbl, sgm); \
-break;   \
-}\
-case OP_NABS: {  \
-tcg_gen_or_i64(xbh, xbh, sgm);   \
-tcg_gen_or_i64(xbl, xbl, sgm);   \
-break;   \
-}\
-case OP_NEG: {   \
-tcg_gen_xor_i64(xbh, xbh, sgm);  \
-tcg_gen_xor_i64(xbl, xbl, sgm);  \
-break;   \
-}\
-case OP_CPSGN: { \
-TCGv_i64 xah = tcg_temp_new_i64();   \
-TCGv_i64 xal = tcg_temp_new_i64();   \
-get_cpu_vsr(xah, xA(ctx->opcode), true); \
-get_cpu_vsr(xal, xA(ctx->opcode), false);\
-tcg_gen_and_i64(xah, xah, sgm);  \
-tcg_gen_and_i64(xal, xal, sgm);  \
-tcg_gen_andc_i64(xbh, xbh, sgm); \
-tcg_gen_andc_i64(xbl, xbl, sgm); \
-tcg_gen_or_i64(xbh, xbh, xah);   \
-tcg_gen_or_i64(xbl, xbl, xal);  

Re: [PULL 29/55] Revert "intel_iommu: Fix irqchip / X2APIC configuration checks"

2022-10-10 Thread Peter Xu
On Mon, Oct 10, 2022 at 10:39:52AM -0700, David Woodhouse wrote:
> On Mon, 2022-10-10 at 13:30 -0400, Michael S. Tsirkin wrote:
> > From: Peter Xu <
> > pet...@redhat.com
> > >
> > 
> > It's true that when vcpus<=255 we don't require the length of 32bit APIC
> > IDs.  However here since we already have EIM=ON it means the hypervisor
> > will declare the VM as x2apic supported (e.g. VT-d ECAP register will have
> > EIM bit 4 set), so the guest should assume the APIC IDs are 32bits width
> > even if vcpus<=255.  In short, commit 77250171bdc breaks any simple cmdline
> > that wants to boot a VM with >=9 but <=255 vcpus with:
> 
> I find that paragraph really hard to parse. What does it even mean that
> "guest should assume the APIC IDs are 32bits"? 

Quotting EIM definition:

 0: On Intel® 64 platforms, hardware supports only 8-bit APIC-IDs (xAPIC
Mode).

 1: On Intel® 64 platforms, hardware supports 32-bit APIC- IDs (x2APIC
mode).  Hardware implementation reporting Interrupt Remapping support
(IR) field as Clear also report this field as Clear.

I hope the statement was matching the spec.  Please let me know if you have
better way to reword it.

> 
> In practice, all the EIM bit does is *allow* 32 bits of APIC ID in the
> tables. Which is perfectly fine if there are only 254 CPUs anyway, and
> we never need to use a higher value.
> 
> I *think* the actual problem here is when logical addressing is used,
> which puts the APIC cluster ID into higher bits? But it's kind of weird
> that the message doesn't mention that at all?

The commit message actually doesn't even need to contain a lot of
information in this case, IMO.

Literally it can be seen as a revert of a commit which breaks guest with
>8vcpu from boot.  I kept the other lines because that still make sense, or
it can be a full revert with "something broke with commit xxx, revert it to
fix" and anything else could be reworked.  AFAICT that's how it normally
works with QEMU or Linux.

I am not 100% familiar with the original purpose of the patch, would
eim=off work for you even after patch applied?  Anything severely wrong
with this patch?

> 
> That's fixable by just setting the X2APIC_PHYSICAL bit in the ACPI
> FADT, isn't it? Then the only values that a guest may put into those
> fields — 32-bit fields or not — are lower than 0xff anyway.

It's still not clear to me why we need to make it inconsistent between the
EIM we declare to the guest and the KVM behavior on understanding EIM bit.
Even if enforced physical mode will work we loose the possibility of
cluster mode, and I also don't see what's the major benefit since EIM=off
will just work, afaiu, meanwhile make everything aligned.

Are you fine if we proceed with this pull request first and revisit later?
Follow up patches will always be fine, and we're unbreaking something.  I
have copied you since the 1st patch I posted and the small patch was there
for weeks, it'll be appreciated if either you could comment earlier next
time, or even propose a better fix then we can discuss what's the best way
to fix.  Thanks.

-- 
Peter Xu




[PATCH v2 08/12] target/ppc: Use gvec to decode XV[N]ABS[DS]P/XVNEG[DS]P

2022-10-10 Thread Lucas Mateus Castro(alqotel)
From: "Lucas Mateus Castro (alqotel)" 

Moved XVABSSP, XVABSDP, XVNABSSP,XVNABSDP, XVNEGSP and XVNEGDP to
decodetree and used gvec to translate them.

xvabssp:
reptloopmaster patch
8   12500   0,00477900 0,00476000 (-0.4%)
25  40000,00442800 0,00353300 (-20.2%)
100 10000,00478700 0,00366100 (-23.5%)
500 200 0,00973200 0,00649400 (-33.3%)
250040  0,03165200 0,02226700 (-29.7%)
800012  0,09315900 0,06674900 (-28.3%)

xvabsdp:
reptloopmaster patch
8   12500   0,00475000 0,00474400 (-0.1%)
25  40000,00355600 0,00367500 (+3.3%)
100 10000,00444200 0,00366000 (-17.6%)
500 200 0,00942700 0,00732400 (-22.3%)
250040  0,0299 0,02308500 (-22.8%)
800012  0,08770300 0,06683800 (-23.8%)

xvnabssp:
reptloopmaster patch
8   12500   0,00494500 0,00492900 (-0.3%)
25  40000,00397700 0,00338600 (-14.9%)
100 10000,00421400 0,00353500 (-16.1%)
500 200 0,01048000 0,00707100 (-32.5%)
250040  0,03251500 0,02238300 (-31.2%)
800012  0,08889100 0,06469800 (-27.2%)

xvnabsdp:
reptloopmaster patch
8   12500   0,00511000 0,00492700 (-3.6%)
25  40000,00398800 0,00381500 (-4.3%)
100 10000,00390500 0,00365900 (-6.3%)
500 200 0,00924800 0,00784600 (-15.2%)
250040  0,03138900 0,02391600 (-23.8%)
800012  0,09654200 0,05684600 (-41.1%)

xvnegsp:
reptloopmaster patch
8   12500   0,00493900 0,00452800 (-8.3%)
25  40000,00369100 0,00366800 (-0.6%)
100 10000,00371100 0,0038 (+2.4%)
500 200 0,00991100 0,00652300 (-34.2%)
250040  0,03025800 0,02422300 (-19.9%)
800012  0,09251100 0,06457600 (-30.2%)

xvnegdp:
reptloopmaster patch
8   12500   0,00474900 0,00454400 (-4.3%)
25  40000,00353100 0,00325600 (-7.8%)
100 10000,00398600 0,00366800 (-8.0%)
500 200 0,01032300 0,00702400 (-32.0%)
250040  0,03125000 0,02422400 (-22.5%)
800012  0,09475100 0,06173000 (-34.9%)

This one to me seemed the opposite of the previous instructions, as it
looks like there was an improvement in the translation time (itself not
a surprise as operations were done twice before so there was the need to
translate twice as many TCGop)

Signed-off-by: Lucas Mateus Castro (alqotel) 
Reviewed-by: Richard Henderson 
---
 target/ppc/insn32.decode|  9 
 target/ppc/translate/vsx-impl.c.inc | 73 ++---
 target/ppc/translate/vsx-ops.c.inc  |  6 ---
 3 files changed, 76 insertions(+), 12 deletions(-)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index ae151c4b62..5b687078be 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -754,6 +754,15 @@ STXVRHX 01 . . . 0010101101 .   
@X_TSX
 STXVRWX 01 . . . 0011001101 .   @X_TSX
 STXVRDX 01 . . . 0011101101 .   @X_TSX
 
+## VSX Vector Binary Floating-Point Sign Manipulation Instructions
+
+XVABSDP 00 . 0 . 111011001 ..   @XX2
+XVABSSP 00 . 0 . 110011001 ..   @XX2
+XVNABSDP00 . 0 . 01001 ..   @XX2
+XVNABSSP00 . 0 . 110101001 ..   @XX2
+XVNEGDP 00 . 0 . 11001 ..   @XX2
+XVNEGSP 00 . 0 . 110111001 ..   @XX2
+
 ## VSX Scalar Multiply-Add Instructions
 
 XSMADDADP   00 . . . 0011 . . . @XX3
diff --git a/target/ppc/translate/vsx-impl.c.inc 
b/target/ppc/translate/vsx-impl.c.inc
index 7acdbceec4..3f9af811dc 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -782,15 +782,76 @@ static void glue(gen_, name)(DisasContext *ctx)   
   \
 tcg_temp_free_i64(sgm);  \
 }
 
-VSX_VECTOR_MOVE(xvabsdp, OP_ABS, SGN_MASK_DP)
-VSX_VECTOR_MOVE(xvnabsdp, OP_NABS, SGN_MASK_DP)
-VSX_VECTOR_MOVE(xvnegdp, OP_NEG, SGN_MASK_DP)
 VSX_VECTOR_MOVE(xvcpsgndp, OP_CPSGN, SGN_MASK_DP)
-VSX_VECTOR_MOVE(xvabssp, OP_ABS, SGN_MASK_SP)
-VSX_VECTOR_MOVE(xvnabssp, OP_NABS, SGN_MASK_SP)
-VSX_VECTOR_MOVE(xvnegsp, OP_NEG, SGN_MASK_SP)
 VSX_VECTOR_MOVE(xvcpsgnsp, OP_CPSGN, SGN_MASK_SP)
 
+#define TCG_OP_IMM_i64(FUNC, OP, IMM)   \
+static void FUNC(TCGv_i64 t, TCGv_i64 b)\
+{   \
+OP(t, b, IMM);  \
+}
+
+TCG_OP_IMM_i64(do_xvabssp_i64, tcg_gen_andi_i64, 

Re: [PATCH v2 09/12] target/ppc: Use gvec to decode XVCPSGN[SD]P

2022-10-10 Thread Richard Henderson

On 10/10/22 12:13, Lucas Mateus Castro(alqotel) wrote:

From: "Lucas Mateus Castro (alqotel)"

Moved XVCPSGNSP and XVCPSGNDP to decodetree and used gvec to translate
them.

xvcpsgnsp:
reptloopmaster patch
8   12500   0,00561400 0,00537900 (-4.2%)
25  40000,00562100 0,0040 (-28.8%)
100 10000,00696900 0,00416300 (-40.3%)
500 200 0,02211900 0,00840700 (-62.0%)
250040  0,09328600 0,02728300 (-70.8%)
800012  0,27295300 0,06867800 (-74.8%)

xvcpsgndp:
reptloopmaster patch
8   12500   0,00556300 0,00584200 (+5.0%)
25  40000,00482700 0,00431700 (-10.6%)
100 10000,00585800 0,00464400 (-20.7%)
500 200 0,01565300 0,00839700 (-46.4%)
250040  0,05766500 0,02430600 (-57.8%)
800012  0,19875300 0,07947100 (-60.0%)

Like the previous instructions there seemed to be a improvement on
translation time.

Signed-off-by: Lucas Mateus Castro (alqotel)
---
  target/ppc/insn32.decode|   2 +
  target/ppc/translate/vsx-impl.c.inc | 109 ++--
  target/ppc/translate/vsx-ops.c.inc  |   3 -
  3 files changed, 55 insertions(+), 59 deletions(-)


Reviewed-by: Richard Henderson 

r~



[PATCH v2 2/2] vvfat: allow spaces in file names

2022-10-10 Thread Hervé Poussineau
In R/W mode, files with spaces were never created on host side.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1176
Fixes: c79e243ed67683d6d06692bd7040f7394da178b0
Signed-off-by: Hervé Poussineau 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Kevin Wolf 
---
 block/vvfat.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/vvfat.c b/block/vvfat.c
index ae53f0d7283..392eab5168b 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -499,7 +499,7 @@ static bool valid_filename(const unsigned char *name)
   (c >= 'A' && c <= 'Z') ||
   (c >= 'a' && c <= 'z') ||
   c > 127 ||
-  strchr("$%'-_@~`!(){}^#&.+,;=[]", c) != NULL))
+  strchr(" $%'-_@~`!(){}^#&.+,;=[]", c) != NULL))
 {
 return false;
 }
-- 
2.36.2




[PATCH v2 0/2] Fix some problems with vvfat in R/W mode

2022-10-10 Thread Hervé Poussineau
Hi,

When testing vvfat in read-write mode, I came across some blocking
problems when using Windows guests.
This patchset is not here to fix all problems of vvfat, but only the
main ones I encountered.

First patch allows setting/resetting the 'volume dirty' flag on
boosector, and the second one allows creating file names with spaces.

Hervé

Changes since v1:
- updated patch 1 with remarks (modify in-memory copy, add comment about
  FAT32)

Hervé Poussineau (2):
  vvfat: allow some writes to bootsector
  vvfat: allow spaces in file names

 block/vvfat.c | 28 ++--
 1 file changed, 26 insertions(+), 2 deletions(-)

-- 
2.36.2




[PATCH v2 05/12] target/ppc: Move VPRTYB[WDQ] to decodetree and use gvec

2022-10-10 Thread Lucas Mateus Castro(alqotel)
From: "Lucas Mateus Castro (alqotel)" 

Moved VPRTYBW and VPRTYBD to use gvec and both of them and VPRTYBQ to
decodetree. VPRTYBW and VPRTYBD now also use .fni4 and .fni8,
respectively.

vprtybw:
reptloopmaster patch
8   12500   0,00991200 0,00626300 (-36.8%)
25  40000,01040600 0,00550600 (-47.1%)
100 10000,01084500 0,00601100 (-44.6%)
500 200 0,01490600 0,01394100 (-6.5%)
250040  0,03285100 0,05143000 (+56.6%)
800012  0,08971500 0,14662500 (+63.4%)

vprtybd:
reptloopmaster patch
8   12500   0,00665800 0,00652800 (-2.0%)
25  40000,00589300 0,00670400 (+13.8%)
100 10000,00646800 0,00743900 (+15.0%)
500 200 0,01065800 0,01586400 (+48.8%)
250040  0,03497000 0,07180100 (+105.3%)
800012  0,09242200 0,21566600 (+133.3%)

vprtybq:
reptloopmaster patch
8   12500   0,00656200 0,00665800 (+1.5%)
25  40000,00620500 0,00644900 (+3.9%)
100 10000,00707500 0,00764900 (+8.1%)
500 200 0,01203500 0,01349500 (+12.1%)
250040  0,03505700 0,04123100 (+17.6%)
800012  0,09590600 0,11586700 (+20.8%)

I wasn't expecting such a performance lost in both VPRTYBD and VPRTYBQ,
I'm not sure if it's worth to move those instructions. Comparing the
assembly of the helper with the TCGop they are pretty similar, so
I'm not sure why vprtybd took so much more time.

Signed-off-by: Lucas Mateus Castro (alqotel) 
---
 target/ppc/helper.h |  4 +-
 target/ppc/insn32.decode|  4 ++
 target/ppc/int_helper.c | 25 +
 target/ppc/translate/vmx-impl.c.inc | 80 +++--
 target/ppc/translate/vmx-ops.c.inc  |  3 --
 5 files changed, 83 insertions(+), 33 deletions(-)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index b2e910b089..a06193bc67 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -193,9 +193,7 @@ DEF_HELPER_FLAGS_3(vslo, TCG_CALL_NO_RWG, void, avr, avr, 
avr)
 DEF_HELPER_FLAGS_3(vsro, TCG_CALL_NO_RWG, void, avr, avr, avr)
 DEF_HELPER_FLAGS_3(vsrv, TCG_CALL_NO_RWG, void, avr, avr, avr)
 DEF_HELPER_FLAGS_3(vslv, TCG_CALL_NO_RWG, void, avr, avr, avr)
-DEF_HELPER_FLAGS_2(vprtybw, TCG_CALL_NO_RWG, void, avr, avr)
-DEF_HELPER_FLAGS_2(vprtybd, TCG_CALL_NO_RWG, void, avr, avr)
-DEF_HELPER_FLAGS_2(vprtybq, TCG_CALL_NO_RWG, void, avr, avr)
+DEF_HELPER_FLAGS_3(VPRTYBQ, TCG_CALL_NO_RWG, void, avr, avr, i32)
 DEF_HELPER_FLAGS_5(vaddsbs, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32)
 DEF_HELPER_FLAGS_5(vaddshs, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32)
 DEF_HELPER_FLAGS_5(vaddsws, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 2658dd3395..aa4968e6b9 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -529,6 +529,10 @@ VCTZDM  000100 . . . 1000100@VX
 VPDEPD  000100 . . . 10111001101@VX
 VPEXTD  000100 . . . 10110001101@VX
 
+VPRTYBD 000100 . 01001 . 1100010@VX_tb
+VPRTYBQ 000100 . 01010 . 1100010@VX_tb
+VPRTYBW 000100 . 01000 . 1100010@VX_tb
+
 ## Vector Permute and Formatting Instruction
 
 VEXTDUBVLX  000100 . . . . 011000   @VA
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index c7fd0d1faa..c6ce4665fa 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -492,31 +492,8 @@ static inline void set_vscr_sat(CPUPPCState *env)
 env->vscr_sat.u32[0] = 1;
 }
 
-/* vprtybw */
-void helper_vprtybw(ppc_avr_t *r, ppc_avr_t *b)
-{
-int i;
-for (i = 0; i < ARRAY_SIZE(r->u32); i++) {
-uint64_t res = b->u32[i] ^ (b->u32[i] >> 16);
-res ^= res >> 8;
-r->u32[i] = res & 1;
-}
-}
-
-/* vprtybd */
-void helper_vprtybd(ppc_avr_t *r, ppc_avr_t *b)
-{
-int i;
-for (i = 0; i < ARRAY_SIZE(r->u64); i++) {
-uint64_t res = b->u64[i] ^ (b->u64[i] >> 32);
-res ^= res >> 16;
-res ^= res >> 8;
-r->u64[i] = res & 1;
-}
-}
-
 /* vprtybq */
-void helper_vprtybq(ppc_avr_t *r, ppc_avr_t *b)
+void helper_VPRTYBQ(ppc_avr_t *r, ppc_avr_t *b, uint32_t v)
 {
 uint64_t res = b->u64[0] ^ b->u64[1];
 res ^= res >> 32;
diff --git a/target/ppc/translate/vmx-impl.c.inc 
b/target/ppc/translate/vmx-impl.c.inc
index b9a9e83ab3..23601942bc 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1659,9 +1659,83 @@ GEN_VXFORM_NOA_ENV(vrfim, 5, 11);
 GEN_VXFORM_NOA_ENV(vrfin, 5, 8);
 GEN_VXFORM_NOA_ENV(vrfip, 5, 10);
 GEN_VXFORM_NOA_ENV(vrfiz, 5, 9);
-GEN_VXFORM_NOA(vprtybw, 1, 24);
-GEN_VXFORM_NOA(vprtybd, 1, 24);
-GEN_VXFORM_NOA(vprtybq, 1, 24);
+
+static void 

[PATCH v2 1/2] vvfat: allow some writes to bootsector

2022-10-10 Thread Hervé Poussineau
'reserved1' field in bootsector is used to mark volume dirty, or need to verify.
Allow writes to bootsector which only changes the 'reserved1' field.

This fixes I/O errors on Windows guests.

Resolves: https://bugs.launchpad.net/qemu/+bug/1889421
Signed-off-by: Hervé Poussineau 
---
 block/vvfat.c | 26 +-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/block/vvfat.c b/block/vvfat.c
index d6dd919683d..ae53f0d7283 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -2993,11 +2993,35 @@ DLOG(checkpoint());
 
 vvfat_close_current_file(s);
 
+if (sector_num == s->offset_to_bootsector && nb_sectors == 1) {
+/*
+ * Write on bootsector. Allow only changing the reserved1 field,
+ * used to mark volume dirtiness
+ */
+unsigned char *bootsector = s->first_sectors
++ s->offset_to_bootsector * 0x200;
+/*
+ * LATER TODO: if FAT32, this is wrong (see init_directories(),
+ * which always creates a FAT16 bootsector)
+ */
+const int reserved1_offset = offsetof(bootsector_t, u.fat16.reserved1);
+
+for (i = 0; i < 0x200; i++) {
+if (i != reserved1_offset && bootsector[i] != buf[i]) {
+fprintf(stderr, "Tried to write to protected bootsector\n");
+return -1;
+}
+}
+
+/* Update bootsector with the only updatable byte, and return success 
*/
+bootsector[reserved1_offset] = buf[reserved1_offset];
+return 0;
+}
+
 /*
  * Some sanity checks:
  * - do not allow writing to the boot sector
  */
-
 if (sector_num < s->offset_to_fat)
 return -1;
 
-- 
2.36.2




[PATCH v2 03/12] target/ppc: Move V(ADD|SUB)CUW to decodetree and use gvec

2022-10-10 Thread Lucas Mateus Castro(alqotel)
From: "Lucas Mateus Castro (alqotel)" 

This patch moves VADDCUW and VSUBCUW to decodtree with gvec using an
implementation based on the helper, with the main difference being
changing the -1 (aka all bits set to 1) result returned by cmp when
true to +1. It also implemented a .fni4 version of those instructions
and dropped the helper.

vaddcuw:
reptloopmaster patch
8   12500   0,01008200 0,00612400 (-39.3%)
25  40000,01091500 0,00471600 (-56.8%)
100 10000,01332500 0,00593700 (-55.4%)
500 200 0,01998500 0,01275700 (-36.2%)
250040  0,04704300 0,04364300 (-7.2%)
800012  0,10748200 0,11241000 (+4.6%)

vsubcuw:
reptloopmaster patch
8   12500   0,01226200 0,00571600 (-53.4%)
25  40000,01493500 0,00462100 (-69.1%)
100 10000,01522700 0,00455100 (-70.1%)
500 200 0,02384600 0,01133500 (-52.5%)
250040  0,04935200 0,03178100 (-35.6%)
800012  0,09039900 0,09440600 (+4.4%)

Overall there was a gain in performance, but the TCGop code was still
slightly bigger in the new version (it went from 4 to 5).

Signed-off-by: Lucas Mateus Castro (alqotel) 
---
 target/ppc/helper.h |  2 -
 target/ppc/insn32.decode|  2 +
 target/ppc/int_helper.c | 18 -
 target/ppc/translate/vmx-impl.c.inc | 61 +++--
 target/ppc/translate/vmx-ops.c.inc  |  3 +-
 5 files changed, 60 insertions(+), 26 deletions(-)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index f02a9497b7..f7047ed2aa 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -193,11 +193,9 @@ DEF_HELPER_FLAGS_3(vslo, TCG_CALL_NO_RWG, void, avr, avr, 
avr)
 DEF_HELPER_FLAGS_3(vsro, TCG_CALL_NO_RWG, void, avr, avr, avr)
 DEF_HELPER_FLAGS_3(vsrv, TCG_CALL_NO_RWG, void, avr, avr, avr)
 DEF_HELPER_FLAGS_3(vslv, TCG_CALL_NO_RWG, void, avr, avr, avr)
-DEF_HELPER_FLAGS_3(vaddcuw, TCG_CALL_NO_RWG, void, avr, avr, avr)
 DEF_HELPER_FLAGS_2(vprtybw, TCG_CALL_NO_RWG, void, avr, avr)
 DEF_HELPER_FLAGS_2(vprtybd, TCG_CALL_NO_RWG, void, avr, avr)
 DEF_HELPER_FLAGS_2(vprtybq, TCG_CALL_NO_RWG, void, avr, avr)
-DEF_HELPER_FLAGS_3(vsubcuw, TCG_CALL_NO_RWG, void, avr, avr, avr)
 DEF_HELPER_FLAGS_5(vaddsbs, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32)
 DEF_HELPER_FLAGS_5(vaddshs, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32)
 DEF_HELPER_FLAGS_5(vaddsws, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 9a509e84df..aebc7b73c8 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -608,12 +608,14 @@ VRLQNM  000100 . . . 00101000101
@VX
 
 ## Vector Integer Arithmetic Instructions
 
+VADDCUW 000100 . . . 0011000@VX
 VADDCUQ 000100 . . . 0010100@VX
 VADDUQM 000100 . . . 001@VX
 
 VADDEUQM000100 . . . . 00   @VA
 VADDECUQ000100 . . . . 01   @VA
 
+VSUBCUW 000100 . . . 1011000@VX
 VSUBCUQ 000100 . . . 1010100@VX
 VSUBUQM 000100 . . . 101@VX
 
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index ae1ba8084d..f8dd12e8ae 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -492,15 +492,6 @@ static inline void set_vscr_sat(CPUPPCState *env)
 env->vscr_sat.u32[0] = 1;
 }
 
-void helper_vaddcuw(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
-{
-int i;
-
-for (i = 0; i < ARRAY_SIZE(r->u32); i++) {
-r->u32[i] = ~a->u32[i] < b->u32[i];
-}
-}
-
 /* vprtybw */
 void helper_vprtybw(ppc_avr_t *r, ppc_avr_t *b)
 {
@@ -1962,15 +1953,6 @@ void helper_vsro(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t 
*b)
 #endif
 }
 
-void helper_vsubcuw(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
-{
-int i;
-
-for (i = 0; i < ARRAY_SIZE(r->u32); i++) {
-r->u32[i] = a->u32[i] >= b->u32[i];
-}
-}
-
 void helper_vsumsws(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
 {
 int64_t t;
diff --git a/target/ppc/translate/vmx-impl.c.inc 
b/target/ppc/translate/vmx-impl.c.inc
index 3acd585a2f..f52485a5f1 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -803,8 +803,6 @@ GEN_VXFORM(vsrv, 2, 28);
 GEN_VXFORM(vslv, 2, 29);
 GEN_VXFORM(vslo, 6, 16);
 GEN_VXFORM(vsro, 6, 17);
-GEN_VXFORM(vaddcuw, 0, 6);
-GEN_VXFORM(vsubcuw, 0, 22);
 
 static bool do_vector_gvec3_VX(DisasContext *ctx, arg_VX *a, int vece,
void (*gen_gvec)(unsigned, uint32_t, uint32_t,
@@ -2847,8 +2845,6 @@ static void gen_xpnd04_2(DisasContext *ctx)
 }
 
 
-GEN_VXFORM_DUAL(vsubcuw, PPC_ALTIVEC, PPC_NONE, \
-xpnd04_1, PPC_NONE, PPC2_ISA300)
 GEN_VXFORM_DUAL(vsubsws, PPC_ALTIVEC, 

Re: [PATCH REPOST] hw/i386/e820: remove legacy reserved entries for e820

2022-10-10 Thread Ani Sinha
On Wed, Sep 7, 2022 at 18:18 Ani Sinha  wrote:

> On Wed, Aug 31, 2022 at 10:23 AM Ani Sinha  wrote:
> >
> > e820 reserved entries were used before the dynamic entries with fw
> config files
> > were intoduced. Please see the following change:
> > 7d67110f2d9a6("pc: add etc/e820 fw_cfg file")
> >
> > Identical support was introduced into seabios as well with the following
> commit:
> > ce39bd4031820 ("Add support for etc/e820 fw_cfg file")
> >
> > Both the above commits are now quite old. QEMU machines 1.7 and newer no
> longer
> > use the reserved entries. Seabios uses fw config files and
> > dynamic e820 entries by default and only falls back to using reserved
> entries
> > when it has to work with old qemu (versions earlier than 1.7). Please see
> > functions qemu_cfg_e820() and qemu_early_e820(). It is safe to remove
> legacy
> > FW_CFG_E820_TABLE and associated code now as QEMU 7.0 has deprecated
> i440fx
> > machines 1.7 and older. It would be incredibly rare to run the latest
> qemu
> > version with a very old version of seabios that did not support fw
> config files
> > for e820.
> >
> > As far as I could see, edk2/ovfm never supported reserved entries and
> uses fw
> > config files from the beginning. So there should be no incompatibilities
> with
> > ovfm as well.
> >
> > CC: Gerd Hoffmann 
> > Signed-off-by: Ani Sinha 
> > Acked-by: Gerd Hoffmann 
>
> michael, please pick this one as well for the next pull. thanks.


Michael, seems you kissed this one.


>
> > ---
> >  hw/i386/e820_memory_layout.c | 20 +---
> >  hw/i386/e820_memory_layout.h |  8 
> >  hw/i386/fw_cfg.c |  3 ---
> >  hw/i386/fw_cfg.h |  1 -
> >  hw/i386/microvm.c|  2 --
> >  5 files changed, 1 insertion(+), 33 deletions(-)
> >
> > Please see:
> >
> https://patchwork.ozlabs.org/project/qemu-devel/patch/20220420043904.1225153-1-...@anisinha.ca/
> > for the previous post. Now that we are in 7.2 devel cycle, time to push
> > this patch.
> >
> > diff --git a/hw/i386/e820_memory_layout.c b/hw/i386/e820_memory_layout.c
> > index bcf9eaf837..06970ac44a 100644
> > --- a/hw/i386/e820_memory_layout.c
> > +++ b/hw/i386/e820_memory_layout.c
> > @@ -11,29 +11,11 @@
> >  #include "e820_memory_layout.h"
> >
> >  static size_t e820_entries;
> > -struct e820_table e820_reserve;
> >  struct e820_entry *e820_table;
> >
> >  int e820_add_entry(uint64_t address, uint64_t length, uint32_t type)
> >  {
> > -int index = le32_to_cpu(e820_reserve.count);
> > -struct e820_entry *entry;
> > -
> > -if (type != E820_RAM) {
> > -/* old FW_CFG_E820_TABLE entry -- reservations only */
> > -if (index >= E820_NR_ENTRIES) {
> > -return -EBUSY;
> > -}
> > -entry = _reserve.entry[index++];
> > -
> > -entry->address = cpu_to_le64(address);
> > -entry->length = cpu_to_le64(length);
> > -entry->type = cpu_to_le32(type);
> > -
> > -e820_reserve.count = cpu_to_le32(index);
> > -}
> > -
> > -/* new "etc/e820" file -- include ram too */
> > +/* new "etc/e820" file -- include ram and reserved entries */
> >  e820_table = g_renew(struct e820_entry, e820_table, e820_entries +
> 1);
> >  e820_table[e820_entries].address = cpu_to_le64(address);
> >  e820_table[e820_entries].length = cpu_to_le64(length);
> > diff --git a/hw/i386/e820_memory_layout.h b/hw/i386/e820_memory_layout.h
> > index 04f93780f9..7c239aa033 100644
> > --- a/hw/i386/e820_memory_layout.h
> > +++ b/hw/i386/e820_memory_layout.h
> > @@ -16,20 +16,12 @@
> >  #define E820_NVS4
> >  #define E820_UNUSABLE   5
> >
> > -#define E820_NR_ENTRIES 16
> > -
> >  struct e820_entry {
> >  uint64_t address;
> >  uint64_t length;
> >  uint32_t type;
> >  } QEMU_PACKED __attribute((__aligned__(4)));
> >
> > -struct e820_table {
> > -uint32_t count;
> > -struct e820_entry entry[E820_NR_ENTRIES];
> > -} QEMU_PACKED __attribute((__aligned__(4)));
> > -
> > -extern struct e820_table e820_reserve;
> >  extern struct e820_entry *e820_table;
> >
> >  int e820_add_entry(uint64_t address, uint64_t length, uint32_t type);
> > diff --git a/hw/i386/fw_cfg.c b/hw/i386/fw_cfg.c
> > index a283785a8d..72a42f3c66 100644
> > --- a/hw/i386/fw_cfg.c
> > +++ b/hw/i386/fw_cfg.c
> > @@ -36,7 +36,6 @@ const char *fw_cfg_arch_key_name(uint16_t key)
> >  {FW_CFG_ACPI_TABLES, "acpi_tables"},
> >  {FW_CFG_SMBIOS_ENTRIES, "smbios_entries"},
> >  {FW_CFG_IRQ0_OVERRIDE, "irq0_override"},
> > -{FW_CFG_E820_TABLE, "e820_table"},
> >  {FW_CFG_HPET, "hpet"},
> >  };
> >
> > @@ -127,8 +126,6 @@ FWCfgState *fw_cfg_arch_create(MachineState *ms,
> >  #endif
> >  fw_cfg_add_i32(fw_cfg, FW_CFG_IRQ0_OVERRIDE, 1);
> >
> > -fw_cfg_add_bytes(fw_cfg, FW_CFG_E820_TABLE,
> > - _reserve, sizeof(e820_reserve));
> >  fw_cfg_add_file(fw_cfg, "etc/e820", e820_table,
> >  sizeof(struct 

[PATCH v2 02/12] target/ppc: Move VMH[R]ADDSHS instruction to decodetree

2022-10-10 Thread Lucas Mateus Castro(alqotel)
From: "Lucas Mateus Castro (alqotel)" 

This patch moves VMHADDSHS and VMHRADDSHS to decodetree I couldn't find
a satisfactory implementation with TCG inline.

vmhaddshs:
reptloopmaster patch
8   12500   0,02983400 0,02648500 (-11.2%)
25  40000,02946000 0,02518000 (-14.5%)
100 10000,03104300 0,02638000 (-15.0%)
500 200 0,04002000 0,03502500 (-12.5%)
250040  0,08090100 0,07562200 (-6.5%)
800012  0,19242600 0,18626800 (-3.2%)

vmhraddshs:
reptloopmaster patch
8   12500   0,03078600 0,02851000 (-7.4%)
25  40000,02793200 0,02746900 (-1.7%)
100 10000,02886000 0,02839900 (-1.6%)
500 200 0,03714700 0,03799200 (+2.3%)
250040  0,07948000 0,07852200 (-1.2%)
800012  0,19049800 0,18813900 (-1.2%)

Signed-off-by: Lucas Mateus Castro (alqotel) 
Reviewed-by: Richard Henderson 
---
 target/ppc/helper.h | 4 ++--
 target/ppc/insn32.decode| 2 ++
 target/ppc/int_helper.c | 4 ++--
 target/ppc/translate/vmx-impl.c.inc | 5 +++--
 target/ppc/translate/vmx-ops.c.inc  | 1 -
 5 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 9c562ab00e..f02a9497b7 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -258,8 +258,8 @@ DEF_HELPER_4(vpkuhum, void, env, avr, avr, avr)
 DEF_HELPER_4(vpkuwum, void, env, avr, avr, avr)
 DEF_HELPER_4(vpkudum, void, env, avr, avr, avr)
 DEF_HELPER_FLAGS_3(vpkpx, TCG_CALL_NO_RWG, void, avr, avr, avr)
-DEF_HELPER_5(vmhaddshs, void, env, avr, avr, avr, avr)
-DEF_HELPER_5(vmhraddshs, void, env, avr, avr, avr, avr)
+DEF_HELPER_5(VMHADDSHS, void, env, avr, avr, avr, avr)
+DEF_HELPER_5(VMHRADDSHS, void, env, avr, avr, avr, avr)
 DEF_HELPER_FLAGS_4(VMSUMUHM, TCG_CALL_NO_RWG, void, avr, avr, avr, avr)
 DEF_HELPER_5(VMSUMUHS, void, env, avr, avr, avr, avr)
 DEF_HELPER_FLAGS_4(VMSUMSHM, TCG_CALL_NO_RWG, void, avr, avr, avr, avr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 7445455a12..9a509e84df 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -694,6 +694,8 @@ VMSUMCUD000100 . . . . 010111   @VA
 VMSUMUDM000100 . . . . 100011   @VA
 
 VMLADDUHM   000100 . . . . 100010   @VA
+VMHADDSHS   000100 . . . . 10   @VA
+VMHRADDSHS  000100 . . . . 11   @VA
 
 ## Vector String Instructions
 
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index 0d25000b2a..ae1ba8084d 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -939,7 +939,7 @@ target_ulong helper_vctzlsbb(ppc_avr_t *r)
 return count;
 }
 
-void helper_vmhaddshs(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a,
+void helper_VMHADDSHS(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a,
   ppc_avr_t *b, ppc_avr_t *c)
 {
 int sat = 0;
@@ -957,7 +957,7 @@ void helper_vmhaddshs(CPUPPCState *env, ppc_avr_t *r, 
ppc_avr_t *a,
 }
 }
 
-void helper_vmhraddshs(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a,
+void helper_VMHRADDSHS(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a,
ppc_avr_t *b, ppc_avr_t *c)
 {
 int sat = 0;
diff --git a/target/ppc/translate/vmx-impl.c.inc 
b/target/ppc/translate/vmx-impl.c.inc
index 9f18c6d4f2..3acd585a2f 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -2521,7 +2521,7 @@ static void glue(gen_, name0##_##name1)(DisasContext 
*ctx)  \
 tcg_temp_free_ptr(rd);  \
 }
 
-GEN_VAFORM_PAIRED(vmhaddshs, vmhraddshs, 16)
+GEN_VAFORM_PAIRED(vmaddfp, vnmsubfp, 23)
 
 static bool do_va_helper(DisasContext *ctx, arg_VA *a,
 void (*gen_helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr))
@@ -2620,7 +2620,8 @@ static bool do_va_env_helper(DisasContext *ctx, arg_VA *a,
 TRANS_FLAGS(ALTIVEC, VMSUMUHS, do_va_env_helper, gen_helper_VMSUMUHS)
 TRANS_FLAGS(ALTIVEC, VMSUMSHS, do_va_env_helper, gen_helper_VMSUMSHS)
 
-GEN_VAFORM_PAIRED(vmaddfp, vnmsubfp, 23)
+TRANS_FLAGS(ALTIVEC, VMHADDSHS, do_va_env_helper, gen_helper_VMHADDSHS)
+TRANS_FLAGS(ALTIVEC, VMHRADDSHS, do_va_env_helper, gen_helper_VMHRADDSHS)
 
 GEN_VXFORM_NOA(vclzb, 1, 28)
 GEN_VXFORM_NOA(vclzh, 1, 29)
diff --git a/target/ppc/translate/vmx-ops.c.inc 
b/target/ppc/translate/vmx-ops.c.inc
index a3a0fd0650..7cd9d40e06 100644
--- a/target/ppc/translate/vmx-ops.c.inc
+++ b/target/ppc/translate/vmx-ops.c.inc
@@ -219,7 +219,6 @@ GEN_VXFORM_UIMM(vctsxs, 5, 15),
 
 #define GEN_VAFORM_PAIRED(name0, name1, opc2)   \
 GEN_HANDLER(name0##_##name1, 0x04, opc2, 0xFF, 0x, PPC_ALTIVEC)
-GEN_VAFORM_PAIRED(vmhaddshs, vmhraddshs, 16),
 GEN_VAFORM_PAIRED(vmaddfp, vnmsubfp, 23),
 
 GEN_VXFORM_DUAL(vclzb, vpopcntb, 1, 28, 

[PATCH v2 01/12] target/ppc: Moved VMLADDUHM to decodetree and use gvec

2022-10-10 Thread Lucas Mateus Castro(alqotel)
From: "Lucas Mateus Castro (alqotel)" 

This patch moves VMLADDUHM to decodetree a creates a gvec implementation
using mul_vec and add_vec.

reptloopmaster patch
8   12500   0,01810500 0,00903100 (-50.1%)
25  40000,01739400 0,00747700 (-57.0%)
100 10000,01843600 0,00901400 (-51.1%)
500 200 0,02574600 0,01971000 (-23.4%)
250040  0,05921600 0,07121800 (+20.3%)
800012  0,15326700 0,21725200 (+41.7%)

The significant difference in performance when REPT is low and LOOP is
high I think is due to the fact that the new implementation has a higher
translation time, as when using a helper only 5 TCGop are used but with
the patch a total of 10 TCGop are needed (Power lacks a direct mul_vec
equivalent so this instruction is implemented with the help of 5 others,
vmuleu, vmulou, vmrgh, vmrgl and vpkum).

Signed-off-by: Lucas Mateus Castro (alqotel) 
Reviewed-by: Richard Henderson 
---
 target/ppc/helper.h |  2 +-
 target/ppc/insn32.decode|  2 ++
 target/ppc/int_helper.c |  3 +-
 target/ppc/translate.c  |  1 -
 target/ppc/translate/vmx-impl.c.inc | 48 ++---
 5 files changed, 35 insertions(+), 21 deletions(-)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 57eee07256..9c562ab00e 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -264,7 +264,7 @@ DEF_HELPER_FLAGS_4(VMSUMUHM, TCG_CALL_NO_RWG, void, avr, 
avr, avr, avr)
 DEF_HELPER_5(VMSUMUHS, void, env, avr, avr, avr, avr)
 DEF_HELPER_FLAGS_4(VMSUMSHM, TCG_CALL_NO_RWG, void, avr, avr, avr, avr)
 DEF_HELPER_5(VMSUMSHS, void, env, avr, avr, avr, avr)
-DEF_HELPER_FLAGS_4(vmladduhm, TCG_CALL_NO_RWG, void, avr, avr, avr, avr)
+DEF_HELPER_FLAGS_5(VMLADDUHM, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32)
 DEF_HELPER_FLAGS_2(mtvscr, TCG_CALL_NO_RWG, void, env, i32)
 DEF_HELPER_FLAGS_1(mfvscr, TCG_CALL_NO_RWG, i32, env)
 DEF_HELPER_3(lvebx, void, env, avr, tl)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index a5249ee32c..7445455a12 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -693,6 +693,8 @@ VMSUMUHS000100 . . . . 100111   @VA
 VMSUMCUD000100 . . . . 010111   @VA
 VMSUMUDM000100 . . . . 100011   @VA
 
+VMLADDUHM   000100 . . . . 100010   @VA
+
 ## Vector String Instructions
 
 VSTRIBL 000100 . 0 . . 001101   @VX_tb_rc
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index 696096100b..0d25000b2a 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -974,7 +974,8 @@ void helper_vmhraddshs(CPUPPCState *env, ppc_avr_t *r, 
ppc_avr_t *a,
 }
 }
 
-void helper_vmladduhm(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, ppc_avr_t *c)
+void helper_VMLADDUHM(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, ppc_avr_t *c,
+  uint32_t v)
 {
 int i;
 
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index e810842925..11f729c60c 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -6921,7 +6921,6 @@ GEN_HANDLER(lvsl, 0x1f, 0x06, 0x00, 0x0001, 
PPC_ALTIVEC),
 GEN_HANDLER(lvsr, 0x1f, 0x06, 0x01, 0x0001, PPC_ALTIVEC),
 GEN_HANDLER(mfvscr, 0x04, 0x2, 0x18, 0x001ff800, PPC_ALTIVEC),
 GEN_HANDLER(mtvscr, 0x04, 0x2, 0x19, 0x03ff, PPC_ALTIVEC),
-GEN_HANDLER(vmladduhm, 0x04, 0x11, 0xFF, 0x, PPC_ALTIVEC),
 #if defined(TARGET_PPC64)
 GEN_HANDLER_E(maddhd_maddhdu, 0x04, 0x18, 0xFF, 0x, PPC_NONE,
   PPC2_ISA300),
diff --git a/target/ppc/translate/vmx-impl.c.inc 
b/target/ppc/translate/vmx-impl.c.inc
index e644ad3236..9f18c6d4f2 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -2523,24 +2523,6 @@ static void glue(gen_, name0##_##name1)(DisasContext 
*ctx)  \
 
 GEN_VAFORM_PAIRED(vmhaddshs, vmhraddshs, 16)
 
-static void gen_vmladduhm(DisasContext *ctx)
-{
-TCGv_ptr ra, rb, rc, rd;
-if (unlikely(!ctx->altivec_enabled)) {
-gen_exception(ctx, POWERPC_EXCP_VPU);
-return;
-}
-ra = gen_avr_ptr(rA(ctx->opcode));
-rb = gen_avr_ptr(rB(ctx->opcode));
-rc = gen_avr_ptr(rC(ctx->opcode));
-rd = gen_avr_ptr(rD(ctx->opcode));
-gen_helper_vmladduhm(rd, ra, rb, rc);
-tcg_temp_free_ptr(ra);
-tcg_temp_free_ptr(rb);
-tcg_temp_free_ptr(rc);
-tcg_temp_free_ptr(rd);
-}
-
 static bool do_va_helper(DisasContext *ctx, arg_VA *a,
 void (*gen_helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr))
 {
@@ -2569,6 +2551,36 @@ TRANS_FLAGS2(ALTIVEC_207, VSUBECUQ, do_va_helper, 
gen_helper_VSUBECUQ)
 TRANS_FLAGS(ALTIVEC, VPERM, do_va_helper, gen_helper_VPERM)
 TRANS_FLAGS2(ISA300, VPERMR, do_va_helper, gen_helper_VPERMR)
 
+static void gen_vmladduhm_vec(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec 
b,
+  TCGv_vec c)

Re: [PATCH 2/2] hw/cxl: Allow CXL type-3 devices to be persistent or volatile

2022-10-10 Thread Davidlohr Bueso

On Thu, 06 Oct 2022, Gregory Price wrote:


diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index bc1bb18844..dfec11a1b5 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -138,7 +138,7 @@ static ret_code cmd_firmware_update_get_info(struct cxl_cmd 
*cmd,
} QEMU_PACKED *fw_info;
QEMU_BUILD_BUG_ON(sizeof(*fw_info) != 0x50);

-if (cxl_dstate->pmem_size < (256 << 20)) {
+if (cxl_dstate->mem_size < (256 << 20)) {


Nit but we probably want to abstract this out (in a pre-patch), just like in the
kernel side. Ie:

#define CXL_CAPACITY_MULTIPLIER   0x1000 /* SZ_256M */


return CXL_MBOX_INTERNAL_ERROR;
}

@@ -281,9 +281,10 @@ static ret_code cmd_identify_memory_device(struct cxl_cmd 
*cmd,

CXLType3Dev *ct3d = container_of(cxl_dstate, CXLType3Dev, cxl_dstate);
CXLType3Class *cvc = CXL_TYPE3_GET_CLASS(ct3d);
-uint64_t size = cxl_dstate->pmem_size;

-if (!QEMU_IS_ALIGNED(size, 256 << 20)) {
+if ((!QEMU_IS_ALIGNED(cxl_dstate->mem_size, 256 << 20)) ||


is the full mem_size check here really needed?


+(!QEMU_IS_ALIGNED(cxl_dstate->vmem_size, 256 << 20)) ||
+(!QEMU_IS_ALIGNED(cxl_dstate->pmem_size, 256 << 20))) {
return CXL_MBOX_INTERNAL_ERROR;
}

@@ -293,8 +294,9 @@ static ret_code cmd_identify_memory_device(struct cxl_cmd 
*cmd,
/* PMEM only */


This comment wants removed.


snprintf(id->fw_revision, 0x10, "BWFW VERSION %02d", 0);

-id->total_capacity = size / (256 << 20);
-id->persistent_capacity = size / (256 << 20);
+id->total_capacity = cxl_dstate->mem_size / (256 << 20);
+id->persistent_capacity = cxl_dstate->pmem_size / (256 << 20);
+id->volatile_capacity = cxl_dstate->vmem_size / (256 << 20);
id->lsa_size = cvc->get_lsa_size(ct3d);

*len = sizeof(*id);
@@ -312,16 +314,16 @@ static ret_code cmd_ccls_get_partition_info(struct 
cxl_cmd *cmd,
uint64_t next_pmem;
} QEMU_PACKED *part_info = (void *)cmd->payload;
QEMU_BUILD_BUG_ON(sizeof(*part_info) != 0x20);
-uint64_t size = cxl_dstate->pmem_size;

-if (!QEMU_IS_ALIGNED(size, 256 << 20)) {
+if ((!QEMU_IS_ALIGNED(cxl_dstate->mem_size, 256 << 20)) ||
+(!QEMU_IS_ALIGNED(cxl_dstate->vmem_size, 256 << 20)) ||
+(!QEMU_IS_ALIGNED(cxl_dstate->pmem_size, 256 << 20))) {
return CXL_MBOX_INTERNAL_ERROR;
}

-/* PMEM only */
-part_info->active_vmem = 0;
+part_info->active_vmem = cxl_dstate->vmem_size / (256 << 20);
part_info->next_vmem = 0;
-part_info->active_pmem = size / (256 << 20);
+part_info->active_pmem = cxl_dstate->pmem_size / (256 << 20);
part_info->next_pmem = 0;

*len = sizeof(*part_info);
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index 1837c1c83a..998461dac1 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -100,18 +100,47 @@ static bool cxl_setup_memory(CXLType3Dev *ct3d, Error 
**errp)
DeviceState *ds = DEVICE(ct3d);
MemoryRegion *mr;
char *name;
+bool is_pmem = false;

-if (!ct3d->hostmem) {
-error_setg(errp, "memdev property must be set");
+/*
+ * FIXME: For now we only allow a single host memory region.
+ * Handle the deprecated memdev property usage cases
+ */
+if (!ct3d->hostmem && !ct3d->host_vmem && !ct3d->host_pmem) {
+error_setg(errp, "at least one memdev property must be set");
return false;
+} else if (ct3d->hostmem && (ct3d->host_vmem || ct3d->host_pmem)) {
+error_setg(errp, "deprecated [memdev] cannot be used with new "
+ "persistent and volatile memdev properties");
+return false;
+} else if (ct3d->hostmem) {
+warn_report("memdev is deprecated and defaults to pmem. "
+"Use (persistent|volatile)-memdev instead.");
+is_pmem = true;
+} else {
+if (ct3d->host_vmem && ct3d->host_pmem) {
+error_setg(errp, "Multiple memory devices not supported yet");
+return false;
+}
+is_pmem = !!ct3d->host_pmem;
+ct3d->hostmem = ct3d->host_pmem ? ct3d->host_pmem : ct3d->host_vmem;


This hides requirement details as to the necessary changes that are needed for
volatile support - for example, build_dvsecs(). Imo using two backends (without
breaking current configs, of course) should be the initial version, not 
something
to leave pending.

Thanks,
Davidlohr



[PULL 53/55] x86: pci: acpi: reorder Device's _DSM method

2022-10-10 Thread Michael S. Tsirkin
From: Igor Mammedov 

align _DSM method in empty slot descriptor with
a populated slot position.
Expected change:
  +Device (SE8)
  +{
  +Name (_ADR, 0x001D)  // _ADR: Address
  +Name (ASUN, 0x1D)
   Method (_DSM, 4, Serialized)  // _DSM: Device-Specific Method
   {
   Local0 = Package (0x02)
   {
   BSEL,
   ASUN
   }
   Return (PDSM (Arg0, Arg1, Arg2, Arg3, Local0))
   }
  -}

  -Device (SE8)
  -{
  -Name (_ADR, 0x001D)  // _ADR: Address
  -Name (ASUN, 0x1D)
   Name (_SUN, 0x1D)  // _SUN: Slot User Number
   Method (_EJ0, 1, NotSerialized)  // _EJx: Eject Device
   {
   PCEJ (BSEL, _SUN)
   }
  +}

i.e. put _DSM right after ASUN, with _SUN/_EJ0 following it.

that will eliminate contextual changes (causing test failures)
when follow up patches merge code generating populated and empty
slots descriptors.

Signed-off-by: Igor Mammedov 
Message-Id: <20220701133515.137890-16-imamm...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/i386/acpi-build.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 6342467af4..fc23cb08c3 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -444,15 +444,13 @@ static void build_append_pci_bus_devices(Aml 
*parent_scope, PCIBus *bus,
 dev = aml_device("S%.02X", devfn);
 aml_append(dev, aml_name_decl("_ADR", aml_int(adr)));
 aml_append(dev, aml_name_decl("ASUN", aml_int(slot)));
+aml_append(dev, aml_pci_device_dsm());
 aml_append(dev, aml_name_decl("_SUN", aml_int(slot)));
 method = aml_method("_EJ0", 1, AML_NOTSERIALIZED);
 aml_append(method,
 aml_call2("PCEJ", aml_name("BSEL"), aml_name("_SUN"))
 );
 aml_append(dev, method);
-
-aml_append(dev, aml_pci_device_dsm());
-
 aml_append(parent_scope, dev);
 
 build_append_pcihp_notify_entry(notify_method, slot);
-- 
MST




Re: [PULL 29/55] Revert "intel_iommu: Fix irqchip / X2APIC configuration checks"

2022-10-10 Thread David Woodhouse
On Mon, 2022-10-10 at 13:30 -0400, Michael S. Tsirkin wrote:
> From: Peter Xu <
> pet...@redhat.com
> >
> 
> It's true that when vcpus<=255 we don't require the length of 32bit APIC
> IDs.  However here since we already have EIM=ON it means the hypervisor
> will declare the VM as x2apic supported (e.g. VT-d ECAP register will have
> EIM bit 4 set), so the guest should assume the APIC IDs are 32bits width
> even if vcpus<=255.  In short, commit 77250171bdc breaks any simple cmdline
> that wants to boot a VM with >=9 but <=255 vcpus with:

I find that paragraph really hard to parse. What does it even mean that
"guest should assume the APIC IDs are 32bits"? 

In practice, all the EIM bit does is *allow* 32 bits of APIC ID in the
tables. Which is perfectly fine if there are only 254 CPUs anyway, and
we never need to use a higher value.

I *think* the actual problem here is when logical addressing is used,
which puts the APIC cluster ID into higher bits? But it's kind of weird
that the message doesn't mention that at all?

That's fixable by just setting the X2APIC_PHYSICAL bit in the ACPI
FADT, isn't it? Then the only values that a guest may put into those
fields — 32-bit fields or not — are lower than 0xff anyway.



smime.p7s
Description: S/MIME cryptographic signature


[PULL 46/55] tests: acpi: whitelist pc/q35 DSDT before switching _DSM to use ASUN

2022-10-10 Thread Michael S. Tsirkin
From: Igor Mammedov 

Signed-off-by: Igor Mammedov 
Message-Id: <20220701133515.137890-9-imamm...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 tests/qtest/bios-tables-test-allowed-diff.h | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
b/tests/qtest/bios-tables-test-allowed-diff.h
index dfb8523c8b..1983fa596b 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1 +1,15 @@
 /* List of comma-separated changed AML files to ignore */
+"tests/data/acpi/pc/DSDT",
+"tests/data/acpi/pc/DSDT.acpierst",
+"tests/data/acpi/pc/DSDT.acpihmat",
+"tests/data/acpi/pc/DSDT.bridge",
+"tests/data/acpi/pc/DSDT.cphp",
+"tests/data/acpi/pc/DSDT.dimmpxm",
+"tests/data/acpi/pc/DSDT.hpbridge",
+"tests/data/acpi/pc/DSDT.ipmikcs",
+"tests/data/acpi/pc/DSDT.memhp",
+"tests/data/acpi/pc/DSDT.nohpet",
+"tests/data/acpi/pc/DSDT.numamem",
+"tests/data/acpi/pc/DSDT.roothp",
+"tests/data/acpi/q35/DSDT.bridge",
+"tests/data/acpi/q35/DSDT.multi-bridge",
-- 
MST




[PULL 52/55] tests: acpi: whitelist pc/q35 DSDT before moving _ADR field

2022-10-10 Thread Michael S. Tsirkin
From: Igor Mammedov 

Signed-off-by: Igor Mammedov 
Message-Id: <20220701133515.137890-15-imamm...@redhat.com>
---
 tests/qtest/bios-tables-test-allowed-diff.h | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
b/tests/qtest/bios-tables-test-allowed-diff.h
index dfb8523c8b..1983fa596b 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1 +1,15 @@
 /* List of comma-separated changed AML files to ignore */
+"tests/data/acpi/pc/DSDT",
+"tests/data/acpi/pc/DSDT.acpierst",
+"tests/data/acpi/pc/DSDT.acpihmat",
+"tests/data/acpi/pc/DSDT.bridge",
+"tests/data/acpi/pc/DSDT.cphp",
+"tests/data/acpi/pc/DSDT.dimmpxm",
+"tests/data/acpi/pc/DSDT.hpbridge",
+"tests/data/acpi/pc/DSDT.ipmikcs",
+"tests/data/acpi/pc/DSDT.memhp",
+"tests/data/acpi/pc/DSDT.nohpet",
+"tests/data/acpi/pc/DSDT.numamem",
+"tests/data/acpi/pc/DSDT.roothp",
+"tests/data/acpi/q35/DSDT.bridge",
+"tests/data/acpi/q35/DSDT.multi-bridge",
-- 
MST




[PULL 54/55] tests: acpi: update expected blobs

2022-10-10 Thread Michael S. Tsirkin
From: Igor Mammedov 

Expected change:
  +Device (SE8)
  +{
  +Name (_ADR, 0x001D)  // _ADR: Address
  +Name (ASUN, 0x1D)
   Method (_DSM, 4, Serialized)  // _DSM: Device-Specific 
Method
   {
   Local0 = Package (0x02)
   {
   BSEL,
   ASUN
   }
   Return (PDSM (Arg0, Arg1, Arg2, Arg3, Local0))
   }
  -}

  -Device (SE8)
  -{
  -Name (_ADR, 0x001D)  // _ADR: Address
  -Name (ASUN, 0x1D)
   Name (_SUN, 0x1D)  // _SUN: Slot User Number
   Method (_EJ0, 1, NotSerialized)  // _EJx: Eject Device
   {
   PCEJ (BSEL, _SUN)
   }
  +}

Signed-off-by: Igor Mammedov 
Message-Id: <20220701133515.137890-17-imamm...@redhat.com>
---
 tests/qtest/bios-tables-test-allowed-diff.h |  14 --
 tests/data/acpi/pc/DSDT | Bin 6422 -> 6422 bytes
 tests/data/acpi/pc/DSDT.acpierst| Bin 6382 -> 6382 bytes
 tests/data/acpi/pc/DSDT.acpihmat| Bin 7747 -> 7747 bytes
 tests/data/acpi/pc/DSDT.bridge  | Bin 9496 -> 9496 bytes
 tests/data/acpi/pc/DSDT.cphp| Bin 6886 -> 6886 bytes
 tests/data/acpi/pc/DSDT.dimmpxm | Bin 8076 -> 8076 bytes
 tests/data/acpi/pc/DSDT.hpbridge| Bin 6382 -> 6382 bytes
 tests/data/acpi/pc/DSDT.ipmikcs | Bin 6494 -> 6494 bytes
 tests/data/acpi/pc/DSDT.memhp   | Bin 7781 -> 7781 bytes
 tests/data/acpi/pc/DSDT.nohpet  | Bin 6280 -> 6280 bytes
 tests/data/acpi/pc/DSDT.numamem | Bin 6428 -> 6428 bytes
 tests/data/acpi/pc/DSDT.roothp  | Bin 6656 -> 6656 bytes
 tests/data/acpi/q35/DSDT.bridge | Bin 11449 -> 11449 bytes
 tests/data/acpi/q35/DSDT.multi-bridge   | Bin 8640 -> 8640 bytes
 15 files changed, 14 deletions(-)

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
b/tests/qtest/bios-tables-test-allowed-diff.h
index 1983fa596b..dfb8523c8b 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1,15 +1 @@
 /* List of comma-separated changed AML files to ignore */
-"tests/data/acpi/pc/DSDT",
-"tests/data/acpi/pc/DSDT.acpierst",
-"tests/data/acpi/pc/DSDT.acpihmat",
-"tests/data/acpi/pc/DSDT.bridge",
-"tests/data/acpi/pc/DSDT.cphp",
-"tests/data/acpi/pc/DSDT.dimmpxm",
-"tests/data/acpi/pc/DSDT.hpbridge",
-"tests/data/acpi/pc/DSDT.ipmikcs",
-"tests/data/acpi/pc/DSDT.memhp",
-"tests/data/acpi/pc/DSDT.nohpet",
-"tests/data/acpi/pc/DSDT.numamem",
-"tests/data/acpi/pc/DSDT.roothp",
-"tests/data/acpi/q35/DSDT.bridge",
-"tests/data/acpi/q35/DSDT.multi-bridge",
diff --git a/tests/data/acpi/pc/DSDT b/tests/data/acpi/pc/DSDT
index 
fd79a602a2aaac0f7d91d2ee2b1af8f2e6cdd4b3..da2a3e5c0551ac2d1d8a0a40b92d3235d5757475
 100644
GIT binary patch
delta 864
zcmY+?y-LGS90l-*wn@``wE1k(q;(Y(B-P?1MB7!t;^Yv006|wlL088>7hl5SE;tLi

[PULL 44/55] x86: acpi: _DSM: use Package to pass parameters

2022-10-10 Thread Michael S. Tsirkin
From: Igor Mammedov 

Numer of possible arguments to pass to a method is limited
in ACPI. The following patches will need to pass over more
parameters to PDSM method, will hit that limit.

Prepare for this by passing structure (Package) to method,
which let us workaround arguments limitation.
Pass to PDSM all standard arguments of _DSM as is, and
pack custom parameters into Package that is passed as
the last argument to PDSM.

Signed-off-by: Igor Mammedov 
Message-Id: <20220701133515.137890-7-imamm...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/i386/acpi-build.c | 40 +++-
 1 file changed, 27 insertions(+), 13 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 6d02eed12c..a19900c4e4 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -431,11 +431,17 @@ static void build_append_pci_bus_devices(Aml 
*parent_scope, PCIBus *bus,
 );
 aml_append(dev, method);
 method = aml_method("_DSM", 4, AML_SERIALIZED);
-aml_append(method,
-aml_return(aml_call6("PDSM", aml_arg(0), aml_arg(1),
- aml_arg(2), aml_arg(3),
- aml_name("BSEL"), aml_name("_SUN")))
-);
+{
+Aml *params = aml_local(0);
+Aml *pkg = aml_package(2);
+aml_append(pkg, aml_name("BSEL"));
+aml_append(pkg, aml_name("_SUN"));
+aml_append(method, aml_store(pkg, params));
+aml_append(method,
+aml_return(aml_call5("PDSM", aml_arg(0), aml_arg(1),
+ aml_arg(2), aml_arg(3), params))
+);
+}
 aml_append(dev, method);
 aml_append(parent_scope, dev);
 
@@ -480,10 +486,17 @@ static void build_append_pci_bus_devices(Aml 
*parent_scope, PCIBus *bus,
  */
 aml_append(dev, aml_name_decl("ASUN", aml_int(slot)));
 method = aml_method("_DSM", 4, AML_SERIALIZED);
-aml_append(method, aml_return(
-aml_call6("PDSM", aml_arg(0), aml_arg(1), aml_arg(2),
-  aml_arg(3), aml_name("BSEL"), aml_name("ASUN"))
-));
+{
+Aml *params = aml_local(0);
+Aml *pkg = aml_package(2);
+aml_append(pkg, aml_name("BSEL"));
+aml_append(pkg, aml_name("ASUN"));
+aml_append(method, aml_store(pkg, params));
+aml_append(method, aml_return(
+aml_call5("PDSM", aml_arg(0), aml_arg(1), aml_arg(2),
+  aml_arg(3), params)
+));
+}
 aml_append(dev, method);
 }
 
@@ -580,12 +593,13 @@ Aml *aml_pci_device_dsm(void)
 Aml *acpi_index = aml_local(2);
 Aml *zero = aml_int(0);
 Aml *one = aml_int(1);
-Aml *bnum = aml_arg(4);
 Aml *func = aml_arg(2);
 Aml *rev = aml_arg(1);
-Aml *sunum = aml_arg(5);
+Aml *params = aml_arg(4);
+Aml *bnum = aml_derefof(aml_index(params, aml_int(0)));
+Aml *sunum = aml_derefof(aml_index(params, aml_int(1)));
 
-method = aml_method("PDSM", 6, AML_SERIALIZED);
+method = aml_method("PDSM", 5, AML_SERIALIZED);
 
 /* get supported functions */
 ifctx = aml_if(aml_equal(func, zero));
@@ -662,10 +676,10 @@ Aml *aml_pci_device_dsm(void)
 * update acpi-index to actual value
 */
aml_append(ifctx, aml_store(acpi_index, aml_index(ret, zero)));
+   aml_append(ifctx, aml_return(ret));
 }
 
 aml_append(method, ifctx);
-aml_append(method, aml_return(ret));
 return method;
 }
 
-- 
MST




[PULL 55/55] x86: pci: acpi: consolidate PCI slots creation

2022-10-10 Thread Michael S. Tsirkin
From: Igor Mammedov 

No functional changes nor AML bytecode changes.
Consolidate code that generates empty and populated slot
descriptors. Besides eliminating duplication,
it helps consolidate conditions for generating
parts of Device{} desriptor in one place, which makes
code more compact and easier to read.

Signed-off-by: Igor Mammedov 
Message-Id: <20220701133515.137890-18-imamm...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/i386/acpi-build.c | 111 +--
 1 file changed, 54 insertions(+), 57 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index fc23cb08c3..4f54b61904 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -427,13 +427,41 @@ static void build_append_pci_bus_devices(Aml 
*parent_scope, PCIBus *bus,
 int func = PCI_FUNC(devfn);
 /* ACPI spec: 1.0b: Table 6-2 _ADR Object Bus Types, PCI type */
 int adr = slot << 16 | func;
-bool hotplug_enabled_dev;
-bool bridge_in_acpi;
-bool cold_plugged_bridge;
+bool hotpluggbale_slot = false;
+bool bridge_in_acpi = false;
+bool cold_plugged_bridge = false;
+bool is_vga = false;
+
+if (pdev) {
+pc = PCI_DEVICE_GET_CLASS(pdev);
+dc = DEVICE_GET_CLASS(pdev);
+
+if (pc->class_id == PCI_CLASS_BRIDGE_ISA) {
+continue;
+}
+
+is_vga = pc->class_id == PCI_CLASS_DISPLAY_VGA;
 
-if (!pdev) {
 /*
- * add hotplug slots for non present devices.
+ * Cold plugged bridges aren't themselves hot-pluggable.
+ * Hotplugged bridges *are* hot-pluggable.
+ */
+cold_plugged_bridge = pc->is_bridge && !DEVICE(pdev)->hotplugged;
+bridge_in_acpi =  cold_plugged_bridge && pcihp_bridge_en;
+
+hotpluggbale_slot = bsel && dc->hotpluggable &&
+!cold_plugged_bridge;
+
+/*
+ * allow describing coldplugged bridges in ACPI even if they are 
not
+ * on function 0, as they are not unpluggable, for all other 
devices
+ * generate description only for function 0 per slot
+ */
+if (func && !bridge_in_acpi) {
+continue;
+}
+} else {
+/*
  * hotplug is supported only for non-multifunction device
  * so generate device description only for function 0
  */
@@ -441,46 +469,11 @@ static void build_append_pci_bus_devices(Aml 
*parent_scope, PCIBus *bus,
 if (pci_bus_is_express(bus) && slot > 0) {
 break;
 }
-dev = aml_device("S%.02X", devfn);
-aml_append(dev, aml_name_decl("_ADR", aml_int(adr)));
-aml_append(dev, aml_name_decl("ASUN", aml_int(slot)));
-aml_append(dev, aml_pci_device_dsm());
-aml_append(dev, aml_name_decl("_SUN", aml_int(slot)));
-method = aml_method("_EJ0", 1, AML_NOTSERIALIZED);
-aml_append(method,
-aml_call2("PCEJ", aml_name("BSEL"), aml_name("_SUN"))
-);
-aml_append(dev, method);
-aml_append(parent_scope, dev);
-
-build_append_pcihp_notify_entry(notify_method, slot);
+/* mark it as empty hotpluggable slot */
+hotpluggbale_slot = true;
+} else {
+continue;
 }
-continue;
-}
-
-pc = PCI_DEVICE_GET_CLASS(pdev);
-dc = DEVICE_GET_CLASS(pdev);
-
-/*
- * Cold plugged bridges aren't themselves hot-pluggable.
- * Hotplugged bridges *are* hot-pluggable.
- */
-cold_plugged_bridge = pc->is_bridge && !DEVICE(pdev)->hotplugged;
-bridge_in_acpi =  cold_plugged_bridge && pcihp_bridge_en;
-
-hotplug_enabled_dev = bsel && dc->hotpluggable && !cold_plugged_bridge;
-
-if (pc->class_id == PCI_CLASS_BRIDGE_ISA) {
-continue;
-}
-
-/*
- * allow describing coldplugged bridges in ACPI even if they are not
- * on function 0, as they are not unpluggable, for all other devices
- * generate description only for function 0 per slot
- */
-if (func && !bridge_in_acpi) {
-continue;
 }
 
 /* start to compose PCI device descriptor */
@@ -496,7 +489,7 @@ static void build_append_pci_bus_devices(Aml *parent_scope, 
PCIBus *bus,
 aml_append(dev, aml_pci_device_dsm());
 }
 
-if (pc->class_id == PCI_CLASS_DISPLAY_VGA) {
+if (is_vga) {
 /* add VGA specific AML methods */
 int s3d;
 
@@ -517,19 +510,10 @@ static void build_append_pci_bus_devices(Aml 
*parent_scope, PCIBus *bus,
 method = 

[PULL 47/55] x86: acpi: cleanup PCI device _DSM duplication

2022-10-10 Thread Michael S. Tsirkin
From: Igor Mammedov 

add ASUN variable to hotpluggable slots and use it
instead of _SUN which has the same value to reuse
_DMS code on both branches (hot- and non-hotpluggable).
No functional change.

Signed-off-by: Igor Mammedov 
Message-Id: <20220701133515.137890-10-imamm...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/i386/acpi-build.c | 56 +---
 1 file changed, 27 insertions(+), 29 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index a19900c4e4..eb92b05197 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -374,6 +374,25 @@ build_facs(GArray *table_data)
 g_array_append_vals(table_data, reserved, 40); /* Reserved */
 }
 
+Aml *aml_pci_device_dsm(void)
+{
+Aml *method;
+
+method = aml_method("_DSM", 4, AML_SERIALIZED);
+{
+Aml *params = aml_local(0);
+Aml *pkg = aml_package(2);
+aml_append(pkg, aml_name("BSEL"));
+aml_append(pkg, aml_name("ASUN"));
+aml_append(method, aml_store(pkg, params));
+aml_append(method,
+aml_return(aml_call5("PDSM", aml_arg(0), aml_arg(1),
+ aml_arg(2), aml_arg(3), params))
+);
+}
+return method;
+}
+
 static void build_append_pcihp_notify_entry(Aml *method, int slot)
 {
 Aml *if_ctx;
@@ -423,26 +442,17 @@ static void build_append_pci_bus_devices(Aml 
*parent_scope, PCIBus *bus,
 break;
 }
 dev = aml_device("S%.02X", devfn);
-aml_append(dev, aml_name_decl("_SUN", aml_int(slot)));
+aml_append(dev, aml_name_decl("ASUN", aml_int(slot)));
 aml_append(dev, aml_name_decl("_ADR", aml_int(adr)));
+aml_append(dev, aml_name_decl("_SUN", aml_int(slot)));
 method = aml_method("_EJ0", 1, AML_NOTSERIALIZED);
 aml_append(method,
 aml_call2("PCEJ", aml_name("BSEL"), aml_name("_SUN"))
 );
 aml_append(dev, method);
-method = aml_method("_DSM", 4, AML_SERIALIZED);
-{
-Aml *params = aml_local(0);
-Aml *pkg = aml_package(2);
-aml_append(pkg, aml_name("BSEL"));
-aml_append(pkg, aml_name("_SUN"));
-aml_append(method, aml_store(pkg, params));
-aml_append(method,
-aml_return(aml_call5("PDSM", aml_arg(0), aml_arg(1),
- aml_arg(2), aml_arg(3), params))
-);
-}
-aml_append(dev, method);
+
+aml_append(dev, aml_pci_device_dsm());
+
 aml_append(parent_scope, dev);
 
 build_append_pcihp_notify_entry(notify_method, slot);
@@ -485,19 +495,7 @@ static void build_append_pci_bus_devices(Aml 
*parent_scope, PCIBus *bus,
  * enumeration order in linux kernel, so use another variable for 
it
  */
 aml_append(dev, aml_name_decl("ASUN", aml_int(slot)));
-method = aml_method("_DSM", 4, AML_SERIALIZED);
-{
-Aml *params = aml_local(0);
-Aml *pkg = aml_package(2);
-aml_append(pkg, aml_name("BSEL"));
-aml_append(pkg, aml_name("ASUN"));
-aml_append(method, aml_store(pkg, params));
-aml_append(method, aml_return(
-aml_call5("PDSM", aml_arg(0), aml_arg(1), aml_arg(2),
-  aml_arg(3), params)
-));
-}
-aml_append(dev, method);
+aml_append(dev, aml_pci_device_dsm());
 }
 
 if (pc->class_id == PCI_CLASS_DISPLAY_VGA) {
@@ -585,7 +583,7 @@ static void build_append_pci_bus_devices(Aml *parent_scope, 
PCIBus *bus,
 qobject_unref(bsel);
 }
 
-Aml *aml_pci_device_dsm(void)
+static Aml *aml_pci_pdsm(void)
 {
 Aml *method, *UUID, *ifctx, *ifctx1;
 Aml *ret = aml_local(0);
@@ -1368,7 +1366,7 @@ static void build_x86_acpi_pci_hotplug(Aml *table, 
uint64_t pcihp_addr)
 aml_append(method, aml_return(aml_local(0)));
 aml_append(scope, method);
 
-aml_append(scope, aml_pci_device_dsm());
+aml_append(scope, aml_pci_pdsm());
 
 aml_append(table, scope);
 }
-- 
MST




[PULL 42/55] tests: acpi: whitelist pc/q35 DSDT due to HPET AML move

2022-10-10 Thread Michael S. Tsirkin
From: Igor Mammedov 

Signed-off-by: Igor Mammedov 
Message-Id: <20220701133515.137890-5-imamm...@redhat.com>
---
 tests/qtest/bios-tables-test-allowed-diff.h | 34 +
 1 file changed, 34 insertions(+)

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
b/tests/qtest/bios-tables-test-allowed-diff.h
index dfb8523c8b..452145badd 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1 +1,35 @@
 /* List of comma-separated changed AML files to ignore */
+"tests/data/acpi/pc/DSDT",
+"tests/data/acpi/pc/DSDT.acpierst",
+"tests/data/acpi/pc/DSDT.acpihmat",
+"tests/data/acpi/pc/DSDT.bridge",
+"tests/data/acpi/pc/DSDT.cphp",
+"tests/data/acpi/pc/DSDT.dimmpxm",
+"tests/data/acpi/pc/DSDT.hpbridge",
+"tests/data/acpi/pc/DSDT.hpbrroot",
+"tests/data/acpi/pc/DSDT.ipmikcs",
+"tests/data/acpi/pc/DSDT.memhp",
+"tests/data/acpi/pc/DSDT.numamem",
+"tests/data/acpi/pc/DSDT.roothp",
+"tests/data/acpi/q35/DSDT",
+"tests/data/acpi/q35/DSDT.acpierst",
+"tests/data/acpi/q35/DSDT.acpihmat",
+"tests/data/acpi/q35/DSDT.applesmc",
+"tests/data/acpi/q35/DSDT.bridge",
+"tests/data/acpi/q35/DSDT.cphp",
+"tests/data/acpi/q35/DSDT.cxl",
+"tests/data/acpi/q35/DSDT.dimmpxm",
+"tests/data/acpi/q35/DSDT.ipmibt",
+"tests/data/acpi/q35/DSDT.ipmismbus",
+"tests/data/acpi/q35/DSDT.ivrs",
+"tests/data/acpi/q35/DSDT.memhp",
+"tests/data/acpi/q35/DSDT.mmio64",
+"tests/data/acpi/q35/DSDT.multi-bridge",
+"tests/data/acpi/q35/DSDT.numamem",
+"tests/data/acpi/q35/DSDT.pvpanic-isa",
+"tests/data/acpi/q35/DSDT.tis.tpm12",
+"tests/data/acpi/q35/DSDT.tis.tpm2",
+"tests/data/acpi/q35/DSDT.viot",
+"tests/data/acpi/q35/DSDT.xapic",
+"tests/data/acpi/q35/DSDT.nohpet",
+"tests/data/acpi/pc/DSDT.nohpet",
-- 
MST




[PULL 50/55] x86: pci: acpi: reorder Device's _ADR and _SUN fields

2022-10-10 Thread Michael S. Tsirkin
From: Igor Mammedov 

no functional change, align order of fields in empty slot
descriptor with a populated slot ordering.
Expected diff:
  -Name (_SUN, 0x0X)  // _SUN: Slot User Number
   Name (_ADR, 0xY)  // _ADR: Address
  ...
  +Name (_SUN, 0xX)  // _SUN: Slot User Number

that will eliminate contextual changes (causing test failures)
when follow up patches merge code generating populated and empty
slots descriptors.

Put mandatory _ADR as the 1st field, then ASUN as it can be
present for both pupulated and empty slots and only then _SUN
which is present only when slot is hotpluggable.

Signed-off-by: Igor Mammedov 
Message-Id: <20220701133515.137890-13-imamm...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/i386/acpi-build.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index eb92b05197..6342467af4 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -442,8 +442,8 @@ static void build_append_pci_bus_devices(Aml *parent_scope, 
PCIBus *bus,
 break;
 }
 dev = aml_device("S%.02X", devfn);
-aml_append(dev, aml_name_decl("ASUN", aml_int(slot)));
 aml_append(dev, aml_name_decl("_ADR", aml_int(adr)));
+aml_append(dev, aml_name_decl("ASUN", aml_int(slot)));
 aml_append(dev, aml_name_decl("_SUN", aml_int(slot)));
 method = aml_method("_EJ0", 1, AML_NOTSERIALIZED);
 aml_append(method,
-- 
MST




[PULL 40/55] acpi: x86: deduplicate HPET AML building

2022-10-10 Thread Michael S. Tsirkin
From: Igor Mammedov 

HPET AML doesn't depend on piix4 nor q35, move code buiding it
to common scope to avoid duplication.

Signed-off-by: Igor Mammedov 
Message-Id: <20220701133515.137890-3-imamm...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/i386/acpi-build.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 0355bd3dda..67b532f5a5 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1467,9 +1467,6 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 aml_append(sb_scope, dev);
 aml_append(dsdt, sb_scope);
 
-if (misc->has_hpet) {
-build_hpet_aml(dsdt);
-}
 build_piix4_isa_bridge(dsdt);
 if (pm->pcihp_bridge_en || pm->pcihp_root_en) {
 build_x86_acpi_pci_hotplug(dsdt, pm->pcihp_io_base);
@@ -1515,9 +1512,6 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 
 aml_append(dsdt, sb_scope);
 
-if (misc->has_hpet) {
-build_hpet_aml(dsdt);
-}
 build_q35_isa_bridge(dsdt);
 if (pm->pcihp_bridge_en) {
 build_x86_acpi_pci_hotplug(dsdt, pm->pcihp_io_base);
@@ -1528,6 +1522,10 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 }
 }
 
+if (misc->has_hpet) {
+build_hpet_aml(dsdt);
+}
+
 if (vmbus_bridge) {
 sb_scope = aml_scope("_SB");
 aml_append(sb_scope, build_vmbus_device_aml(vmbus_bridge));
-- 
MST




Re: [PATCH RFC] hw/cxl: type 3 devices can now present volatile or persistent memory

2022-10-10 Thread Davidlohr Bueso

On Mon, 10 Oct 2022, Jonathan Cameron wrote:


I wonder if we care to emulate beyond 1 volatile and 1 persistent.
Sure devices might exist, but if we can exercise all the code paths
with a simpler configuration, perhaps we don't need to handle the
more complex ones?


Yes, I completely agree. 1 of each seems like the best balance between
exercising code paths vs complexity.

Thanks,
Davidlohr



[PULL 45/55] tests: acpi: update expected blobs

2022-10-10 Thread Michael S. Tsirkin
From: Igor Mammedov 

An intermediate blobs update to keep changes (last 2 patches)
reviewable.

Includes refactored PDSM that uses Package argument for custom
parameters.

 = PDSM taking package as arguments

 Return (Local0)
 }

-Method (PDSM, 6, Serialized)
+Method (PDSM, 5, Serialized)
 {
-If ((Arg0 == ToUUID ("e5c937d0-3553-4d7a-9117-ea4d19c3434d") /* 
Device Labeling Interface */))
+If ((Arg2 == Zero))
 {
-Local0 = AIDX (Arg4, Arg5)
-If ((Arg2 == Zero))
-{
-If ((Arg1 == 0x02))
+Local0 = Buffer (One)
 {
-If (!((Local0 == Zero) | (Local0 == 0x)))
-{
-Return (Buffer (One)
-{
- 0x81  
   // .
-})
-}
+ 0x00 // .
 }
+Local1 = Zero
+If ((Arg0 != ToUUID ("e5c937d0-3553-4d7a-9117-ea4d19c3434d") 
/* Device Labeling Interface */))
+{
+Return (Local0)
+}

-Return (Buffer (One)
-{
- 0x00 // .
-})
+If ((Arg1 < 0x02))
+{
+Return (Local0)
 }
-ElseIf ((Arg2 == 0x07))
+
+Local2 = AIDX (DerefOf (Arg4 [Zero]), DerefOf (Arg4 [One]
+))
+If (!((Local2 == Zero) | (Local2 == 0x)))
 {
-Local1 = Package (0x02)
-{
-Zero,
-""
-}
-Local1 [Zero] = Local0
-Return (Local1)
+Local1 |= One
+Local1 |= (One << 0x07)
 }
+
+Local0 [Zero] = Local1
+Return (Local0)
+}
+
+If ((Arg2 == 0x07))
+{
+Local0 = Package (0x02)
+{
+Zero,
+""
+}
+Local2 = AIDX (DerefOf (Arg4 [Zero]), DerefOf (Arg4 [One]
+))
+Local0 [Zero] = Local2
+Return (Local0)
 }
 }
 }

 =  PCI slot using Package to pass arguments to _DSM

 Name (ASUN, Zero)
 Method (_DSM, 4, Serialized)  // _DSM: Device-Specific Method
 {
-Return (PDSM (Arg0, Arg1, Arg2, Arg3, BSEL, ASUN))
+Local0 = Package (0x02)
+{
+BSEL,
+ASUN
+}
+Return (PDSM (Arg0, Arg1, Arg2, Arg3, Local0))
 }
 }

 = hotpluggable PCI slot using Package to pass arguments to _DSM

 Method (_DSM, 4, Serialized)  // _DSM: Device-Specific Method
 {
-Return (PDSM (Arg0, Arg1, Arg2, Arg3, BSEL, _SUN))
+Local0 = Package (0x02)
+{
+BSEL,
+_SUN
+}
+Return (PDSM (Arg0, Arg1, Arg2, Arg3, Local0))
 }
 }

Signed-off-by: Igor Mammedov 
Message-Id: <20220701133515.137890-8-imamm...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 tests/qtest/bios-tables-test-allowed-diff.h |  34 
 tests/data/acpi/pc/DSDT | Bin 5987 -> 6219 bytes
 tests/data/acpi/pc/DSDT.acpierst| Bin 5954 -> 6186 bytes
 tests/data/acpi/pc/DSDT.acpihmat| Bin 7312 -> 7544 bytes
 tests/data/acpi/pc/DSDT.bridge  | Bin 8653 -> 9078 bytes
 tests/data/acpi/pc/DSDT.cphp| Bin 6451 -> 6683 bytes
 tests/data/acpi/pc/DSDT.dimmpxm | Bin 7641 -> 7873 bytes
 tests/data/acpi/pc/DSDT.hpbridge| Bin 5954 -> 6186 bytes
 tests/data/acpi/pc/DSDT.ipmikcs | Bin 6059 -> 6291 bytes
 tests/data/acpi/pc/DSDT.memhp   | Bin 7346 -> 7578 bytes
 tests/data/acpi/pc/DSDT.nohpet  | Bin 5845 -> 6077 bytes
 tests/data/acpi/pc/DSDT.numamem | Bin 5993 -> 6225 bytes
 tests/data/acpi/pc/DSDT.roothp  

[PULL 51/55] tests: acpi: update expected blobs

2022-10-10 Thread Michael S. Tsirkin
From: Igor Mammedov 

Expected change:
  -Name (_SUN, 0x0X)  // _SUN: Slot User Number
   Name (_ADR, 0xY)  // _ADR: Address
  ...
  +Name (_SUN, 0xX)  // _SUN: Slot User Number

Signed-off-by: Igor Mammedov 
Message-Id: <20220701133515.137890-14-imamm...@redhat.com>
---
 tests/qtest/bios-tables-test-allowed-diff.h |  14 --
 tests/data/acpi/pc/DSDT | Bin 6422 -> 6422 bytes
 tests/data/acpi/pc/DSDT.acpierst| Bin 6382 -> 6382 bytes
 tests/data/acpi/pc/DSDT.acpihmat| Bin 7747 -> 7747 bytes
 tests/data/acpi/pc/DSDT.bridge  | Bin 9496 -> 9496 bytes
 tests/data/acpi/pc/DSDT.cphp| Bin 6886 -> 6886 bytes
 tests/data/acpi/pc/DSDT.dimmpxm | Bin 8076 -> 8076 bytes
 tests/data/acpi/pc/DSDT.hpbridge| Bin 6382 -> 6382 bytes
 tests/data/acpi/pc/DSDT.ipmikcs | Bin 6494 -> 6494 bytes
 tests/data/acpi/pc/DSDT.memhp   | Bin 7781 -> 7781 bytes
 tests/data/acpi/pc/DSDT.nohpet  | Bin 6280 -> 6280 bytes
 tests/data/acpi/pc/DSDT.numamem | Bin 6428 -> 6428 bytes
 tests/data/acpi/pc/DSDT.roothp  | Bin 6656 -> 6656 bytes
 tests/data/acpi/q35/DSDT.bridge | Bin 11449 -> 11449 bytes
 tests/data/acpi/q35/DSDT.multi-bridge   | Bin 8640 -> 8640 bytes
 15 files changed, 14 deletions(-)

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
b/tests/qtest/bios-tables-test-allowed-diff.h
index 1983fa596b..dfb8523c8b 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1,15 +1 @@
 /* List of comma-separated changed AML files to ignore */
-"tests/data/acpi/pc/DSDT",
-"tests/data/acpi/pc/DSDT.acpierst",
-"tests/data/acpi/pc/DSDT.acpihmat",
-"tests/data/acpi/pc/DSDT.bridge",
-"tests/data/acpi/pc/DSDT.cphp",
-"tests/data/acpi/pc/DSDT.dimmpxm",
-"tests/data/acpi/pc/DSDT.hpbridge",
-"tests/data/acpi/pc/DSDT.ipmikcs",
-"tests/data/acpi/pc/DSDT.memhp",
-"tests/data/acpi/pc/DSDT.nohpet",
-"tests/data/acpi/pc/DSDT.numamem",
-"tests/data/acpi/pc/DSDT.roothp",
-"tests/data/acpi/q35/DSDT.bridge",
-"tests/data/acpi/q35/DSDT.multi-bridge",
diff --git a/tests/data/acpi/pc/DSDT b/tests/data/acpi/pc/DSDT
index 
8282d449ceaefd914b419f507a13f3fb7b318aaf..fd79a602a2aaac0f7d91d2ee2b1af8f2e6cdd4b3
 100644
GIT binary patch
delta 758
zcmYk(y-LGi6vpvN(>6`>{pO=-zHY!lJG6Ch2v8>4a26a~osMjK@aXy9keFc02_4YHa_A2
zaX?=ZhloQeJpJYfaYQ!~O+=GEB#sfsrU)F-qtah5MvUoKVuF~^ex%<_5mUOCI7OV~}8RCp?CFY1Zea8AX
s6^I3%9$m+WG}Z)FNyw56})1vy8W5#Yk19GZ~XB6IQ`lC1Mb_IlmGw#

delta 758
zcmYMxJ5Iw;5QX7EaT3R|uOHWsIDTz_0#YQQps)m#6d-D|0VHZ93R)`ANU2!@h#rX=
zi5{^TlIuC5{Xd^#d^d~D;x$)UdwF&44D7Bwxp0XFWOpX;gIei>E#f=X1||*>
zhs0lqBg7GL>C2a6#4+&=qK#-1e;`f}C|h8OzDArPPKn=wwm-u~bma_pd@
zuH_C7?umOl=cmx>_X@7WmD#Vb_u;;{Z+wma6u<-Vpm*#;cqkqkKf*qON8(ZM*lV~J
z*Zi#ipD6~%#E)A3UV@kq|01S{De-T6ouh{ocR5o_-&_vY={I2TZ@n-NFdfOfd=T}HWG;jXn+P-!Init3CMF#
ze7FCnGt$hQHmA+;F^GnK|K!bP#?0!UUmhHThM+*LUIx(NFma}&|DhVc}W
ze?eKlBc3vRJnqtkb2ocA`|X1DeJM)nTeVRzVXr}JI7%kFC5%HD%}>>j)9eYnr=
zYv0K}fCucs=xQhX5FXN3JUw587|}YJVvHEmo5Tb$p-+h^VoF^)%@v}eSBXo+C4ETD
z5HpI^^iMfr@4e61-xJ{MyEeD{Zk1q*-O3O%f5nF>=nDbQw^`#Ywh>4Z{Q7k
e!!G+4-m>$nU%mfk#^pcHH={rR

delta 772
zcmXxhyG;W@6ougnT6^E`@xFImfD!xo+4(!R%-)5&;;!D`VDG^_anJazobSVZabNom`v4w@2gb1v;h}h_
zeUE(vkHn+d)gJp89ut41@_Y$mLcGZ=8)Al-5q~1)h)%UmHU;#b5K;)?hK
zu|O<{QzidYB9?NGQs46q_pIQRcr`o!Dfy=wUW?ayf55(hH{y+Pyi*Hr#ar$7*mv+w
fyfcn{5AVf$?MLhf_#i!ER;cI(CbKs2L3

diff --git a/tests/data/acpi/pc/DSDT.acpihmat b/tests/data/acpi/pc/DSDT.acpihmat
index 
33169838bed50710495d45e0d9486ac59b4e504c..973320cb25120818a45ddb3d8e3b3211f0c00adc
 100644
GIT binary patch
delta 760
zcmYMxy-LGS7{>8PlQvC1@^z9lP1;<79|frpkQ8we2gSL$o^!n+9$u1*Cnp;yq+
zMQ|3p0T<1ApPqLL{C{}}=VV)N>&<1Ht(%LRp=EU}x4FH$_B!rbKj>AR7AHq*7jDCC
zcAI@^+<`mnj`oG@yYMc1w{^9UeGlHFXVnDwsEg>*x5PeTpX$WC=^=XbC~<%|pihZI
z#33D~W;+S@6v%e=F(dTpc`kYHSX8;e_gVt%2+53G657|S#Ka+g~kJux2
zxu+N*(xKsurX|KvKj>92Dp5UWGRx2xk)dh?i1c_`=YuHMz#t0-;m?r)rd-8PrEF42JORtJ7lXT|ApYil>)hPcuD
zTHJ)2;->Z$_7>a{w>noV?1%6n@r!DF{i8ObP5gs6LL3p-v3%(uI>b+iW5hA>7ov;k
z5-$^(PY@@>kBC#mDe*pO`gigWJ-Nfvcdl`#5BJ6W>}{XT#P;(^|uV;{mp@vwKi
zQv{F1Bkc|L3a-Re@7TxiSUlFg!9IZ};z{RfgMA85iQhAMz8T_-_zy8d%!sdZ`7%e$
uiC+;5#De%2u|zD1FAAB@5$D9uhzrC8@pmErQ-xT`9o7FGCm#>x=g|R&$ew!u

diff --git a/tests/data/acpi/pc/DSDT.bridge b/tests/data/acpi/pc/DSDT.bridge
index 
407066e1ac46922751969b823000e4fdfa542662..9583da4e4f558cb0bf6912733fbe8db7c1ad255f
 100644
GIT binary patch
delta 1438
zcmeIyOG?8~6vpu+O`9}*B~6>8P1;zT3T;t*fQX@v3OcX|?I}o}8
zHy}f*q5I2_2TiE3$CwlvXsI)|PUJ(qU6tLZkLuA}R8)lQa}2j=OG
zu>dSkFHC%l8$d%m-pIawi(k(~H$^wo{t)TQ@gj0jZtMtlP1J)>Z<+u>4MP#I`qO=2iEDQE5}`+O9!#U2CzZ5j7?yZ
zz8F2AN5eI_ZXf82efinCN4&5AJrF%eJJ~(C?-qJX^cFpNfA`%+ZquVDUpWMZ^vxIn
yBii@n$qukXH;i3imp|U_9>!huZSmt?pv_dtcqf>~HV??;`bsz4>VUUHt`{a3nzh

delta 1438
zcmeIwOG?8~6vpwBG;Pz{SJJde+N6!esn8b12Z$KzsGtLj;H+D41L9l|A7@gWxdWjq
za0BARks!{+J=i4Y!zDQBq?F(Pp5$`Kd@`RrpV#bodUitQEIvA2{wDwV-rfw4jp=ZA

[PULL 39/55] tests: acpi: whitelist pc/q35 DSDT due to HPET AML move

2022-10-10 Thread Michael S. Tsirkin
From: Igor Mammedov 

Signed-off-by: Igor Mammedov 
Message-Id: <20220701133515.137890-2-imamm...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 tests/qtest/bios-tables-test-allowed-diff.h | 32 +
 1 file changed, 32 insertions(+)

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
b/tests/qtest/bios-tables-test-allowed-diff.h
index dfb8523c8b..a7aa428fab 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1 +1,33 @@
 /* List of comma-separated changed AML files to ignore */
+"tests/data/acpi/pc/DSDT",
+"tests/data/acpi/pc/DSDT.acpierst",
+"tests/data/acpi/pc/DSDT.acpihmat",
+"tests/data/acpi/pc/DSDT.bridge",
+"tests/data/acpi/pc/DSDT.cphp",
+"tests/data/acpi/pc/DSDT.dimmpxm",
+"tests/data/acpi/pc/DSDT.hpbridge",
+"tests/data/acpi/pc/DSDT.hpbrroot",
+"tests/data/acpi/pc/DSDT.ipmikcs",
+"tests/data/acpi/pc/DSDT.memhp",
+"tests/data/acpi/pc/DSDT.numamem",
+"tests/data/acpi/pc/DSDT.roothp",
+"tests/data/acpi/q35/DSDT",
+"tests/data/acpi/q35/DSDT.acpierst",
+"tests/data/acpi/q35/DSDT.acpihmat",
+"tests/data/acpi/q35/DSDT.applesmc",
+"tests/data/acpi/q35/DSDT.bridge",
+"tests/data/acpi/q35/DSDT.cphp",
+"tests/data/acpi/q35/DSDT.cxl",
+"tests/data/acpi/q35/DSDT.dimmpxm",
+"tests/data/acpi/q35/DSDT.ipmibt",
+"tests/data/acpi/q35/DSDT.ipmismbus",
+"tests/data/acpi/q35/DSDT.ivrs",
+"tests/data/acpi/q35/DSDT.memhp",
+"tests/data/acpi/q35/DSDT.mmio64",
+"tests/data/acpi/q35/DSDT.multi-bridge",
+"tests/data/acpi/q35/DSDT.numamem",
+"tests/data/acpi/q35/DSDT.pvpanic-isa",
+"tests/data/acpi/q35/DSDT.tis.tpm12",
+"tests/data/acpi/q35/DSDT.tis.tpm2",
+"tests/data/acpi/q35/DSDT.viot",
+"tests/data/acpi/q35/DSDT.xapic",
-- 
MST




[PULL 49/55] tests: acpi: whitelist pc/q35 DSDT before moving _ADR field

2022-10-10 Thread Michael S. Tsirkin
From: Igor Mammedov 

Signed-off-by: Igor Mammedov 
Message-Id: <20220701133515.137890-12-imamm...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 tests/qtest/bios-tables-test-allowed-diff.h | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
b/tests/qtest/bios-tables-test-allowed-diff.h
index dfb8523c8b..1983fa596b 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1 +1,15 @@
 /* List of comma-separated changed AML files to ignore */
+"tests/data/acpi/pc/DSDT",
+"tests/data/acpi/pc/DSDT.acpierst",
+"tests/data/acpi/pc/DSDT.acpihmat",
+"tests/data/acpi/pc/DSDT.bridge",
+"tests/data/acpi/pc/DSDT.cphp",
+"tests/data/acpi/pc/DSDT.dimmpxm",
+"tests/data/acpi/pc/DSDT.hpbridge",
+"tests/data/acpi/pc/DSDT.ipmikcs",
+"tests/data/acpi/pc/DSDT.memhp",
+"tests/data/acpi/pc/DSDT.nohpet",
+"tests/data/acpi/pc/DSDT.numamem",
+"tests/data/acpi/pc/DSDT.roothp",
+"tests/data/acpi/q35/DSDT.bridge",
+"tests/data/acpi/q35/DSDT.multi-bridge",
-- 
MST




[PULL 37/55] pci: Sanity check mask argument to pci_set_*_by_mask()

2022-10-10 Thread Michael S. Tsirkin
From: Peter Maydell 

Coverity complains that in functions like pci_set_word_by_mask()
we might end up shifting by more than 31 bits. This is true,
but only if the caller passes in a zero mask. Help Coverity out
by asserting that the mask argument is valid.

Fixes: CID 1487168

Reviewed-by: Richard Henderson 
Signed-off-by: Peter Maydell 
Message-Id: <20220818135421.2515257-3-peter.mayd...@linaro.org>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Paolo Bonzini 
---
 include/hw/pci/pci.h | 20 
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index c79144bc5e..97937cc922 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -688,7 +688,10 @@ static inline void
 pci_set_byte_by_mask(uint8_t *config, uint8_t mask, uint8_t reg)
 {
 uint8_t val = pci_get_byte(config);
-uint8_t rval = reg << ctz32(mask);
+uint8_t rval;
+
+assert(mask);
+rval = reg << ctz32(mask);
 pci_set_byte(config, (~mask & val) | (mask & rval));
 }
 
@@ -696,7 +699,10 @@ static inline void
 pci_set_word_by_mask(uint8_t *config, uint16_t mask, uint16_t reg)
 {
 uint16_t val = pci_get_word(config);
-uint16_t rval = reg << ctz32(mask);
+uint16_t rval;
+
+assert(mask);
+rval = reg << ctz32(mask);
 pci_set_word(config, (~mask & val) | (mask & rval));
 }
 
@@ -704,7 +710,10 @@ static inline void
 pci_set_long_by_mask(uint8_t *config, uint32_t mask, uint32_t reg)
 {
 uint32_t val = pci_get_long(config);
-uint32_t rval = reg << ctz32(mask);
+uint32_t rval;
+
+assert(mask);
+rval = reg << ctz32(mask);
 pci_set_long(config, (~mask & val) | (mask & rval));
 }
 
@@ -712,7 +721,10 @@ static inline void
 pci_set_quad_by_mask(uint8_t *config, uint64_t mask, uint64_t reg)
 {
 uint64_t val = pci_get_quad(config);
-uint64_t rval = reg << ctz32(mask);
+uint64_t rval;
+
+assert(mask);
+rval = reg << ctz32(mask);
 pci_set_quad(config, (~mask & val) | (mask & rval));
 }
 
-- 
MST




[PULL 48/55] tests: acpi: update expected blobs

2022-10-10 Thread Michael S. Tsirkin
From: Igor Mammedov 

It's expected that hotpluggable slots will, get ASUN variable
and use that instead of _SUN with its _DSM method.

For example:

  @@ -979,8 +979,9 @@ DefinitionBlock ("", "DSDT", 1, "BOCHS ", "BXPC", 
0x0001)

   Device (S18)
   {
  -Name (_SUN, 0x03)  // _SUN: Slot User Number
  +Name (ASUN, 0x03)
   Name (_ADR, 0x0003)  // _ADR: Address
  +Name (_SUN, 0x03)  // _SUN: Slot User Number
   Method (_EJ0, 1, NotSerialized)  // _EJx: Eject Device
   {
   PCEJ (BSEL, _SUN)
  @@ -991,7 +992,7 @@ DefinitionBlock ("", "DSDT", 1, "BOCHS ", "BXPC", 
0x0001)
   Local0 = Package (0x02)
   {
   BSEL,
  -_SUN
  +ASUN
   }
   Return (PDSM (Arg0, Arg1, Arg2, Arg3, Local0))
   }

Signed-off-by: Igor Mammedov 
Message-Id: <20220701133515.137890-11-imamm...@redhat.com>
---
 tests/qtest/bios-tables-test-allowed-diff.h |  14 --
 tests/data/acpi/pc/DSDT | Bin 6219 -> 6422 bytes
 tests/data/acpi/pc/DSDT.acpierst| Bin 6186 -> 6382 bytes
 tests/data/acpi/pc/DSDT.acpihmat| Bin 7544 -> 7747 bytes
 tests/data/acpi/pc/DSDT.bridge  | Bin 9078 -> 9496 bytes
 tests/data/acpi/pc/DSDT.cphp| Bin 6683 -> 6886 bytes
 tests/data/acpi/pc/DSDT.dimmpxm | Bin 7873 -> 8076 bytes
 tests/data/acpi/pc/DSDT.hpbridge| Bin 6186 -> 6382 bytes
 tests/data/acpi/pc/DSDT.ipmikcs | Bin 6291 -> 6494 bytes
 tests/data/acpi/pc/DSDT.memhp   | Bin 7578 -> 7781 bytes
 tests/data/acpi/pc/DSDT.nohpet  | Bin 6077 -> 6280 bytes
 tests/data/acpi/pc/DSDT.numamem | Bin 6225 -> 6428 bytes
 tests/data/acpi/pc/DSDT.roothp  | Bin 6434 -> 6656 bytes
 tests/data/acpi/q35/DSDT.bridge | Bin 11227 -> 11449 bytes
 tests/data/acpi/q35/DSDT.multi-bridge   | Bin 8628 -> 8640 bytes
 15 files changed, 14 deletions(-)

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
b/tests/qtest/bios-tables-test-allowed-diff.h
index 1983fa596b..dfb8523c8b 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1,15 +1 @@
 /* List of comma-separated changed AML files to ignore */
-"tests/data/acpi/pc/DSDT",
-"tests/data/acpi/pc/DSDT.acpierst",
-"tests/data/acpi/pc/DSDT.acpihmat",
-"tests/data/acpi/pc/DSDT.bridge",
-"tests/data/acpi/pc/DSDT.cphp",
-"tests/data/acpi/pc/DSDT.dimmpxm",
-"tests/data/acpi/pc/DSDT.hpbridge",
-"tests/data/acpi/pc/DSDT.ipmikcs",
-"tests/data/acpi/pc/DSDT.memhp",
-"tests/data/acpi/pc/DSDT.nohpet",
-"tests/data/acpi/pc/DSDT.numamem",
-"tests/data/acpi/pc/DSDT.roothp",
-"tests/data/acpi/q35/DSDT.bridge",
-"tests/data/acpi/q35/DSDT.multi-bridge",
diff --git a/tests/data/acpi/pc/DSDT b/tests/data/acpi/pc/DSDT
index 
cb718f1a8a591b27ca841f70f40762562c15c837..8282d449ceaefd914b419f507a13f3fb7b318aaf
 100644
GIT binary patch
delta 1170
zcmY++OHRU207l^w3Y0==%cD>pC2<9aFC2o!nHsImOu7UUx8MRCkhlsHV;r#tSKw}p
z=0C~3(~~dfd-1mP^CG`bQpc{>f3<%si%l7A_q&(iHM?<(!|6i-&8!>XqyK
zuI)Hg$1DHC?lwHweFX6zTQ0<0m@DR5j9U#>6RVl*h+7?27pq&0TLab*Ynbeqn+Nm6
zJd1H_!kS`Dlbvw$VZNAeF>WnbORQzGQ*LcoTdZv{ZUHP13ru##tpn?bbu7j$goR?E
z$yVGVSR@u%j9Uzg#bT44bL+yoVqJ@I>%n?rJ(I1u^w4F>W*1Ol)Sd
YOKuq~6U!{dZ4R4@=j5mI3(bW3jsO4v

delta 1329
zcmaLPIZnd>6aZ0MaT3R|9cQ((OR1cIN|u5%00*Fm#0e;
za19nQ<^S#IX`UyaV<$?}P$c5-$$
zOHN(G$Qg@!f3)-)=PxHodUSPtb34Cy_`#F$KeoV@f;s2Gc^R*9E`SRP=3E39WxUSW
z2HOheTmqM5yusN4I|}Ap2A5^L$=Lj6qH3f67gX=P$ac+Pc3g+AdH)R}iZh>10<{W}U8Sik8z>$JEx4~^0C!9Orj)FON
k!Ce{ea_)h93g+Ag_hr1tc>o?LnDY=k{KIDUu@Zin-wNzyDF6Tf

diff --git a/tests/data/acpi/pc/DSDT.acpierst b/tests/data/acpi/pc/DSDT.acpierst
index 
aebb29c2a4ae67b732bef3eb8e72c5665bb3a7b3..9520f3b7303a43091e8c77b64d1f76407e85f1f4
 100644
GIT binary patch
delta 1131
zcmY++OG?8~07l^y`bck+Cg~$hUn#hP`mV#))R_uKXJVG%B3wdu;Y38x6*zO^M!E(;
z{sZ?;4o^B_=NKiJXA2}5qR{6ZSxVuU-mC@|70cr

diff --git a/tests/data/acpi/pc/DSDT.acpihmat b/tests/data/acpi/pc/DSDT.acpihmat
index 
b7c5de46346d2777b33f7fc464d319bd762fda8d..33169838bed50710495d45e0d9486ac59b4e504c
 100644
GIT binary patch
delta 1170
zcmY++yGp}g07vl@dr8x@x%Hkl1)spXRb1MpZYmhvgv>sTLsvl`L2>HrQwTnYqbUCa
z-?tuqoS*a03okG7yH(~mmB)AQIrgr$4`s0|!}WIa7C)nb2Mx{Koe+S_r~6l}=#L%jiB(}$v8u(m)nGNTn#m5i)nRq9y2ZFPU=6W`$?WFjvgA
z7`G;@Db_UEF*gt9iFp>|)`GRfS|=EHn3-(uX_u(nv+WT)IZu#Q;AV%!2)AQqTx
z*RLGT9lo7#54g7UPz{60yW&3vOLlSFCF>Zar8}tY@+%w?3>d*0
z0c;>PFxiUR5H=JWT8!HWHWC||Eax_cjm5?mLNHQC_aJ@
zk@Hc9JG%H<4kOpRe}4XUETzER8ew@M-?y-315vba|9nS$ckK
z=d5g;oH~Y)Gv*Ke!s2h7KAt4$(d_c-dU|pDgIAXRaUPskFy{ieAmdfeMQ~BUoJ-)6
zjMq3@U`xTA%iyw%*E!o@Tfv+w;EIelI6GiR!JMn$s*E=|*T6LebFPEyG9Gbu!LEWi
zH^2=UZ*lg(o`N|y!A%*DIs0H=!JJ#*mW*T0ZE#z`oC9zm<6X`ja7V$MyWp;j6V5$w

[PULL 43/55] acpi: x86: refactor PDSM method to reduce nesting

2022-10-10 Thread Michael S. Tsirkin
From: Igor Mammedov 

.., it will help with code readability and make easier
to extend method in followup patches

Signed-off-by: Igor Mammedov 
Message-Id: <20220701133515.137890-6-imamm...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/i386/acpi-build.c | 139 ---
 1 file changed, 77 insertions(+), 62 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 67b532f5a5..6d02eed12c 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -574,9 +574,12 @@ static void build_append_pci_bus_devices(Aml 
*parent_scope, PCIBus *bus,
 
 Aml *aml_pci_device_dsm(void)
 {
-Aml *method, *UUID, *ifctx, *ifctx1, *ifctx2, *ifctx3, *elsectx;
-Aml *acpi_index = aml_local(0);
+Aml *method, *UUID, *ifctx, *ifctx1;
+Aml *ret = aml_local(0);
+Aml *caps = aml_local(1);
+Aml *acpi_index = aml_local(2);
 Aml *zero = aml_int(0);
+Aml *one = aml_int(1);
 Aml *bnum = aml_arg(4);
 Aml *func = aml_arg(2);
 Aml *rev = aml_arg(1);
@@ -584,73 +587,85 @@ Aml *aml_pci_device_dsm(void)
 
 method = aml_method("PDSM", 6, AML_SERIALIZED);
 
-/*
- * PCI Firmware Specification 3.1
- * 4.6.  _DSM Definitions for PCI
- */
-UUID = aml_touuid("E5C937D0-3553-4D7A-9117-EA4D19C3434D");
-ifctx = aml_if(aml_equal(aml_arg(0), UUID));
+/* get supported functions */
+ifctx = aml_if(aml_equal(func, zero));
 {
-aml_append(ifctx, aml_store(aml_call2("AIDX", bnum, sunum), 
acpi_index));
-ifctx1 = aml_if(aml_equal(func, zero));
+uint8_t byte_list[1] = { 0 }; /* nothing supported yet */
+aml_append(ifctx, aml_store(aml_buffer(1, byte_list), ret));
+aml_append(ifctx, aml_store(zero, caps));
+
+   /*
+* PCI Firmware Specification 3.1
+* 4.6.  _DSM Definitions for PCI
+*/
+UUID = aml_touuid("E5C937D0-3553-4D7A-9117-EA4D19C3434D");
+ifctx1 = aml_if(aml_lnot(aml_equal(aml_arg(0), UUID)));
 {
-uint8_t byte_list[1];
+/* call is for unsupported UUID, bail out */
+aml_append(ifctx1, aml_return(ret));
+}
+aml_append(ifctx, ifctx1);
 
-ifctx2 = aml_if(aml_equal(rev, aml_int(2)));
-{
-/*
- * advertise function 7 if device has acpi-index
- * acpi_index values:
- *0: not present (default value)
- * : not supported (old QEMU without PIDX reg)
- *other: device's acpi-index
- */
-ifctx3 = aml_if(aml_lnot(
-aml_or(aml_equal(acpi_index, zero),
-   aml_equal(acpi_index, aml_int(0x)), NULL)
-));
-{
-byte_list[0] =
-1 /* have supported functions */ |
-1 << 7 /* support for function 7 */
-;
-aml_append(ifctx3, aml_return(aml_buffer(1, byte_list)));
-}
-aml_append(ifctx2, ifctx3);
- }
- aml_append(ifctx1, ifctx2);
+ifctx1 = aml_if(aml_lless(rev, aml_int(2)));
+{
+/* call is for unsupported REV, bail out */
+aml_append(ifctx1, aml_return(ret));
+}
+aml_append(ifctx, ifctx1);
 
- byte_list[0] = 0; /* nothing supported */
- aml_append(ifctx1, aml_return(aml_buffer(1, byte_list)));
- }
- aml_append(ifctx, ifctx1);
- elsectx = aml_else();
- /*
-  * PCI Firmware Specification 3.1
-  * 4.6.7. _DSM for Naming a PCI or PCI Express Device Under
-  *Operating Systems
-  */
- ifctx1 = aml_if(aml_equal(func, aml_int(7)));
- {
- Aml *pkg = aml_package(2);
- Aml *ret = aml_local(1);
+aml_append(ifctx,
+aml_store(aml_call2("AIDX", bnum, sunum), acpi_index));
+/*
+ * advertise function 7 if device has acpi-index
+ * acpi_index values:
+ *0: not present (default value)
+ * : not supported (old QEMU without PIDX reg)
+ *other: device's acpi-index
+ */
+ifctx1 = aml_if(aml_lnot(
+ aml_or(aml_equal(acpi_index, zero),
+aml_equal(acpi_index, aml_int(0x)), NULL)
+ ));
+{
+/* have supported functions */
+aml_append(ifctx1, aml_or(caps, one, caps));
+/* support for function 7 */
+aml_append(ifctx1,
+aml_or(caps, aml_shiftleft(one, aml_int(7)), caps));
+}
+aml_append(ifctx, ifctx1);
 
- aml_append(pkg, zero);
- /*
-  * optional, if not impl. should return null 

[PULL 38/55] hw/smbios: support for type 8 (port connector)

2022-10-10 Thread Michael S. Tsirkin
From: Hal Martin 

PATCH v1: add support for SMBIOS type 8 to qemu
PATCH v2: incorporate patch v1 feedback and add smbios type=8 to qemu-options

internal_reference: internal reference designator
external_reference: external reference designator
connector_type: hex value for port connector type (see SMBIOS 7.9.2)
port_type: hex value for port type (see SMBIOS 7.9.3)

After studying various vendor implementationsi (Dell, Lenovo, MSI),
the value of internal connector type was hard-coded to 0x0 (None).

Example usage:
-smbios 
type=8,internal_reference=JUSB1,external_reference=USB1,connector_type=0x12,port_type=0x10
 \
-smbios type=8,internal_reference=JAUD1,external_reference="Audio 
Jack",connector_type=0x1f,port_type=0x1d \
-smbios 
type=8,internal_reference=LAN,external_reference=Ethernet,connector_type=0x0b,port_type=0x1f
 \
-smbios 
type=8,internal_reference=PS2,external_reference=Mouse,connector_type=0x0f,port_type=0x0e
 \
-smbios 
type=8,internal_reference=PS2,external_reference=Keyboard,connector_type=0x0f,port_type=0x0d

Signed-off-by: Hal Martin 

Message-Id: <20220812135153.17859-1-hal.mar...@gmail.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/firmware/smbios.h | 10 ++
 hw/smbios/smbios.c   | 63 
 qemu-options.hx  |  2 ++
 3 files changed, 75 insertions(+)

diff --git a/include/hw/firmware/smbios.h b/include/hw/firmware/smbios.h
index 4b7ad77a44..e7d386f7c8 100644
--- a/include/hw/firmware/smbios.h
+++ b/include/hw/firmware/smbios.h
@@ -189,6 +189,16 @@ struct smbios_type_4 {
 uint16_t processor_family2;
 } QEMU_PACKED;
 
+/* SMBIOS type 8 - Port Connector Information */
+struct smbios_type_8 {
+struct smbios_structure_header header;
+uint8_t internal_reference_str;
+uint8_t internal_connector_type;
+uint8_t external_reference_str;
+uint8_t external_connector_type;
+uint8_t port_type;
+} QEMU_PACKED;
+
 /* SMBIOS type 11 - OEM strings */
 struct smbios_type_11 {
 struct smbios_structure_header header;
diff --git a/hw/smbios/smbios.c b/hw/smbios/smbios.c
index 4c9f664830..51437ca09f 100644
--- a/hw/smbios/smbios.c
+++ b/hw/smbios/smbios.c
@@ -111,6 +111,13 @@ static struct {
 .processor_id = 0,
 };
 
+struct type8_instance {
+const char *internal_reference, *external_reference;
+uint8_t connector_type, port_type;
+QTAILQ_ENTRY(type8_instance) next;
+};
+static QTAILQ_HEAD(, type8_instance) type8 = QTAILQ_HEAD_INITIALIZER(type8);
+
 static struct {
 size_t nvalues;
 char **values;
@@ -337,6 +344,29 @@ static const QemuOptDesc qemu_smbios_type4_opts[] = {
 { /* end of list */ }
 };
 
+static const QemuOptDesc qemu_smbios_type8_opts[] = {
+{
+.name = "internal_reference",
+.type = QEMU_OPT_STRING,
+.help = "internal reference designator",
+},
+{
+.name = "external_reference",
+.type = QEMU_OPT_STRING,
+.help = "external reference designator",
+},
+{
+.name = "connector_type",
+.type = QEMU_OPT_NUMBER,
+.help = "connector type",
+},
+{
+.name = "port_type",
+.type = QEMU_OPT_NUMBER,
+.help = "port type",
+},
+};
+
 static const QemuOptDesc qemu_smbios_type11_opts[] = {
 {
 .name = "value",
@@ -718,6 +748,26 @@ static void smbios_build_type_4_table(MachineState *ms, 
unsigned instance)
 smbios_type4_count++;
 }
 
+static void smbios_build_type_8_table(void)
+{
+unsigned instance = 0;
+struct type8_instance *t8;
+
+QTAILQ_FOREACH(t8, , next) {
+SMBIOS_BUILD_TABLE_PRE(8, T0_BASE + instance, true);
+
+SMBIOS_TABLE_SET_STR(8, internal_reference_str, 
t8->internal_reference);
+SMBIOS_TABLE_SET_STR(8, external_reference_str, 
t8->external_reference);
+/* most vendors seem to set this to None */
+t->internal_connector_type = 0x0;
+t->external_connector_type = t8->connector_type;
+t->port_type = t8->port_type;
+
+SMBIOS_BUILD_TABLE_POST;
+instance++;
+}
+}
+
 static void smbios_build_type_11_table(void)
 {
 char count_str[128];
@@ -1030,6 +1080,7 @@ void smbios_get_tables(MachineState *ms,
 smbios_build_type_4_table(ms, i);
 }
 
+smbios_build_type_8_table();
 smbios_build_type_11_table();
 
 #define MAX_DIMM_SZ (16 * GiB)
@@ -1348,6 +1399,18 @@ void smbios_entry_add(QemuOpts *opts, Error **errp)
UINT16_MAX);
 }
 return;
+case 8:
+if (!qemu_opts_validate(opts, qemu_smbios_type8_opts, errp)) {
+return;
+}
+struct type8_instance *t;
+t = g_new0(struct type8_instance, 1);
+save_opt(>internal_reference, opts, "internal_reference");
+save_opt(>external_reference, opts, "external_reference");
+t->connector_type = 

[PULL 32/55] qmp: decode feature & status bits in virtio-status

2022-10-10 Thread Michael S. Tsirkin
From: Laurent Vivier 

Display feature names instead of bitmaps for host, guest, and
backend for VirtIODevices.

Display status names instead of bitmaps for VirtIODevices.

Display feature names instead of bitmaps for backend, protocol,
acked, and features (hdev->features) for vhost devices.

Decode features according to device ID. Decode statuses
according to configuration status bitmap (config_status_map).
Decode vhost user protocol features according to vhost user
protocol bitmap (vhost_user_protocol_map).

Transport features are on the first line. Undecoded bits (if
any) are stored in a separate field.

[Jonah: Several changes made to this patch from prev. version (v14):
 - Moved all device features mappings to hw/virtio/virtio.c
 - Renamed device features mappings (less generic)
 - Generalized @FEATURE_ENTRY macro for all device mappings
 - Virtio device feature map definitions include descriptions of
   feature bits
 - Moved @VHOST_USER_F_PROTOCOL_FEATURES feature bit from transport
   feature map to vhost-user-supported device feature mappings
   (blk, fs, i2c, rng, net, gpu, input, scsi, vsock)
 - New feature bit added for virtio-vsock: @VIRTIO_VSOCK_F_SEQPACKET
 - New feature bit added for virtio-iommu: @VIRTIO_IOMMU_F_BYPASS_CONFIG
 - New feature bit added for virtio-mem: @VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE
 - New virtio transport feature bit added: @VIRTIO_F_IN_ORDER
 - Added device feature map definition for virtio-rng
]

Signed-off-by: Laurent Vivier 
Signed-off-by: Jonah Palmer 
Message-Id: <1660220684-24909-4-git-send-email-jonah.pal...@oracle.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 qapi/virtio.json   | 251 +--
 include/hw/virtio/vhost.h  |   3 +
 include/hw/virtio/virtio.h |   5 +
 hw/virtio/virtio.c | 643 -
 4 files changed, 874 insertions(+), 28 deletions(-)

diff --git a/qapi/virtio.json b/qapi/virtio.json
index c86b3bc635..c9c8201e66 100644
--- a/qapi/virtio.json
+++ b/qapi/virtio.json
@@ -106,10 +106,10 @@
 'n-tmp-sections': 'int',
 'nvqs': 'uint32',
 'vq-index': 'int',
-'features': 'uint64',
-'acked-features': 'uint64',
-'backend-features': 'uint64',
-'protocol-features': 'uint64',
+'features': 'VirtioDeviceFeatures',
+'acked-features': 'VirtioDeviceFeatures',
+'backend-features': 'VirtioDeviceFeatures',
+'protocol-features': 'VhostDeviceProtocols',
 'max-queues': 'uint64',
 'backend-cap': 'uint64',
 'log-enabled': 'bool',
@@ -176,11 +176,11 @@
 'device-id': 'uint16',
 'vhost-started': 'bool',
 'device-endian': 'str',
-'guest-features': 'uint64',
-'host-features': 'uint64',
-'backend-features': 'uint64',
+'guest-features': 'VirtioDeviceFeatures',
+'host-features': 'VirtioDeviceFeatures',
+'backend-features': 'VirtioDeviceFeatures',
 'num-vqs': 'int',
-'status': 'uint8',
+'status': 'VirtioDeviceStatus',
 'isr': 'uint8',
 'queue-sel': 'uint16',
 'vm-running': 'bool',
@@ -222,14 +222,41 @@
 #  "name": "virtio-crypto",
 #  "started": true,
 #  "device-id": 20,
-#  "backend-features": 0,
+#  "backend-features": {
+#  "transports": [],
+#  "dev-features": []
+#  },
 #  "start-on-kick": false,
 #  "isr": 1,
 #  "broken": false,
-#  "status": 15,
+#  "status": {
+#  "statuses": [
+#  "VIRTIO_CONFIG_S_ACKNOWLEDGE: Valid virtio device found",
+#  "VIRTIO_CONFIG_S_DRIVER: Guest OS compatible with device",
+#  "VIRTIO_CONFIG_S_FEATURES_OK: Feature negotiation complete",
+#  "VIRTIO_CONFIG_S_DRIVER_OK: Driver setup and ready"
+#  ]
+#  },
 #  "num-vqs": 2,
-#  "guest-features": 5100273664,
-#  "host-features": 6325010432,
+#  "guest-features": {
+#  "dev-features": [],
+#  "transports": [
+#  "VIRTIO_RING_F_EVENT_IDX: Used & avail. event fields 
enabled",
+#  "VIRTIO_RING_F_INDIRECT_DESC: Indirect descriptors 
supported",
+#  "VIRTIO_F_VERSION_1: Device compliant for v1 spec (legacy)"
+#  ]
+#  },
+#  "host-features": {
+#  "unknown-dev-features": 1073741824,
+#  "dev-features": [],
+#  "transports": [
+#  "VIRTIO_RING_F_EVENT_IDX: Used & avail. event fields 
enabled",
+#  "VIRTIO_RING_F_INDIRECT_DESC: Indirect descriptors 
supported",
+#  "VIRTIO_F_VERSION_1: Device compliant for v1 spec (legacy)",
+#  

[PULL 35/55] hmp: add virtio commands

2022-10-10 Thread Michael S. Tsirkin
From: Laurent Vivier 

This patch implements the HMP versions of the virtio QMP commands.

[Jonah: Adjusted hmp monitor output format for features / statuses
with their descriptions.]

Signed-off-by: Laurent Vivier 
Signed-off-by: Jonah Palmer 
Message-Id: <1660220684-24909-7-git-send-email-jonah.pal...@oracle.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/monitor/hmp.h |   5 +
 monitor/hmp-cmds.c| 310 ++
 hmp-commands-info.hx  |  70 ++
 3 files changed, 385 insertions(+)

diff --git a/include/monitor/hmp.h b/include/monitor/hmp.h
index a618eb1e4e..a9cf064ee8 100644
--- a/include/monitor/hmp.h
+++ b/include/monitor/hmp.h
@@ -95,6 +95,11 @@ void hmp_qom_list(Monitor *mon, const QDict *qdict);
 void hmp_qom_get(Monitor *mon, const QDict *qdict);
 void hmp_qom_set(Monitor *mon, const QDict *qdict);
 void hmp_info_qom_tree(Monitor *mon, const QDict *dict);
+void hmp_virtio_query(Monitor *mon, const QDict *qdict);
+void hmp_virtio_status(Monitor *mon, const QDict *qdict);
+void hmp_virtio_queue_status(Monitor *mon, const QDict *qdict);
+void hmp_vhost_queue_status(Monitor *mon, const QDict *qdict);
+void hmp_virtio_queue_element(Monitor *mon, const QDict *qdict);
 void object_add_completion(ReadLineState *rs, int nb_args, const char *str);
 void object_del_completion(ReadLineState *rs, int nb_args, const char *str);
 void device_add_completion(ReadLineState *rs, int nb_args, const char *str);
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index f90eea8d01..bab86c5537 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -43,6 +43,8 @@
 #include "qapi/qapi-commands-stats.h"
 #include "qapi/qapi-commands-tpm.h"
 #include "qapi/qapi-commands-ui.h"
+#include "qapi/qapi-commands-virtio.h"
+#include "qapi/qapi-visit-virtio.h"
 #include "qapi/qapi-visit-net.h"
 #include "qapi/qapi-visit-migration.h"
 #include "qapi/qmp/qdict.h"
@@ -2472,3 +2474,311 @@ exit:
 exit_no_print:
 error_free(err);
 }
+
+static void hmp_virtio_dump_protocols(Monitor *mon,
+  VhostDeviceProtocols *pcol)
+{
+strList *pcol_list = pcol->protocols;
+while (pcol_list) {
+monitor_printf(mon, "\t%s", pcol_list->value);
+pcol_list = pcol_list->next;
+if (pcol_list != NULL) {
+monitor_printf(mon, ",\n");
+}
+}
+monitor_printf(mon, "\n");
+if (pcol->has_unknown_protocols) {
+monitor_printf(mon, "  unknown-protocols(0x%016"PRIx64")\n",
+   pcol->unknown_protocols);
+}
+}
+
+static void hmp_virtio_dump_status(Monitor *mon,
+   VirtioDeviceStatus *status)
+{
+strList *status_list = status->statuses;
+while (status_list) {
+monitor_printf(mon, "\t%s", status_list->value);
+status_list = status_list->next;
+if (status_list != NULL) {
+monitor_printf(mon, ",\n");
+}
+}
+monitor_printf(mon, "\n");
+if (status->has_unknown_statuses) {
+monitor_printf(mon, "  unknown-statuses(0x%016"PRIx32")\n",
+   status->unknown_statuses);
+}
+}
+
+static void hmp_virtio_dump_features(Monitor *mon,
+ VirtioDeviceFeatures *features)
+{
+strList *transport_list = features->transports;
+while (transport_list) {
+monitor_printf(mon, "\t%s", transport_list->value);
+transport_list = transport_list->next;
+if (transport_list != NULL) {
+monitor_printf(mon, ",\n");
+}
+}
+
+monitor_printf(mon, "\n");
+strList *list = features->dev_features;
+if (list) {
+while (list) {
+monitor_printf(mon, "\t%s", list->value);
+list = list->next;
+if (list != NULL) {
+monitor_printf(mon, ",\n");
+}
+}
+monitor_printf(mon, "\n");
+}
+
+if (features->has_unknown_dev_features) {
+monitor_printf(mon, "  unknown-features(0x%016"PRIx64")\n",
+   features->unknown_dev_features);
+}
+}
+
+void hmp_virtio_query(Monitor *mon, const QDict *qdict)
+{
+Error *err = NULL;
+VirtioInfoList *list = qmp_x_query_virtio();
+VirtioInfoList *node;
+
+if (err != NULL) {
+hmp_handle_error(mon, err);
+return;
+}
+
+if (list == NULL) {
+monitor_printf(mon, "No VirtIO devices\n");
+return;
+}
+
+node = list;
+while (node) {
+monitor_printf(mon, "%s [%s]\n", node->value->path,
+   node->value->name);
+node = node->next;
+}
+qapi_free_VirtioInfoList(list);
+}
+
+void hmp_virtio_status(Monitor *mon, const QDict *qdict)
+{
+Error *err = NULL;
+const char *path = qdict_get_try_str(qdict, "path");
+VirtioStatus *s = qmp_x_query_virtio_status(path, );
+
+if (err != NULL) {
+hmp_handle_error(mon, err);

[PULL 33/55] qmp: add QMP commands for virtio/vhost queue-status

2022-10-10 Thread Michael S. Tsirkin
From: Laurent Vivier 

These new commands show the internal status of a VirtIODevice's
VirtQueue and a vhost device's vhost_virtqueue (if active).

Signed-off-by: Laurent Vivier 
Signed-off-by: Jonah Palmer 
Message-Id: <1660220684-24909-5-git-send-email-jonah.pal...@oracle.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 qapi/virtio.json| 256 
 hw/virtio/virtio-stub.c |  14 +++
 hw/virtio/virtio.c  | 103 
 3 files changed, 373 insertions(+)

diff --git a/qapi/virtio.json b/qapi/virtio.json
index c9c8201e66..d9050f3584 100644
--- a/qapi/virtio.json
+++ b/qapi/virtio.json
@@ -499,3 +499,259 @@
   'data': { 'transports': [ 'str' ],
 '*dev-features': [ 'str' ],
 '*unknown-dev-features': 'uint64' } }
+
+##
+# @VirtQueueStatus:
+#
+# Information of a VirtIODevice VirtQueue, including most members of
+# the VirtQueue data structure.
+#
+# @name: Name of the VirtIODevice that uses this VirtQueue
+#
+# @queue-index: VirtQueue queue_index
+#
+# @inuse: VirtQueue inuse
+#
+# @vring-num: VirtQueue vring.num
+#
+# @vring-num-default: VirtQueue vring.num_default
+#
+# @vring-align: VirtQueue vring.align
+#
+# @vring-desc: VirtQueue vring.desc (descriptor area)
+#
+# @vring-avail: VirtQueue vring.avail (driver area)
+#
+# @vring-used: VirtQueue vring.used (device area)
+#
+# @last-avail-idx: VirtQueue last_avail_idx or return of vhost_dev
+#  vhost_get_vring_base (if vhost active)
+#
+# @shadow-avail-idx: VirtQueue shadow_avail_idx
+#
+# @used-idx: VirtQueue used_idx
+#
+# @signalled-used: VirtQueue signalled_used
+#
+# @signalled-used-valid: VirtQueue signalled_used_valid flag
+#
+# Since: 7.1
+#
+##
+
+{ 'struct': 'VirtQueueStatus',
+  'data': { 'name': 'str',
+'queue-index': 'uint16',
+'inuse': 'uint32',
+'vring-num': 'uint32',
+'vring-num-default': 'uint32',
+'vring-align': 'uint32',
+'vring-desc': 'uint64',
+'vring-avail': 'uint64',
+'vring-used': 'uint64',
+'*last-avail-idx': 'uint16',
+'*shadow-avail-idx': 'uint16',
+'used-idx': 'uint16',
+'signalled-used': 'uint16',
+'signalled-used-valid': 'bool' } }
+
+##
+# @x-query-virtio-queue-status:
+#
+# Return the status of a given VirtIODevice's VirtQueue
+#
+# @path: VirtIODevice canonical QOM path
+#
+# @queue: VirtQueue index to examine
+#
+# Features:
+# @unstable: This command is meant for debugging.
+#
+# Returns: VirtQueueStatus of the VirtQueue
+#
+# Notes: last_avail_idx will not be displayed in the case where
+#the selected VirtIODevice has a running vhost device and
+#the VirtIODevice VirtQueue index (queue) does not exist for
+#the corresponding vhost device vhost_virtqueue. Also,
+#shadow_avail_idx will not be displayed in the case where
+#the selected VirtIODevice has a running vhost device.
+#
+# Since: 7.1
+#
+# Examples:
+#
+# 1. Get VirtQueueStatus for virtio-vsock (vhost-vsock running)
+#
+# -> { "execute": "x-query-virtio-queue-status",
+#  "arguments": { "path": "/machine/peripheral/vsock0/virtio-backend",
+# "queue": 1 }
+#}
+# <- { "return": {
+#  "signalled-used": 0,
+#  "inuse": 0,
+#  "name": "vhost-vsock",
+#  "vring-align": 4096,
+#  "vring-desc": 5217370112,
+#  "signalled-used-valid": false,
+#  "vring-num-default": 128,
+#  "vring-avail": 5217372160,
+#  "queue-index": 1,
+#  "last-avail-idx": 0,
+#  "vring-used": 5217372480,
+#  "used-idx": 0,
+#  "vring-num": 128
+#  }
+#}
+#
+# 2. Get VirtQueueStatus for virtio-serial (no vhost)
+#
+# -> { "execute": "x-query-virtio-queue-status",
+#  "arguments": { "path": 
"/machine/peripheral-anon/device[0]/virtio-backend",
+# "queue": 20 }
+#}
+# <- { "return": {
+#  "signalled-used": 0,
+#  "inuse": 0,
+#  "name": "virtio-serial",
+#  "vring-align": 4096,
+#  "vring-desc": 5182074880,
+#  "signalled-used-valid": false,
+#  "vring-num-default": 128,
+#  "vring-avail": 5182076928,
+#  "queue-index": 20,
+#  "last-avail-idx": 0,
+#  "vring-used": 5182077248,
+#  "used-idx": 0,
+#  "shadow-avail-idx": 0,
+#  "vring-num": 128
+#  }
+#}
+#
+##
+
+{ 'command': 'x-query-virtio-queue-status',
+  'data': { 'path': 'str', 'queue': 'uint16' },
+  'returns': 'VirtQueueStatus',
+  'features': [ 'unstable' ] }
+
+##
+# @VirtVhostQueueStatus:
+#
+# Information of a vhost device's vhost_virtqueue, including most
+# members of the vhost_dev vhost_virtqueue data structure.
+#
+# @name: Name of the VirtIODevice that uses this vhost_virtqueue
+#
+# @kick: vhost_virtqueue kick
+#
+# @call: vhost_virtqueue 

[PULL 41/55] tests: acpi: update expected blobs after HPET move

2022-10-10 Thread Michael S. Tsirkin
From: Igor Mammedov 

HPET AML moved after PCI host bridge description (no functional change)

diff example for PC machine:

@@ -54,47 +54,6 @@ DefinitionBlock ("", "DSDT", 1, "BOCHS ", "BXPC", 
0x0001)
 }
 }

-Scope (_SB)
-{
-Device (HPET)
-{
-Name (_HID, EisaId ("PNP0103") /* HPET System Timer */)  // _HID: 
Hardware ID
-Name (_UID, Zero)  // _UID: Unique ID
-OperationRegion (HPTM, SystemMemory, 0xFED0, 0x0400)
-Field (HPTM, DWordAcc, Lock, Preserve)
-{
-VEND,   32,
-PRD,32
-}
-
-Method (_STA, 0, NotSerialized)  // _STA: Status
-{
-Local0 = VEND /* \_SB_.HPET.VEND */
-Local1 = PRD /* \_SB_.HPET.PRD_ */
-Local0 >>= 0x10
-If (((Local0 == Zero) || (Local0 == 0x)))
-{
-Return (Zero)
-}
-
-If (((Local1 == Zero) || (Local1 > 0x05F5E100)))
-{
-Return (Zero)
-}
-
-Return (0x0F)
-}
-
-Name (_CRS, ResourceTemplate ()  // _CRS: Current Resource Settings
-{
-Memory32Fixed (ReadOnly,
-0xFED0, // Address Base
-0x0400, // Address Length
-)
-})
-}
-}
-
 Scope (_SB.PCI0)
 {
 Device (ISA)
@@ -529,6 +488,47 @@ DefinitionBlock ("", "DSDT", 1, "BOCHS ", "BXPC", 
0x0001)
 }
 }

+Scope (_SB)
+{
+Device (HPET)
+{
+Name (_HID, EisaId ("PNP0103") /* HPET System Timer */)  // _HID: 
Hardware ID
+Name (_UID, Zero)  // _UID: Unique ID
+OperationRegion (HPTM, SystemMemory, 0xFED0, 0x0400)
+Field (HPTM, DWordAcc, Lock, Preserve)
+{
+VEND,   32,
+PRD,32
+}
+
+Method (_STA, 0, NotSerialized)  // _STA: Status
+{
+Local0 = VEND /* \_SB_.HPET.VEND */
+Local1 = PRD /* \_SB_.HPET.PRD_ */
+Local0 >>= 0x10
+If (((Local0 == Zero) || (Local0 == 0x)))
+{
+Return (Zero)
+}
+
+If (((Local1 == Zero) || (Local1 > 0x05F5E100)))
+{
+Return (Zero)
+}
+
+Return (0x0F)
+}
+
+Name (_CRS, ResourceTemplate ()  // _CRS: Current Resource Settings
+{
+Memory32Fixed (ReadOnly,
+0xFED0, // Address Base
+0x0400, // Address Length
+)
+})
+}
+}
+
 Scope (_SB)
 {
 Device (\_SB.PCI0.PRES)

Signed-off-by: Igor Mammedov 
Message-Id: <20220701133515.137890-4-imamm...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 tests/qtest/bios-tables-test-allowed-diff.h |  32 
 tests/data/acpi/pc/DSDT | Bin 5987 -> 5987 bytes
 tests/data/acpi/pc/DSDT.acpierst| Bin 5954 -> 5954 bytes
 tests/data/acpi/pc/DSDT.acpihmat| Bin 7312 -> 7312 bytes
 tests/data/acpi/pc/DSDT.bridge  | Bin 8653 -> 8653 bytes
 tests/data/acpi/pc/DSDT.cphp| Bin 6451 -> 6451 bytes
 tests/data/acpi/pc/DSDT.dimmpxm | Bin 7641 -> 7641 bytes
 tests/data/acpi/pc/DSDT.hpbridge| Bin 5954 -> 5954 bytes
 tests/data/acpi/pc/DSDT.hpbrroot| Bin 3069 -> 3069 bytes
 tests/data/acpi/pc/DSDT.ipmikcs | Bin 6059 -> 6059 bytes
 tests/data/acpi/pc/DSDT.memhp   | Bin 7346 -> 7346 bytes
 tests/data/acpi/pc/DSDT.numamem | Bin 5993 -> 5993 bytes
 tests/data/acpi/pc/DSDT.roothp  | Bin 6195 -> 6195 bytes
 tests/data/acpi/q35/DSDT| Bin 8274 -> 8274 bytes
 tests/data/acpi/q35/DSDT.acpierst   | Bin 8291 -> 8291 bytes
 tests/data/acpi/q35/DSDT.acpihmat   | Bin 9599 -> 9599 bytes
 tests/data/acpi/q35/DSDT.applesmc   | Bin 8320 -> 8320 bytes
 tests/data/acpi/q35/DSDT.bridge | Bin 10988 -> 10988 bytes
 tests/data/acpi/q35/DSDT.cphp   | Bin 8738 -> 8738 bytes
 tests/data/acpi/q35/DSDT.cxl| Bin 9600 -> 9600 bytes
 tests/data/acpi/q35/DSDT.dimmpxm| Bin 9928 -> 9928 bytes
 tests/data/acpi/q35/DSDT.ipmibt | Bin 8349 -> 8349 bytes
 tests/data/acpi/q35/DSDT.ipmismbus  | Bin 8363 -> 8363 bytes
 tests/data/acpi/q35/DSDT.ivrs   | Bin 8291 -> 8291 bytes
 tests/data/acpi/q35/DSDT.memhp  | Bin 9633 -> 9633 bytes
 tests/data/acpi/q35/DSDT.mmio64 | Bin 9404 -> 9404 bytes
 

[PULL 27/55] tests/acpi: virt: update ACPI GTDT binaries

2022-10-10 Thread Michael S. Tsirkin
From: Miguel Luis 

Step 6 & 7 of the bios-tables-test.c documented procedure.

Differences between disassembled ASL files for GTDT:

@@ -13,14 +13,14 @@
 [000h    4]Signature : "GTDT"[Generic Timer 
Description Table]
 [004h 0004   4] Table Length : 0060
 [008h 0008   1] Revision : 02
-[009h 0009   1] Checksum : 8C
+[009h 0009   1] Checksum : 9C
 [00Ah 0010   6]   Oem ID : "BOCHS "
 [010h 0016   8] Oem Table ID : "BXPC"
 [018h 0024   4] Oem Revision : 0001
 [01Ch 0028   4]  Asl Compiler ID : "BXPC"
 [020h 0032   4]Asl Compiler Revision : 0001

-[024h 0036   8]Counter Block Address : 
+[024h 0036   8]Counter Block Address : 
 [02Ch 0044   4] Reserved : 

 [030h 0048   4] Secure EL1 Interrupt : 001D
@@ -46,16 +46,16 @@
 Trigger Mode : 0
 Polarity : 0
Always On : 0
-[050h 0080   8]   Counter Read Block Address : 
+[050h 0080   8]   Counter Read Block Address : 

 [058h 0088   4] Platform Timer Count : 
 [05Ch 0092   4]Platform Timer Offset : 

 Raw Table Data: Length 96 (0x60)

-: 47 54 44 54 60 00 00 00 02 8C 42 4F 43 48 53 20  // 
GTDT`.BOCHS
+: 47 54 44 54 60 00 00 00 02 9C 42 4F 43 48 53 20  // 
GTDT`.BOCHS
 0010: 42 58 50 43 20 20 20 20 01 00 00 00 42 58 50 43  // BXPC
BXPC
-0020: 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  // 

+0020: 01 00 00 00 FF FF FF FF FF FF FF FF 00 00 00 00  // 

 0030: 1D 00 00 00 00 00 00 00 1E 00 00 00 04 00 00 00  // 

 0040: 1B 00 00 00 00 00 00 00 1A 00 00 00 00 00 00 00  // 

-0050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  // 

+0050: FF FF FF FF FF FF FF FF 00 00 00 00 00 00 00 00  // 


Signed-off-by: Miguel Luis 
Message-Id: <20220920162137.75239-4-miguel.l...@oracle.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Acked-by: Ani Sinha 
---
 tests/qtest/bios-tables-test-allowed-diff.h |   3 ---
 tests/data/acpi/virt/GTDT   | Bin 96 -> 96 bytes
 tests/data/acpi/virt/GTDT.memhp | Bin 96 -> 96 bytes
 tests/data/acpi/virt/GTDT.numamem   | Bin 96 -> 96 bytes
 4 files changed, 3 deletions(-)

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
b/tests/qtest/bios-tables-test-allowed-diff.h
index 957bd1b4f6..dfb8523c8b 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1,4 +1 @@
 /* List of comma-separated changed AML files to ignore */
-"tests/data/acpi/virt/GTDT",
-"tests/data/acpi/virt/GTDT.memhp",
-"tests/data/acpi/virt/GTDT.numamem",
diff --git a/tests/data/acpi/virt/GTDT b/tests/data/acpi/virt/GTDT
index 
9408b71b59c0e0f2991c0053562280155b47bc0b..6f8cb9b8f30b55f4c93fe515982621e3db50feb2
 100644
GIT binary patch
delta 45
kcmYdD;BpUf2}xjJU|^avkxPo>KNL*VQ4xT#fs$YV0LH=;ng9R*

delta 45
jcmYdD;BpUf2}xjJU|{N*$R))AWPrg$9Tfo>8%6^Foy!E8

diff --git a/tests/data/acpi/virt/GTDT.memhp b/tests/data/acpi/virt/GTDT.memhp
index 
9408b71b59c0e0f2991c0053562280155b47bc0b..6f8cb9b8f30b55f4c93fe515982621e3db50feb2
 100644
GIT binary patch
delta 45
kcmYdD;BpUf2}xjJU|^avkxPo>KNL*VQ4xT#fs$YV0LH=;ng9R*

delta 45
jcmYdD;BpUf2}xjJU|{N*$R))AWPrg$9Tfo>8%6^Foy!E8

diff --git a/tests/data/acpi/virt/GTDT.numamem 
b/tests/data/acpi/virt/GTDT.numamem
index 
9408b71b59c0e0f2991c0053562280155b47bc0b..6f8cb9b8f30b55f4c93fe515982621e3db50feb2
 100644
GIT binary patch
delta 45
kcmYdD;BpUf2}xjJU|^avkxPo>KNL*VQ4xT#fs$YV0LH=;ng9R*

delta 45
jcmYdD;BpUf2}xjJU|{N*$R))AWPrg$9Tfo>8%6^Foy!E8

-- 
MST




[PULL 30/55] qmp: add QMP command x-query-virtio

2022-10-10 Thread Michael S. Tsirkin
From: Laurent Vivier 

This new command lists all the instances of VirtIODevices with
their canonical QOM path and name.

[Jonah: @virtio_list duplicates information that already exists in
 the QOM composition tree. However, extracting necessary information
 from this tree seems to be a bit convoluted.

 Instead, we still create our own list of realized virtio devices
 but use @qmp_qom_get with the device's canonical QOM path to confirm
 that the device exists and is realized. If the device exists but
 is actually not realized, then we remove it from our list (for
 synchronicity to the QOM composition tree).

 Also, the QMP command @x-query-virtio is redundant as @qom-list
 and @qom-get are sufficient to search '/machine/' for realized
 virtio devices. However, @x-query-virtio is much more convenient
 in listing realized virtio devices.]

Signed-off-by: Laurent Vivier 
Signed-off-by: Jonah Palmer 
Message-Id: <1660220684-24909-2-git-send-email-jonah.pal...@oracle.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 qapi/qapi-schema.json  |  1 +
 qapi/virtio.json   | 68 ++
 include/hw/virtio/virtio.h |  1 +
 hw/virtio/virtio-stub.c| 14 
 hw/virtio/virtio.c | 44 
 tests/qtest/qmp-cmd-test.c |  1 +
 hw/virtio/meson.build  |  2 ++
 qapi/meson.build   |  1 +
 8 files changed, 132 insertions(+)
 create mode 100644 qapi/virtio.json
 create mode 100644 hw/virtio/virtio-stub.c

diff --git a/qapi/qapi-schema.json b/qapi/qapi-schema.json
index 92d7ecc52c..f000b90744 100644
--- a/qapi/qapi-schema.json
+++ b/qapi/qapi-schema.json
@@ -94,3 +94,4 @@
 { 'include': 'acpi.json' }
 { 'include': 'pci.json' }
 { 'include': 'stats.json' }
+{ 'include': 'virtio.json' }
diff --git a/qapi/virtio.json b/qapi/virtio.json
new file mode 100644
index 00..03896e423f
--- /dev/null
+++ b/qapi/virtio.json
@@ -0,0 +1,68 @@
+# -*- Mode: Python -*-
+# vim: filetype=python
+#
+
+##
+# = Virtio devices
+##
+
+##
+# @VirtioInfo:
+#
+# Basic information about a given VirtIODevice
+#
+# @path: The VirtIODevice's canonical QOM path
+#
+# @name: Name of the VirtIODevice
+#
+# Since: 7.1
+#
+##
+{ 'struct': 'VirtioInfo',
+  'data': { 'path': 'str',
+'name': 'str' } }
+
+##
+# @x-query-virtio:
+#
+# Returns a list of all realized VirtIODevices
+#
+# Features:
+# @unstable: This command is meant for debugging.
+#
+# Returns: List of gathered VirtIODevices
+#
+# Since: 7.1
+#
+# Example:
+#
+# -> { "execute": "x-query-virtio" }
+# <- { "return": [
+#  {
+#  "name": "virtio-input",
+#  "path": "/machine/peripheral-anon/device[4]/virtio-backend"
+#  },
+#  {
+#  "name": "virtio-crypto",
+#  "path": "/machine/peripheral/crypto0/virtio-backend"
+#  },
+#  {
+#  "name": "virtio-scsi",
+#  "path": "/machine/peripheral-anon/device[2]/virtio-backend"
+#  },
+#  {
+#  "name": "virtio-net",
+#  "path": "/machine/peripheral-anon/device[1]/virtio-backend"
+#  },
+#  {
+#  "name": "virtio-serial",
+#  "path": "/machine/peripheral-anon/device[0]/virtio-backend"
+#  }
+#  ]
+#}
+#
+##
+
+{ 'command': 'x-query-virtio',
+  'returns': [ 'VirtioInfo' ],
+  'features': [ 'unstable' ] }
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index cecfb7c552..9eeb958e39 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -122,6 +122,7 @@ struct VirtIODevice
 bool use_guest_notifier_mask;
 AddressSpace *dma_as;
 QLIST_HEAD(, VirtQueue) *vector_queues;
+QTAILQ_ENTRY(VirtIODevice) next;
 };
 
 struct VirtioDeviceClass {
diff --git a/hw/virtio/virtio-stub.c b/hw/virtio/virtio-stub.c
new file mode 100644
index 00..05a81edc92
--- /dev/null
+++ b/hw/virtio/virtio-stub.c
@@ -0,0 +1,14 @@
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qapi/qapi-commands-virtio.h"
+
+static void *qmp_virtio_unsupported(Error **errp)
+{
+error_setg(errp, "Virtio is disabled");
+return NULL;
+}
+
+VirtioInfoList *qmp_x_query_virtio(Error **errp)
+{
+return qmp_virtio_unsupported(errp);
+}
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 2cc1d7d24a..4fc7c80d3f 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -13,12 +13,18 @@
 
 #include "qemu/osdep.h"
 #include "qapi/error.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qapi-commands-virtio.h"
+#include "qapi/qapi-commands-qom.h"
+#include "qapi/qapi-visit-virtio.h"
+#include "qapi/qmp/qjson.h"
 #include "cpu.h"
 #include "trace.h"
 #include "qemu/error-report.h"
 #include "qemu/log.h"
 #include "qemu/main-loop.h"
 #include "qemu/module.h"
+#include "qom/object_interfaces.h"
 #include "hw/virtio/virtio.h"
 #include "migration/qemu-file-types.h"
 #include "qemu/atomic.h"
@@ -29,6 +35,9 @@
 

[PULL 31/55] qmp: add QMP command x-query-virtio-status

2022-10-10 Thread Michael S. Tsirkin
From: Laurent Vivier 

This new command shows the status of a VirtIODevice, including
its corresponding vhost device's status (if active).

Next patch will improve output by decoding feature bits, including
vhost device's feature bits (backend, protocol, acked, and features).
Also will decode status bits of a VirtIODevice.

[Jonah: From patch v12; added a check to @virtio_device_find to ensure
 synchronicity between @virtio_list and the devices in the QOM
 composition tree.]

Signed-off-by: Laurent Vivier 
Signed-off-by: Jonah Palmer 
Message-Id: <1660220684-24909-3-git-send-email-jonah.pal...@oracle.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 qapi/virtio.json| 222 
 hw/virtio/virtio-stub.c |   5 +
 hw/virtio/virtio.c  | 104 +++
 3 files changed, 331 insertions(+)

diff --git a/qapi/virtio.json b/qapi/virtio.json
index 03896e423f..c86b3bc635 100644
--- a/qapi/virtio.json
+++ b/qapi/virtio.json
@@ -66,3 +66,225 @@
 { 'command': 'x-query-virtio',
   'returns': [ 'VirtioInfo' ],
   'features': [ 'unstable' ] }
+
+##
+# @VhostStatus:
+#
+# Information about a vhost device. This information will only be
+# displayed if the vhost device is active.
+#
+# @n-mem-sections: vhost_dev n_mem_sections
+#
+# @n-tmp-sections: vhost_dev n_tmp_sections
+#
+# @nvqs: vhost_dev nvqs (number of virtqueues being used)
+#
+# @vq-index: vhost_dev vq_index
+#
+# @features: vhost_dev features
+#
+# @acked-features: vhost_dev acked_features
+#
+# @backend-features: vhost_dev backend_features
+#
+# @protocol-features: vhost_dev protocol_features
+#
+# @max-queues: vhost_dev max_queues
+#
+# @backend-cap: vhost_dev backend_cap
+#
+# @log-enabled: vhost_dev log_enabled flag
+#
+# @log-size: vhost_dev log_size
+#
+# Since: 7.1
+#
+##
+
+{ 'struct': 'VhostStatus',
+  'data': { 'n-mem-sections': 'int',
+'n-tmp-sections': 'int',
+'nvqs': 'uint32',
+'vq-index': 'int',
+'features': 'uint64',
+'acked-features': 'uint64',
+'backend-features': 'uint64',
+'protocol-features': 'uint64',
+'max-queues': 'uint64',
+'backend-cap': 'uint64',
+'log-enabled': 'bool',
+'log-size': 'uint64' } }
+
+##
+# @VirtioStatus:
+#
+# Full status of the virtio device with most VirtIODevice members.
+# Also includes the full status of the corresponding vhost device
+# if the vhost device is active.
+#
+# @name: VirtIODevice name
+#
+# @device-id: VirtIODevice ID
+#
+# @vhost-started: VirtIODevice vhost_started flag
+#
+# @guest-features: VirtIODevice guest_features
+#
+# @host-features: VirtIODevice host_features
+#
+# @backend-features: VirtIODevice backend_features
+#
+# @device-endian: VirtIODevice device_endian
+#
+# @num-vqs: VirtIODevice virtqueue count. This is the number of active
+#   virtqueues being used by the VirtIODevice.
+#
+# @status: VirtIODevice configuration status (VirtioDeviceStatus)
+#
+# @isr: VirtIODevice ISR
+#
+# @queue-sel: VirtIODevice queue_sel
+#
+# @vm-running: VirtIODevice vm_running flag
+#
+# @broken: VirtIODevice broken flag
+#
+# @disabled: VirtIODevice disabled flag
+#
+# @use-started: VirtIODevice use_started flag
+#
+# @started: VirtIODevice started flag
+#
+# @start-on-kick: VirtIODevice start_on_kick flag
+#
+# @disable-legacy-check: VirtIODevice disabled_legacy_check flag
+#
+# @bus-name: VirtIODevice bus_name
+#
+# @use-guest-notifier-mask: VirtIODevice use_guest_notifier_mask flag
+#
+# @vhost-dev: Corresponding vhost device info for a given VirtIODevice.
+# Present if the given VirtIODevice has an active vhost
+# device.
+#
+# Since: 7.1
+#
+##
+
+{ 'struct': 'VirtioStatus',
+  'data': { 'name': 'str',
+'device-id': 'uint16',
+'vhost-started': 'bool',
+'device-endian': 'str',
+'guest-features': 'uint64',
+'host-features': 'uint64',
+'backend-features': 'uint64',
+'num-vqs': 'int',
+'status': 'uint8',
+'isr': 'uint8',
+'queue-sel': 'uint16',
+'vm-running': 'bool',
+'broken': 'bool',
+'disabled': 'bool',
+'use-started': 'bool',
+'started': 'bool',
+'start-on-kick': 'bool',
+'disable-legacy-check': 'bool',
+'bus-name': 'str',
+'use-guest-notifier-mask': 'bool',
+'*vhost-dev': 'VhostStatus' } }
+
+##
+# @x-query-virtio-status:
+#
+# Poll for a comprehensive status of a given virtio device
+#
+# @path: Canonical QOM path of the VirtIODevice
+#
+# Features:
+# @unstable: This command is meant for debugging.
+#
+# Returns: VirtioStatus of the virtio device
+#
+# Since: 7.1
+#
+# Examples:
+#
+# 1. Poll for the status of virtio-crypto (no vhost-crypto active)
+#
+# -> { "execute": "x-query-virtio-status",
+#  "arguments": { 

  1   2   3   4   >