date:20240321

On Fri, Feb 16, 2024 at 6:50 AM Igor Lesik  wrote:
>
> Hi,
>
> I have a situation when I need to use third-party 32-bit RISC-V CPU when rest 
> is all 64-bit RISC-V CPUs. I have seen that some steps were already made in 
> the direction to enable such configuration 
> (https://riscv.org/blog/2023/01/run-32-bit-applications-on-64-bit-linux-kernel-liu-zhiwei-guo-ren-t-head-division-of-alibaba-cloud/),
>  I am wondering if someone can shed more light on it.

I assume you want to model a number of 64-bit RISC-V CPUs and a 32-bit
RISC-V CPU (for power control of something like that) in QEMU?

That currently isn't possible. There are minimal efforts to support
creating and running a 32-bit RISC-V CPU with the 64-bit binary, but
that currently doesn't work. I don't think anyone is actively working
on it either. The next step of running both at the same time shouldn't
be too hard after that. It's already possible for 32/64-bit ARM CPUs.

Alistair

>
> Thanks,
> Igor
>

Re: [PATCH] hw/intc: Update APLIC IDC after claiming iforce register

On Thu, Mar 21, 2024 at 8:50 PM  wrote:
>
> From: Frank Chang 
>
> Currently, QEMU only sets the iforce register to 0 and returns early
> when claiming the iforce register. However, this may leave mip.meip
> remains at 1 if a spurious external interrupt triggered by iforce
> register is the only pending interrupt to be claimed, and the interrupt
> cannot be lowered as expected.
>
> This commit fixes this issue by calling riscv_aplic_idc_update() to
> update the IDC status after the iforce register is claimed.
>
> Signed-off-by: Frank Chang 
> Reviewed-by: Jim Shu 

Thanks!

Applied to riscv-to-apply.next

Alistair

> ---
>  hw/intc/riscv_aplic.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/hw/intc/riscv_aplic.c b/hw/intc/riscv_aplic.c
> index 6a7fbfa861..fc5df0d598 100644
> --- a/hw/intc/riscv_aplic.c
> +++ b/hw/intc/riscv_aplic.c
> @@ -488,6 +488,7 @@ static uint32_t riscv_aplic_idc_claimi(RISCVAPLICState 
> *aplic, uint32_t idc)
>
>  if (!topi) {
>  aplic->iforce[idc] = 0;
> +riscv_aplic_idc_update(aplic, idc);
>  return 0;
>  }
>
> --
> 2.43.2
>
>

Re: [PATCH v4 2/2] vhost: Perform memory section dirty scans once per iteration

2024-03-21 Thread Jason Wang

On Fri, Mar 22, 2024 at 5:43 AM Si-Wei Liu  wrote:
>
>
>
> On 3/20/2024 8:56 PM, Jason Wang wrote:
> > On Thu, Mar 21, 2024 at 5:03 AM Si-Wei Liu  wrote:
> >>
> >>
> >> On 3/19/2024 8:27 PM, Jason Wang wrote:
> >>> On Tue, Mar 19, 2024 at 6:16 AM Si-Wei Liu  wrote:
> 
>  On 3/17/2024 8:22 PM, Jason Wang wrote:
> > On Sat, Mar 16, 2024 at 2:45 AM Si-Wei Liu  
> > wrote:
> >> On 3/14/2024 9:03 PM, Jason Wang wrote:
> >>> On Fri, Mar 15, 2024 at 5:39 AM Si-Wei Liu  
> >>> wrote:
>  On setups with one or more virtio-net devices with vhost on,
>  dirty tracking iteration increases cost the bigger the number
>  amount of queues are set up e.g. on idle guests migration the
>  following is observed with virtio-net with vhost=on:
> 
>  48 queues -> 78.11%  [.] vhost_dev_sync_region.isra.13
>  8 queues -> 40.50%   [.] vhost_dev_sync_region.isra.13
>  1 queue -> 6.89% [.] vhost_dev_sync_region.isra.13
>  2 devices, 1 queue -> 18.60%  [.] vhost_dev_sync_region.isra.14
> 
>  With high memory rates the symptom is lack of convergence as soon
>  as it has a vhost device with a sufficiently high number of queues,
>  the sufficient number of vhost devices.
> 
>  On every migration iteration (every 100msecs) it will redundantly
>  query the *shared log* the number of queues configured with vhost
>  that exist in the guest. For the virtqueue data, this is necessary,
>  but not for the memory sections which are the same. So essentially
>  we end up scanning the dirty log too often.
> 
>  To fix that, select a vhost device responsible for scanning the
>  log with regards to memory sections dirty tracking. It is selected
>  when we enable the logger (during migration) and cleared when we
>  disable the logger. If the vhost logger device goes away for some
>  reason, the logger will be re-selected from the rest of vhost
>  devices.
> 
>  After making mem-section logger a singleton instance, constant cost
>  of 7%-9% (like the 1 queue report) will be seen, no matter how many
>  queues or how many vhost devices are configured:
> 
>  48 queues -> 8.71%[.] vhost_dev_sync_region.isra.13
>  2 devices, 8 queues -> 7.97%   [.] vhost_dev_sync_region.isra.14
> 
>  Co-developed-by: Joao Martins 
>  Signed-off-by: Joao Martins 
>  Signed-off-by: Si-Wei Liu 
> 
>  ---
>  v3 -> v4:
>    - add comment to clarify effect on cache locality and
>  performance
> 
>  v2 -> v3:
>    - add after-fix benchmark to commit log
>    - rename vhost_log_dev_enabled to vhost_dev_should_log
>    - remove unneeded comparisons for backend_type
>    - use QLIST array instead of single flat list to store vhost
>  logger devices
>    - simplify logger election logic
>  ---
>   hw/virtio/vhost.c | 67 
>  ++-
>   include/hw/virtio/vhost.h |  1 +
>   2 files changed, 62 insertions(+), 6 deletions(-)
> 
>  diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
>  index 612f4db..58522f1 100644
>  --- a/hw/virtio/vhost.c
>  +++ b/hw/virtio/vhost.c
>  @@ -45,6 +45,7 @@
> 
>   static struct vhost_log *vhost_log[VHOST_BACKEND_TYPE_MAX];
>   static struct vhost_log *vhost_log_shm[VHOST_BACKEND_TYPE_MAX];
>  +static QLIST_HEAD(, vhost_dev) 
>  vhost_log_devs[VHOST_BACKEND_TYPE_MAX];
> 
>   /* Memslots used by backends that support private memslots 
>  (without an fd). */
>   static unsigned int used_memslots;
>  @@ -149,6 +150,47 @@ bool vhost_dev_has_iommu(struct vhost_dev *dev)
>   }
>   }
> 
>  +static inline bool vhost_dev_should_log(struct vhost_dev *dev)
>  +{
>  +assert(dev->vhost_ops);
>  +assert(dev->vhost_ops->backend_type > VHOST_BACKEND_TYPE_NONE);
>  +assert(dev->vhost_ops->backend_type < VHOST_BACKEND_TYPE_MAX);
>  +
>  +return dev == 
>  QLIST_FIRST(_log_devs[dev->vhost_ops->backend_type]);
> >>> A dumb question, why not simple check
> >>>
> >>> dev->log == vhost_log_shm[dev->vhost_ops->backend_type]
> >> Because we are not sure if the logger comes from vhost_log_shm[] or
> >> vhost_log[]. Don't want to complicate the check here by calling into
> >> vhost_dev_log_is_shared() everytime when the .log_sync() is called.
> > It has very low overhead, isn't it?
>  Whether this has low overhead will have to depend on

Re: [RFC PATCH v9 06/23] target/arm: Add support for Non-maskable Interrupt

2024-03-21 Thread Jinjie Ruan via




On 2024/3/22 2:28, Peter Maydell wrote:
> On Thu, 21 Mar 2024 at 15:46, Peter Maydell  wrote:
>> Something somewhere needs to implement "if SCTLR_ELx.NMI is 0 then
>> we don't take EXCP_VINMI etc but instead (maybe) EXCP_VIRQ etc".
>> At the moment nothing does that:
>>  * arm_cpu_update_vinmi() doesn't look at the NMI bit before
>>deciding whether to set CPU_INTERRUPT_VINMI
>>  * in arm_excp_unmasked() if NMI is 0 then allIntMask takes its
>>default value of false and so arm_excp_unmasked() returns true,
>>so VINMI is not masked
>>  * arm_cpu_exec_interrupt() doesn't look at the NMI bit before
>>deciding whether to check the CPU_INTERRUPT_VINMI bit in interrupt_request
>>
>> So even if SCTLR_ELx.NMI is 0 we'll still try to take a VINMI
>> if it's set up in the HCR_EL2 bits.
>>
>> However we do this the required behaviour is that if NMI is 0
>> then it is as if the interrupt doesn't have superpriority and
>> it falls back to being handled as an ordinary IRQ, VIRQ, VFIQ etc.
>> I think the best place to do this is probably here in
>> arm_cpu_exec_interrupt() -- if SCTLR_ELx.NMI isn't set then
>> treat the VFNMI bit like VFIQ, the VINMI bit like VIRQ, and
>> the NMI bit like IRQ.
> 
> Folding in something like this I think will work:
> 
> diff --git a/target/arm/cpu.c b/target/arm/cpu.c
> index 91c2896de0f..797ae3eb805 100644
> --- a/target/arm/cpu.c
> +++ b/target/arm/cpu.c
> @@ -837,7 +837,8 @@ static bool arm_cpu_exec_interrupt(CPUState *cs,
> int interrupt_request)
> 
>  /* The prioritization of interrupts is IMPLEMENTATION DEFINED. */
> 
> -if (cpu_isar_feature(aa64_nmi, env_archcpu(env))) {
> +if (cpu_isar_feature(aa64_nmi, env_archcpu(env)) &&
> +(arm_sctlr(env, cur_el) & SCTLR_NMI)) {
>  if (interrupt_request & CPU_INTERRUPT_NMI) {
>  excp_idx = EXCP_NMI;
>  target_el = arm_phys_excp_target_el(cs, excp_idx, cur_el, 
> secure);
> @@ -862,7 +863,22 @@ static bool arm_cpu_exec_interrupt(CPUState *cs,
> int interrupt_request)
>  goto found;
>  }
>  }
> +} else {
> +/*
> + * NMI disabled: interrupts with superpriority are handled
> + * as if they didn't have it
> + */
> +if (interrupt_request & CPU_INTERRUPT_NMI) {
> +interrupt_request |= CPU_INTERRUPT_HARD;

The CPU_INTERRUPT_NMI and CPU_INTERRUPT_HARD are set simultaneously,
should the CPU_INTERRUPT_NMI be cleared?

> +}
> +if (interrupt_request & CPU_INTERRUPT_VINMI) {
> +interrupt_request |= CPU_INTERRUPT_VIRQ;
> +}
> +if (interrupt_request & CPU_INTERRUPT_VFNMI) {
> +interrupt_request |= CPU_INTERRUPT_VFIQ;
> +}
>  }> +
>  if (interrupt_request & CPU_INTERRUPT_FIQ) {
>  excp_idx = EXCP_FIQ;
>  target_el = arm_phys_excp_target_el(cs, excp_idx, cur_el, secure);
> 
> 
>> What semantics do we intend for the VINMI/VFNMI bits in interrupt_request
>> and for the incoming IRQ, FIQ, NMI lines? The GIC spec suggests
>> (but doesn't mandate) that NMI could be signalled by asserting
>> both NMI and IRQ, and plain IRQ by asserting just IRQ (table 4-6
>> in the GIC spec). I think the GIC changes in this patchset assert
>> only the NMI line for an IRQNMI, and not both NMI and IRQ. That's OK
>> and I think makes more sense for QEMU than signalling both lines,
>> but it's not the same as what we wind up doing with the handling
>> of the HCR_EL2 bits in these functions, because you don't change
>> the existing arm_cpu_update_virq() so that it only sets the
>> CPU_INTERRUPT_VIRQ bit if this is a VIRQ and not a VIRQNMI.
>> So if the guest sets HCR_EL2.VI and HCRX_EL2.VINMI then
>> arm_cpu_update_virq() will say "this is a VIRQ" and also
>> arm_cpu_update_vinmi() will say "This is a VINMI" and so both bits
>> get set in the interrupt_request field.
>>
>> I think the fix for this is probably to have arm_cpu_update_virq()
>> and arm_cpu_update_vfiq() check that this is not a VINMI/VFNMI,
>> so we only set 1 bit in interrupt_request, not 2.
> 
> And for this a change like:
> 
> diff --git a/target/arm/cpu.c b/target/arm/cpu.c
> index 03a48a41366..91c2896de0f 100644
> --- a/target/arm/cpu.c
> +++ b/target/arm/cpu.c
> @@ -926,7 +926,8 @@ void arm_cpu_update_virq(ARMCPU *cpu)
>  CPUARMState *env = >env;
>  CPUState *cs = CPU(cpu);
> 
> -bool new_state = (env->cp15.hcr_el2 & HCR_VI) ||
> +bool new_state = ((arm_hcr_el2_eff(env) & HCR_VI) &&
> +  !(arm_hcrx_el2_eff(env) & HCRX_VINMI)) ||
>  (env->irq_line_state & CPU_INTERRUPT_VIRQ);
> 
>  if (new_state != ((cs->interrupt_request & CPU_INTERRUPT_VIRQ) != 0)) {
> @@ -947,7 +948,8 @@ void arm_cpu_update_vfiq(ARMCPU *cpu)
>  CPUARMState *env = >env;
>  CPUState *cs = CPU(cpu);
> 
> -bool new_state = (env->cp15.hcr_el2 & HCR_VF) ||
> +bool new_state = ((arm_hcr_el2_eff(env) & HCR_VF) &&
> +

Re: [RFC v2 2/2] hw/riscv: Add server platform reference machine

On Tue, Mar 12, 2024 at 11:55 PM Fei Wu  wrote:
>
> The RISC-V Server Platform specification[1] defines a standardized set
> of hardware and software capabilities, that portable system software,
> such as OS and hypervisors can rely on being present in a RISC-V server
> platform.
>
> A corresponding Qemu RISC-V server platform reference (rvsp-ref for
> short) machine type is added to provide a environment for firmware/OS
> development and testing. The main features included in rvsp-ref are:
>
>  - Based on riscv virt machine type
>  - A new memory map as close as virt machine as possible
>  - A new virt CPU type rvsp-ref-cpu for server platform compliance
>  - AIA
>  - PCIe AHCI
>  - PCIe NIC
>  - No virtio device
>  - No fw_cfg device
>  - No ACPI table provided
>  - Only minimal device tree nodes
>
> [1] https://github.com/riscv-non-isa/riscv-server-platform
>
> Signed-off-by: Fei Wu 
> ---
>  configs/devices/riscv64-softmmu/default.mak |1 +
>  hw/riscv/Kconfig|   12 +
>  hw/riscv/meson.build|1 +
>  hw/riscv/server_platform_ref.c  | 1276 +++
>  4 files changed, 1290 insertions(+)
>  create mode 100644 hw/riscv/server_platform_ref.c
>
> diff --git a/configs/devices/riscv64-softmmu/default.mak 
> b/configs/devices/riscv64-softmmu/default.mak
> index 3f68059448..a1d98e49ef 100644
> --- a/configs/devices/riscv64-softmmu/default.mak
> +++ b/configs/devices/riscv64-softmmu/default.mak
> @@ -10,5 +10,6 @@ CONFIG_SPIKE=y
>  CONFIG_SIFIVE_E=y
>  CONFIG_SIFIVE_U=y
>  CONFIG_RISCV_VIRT=y
> +CONFIG_SERVER_PLATFORM_REF=y
>  CONFIG_MICROCHIP_PFSOC=y
>  CONFIG_SHAKTI_C=y
> diff --git a/hw/riscv/Kconfig b/hw/riscv/Kconfig
> index 5d644eb7b1..5674589e66 100644
> --- a/hw/riscv/Kconfig
> +++ b/hw/riscv/Kconfig
> @@ -48,6 +48,18 @@ config RISCV_VIRT
>  select ACPI
>  select ACPI_PCI
>
> +config SERVER_PLATFORM_REF
> +bool
> +select RISCV_NUMA
> +select GOLDFISH_RTC
> +select PCI
> +select PCI_EXPRESS_GENERIC_BRIDGE
> +select PFLASH_CFI01
> +select SERIAL
> +select RISCV_ACLINT
> +select RISCV_APLIC
> +select RISCV_IMSIC
> +
>  config SHAKTI_C
>  bool
>  select RISCV_ACLINT
> diff --git a/hw/riscv/meson.build b/hw/riscv/meson.build
> index 2f7ee81be3..bb3aff91ea 100644
> --- a/hw/riscv/meson.build
> +++ b/hw/riscv/meson.build
> @@ -4,6 +4,7 @@ riscv_ss.add(when: 'CONFIG_RISCV_NUMA', if_true: 
> files('numa.c'))
>  riscv_ss.add(files('riscv_hart.c'))
>  riscv_ss.add(when: 'CONFIG_OPENTITAN', if_true: files('opentitan.c'))
>  riscv_ss.add(when: 'CONFIG_RISCV_VIRT', if_true: files('virt.c'))
> +riscv_ss.add(when: 'CONFIG_SERVER_PLATFORM_REF', if_true: 
> files('server_platform_ref.c'))
>  riscv_ss.add(when: 'CONFIG_SHAKTI_C', if_true: files('shakti_c.c'))
>  riscv_ss.add(when: 'CONFIG_SIFIVE_E', if_true: files('sifive_e.c'))
>  riscv_ss.add(when: 'CONFIG_SIFIVE_U', if_true: files('sifive_u.c'))
> diff --git a/hw/riscv/server_platform_ref.c b/hw/riscv/server_platform_ref.c
> new file mode 100644
> index 00..b552650265
> --- /dev/null
> +++ b/hw/riscv/server_platform_ref.c
> @@ -0,0 +1,1276 @@
> +/*
> + * QEMU RISC-V Server Platform (RVSP) Reference Board
> + *
> + * Copyright (c) 2024 Intel, Inc.
> + *
> + * This board is compliant RISC-V Server platform specification and 
> leveraging
> + * a lot of riscv virt code.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2 or later, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along 
> with
> + * this program.  If not, see .
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu/units.h"
> +#include "qemu/error-report.h"
> +#include "qemu/guest-random.h"
> +#include "qapi/error.h"
> +#include "qapi/qapi-visit-common.h"
> +#include "hw/boards.h"
> +#include "hw/loader.h"
> +#include "hw/sysbus.h"
> +#include "hw/qdev-properties.h"
> +#include "hw/char/serial.h"
> +#include "hw/block/flash.h"
> +#include "hw/ide/pci.h"
> +#include "hw/ide/ahci-pci.h"
> +#include "hw/pci/pci.h"
> +#include "hw/pci-host/gpex.h"
> +#include "hw/core/sysbus-fdt.h"
> +#include "hw/riscv/riscv_hart.h"
> +#include "hw/riscv/boot.h"
> +#include "hw/riscv/numa.h"
> +#include "hw/intc/riscv_aclint.h"
> +#include "hw/intc/riscv_aplic.h"
> +#include "hw/intc/riscv_imsic.h"
> +#include "chardev/char.h"
> +#include "sysemu/device_tree.h"
> +#include "sysemu/runstate.h"
> +#include "sysemu/sysemu.h"
> +#include "sysemu/tcg.h"
> +#include "target/riscv/cpu.h"
> +#include

Re: [PULL 0/9] target/hppa fixes for 9.0


21.03.2024 21:32, Helge Deller wrote:

On 3/21/24 19:25, Sven Schnelle wrote:

Michael Tokarev  writes:


20.03.2024 03:32, Richard Henderson :


Richard Henderson (3):
    target/hppa: Fix assemble_16 insns for wide mode
    target/hppa: Fix assemble_11a insns for wide mode
    target/hppa: Fix assemble_12a insns for wide mode
Sven Schnelle (6):
    target/hppa: ldcw,s uses static shift of 3
    target/hppa: fix shrp for wide mode
    target/hppa: fix access_id check
    target/hppa: exit tb on flush cache instructions
    target/hppa: mask privilege bits in mfia
    target/hppa: fix do_stdby_e()


Is it all -stable material (when appropriate)?


I'd say yes.


Yes.


Picked all 9 for stable-8.2.

And none for stable-7.2.  There, just one of them applies.

I understand most of them can be applied still (it is just adding
new lines here and there, the same lines needs to be added to 7.2
but there, context is missing so every patch needs manual applying,
which I'm not feeling confident doing.  If anything of that is
really good to have in 7.2 (which has de-facto become an LTS series),
please re-spin it on top of stable-7.2 branch and send the result
to qemu-stable@.

Thanks,

/mjt

Re: [PATCH v2 1/1] target/riscv/kvm: fix timebase-frequency when using KVM acceleration

On Thu, Mar 14, 2024 at 4:17 PM Yong-Xuan Wang  wrote:
>
> The timebase-frequency of guest OS should be the same with host
> machine. The timebase-frequency value in DTS should be got from
> hypervisor when using KVM acceleration.
>
> Reviewed-by: Andrew Jones 
> Signed-off-by: Yong-Xuan Wang 

Thanks!

Applied to riscv-to-apply.next

Alistair

>
> ---
> Changelog
> v2:
> - update the function definition
> - restructure if-else statement
> ---
>  hw/riscv/virt.c  | 2 ++
>  target/riscv/kvm/kvm-cpu.c   | 9 +
>  target/riscv/kvm/kvm_riscv.h | 1 +
>  3 files changed, 12 insertions(+)
>
> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
> index a094af97c32a..533b17799581 100644
> --- a/hw/riscv/virt.c
> +++ b/hw/riscv/virt.c
> @@ -711,6 +711,8 @@ static void create_fdt_sockets(RISCVVirtState *s, const 
> MemMapEntry *memmap,
>
>  qemu_fdt_add_subnode(ms->fdt, "/cpus");
>  qemu_fdt_setprop_cell(ms->fdt, "/cpus", "timebase-frequency",
> +  kvm_enabled() ?
> +  kvm_riscv_get_timebase_frequency(first_cpu) :
>RISCV_ACLINT_DEFAULT_TIMEBASE_FREQ);
>  qemu_fdt_setprop_cell(ms->fdt, "/cpus", "#size-cells", 0x0);
>  qemu_fdt_setprop_cell(ms->fdt, "/cpus", "#address-cells", 0x1);
> diff --git a/target/riscv/kvm/kvm-cpu.c b/target/riscv/kvm/kvm-cpu.c
> index c7afdb1e81b7..bbb115eaa867 100644
> --- a/target/riscv/kvm/kvm-cpu.c
> +++ b/target/riscv/kvm/kvm-cpu.c
> @@ -739,6 +739,15 @@ static void kvm_riscv_put_regs_timer(CPUState *cs)
>  env->kvm_timer_dirty = false;
>  }
>
> +uint64_t kvm_riscv_get_timebase_frequency(CPUState *cs)
> +{
> +uint64_t reg;
> +
> +KVM_RISCV_GET_TIMER(cs, frequency, reg);
> +
> +return reg;
> +}
> +
>  static int kvm_riscv_get_regs_vector(CPUState *cs)
>  {
>  RISCVCPU *cpu = RISCV_CPU(cs);
> diff --git a/target/riscv/kvm/kvm_riscv.h b/target/riscv/kvm/kvm_riscv.h
> index 4bd98fddc776..58518988681d 100644
> --- a/target/riscv/kvm/kvm_riscv.h
> +++ b/target/riscv/kvm/kvm_riscv.h
> @@ -28,5 +28,6 @@ void kvm_riscv_aia_create(MachineState *machine, uint64_t 
> group_shift,
>  void riscv_kvm_aplic_request(void *opaque, int irq, int level);
>  int kvm_riscv_sync_mpstate_to_kvm(RISCVCPU *cpu, int state);
>  void riscv_kvm_cpu_finalize_features(RISCVCPU *cpu, Error **errp);
> +uint64_t kvm_riscv_get_timebase_frequency(CPUState *cs);
>
>  #endif
> --
> 2.17.1
>
>

Re: [PATCH v2 1/1] target/riscv/kvm: fix timebase-frequency when using KVM acceleration

On Thu, Mar 14, 2024 at 4:17 PM Yong-Xuan Wang  wrote:
>
> The timebase-frequency of guest OS should be the same with host
> machine. The timebase-frequency value in DTS should be got from
> hypervisor when using KVM acceleration.
>
> Reviewed-by: Andrew Jones 
> Signed-off-by: Yong-Xuan Wang 

Reviewed-by: Alistair Francis 

Alistair

>
> ---
> Changelog
> v2:
> - update the function definition
> - restructure if-else statement
> ---
>  hw/riscv/virt.c  | 2 ++
>  target/riscv/kvm/kvm-cpu.c   | 9 +
>  target/riscv/kvm/kvm_riscv.h | 1 +
>  3 files changed, 12 insertions(+)
>
> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
> index a094af97c32a..533b17799581 100644
> --- a/hw/riscv/virt.c
> +++ b/hw/riscv/virt.c
> @@ -711,6 +711,8 @@ static void create_fdt_sockets(RISCVVirtState *s, const 
> MemMapEntry *memmap,
>
>  qemu_fdt_add_subnode(ms->fdt, "/cpus");
>  qemu_fdt_setprop_cell(ms->fdt, "/cpus", "timebase-frequency",
> +  kvm_enabled() ?
> +  kvm_riscv_get_timebase_frequency(first_cpu) :
>RISCV_ACLINT_DEFAULT_TIMEBASE_FREQ);
>  qemu_fdt_setprop_cell(ms->fdt, "/cpus", "#size-cells", 0x0);
>  qemu_fdt_setprop_cell(ms->fdt, "/cpus", "#address-cells", 0x1);
> diff --git a/target/riscv/kvm/kvm-cpu.c b/target/riscv/kvm/kvm-cpu.c
> index c7afdb1e81b7..bbb115eaa867 100644
> --- a/target/riscv/kvm/kvm-cpu.c
> +++ b/target/riscv/kvm/kvm-cpu.c
> @@ -739,6 +739,15 @@ static void kvm_riscv_put_regs_timer(CPUState *cs)
>  env->kvm_timer_dirty = false;
>  }
>
> +uint64_t kvm_riscv_get_timebase_frequency(CPUState *cs)
> +{
> +uint64_t reg;
> +
> +KVM_RISCV_GET_TIMER(cs, frequency, reg);
> +
> +return reg;
> +}
> +
>  static int kvm_riscv_get_regs_vector(CPUState *cs)
>  {
>  RISCVCPU *cpu = RISCV_CPU(cs);
> diff --git a/target/riscv/kvm/kvm_riscv.h b/target/riscv/kvm/kvm_riscv.h
> index 4bd98fddc776..58518988681d 100644
> --- a/target/riscv/kvm/kvm_riscv.h
> +++ b/target/riscv/kvm/kvm_riscv.h
> @@ -28,5 +28,6 @@ void kvm_riscv_aia_create(MachineState *machine, uint64_t 
> group_shift,
>  void riscv_kvm_aplic_request(void *opaque, int irq, int level);
>  int kvm_riscv_sync_mpstate_to_kvm(RISCVCPU *cpu, int state);
>  void riscv_kvm_cpu_finalize_features(RISCVCPU *cpu, Error **errp);
> +uint64_t kvm_riscv_get_timebase_frequency(CPUState *cs);
>
>  #endif
> --
> 2.17.1
>
>

Re: [PATCH] target/riscv: Fix mode in riscv_tlb_fill

On Thu, Mar 21, 2024 at 3:29 AM Irina Ryapolova
 wrote:
>
> Need to convert mmu_idx to privilege mode for PMP function.
>
> Signed-off-by: Irina Ryapolova 

Thanks!

Applied to riscv-to-apply.next

Alistair

> ---
>  target/riscv/cpu_helper.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> index ce7322011d..fc090d729a 100644
> --- a/target/riscv/cpu_helper.c
> +++ b/target/riscv/cpu_helper.c
> @@ -1315,7 +1315,7 @@ bool riscv_cpu_tlb_fill(CPUState *cs, vaddr address, 
> int size,
>  bool two_stage_lookup = mmuidx_2stage(mmu_idx);
>  bool two_stage_indirect_error = false;
>  int ret = TRANSLATE_FAIL;
> -int mode = mmu_idx;
> +int mode = mmuidx_priv(mmu_idx);
>  /* default TLB page size */
>  target_ulong tlb_size = TARGET_PAGE_SIZE;
>
> --
> 2.25.1
>
>

Re: [PATCH] target/riscv: Fix mode in riscv_tlb_fill

On Thu, Mar 21, 2024 at 3:29 AM Irina Ryapolova
 wrote:
>
> Need to convert mmu_idx to privilege mode for PMP function.
>
> Signed-off-by: Irina Ryapolova 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  target/riscv/cpu_helper.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> index ce7322011d..fc090d729a 100644
> --- a/target/riscv/cpu_helper.c
> +++ b/target/riscv/cpu_helper.c
> @@ -1315,7 +1315,7 @@ bool riscv_cpu_tlb_fill(CPUState *cs, vaddr address, 
> int size,
>  bool two_stage_lookup = mmuidx_2stage(mmu_idx);
>  bool two_stage_indirect_error = false;
>  int ret = TRANSLATE_FAIL;
> -int mode = mmu_idx;
> +int mode = mmuidx_priv(mmu_idx);
>  /* default TLB page size */
>  target_ulong tlb_size = TARGET_PAGE_SIZE;
>
> --
> 2.25.1
>
>

Re: [PATCH] target/riscv: rvv: Remove the dependency of Zvfbfmin to Zfbfmin

On Fri, Mar 22, 2024 at 3:16 AM Max Chou  wrote:
>
> According to the Zvfbfmin definition in the RISC-V BF16 extensions spec,
> the Zvfbfmin extension only requires either the V extension or the
> Zve32f extension.
>
> Signed-off-by: Max Chou 

Thanks!

Applied to riscv-to-apply.next

Alistair

> ---
>  target/riscv/tcg/tcg-cpu.c | 5 -
>  1 file changed, 5 deletions(-)
>
> diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
> index 63192ef54f3..b5b95e052d2 100644
> --- a/target/riscv/tcg/tcg-cpu.c
> +++ b/target/riscv/tcg/tcg-cpu.c
> @@ -530,11 +530,6 @@ void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, 
> Error **errp)
>  return;
>  }
>
> -if (cpu->cfg.ext_zvfbfmin && !cpu->cfg.ext_zfbfmin) {
> -error_setg(errp, "Zvfbfmin extension depends on Zfbfmin extension");
> -return;
> -}
> -
>  if (cpu->cfg.ext_zvfbfmin && !cpu->cfg.ext_zve32f) {
>  error_setg(errp, "Zvfbfmin extension depends on Zve32f extension");
>  return;
> --
> 2.34.1
>
>

Re: [PATCH] target/riscv: rvv: Remove the dependency of Zvfbfmin to Zfbfmin

On Fri, Mar 22, 2024 at 3:16 AM Max Chou  wrote:
>
> According to the Zvfbfmin definition in the RISC-V BF16 extensions spec,
> the Zvfbfmin extension only requires either the V extension or the
> Zve32f extension.

Yeah, the dependency has been removed

https://github.com/riscv/riscv-bfloat16/commit/86d7a74f4b928e981f79f6d84a4592e6e9e4c0e9#diff-f3084dfbeae77f242848fc2cd24a84514a9f01aff7ae8ad945e1af1c0a33988cR20

>
> Signed-off-by: Max Chou 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  target/riscv/tcg/tcg-cpu.c | 5 -
>  1 file changed, 5 deletions(-)
>
> diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
> index 63192ef54f3..b5b95e052d2 100644
> --- a/target/riscv/tcg/tcg-cpu.c
> +++ b/target/riscv/tcg/tcg-cpu.c
> @@ -530,11 +530,6 @@ void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, 
> Error **errp)
>  return;
>  }
>
> -if (cpu->cfg.ext_zvfbfmin && !cpu->cfg.ext_zfbfmin) {
> -error_setg(errp, "Zvfbfmin extension depends on Zfbfmin extension");
> -return;
> -}
> -
>  if (cpu->cfg.ext_zvfbfmin && !cpu->cfg.ext_zve32f) {
>  error_setg(errp, "Zvfbfmin extension depends on Zve32f extension");
>  return;
> --
> 2.34.1
>
>

Re: [PATCH] Fix fp16 checking in vector fp widen/narrow instructions

On Wed, Mar 20, 2024 at 5:28 PM Max Chou  wrote:
>
> When SEW is 16, we need to check whether the Zvfhmin is enabled for the
> single width operator for vector floating point widen/narrow
> instructions.
>
> The commits in this patchset fix the single width operator checking and
> remove the redudant SEW checking for vector floating point widen/narrow
> instructions.
>
> Max Chou (4):
>   target/riscv: rvv: Fix Zvfhmin checking for vfwcvt.f.f.v and
> vfncvt.f.f.w instructions
>   target/riscv: rvv: Check single width operator for vector fp widen
> instructions
>   target/riscv: rvv: Check single width operator for vfncvt.rod.f.f.w
>   target/riscv: rvv: Remove redudant SEW checking for vector fp
> narrow/widen instructions

I think something went wrong here. It looks like you meant to send
this as a series, but somehow that information was lost in the subject

Alistair

>
>  target/riscv/insn_trans/trans_rvv.c.inc | 42 -
>  1 file changed, 28 insertions(+), 14 deletions(-)
>
> --
> 2.31.1
>
>

Re: [PATCH] hw/intc: Update APLIC IDC after claiming iforce register

On Thu, Mar 21, 2024 at 8:50 PM  wrote:
>
> From: Frank Chang 
>
> Currently, QEMU only sets the iforce register to 0 and returns early
> when claiming the iforce register. However, this may leave mip.meip
> remains at 1 if a spurious external interrupt triggered by iforce
> register is the only pending interrupt to be claimed, and the interrupt
> cannot be lowered as expected.
>
> This commit fixes this issue by calling riscv_aplic_idc_update() to
> update the IDC status after the iforce register is claimed.
>
> Signed-off-by: Frank Chang 
> Reviewed-by: Jim Shu 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  hw/intc/riscv_aplic.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/hw/intc/riscv_aplic.c b/hw/intc/riscv_aplic.c
> index 6a7fbfa861..fc5df0d598 100644
> --- a/hw/intc/riscv_aplic.c
> +++ b/hw/intc/riscv_aplic.c
> @@ -488,6 +488,7 @@ static uint32_t riscv_aplic_idc_claimi(RISCVAPLICState 
> *aplic, uint32_t idc)
>
>  if (!topi) {
>  aplic->iforce[idc] = 0;
> +riscv_aplic_idc_update(aplic, idc);
>  return 0;
>  }
>
> --
> 2.43.2
>
>

Re: [PATCH for-9.0] target/riscv/debug: set tval=pc in breakpoint exceptions

On Wed, Mar 20, 2024 at 7:33 PM Daniel Henrique Barboza
 wrote:
>
> We're not setting (s/m)tval when triggering breakpoints of type 2
> (mcontrol) and 6 (mcontrol6). According to the debug spec section
> 5.7.12, "Match Control Type 6":
>
> "The Privileged Spec says that breakpoint exceptions that occur on
> instruction fetches, loads, or stores update the tval CSR with either
> zero or the faulting virtual address. The faulting virtual address for
> an mcontrol6 trigger with action = 0 is the address being accessed and
> which caused that trigger to fire."

Setting zero is valid though. I agree it's probably worth setting a
value, but I don't think we need this for 9.0

Alistair

>
> A similar text is also found in the Debug spec section 5.7.11 w.r.t.
> mcontrol.
>
> Given that we always use action = 0, save the faulting address for the
> mcontrol and mcontrol6 trigger breakpoints into env->badaddr, which is
> used as as scratch area for traps with address information. 'tval' is
> then set during riscv_cpu_do_interrupt().
>
> Reported-by: Joseph Chan 
> Fixes: b5f6379d13 ("target/riscv: debug: Implement debug related TCGCPUOps")
> Fixes: c472c142a7 ("target/riscv: debug: Add initial support of type 6 
> trigger")
> Signed-off-by: Daniel Henrique Barboza 
> ---
>  target/riscv/cpu_helper.c | 1 +
>  target/riscv/debug.c  | 3 +++
>  2 files changed, 4 insertions(+)
>
> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> index ce7322011d..492ca63b1a 100644
> --- a/target/riscv/cpu_helper.c
> +++ b/target/riscv/cpu_helper.c
> @@ -1717,6 +1717,7 @@ void riscv_cpu_do_interrupt(CPUState *cs)
>  tval = env->bins;
>  break;
>  case RISCV_EXCP_BREAKPOINT:
> +tval = env->badaddr;
>  if (cs->watchpoint_hit) {
>  tval = cs->watchpoint_hit->hitaddr;
>  cs->watchpoint_hit = NULL;
> diff --git a/target/riscv/debug.c b/target/riscv/debug.c
> index e30d99cc2f..b110370ea6 100644
> --- a/target/riscv/debug.c
> +++ b/target/riscv/debug.c
> @@ -798,6 +798,7 @@ bool riscv_cpu_debug_check_breakpoint(CPUState *cs)
>  if ((ctrl & TYPE2_EXEC) && (bp->pc == pc)) {
>  /* check U/S/M bit against current privilege level */
>  if ((ctrl >> 3) & BIT(env->priv)) {
> +env->badaddr = pc;
>  return true;
>  }
>  }
> @@ -810,11 +811,13 @@ bool riscv_cpu_debug_check_breakpoint(CPUState *cs)
>  if (env->virt_enabled) {
>  /* check VU/VS bit against current privilege level */
>  if ((ctrl >> 23) & BIT(env->priv)) {
> +env->badaddr = pc;
>  return true;
>  }
>  } else {
>  /* check U/S/M bit against current privilege level */
>  if ((ctrl >> 3) & BIT(env->priv)) {
> +env->badaddr = pc;
>  return true;
>  }
>  }
> --
> 2.44.0
>
>

Re: [RFC PATCH v9 06/23] target/arm: Add support for Non-maskable Interrupt

2024-03-21 Thread Jinjie Ruan via




On 2024/3/21 23:46, Peter Maydell wrote:
> On Thu, 21 Mar 2024 at 13:10, Jinjie Ruan  wrote:
>>
>> This only implements the external delivery method via the GICv3.
>>
>> Signed-off-by: Jinjie Ruan 
>> Reviewed-by: Richard Henderson 
>> ---
>> v9:
>> - Update the GPIOs passed in the arm_cpu_kvm_set_irq, and update the comment.
>> - Definitely not merge VINMI and VFNMI into EXCP_VNMI.
>> - Update VINMI and VFNMI when writing HCR_EL2 or HCRX_EL2.
>> v8:
>> - Fix the rcu stall after sending a VNMI in qemu VM.
>> v7:
>> - Add Reviewed-by.
>> v6:
>> - env->cp15.hcr_el2 -> arm_hcr_el2_eff().
>> - env->cp15.hcrx_el2 -> arm_hcrx_el2_eff().
>> - Not include VF && VFNMI in CPU_INTERRUPT_VNMI.
>> v4:
>> - Accept NMI unconditionally for arm_cpu_has_work() but add comment.
>> - Change from & to && for EXCP_IRQ or EXCP_FIQ.
>> - Refator nmi mask in arm_excp_unmasked().
>> - Also handle VNMI in arm_cpu_exec_interrupt() and arm_cpu_set_irq().
>> - Rename virtual to Virtual.
>> v3:
>> - Not include CPU_INTERRUPT_NMI when FEAT_NMI not enabled
>> - Add ARM_CPU_VNMI.
>> - Refator nmi mask in arm_excp_unmasked().
>> - Test SCTLR_ELx.NMI for ALLINT mask for NMI.
>> ---
>>  target/arm/cpu-qom.h   |   5 +-
>>  target/arm/cpu.c   | 124 ++---
>>  target/arm/cpu.h   |   6 ++
>>  target/arm/helper.c|  33 +--
>>  target/arm/internals.h |  18 ++
>>  5 files changed, 172 insertions(+), 14 deletions(-)
>>
>> diff --git a/target/arm/cpu-qom.h b/target/arm/cpu-qom.h
>> index 8e032691db..b497667d61 100644
>> --- a/target/arm/cpu-qom.h
>> +++ b/target/arm/cpu-qom.h
>> @@ -36,11 +36,14 @@ DECLARE_CLASS_CHECKERS(AArch64CPUClass, AARCH64_CPU,
>>  #define ARM_CPU_TYPE_SUFFIX "-" TYPE_ARM_CPU
>>  #define ARM_CPU_TYPE_NAME(name) (name ARM_CPU_TYPE_SUFFIX)
>>
>> -/* Meanings of the ARMCPU object's four inbound GPIO lines */
>> +/* Meanings of the ARMCPU object's seven inbound GPIO lines */
>>  #define ARM_CPU_IRQ 0
>>  #define ARM_CPU_FIQ 1
>>  #define ARM_CPU_VIRQ 2
>>  #define ARM_CPU_VFIQ 3
>> +#define ARM_CPU_NMI 4
>> +#define ARM_CPU_VINMI 5
>> +#define ARM_CPU_VFNMI 6
>>
>>  /* For M profile, some registers are banked secure vs non-secure;
>>   * these are represented as a 2-element array where the first element
>> diff --git a/target/arm/cpu.c b/target/arm/cpu.c
>> index ab8d007a86..f1e7ae0975 100644
>> --- a/target/arm/cpu.c
>> +++ b/target/arm/cpu.c
>> @@ -122,6 +122,13 @@ void arm_restore_state_to_opc(CPUState *cs,
>>  }
>>  #endif /* CONFIG_TCG */
>>
>> +/*
>> + * With SCTLR_ELx.NMI == 0, IRQ with Superpriority is masked identically 
>> with
>> + * IRQ without Superpriority. Moreover, if the GIC is configured so that
>> + * FEAT_GICv3_NMI is only set if FEAT_NMI is set, then we won't ever see
>> + * CPU_INTERRUPT_*NMI anyway. So we might as well accept NMI here
>> + * unconditionally.
>> + */
>>  static bool arm_cpu_has_work(CPUState *cs)
>>  {
>>  ARMCPU *cpu = ARM_CPU(cs);
>> @@ -129,6 +136,7 @@ static bool arm_cpu_has_work(CPUState *cs)
>>  return (cpu->power_state != PSCI_OFF)
>>  && cs->interrupt_request &
>>  (CPU_INTERRUPT_FIQ | CPU_INTERRUPT_HARD
>> + | CPU_INTERRUPT_NMI | CPU_INTERRUPT_VINMI | CPU_INTERRUPT_VFNMI
>>   | CPU_INTERRUPT_VFIQ | CPU_INTERRUPT_VIRQ | CPU_INTERRUPT_VSERR
>>   | CPU_INTERRUPT_EXITTB);
>>  }
>> @@ -668,6 +676,7 @@ static inline bool arm_excp_unmasked(CPUState *cs, 
>> unsigned int excp_idx,
>>  CPUARMState *env = cpu_env(cs);
>>  bool pstate_unmasked;
>>  bool unmasked = false;
>> +bool allIntMask = false;
>>
>>  /*
>>   * Don't take exceptions if they target a lower EL.
>> @@ -678,13 +687,36 @@ static inline bool arm_excp_unmasked(CPUState *cs, 
>> unsigned int excp_idx,
>>  return false;
>>  }
>>
>> +if (cpu_isar_feature(aa64_nmi, env_archcpu(env)) &&
>> +env->cp15.sctlr_el[target_el] & SCTLR_NMI && cur_el == target_el) {
>> +allIntMask = env->pstate & PSTATE_ALLINT ||
>> + ((env->cp15.sctlr_el[target_el] & SCTLR_SPINTMASK) &&
>> +  (env->pstate & PSTATE_SP));
>> +}
>> +
>>  switch (excp_idx) {
>> +case EXCP_NMI:
>> +pstate_unmasked = !allIntMask;
>> +break;
>> +
>> +case EXCP_VINMI:
>> +if (!(hcr_el2 & HCR_IMO) || (hcr_el2 & HCR_TGE)) {
>> +/* VINMIs are only taken when hypervized.  */
>> +return false;
>> +}
>> +return !allIntMask;
>> +case EXCP_VFNMI:
>> +if (!(hcr_el2 & HCR_FMO) || (hcr_el2 & HCR_TGE)) {
>> +/* VFNMIs are only taken when hypervized.  */
>> +return false;
>> +}
>> +return !allIntMask;
>>  case EXCP_FIQ:
>> -pstate_unmasked = !(env->daif & PSTATE_F);
>> +pstate_unmasked = (!(env->daif & PSTATE_F)) && (!allIntMask);
>>  break;
>>
>>  case EXCP_IRQ:
>> -pstate_unmasked = !(env->daif & PSTATE_I);
>> +

[PATCH v2] input-linux: Add option to not grab a device upon guest startup

2024-03-21 Thread Justinien Bouron

Depending on your use-case, it might be inconvenient to have qemu grab
the input device from the host immediately upon starting the guest.

Added a new bool option to input-linux: grab-on-startup. If true, the
device is grabbed as soon as the guest is started, otherwise it is not
grabbed until the toggle combination is entered. To avoid breaking
existing setups, the default value of grab-on-startup is true, i.e. same
behaviour as before this change.

Signed-off-by: Justinien Bouron 
---

Changes since v1:
- Revised commit message.
- Revised grab-on-startup description in qapi/qom.json to comply with
  conventions.

 qapi/qom.json| 14 +-
 ui/input-linux.c | 20 +++-
 2 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/qapi/qom.json b/qapi/qom.json
index baae3a183f..4bcffd265c 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -508,13 +508,25 @@
 # @grab-toggle: the key or key combination that toggles device grab
 # (default: ctrl-ctrl)
 #
+# @grab-on-startup: if true, grab the device immediately upon starting
+# the guest.  Otherwise, don't grab the device until the
+# combination is entered.  This does not influence other devices
+# even if grab_all is true, i.e. in the unlikely scenario where
+# device1 has grab_all=true + grab-on-startup=true and device2 has
+# grab-on-startup=false, only device1 is grabbed on startup, then,
+# once the grab combination is entered, grabbing is toggled off
+# for both devices (because device1 enforces the grab_all
+# property) until the combination is entered again at which point
+# both devices will be grabbed.  (default: true).
+#
 # Since: 2.6
 ##
 { 'struct': 'InputLinuxProperties',
   'data': { 'evdev': 'str',
 '*grab_all': 'bool',
 '*repeat': 'bool',
-'*grab-toggle': 'GrabToggleKeys' } }
+'*grab-toggle': 'GrabToggleKeys',
+'*grab-on-startup': 'bool'} }
 
 ##
 # @EventLoopBaseProperties:
diff --git a/ui/input-linux.c b/ui/input-linux.c
index e572a2e905..68b5c6d485 100644
--- a/ui/input-linux.c
+++ b/ui/input-linux.c
@@ -44,6 +44,7 @@ struct InputLinux {
 boolgrab_request;
 boolgrab_active;
 boolgrab_all;
+boolgrab_on_startup;
 boolkeydown[KEY_CNT];
 int keycount;
 int wheel;
@@ -400,7 +401,7 @@ static void input_linux_complete(UserCreatable *uc, Error 
**errp)
 if (il->keycount) {
 /* delay grab until all keys are released */
 il->grab_request = true;
-} else {
+} else if (il->grab_on_startup) {
 input_linux_toggle_grab(il);
 }
 QTAILQ_INSERT_TAIL(, il, next);
@@ -491,6 +492,19 @@ static void input_linux_set_grab_toggle(Object *obj, int 
value,
 il->grab_toggle = value;
 }
 
+static bool input_linux_get_grab_on_startup(Object *obj, Error **errp)
+{
+InputLinux *il = INPUT_LINUX(obj);
+return il->grab_on_startup;
+}
+
+static void input_linux_set_grab_on_startup(Object *obj, bool value,
+Error **errp)
+{
+InputLinux *il = INPUT_LINUX(obj);
+il->grab_on_startup = value;
+}
+
 static void input_linux_instance_init(Object *obj)
 {
 }
@@ -498,6 +512,7 @@ static void input_linux_instance_init(Object *obj)
 static void input_linux_class_init(ObjectClass *oc, void *data)
 {
 UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
+ObjectProperty *grab_on_startup_prop;
 
 ucc->complete = input_linux_complete;
 
@@ -514,6 +529,9 @@ static void input_linux_class_init(ObjectClass *oc, void 
*data)
_lookup,
input_linux_get_grab_toggle,
input_linux_set_grab_toggle);
+grab_on_startup_prop = object_class_property_add_bool(oc, 
"grab-on-startup",
+input_linux_get_grab_on_startup, input_linux_set_grab_on_startup);
+object_property_set_default_bool(grab_on_startup_prop, true);
 }
 
 static const TypeInfo input_linux_info = {
-- 
2.43.0

Re: [RFC PATCH v8 13/23] hw/intc/arm_gicv3: Add irq superpriority information

2024-03-21 Thread Jinjie Ruan via




On 2024/3/21 21:17, Peter Maydell wrote:
> On Mon, 18 Mar 2024 at 09:38, Jinjie Ruan  wrote:
>>
>> A SPI, PPI or SGI interrupt can have a superpriority property. So
>> maintain superpriority information in PendingIrq and GICR/GICD.
>>
>> Signed-off-by: Jinjie Ruan 
>> Acked-by: Richard Henderson 
>> ---
>> v3:
>> - Place this ahead of implement GICR_INMIR.
>> - Add Acked-by.
>> ---
>>  include/hw/intc/arm_gicv3_common.h | 4 
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/include/hw/intc/arm_gicv3_common.h 
>> b/include/hw/intc/arm_gicv3_common.h
>> index 7324c7d983..df4380141d 100644
>> --- a/include/hw/intc/arm_gicv3_common.h
>> +++ b/include/hw/intc/arm_gicv3_common.h
>> @@ -146,6 +146,7 @@ typedef struct {
>>  int irq;
>>  uint8_t prio;
>>  int grp;
>> +bool superprio;
>>  } PendingIrq;
>>
>>  struct GICv3CPUState {
>> @@ -172,6 +173,7 @@ struct GICv3CPUState {
>>  uint32_t gicr_ienabler0;
>>  uint32_t gicr_ipendr0;
>>  uint32_t gicr_iactiver0;
>> +uint32_t gicr_isuperprio;
> 
> This field stores the state that is in the GICR_INMIR0
> register, so please name it that way: gicr_inmir0.
> 
>>  uint32_t edge_trigger; /* ICFGR0 and ICFGR1 even bits */
>>  uint32_t gicr_igrpmodr0;
>>  uint32_t gicr_nsacr;
>> @@ -274,6 +276,7 @@ struct GICv3State {
>>  GIC_DECLARE_BITMAP(active);   /* GICD_ISACTIVER */
>>  GIC_DECLARE_BITMAP(level);/* Current level */
>>  GIC_DECLARE_BITMAP(edge_trigger); /* GICD_ICFGR even bits */
>> +GIC_DECLARE_BITMAP(superprio);/* GICD_INMIR */
>>  uint8_t gicd_ipriority[GICV3_MAXIRQ];
>>  uint64_t gicd_irouter[GICV3_MAXIRQ];
>>  /* Cached information: pointer to the cpu i/f for the CPUs specified
>> @@ -313,6 +316,7 @@ GICV3_BITMAP_ACCESSORS(pending)
>>  GICV3_BITMAP_ACCESSORS(active)
>>  GICV3_BITMAP_ACCESSORS(level)
>>  GICV3_BITMAP_ACCESSORS(edge_trigger)
>> +GICV3_BITMAP_ACCESSORS(superprio)
> 
> This is the state behind the GICD_INMIR registers, and
> the GIC spec calls the bits in those registers NMI,
> so I would call this bitmap nmi, not superprio.
> 
> This commit adds new device state, so it also needs to be migrated.
> You'll want to add a new subsection to vmstate_gicv3_cpu which
> is present if the GIC implements NMIs, and which has an entry
> for the gicr_inmir0 field. Similarly, you want a new subsection
> in vmstate_gicv3 which is present if NMIs are implemented and which
> has a field for the nmi array.

OK, I'll add it.

> 
> thanks
> -- PMM

Re: qemu fuzz crash in virtio_net_queue_reset()

2024-03-21 Thread Xuan Zhuo

On Wed, 20 Mar 2024 00:24:37 +0300, "Vladimir Sementsov-Ogievskiy" 
 wrote:
> Hi all!
>
>  From fuzzing I've got a fuzz-data, which produces the following crash:
>
> qemu-fuzz-x86_64: ../hw/net/virtio-net.c:134: void 
> flush_or_purge_queued_packets(NetClientState *): Assertion 
> `!virtio_net_get_subqueue(nc)->async_tx.elem' failed.
> ==2172308== ERROR: libFuzzer: deadly signal
>  #0 0x5bd8c748b5a1 in __sanitizer_print_stack_trace 
> (/home/vsementsov/work/src/qemu/yc7-fuzz/build/qemu-fuzz-x86_64+0x26f05a1) 
> (BuildId: b41827f440fd9feaa98c667dbdcc961abb2799ae)
>  #1 0x5bd8c73fde38 in fuzzer::PrintStackTrace() 
> (/home/vsementsov/work/src/qemu/yc7-fuzz/build/qemu-fuzz-x86_64+0x2662e38) 
> (BuildId: b41827f440fd9feaa98c667dbdcc961abb2799ae)
>  #2 0x5bd8c73e38b3 in fuzzer::Fuzzer::CrashCallback() 
> (/home/settlements/work/src/qemu/yc7-fuzz/build/qemu-fuzz-x86_64+0x26488b3) 
> (BuildId: b41827f440fd9feaa98c667dbdcc961abb2799ae)
>  #3 0x739eec84251f  (/lib/x86_64-linux-gnu/libc.so.6+0x4251f) (BuildId: 
> c289da5071a3399de893d2af81d6a30c62646e1e)
>  #4 0x739eec8969fb in __pthread_kill_implementation 
> nptl/./nptl/pthread_kill.c:43:17
>  #5 0x739eec8969fb in __pthread_kill_internal 
> nptl/./nptl/pthread_kill.c:78:10
>  #6 0x739eec8969fb in pthread_kill nptl/./nptl/pthread_kill.c:89:10
>  #7 0x739eec842475 in gsignal signal/../sysdeps/posix/raise.c:26:13
>  #8 0x739eec8287f2 in abort stdlib/./stdlib/abort.c:79:7
>  #9 0x739eec82871a in __assert_fail_base assert/./assert/assert.c:92:3
>  #10 0x739eec839e95 in __assert_fail assert/./assert/assert.c:101:3
>  #11 0x5bd8c995d9e2 in flush_or_purge_queued_packets 
> /home/vsementsov/work/src/qemu/yc7-fuzz/build/../hw/net/virtio-net.c:134:5
>  #12 0x5bd8c9918a5f in virtio_net_queue_reset 
> /home/vsementsov/work/src/qemu/yc7-fuzz/build/../hw/net/virtio-net.c:563:5
>  #13 0x5bd8c9b724e5 in virtio_queue_reset 
> /home/vsementsov/work/src/qemu/yc7-fuzz/build/../hw/virtio/virtio.c:2492:9
>  #14 0x5bd8c8bcfb7c in virtio_pci_common_write 
> /home/vsementsov/work/src/qemu/yc7-fuzz/build/../hw/virtio/virtio-pci.c:1372:13
>  #15 0x5bd8c9e19cf3 in memory_region_write_accessor 
> /home/vsementsov/work/src/qemu/yc7-fuzz/build/../softmmu/memory.c:492:5
>  #16 0x5bd8c9e19631 in access_with_adjusted_size 
> /home/vsementsov/work/src/qemu/yc7-fuzz/build/../softmmu/memory.c:554:18
>  #17 0x5bd8c9e17f3c in memory_region_dispatch_write 
> /home/vsementsov/work/src/qemu/yc7-fuzz/build/../softmmu/memory.c:1514:16
>  #18 0x5bd8c9ea3bbe in flatview_write_continue 
> /home/vsementsov/work/src/qemu/yc7-fuzz/build/../softmmu/physmem.c:2825:23
>  #19 0x5bd8c9e91aab in flatview_write 
> /home/vsementsov/work/src/qemu/yc7-fuzz/build/../softmmu/physmem.c:2867:12
>  #20 0x5bd8c9e91568 in address_space_write 
> /home/vsementsov/work/src/qemu/yc7-fuzz/build/../softmmu/physmem.c:2963:18
>  #21 0x5bd8c74c8a90 in __wrap_qtest_writeq 
> /home/vsementsov/work/src/qemu/yc7-fuzz/build/../tests/qtest/fuzz/qtest_wrappers.c:187:9
>  #22 0x5bd8c74dc4da in op_write 
> /home/vsementsov/work/src/qemu/yc7-fuzz/build/../tests/qtest/fuzz/generic_fuzz.c:487:13
>  #23 0x5bd8c74d942e in generic_fuzz 
> /home/vsementsov/work/src/qemu/yc7-fuzz/build/../tests/qtest/fuzz/generic_fuzz.c:714:17
>  #24 0x5bd8c74c016e in LLVMFuzzerTestOneInput 
> /home/vsementsov/work/src/qemu/yc7-fuzz/build/../tests/qtest/fuzz/fuzz.c:152:5
>  #25 0x5bd8c73e4e43 in fuzzer::Fuzzer::ExecuteCallback(unsigned char 
> const*, unsigned long) 
> (/home/vsementsov/work/src/qemu/yc7-fuzz/build/qemu-fuzz-x86_64+0x2649e43) 
> (BuildId: b41827f440fd9feaa98c667dbdcc961abb2799ae)
>  #26 0x5bd8c73cebbf in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, 
> unsigned long) 
> (/home/vsementsov/work/src/qemu/yc7-fuzz/build/qemu-fuzz-x86_64+0x2633bbf) 
> (BuildId: b41827f440fd9feaa98c667dbdcc961abb2799ae)
>  #27 0x5bd8c73d4916 in fuzzer::FuzzerDriver(int*, char***, int 
> (*)(unsigned char const*, unsigned long)) 
> (/home/vsementsov/work/src/qemu/yc7-fuzz/build/qemu-fuzz-x86_64+0x2639916) 
> (BuildId: b41827f440fd9feaa98c667dbdcc961abb2799ae)
>  #28 0x5bd8c73fe732 in main 
> (/home/vsementsov/work/src/qemu/yc7-fuzz/build/qemu-fuzz-x86_64+0x2663732) 
> (BuildId: b41827f440fd9feaa98c667dbdcc961abb2799ae)
>  #29 0x739eec829d8f in __libc_start_call_main 
> csu/../sysdeps/nptl/libc_start_call_main.h:58:16
>  #30 0x739eec829e3f in __libc_start_main csu/../csu/libc-start.c:392:3
>  #31 0x5bd8c73c9484 in _start 
> (/home/vsementsov/work/src/qemu/yc7-fuzz/build/qemu-fuzz-x86_64+0x262e484) 
> (BuildId: b41827f440fd9feaa98c667dbdcc961abb2799ae)
>
>
>
> How to reproduce:
> ./configure --target-list=x86_64-softmmu --enable-debug --disable-docs 
> --cc=clang --cxx=clang++ --enable-fuzzing --enable-sanitizers --enable-slirp
> make -j20 qemu-fuzz-x86_64
> ./build/qemu-fuzz-x86_64 --fuzz-target=generic-fuzz-virtio-net-pci-slirp 
>

Re: [PATCH] virtio_net: Do not send RSS key if it is not supported

2024-03-21 Thread Xuan Zhuo

On Thu, 21 Mar 2024 09:54:30 -0700, Breno Leitao  wrote:
> There is a bug when setting the RSS options in virtio_net that can break
> the whole machine, getting the kernel into an infinite loop.
>
> Running the following command in any QEMU virtual machine with virtionet
> will reproduce this problem:
>
>   # ethtool -X eth0  hfunc toeplitz
>
> This is how the problem happens:
>
> 1) ethtool_set_rxfh() calls virtnet_set_rxfh()
>
> 2) virtnet_set_rxfh() calls virtnet_commit_rss_command()
>
> 3) virtnet_commit_rss_command() populates 4 entries for the rss
>scatter-gather
>
> 4) Since the command above does not have a key, then the last
>scatter-gatter entry will be zeroed, since rss_key_size == 0.
> sg_buf_size = vi->rss_key_size;



if (vi->has_rss || vi->has_rss_hash_report) {
vi->rss_indir_table_size =
virtio_cread16(vdev, offsetof(struct virtio_net_config,
rss_max_indirection_table_length));
vi->rss_key_size =
virtio_cread8(vdev, offsetof(struct virtio_net_config, 
rss_max_key_size));

vi->rss_hash_types_supported =
virtio_cread32(vdev, offsetof(struct virtio_net_config, 
supported_hash_types));
vi->rss_hash_types_supported &=
~(VIRTIO_NET_RSS_HASH_TYPE_IP_EX |
  VIRTIO_NET_RSS_HASH_TYPE_TCP_EX |
  VIRTIO_NET_RSS_HASH_TYPE_UDP_EX);

dev->hw_features |= NETIF_F_RXHASH;
}


vi->rss_key_size is initiated here, I wonder if there is something wrong?

Thanks.


>
> 5) This buffer is passed to qemu, but qemu is not happy with a buffer
>with zero length, and do the following in virtqueue_map_desc() (QEMU
>function):
>
>   if (!sz) {
>   virtio_error(vdev, "virtio: zero sized buffers are not allowed");
>
> 6) virtio_error() (also QEMU function) set the device as broken
>
>   vdev->broken = true;
>
> 7) Qemu bails out, and do not repond this crazy kernel.
>
> 8) The kernel is waiting for the response to come back (function
>virtnet_send_command())
>
> 9) The kernel is waiting doing the following :
>
>   while (!virtqueue_get_buf(vi->cvq, ) &&
>  !virtqueue_is_broken(vi->cvq))
>   cpu_relax();
>
> 10) None of the following functions above is true, thus, the kernel
> loops here forever. Keeping in mind that virtqueue_is_broken() does
> not look at the qemu `vdev->broken`, so, it never realizes that the
> vitio is broken at QEMU side.
>
> Fix it by not sending the key scatter-gatter key if it is not set.
>
> Fixes: c7114b1249fa ("drivers/net/virtio_net: Added basic RSS support.")
> Signed-off-by: Breno Leitao 
> Cc: sta...@vger.kernel.org
> Cc: qemu-devel@nongnu.org
> ---
>  drivers/net/virtio_net.c | 16 +---
>  1 file changed, 13 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index d7ce4a1011ea..5a7700b103f8 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -3041,11 +3041,16 @@ static int virtnet_set_ringparam(struct net_device 
> *dev,
>  static bool virtnet_commit_rss_command(struct virtnet_info *vi)
>  {
>   struct net_device *dev = vi->dev;
> + int has_key = vi->rss_key_size;
>   struct scatterlist sgs[4];
>   unsigned int sg_buf_size;
> + int nents = 3;
> +
> + if (has_key)
> + nents += 1;
>
>   /* prepare sgs */
> - sg_init_table(sgs, 4);
> + sg_init_table(sgs, nents);
>
>   sg_buf_size = offsetof(struct virtio_net_ctrl_rss, indirection_table);
>   sg_set_buf([0], >ctrl->rss, sg_buf_size);
> @@ -3057,8 +3062,13 @@ static bool virtnet_commit_rss_command(struct 
> virtnet_info *vi)
>   - offsetof(struct virtio_net_ctrl_rss, max_tx_vq);
>   sg_set_buf([2], >ctrl->rss.max_tx_vq, sg_buf_size);
>
> - sg_buf_size = vi->rss_key_size;
> - sg_set_buf([3], vi->ctrl->rss.key, sg_buf_size);
> + if (has_key) {
> + /* Only populate if key is available, otherwise
> +  * populating a buffer with zero size breaks virtio
> +  */
> + sg_buf_size = vi->rss_key_size;
> + sg_set_buf([3], vi->ctrl->rss.key, sg_buf_size);
> + }
>
>   if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_MQ,
> vi->has_rss ? VIRTIO_NET_CTRL_MQ_RSS_CONFIG
> --
> 2.43.0
>

RE: [PATCH v5 5/7] migration/multifd: implement initialization of qpl compression

2024-03-21 Thread Liu, Yuan1

> -Original Message-
> From: Peter Xu 
> Sent: Thursday, March 21, 2024 11:28 PM
> To: Liu, Yuan1 
> Cc: Daniel P. Berrangé ; faro...@suse.de; qemu-
> de...@nongnu.org; hao.xi...@bytedance.com; bryan.zh...@bytedance.com; Zou,
> Nanhai 
> Subject: Re: [PATCH v5 5/7] migration/multifd: implement initialization of
> qpl compression
> 
> On Thu, Mar 21, 2024 at 01:37:36AM +, Liu, Yuan1 wrote:
> > > -Original Message-
> > > From: Peter Xu 
> > > Sent: Thursday, March 21, 2024 4:32 AM
> > > To: Liu, Yuan1 
> > > Cc: Daniel P. Berrangé ; faro...@suse.de; qemu-
> > > de...@nongnu.org; hao.xi...@bytedance.com; bryan.zh...@bytedance.com;
> Zou,
> > > Nanhai 
> > > Subject: Re: [PATCH v5 5/7] migration/multifd: implement
> initialization of
> > > qpl compression
> > >
> > > On Wed, Mar 20, 2024 at 04:23:01PM +, Liu, Yuan1 wrote:
> > > > let me explain here, during the decompression operation of IAA, the
> > > > decompressed data can be directly output to the virtual address of
> the
> > > > guest memory by IAA hardware.  It can avoid copying the decompressed
> > > data
> > > > to guest memory by CPU.
> > >
> > > I see.
> > >
> > > > Without -mem-prealloc, all the guest memory is not populated, and
> IAA
> > > > hardware needs to trigger I/O page fault first and then output the
> > > > decompressed data to the guest memory region.  Besides that, CPU
> page
> > > > faults will also trigger IOTLB flush operation when IAA devices use
> SVM.
> > >
> > > Oh so the IAA hardware already can use CPU pgtables?  Nice..
> > >
> > > Why IOTLB flush is needed?  AFAIU we're only installing new pages, the
> > > request can either come from a CPU access or a DMA.  In all cases
> there
> > > should have no tearing down of an old page.  Isn't an iotlb flush only
> > > needed if a tear down happens?
> >
> > As far as I know, IAA hardware uses SVM technology to use the CPU's page
> table
> > for address translation (IOMMU scalable mode directly accesses the CPU
> page table).
> > Therefore, when the CPU page table changes, the device's Invalidation
> operation needs
> > to be triggered to update the IOMMU and the device's cache.
> >
> > My current kernel version is mainline 6.2. The issue I see is as
> follows:
> > --Handle_mm_fault
> >  |
> >   -- wp_page_copy
> 
> This is the CoW path.  Not usual at all..
> 
> I assume this issue should only present on destination.  Then the guest
> pages should be the destination of such DMAs to happen, which means these
> should be write faults, and as we see here it is, otherwise it won't
> trigger a CoW.
> 
> However it's not clear to me why a pre-installed zero page existed.  It
> means someone read the guest pages first.
> 
> It might be interesting to know _why_ someone reads the guest pages, even
> if we know they're all zeros.  If we can avoid such reads then it'll be a
> hole rather than a prefaulted read on zero page, then invalidations are
> not
> needed, and I expect that should fix the iotlb storm issue.

The received pages will be read for zero pages check first. Although
these pages are zero pages, and IAA hardware will not access them, the
COW happens and causes following IOTLB flush operation. As far as I know, 
IOMMU quickly detects whether the address range has been used by the device,
and does not invalidate the address that is not used by the device, this has 
not yet been resolved in Linux kernel 6.2. I will check the latest status for
this.
void multifd_recv_zero_page_process(MultiFDRecvParams *p)
{
for (int i = 0; i < p->zero_num; i++) {
void *page = p->host + p->zero[i];
if (!buffer_is_zero(page, p->page_size)) {
memset(page, 0, p->page_size);
}
}
}


> It'll still be good we can fix this first to not make qpl special from
> this
> regard, so that the hope is migration submodule shouldn't rely on any
> pre-config (-mem-prealloc) on guest memory behaviors to work properly.

Even if the IOTLB problem can be avoided, the I/O page fault problem (normal
pages are loaded by the IAA device and solving normal page faults through IOMMU,
the performance is not good)

It can let the decompressed data of the IAA device be output to a pre-populated
memory instead of directly outputting to the guest address, but then each 
multifd
thread needs two memory copies, one copy from the network to the IAA input 
memory(pre-populated), and another copy from the IAA output 
memory(pre-populated)
to the guest address, which may become a performance bottleneck at the 
destination
during the live migration process.

So I think it is still necessary to use the -mem-prealloc option

> > -- mmu_notifier_invalidate_range
> >   |
> >   -- intel_invalidate_rage
> > |
> > -- qi_flush_piotlb
> > -- qi_flush_dev_iotlb_pasid
> 
> --
> Peter Xu

Re: [PATCH for-9.1 v5 11/14] memory: Add Error** argument to the global_dirty_log routines

2024-03-21 Thread Yong Huang

On Wed, Mar 20, 2024 at 11:19 PM Peter Xu  wrote:

> On Wed, Mar 20, 2024 at 07:49:07AM +0100, Cédric Le Goater wrote:
> > Now that the log_global*() handlers take an Error** parameter and
> > return a bool, do the same for memory_global_dirty_log_start() and
> > memory_global_dirty_log_stop(). The error is reported in the callers
> > for now and it will be propagated in the call stack in the next
> > changes.
> >
> > To be noted a functional change in ram_init_bitmaps(), if the dirty
> > pages logger fails to start, there is no need to synchronize the dirty
> > pages bitmaps. colo_incoming_start_dirty_log() could be modified in a
> > similar way.
> >
> > Cc: Stefano Stabellini 
> > Cc: Anthony Perard 
> > Cc: Paul Durrant 
> > Cc: "Michael S. Tsirkin" 
> > Cc: Paolo Bonzini 
> > Cc: David Hildenbrand 
> > Cc: Hyman Huang 
> > Signed-off-by: Cédric Le Goater 
>
> Just to mention that for the rest users (dirtylimit/dirtyrate/colo) the
> code still just keeps going even if start logging failed even after this
> series applied as a whole.  Migration framework handles the failure
> gracefully, not yet the rest.
>
> That might be an issue for some, e.g., ideally we should be able to fail a
> calc-dirty-rate request, but it's not supported so far.  Adding that could
> add quite some burden to this series, so maybe that's fine to be done
>
Indeed, for the GLOBAL_DIRTY_DIRTY_RATE and GLOBAL_DIRTY_LIMIT
dirty tracking, they should handle the failure path of logging start.
The work may be done once the current patchset is merged.


> later.  After all, having a VFIO device (that can fail a start_log()), plus
> any of those features should be even rarer, I think?
>
> It seems at least memory_global_dirty_log_sync() can be called even without
> start logging, so I expect nothing should crash immediately. I spot one in
> colo_incoming_start_dirty_log() already of such use.  My wild guess is it
> relies on all log_sync*() hooks to cope with it, e.g. KVM ioctl() should
> fail with -ENENT on most archs I think when it sees dirty log not ever
> started.
>
> For those bits, I'll wait and see whether Yong or Hailiang (cced) has any
> comments. From generic migration/memory side, nothing I see wrong:
>
> Acked-by: Peter Xu 
>
> Thanks,
>
> --
> Peter Xu
>
>

-- 
Best regards

Re: [PULL 3/3] target/loongarch: Fix qemu-loongarch64 hang when executing 'll.d $t0, $t0, 0'

2024-03-21 Thread gaosong


在 2024/3/22 上午1:13, Michael Tokarev 写道:

20.03.2024 05:40, Song Gao :

On gen_ll, if a->imm is zero, make_address_x return src1,
but the load to destination may clobber src1. We use a new
destination to fix this problem.

Fixes: c5af6628f4be (target/loongarch: Extract make_address_i() helper)
Reviewed-by: Richard Henderson 
Suggested-by: Richard Henderson 
Signed-off-by: Song Gao 
Message-Id: <20240320013955.1561311-1-gaos...@loongson.cn>


Is it a stable-8.2 material?


Yes.

Thanks.
Song Gao

Thanks,

/mjt

RE: [INCOMPLETE PATCH v3 3/3] ui/console: Introduce dpy_gl_create_dmabuf() helper

2024-03-21 Thread Kim, Dongwon

Hi Phil,

I submitted revised series (v4) based on your v3 work. I tried to address 2 or 
3 major issues you mentioned there and some minor corrections.
Please check it. I appreciate your help and feedback.

[PATCH v4 0/3] ui/console: Introduce helpers for creating and

> -Original Message-
> From: Philippe Mathieu-Daudé 
> Sent: Thursday, March 21, 2024 3:07 AM
> To: Kim, Dongwon ; qemu-devel@nongnu.org
> Cc: Alex Williamson ; Cédric Le Goater
> ; Marc-André Lureau ;
> Kasireddy, Vivek ; Michael S. Tsirkin
> ; Gerd Hoffmann ; Philippe Mathieu-
> Daudé 
> Subject: [INCOMPLETE PATCH v3 3/3] ui/console: Introduce
> dpy_gl_create_dmabuf() helper
> 
> From: Dongwon Kim 
> 
> It is safer to create, initialize, and access all the parameters in QemuDmaBuf
> from a central location, ui/console, instead of hw/virtio-gpu or hw/vfio 
> modules.
> 
> Signed-off-by: Dongwon Kim 
> Message-Id: <20240320034229.3347130-1-dongwon@intel.com>
> [PMD: Split patch in 3, part 3/3]
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> Incomplete... VhostUserGPU doesn't allocate,
> vhost_user_gpu_handle_display() crashes.
> ---
>  include/hw/vfio/vfio-common.h   |  2 +-
>  include/hw/virtio/virtio-gpu.h  |  2 +-
>  include/ui/console.h|  7 +++
>  hw/display/virtio-gpu-udmabuf.c | 23 ---
>  hw/vfio/display.c   | 24 ++--
>  ui/console.c| 28 
>  6 files changed, 55 insertions(+), 31 deletions(-)
> 
> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> index b9da6c08ef..d66e27db02 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -148,7 +148,7 @@ typedef struct VFIOGroup {  } VFIOGroup;
> 
>  typedef struct VFIODMABuf {
> -QemuDmaBuf buf;
> +QemuDmaBuf *buf;
>  uint32_t pos_x, pos_y, pos_updates;
>  uint32_t hot_x, hot_y, hot_updates;
>  int dmabuf_id;
> diff --git a/include/hw/virtio/virtio-gpu.h b/include/hw/virtio/virtio-gpu.h 
> index
> ed44cdad6b..010083e8e3 100644
> --- a/include/hw/virtio/virtio-gpu.h
> +++ b/include/hw/virtio/virtio-gpu.h
> @@ -169,7 +169,7 @@ struct VirtIOGPUBaseClass {
>  DEFINE_PROP_UINT32("yres", _state, _conf.yres, 800)
> 
>  typedef struct VGPUDMABuf {
> -QemuDmaBuf buf;
> +QemuDmaBuf *buf;
>  uint32_t scanout_id;
>  QTAILQ_ENTRY(VGPUDMABuf) next;
>  } VGPUDMABuf;
> diff --git a/include/ui/console.h b/include/ui/console.h index
> 1f3d025548..0b823efb2e 100644
> --- a/include/ui/console.h
> +++ b/include/ui/console.h
> @@ -279,6 +279,7 @@ typedef struct DisplayChangeListenerOps {
>  /* optional */
>  void (*dpy_gl_cursor_position)(DisplayChangeListener *dcl,
> uint32_t pos_x, uint32_t pos_y);
> +
>  /* optional */
>  void (*dpy_gl_release_dmabuf)(DisplayChangeListener *dcl,
>QemuDmaBuf *dmabuf); @@ -358,6 +359,12 @@ 
> void
> dpy_gl_cursor_dmabuf(QemuConsole *con, QemuDmaBuf *dmabuf,
>bool have_hot, uint32_t hot_x, uint32_t hot_y);  
> void
> dpy_gl_cursor_position(QemuConsole *con,
>  uint32_t pos_x, uint32_t pos_y);
> +QemuDmaBuf *dpy_gl_create_dmabuf(uint32_t width, uint32_t height,
> + uint32_t stride, uint32_t x,
> + uint32_t y, uint32_t backing_width,
> + uint32_t backing_height, uint32_t fourcc,
> + uint32_t modifier, uint32_t dmabuf_fd,
> + bool allow_fences);
>  uint32_t dpy_gl_dmabuf_get_width(QemuDmaBuf *dmabuf);  uint32_t
> dpy_gl_dmabuf_get_height(QemuDmaBuf *dmabuf);  int32_t
> dpy_gl_dmabuf_get_fd(QemuDmaBuf *dmabuf); diff --git a/hw/display/virtio-
> gpu-udmabuf.c b/hw/display/virtio-gpu-udmabuf.c index
> d680e871c1..dde6c8e9d9 100644
> --- a/hw/display/virtio-gpu-udmabuf.c
> +++ b/hw/display/virtio-gpu-udmabuf.c
> @@ -162,7 +162,7 @@ static void virtio_gpu_free_dmabuf(VirtIOGPU *g,
> VGPUDMABuf *dmabuf)
>  struct virtio_gpu_scanout *scanout;
> 
>  scanout = >parent_obj.scanout[dmabuf->scanout_id];
> -dpy_gl_release_dmabuf(scanout->con, >buf);
> +dpy_gl_release_dmabuf(scanout->con, dmabuf->buf);
>  QTAILQ_REMOVE(>dmabuf.bufs, dmabuf, next);
>  g_free(dmabuf);
>  }
> @@ -181,17 +181,10 @@ static VGPUDMABuf
>  }
> 
>  dmabuf = g_new0(VGPUDMABuf, 1);
> -dmabuf->buf.width = r->width;
> -dmabuf->buf.height = r->height;
> -dmabuf->buf.stride = fb->stride;
> -dmabuf->buf.x = r->x;
> -dmabuf->buf.y = r->y;
> -dmabuf->buf.backing_width = fb->width;
> -dmabuf->buf.backing_height = fb->height;
> -dmabuf->buf.fourcc = qemu_pixman_to_drm_format(fb->format);
> -dmabuf->buf.fd = res->dmabuf_fd;
> -dmabuf->buf.allow_fences = true;
> -dmabuf->buf.draw_submitted = false;
> +dmabuf->buf =

[PATCH v4 1/3] ui/console: Introduce dpy_gl_dmabuf_get_height/width() helpers

From: Dongwon Kim 

dpy_gl_dmabuf_get_height() and dpy_gl_dmabuf_get_width() are helpers for
retrieving width and height fields from QemuDmaBuf struct.

Cc: Philippe Mathieu-Daudé 
Cc: Marc-André Lureau 
Cc: Vivek Kasireddy 
Signed-off-by: Dongwon Kim 
---
 include/ui/console.h|  2 ++
 hw/display/virtio-gpu-udmabuf.c |  7 ---
 hw/vfio/display.c   |  9 ++---
 ui/console.c| 18 ++
 4 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/include/ui/console.h b/include/ui/console.h
index 0bc7a00ac0..6064487fc4 100644
--- a/include/ui/console.h
+++ b/include/ui/console.h
@@ -358,6 +358,8 @@ void dpy_gl_cursor_dmabuf(QemuConsole *con, QemuDmaBuf 
*dmabuf,
   bool have_hot, uint32_t hot_x, uint32_t hot_y);
 void dpy_gl_cursor_position(QemuConsole *con,
 uint32_t pos_x, uint32_t pos_y);
+uint32_t dpy_gl_dmabuf_get_width(QemuDmaBuf *dmabuf);
+uint32_t dpy_gl_dmabuf_get_height(QemuDmaBuf *dmabuf);
 void dpy_gl_release_dmabuf(QemuConsole *con,
QemuDmaBuf *dmabuf);
 void dpy_gl_update(QemuConsole *con,
diff --git a/hw/display/virtio-gpu-udmabuf.c b/hw/display/virtio-gpu-udmabuf.c
index d51184d658..a4ebf828ec 100644
--- a/hw/display/virtio-gpu-udmabuf.c
+++ b/hw/display/virtio-gpu-udmabuf.c
@@ -206,6 +206,7 @@ int virtio_gpu_update_dmabuf(VirtIOGPU *g,
 {
 struct virtio_gpu_scanout *scanout = >parent_obj.scanout[scanout_id];
 VGPUDMABuf *new_primary, *old_primary = NULL;
+uint32_t width, height;
 
 new_primary = virtio_gpu_create_dmabuf(g, scanout_id, res, fb, r);
 if (!new_primary) {
@@ -216,10 +217,10 @@ int virtio_gpu_update_dmabuf(VirtIOGPU *g,
 old_primary = g->dmabuf.primary[scanout_id];
 }
 
+width = dpy_gl_dmabuf_get_width(_primary->buf);
+height = dpy_gl_dmabuf_get_height(_primary->buf);
 g->dmabuf.primary[scanout_id] = new_primary;
-qemu_console_resize(scanout->con,
-new_primary->buf.width,
-new_primary->buf.height);
+qemu_console_resize(scanout->con, width, height);
 dpy_gl_scanout_dmabuf(scanout->con, _primary->buf);
 
 if (old_primary) {
diff --git a/hw/vfio/display.c b/hw/vfio/display.c
index 1aa440c663..c962e5f88f 100644
--- a/hw/vfio/display.c
+++ b/hw/vfio/display.c
@@ -286,6 +286,7 @@ static void vfio_display_dmabuf_update(void *opaque)
 VFIOPCIDevice *vdev = opaque;
 VFIODisplay *dpy = vdev->dpy;
 VFIODMABuf *primary, *cursor;
+uint32_t width, height;
 bool free_bufs = false, new_cursor = false;
 
 primary = vfio_display_get_dmabuf(vdev, DRM_PLANE_TYPE_PRIMARY);
@@ -296,10 +297,12 @@ static void vfio_display_dmabuf_update(void *opaque)
 return;
 }
 
+width = dpy_gl_dmabuf_get_width(>buf);
+height = dpy_gl_dmabuf_get_height(>buf);
+
 if (dpy->dmabuf.primary != primary) {
 dpy->dmabuf.primary = primary;
-qemu_console_resize(dpy->con,
-primary->buf.width, primary->buf.height);
+qemu_console_resize(dpy->con, width, height);
 dpy_gl_scanout_dmabuf(dpy->con, >buf);
 free_bufs = true;
 }
@@ -328,7 +331,7 @@ static void vfio_display_dmabuf_update(void *opaque)
 cursor->pos_updates = 0;
 }
 
-dpy_gl_update(dpy->con, 0, 0, primary->buf.width, primary->buf.height);
+dpy_gl_update(dpy->con, 0, 0, width, height);
 
 if (free_bufs) {
 vfio_display_free_dmabufs(vdev);
diff --git a/ui/console.c b/ui/console.c
index 43226c5c14..1d0513a733 100644
--- a/ui/console.c
+++ b/ui/console.c
@@ -1132,6 +1132,24 @@ void dpy_gl_cursor_position(QemuConsole *con,
 }
 }
 
+uint32_t dpy_gl_dmabuf_get_width(QemuDmaBuf *dmabuf)
+{
+if (dmabuf) {
+return dmabuf->width;
+}
+
+return 0;
+}
+
+uint32_t dpy_gl_dmabuf_get_height(QemuDmaBuf *dmabuf)
+{
+if (dmabuf) {
+return dmabuf->height;
+}
+
+return 0;
+}
+
 void dpy_gl_release_dmabuf(QemuConsole *con,
   QemuDmaBuf *dmabuf)
 {
-- 
2.34.1

[PATCH v4 2/3] ui/console: Introduce dpy_gl_dmabuf_get_fd() helper

From: Dongwon Kim 

dpy_gl_dmabuf_get_fd() is a helper for retrieving FD of the dmabuf
from QemuDmaBuf struct.

Cc: Philippe Mathieu-Daudé 
Cc: Marc-André Lureau 
Cc: Vivek Kasireddy 
Signed-off-by: Dongwon Kim 
---
 include/ui/console.h | 1 +
 hw/vfio/display.c| 8 +++-
 ui/console.c | 9 +
 3 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/include/ui/console.h b/include/ui/console.h
index 6064487fc4..d5334a806c 100644
--- a/include/ui/console.h
+++ b/include/ui/console.h
@@ -360,6 +360,7 @@ void dpy_gl_cursor_position(QemuConsole *con,
 uint32_t pos_x, uint32_t pos_y);
 uint32_t dpy_gl_dmabuf_get_width(QemuDmaBuf *dmabuf);
 uint32_t dpy_gl_dmabuf_get_height(QemuDmaBuf *dmabuf);
+int32_t dpy_gl_dmabuf_get_fd(QemuDmaBuf *dmabuf);
 void dpy_gl_release_dmabuf(QemuConsole *con,
QemuDmaBuf *dmabuf);
 void dpy_gl_update(QemuConsole *con,
diff --git a/hw/vfio/display.c b/hw/vfio/display.c
index c962e5f88f..676b2fc5f3 100644
--- a/hw/vfio/display.c
+++ b/hw/vfio/display.c
@@ -259,9 +259,15 @@ static VFIODMABuf *vfio_display_get_dmabuf(VFIOPCIDevice 
*vdev,
 
 static void vfio_display_free_one_dmabuf(VFIODisplay *dpy, VFIODMABuf *dmabuf)
 {
+int fd;
+
 QTAILQ_REMOVE(>dmabuf.bufs, dmabuf, next);
+fd = dpy_gl_dmabuf_get_fd(>buf);
+if (fd > -1) {
+close(fd);
+}
+
 dpy_gl_release_dmabuf(dpy->con, >buf);
-close(dmabuf->buf.fd);
 g_free(dmabuf);
 }
 
diff --git a/ui/console.c b/ui/console.c
index 1d0513a733..69560aac7e 100644
--- a/ui/console.c
+++ b/ui/console.c
@@ -1504,6 +1504,15 @@ int qemu_console_get_height(QemuConsole *con, int 
fallback)
 }
 }
 
+int32_t dpy_gl_dmabuf_get_fd(QemuDmaBuf *dmabuf)
+{
+if (dmabuf) {
+return dmabuf->fd;
+}
+
+return -1;
+}
+
 int qemu_invalidate_text_consoles(void)
 {
 QemuConsole *s;
-- 
2.34.1

[PATCH v4 3/3] ui/console: Introduce dpy_gl_create_dmabuf() helper

From: Dongwon Kim 

dpy_gl_create_dmabuf() allocates QemuDmaBuf and initialize fields.
hw/display modules, hw/vfio and ui/dbus-listener now use this method
to create QemuDmaBuf instead of declaring and initializing it on their
own.

Cc: Philippe Mathieu-Daudé 
Cc: Marc-André Lureau 
Cc: Vivek Kasireddy 
Signed-off-by: Dongwon Kim 
---
 include/hw/vfio/vfio-common.h   |  2 +-
 include/hw/virtio/virtio-gpu.h  |  4 ++--
 include/ui/console.h|  6 ++
 hw/display/vhost-user-gpu.c | 33 ++---
 hw/display/virtio-gpu-udmabuf.c | 23 ---
 hw/vfio/display.c   | 26 +++---
 ui/console.c| 28 
 ui/dbus-listener.c  | 28 
 8 files changed, 86 insertions(+), 64 deletions(-)

diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index b9da6c08ef..d66e27db02 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -148,7 +148,7 @@ typedef struct VFIOGroup {
 } VFIOGroup;
 
 typedef struct VFIODMABuf {
-QemuDmaBuf buf;
+QemuDmaBuf *buf;
 uint32_t pos_x, pos_y, pos_updates;
 uint32_t hot_x, hot_y, hot_updates;
 int dmabuf_id;
diff --git a/include/hw/virtio/virtio-gpu.h b/include/hw/virtio/virtio-gpu.h
index ed44cdad6b..56d6e821bf 100644
--- a/include/hw/virtio/virtio-gpu.h
+++ b/include/hw/virtio/virtio-gpu.h
@@ -169,7 +169,7 @@ struct VirtIOGPUBaseClass {
 DEFINE_PROP_UINT32("yres", _state, _conf.yres, 800)
 
 typedef struct VGPUDMABuf {
-QemuDmaBuf buf;
+QemuDmaBuf *buf;
 uint32_t scanout_id;
 QTAILQ_ENTRY(VGPUDMABuf) next;
 } VGPUDMABuf;
@@ -238,7 +238,7 @@ struct VhostUserGPU {
 VhostUserBackend *vhost;
 int vhost_gpu_fd; /* closed by the chardev */
 CharBackend vhost_chr;
-QemuDmaBuf dmabuf[VIRTIO_GPU_MAX_SCANOUTS];
+QemuDmaBuf *dmabuf[VIRTIO_GPU_MAX_SCANOUTS];
 bool backend_blocked;
 };
 
diff --git a/include/ui/console.h b/include/ui/console.h
index d5334a806c..01e998264b 100644
--- a/include/ui/console.h
+++ b/include/ui/console.h
@@ -358,6 +358,12 @@ void dpy_gl_cursor_dmabuf(QemuConsole *con, QemuDmaBuf 
*dmabuf,
   bool have_hot, uint32_t hot_x, uint32_t hot_y);
 void dpy_gl_cursor_position(QemuConsole *con,
 uint32_t pos_x, uint32_t pos_y);
+QemuDmaBuf *dpy_gl_create_dmabuf(uint32_t width, uint32_t height,
+ uint32_t stride, uint32_t x,
+ uint32_t y, uint32_t backing_width,
+ uint32_t backing_height, uint32_t fourcc,
+ uint64_t modifier, uint32_t dmabuf_fd,
+ bool allow_fences, bool y0_top);
 uint32_t dpy_gl_dmabuf_get_width(QemuDmaBuf *dmabuf);
 uint32_t dpy_gl_dmabuf_get_height(QemuDmaBuf *dmabuf);
 int32_t dpy_gl_dmabuf_get_fd(QemuDmaBuf *dmabuf);
diff --git a/hw/display/vhost-user-gpu.c b/hw/display/vhost-user-gpu.c
index 709c8a02a1..0e49a934ed 100644
--- a/hw/display/vhost-user-gpu.c
+++ b/hw/display/vhost-user-gpu.c
@@ -249,6 +249,7 @@ vhost_user_gpu_handle_display(VhostUserGPU *g, 
VhostUserGpuMsg *msg)
 case VHOST_USER_GPU_DMABUF_SCANOUT: {
 VhostUserGpuDMABUFScanout *m = >payload.dmabuf_scanout;
 int fd = qemu_chr_fe_get_msgfd(>vhost_chr);
+uint64_t modifier = 0;
 QemuDmaBuf *dmabuf;
 
 if (m->scanout_id >= g->parent_obj.conf.max_outputs) {
@@ -261,30 +262,32 @@ vhost_user_gpu_handle_display(VhostUserGPU *g, 
VhostUserGpuMsg *msg)
 
 g->parent_obj.enable = 1;
 con = g->parent_obj.scanout[m->scanout_id].con;
-dmabuf = >dmabuf[m->scanout_id];
-if (dmabuf->fd >= 0) {
-close(dmabuf->fd);
-dmabuf->fd = -1;
+dmabuf = g->dmabuf[m->scanout_id];
+if (dmabuf) {
+int dmabuf_fd = dpy_gl_dmabuf_get_fd(dmabuf);
+if (dmabuf_fd >= 0) {
+close(dmabuf_fd);
+}
+dpy_gl_release_dmabuf(con, dmabuf);
 }
-dpy_gl_release_dmabuf(con, dmabuf);
+
 if (fd == -1) {
 dpy_gl_scanout_disable(con);
 break;
 }
-*dmabuf = (QemuDmaBuf) {
-.fd = fd,
-.width = m->fd_width,
-.height = m->fd_height,
-.stride = m->fd_stride,
-.fourcc = m->fd_drm_fourcc,
-.y0_top = m->fd_flags & VIRTIO_GPU_RESOURCE_FLAG_Y_0_TOP,
-};
+
 if (msg->request == VHOST_USER_GPU_DMABUF_SCANOUT2) {
 VhostUserGpuDMABUFScanout2 *m2 = >payload.dmabuf_scanout2;
-dmabuf->modifier = m2->modifier;
+modifier = m2->modifier;
 }
 
-dpy_gl_scanout_dmabuf(con, dmabuf);
+dmabuf = dpy_gl_create_dmabuf(m->fd_width, m->fd_height, m->fd_stride,
+  0, 0, 0, 0,

[PATCH v4 0/3] ui/console: Introduce helpers for creating and

From: Dongwon Kim 

QemuDmaBuf struct is defined and primarily used by ui/console/gl so it is
better to handle its creation, initialization and access within ui/console
rather than within hw modules such as hw/display/virtio-gpu,
hw/display/vhost-user-gpu and hw/vfio as well as ui/dbus-listener.

v4: refactored patches in the previous series

made ui/dbus-listener and hw/display/vhost-user-gpu to use the new
helpers

Dongwon Kim (3):
  ui/console: Introduce dpy_gl_dmabuf_get_height/width() helpers
  ui/console: Introduce dpy_gl_dmabuf_get_fd() helper
  ui/console: Introduce dpy_gl_create_dmabuf() helper

 include/hw/vfio/vfio-common.h   |  2 +-
 include/hw/virtio/virtio-gpu.h  |  2 +-
 include/ui/console.h|  9 ++
 hw/display/vhost-user-gpu.c | 12 +++
 hw/display/virtio-gpu-udmabuf.c | 26 ++--
 hw/vfio/display.c   | 36 +++--
 ui/console.c| 55 +
 ui/dbus-listener.c  | 22 ++---
 8 files changed, 109 insertions(+), 55 deletions(-)

-- 
2.34.1

Re: [PATCH-for-9.1 03/21] target/i386: Move APIC related code to cpu-apic.c


On 3/21/24 05:48, Philippe Mathieu-Daudé wrote:

Move APIC related code split in cpu-sysemu.c and
monitor.c to cpu-apic.c.

Signed-off-by: Philippe Mathieu-Daudé
---
  target/i386/cpu-apic.c   | 112 +++
  target/i386/cpu-sysemu.c |  77 ---
  target/i386/monitor.c|  25 -
  target/i386/meson.build  |   1 +
  4 files changed, 113 insertions(+), 102 deletions(-)
  create mode 100644 target/i386/cpu-apic.c


Reviewed-by: Richard Henderson 

r~

Re: [PATCH-for-9.1 07/21] target/m68k: Move MMU monitor commands from helper.c to monitor.c


On 3/21/24 05:48, Philippe Mathieu-Daudé wrote:

Keep all HMP commands in monitor.c.

Signed-off-by: Philippe Mathieu-Daudé
---
  target/m68k/cpu.h |   3 +-
  target/m68k/helper.c  | 222 -
  target/m68k/monitor.c | 223 +-
  3 files changed, 223 insertions(+), 225 deletions(-)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH-for-9.1 06/21] target/m68k: Have dump_ttr() take a @description argument


On 3/21/24 05:48, Philippe Mathieu-Daudé wrote:

Slightly simplify dump_mmu() by passing the description as
argument to dump_ttr().

Signed-off-by: Philippe Mathieu-Daudé
---
  target/m68k/helper.c | 15 ++-
  1 file changed, 6 insertions(+), 9 deletions(-)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH-for-9.1 05/21] target/m68k: Replace qemu_printf() by monitor_printf() in monitor


On 3/21/24 05:48, Philippe Mathieu-Daudé wrote:

Replace qemu_printf() by monitor_printf() / monitor_puts() in monitor.

Signed-off-by: Philippe Mathieu-Daudé
---
  target/m68k/cpu.h |   2 +-
  target/m68k/helper.c  | 126 +-
  target/m68k/monitor.c |   4 +-
  3 files changed, 67 insertions(+), 65 deletions(-)


Reviewed-by: Richard Henderson 


r~

Re: [PATCH-for-9.1 04/21] target/i386: Extract x86_dump_mmu() from hmp_info_tlb()


On 3/21/24 05:48, Philippe Mathieu-Daudé wrote:

hmp_info_tlb() is specific to tcg/system, move it to
target/i386/tcg/sysemu/hmp-cmds.c, along with the functions
it depend on (except addr_canonical() which is exposed in
"cpu.h").

Signed-off-by: Philippe Mathieu-Daudé
---
  target/i386/cpu.h   |   7 ++
  target/i386/mmu.c   | 231 
  target/i386/monitor.c   | 215 -
  target/i386/meson.build |   1 +
  4 files changed, 239 insertions(+), 215 deletions(-)
  create mode 100644 target/i386/mmu.c


Patch commit message appears to be out of date wrt filename.
Otherwise,

Reviewed-by: Richard Henderson 


r~

[PATCH] virtio-snd: Skip invalid message sizes and null streams

2024-03-21 Thread Zheyu Ma

This update changes how virtio_snd_handle_tx_xfer handles message size
discrepancies and null streams. Instead of using error handling paths
which led to unnecessary processing and potential null pointer dereferences,
the function now continues to the next loop iteration.

ASAN log illustrating the issue addressed:

ERROR: AddressSanitizer: SEGV on unknown address 0x00b4 (pc 
0x57cea39967b8 bp 0x7ffce84d51b0 sp 0x7ffce84d5160 T0)
#0 0x57cea39967b8 in qemu_mutex_lock_impl qemu/util/qemu-thread-posix.c:92:5
#1 0x57cea128c462 in qemu_mutex_lock qemu/include/qemu/thread.h:122:5
#2 0x57cea128d72f in qemu_lockable_lock qemu/include/qemu/lockable.h:95:5
#3 0x57cea128c294 in qemu_lockable_auto_lock 
qemu/include/qemu/lockable.h:105:5
#4 0x57cea1285eb2 in virtio_snd_handle_rx_xfer 
qemu/hw/audio/virtio-snd.c:1026:9
#5 0x57cea2caebbc in virtio_queue_notify_vq qemu/hw/virtio/virtio.c:2268:9
#6 0x57cea2cae412 in virtio_queue_host_notifier_read 
qemu/hw/virtio/virtio.c:3671:9
#7 0x57cea39822f1 in aio_dispatch_handler qemu/util/aio-posix.c:372:9
#8 0x57cea3979385 in aio_dispatch_handlers qemu/util/aio-posix.c:414:20
#9 0x57cea3978eb1 in aio_dispatch qemu/util/aio-posix.c:424:5
#10 0x57cea3a1eede in aio_ctx_dispatch qemu/util/async.c:360:5

Signed-off-by: Zheyu Ma 
---
 hw/audio/virtio-snd.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/audio/virtio-snd.c b/hw/audio/virtio-snd.c
index e604d8f30c..d9e9f980f7 100644
--- a/hw/audio/virtio-snd.c
+++ b/hw/audio/virtio-snd.c
@@ -913,13 +913,13 @@ static void virtio_snd_handle_tx_xfer(VirtIODevice *vdev, 
VirtQueue *vq)
 ,
 sizeof(virtio_snd_pcm_xfer));
 if (msg_sz != sizeof(virtio_snd_pcm_xfer)) {
-goto tx_err;
+continue;
 }
 stream_id = le32_to_cpu(hdr.stream_id);
 
 if (stream_id >= s->snd_conf.streams
 || s->pcm->streams[stream_id] == NULL) {
-goto tx_err;
+continue;
 }
 
 stream = s->pcm->streams[stream_id];
-- 
2.34.1

Re: [PATCH v4 2/2] vhost: Perform memory section dirty scans once per iteration

2024-03-21 Thread Si-Wei Liu





On 3/20/2024 8:56 PM, Jason Wang wrote:

On Thu, Mar 21, 2024 at 5:03 AM Si-Wei Liu  wrote:



On 3/19/2024 8:27 PM, Jason Wang wrote:

On Tue, Mar 19, 2024 at 6:16 AM Si-Wei Liu  wrote:


On 3/17/2024 8:22 PM, Jason Wang wrote:

On Sat, Mar 16, 2024 at 2:45 AM Si-Wei Liu  wrote:

On 3/14/2024 9:03 PM, Jason Wang wrote:

On Fri, Mar 15, 2024 at 5:39 AM Si-Wei Liu  wrote:

On setups with one or more virtio-net devices with vhost on,
dirty tracking iteration increases cost the bigger the number
amount of queues are set up e.g. on idle guests migration the
following is observed with virtio-net with vhost=on:

48 queues -> 78.11%  [.] vhost_dev_sync_region.isra.13
8 queues -> 40.50%   [.] vhost_dev_sync_region.isra.13
1 queue -> 6.89% [.] vhost_dev_sync_region.isra.13
2 devices, 1 queue -> 18.60%  [.] vhost_dev_sync_region.isra.14

With high memory rates the symptom is lack of convergence as soon
as it has a vhost device with a sufficiently high number of queues,
the sufficient number of vhost devices.

On every migration iteration (every 100msecs) it will redundantly
query the *shared log* the number of queues configured with vhost
that exist in the guest. For the virtqueue data, this is necessary,
but not for the memory sections which are the same. So essentially
we end up scanning the dirty log too often.

To fix that, select a vhost device responsible for scanning the
log with regards to memory sections dirty tracking. It is selected
when we enable the logger (during migration) and cleared when we
disable the logger. If the vhost logger device goes away for some
reason, the logger will be re-selected from the rest of vhost
devices.

After making mem-section logger a singleton instance, constant cost
of 7%-9% (like the 1 queue report) will be seen, no matter how many
queues or how many vhost devices are configured:

48 queues -> 8.71%[.] vhost_dev_sync_region.isra.13
2 devices, 8 queues -> 7.97%   [.] vhost_dev_sync_region.isra.14

Co-developed-by: Joao Martins 
Signed-off-by: Joao Martins 
Signed-off-by: Si-Wei Liu 

---
v3 -> v4:
  - add comment to clarify effect on cache locality and
performance

v2 -> v3:
  - add after-fix benchmark to commit log
  - rename vhost_log_dev_enabled to vhost_dev_should_log
  - remove unneeded comparisons for backend_type
  - use QLIST array instead of single flat list to store vhost
logger devices
  - simplify logger election logic
---
 hw/virtio/vhost.c | 67 
++-
 include/hw/virtio/vhost.h |  1 +
 2 files changed, 62 insertions(+), 6 deletions(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 612f4db..58522f1 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -45,6 +45,7 @@

 static struct vhost_log *vhost_log[VHOST_BACKEND_TYPE_MAX];
 static struct vhost_log *vhost_log_shm[VHOST_BACKEND_TYPE_MAX];
+static QLIST_HEAD(, vhost_dev) vhost_log_devs[VHOST_BACKEND_TYPE_MAX];

 /* Memslots used by backends that support private memslots (without an 
fd). */
 static unsigned int used_memslots;
@@ -149,6 +150,47 @@ bool vhost_dev_has_iommu(struct vhost_dev *dev)
 }
 }

+static inline bool vhost_dev_should_log(struct vhost_dev *dev)
+{
+assert(dev->vhost_ops);
+assert(dev->vhost_ops->backend_type > VHOST_BACKEND_TYPE_NONE);
+assert(dev->vhost_ops->backend_type < VHOST_BACKEND_TYPE_MAX);
+
+return dev == QLIST_FIRST(_log_devs[dev->vhost_ops->backend_type]);

A dumb question, why not simple check

dev->log == vhost_log_shm[dev->vhost_ops->backend_type]

Because we are not sure if the logger comes from vhost_log_shm[] or
vhost_log[]. Don't want to complicate the check here by calling into
vhost_dev_log_is_shared() everytime when the .log_sync() is called.

It has very low overhead, isn't it?

Whether this has low overhead will have to depend on the specific
backend's implementation for .vhost_requires_shm_log(), which the common
vhost layer should not assume upon or rely on the current implementation.


static bool vhost_dev_log_is_shared(struct vhost_dev *dev)
{
   return dev->vhost_ops->vhost_requires_shm_log &&
  dev->vhost_ops->vhost_requires_shm_log(dev);
}

For example, if I understand the code correctly, the log type won't be
changed during runtime, so we can endup with a boolean to record that
instead of a query ops?

Right now the log type won't change during runtime, but I am not sure if
this may prohibit future revisit to allow change at the runtime,

We can be bothered when we have such a request then.


then
there'll be complex code involvled to maintain the state.

Other than this, I think it's insufficient to just check the shm log
v.s. normal log. The logger device requires to identify a leading logger
device that gets elected in vhost_dev_elect_mem_logger(), as all the
dev->log points to the same logger that is refenerce counted, that we
have to add extra field and complex logic

Re: [RFC 0/8] virtio,vhost: Add VIRTIO_F_IN_ORDER support

On 3/21/24 3:48 PM, Dongli Zhang wrote:

Hi Jonah,

Would you mind helping explain how does VIRTIO_F_IN_ORDER improve the
performance?

https://lore.kernel.org/all/20240321155717.1392787-1-jonah.pal...@oracle.com/#t

I tried to look for it from prior discussions but could not find why.

https://lore.kernel.org/all/byapr18mb2791df7e6c0f61e2d8698e8fa0...@byapr18mb2791.namprd18.prod.outlook.com/

Thank you very much!

Dongli Zhang

Hey Dongli,

So VIRTIO_F_IN_ORDER can theoretically improve performance under certain
conditions. Whether it can improve performance today, I'm not sure.

But, if we can guarantee that all buffers are used by the device in the
same order in which they're made available by the driver (enforcing a
strict in-order processing and completion of requests), then we can
leverage this to our advantage.

For example, we could simplify device and driver logic such as not
needing complex mechanisms to track the completion of out-of-order
requests (reduce request management overhead). Though the need of
complex mechanisms to force this data to be in-order kind of defeats
this benefit.

It could also improve cache utilization since sequential access patterns
are more cache-friendly compared to random access patterns.

Also, in-order processing is more predictable, making it easier to
optimize device and driver performance. E.g. it can allow us to
fine-tune things without having to account for the variability of
out-of-order completions.

But again, the actual performance impact will vary depending on the use
case and workload. Scenarios that require high levels of parallelism or
where out-of-order completions are efficiently managed, the flexibility
of out-of-order processing can still be preferable.

Jonah

On 3/21/24 08:57, Jonah Palmer wrote:

The goal of these patches is to add support to a variety of virtio and
vhost devices for the VIRTIO_F_IN_ORDER transport feature. This feature
indicates that all buffers are used by the device in the same order in
which they were made available by the driver.

These patches attempt to implement a generalized, non-device-specific
solution to support this feature.

The core feature behind this solution is a buffer mechanism in the form
of GLib's GHashTable. The decision behind using a hash table was to
leverage their ability for quick lookup, insertion, and removal
operations. Given that our keys are simply numbers of an ordered
sequence, a hash table seemed like the best choice for a buffer
mechanism.

The strategy behind this implementation is as follows:

We know that buffers that are popped from the available ring and enqueued
for further processing will always done in the same order in which they
were made available by the driver. Given this, we can note their order
by assigning the resulting VirtQueueElement a key. This key is a number
in a sequence that represents the order in which they were popped from
the available ring, relative to the other VirtQueueElements.

For example, given 3 "elements" that were popped from the available
ring, we assign a key value to them which represents their order (elem0
is popped first, then elem1, then lastly elem2):

elem2 -- elem1 -- elem0 ---> Enqueue for processing
(key: 2)(key: 1)(key: 0)

Then these elements are enqueued for further processing by the host.

While most devices will return these completed elements in the same
order in which they were enqueued, some devices may not (e.g.
virtio-blk). To guarantee that these elements are put on the used ring
in the same order in which they were enqueued, we can use a buffering
mechanism that keeps track of the next expected sequence number of an
element.

In other words, if the completed element does not have a key value that
matches the next expected sequence number, then we know this element is
not in-order and we must stash it away in a hash table until an order
can be made. The element's key value is used as the key for placing it
in the hash table.

If the completed element has a key value that matches the next expected
sequence number, then we know this element is in-order and we can push
it on the used ring. Then we increment the next expected sequence number
and check if the hash table contains an element at this key location.

If so, we retrieve this element, push it to the used ring, delete the
key-value pair from the hash table, increment the next expected sequence
number, and check the hash table again for an element at this new key
location. This process is repeated until we're unable to find an element
in the hash table to continue the order.

So, for example, say the 3 elements we enqueued were completed in the
following order: elem1, elem2, elem0. The next expected sequence number
is 0:

exp-seq-num = 0:

elem1 --> elem1.key == exp-seq-num ? --> No, stash it
(key: 1) |

Re: [PATCH-for-9.1 02/21] hw/core: Remove check on NEED_CPU_H in tcg-cpu-ops.h


On 3/21/24 05:48, Philippe Mathieu-Daudé wrote:

Commit fd3f7d24d4 ("include/hw/core: Remove i386 conditional
on fake_user_interrupt") remove the need to check on NEED_CPU_H.

Signed-off-by: Philippe Mathieu-Daudé
---
  include/hw/core/tcg-cpu-ops.h | 2 --
  1 file changed, 2 deletions(-)


Reviewed-by: Richard Henderson 


r~

Re: [PATCH 1/3] target/hppa: add unit conditions for wide mode


On 3/21/24 10:10, Sven Schnelle wrote:

Richard Henderson  writes:


On 3/21/24 08:42, Sven Schnelle wrote:

Wide mode provides two more conditions, add them.
Signed-off-by: Sven Schnelle 
---
   target/hppa/translate.c | 25 +++--
   1 file changed, 23 insertions(+), 2 deletions(-)
diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index 8a87996fc1..f493e207e1 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -963,11 +963,22 @@ static DisasCond do_unit_cond(unsigned cf, bool d, 
TCGv_i64 res,
 switch (cf >> 1) {
   case 0: /* never / TR */
-case 1: /* undefined */
-case 5: /* undefined */
   cond = cond_make_f();
   break;
   +case 1:


Wants a comment for /* SWZ / NWZ */


+case 5:


/* SWC / NWC */


Are you going to fix that up, or should i send a v2?


I'll fix it up.


r~

Re: [PATCH] migration/multifd: Fix clearing of mapped-ram zero pages

2024-03-21 Thread Peter Xu

On Thu, Mar 21, 2024 at 05:12:42PM -0300, Fabiano Rosas wrote:
> When the zero page detection is done in the multifd threads, we need
> to iterate the second part of the pages->offset array and clear the
> file bitmap for each zero page. The piece of code we merged to do that
> is wrong.
> 
> The reason this has passed all the tests is because the bitmap is
> initialized with zeroes already, so clearing the bits only really has
> an effect during live migration and when a data page goes from having
> data to no data.
> 
> Fixes: 303e6f54f9 ("migration/multifd: Implement zero page transmission on 
> the multifd thread.")
> Signed-off-by: Fabiano Rosas 
> ---
> CI run: https://gitlab.com/farosas/qemu/-/pipelines/1222882269
> ---
>  migration/multifd.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/migration/multifd.c b/migration/multifd.c
> index d2f0238f70..2802afe79d 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -111,7 +111,6 @@ void multifd_send_channel_created(void)
>  static void multifd_set_file_bitmap(MultiFDSendParams *p)
>  {
>  MultiFDPages_t *pages = p->pages;
> -uint32_t zero_num = p->pages->num - p->pages->normal_num;
>  
>  assert(pages->block);
>  
> @@ -119,7 +118,7 @@ static void multifd_set_file_bitmap(MultiFDSendParams *p)
>  ramblock_set_file_bmap_atomic(pages->block, pages->offset[i], true);
>  }
>  
> -for (int i = p->pages->num; i < zero_num; i++) {
> +for (int i = p->pages->normal_num; i < p->pages->num; i++) {
>  ramblock_set_file_bmap_atomic(pages->block, pages->offset[i], false);
>  }
>  }

Hmm, a challenging one even if it reads obvious.. :)

queued for 9.0-rc1, thanks.

-- 
Peter Xu

Re: [PATCH RFC v3 00/49] Add AMD Secure Nested Paging (SEV-SNP) support

2024-03-21 Thread Michael Roth

On Wed, Mar 20, 2024 at 03:38:56AM -0500, Michael Roth wrote:
> 
> Testing
> ---
> 
> This series has been tested against the following host kernel tree, which
> is a snapshot of the latest WIP SNP hypervisor tree at the time of this
> posting. It will likely not be kept up to date afterward, so please keep an
> eye upstream or official AMDESE github if you are looking for the latest
> some time after this posting:
> 
>   https://github.com/mdroth/linux/commits/snp-host-v12-wip40/

I just noticed I had a necessary local change that wasn't included in the
initial push of this branch. I've updated the branch now, but just wanted
to post a heads-up in case anyone was having issues.

-Mike

Re: [PATCH v3 11/49] physmem: Introduce ram_block_discard_guest_memfd_range()

2024-03-21 Thread Michael Roth

On Wed, Mar 20, 2024 at 09:04:52PM +0100, David Hildenbrand wrote:
> On 20.03.24 18:38, Michael Roth wrote:
> > On Wed, Mar 20, 2024 at 10:37:14AM +0100, David Hildenbrand wrote:
> > > On 20.03.24 09:39, Michael Roth wrote:
> > > > From: Xiaoyao Li 
> > > > 
> > > > When memory page is converted from private to shared, the original
> > > > private memory is back'ed by guest_memfd. Introduce
> > > > ram_block_discard_guest_memfd_range() for discarding memory in
> > > > guest_memfd.
> > > > 
> > > > Originally-from: Isaku Yamahata 
> > > > Codeveloped-by: Xiaoyao Li 
> > > 
> > > "Co-developed-by"
> > > 
> > > > Signed-off-by: Xiaoyao Li 
> > > > Reviewed-by: David Hildenbrand 
> > > 
> > > Your SOB should go here.
> > > 
> > > > ---
> > > > Changes in v5:
> > > > - Collect Reviewed-by from David;
> > > > 
> > > > Changes in in v4:
> > > > - Drop ram_block_convert_range() and open code its implementation in the
> > > > next Patch.
> > > > 
> > > > Signed-off-by: Michael Roth 
> > > 
> > > I only received 3 patches from this series, and now I am confused: 
> > > changelog
> > > talks about v5 and this is "PATCH v3"
> > > 
> > > Please make sure to send at least the cover letter along (I might not need
> > > the other 46 patches :D ).
> > 
> > Sorry for the confusion, you got auto-Cc'd by git, which is good, but
> > not sure there's a good way to make sure everyone gets a copy of the
> > cover letter. I could see how it would help useful to potential
> > reviewers though. I'll try to come up with a script for it and take that
> > approach in the future.
> 
> A script shared with me in the past to achieve that in most cases:
> 
> $ cat cc-cmd.sh
> #!/bin/bash
> 
> if [[ $1 == *gitsendemail.msg* || $1 == *cover-letter* ]]; then
> grep ': .* <.*@.*>' -h *.patch | sed 's/^.*: //' | sort | uniq
> fi
> 
> 
> And attach to "git send-email ... *.patch": --cc-cmd=./cc-cmd.sh

That should do the trick nicely. Thanks!

-Mike

> 
> -- 
> Cheers,
> 
> David / dhildenb
>

[PATCH] migration/multifd: Fix clearing of mapped-ram zero pages

2024-03-21 Thread Fabiano Rosas

When the zero page detection is done in the multifd threads, we need
to iterate the second part of the pages->offset array and clear the
file bitmap for each zero page. The piece of code we merged to do that
is wrong.

The reason this has passed all the tests is because the bitmap is
initialized with zeroes already, so clearing the bits only really has
an effect during live migration and when a data page goes from having
data to no data.

Fixes: 303e6f54f9 ("migration/multifd: Implement zero page transmission on the 
multifd thread.")
Signed-off-by: Fabiano Rosas 
---
CI run: https://gitlab.com/farosas/qemu/-/pipelines/1222882269
---
 migration/multifd.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index d2f0238f70..2802afe79d 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -111,7 +111,6 @@ void multifd_send_channel_created(void)
 static void multifd_set_file_bitmap(MultiFDSendParams *p)
 {
 MultiFDPages_t *pages = p->pages;
-uint32_t zero_num = p->pages->num - p->pages->normal_num;
 
 assert(pages->block);
 
@@ -119,7 +118,7 @@ static void multifd_set_file_bitmap(MultiFDSendParams *p)
 ramblock_set_file_bmap_atomic(pages->block, pages->offset[i], true);
 }
 
-for (int i = p->pages->num; i < zero_num; i++) {
+for (int i = p->pages->normal_num; i < p->pages->num; i++) {
 ramblock_set_file_bmap_atomic(pages->block, pages->offset[i], false);
 }
 }
-- 
2.35.3

Re: [PATCH 1/3] target/hppa: add unit conditions for wide mode

Richard Henderson  writes:

> On 3/21/24 08:42, Sven Schnelle wrote:
>> Wide mode provides two more conditions, add them.
>> Signed-off-by: Sven Schnelle 
>> ---
>>   target/hppa/translate.c | 25 +++--
>>   1 file changed, 23 insertions(+), 2 deletions(-)
>> diff --git a/target/hppa/translate.c b/target/hppa/translate.c
>> index 8a87996fc1..f493e207e1 100644
>> --- a/target/hppa/translate.c
>> +++ b/target/hppa/translate.c
>> @@ -963,11 +963,22 @@ static DisasCond do_unit_cond(unsigned cf, bool d, 
>> TCGv_i64 res,
>> switch (cf >> 1) {
>>   case 0: /* never / TR */
>> -case 1: /* undefined */
>> -case 5: /* undefined */
>>   cond = cond_make_f();
>>   break;
>>   +case 1:
>
> Wants a comment for /* SWZ / NWZ */
>
>> +case 5:
>
> /* SWC / NWC */

Are you going to fix that up, or should i send a v2?

Re: [PATCH 2/3] target/hppa: sub: fix trap on overflow for narrow mode

Richard Henderson  writes:

> On 3/21/24 08:42, Sven Schnelle wrote:
>> Signed-off-by: Sven Schnelle 
>> ---
>>   target/hppa/translate.c | 3 +++
>>   1 file changed, 3 insertions(+)
>> diff --git a/target/hppa/translate.c b/target/hppa/translate.c
>> index f493e207e1..4d2b96f876 100644
>> --- a/target/hppa/translate.c
>> +++ b/target/hppa/translate.c
>> @@ -1213,6 +1213,9 @@ static void do_sub(DisasContext *ctx, unsigned rt, 
>> TCGv_i64 in1,
>>   if (is_tsv || cond_need_sv(c)) {
>>   sv = do_sub_sv(ctx, dest, in1, in2);
>>   if (is_tsv) {
>> +if (!d) {
>> +tcg_gen_ext32s_i64(sv, sv);
>> +}
>>   gen_helper_tsv(tcg_env, sv);
>>   }
>>   }
>
> Difficult to pinpoint exactly which patch should have added this.  :-)

Yes, after missing the Fixes: tags on all of my patches in the last
patchset, i tried add one but wasn't sure either. :-)

> Reviewed-by: Richard Henderson 

Thanks!

Re: [PATCH 1/3] target/hppa: add unit conditions for wide mode


On 3/21/24 08:42, Sven Schnelle wrote:

Wide mode provides two more conditions, add them.

Signed-off-by: Sven Schnelle 
---
  target/hppa/translate.c | 25 +++--
  1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index 8a87996fc1..f493e207e1 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -963,11 +963,22 @@ static DisasCond do_unit_cond(unsigned cf, bool d, 
TCGv_i64 res,
  
  switch (cf >> 1) {

  case 0: /* never / TR */
-case 1: /* undefined */
-case 5: /* undefined */
  cond = cond_make_f();
  break;
  
+case 1:


Wants a comment for /* SWZ / NWZ */


+case 5:


/* SWC / NWC */

Reviewed-by: Richard Henderson 


r~

Re: [PATCH 2/2] target/hppa: Fix B,GATE for wide mode


On 3/21/24 09:34, Philippe Mathieu-Daudé wrote:

On 21/3/24 20:28, Richard Henderson wrote:

Do not clobber the high bits of the address by using a 32-bit deposit.

Signed-off-by: Richard Henderson 
---
  target/hppa/translate.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index 1766a63001..f875d76a23 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -3880,7 +3880,7 @@ static bool trans_b_gate(DisasContext *ctx, arg_b_gate *a)
  }
  /* No change for non-gateway pages or for priv decrease.  */
  if (type >= 4 && type - 4 < ctx->privilege) {
-    dest = deposit32(dest, 0, 2, type - 4);
+    dest = deposit64(dest, 0, 2, type - 4);
  }
  } else {
  dest &= -4;  /* priv = 0 */


Fixes: 43e056522f ("target/hppa: Implement B,GATE insn")


Certainly not.  That predates hppa64 support by years.


r~

Re: [PATCH 3/3] target/hppa: add: fix trap on overflow for narrow mode


On 3/21/24 08:42, Sven Schnelle wrote:

Signed-off-by: Sven Schnelle 
---
  target/hppa/translate.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index 4d2b96f876..74a9ea0cd8 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -1122,6 +1122,9 @@ static void do_add(DisasContext *ctx, unsigned rt, 
TCGv_i64 in1,
  if (is_tsv || cond_need_sv(c)) {
  sv = do_add_sv(ctx, dest, in1, in2);
  if (is_tsv) {
+if (!d) {
+tcg_gen_ext32s_i64(sv, sv);
+}
  /* ??? Need to include overflow from shift.  */
  gen_helper_tsv(tcg_env, sv);
  }


Reviewed-by: Richard Henderson 

r~

Re: [PATCH 2/3] target/hppa: sub: fix trap on overflow for narrow mode


On 3/21/24 08:42, Sven Schnelle wrote:

Signed-off-by: Sven Schnelle 
---
  target/hppa/translate.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index f493e207e1..4d2b96f876 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -1213,6 +1213,9 @@ static void do_sub(DisasContext *ctx, unsigned rt, 
TCGv_i64 in1,
  if (is_tsv || cond_need_sv(c)) {
  sv = do_sub_sv(ctx, dest, in1, in2);
  if (is_tsv) {
+if (!d) {
+tcg_gen_ext32s_i64(sv, sv);
+}
  gen_helper_tsv(tcg_env, sv);
  }
  }


Difficult to pinpoint exactly which patch should have added this.  :-)

Reviewed-by: Richard Henderson 

r~

Re: [RFC 0/8] virtio,vhost: Add VIRTIO_F_IN_ORDER support

2024-03-21 Thread Dongli Zhang

Hi Jonah,

Would you mind helping explain how does VIRTIO_F_IN_ORDER improve the 
performance?

https://lore.kernel.org/all/20240321155717.1392787-1-jonah.pal...@oracle.com/#t

I tried to look for it from prior discussions but could not find why.

https://lore.kernel.org/all/byapr18mb2791df7e6c0f61e2d8698e8fa0...@byapr18mb2791.namprd18.prod.outlook.com/

Thank you very much!

Dongli Zhang

On 3/21/24 08:57, Jonah Palmer wrote:
> The goal of these patches is to add support to a variety of virtio and
> vhost devices for the VIRTIO_F_IN_ORDER transport feature. This feature
> indicates that all buffers are used by the device in the same order in
> which they were made available by the driver.
> 
> These patches attempt to implement a generalized, non-device-specific
> solution to support this feature.
> 
> The core feature behind this solution is a buffer mechanism in the form
> of GLib's GHashTable. The decision behind using a hash table was to
> leverage their ability for quick lookup, insertion, and removal
> operations. Given that our keys are simply numbers of an ordered
> sequence, a hash table seemed like the best choice for a buffer
> mechanism.
> 
> -
> 
> The strategy behind this implementation is as follows:
> 
> We know that buffers that are popped from the available ring and enqueued
> for further processing will always done in the same order in which they
> were made available by the driver. Given this, we can note their order
> by assigning the resulting VirtQueueElement a key. This key is a number
> in a sequence that represents the order in which they were popped from
> the available ring, relative to the other VirtQueueElements.
> 
> For example, given 3 "elements" that were popped from the available
> ring, we assign a key value to them which represents their order (elem0
> is popped first, then elem1, then lastly elem2):
> 
>  elem2   --  elem1   --  elem0   ---> Enqueue for processing
> (key: 2)(key: 1)(key: 0)
> 
> Then these elements are enqueued for further processing by the host.
> 
> While most devices will return these completed elements in the same
> order in which they were enqueued, some devices may not (e.g.
> virtio-blk). To guarantee that these elements are put on the used ring
> in the same order in which they were enqueued, we can use a buffering
> mechanism that keeps track of the next expected sequence number of an
> element.
> 
> In other words, if the completed element does not have a key value that
> matches the next expected sequence number, then we know this element is
> not in-order and we must stash it away in a hash table until an order
> can be made. The element's key value is used as the key for placing it
> in the hash table.
> 
> If the completed element has a key value that matches the next expected
> sequence number, then we know this element is in-order and we can push
> it on the used ring. Then we increment the next expected sequence number
> and check if the hash table contains an element at this key location.
> 
> If so, we retrieve this element, push it to the used ring, delete the
> key-value pair from the hash table, increment the next expected sequence
> number, and check the hash table again for an element at this new key
> location. This process is repeated until we're unable to find an element
> in the hash table to continue the order.
> 
> So, for example, say the 3 elements we enqueued were completed in the
> following order: elem1, elem2, elem0. The next expected sequence number
> is 0:
> 
> exp-seq-num = 0:
> 
>  elem1   --> elem1.key == exp-seq-num ? --> No, stash it
> (key: 1) |
>  |
>  v
>
>|key: 1 - elem1|
>
> -
> exp-seq-num = 0:
> 
>  elem2   --> elem2.key == exp-seq-num ? --> No, stash it
> (key: 2) |
>  |
>  v
>
>|key: 1 - elem1|
>|--|
>|key: 2 - elem2|
>
> -
> exp-seq-num = 0:
> 
>  elem0   --> elem0.key == exp-seq-num ? --> Yes, push to used ring
> (key: 0)
> 
> exp-seq-num = 1:
> 
> lookup(table, exp-seq-num) != NULL ? --> Yes, push to used ring,
>  remove elem from table
>  |
>

Re: [PATCH 2/2] target/hppa: Fix B,GATE for wide mode


On 21/3/24 20:28, Richard Henderson wrote:

Do not clobber the high bits of the address by using a 32-bit deposit.

Signed-off-by: Richard Henderson 
---
  target/hppa/translate.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index 1766a63001..f875d76a23 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -3880,7 +3880,7 @@ static bool trans_b_gate(DisasContext *ctx, arg_b_gate *a)
  }
  /* No change for non-gateway pages or for priv decrease.  */
  if (type >= 4 && type - 4 < ctx->privilege) {
-dest = deposit32(dest, 0, 2, type - 4);
+dest = deposit64(dest, 0, 2, type - 4);
  }
  } else {
  dest &= -4;  /* priv = 0 */


Fixes: 43e056522f ("target/hppa: Implement B,GATE insn")
Reviewed-by: Philippe Mathieu-Daudé

Re: qemu fuzz crash in virtio_net_queue_reset()

2024-03-21 Thread Alexander Bulekov

On 240321 2208, Vladimir Sementsov-Ogievskiy wrote:
> On 21.03.24 18:01, Alexander Bulekov wrote:
> > On 240320 0024, Vladimir Sementsov-Ogievskiy wrote:
> > > Hi all!
> > > 
> > >  From fuzzing I've got a fuzz-data, which produces the following crash:
> > > 
> > > qemu-fuzz-x86_64: ../hw/net/virtio-net.c:134: void 
> > > flush_or_purge_queued_packets(NetClientState *): Assertion 
> > > `!virtio_net_get_subqueue(nc)->async_tx.elem' failed.
> > > ==2172308== ERROR: libFuzzer: deadly signal
> > >  #0 0x5bd8c748b5a1 in __sanitizer_print_stack_trace 
> > > (/home/vsementsov/work/src/qemu/yc7-fuzz/build/qemu-fuzz-x86_64+0x26f05a1)
> > >  (BuildId: b41827f440fd9feaa98c667dbdcc961abb2799ae)
> > >  #1 0x5bd8c73fde38 in fuzzer::PrintStackTrace() 
> > > (/home/vsementsov/work/src/qemu/yc7-fuzz/build/qemu-fuzz-x86_64+0x2662e38)
> > >  (BuildId: b41827f440fd9feaa98c667dbdcc961abb2799ae)
> > >  #2 0x5bd8c73e38b3 in fuzzer::Fuzzer::CrashCallback() 
> > > (/home/settlements/work/src/qemu/yc7-fuzz/build/qemu-fuzz-x86_64+0x26488b3)
> > >  (BuildId: b41827f440fd9feaa98c667dbdcc961abb2799ae)
> > >  #3 0x739eec84251f  (/lib/x86_64-linux-gnu/libc.so.6+0x4251f) 
> > > (BuildId: c289da5071a3399de893d2af81d6a30c62646e1e)
> > >  #4 0x739eec8969fb in __pthread_kill_implementation 
> > > nptl/./nptl/pthread_kill.c:43:17
> > >  #5 0x739eec8969fb in __pthread_kill_internal 
> > > nptl/./nptl/pthread_kill.c:78:10
> > >  #6 0x739eec8969fb in pthread_kill nptl/./nptl/pthread_kill.c:89:10
> > >  #7 0x739eec842475 in gsignal signal/../sysdeps/posix/raise.c:26:13
> > >  #8 0x739eec8287f2 in abort stdlib/./stdlib/abort.c:79:7
> > >  #9 0x739eec82871a in __assert_fail_base assert/./assert/assert.c:92:3
> > >  #10 0x739eec839e95 in __assert_fail assert/./assert/assert.c:101:3
> > >  #11 0x5bd8c995d9e2 in flush_or_purge_queued_packets 
> > > /home/vsementsov/work/src/qemu/yc7-fuzz/build/../hw/net/virtio-net.c:134:5
> > >  #12 0x5bd8c9918a5f in virtio_net_queue_reset 
> > > /home/vsementsov/work/src/qemu/yc7-fuzz/build/../hw/net/virtio-net.c:563:5
> > >  #13 0x5bd8c9b724e5 in virtio_queue_reset 
> > > /home/vsementsov/work/src/qemu/yc7-fuzz/build/../hw/virtio/virtio.c:2492:9
> > >  #14 0x5bd8c8bcfb7c in virtio_pci_common_write 
> > > /home/vsementsov/work/src/qemu/yc7-fuzz/build/../hw/virtio/virtio-pci.c:1372:13
> > >  #15 0x5bd8c9e19cf3 in memory_region_write_accessor 
> > > /home/vsementsov/work/src/qemu/yc7-fuzz/build/../softmmu/memory.c:492:5
> > >  #16 0x5bd8c9e19631 in access_with_adjusted_size 
> > > /home/vsementsov/work/src/qemu/yc7-fuzz/build/../softmmu/memory.c:554:18
> > >  #17 0x5bd8c9e17f3c in memory_region_dispatch_write 
> > > /home/vsementsov/work/src/qemu/yc7-fuzz/build/../softmmu/memory.c:1514:16
> > >  #18 0x5bd8c9ea3bbe in flatview_write_continue 
> > > /home/vsementsov/work/src/qemu/yc7-fuzz/build/../softmmu/physmem.c:2825:23
> > >  #19 0x5bd8c9e91aab in flatview_write 
> > > /home/vsementsov/work/src/qemu/yc7-fuzz/build/../softmmu/physmem.c:2867:12
> > >  #20 0x5bd8c9e91568 in address_space_write 
> > > /home/vsementsov/work/src/qemu/yc7-fuzz/build/../softmmu/physmem.c:2963:18
> > >  #21 0x5bd8c74c8a90 in __wrap_qtest_writeq 
> > > /home/vsementsov/work/src/qemu/yc7-fuzz/build/../tests/qtest/fuzz/qtest_wrappers.c:187:9
> > >  #22 0x5bd8c74dc4da in op_write 
> > > /home/vsementsov/work/src/qemu/yc7-fuzz/build/../tests/qtest/fuzz/generic_fuzz.c:487:13
> > >  #23 0x5bd8c74d942e in generic_fuzz 
> > > /home/vsementsov/work/src/qemu/yc7-fuzz/build/../tests/qtest/fuzz/generic_fuzz.c:714:17
> > >  #24 0x5bd8c74c016e in LLVMFuzzerTestOneInput 
> > > /home/vsementsov/work/src/qemu/yc7-fuzz/build/../tests/qtest/fuzz/fuzz.c:152:5
> > >  #25 0x5bd8c73e4e43 in fuzzer::Fuzzer::ExecuteCallback(unsigned char 
> > > const*, unsigned long) 
> > > (/home/vsementsov/work/src/qemu/yc7-fuzz/build/qemu-fuzz-x86_64+0x2649e43)
> > >  (BuildId: b41827f440fd9feaa98c667dbdcc961abb2799ae)
> > >  #26 0x5bd8c73cebbf in fuzzer::RunOneTest(fuzzer::Fuzzer*, char 
> > > const*, unsigned long) 
> > > (/home/vsementsov/work/src/qemu/yc7-fuzz/build/qemu-fuzz-x86_64+0x2633bbf)
> > >  (BuildId: b41827f440fd9feaa98c667dbdcc961abb2799ae)
> > >  #27 0x5bd8c73d4916 in fuzzer::FuzzerDriver(int*, char***, int 
> > > (*)(unsigned char const*, unsigned long)) 
> > > (/home/vsementsov/work/src/qemu/yc7-fuzz/build/qemu-fuzz-x86_64+0x2639916)
> > >  (BuildId: b41827f440fd9feaa98c667dbdcc961abb2799ae)
> > >  #28 0x5bd8c73fe732 in main 
> > > (/home/vsementsov/work/src/qemu/yc7-fuzz/build/qemu-fuzz-x86_64+0x2663732)
> > >  (BuildId: b41827f440fd9feaa98c667dbdcc961abb2799ae)
> > >  #29 0x739eec829d8f in __libc_start_call_main 
> > > csu/../sysdeps/nptl/libc_start_call_main.h:58:16
> > >  #30 0x739eec829e3f in __libc_start_main csu/../csu/libc-start.c:392:3
> > >  #31 0x5bd8c73c9484 in _start 
> > >

[PATCH 2/2] target/hppa: Fix B,GATE for wide mode

Do not clobber the high bits of the address by using a 32-bit deposit.

Signed-off-by: Richard Henderson 
---
 target/hppa/translate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index 1766a63001..f875d76a23 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -3880,7 +3880,7 @@ static bool trans_b_gate(DisasContext *ctx, arg_b_gate *a)
 }
 /* No change for non-gateway pages or for priv decrease.  */
 if (type >= 4 && type - 4 < ctx->privilege) {
-dest = deposit32(dest, 0, 2, type - 4);
+dest = deposit64(dest, 0, 2, type - 4);
 }
 } else {
 dest &= -4;  /* priv = 0 */
-- 
2.34.1

[PATCH 1/2] target/hppa: Fix BE,L set of sr0

The return address comes from IA*Q_Next, and IASQ_Next
is always equal to IASQ_Back, not IASQ_Front.

Signed-off-by: Richard Henderson 
---
 target/hppa/translate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index 19594f917e..1766a63001 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -3817,7 +3817,7 @@ static bool trans_be(DisasContext *ctx, arg_be *a)
 load_spr(ctx, new_spc, a->sp);
 if (a->l) {
 copy_iaoq_entry(ctx, cpu_gr[31], ctx->iaoq_n, ctx->iaoq_n_var);
-tcg_gen_mov_i64(cpu_sr[0], cpu_iasq_f);
+tcg_gen_mov_i64(cpu_sr[0], cpu_iasq_b);
 }
 if (a->n && use_nullify_skip(ctx)) {
 copy_iaoq_entry(ctx, cpu_iaoq_f, -1, tmp);
-- 
2.34.1

[PATCH for-9.0 0/2] target/hppa: two more simple fixes

Using the correct space for BE,L linkage might make the difference
for a cpu stress test.  I believe triggering this would require
something like

f=(seg,ofs), b=(seg,ofs+4)
be  0(sr1,r1)   f=(seg,ofs+4), b=(sr1,r1)
be,l,n  0(sr2,r2),sr0,r31   f=(sr1,r1), b=(sr2,r2)

and then validating the contents of sr0 on return.

Linux only places B,GATE in the zero page, so amusingly there were
never any high bits to clobber.  But I can imagine HP-UX making
more use of gateway pages, and certainly a cpu stress test would.


r~


Richard Henderson (2):
  target/hppa: Fix BE,L set of sr0
  target/hppa: Fix B,GATE for wide mode

 target/hppa/translate.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

-- 
2.34.1

Re: qemu fuzz crash in virtio_net_queue_reset()

2024-03-21 Thread Vladimir Sementsov-Ogievskiy


On 21.03.24 18:01, Alexander Bulekov wrote:

On 240320 0024, Vladimir Sementsov-Ogievskiy wrote:

Hi all!

 From fuzzing I've got a fuzz-data, which produces the following crash:

qemu-fuzz-x86_64: ../hw/net/virtio-net.c:134: void 
flush_or_purge_queued_packets(NetClientState *): Assertion 
`!virtio_net_get_subqueue(nc)->async_tx.elem' failed.
==2172308== ERROR: libFuzzer: deadly signal
 #0 0x5bd8c748b5a1 in __sanitizer_print_stack_trace 
(/home/vsementsov/work/src/qemu/yc7-fuzz/build/qemu-fuzz-x86_64+0x26f05a1) 
(BuildId: b41827f440fd9feaa98c667dbdcc961abb2799ae)
 #1 0x5bd8c73fde38 in fuzzer::PrintStackTrace() 
(/home/vsementsov/work/src/qemu/yc7-fuzz/build/qemu-fuzz-x86_64+0x2662e38) 
(BuildId: b41827f440fd9feaa98c667dbdcc961abb2799ae)
 #2 0x5bd8c73e38b3 in fuzzer::Fuzzer::CrashCallback() 
(/home/settlements/work/src/qemu/yc7-fuzz/build/qemu-fuzz-x86_64+0x26488b3) 
(BuildId: b41827f440fd9feaa98c667dbdcc961abb2799ae)
 #3 0x739eec84251f  (/lib/x86_64-linux-gnu/libc.so.6+0x4251f) (BuildId: 
c289da5071a3399de893d2af81d6a30c62646e1e)
 #4 0x739eec8969fb in __pthread_kill_implementation 
nptl/./nptl/pthread_kill.c:43:17
 #5 0x739eec8969fb in __pthread_kill_internal 
nptl/./nptl/pthread_kill.c:78:10
 #6 0x739eec8969fb in pthread_kill nptl/./nptl/pthread_kill.c:89:10
 #7 0x739eec842475 in gsignal signal/../sysdeps/posix/raise.c:26:13
 #8 0x739eec8287f2 in abort stdlib/./stdlib/abort.c:79:7
 #9 0x739eec82871a in __assert_fail_base assert/./assert/assert.c:92:3
 #10 0x739eec839e95 in __assert_fail assert/./assert/assert.c:101:3
 #11 0x5bd8c995d9e2 in flush_or_purge_queued_packets 
/home/vsementsov/work/src/qemu/yc7-fuzz/build/../hw/net/virtio-net.c:134:5
 #12 0x5bd8c9918a5f in virtio_net_queue_reset 
/home/vsementsov/work/src/qemu/yc7-fuzz/build/../hw/net/virtio-net.c:563:5
 #13 0x5bd8c9b724e5 in virtio_queue_reset 
/home/vsementsov/work/src/qemu/yc7-fuzz/build/../hw/virtio/virtio.c:2492:9
 #14 0x5bd8c8bcfb7c in virtio_pci_common_write 
/home/vsementsov/work/src/qemu/yc7-fuzz/build/../hw/virtio/virtio-pci.c:1372:13
 #15 0x5bd8c9e19cf3 in memory_region_write_accessor 
/home/vsementsov/work/src/qemu/yc7-fuzz/build/../softmmu/memory.c:492:5
 #16 0x5bd8c9e19631 in access_with_adjusted_size 
/home/vsementsov/work/src/qemu/yc7-fuzz/build/../softmmu/memory.c:554:18
 #17 0x5bd8c9e17f3c in memory_region_dispatch_write 
/home/vsementsov/work/src/qemu/yc7-fuzz/build/../softmmu/memory.c:1514:16
 #18 0x5bd8c9ea3bbe in flatview_write_continue 
/home/vsementsov/work/src/qemu/yc7-fuzz/build/../softmmu/physmem.c:2825:23
 #19 0x5bd8c9e91aab in flatview_write 
/home/vsementsov/work/src/qemu/yc7-fuzz/build/../softmmu/physmem.c:2867:12
 #20 0x5bd8c9e91568 in address_space_write 
/home/vsementsov/work/src/qemu/yc7-fuzz/build/../softmmu/physmem.c:2963:18
 #21 0x5bd8c74c8a90 in __wrap_qtest_writeq 
/home/vsementsov/work/src/qemu/yc7-fuzz/build/../tests/qtest/fuzz/qtest_wrappers.c:187:9
 #22 0x5bd8c74dc4da in op_write 
/home/vsementsov/work/src/qemu/yc7-fuzz/build/../tests/qtest/fuzz/generic_fuzz.c:487:13
 #23 0x5bd8c74d942e in generic_fuzz 
/home/vsementsov/work/src/qemu/yc7-fuzz/build/../tests/qtest/fuzz/generic_fuzz.c:714:17
 #24 0x5bd8c74c016e in LLVMFuzzerTestOneInput 
/home/vsementsov/work/src/qemu/yc7-fuzz/build/../tests/qtest/fuzz/fuzz.c:152:5
 #25 0x5bd8c73e4e43 in fuzzer::Fuzzer::ExecuteCallback(unsigned char 
const*, unsigned long) 
(/home/vsementsov/work/src/qemu/yc7-fuzz/build/qemu-fuzz-x86_64+0x2649e43) 
(BuildId: b41827f440fd9feaa98c667dbdcc961abb2799ae)
 #26 0x5bd8c73cebbf in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, 
unsigned long) 
(/home/vsementsov/work/src/qemu/yc7-fuzz/build/qemu-fuzz-x86_64+0x2633bbf) 
(BuildId: b41827f440fd9feaa98c667dbdcc961abb2799ae)
 #27 0x5bd8c73d4916 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned 
char const*, unsigned long)) 
(/home/vsementsov/work/src/qemu/yc7-fuzz/build/qemu-fuzz-x86_64+0x2639916) 
(BuildId: b41827f440fd9feaa98c667dbdcc961abb2799ae)
 #28 0x5bd8c73fe732 in main 
(/home/vsementsov/work/src/qemu/yc7-fuzz/build/qemu-fuzz-x86_64+0x2663732) 
(BuildId: b41827f440fd9feaa98c667dbdcc961abb2799ae)
 #29 0x739eec829d8f in __libc_start_call_main 
csu/../sysdeps/nptl/libc_start_call_main.h:58:16
 #30 0x739eec829e3f in __libc_start_main csu/../csu/libc-start.c:392:3
 #31 0x5bd8c73c9484 in _start 
(/home/vsementsov/work/src/qemu/yc7-fuzz/build/qemu-fuzz-x86_64+0x262e484) 
(BuildId: b41827f440fd9feaa98c667dbdcc961abb2799ae)





Hello Vladimir,
This looks like a similar crash.
https://gitlab.com/qemu-project/qemu/-/issues/1451

That issue has a qtest reproducer that does not require a fuzzer to
reproduce.


Right, looks very similar, thanks! 1 year ago and still no news.. It's not 
encouraging.



The fuzzer should run fine under gdb. e.g.
gdb ./qemu-fuzz-i386
r  --fuzz-target=generic-fuzz-virtio-net-pci-slirp

[PATCH 3/3] target/hppa: add: fix trap on overflow for narrow mode

Signed-off-by: Sven Schnelle 
---
 target/hppa/translate.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index 4d2b96f876..74a9ea0cd8 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -1122,6 +1122,9 @@ static void do_add(DisasContext *ctx, unsigned rt, 
TCGv_i64 in1,
 if (is_tsv || cond_need_sv(c)) {
 sv = do_add_sv(ctx, dest, in1, in2);
 if (is_tsv) {
+if (!d) {
+tcg_gen_ext32s_i64(sv, sv);
+}
 /* ??? Need to include overflow from shift.  */
 gen_helper_tsv(tcg_env, sv);
 }
-- 
2.43.2

[PATCH 1/3] target/hppa: add unit conditions for wide mode

Wide mode provides two more conditions, add them.

Signed-off-by: Sven Schnelle 
---
 target/hppa/translate.c | 25 +++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index 8a87996fc1..f493e207e1 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -963,11 +963,22 @@ static DisasCond do_unit_cond(unsigned cf, bool d, 
TCGv_i64 res,
 
 switch (cf >> 1) {
 case 0: /* never / TR */
-case 1: /* undefined */
-case 5: /* undefined */
 cond = cond_make_f();
 break;
 
+case 1:
+if (d) {
+tmp = tcg_temp_new_i64();
+tcg_gen_subi_i64(tmp, res, d_repl * 0x0001u);
+tcg_gen_andc_i64(tmp, tmp, res);
+tcg_gen_andi_i64(tmp, tmp, d_repl * 0x8000u);
+cond = cond_make_0(TCG_COND_NE, tmp);
+} else {
+/* undefined */
+cond = cond_make_f();
+}
+break;
+
 case 2: /* SBZ / NBZ */
 /* See hasless(v,1) from
  * https://graphics.stanford.edu/~seander/bithacks.html#ZeroInWord
@@ -992,6 +1003,16 @@ static DisasCond do_unit_cond(unsigned cf, bool d, 
TCGv_i64 res,
 cond = cond_make_0(TCG_COND_NE, cb);
 break;
 
+case 5:
+if (d) {
+tcg_gen_andi_i64(cb, cb, d_repl * 0x8000u);
+cond = cond_make_0(TCG_COND_NE, cb);
+} else {
+/* undefined */
+cond = cond_make_f();
+}
+break;
+
 case 6: /* SBC / NBC */
 tcg_gen_andi_i64(cb, cb, d_repl * 0x80808080u);
 cond = cond_make_0(TCG_COND_NE, cb);
-- 
2.43.2

[PATCH 2/3] target/hppa: sub: fix trap on overflow for narrow mode

Signed-off-by: Sven Schnelle 
---
 target/hppa/translate.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index f493e207e1..4d2b96f876 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -1213,6 +1213,9 @@ static void do_sub(DisasContext *ctx, unsigned rt, 
TCGv_i64 in1,
 if (is_tsv || cond_need_sv(c)) {
 sv = do_sub_sv(ctx, dest, in1, in2);
 if (is_tsv) {
+if (!d) {
+tcg_gen_ext32s_i64(sv, sv);
+}
 gen_helper_tsv(tcg_env, sv);
 }
 }
-- 
2.43.2

Re: TCG change broke MorphOS boot on sam460ex

2024-03-21 Thread BALATON Zoltan


On 27/2/24 17:47, BALATON Zoltan wrote:

Hello,

Commit 18a536f1f8 (accel/tcg: Always require can_do_io) broke booting 
MorphOS on sam460ex (this was before 8.2.0 and I thought I've verified 
it before that release but apparently missed it back then). It can be 
reproduced with https://www.morphos-team.net/morphos-3.18.iso and 
following command:


qemu-system-ppc -M sam460ex -serial stdio -d unimp,guest_errors \
   -drive if=none,id=cd,format=raw,file=morphos-3.18.iso \
   -device ide-cd,drive=cd,bus=ide.1


Although it breaks at the TCG change it may also be related to tlbwe 
changes somehow but I don't really understand it. I've tried to get some 
more debug info in case somebody can tell what's happening. With 
18a536f1f8^ (the commit before the one it broke at and still works) I get:



IN:
ppcemb_tlb_check: TLB 0 address 00c01000 PID 0 <=> f000 f000 0 7f
mmubooke_check_tlb: TLB entry not found
ppcemb_tlb_check: TLB 1 address 00c01000 PID 0 <=> d000 f000 0 3b
mmubooke_check_tlb: TLB entry not found
ppcemb_tlb_check: TLB 2 address 00c01000 PID 0 <=> 8000 f000 0 3b
mmubooke_check_tlb: TLB entry not found
ppcemb_tlb_check: TLB 3 address 00c01000 PID 0 <=> 9000 f000 0 3b
mmubooke_check_tlb: TLB entry not found
ppcemb_tlb_check: TLB 4 address 00c01000 PID 0 <=> a000 f000 0 3b
mmubooke_check_tlb: TLB entry not found
ppcemb_tlb_check: TLB 5 address 00c01000 PID 0 <=> b000 f000 0 3b
mmubooke_check_tlb: TLB entry not found
ppcemb_tlb_check: TLB 6 address 00c01000 PID 0 <=> c000 f000 0 3b
mmubooke_check_tlb: TLB entry not found
ppcemb_tlb_check: TLB 7 address 00c01000 PID 0 <=> e000 ff00 0 3b
mmubooke_check_tlb: TLB entry not found
ppcemb_tlb_check: TLB 8 address 00c01000 PID 0 <=> e100 ff00 0 3b
mmubooke_check_tlb: TLB entry not found
ppcemb_tlb_check: TLB 9 address 00c01000 PID 0 <=> e300 fc00 0 3b
mmubooke_check_tlb: TLB entry not found
ppcemb_tlb_check: TLB 10 address 00c01000 PID 0 <=> e3001000 fc00 0 3b
mmubooke_check_tlb: TLB entry not found
ppcemb_tlb_check: TLB 11 address 00c01000 PID 0 <=> e400 c000 0 3b
mmubooke_check_tlb: TLB entry not found
ppcemb_tlb_check: TLB 12 address 00c01000 PID 0 <=> e500 fff0 0 7f
mmubooke_check_tlb: TLB entry not found
ppcemb_tlb_check: TLB 13 address 00c01000 PID 0 <=> ef00 ff00 0 7f
mmubooke_check_tlb: TLB entry not found
ppcemb_tlb_check: TLB 14 address 00c01000 PID 0 <=> e200 fff0 0 7f
mmubooke_check_tlb: TLB entry not found
ppcemb_tlb_check: TLB 15 address 00c01000 PID 0 <=>  f000 0 7f
mmubooke_check_tlb: good TLB!
mmubooke_get_physical_address: access granted 00c01000 => 00c01000 7 0
0x00c01354:  38c00040  li   r6, 0x40
0x00c01358:  38e10204  addi r7, r1, 0x204
0x00c0135c:  39010104  addi r8, r1, 0x104
0x00c01360:  39410004  addi r10, r1, 4
0x00c01364:  3920  li   r9, 0
0x00c01368:  7cc903a6  mtctrr6
0x00c0136c:  84c70004  lwzu r6, 4(r7)
0x00c01370:  7cc907a4  tlbwehi  r6, r9
0x00c01374:  84c80004  lwzu r6, 4(r8)
0x00c01378:  7cc90fa4  tlbwelo  r6, r9
0x00c0137c:  84ca0004  lwzu r6, 4(r10)
0x00c01380:  7cc917a4  tlbwehi  r6, r9
0x00c01384:  39290001  addi r9, r9, 1
0x00c01388:  4200ffe4  bdnz 0xc0136c

helper_440_tlbwe word 0 entry 0 value 0290
ppcemb_tlb_check: TLB 0 address 0df6bfb0 PID 0 <=>  f000 0 7f
mmubooke_check_tlb: good TLB!
mmubooke_get_physical_address: access granted 0df6bfb0 => 0004fdf6bfb0 7 0
Invalid read at addr 0x4FDF6BFB0, size 4, region '(null)', reason: rejected
helper_440_tlbwe word 1 entry 0 value 
ppcemb_tlb_check: TLB 0 address 0df6beb0 PID 0 <=>  f000 0 7f
mmubooke_check_tlb: good TLB!
mmubooke_get_physical_address: access granted 0df6beb0 => 0df6beb0 7 0
helper_440_tlbwe word 2 entry 0 value 003f
ppcemb_tlb_check: TLB 0 address 00c0136c PID 0 <=>  f000 0 7f
mmubooke_check_tlb: good TLB!
mmubooke_get_physical_address: access granted 00c0136c => 00c0136c 7 0


and with commit 18a536f1f8 this changes to


IN:
ppcemb_tlb_check: TLB 0 address 00c01000 PID 0 <=> f000 f000 0 7f
mmubooke_check_tlb: TLB entry not found
ppcemb_tlb_check: TLB 1 address 00c01000 PID 0 <=> d000 f000 0 3b
mmubooke_check_tlb: TLB entry not found
ppcemb_tlb_check: TLB 2 address 00c01000 PID 0 <=> 8000 f000 0 3b
mmubooke_check_tlb: TLB entry not found
ppcemb_tlb_check: TLB 3 address 00c01000 PID 0 <=> 9000 f000 0 3b
mmubooke_check_tlb: TLB entry not found
ppcemb_tlb_check: TLB 4 address 00c01000 PID 0 <=> a000 f000 0 3b
mmubooke_check_tlb: TLB entry not found
ppcemb_tlb_check: TLB 5 address 00c01000 PID 0 <=> b000 f000 0 3b
mmubooke_check_tlb: TLB entry not found
ppcemb_tlb_check: TLB 6 address 00c01000 PID 0 <=> c000 f000 0 3b
mmubooke_check_tlb: TLB entry not found
ppcemb_tlb_check: TLB 7 address 00c01000 PID 0 <=>

Re: [PULL 0/9] target/hppa fixes for 9.0

2024-03-21 Thread Helge Deller


On 3/21/24 19:25, Sven Schnelle wrote:

Michael Tokarev  writes:


20.03.2024 03:32, Richard Henderson :


Richard Henderson (3):
target/hppa: Fix assemble_16 insns for wide mode
target/hppa: Fix assemble_11a insns for wide mode
target/hppa: Fix assemble_12a insns for wide mode
Sven Schnelle (6):
target/hppa: ldcw,s uses static shift of 3
target/hppa: fix shrp for wide mode
target/hppa: fix access_id check
target/hppa: exit tb on flush cache instructions
target/hppa: mask privilege bits in mfia
target/hppa: fix do_stdby_e()


Is it all -stable material (when appropriate)?


I'd say yes.


Yes.

Helge

Re: [RFC PATCH v9 06/23] target/arm: Add support for Non-maskable Interrupt

On Thu, 21 Mar 2024 at 15:46, Peter Maydell  wrote:
> Something somewhere needs to implement "if SCTLR_ELx.NMI is 0 then
> we don't take EXCP_VINMI etc but instead (maybe) EXCP_VIRQ etc".
> At the moment nothing does that:
>  * arm_cpu_update_vinmi() doesn't look at the NMI bit before
>deciding whether to set CPU_INTERRUPT_VINMI
>  * in arm_excp_unmasked() if NMI is 0 then allIntMask takes its
>default value of false and so arm_excp_unmasked() returns true,
>so VINMI is not masked
>  * arm_cpu_exec_interrupt() doesn't look at the NMI bit before
>deciding whether to check the CPU_INTERRUPT_VINMI bit in interrupt_request
>
> So even if SCTLR_ELx.NMI is 0 we'll still try to take a VINMI
> if it's set up in the HCR_EL2 bits.
>
> However we do this the required behaviour is that if NMI is 0
> then it is as if the interrupt doesn't have superpriority and
> it falls back to being handled as an ordinary IRQ, VIRQ, VFIQ etc.
> I think the best place to do this is probably here in
> arm_cpu_exec_interrupt() -- if SCTLR_ELx.NMI isn't set then
> treat the VFNMI bit like VFIQ, the VINMI bit like VIRQ, and
> the NMI bit like IRQ.

Folding in something like this I think will work:

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 91c2896de0f..797ae3eb805 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -837,7 +837,8 @@ static bool arm_cpu_exec_interrupt(CPUState *cs,
int interrupt_request)

 /* The prioritization of interrupts is IMPLEMENTATION DEFINED. */

-if (cpu_isar_feature(aa64_nmi, env_archcpu(env))) {
+if (cpu_isar_feature(aa64_nmi, env_archcpu(env)) &&
+(arm_sctlr(env, cur_el) & SCTLR_NMI)) {
 if (interrupt_request & CPU_INTERRUPT_NMI) {
 excp_idx = EXCP_NMI;
 target_el = arm_phys_excp_target_el(cs, excp_idx, cur_el, secure);
@@ -862,7 +863,22 @@ static bool arm_cpu_exec_interrupt(CPUState *cs,
int interrupt_request)
 goto found;
 }
 }
+} else {
+/*
+ * NMI disabled: interrupts with superpriority are handled
+ * as if they didn't have it
+ */
+if (interrupt_request & CPU_INTERRUPT_NMI) {
+interrupt_request |= CPU_INTERRUPT_HARD;
+}
+if (interrupt_request & CPU_INTERRUPT_VINMI) {
+interrupt_request |= CPU_INTERRUPT_VIRQ;
+}
+if (interrupt_request & CPU_INTERRUPT_VFNMI) {
+interrupt_request |= CPU_INTERRUPT_VFIQ;
+}
 }
+
 if (interrupt_request & CPU_INTERRUPT_FIQ) {
 excp_idx = EXCP_FIQ;
 target_el = arm_phys_excp_target_el(cs, excp_idx, cur_el, secure);


> What semantics do we intend for the VINMI/VFNMI bits in interrupt_request
> and for the incoming IRQ, FIQ, NMI lines? The GIC spec suggests
> (but doesn't mandate) that NMI could be signalled by asserting
> both NMI and IRQ, and plain IRQ by asserting just IRQ (table 4-6
> in the GIC spec). I think the GIC changes in this patchset assert
> only the NMI line for an IRQNMI, and not both NMI and IRQ. That's OK
> and I think makes more sense for QEMU than signalling both lines,
> but it's not the same as what we wind up doing with the handling
> of the HCR_EL2 bits in these functions, because you don't change
> the existing arm_cpu_update_virq() so that it only sets the
> CPU_INTERRUPT_VIRQ bit if this is a VIRQ and not a VIRQNMI.
> So if the guest sets HCR_EL2.VI and HCRX_EL2.VINMI then
> arm_cpu_update_virq() will say "this is a VIRQ" and also
> arm_cpu_update_vinmi() will say "This is a VINMI" and so both bits
> get set in the interrupt_request field.
>
> I think the fix for this is probably to have arm_cpu_update_virq()
> and arm_cpu_update_vfiq() check that this is not a VINMI/VFNMI,
> so we only set 1 bit in interrupt_request, not 2.

And for this a change like:

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 03a48a41366..91c2896de0f 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -926,7 +926,8 @@ void arm_cpu_update_virq(ARMCPU *cpu)
 CPUARMState *env = >env;
 CPUState *cs = CPU(cpu);

-bool new_state = (env->cp15.hcr_el2 & HCR_VI) ||
+bool new_state = ((arm_hcr_el2_eff(env) & HCR_VI) &&
+  !(arm_hcrx_el2_eff(env) & HCRX_VINMI)) ||
 (env->irq_line_state & CPU_INTERRUPT_VIRQ);

 if (new_state != ((cs->interrupt_request & CPU_INTERRUPT_VIRQ) != 0)) {
@@ -947,7 +948,8 @@ void arm_cpu_update_vfiq(ARMCPU *cpu)
 CPUARMState *env = >env;
 CPUState *cs = CPU(cpu);

-bool new_state = (env->cp15.hcr_el2 & HCR_VF) ||
+bool new_state = ((arm_hcr_el2_eff(env) & HCR_VF) &&
+  !(arm_hcrx_el2_eff(env) & HCRX_VFNMI)) ||
 (env->irq_line_state & CPU_INTERRUPT_VFIQ);

 if (new_state != ((cs->interrupt_request & CPU_INTERRUPT_VFIQ) != 0)) {


thanks
-- PMM

Re: [PULL 0/9] target/hppa fixes for 9.0

Michael Tokarev  writes:

> 20.03.2024 03:32, Richard Henderson :
>
>> Richard Henderson (3):
>>target/hppa: Fix assemble_16 insns for wide mode
>>target/hppa: Fix assemble_11a insns for wide mode
>>target/hppa: Fix assemble_12a insns for wide mode
>> Sven Schnelle (6):
>>target/hppa: ldcw,s uses static shift of 3
>>target/hppa: fix shrp for wide mode
>>target/hppa: fix access_id check
>>target/hppa: exit tb on flush cache instructions
>>target/hppa: mask privilege bits in mfia
>>target/hppa: fix do_stdby_e()
>
> Is it all -stable material (when appropriate)?

I'd say yes.

Re: [RFC PATCH v9 04/23] target/arm: Implement ALLINT MSR (immediate)

On Thu, 21 Mar 2024 at 13:10, Jinjie Ruan  wrote:
>
> Add ALLINT MSR (immediate) to decodetree, in which the CRm is 0b000x. The
> EL0 check is necessary to ALLINT, and the EL1 check is necessary when
> imm == 1. So implement it inline for EL2/3, or EL1 with imm==0. Avoid the
> unconditional write to pc and use raise_exception_ra to unwind.

> +void HELPER(msr_set_allint_el1)(CPUARMState *env)
> +{
> +/* ALLINT update to PSTATE. */
> +if (arm_hcrx_el2_eff(env) & HCRX_TALLINT) {
> +raise_exception_ra(env, EXCP_UDEF,
> +   syn_aa64_sysregtrap(0, 1, 0, 4, 1, 0x1f, 0),
> +   exception_target_el(env), GETPC());
> +}
> +
> +env->pstate |= PSTATE_ALLINT;
> +}

This is a hypervisor trap from EL1 to EL2, so the target_el
argument to raise_exception_ra() should be "2", not
"exception_target_el(env)". Otherwise we will trap to EL1.

thanks
-- PMM

Re: change QARMA3 default for aarch64?

On Thu, 21 Mar 2024 at 17:18, Richard Henderson
 wrote:
>
> On 3/20/24 23:32, Michael Tokarev wrote:
> > Since commit v8.1.0-511-g399e5e7125 "target/arm: Implement FEAT_PACQARMA3",
> > pauth-qarma3 is the default pauth scheme.  However this one is very slow.
>
> That patch only introduced qarma3, it didn't make it the default:
>
>   static Property arm_cpu_pauth_property =
>   DEFINE_PROP_BOOL("pauth", ARMCPU, prop_pauth, true);
>   static Property arm_cpu_pauth_impdef_property =
>   DEFINE_PROP_BOOL("pauth-impdef", ARMCPU, prop_pauth_impdef, false);
> +static Property arm_cpu_pauth_qarma3_property =
> +DEFINE_PROP_BOOL("pauth-qarma3", ARMCPU, prop_pauth_qarma3, false);
>
> Per the first line, default is still qarma5 (which is the slowest, afaik).
>
> I have not done any benchmarking for qarma3 at all, but it still *looks* 
> significantly
> more complex than impdef.
>
> > When people run aarch64 code in qemu tcg, an immediate reaction is like,
> > "this seems to be a bug somewhere", since the code run insanely slower than
> > it was before.
> >
> > And this is very difficult to find as well, - the reason for that slowdown
> > is usually well hidden from an average soul.
> >
> > When the reason is actually discovered, people start changing settings in
> > various tools and configs to work around this issue.  Qemu itself has
> > overrides, pauth-impdef=on, in various tests, to make the test run at
> > saner speed.
> >
> > After seeing how many issues people are having in debian with that, I'm
> > about to switch the default in debian build of qemu, because impdef,
> > while makes certain arm64-specific protection feature less effective,
> > is actually significantly more practical.  I dislike changing the
> > defaults, but this is a situation when it needs to be done, imho.
> >
> > But before doing that, maybe it's better to change qemu default
> > instead?  What do you think?
>
> I think it might be worth having -cpu max default to impdef.

Yes, I think this is probably on net a good idea (it's a migration
compat break but I think that's OK for 'max', right?). But in
IRC discussion it turned out that the perf loss mjt's users are
seeing is between 7.2 and 8.2, whereas we've had pauth be QARMA5
for much longer than that. So it seems like a good idea to track
down exactly what's caused it to appear as a problem now, before
we start changing the default.

thanks
-- PMM

[PULL for-9.0 0/1] Block patches

The following changes since commit fea445e8fe9acea4f775a832815ee22bdf2b0222:

  Merge tag 'pull-maintainer-final-for-real-this-time-200324-1' of 
https://gitlab.com/stsquad/qemu into staging (2024-03-21 10:31:56 +)

are available in the Git repository at:

  https://gitlab.com/stefanha/qemu.git tags/block-pull-request

for you to fetch changes up to 9352f80cd926fe2dde7c89b93ee33bb0356ff40e:

  coroutine: reserve 5,000 mappings (2024-03-21 13:14:30 -0400)


Pull request

I was too quick in sending the coroutine pool sizing change for -rc0 and still
needed to address feedback from Daniel Berrangé.



Stefan Hajnoczi (1):
  coroutine: reserve 5,000 mappings

 util/qemu-coroutine.c | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

-- 
2.44.0

[PULL for-9.0 1/1] coroutine: reserve 5,000 mappings

Daniel P. Berrangé  pointed out that the coroutine
pool size heuristic is very conservative. Instead of halving
max_map_count, he suggested reserving 5,000 mappings for non-coroutine
users based on observations of guests he has access to.

Fixes: 86a637e48104 ("coroutine: cap per-thread local pool size")
Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Daniel P. Berrangé 
Message-id: 20240320181232.1464819-1-stefa...@redhat.com
Signed-off-by: Stefan Hajnoczi 
---
 util/qemu-coroutine.c | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c
index 2790959eaf..eb4eebefdf 100644
--- a/util/qemu-coroutine.c
+++ b/util/qemu-coroutine.c
@@ -377,12 +377,17 @@ static unsigned int get_global_pool_hard_max_size(void)
 NULL) &&
 qemu_strtoi(contents, NULL, 10, _map_count) == 0) {
 /*
- * This is a conservative upper bound that avoids exceeding
- * max_map_count. Leave half for non-coroutine users like library
- * dependencies, vhost-user, etc. Each coroutine takes up 2 VMAs so
- * halve the amount again.
+ * This is an upper bound that avoids exceeding max_map_count. Leave a
+ * fixed amount for non-coroutine users like library dependencies,
+ * vhost-user, etc. Each coroutine takes up 2 VMAs so halve the
+ * remaining amount.
  */
-return max_map_count / 4;
+if (max_map_count > 5000) {
+return (max_map_count - 5000) / 2;
+} else {
+/* Disable the global pool but threads still have local pools */
+return 0;
+}
 }
 #endif
 
-- 
2.44.0

Re: change QARMA3 default for aarch64?


On 3/20/24 23:32, Michael Tokarev wrote:

Since commit v8.1.0-511-g399e5e7125 "target/arm: Implement FEAT_PACQARMA3",
pauth-qarma3 is the default pauth scheme.  However this one is very slow.


That patch only introduced qarma3, it didn't make it the default:

 static Property arm_cpu_pauth_property =
 DEFINE_PROP_BOOL("pauth", ARMCPU, prop_pauth, true);
 static Property arm_cpu_pauth_impdef_property =
 DEFINE_PROP_BOOL("pauth-impdef", ARMCPU, prop_pauth_impdef, false);
+static Property arm_cpu_pauth_qarma3_property =
+DEFINE_PROP_BOOL("pauth-qarma3", ARMCPU, prop_pauth_qarma3, false);

Per the first line, default is still qarma5 (which is the slowest, afaik).

I have not done any benchmarking for qarma3 at all, but it still *looks* significantly 
more complex than impdef.



When people run aarch64 code in qemu tcg, an immediate reaction is like,
"this seems to be a bug somewhere", since the code run insanely slower than
it was before.

And this is very difficult to find as well, - the reason for that slowdown
is usually well hidden from an average soul.

When the reason is actually discovered, people start changing settings in
various tools and configs to work around this issue.  Qemu itself has
overrides, pauth-impdef=on, in various tests, to make the test run at
saner speed.

After seeing how many issues people are having in debian with that, I'm
about to switch the default in debian build of qemu, because impdef,
while makes certain arm64-specific protection feature less effective,
is actually significantly more practical.  I dislike changing the
defaults, but this is a situation when it needs to be done, imho.

But before doing that, maybe it's better to change qemu default
instead?  What do you think?


I think it might be worth having -cpu max default to impdef.

Peter?


r~

Re: [PATCH v2] target/i386: Revert monitor_puts() in do_inject_x86_mce()


20.03.2024 11:36, Tao Su :

monitor_puts() doesn't check the monitor pointer, but do_inject_x86_mce()
may have a parameter with NULL monitor pointer. Revert monitor_puts() in
do_inject_x86_mce() to fix, then the fact that we send the same message to
monitor and log is again more obvious.

Fixes: bf0c50d4aa85 (monitor: expose monitor_puts to rest of code)
Reviwed-by: Xiaoyao Li 
Reviewed-by: Markus Armbruster 
Signed-off-by: Tao Su 


Smells like a -stable material, is it not?

Thanks,

/mjt

Re: [PATCH] misc/pca9554: Fix check of pin range value in property accessors


On 21/3/24 17:01, Cédric Le Goater wrote:

Coverity detected an "Integer handling" issue with the pin value :

   In expression "state >> pin", right shifting "state" by more than 7
   bits always yields zero.  The shift amount, "pin", is as much as 8.

In practice, this should not happen because the properties "pin8" and
above are not created. Nevertheless, fix the range to avoid this warning.

Fixes: CID 1534917
Fixes: de0c7d543bca ("misc: Add a pca9554 GPIO device model")
Cc: Glenn Miles 
Signed-off-by: Cédric Le Goater 
---
  hw/misc/pca9554.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)


Isn't it the one Peter fixed in
https://lore.kernel.org/qemu-devel/20240312183810.557768-5-peter.mayd...@linaro.org/?

[PATCH] target/riscv: rvv: Remove the dependency of Zvfbfmin to Zfbfmin

2024-03-21 Thread Max Chou

According to the Zvfbfmin definition in the RISC-V BF16 extensions spec,
the Zvfbfmin extension only requires either the V extension or the
Zve32f extension.

Signed-off-by: Max Chou 
---
 target/riscv/tcg/tcg-cpu.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
index 63192ef54f3..b5b95e052d2 100644
--- a/target/riscv/tcg/tcg-cpu.c
+++ b/target/riscv/tcg/tcg-cpu.c
@@ -530,11 +530,6 @@ void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, 
Error **errp)
 return;
 }
 
-if (cpu->cfg.ext_zvfbfmin && !cpu->cfg.ext_zfbfmin) {
-error_setg(errp, "Zvfbfmin extension depends on Zfbfmin extension");
-return;
-}
-
 if (cpu->cfg.ext_zvfbfmin && !cpu->cfg.ext_zve32f) {
 error_setg(errp, "Zvfbfmin extension depends on Zve32f extension");
 return;
-- 
2.34.1

Re: [PATCH] coroutine: reserve 5,000 mappings

On Wed, Mar 20, 2024 at 02:12:32PM -0400, Stefan Hajnoczi wrote:
> Daniel P. Berrangé  pointed out that the coroutine
> pool size heuristic is very conservative. Instead of halving
> max_map_count, he suggested reserving 5,000 mappings for non-coroutine
> users based on observations of guests he has access to.
> 
> Fixes: 86a637e48104 ("coroutine: cap per-thread local pool size")
> Signed-off-by: Stefan Hajnoczi 
> ---
>  util/qemu-coroutine.c | 15 ++-
>  1 file changed, 10 insertions(+), 5 deletions(-)

Discussed with Kevin and applied to my block tree:
https://gitlab.com/stefanha/qemu/commits/block

Stefan


signature.asc
Description: PGP signature

Re: [PULL 3/3] target/loongarch: Fix qemu-loongarch64 hang when executing 'll.d $t0, $t0, 0'


20.03.2024 05:40, Song Gao :

On gen_ll, if a->imm is zero, make_address_x return src1,
but the load to destination may clobber src1. We use a new
destination to fix this problem.

Fixes: c5af6628f4be (target/loongarch: Extract make_address_i() helper)
Reviewed-by: Richard Henderson 
Suggested-by: Richard Henderson 
Signed-off-by: Song Gao 
Message-Id: <20240320013955.1561311-1-gaos...@loongson.cn>


Is it a stable-8.2 material?

Thanks,

/mjt

Re: [PATCH-for-9.0? 01/21] host/atomic128: Include missing 'qemu/atomic.h' header


On 3/21/24 05:48, Philippe Mathieu-Daudé wrote:

qatomic_cmpxchg__nocheck(), qatomic_read__nocheck(),
qatomic_set__nocheck() are defined in "qemu/atomic.h".
Include it in order to avoid:

   In file included from include/exec/helper-proto.h:10:
   In file included from include/exec/helper-proto-common.h:10:
   In file included from include/qemu/atomic128.h:61:
   In file included from host/include/aarch64/host/atomic128-cas.h:16:
   host/include/generic/host/atomic128-cas.h:23:11: error: call to undeclared 
function 'qatomic_cmpxchg__nocheck'; ISO C99 and later do not support implicit 
function declarations [-Wimplicit-function-declaration]
 r.i = qatomic_cmpxchg__nocheck(ptr_align, c.i, n.i);



Nak.  We can rename these host/include/*/host/*atomic* as .h.inc if you need, but the 
top-level header is include/qemu/atomic128.h.



r~

Re: [PULL 2/3] target/loongarch: Fix tlb huge page loading issue


20.03.2024 05:40, Song Gao :

From: Xianglai Li 




+if (unlikely((level == 0) || (level > 4))) {
+qemu_log_mask(LOG_GUEST_ERROR,
+  "Attepted LDDIR with level %"PRId64"\n", level);


Attempted.  FWIW, it's applied already.

/mjt

Re: [PULL 0/9] target/hppa fixes for 9.0


20.03.2024 03:32, Richard Henderson :


Richard Henderson (3):
   target/hppa: Fix assemble_16 insns for wide mode
   target/hppa: Fix assemble_11a insns for wide mode
   target/hppa: Fix assemble_12a insns for wide mode

Sven Schnelle (6):
   target/hppa: ldcw,s uses static shift of 3
   target/hppa: fix shrp for wide mode
   target/hppa: fix access_id check
   target/hppa: exit tb on flush cache instructions
   target/hppa: mask privilege bits in mfia
   target/hppa: fix do_stdby_e()


Is it all -stable material (when appropriate)?

/mjt

Re: [PATCH] coroutine: cap per-thread local pool size

On Thu, 21 Mar 2024 at 08:22, Kevin Wolf  wrote:
>
> Am 20.03.2024 um 15:09 hat Daniel P. Berrangé geschrieben:
> > On Wed, Mar 20, 2024 at 09:35:39AM -0400, Stefan Hajnoczi wrote:
> > > On Tue, Mar 19, 2024 at 08:10:49PM +, Daniel P. Berrangé wrote:
> > > > On Tue, Mar 19, 2024 at 01:55:10PM -0400, Stefan Hajnoczi wrote:
> > > > > On Tue, Mar 19, 2024 at 01:43:32PM +, Daniel P. Berrangé wrote:
> > > > > > On Mon, Mar 18, 2024 at 02:34:29PM -0400, Stefan Hajnoczi wrote:
> > > > > > > diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c
> > > > > > > index 5fd2dbaf8b..2790959eaf 100644
> > > > > > > --- a/util/qemu-coroutine.c
> > > > > > > +++ b/util/qemu-coroutine.c
> > > > > >
> > > > > > > +static unsigned int get_global_pool_hard_max_size(void)
> > > > > > > +{
> > > > > > > +#ifdef __linux__
> > > > > > > +g_autofree char *contents = NULL;
> > > > > > > +int max_map_count;
> > > > > > > +
> > > > > > > +/*
> > > > > > > + * Linux processes can have up to max_map_count virtual 
> > > > > > > memory areas
> > > > > > > + * (VMAs). mmap(2), mprotect(2), etc fail with ENOMEM beyond 
> > > > > > > this limit. We
> > > > > > > + * must limit the coroutine pool to a safe size to avoid 
> > > > > > > running out of
> > > > > > > + * VMAs.
> > > > > > > + */
> > > > > > > +if (g_file_get_contents("/proc/sys/vm/max_map_count", 
> > > > > > > , NULL,
> > > > > > > +NULL) &&
> > > > > > > +qemu_strtoi(contents, NULL, 10, _map_count) == 0) {
> > > > > > > +/*
> > > > > > > + * This is a conservative upper bound that avoids 
> > > > > > > exceeding
> > > > > > > + * max_map_count. Leave half for non-coroutine users 
> > > > > > > like library
> > > > > > > + * dependencies, vhost-user, etc. Each coroutine takes 
> > > > > > > up 2 VMAs so
> > > > > > > + * halve the amount again.
> > > >
> > > > Leaving half for loaded libraries, etc is quite conservative
> > > > if max_map_count is the small-ish 64k default.
> > > >
> > > > That reservation could perhaps a fixed number like 5,000 ?
> > >
> > > While I don't want QEMU to abort, once this heuristic is in the code it
> > > will be scary to make it more optimistic and we may never change it. So
> > > now is the best time to try 5,000.
> > >
> > > I'll send a follow-up patch that reserves 5,000 mappings. If that turns
> > > out to be too optimistic we can increase the reservation.
> >
> > BTW, I suggested 5,000, because I looked at a few QEM processes I have
> > running on Fedora and saw just under 1,000 lines in /proc/$PID/maps,
> > of which only a subset is library mappings. So multiplying that x5 felt
> > like a fairly generous overhead for more complex build configurations.
>
> On my system, the boring desktop VM with no special hardware or other
> advanced configuration takes ~1500 mappings, most of which are
> libraries. I'm not concerned about the library mappings, it's unlikely
> that we'll double the number of libraries soon.
>
> But I'm not sure about dynamic mappings outside of coroutines, maybe
> when enabling features my simple desktop VM doesn't even use at all. If
> we're sure that nothing else uses any number worth mentioning, fine with
> me. But I couldn't tell.
>
> Staying the area we know reasonably well, how many libblkio bounce
> buffers could be in use at the same time? I think each one is an
> individual mmap(), right?

libblkio's mapping requirements are similar to vhost-user. There is
one general-purpose bounce buffer mapping plus a mapping for each QEMU
RAMBlock. That means the number of in-flight I/Os does not directly
influence the number of mappings.

Stefan

[PATCH] virtio_net: Do not send RSS key if it is not supported

2024-03-21 Thread Breno Leitao

There is a bug when setting the RSS options in virtio_net that can break
the whole machine, getting the kernel into an infinite loop.

Running the following command in any QEMU virtual machine with virtionet
will reproduce this problem:

# ethtool -X eth0  hfunc toeplitz

This is how the problem happens:

1) ethtool_set_rxfh() calls virtnet_set_rxfh()

2) virtnet_set_rxfh() calls virtnet_commit_rss_command()

3) virtnet_commit_rss_command() populates 4 entries for the rss
   scatter-gather

4) Since the command above does not have a key, then the last
   scatter-gatter entry will be zeroed, since rss_key_size == 0.
sg_buf_size = vi->rss_key_size;

5) This buffer is passed to qemu, but qemu is not happy with a buffer
   with zero length, and do the following in virtqueue_map_desc() (QEMU
   function):

  if (!sz) {
  virtio_error(vdev, "virtio: zero sized buffers are not allowed");

6) virtio_error() (also QEMU function) set the device as broken

vdev->broken = true;

7) Qemu bails out, and do not repond this crazy kernel.

8) The kernel is waiting for the response to come back (function
   virtnet_send_command())

9) The kernel is waiting doing the following :

  while (!virtqueue_get_buf(vi->cvq, ) &&
 !virtqueue_is_broken(vi->cvq))
  cpu_relax();

10) None of the following functions above is true, thus, the kernel
loops here forever. Keeping in mind that virtqueue_is_broken() does
not look at the qemu `vdev->broken`, so, it never realizes that the
vitio is broken at QEMU side.

Fix it by not sending the key scatter-gatter key if it is not set.

Fixes: c7114b1249fa ("drivers/net/virtio_net: Added basic RSS support.")
Signed-off-by: Breno Leitao 
Cc: sta...@vger.kernel.org
Cc: qemu-devel@nongnu.org
---
 drivers/net/virtio_net.c | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index d7ce4a1011ea..5a7700b103f8 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -3041,11 +3041,16 @@ static int virtnet_set_ringparam(struct net_device *dev,
 static bool virtnet_commit_rss_command(struct virtnet_info *vi)
 {
struct net_device *dev = vi->dev;
+   int has_key = vi->rss_key_size;
struct scatterlist sgs[4];
unsigned int sg_buf_size;
+   int nents = 3;
+
+   if (has_key)
+   nents += 1;
 
/* prepare sgs */
-   sg_init_table(sgs, 4);
+   sg_init_table(sgs, nents);
 
sg_buf_size = offsetof(struct virtio_net_ctrl_rss, indirection_table);
sg_set_buf([0], >ctrl->rss, sg_buf_size);
@@ -3057,8 +3062,13 @@ static bool virtnet_commit_rss_command(struct 
virtnet_info *vi)
- offsetof(struct virtio_net_ctrl_rss, max_tx_vq);
sg_set_buf([2], >ctrl->rss.max_tx_vq, sg_buf_size);
 
-   sg_buf_size = vi->rss_key_size;
-   sg_set_buf([3], vi->ctrl->rss.key, sg_buf_size);
+   if (has_key) {
+   /* Only populate if key is available, otherwise
+* populating a buffer with zero size breaks virtio
+*/
+   sg_buf_size = vi->rss_key_size;
+   sg_set_buf([3], vi->ctrl->rss.key, sg_buf_size);
+   }
 
if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_MQ,
  vi->has_rss ? VIRTIO_NET_CTRL_MQ_RSS_CONFIG
-- 
2.43.0

Re: [RFC PATCH v9 00/23] target/arm: Implement FEAT_NMI and FEAT_GICv3_NMI

On Thu, 21 Mar 2024 at 13:10, Jinjie Ruan  wrote:
>
> This patch set implements FEAT_NMI and FEAT_GICv3_NMI for armv8. These
> introduce support for a new category of interrupts in the architecture
> which we can use to provide NMI like functionality.
>
> There are two modes for using this FEAT_NMI. When PSTATE.ALLINT or
> PSTATE.SP & SCTLR_ELx.SCTLR_SPINTMASK is set, any entry to ELx causes all
> interrupts including those with superpriority to be masked on entry to ELn
> until the mask is explicitly removed by software or hardware. PSTATE.ALLINT
> can be managed by software using the new register control ALLINT.ALLINT.
> Independent controls are provided for this feature at each EL, usage at EL1
> should not disrupt EL2 or EL3.
>
> I have tested it with the following linux patches which try to support
> FEAT_NMI in linux kernel:
>
> 
> https://lore.kernel.org/linux-arm-kernel/Y4sH5qX5bK9xfEBp@lpieralisi/T/#mb4ba4a2c045bf72c10c2202c1dd1b82d3240dc88
>
> In the test, SGI, PPI and SPI interrupts can all be set to have super priority
> to be converted to a hardware NMI interrupt. The SGI is tested with kernel
> IPI as NMI framework, softlockup, hardlockup and kgdb test cases, and the PPI
> interrupt is tested with "perf top" command with hardware NMI enabled, and
> the SPI interrupt is tested with a custom test module, in which NMI interrupts
> can be received and sent normally.

It looks like your changes to the GIC have missed the handling
of NMIs in the active priority registers and the running
priority registers (both ICC and ICV versions). The way this
works is that the ICH_AP1R0 and ICC_AP1R0 registers get an
extra bit for NMI status in bit 63 (luckily we had the foresight
to make these struct fields be uint64_t). When we activate
an NMI IRQ, instead of setting the ICC_APR bit according to
its priority, we set the NMI bit instead. Similarly, on
deactivate we clear the NMI bit, not the priority-related bit.
The ICC_RPR register also has new bits in bit 63 and 62 for
whether there's an active NMI. On read of the RPR we figure
out the values for these bits based on the NMI bits in
ICC_AP1R_EL1S.NMI and ICC_AP1R_EL1NS.NMI (you might find the
pseudocode functions for ICC_RPR_EL1 in chapter 13 helpful to
look at here). The icc_highest_active_prio() likely also needs
changes to handle NMI. Similarly for all the ICV versions of
these registers and functions.

thanks
-- PMM

Re: [PATCH] migration/postcopy: Fix high frequency sync

2024-03-21 Thread Peter Xu

On Wed, Mar 20, 2024 at 05:44:53PM -0400, pet...@redhat.com wrote:
> From: Peter Xu 
> 
> On current code base I can observe extremely high sync count during
> precopy, as long as one enables postcopy-ram=on before switchover to
> postcopy.
> 
> To provide some context of when we decide to do a full sync: we check
> must_precopy (which implies "data must be sent during precopy phase"), and
> as long as it is lower than the threshold size we calculated (out of
> bandwidth and expected downtime) we will kick off the slow sync.
> 
> However, when postcopy is enabled (even if still during precopy phase), RAM
> only reports all pages as can_postcopy, and report must_precopy==0.  Then
> "must_precopy <= threshold_size" mostly always triggers and enforces a slow
> sync for every call to migration_iteration_run() when postcopy is enabled
> even if not used.  That is insane.
> 
> It turns out it was a regress bug introduced in the previous refactoring in
> QEMU 8.0 in late 2022. Fix this by checking the whole RAM size rather than
> must_precopy, like before.  Not copy stable yet as many things changed, and
> even if this should be a major performance regression, no functional change
> has observed (and that's also probably why nobody found it).  I only notice
> this when looking for another bug reported by Nina.
> 
> When at it, cleanup a little bit on the lines around.
> 
> Cc: Nina Schoetterl-Glausch 
> Fixes: c8df4a7aef ("migration: Split save_live_pending() into 
> state_pending_*")
> Signed-off-by: Peter Xu 

queued for 9.0-rc1.

-- 
Peter Xu

Re: [PATCH] misc/pca9554: Fix check of pin range value in property accessors

2024-03-21 Thread Cédric Le Goater


On 3/21/24 17:08, Miles Glenn wrote:

On Thu, 2024-03-21 at 17:01 +0100, Cédric Le Goater wrote:

Coverity detected an "Integer handling" issue with the pin value :

   In expression "state >> pin", right shifting "state" by more than 7
   bits always yields zero.  The shift amount, "pin", is as much as 8.

In practice, this should not happen because the properties "pin8" and
above are not created. Nevertheless, fix the range to avoid this
warning.

Fixes: CID 1534917
Fixes: de0c7d543bca ("misc: Add a pca9554 GPIO device model")
Cc: Glenn Miles 
Signed-off-by: Cédric Le Goater 
---
  hw/misc/pca9554.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/misc/pca9554.c b/hw/misc/pca9554.c
index
778b32e4430a8b618322c26b1b185ed3ead97cc4..5e31696797d9564666ece6fe177
37ee2a9733e96 100644
--- a/hw/misc/pca9554.c
+++ b/hw/misc/pca9554.c
@@ -160,7 +160,7 @@ static void pca9554_get_pin(Object *obj, Visitor
*v, const char *name,
  error_setg(errp, "%s: error reading %s", __func__, name);
  return;
  }
-if (pin < 0 || pin > PCA9554_PIN_COUNT) {
+if (pin < 0 || pin >= PCA9554_PIN_COUNT) {
  error_setg(errp, "%s invalid pin %s", __func__, name);
  return;
  }
@@ -187,7 +187,7 @@ static void pca9554_set_pin(Object *obj, Visitor
*v, const char *name,
  error_setg(errp, "%s: error reading %s", __func__, name);
  return;
  }
-if (pin < 0 || pin > PCA9554_PIN_COUNT) {
+if (pin < 0 || pin >= PCA9554_PIN_COUNT) {
  error_setg(errp, "%s invalid pin %s", __func__, name);
  return;
  }


Thanks, Cédric!  I guess I should be running coverity myself.


I don't myself. I get reports from :

  https://scan.coverity.com/projects/qemu

Thanks,

C.

[PATCH v2 2/2] Refactor common functions between POSIX and Windows implementation

2024-03-21 Thread aidan_leuck

From: aidaleuc 

Signed-off-by: aidaleuc 
---
 qga/commands-posix-ssh.c   | 47 +--
 qga/commands-ssh-core.c| 57 
 qga/commands-ssh-core.h|  8 +
 qga/commands-windows-ssh.c | 66 +-
 qga/meson.build|  3 +-
 5 files changed, 69 insertions(+), 112 deletions(-)
 create mode 100644 qga/commands-ssh-core.c
 create mode 100644 qga/commands-ssh-core.h

diff --git a/qga/commands-posix-ssh.c b/qga/commands-posix-ssh.c
index 236f80de44..9a71b109f9 100644
--- a/qga/commands-posix-ssh.c
+++ b/qga/commands-posix-ssh.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 
+#include "commands-ssh-core.h"
 #include "qapi/error.h"
 #include "qga-qapi-commands.h"
 
@@ -80,37 +81,6 @@ mkdir_for_user(const char *path, const struct passwd *p,
 return true;
 }
 
-static bool
-check_openssh_pub_key(const char *key, Error **errp)
-{
-/* simple sanity-check, we may want more? */
-if (!key || key[0] == '#' || strchr(key, '\n')) {
-error_setg(errp, "invalid OpenSSH public key: '%s'", key);
-return false;
-}
-
-return true;
-}
-
-static bool
-check_openssh_pub_keys(strList *keys, size_t *nkeys, Error **errp)
-{
-size_t n = 0;
-strList *k;
-
-for (k = keys; k != NULL; k = k->next) {
-if (!check_openssh_pub_key(k->value, errp)) {
-return false;
-}
-n++;
-}
-
-if (nkeys) {
-*nkeys = n;
-}
-return true;
-}
-
 static bool
 write_authkeys(const char *path, const GStrv keys,
const struct passwd *p, Error **errp)
@@ -139,21 +109,6 @@ write_authkeys(const char *path, const GStrv keys,
 return true;
 }
 
-static GStrv
-read_authkeys(const char *path, Error **errp)
-{
-g_autoptr(GError) err = NULL;
-g_autofree char *contents = NULL;
-
-if (!g_file_get_contents(path, , NULL, )) {
-error_setg(errp, "failed to read '%s': %s", path, err->message);
-return NULL;
-}
-
-return g_strsplit(contents, "\n", -1);
-
-}
-
 void
 qmp_guest_ssh_add_authorized_keys(const char *username, strList *keys,
   bool has_reset, bool reset,
diff --git a/qga/commands-ssh-core.c b/qga/commands-ssh-core.c
new file mode 100644
index 00..51353b396d
--- /dev/null
+++ b/qga/commands-ssh-core.c
@@ -0,0 +1,57 @@
+/*
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include 
+#include 
+#include "qapi/error.h"
+#include "commands-ssh-core.h"
+
+GStrv read_authkeys(const char *path, Error **errp)
+{
+g_autoptr(GError) err = NULL;
+g_autofree char *contents = NULL;
+
+if (!g_file_get_contents(path, , NULL, ))
+{
+error_setg(errp, "failed to read '%s': %s", path, err->message);
+return NULL;
+}
+
+return g_strsplit(contents, "\n", -1);
+}
+
+bool check_openssh_pub_keys(strList *keys, size_t *nkeys, Error **errp)
+{
+size_t n = 0;
+strList *k;
+
+for (k = keys; k != NULL; k = k->next)
+{
+if (!check_openssh_pub_key(k->value, errp))
+{
+return false;
+}
+n++;
+}
+
+if (nkeys)
+{
+*nkeys = n;
+}
+return true;
+}
+
+bool check_openssh_pub_key(const char *key, Error **errp)
+{
+/* simple sanity-check, we may want more? */
+if (!key || key[0] == '#' || strchr(key, '\n'))
+{
+error_setg(errp, "invalid OpenSSH public key: '%s'", key);
+return false;
+}
+
+return true;
+}
\ No newline at end of file
diff --git a/qga/commands-ssh-core.h b/qga/commands-ssh-core.h
new file mode 100644
index 00..f26d66c846
--- /dev/null
+++ b/qga/commands-ssh-core.h
@@ -0,0 +1,8 @@
+/*
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+GStrv read_authkeys(const char *path, Error **errp);
+bool check_openssh_pub_keys(strList *keys, size_t *nkeys, Error **errp);
+bool check_openssh_pub_key(const char *key, Error **errp);
\ No newline at end of file
diff --git a/qga/commands-windows-ssh.c b/qga/commands-windows-ssh.c
index 566266f465..e8cbcd0779 100644
--- a/qga/commands-windows-ssh.c
+++ b/qga/commands-windows-ssh.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 
+#include "commands-ssh-core.h"
 #include "commands-windows-ssh.h"
 #include "guest-agent-core.h"
 #include "limits.h"
@@ -35,71 +36,6 @@
 #define ADMIN_SID "S-1-5-32-544"
 #define WORLD_SID "S-1-1-0"
 
-/*
- * Reads the authorized_keys file and returns an array of strings for each 
entry
- *
- * parameters:
- * path -> Path to the authorized_keys file
- * errp -> Error structure that will contain errors upon failure.
- * returns: Array of strings, where each entry is an authorized key.
- */
-static GStrv read_authkeys(const char *path, Error **errp)
-{
-  g_autoptr(GError) err = NULL;
-

Re: [PATCH] misc/pca9554: Fix check of pin range value in property accessors

2024-03-21 Thread Miles Glenn

On Thu, 2024-03-21 at 17:01 +0100, Cédric Le Goater wrote:
> Coverity detected an "Integer handling" issue with the pin value :
> 
>   In expression "state >> pin", right shifting "state" by more than 7
>   bits always yields zero.  The shift amount, "pin", is as much as 8.
> 
> In practice, this should not happen because the properties "pin8" and
> above are not created. Nevertheless, fix the range to avoid this
> warning.
> 
> Fixes: CID 1534917
> Fixes: de0c7d543bca ("misc: Add a pca9554 GPIO device model")
> Cc: Glenn Miles 
> Signed-off-by: Cédric Le Goater 
> ---
>  hw/misc/pca9554.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/misc/pca9554.c b/hw/misc/pca9554.c
> index
> 778b32e4430a8b618322c26b1b185ed3ead97cc4..5e31696797d9564666ece6fe177
> 37ee2a9733e96 100644
> --- a/hw/misc/pca9554.c
> +++ b/hw/misc/pca9554.c
> @@ -160,7 +160,7 @@ static void pca9554_get_pin(Object *obj, Visitor
> *v, const char *name,
>  error_setg(errp, "%s: error reading %s", __func__, name);
>  return;
>  }
> -if (pin < 0 || pin > PCA9554_PIN_COUNT) {
> +if (pin < 0 || pin >= PCA9554_PIN_COUNT) {
>  error_setg(errp, "%s invalid pin %s", __func__, name);
>  return;
>  }
> @@ -187,7 +187,7 @@ static void pca9554_set_pin(Object *obj, Visitor
> *v, const char *name,
>  error_setg(errp, "%s: error reading %s", __func__, name);
>  return;
>  }
> -if (pin < 0 || pin > PCA9554_PIN_COUNT) {
> +if (pin < 0 || pin >= PCA9554_PIN_COUNT) {
>  error_setg(errp, "%s invalid pin %s", __func__, name);
>  return;
>  }

Thanks, Cédric!  I guess I should be running coverity myself.

-Glenn

Reviewed-by: Glenn Miles

[PATCH v2 1/2] Implement SSH commands in QEMU GA for Windows

2024-03-21 Thread aidan_leuck

From: aidaleuc 

Signed-off-by: aidaleuc 
---
 qga/commands-windows-ssh.c | 848 +
 qga/commands-windows-ssh.h |  26 ++
 qga/meson.build|   9 +-
 qga/qapi-schema.json   |  22 +-
 4 files changed, 892 insertions(+), 13 deletions(-)
 create mode 100644 qga/commands-windows-ssh.c
 create mode 100644 qga/commands-windows-ssh.h

diff --git a/qga/commands-windows-ssh.c b/qga/commands-windows-ssh.c
new file mode 100644
index 00..566266f465
--- /dev/null
+++ b/qga/commands-windows-ssh.c
@@ -0,0 +1,848 @@
+/*
+ * QEMU Guest Agent win32-specific command implementations for SSH keys.
+ * The implementation is opinionated and expects the SSH implementation to
+ * be OpenSSH.
+ *
+ * Copyright Schweitzer Engineering Laboratories. 2024
+ *
+ * Authors:
+ *  Aidan Leuck 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include 
+#include 
+
+#include "commands-windows-ssh.h"
+#include "guest-agent-core.h"
+#include "limits.h"
+#include "lmaccess.h"
+#include "lmapibuf.h"
+#include "lmerr.h"
+#include "qapi/error.h"
+
+#include "qga-qapi-commands.h"
+#include "sddl.h"
+#include "shlobj.h"
+#include "userenv.h"
+
+#define AUTHORIZED_KEY_FILE "authorized_keys"
+#define AUTHORIZED_KEY_FILE_ADMIN "administrators_authorized_keys"
+#define LOCAL_SYSTEM_SID "S-1-5-18"
+#define ADMIN_SID "S-1-5-32-544"
+#define WORLD_SID "S-1-1-0"
+
+/*
+ * Reads the authorized_keys file and returns an array of strings for each 
entry
+ *
+ * parameters:
+ * path -> Path to the authorized_keys file
+ * errp -> Error structure that will contain errors upon failure.
+ * returns: Array of strings, where each entry is an authorized key.
+ */
+static GStrv read_authkeys(const char *path, Error **errp)
+{
+  g_autoptr(GError) err = NULL;
+  g_autofree char *contents = NULL;
+
+  if (!g_file_get_contents(path, , NULL, )) {
+error_setg(errp, "failed to read '%s': %s", path, err->message);
+return NULL;
+  }
+
+  return g_strsplit(contents, "\n", -1);
+}
+
+/*
+ * Checks if a OpenSSH key is valid
+ * parameters:
+ * key* Key to check for validity
+ * errp -> Error structure that will contain errors upon failure.
+ * returns: true if key is valid, false otherwise
+ */
+static bool check_openssh_pub_key(const char *key, Error **errp)
+{
+  /* simple sanity-check, we may want more? */
+  if (!key || key[0] == '#' || strchr(key, '\n')) {
+error_setg(errp, "invalid OpenSSH public key: '%s'", key);
+return false;
+  }
+
+  return true;
+}
+
+/*
+ * Checks if all openssh keys in the array are valid
+ *
+ * parameters:
+ * keys -> Array of keys to check
+ * errp -> Error structure that will contain errors upon failure.
+ * returns: true if all keys are valid, false otherwise
+ */
+static bool check_openssh_pub_keys(strList *keys, size_t *nkeys, Error **errp)
+{
+  size_t n = 0;
+  strList *k;
+
+  for (k = keys; k != NULL; k = k->next) {
+if (!check_openssh_pub_key(k->value, errp)) {
+  return false;
+}
+n++;
+  }
+
+  if (nkeys) {
+*nkeys = n;
+  }
+  return true;
+}
+
+/*
+ * Frees userInfo structure. This implements the g_auto cleanup
+ * for the structure.
+ */
+void free_userInfo(PWindowsUserInfo info)
+{
+  g_free(info->sshDirectory);
+  g_free(info->authorizedKeyFile);
+  LocalFree(info->SSID);
+  g_free(info->username);
+  g_free(info);
+}
+
+/*
+ * Gets the admin SSH folder for OpenSSH. OpenSSH does not store
+ * the authorized_key file in the users home directory for security reasons and
+ * instead stores it at %PROGRAMDATA%/ssh. This function returns the path to
+ * that directory on the users machine
+ *
+ * parameters:
+ * errp -> error structure to set when an error occurs
+ * returns: The path to the ssh folder in %PROGRAMDATA% or NULL if an error 
occurred.
+ */
+static char *get_admin_ssh_folder(Error **errp)
+{
+  // Allocate memory for the program data path
+  g_autofree char *programDataPath = NULL;
+  char *authkeys_path = NULL;
+  PWSTR pgDataW;
+  GError *gerr = NULL;
+
+  // Get the KnownFolderPath on the machine.
+  HRESULT folderResult =
+  SHGetKnownFolderPath(_ProgramData, 0, NULL, );
+  if (folderResult != S_OK) {
+error_setg(errp, "Failed to retrieve ProgramData folder");
+goto error;
+  }
+
+  // Convert from a wide string back to a standard character string.
+  programDataPath = g_utf16_to_utf8(pgDataW, -1, NULL, NULL, );
+  if (!programDataPath) {
+goto error;
+  }
+
+  // Build the path to the file.
+  authkeys_path = g_build_filename(programDataPath, "ssh", NULL);
+  CoTaskMemFree(pgDataW);
+  return authkeys_path;
+
+error:
+  CoTaskMemFree(pgDataW);
+  
+  if (gerr) {
+error_setg(errp,"Failed to convert program data path from wide string to 
standard utf 8 string. %s", gerr->message);
+g_error_free(gerr);
+  }
+
+  return NULL;
+}
+
+/*
+ * Gets the path to the SSH folder for the specified

[PATCH v2 0/2] Implement SSH commands in QEMU GA for Windows

2024-03-21 Thread aidan_leuck

From: aidaleuc 

This patch aims to implement guest-ssh-add-authorized-keys, 
guest-ssh-remove-authorized-keys, and guest-ssh-get-authorized-keys
for Windows. This PR is based on Microsoft's OpenSSH implementation 
https://github.com/PowerShell/Win32-OpenSSH. The guest agents 
will support Kubevirt and allow guest agent propagation to be used to 
dynamically inject SSH keys. 
https://kubevirt.io/user-guide/virtual_machines/accessing_virtual_machines/#dynamic-ssh-public-key-injection-via-qemu-guest-agent

Changes since v1
* Fixed styling errors
* Moved from wcstombs to g_utf functions
* Removed unnecessary if checks on calls to free
* Fixed copyright headers
* Refactored create_acl functions into base function, admin function and user 
function
* Removed unused user count function
* Split up refactor of existing code into a separate patch

aidaleuc (2):
  Implement SSH commands in QEMU GA for Windows
  Refactor common functions between POSIX and Windows implementation

 qga/commands-posix-ssh.c   |  47 +--
 qga/commands-ssh-core.c|  57 +++
 qga/commands-ssh-core.h|   8 +
 qga/commands-windows-ssh.c | 784 +
 qga/commands-windows-ssh.h |  26 ++
 qga/meson.build|   8 +-
 qga/qapi-schema.json   |  22 +-
 7 files changed, 894 insertions(+), 58 deletions(-)
 create mode 100644 qga/commands-ssh-core.c
 create mode 100644 qga/commands-ssh-core.h
 create mode 100644 qga/commands-windows-ssh.c
 create mode 100644 qga/commands-windows-ssh.h

-- 
2.44.0

[PATCH] misc/pca9554: Fix check of pin range value in property accessors

2024-03-21 Thread Cédric Le Goater

Coverity detected an "Integer handling" issue with the pin value :

  In expression "state >> pin", right shifting "state" by more than 7
  bits always yields zero.  The shift amount, "pin", is as much as 8.

In practice, this should not happen because the properties "pin8" and
above are not created. Nevertheless, fix the range to avoid this warning.

Fixes: CID 1534917
Fixes: de0c7d543bca ("misc: Add a pca9554 GPIO device model")
Cc: Glenn Miles 
Signed-off-by: Cédric Le Goater 
---
 hw/misc/pca9554.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/misc/pca9554.c b/hw/misc/pca9554.c
index 
778b32e4430a8b618322c26b1b185ed3ead97cc4..5e31696797d9564666ece6fe17737ee2a9733e96
 100644
--- a/hw/misc/pca9554.c
+++ b/hw/misc/pca9554.c
@@ -160,7 +160,7 @@ static void pca9554_get_pin(Object *obj, Visitor *v, const 
char *name,
 error_setg(errp, "%s: error reading %s", __func__, name);
 return;
 }
-if (pin < 0 || pin > PCA9554_PIN_COUNT) {
+if (pin < 0 || pin >= PCA9554_PIN_COUNT) {
 error_setg(errp, "%s invalid pin %s", __func__, name);
 return;
 }
@@ -187,7 +187,7 @@ static void pca9554_set_pin(Object *obj, Visitor *v, const 
char *name,
 error_setg(errp, "%s: error reading %s", __func__, name);
 return;
 }
-if (pin < 0 || pin > PCA9554_PIN_COUNT) {
+if (pin < 0 || pin >= PCA9554_PIN_COUNT) {
 error_setg(errp, "%s invalid pin %s", __func__, name);
 return;
 }
-- 
2.44.0

[RFC 1/8] virtio: Define InOrderVQElement

Define the InOrderVQElement structure for the VIRTIO_F_IN_ORDER
transport feature implementation.

The InOrderVQElement structure is used to encapsulate out-of-order
VirtQueueElement data that was processed by the host. This data
includes:
 - The processed VirtQueueElement (elem)
 - Length of data (len)
 - VirtQueueElement array index (idx)
 - Number of processed VirtQueueElements (count)

InOrderVQElements will be stored in a buffering mechanism until an
order can be achieved.

Signed-off-by: Jonah Palmer 
---
 include/hw/virtio/virtio.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index b3c74a1bca..c8aa435a5e 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -77,6 +77,13 @@ typedef struct VirtQueueElement
 struct iovec *out_sg;
 } VirtQueueElement;
 
+typedef struct InOrderVQElement {
+const VirtQueueElement *elem;
+unsigned int len;
+unsigned int idx;
+unsigned int count;
+} InOrderVQElement;
+
 #define VIRTIO_QUEUE_MAX 1024
 
 #define VIRTIO_NO_VECTOR 0x
-- 
2.39.3

[RFC 8/8] virtio: Add VIRTIO_F_IN_ORDER property definition

Extend the virtio device property definitions to include the
VIRTIO_F_IN_ORDER feature.

The default state of this feature is disabled, allowing it to be
explicitly enabled where it's supported.

Signed-off-by: Jonah Palmer 
---
 include/hw/virtio/virtio.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index eeeda397a9..ffd78830a3 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -400,7 +400,9 @@ typedef struct VirtIORNGConf VirtIORNGConf;
 DEFINE_PROP_BIT64("packed", _state, _field, \
   VIRTIO_F_RING_PACKED, false), \
 DEFINE_PROP_BIT64("queue_reset", _state, _field, \
-  VIRTIO_F_RING_RESET, true)
+  VIRTIO_F_RING_RESET, true), \
+DEFINE_PROP_BIT64("in_order", _state, _field, \
+  VIRTIO_F_IN_ORDER, false)
 
 hwaddr virtio_queue_get_desc_addr(VirtIODevice *vdev, int n);
 bool virtio_queue_enabled_legacy(VirtIODevice *vdev, int n);
-- 
2.39.3

[RFC 5/8] virtio-net: in-order handling

Implements in-order handling for the virtio-net device.

Since virtio-net utilizes batching for its Rx VirtQueue, the device is
responsible for calling virtqueue_flush once it has completed its
batching operation.

Note:
-
It's unclear if this implementation is really necessary to "guarantee"
that used VirtQueueElements are put on the used ring in-order since, by
design, virtio-net already does this with its Rx VirtQueue.

Signed-off-by: Jonah Palmer 
---
 hw/net/virtio-net.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 9959f1932b..b0375f7e5e 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -2069,7 +2069,11 @@ static ssize_t virtio_net_receive_rcu(NetClientState 
*nc, const uint8_t *buf,
 
 for (j = 0; j < i; j++) {
 /* signal other side */
-virtqueue_fill(q->rx_vq, elems[j], lens[j], j);
+if (virtio_vdev_has_feature(vdev, VIRTIO_F_IN_ORDER)) {
+virtqueue_order_element(q->rx_vq, elems[j], lens[j], j, 0);
+} else {
+virtqueue_fill(q->rx_vq, elems[j], lens[j], j);
+}
 g_free(elems[j]);
 }
 
-- 
2.39.3

[RFC 0/8] virtio,vhost: Add VIRTIO_F_IN_ORDER support

The goal of these patches is to add support to a variety of virtio and
vhost devices for the VIRTIO_F_IN_ORDER transport feature. This feature
indicates that all buffers are used by the device in the same order in
which they were made available by the driver.

These patches attempt to implement a generalized, non-device-specific
solution to support this feature.

The core feature behind this solution is a buffer mechanism in the form
of GLib's GHashTable. The decision behind using a hash table was to
leverage their ability for quick lookup, insertion, and removal
operations. Given that our keys are simply numbers of an ordered
sequence, a hash table seemed like the best choice for a buffer
mechanism.

-

The strategy behind this implementation is as follows:

We know that buffers that are popped from the available ring and enqueued
for further processing will always done in the same order in which they
were made available by the driver. Given this, we can note their order
by assigning the resulting VirtQueueElement a key. This key is a number
in a sequence that represents the order in which they were popped from
the available ring, relative to the other VirtQueueElements.

For example, given 3 "elements" that were popped from the available
ring, we assign a key value to them which represents their order (elem0
is popped first, then elem1, then lastly elem2):

 elem2   --  elem1   --  elem0   ---> Enqueue for processing
(key: 2)(key: 1)(key: 0)

Then these elements are enqueued for further processing by the host.

While most devices will return these completed elements in the same
order in which they were enqueued, some devices may not (e.g.
virtio-blk). To guarantee that these elements are put on the used ring
in the same order in which they were enqueued, we can use a buffering
mechanism that keeps track of the next expected sequence number of an
element.

In other words, if the completed element does not have a key value that
matches the next expected sequence number, then we know this element is
not in-order and we must stash it away in a hash table until an order
can be made. The element's key value is used as the key for placing it
in the hash table.

If the completed element has a key value that matches the next expected
sequence number, then we know this element is in-order and we can push
it on the used ring. Then we increment the next expected sequence number
and check if the hash table contains an element at this key location.

If so, we retrieve this element, push it to the used ring, delete the
key-value pair from the hash table, increment the next expected sequence
number, and check the hash table again for an element at this new key
location. This process is repeated until we're unable to find an element
in the hash table to continue the order.

So, for example, say the 3 elements we enqueued were completed in the
following order: elem1, elem2, elem0. The next expected sequence number
is 0:

exp-seq-num = 0:

 elem1   --> elem1.key == exp-seq-num ? --> No, stash it
(key: 1) |
 |
 v
   
   |key: 1 - elem1|
   
-
exp-seq-num = 0:

 elem2   --> elem2.key == exp-seq-num ? --> No, stash it
(key: 2) |
 |
 v
   
   |key: 1 - elem1|
   |--|
   |key: 2 - elem2|
   
-
exp-seq-num = 0:

 elem0   --> elem0.key == exp-seq-num ? --> Yes, push to used ring
(key: 0)

exp-seq-num = 1:

lookup(table, exp-seq-num) != NULL ? --> Yes, push to used ring,
 remove elem from table
 |
 v
   
   |key: 2 - elem2|
   

exp-seq-num = 2:

lookup(table, exp-seq-num) != NULL ? --> Yes, push to used ring,
 remove elem from table
 |
 v

[RFC 6/8] vhost-svq: in-order handling

Implements in-order handling for vhost devices using shadow virtqueues.

Since vhost's shadow virtqueues utilize batching in their
vhost_svq_flush calls, the vhost device is responsible for calling
virtqueue_flush once it has completed its batching operation.

Note:
-
It's unclear if this implementation is really necessary to "guarantee"
in-order handling since, by design, the vhost_svq_flush function puts
used VirtQueueElements in-order already.

Signed-off-by: Jonah Palmer 
---
 hw/virtio/vhost-shadow-virtqueue.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.c 
b/hw/virtio/vhost-shadow-virtqueue.c
index fc5f408f77..3c42adee87 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -493,11 +493,20 @@ static void vhost_svq_flush(VhostShadowVirtqueue *svq,
 qemu_log_mask(LOG_GUEST_ERROR,
  "More than %u used buffers obtained in a %u size SVQ",
  i, svq->vring.num);
-virtqueue_fill(vq, elem, len, i);
-virtqueue_flush(vq, i);
+if (virtio_vdev_has_feature(svq->vdev, VIRTIO_F_IN_ORDER)) {
+virtqueue_order_element(vq, elem, len, i, i);
+} else {
+virtqueue_fill(vq, elem, len, i);
+virtqueue_flush(vq, i);
+}
 return;
 }
-virtqueue_fill(vq, elem, len, i++);
+
+if (virtio_vdev_has_feature(svq->vdev, VIRTIO_F_IN_ORDER)) {
+virtqueue_order_element(vq, elem, len, i++, 0);
+} else {
+virtqueue_fill(vq, elem, len, i++);
+}
 }
 
 virtqueue_flush(vq, i);
-- 
2.39.3

[RFC 3/8] virtio: Define order variables

Define order variables for their use in a VirtQueue's in-order hash
table. Also initialize current_order variables to 0 when creating or
resetting a VirtQueue. These variables are used when the device has
negotiated the VIRTIO_F_IN_ORDER transport feature.

A VirtQueue's current_order_idx represents the next expected index in
the sequence of VirtQueueElements to be processed (put on the used
ring). The next VirtQueueElement to be processed must match this
sequence number before additional elements can be safely added to the
used ring.

A VirtQueue's current_order_key is essentially a counter whose value is
saved as a key in a VirtQueueElement. After the value has been assigned
to the VirtQueueElement, the counter is incremented. All
VirtQueueElements being used by the device are assigned a key value and
the sequence at which they're assigned must be preserved when the device
puts these elements on the used ring.

A VirtQueueElement's order_key is value of a VirtQueue's
current_order_key at the time of the VirtQueueElement's creation. This
value must match with the VirtQueue's current_order_idx before it's able
to be put on the used ring by the device.

Signed-off-by: Jonah Palmer 
---
 hw/virtio/virtio.c | 6 ++
 include/hw/virtio/virtio.h | 1 +
 2 files changed, 7 insertions(+)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index d2afeeb59a..40124545d6 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -155,6 +155,8 @@ struct VirtQueue
 
 /* In-Order */
 GHashTable *in_order_ht;
+uint16_t current_order_idx;
+uint16_t current_order_key;
 };
 
 const char *virtio_device_names[] = {
@@ -2103,6 +2105,8 @@ static void __virtio_queue_reset(VirtIODevice *vdev, 
uint32_t i)
 if (vdev->vq[i].in_order_ht != NULL) {
 g_hash_table_remove_all(vdev->vq[i].in_order_ht);
 }
+vdev->vq[i].current_order_idx = 0;
+vdev->vq[i].current_order_key = 0;
 virtio_virtqueue_reset_region_cache(>vq[i]);
 }
 
@@ -2357,6 +2361,8 @@ VirtQueue *virtio_add_queue(VirtIODevice *vdev, int 
queue_size,
 g_hash_table_new_full(g_direct_hash, g_direct_equal, NULL,
   free_in_order_vq_element);
 }
+vdev->vq[i].current_order_idx = 0;
+vdev->vq[i].current_order_key = 0;
 
 return >vq[i];
 }
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index c8aa435a5e..f83d7e1fee 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -75,6 +75,7 @@ typedef struct VirtQueueElement
 hwaddr *out_addr;
 struct iovec *in_sg;
 struct iovec *out_sg;
+uint16_t order_key;
 } VirtQueueElement;
 
 typedef struct InOrderVQElement {
-- 
2.39.3

[RFC 2/8] virtio: Create/destroy/reset VirtQueue In-Order hash table

Define a GLib hash table (GHashTable) member in a device's VirtQueue
and add its creation, destruction, and reset functions appropriately.
Also define a function to handle the deallocation of InOrderVQElement
values whenever they're removed from the hash table or the hash table
is destroyed. This hash table is to be used when the device is using
the VIRTIO_F_IN_ORDER transport feature.

A VirtQueue's in-order hash table will take in a uint16_t key with a
InOrderVQElement value as its key-value pair.

The hash table will be used as a buffer mechanism for completed,
out-of-order VirtQueueElements until they can be used in the same order
in which they were made available to the device.

Signed-off-by: Jonah Palmer 
---
 hw/virtio/virtio.c | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index fb6b4ccd83..d2afeeb59a 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -152,6 +152,9 @@ struct VirtQueue
 EventNotifier host_notifier;
 bool host_notifier_enabled;
 QLIST_ENTRY(VirtQueue) node;
+
+/* In-Order */
+GHashTable *in_order_ht;
 };
 
 const char *virtio_device_names[] = {
@@ -2070,6 +2073,16 @@ static enum virtio_device_endian 
virtio_current_cpu_endian(void)
 }
 }
 
+/* 
+ * Called when an element is removed from the hash table
+ * or when the hash table is destroyed.
+ */
+static void free_in_order_vq_element(gpointer data)
+{
+InOrderVQElement *elem = (InOrderVQElement *)data;
+g_free(elem);
+}
+
 static void __virtio_queue_reset(VirtIODevice *vdev, uint32_t i)
 {
 vdev->vq[i].vring.desc = 0;
@@ -2087,6 +2100,9 @@ static void __virtio_queue_reset(VirtIODevice *vdev, 
uint32_t i)
 vdev->vq[i].notification = true;
 vdev->vq[i].vring.num = vdev->vq[i].vring.num_default;
 vdev->vq[i].inuse = 0;
+if (vdev->vq[i].in_order_ht != NULL) {
+g_hash_table_remove_all(vdev->vq[i].in_order_ht);
+}
 virtio_virtqueue_reset_region_cache(>vq[i]);
 }
 
@@ -2334,6 +2350,13 @@ VirtQueue *virtio_add_queue(VirtIODevice *vdev, int 
queue_size,
 vdev->vq[i].vring.align = VIRTIO_PCI_VRING_ALIGN;
 vdev->vq[i].handle_output = handle_output;
 vdev->vq[i].used_elems = g_new0(VirtQueueElement, queue_size);
+vdev->vq[i].in_order_ht = NULL;
+
+if (virtio_host_has_feature(vdev, VIRTIO_F_IN_ORDER)) {
+vdev->vq[i].in_order_ht =
+g_hash_table_new_full(g_direct_hash, g_direct_equal, NULL,
+  free_in_order_vq_element);
+}
 
 return >vq[i];
 }
@@ -2345,6 +2368,10 @@ void virtio_delete_queue(VirtQueue *vq)
 vq->handle_output = NULL;
 g_free(vq->used_elems);
 vq->used_elems = NULL;
+if (virtio_host_has_feature(vq->vdev, VIRTIO_F_IN_ORDER)) {
+g_hash_table_destroy(vq->in_order_ht);
+vq->in_order_ht = NULL;
+}
 virtio_virtqueue_reset_region_cache(vq);
 }
 
-- 
2.39.3

[RFC 7/8] vhost/vhost-user: Add VIRTIO_F_IN_ORDER to vhost feature bits

Add support for the VIRTIO_F_IN_ORDER feature across a variety of vhost
devices.

The inclusion of VIRTIO_F_IN_ORDER in the feature bits arrays for these
devices ensures that the backend is capable of offering and providing
support for this feature, and that it can be disabled if the backend
does not support it.

Signed-off-by: Jonah Palmer 
---
 hw/block/vhost-user-blk.c| 1 +
 hw/net/vhost_net.c   | 2 ++
 hw/scsi/vhost-scsi.c | 1 +
 hw/scsi/vhost-user-scsi.c| 1 +
 hw/virtio/vhost-user-fs.c| 1 +
 hw/virtio/vhost-user-vsock.c | 1 +
 net/vhost-vdpa.c | 1 +
 7 files changed, 8 insertions(+)

diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index 6a856ad51a..d176ed857e 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -51,6 +51,7 @@ static const int user_feature_bits[] = {
 VIRTIO_F_RING_PACKED,
 VIRTIO_F_IOMMU_PLATFORM,
 VIRTIO_F_RING_RESET,
+VIRTIO_F_IN_ORDER,
 VHOST_INVALID_FEATURE_BIT
 };
 
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index e8e1661646..33d1d4b9d3 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -48,6 +48,7 @@ static const int kernel_feature_bits[] = {
 VIRTIO_F_IOMMU_PLATFORM,
 VIRTIO_F_RING_PACKED,
 VIRTIO_F_RING_RESET,
+VIRTIO_F_IN_ORDER,
 VIRTIO_NET_F_HASH_REPORT,
 VHOST_INVALID_FEATURE_BIT
 };
@@ -76,6 +77,7 @@ static const int user_feature_bits[] = {
 VIRTIO_F_IOMMU_PLATFORM,
 VIRTIO_F_RING_PACKED,
 VIRTIO_F_RING_RESET,
+VIRTIO_F_IN_ORDER,
 VIRTIO_NET_F_RSS,
 VIRTIO_NET_F_HASH_REPORT,
 VIRTIO_NET_F_GUEST_USO4,
diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
index ae26bc19a4..40e7630191 100644
--- a/hw/scsi/vhost-scsi.c
+++ b/hw/scsi/vhost-scsi.c
@@ -38,6 +38,7 @@ static const int kernel_feature_bits[] = {
 VIRTIO_RING_F_EVENT_IDX,
 VIRTIO_SCSI_F_HOTPLUG,
 VIRTIO_F_RING_RESET,
+VIRTIO_F_IN_ORDER,
 VHOST_INVALID_FEATURE_BIT
 };
 
diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c
index a63b1f4948..1d59951ab7 100644
--- a/hw/scsi/vhost-user-scsi.c
+++ b/hw/scsi/vhost-user-scsi.c
@@ -36,6 +36,7 @@ static const int user_feature_bits[] = {
 VIRTIO_RING_F_EVENT_IDX,
 VIRTIO_SCSI_F_HOTPLUG,
 VIRTIO_F_RING_RESET,
+VIRTIO_F_IN_ORDER,
 VHOST_INVALID_FEATURE_BIT
 };
 
diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c
index cca2cd41be..9243dbb128 100644
--- a/hw/virtio/vhost-user-fs.c
+++ b/hw/virtio/vhost-user-fs.c
@@ -33,6 +33,7 @@ static const int user_feature_bits[] = {
 VIRTIO_F_RING_PACKED,
 VIRTIO_F_IOMMU_PLATFORM,
 VIRTIO_F_RING_RESET,
+VIRTIO_F_IN_ORDER,
 
 VHOST_INVALID_FEATURE_BIT
 };
diff --git a/hw/virtio/vhost-user-vsock.c b/hw/virtio/vhost-user-vsock.c
index 9431b9792c..cc7e4e47b4 100644
--- a/hw/virtio/vhost-user-vsock.c
+++ b/hw/virtio/vhost-user-vsock.c
@@ -21,6 +21,7 @@ static const int user_feature_bits[] = {
 VIRTIO_RING_F_INDIRECT_DESC,
 VIRTIO_RING_F_EVENT_IDX,
 VIRTIO_F_NOTIFY_ON_EMPTY,
+VIRTIO_F_IN_ORDER,
 VHOST_INVALID_FEATURE_BIT
 };
 
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 85e73dd6a7..ed3185acfa 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -62,6 +62,7 @@ const int vdpa_feature_bits[] = {
 VIRTIO_F_RING_PACKED,
 VIRTIO_F_RING_RESET,
 VIRTIO_F_VERSION_1,
+VIRTIO_F_IN_ORDER,
 VIRTIO_NET_F_CSUM,
 VIRTIO_NET_F_CTRL_GUEST_OFFLOADS,
 VIRTIO_NET_F_CTRL_MAC_ADDR,
-- 
2.39.3

[RFC 4/8] virtio: Implement in-order handling for virtio devices

Implements in-order handling for most virtio devices using the
VIRTIO_F_IN_ORDER transport feature, specifically those who call
virtqueue_push to push their used elements onto the used ring.

The logic behind this implementation is as follows:

1.) virtqueue_pop always enqueues VirtQueueElements in-order.

virtqueue_pop always retrieves one or more buffer descriptors in-order
from the available ring and converts them into a VirtQueueElement. This
means that the order in which VirtQueueElements are enqueued are
in-order by default.

By virtue, as VirtQueueElements are created, we can assign a sequential
key value to them. This preserves the order of buffers that have been
made available to the device by the driver.

As VirtQueueElements are assigned a key value, the current sequence
number is incremented.

2.) Requests can be completed out-of-order.

While most devices complete requests in the same order that they were
enqueued by default, some devices don't (e.g. virtio-blk). The goal of
this out-of-order handling is to reduce the impact of devices that
process elements in-order by default while also guaranteeing compliance
with the VIRTIO_F_IN_ORDER feature.

Below is the logic behind handling completed requests (which may or may
not be in-order).

3.) Does the incoming used VirtQueueElement preserve the correct order?

In other words, is the sequence number (key) assigned to the
VirtQueueElement the expected number that would preserve the original
order?

3a.)
If it does... immediately push the used element onto the used ring.
Then increment the next expected sequence number and check to see if
any previous out-of-order VirtQueueElements stored on the hash table
has a key that matches this next expected sequence number.

For each VirtQueueElement found on the hash table with a matching key:
push the element on the used ring, remove the key-value pair from the
hash table, and then increment the next expected sequence number. Repeat
this process until we're unable to find an element with a matching key.

Note that if the device uses batching (e.g. virtio-net), then we skip
the virtqueue_flush call and let the device call it themselves.

3b.)
If it does not... stash the VirtQueueElement, along with relevant data,
as a InOrderVQElement on the hash table. The key used is the order_key
that was assigned when the VirtQueueElement was created.

Signed-off-by: Jonah Palmer 
---
 hw/virtio/virtio.c | 70 --
 include/hw/virtio/virtio.h |  8 +
 2 files changed, 76 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 40124545d6..40e4377f1e 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -992,12 +992,56 @@ void virtqueue_flush(VirtQueue *vq, unsigned int count)
 }
 }
 
+void virtqueue_order_element(VirtQueue *vq, const VirtQueueElement *elem,
+ unsigned int len, unsigned int idx,
+ unsigned int count)
+{
+InOrderVQElement *in_order_elem;
+
+if (elem->order_key == vq->current_order_idx) {
+/* Element is in-order, push to used ring */
+virtqueue_fill(vq, elem, len, idx);
+
+/* Batching? Don't flush */
+if (count) {
+virtqueue_flush(vq, count);
+}
+
+/* Increment next expected order, search for more in-order elements */
+while ((in_order_elem = g_hash_table_lookup(vq->in_order_ht,
+GUINT_TO_POINTER(++vq->current_order_idx))) != NULL) {
+/* Found in-order element, push to used ring */
+virtqueue_fill(vq, in_order_elem->elem, in_order_elem->len,
+   in_order_elem->idx);
+
+/* Batching? Don't flush */
+if (count) {
+virtqueue_flush(vq, in_order_elem->count);
+}
+
+/* Remove key-value pair from hash table */
+g_hash_table_remove(vq->in_order_ht,
+GUINT_TO_POINTER(vq->current_order_idx));
+}
+} else {
+/* Element is out-of-order, stash in hash table */
+in_order_elem = virtqueue_alloc_in_order_element(elem, len, idx,
+ count);
+g_hash_table_insert(vq->in_order_ht, GUINT_TO_POINTER(elem->order_key),
+in_order_elem);
+}
+}
+
 void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem,
 unsigned int len)
 {
 RCU_READ_LOCK_GUARD();
-virtqueue_fill(vq, elem, len, 0);
-virtqueue_flush(vq, 1);
+if (virtio_vdev_has_feature(vq->vdev, VIRTIO_F_IN_ORDER)) {
+virtqueue_order_element(vq, elem, len, 0, 1);
+} else {
+virtqueue_fill(vq, elem, len, 0);
+virtqueue_flush(vq, 1);
+}
 }
 
 /* Called within rcu_read_lock().  */
@@ -1478,6 +1522,18 @@ void virtqueue_map(VirtIODevice *vdev, VirtQueueElement 
*elem)

[PATCH-for-9.1 21/21] target/xtensa: Replace qemu_printf() by monitor_printf() in monitor

Replace qemu_printf() by monitor_printf() / monitor_puts() in monitor.

Signed-off-by: Philippe Mathieu-Daudé 
---
 target/xtensa/mmu.h |   2 +-
 target/xtensa/monitor.c | 117 
 2 files changed, 61 insertions(+), 58 deletions(-)

diff --git a/target/xtensa/mmu.h b/target/xtensa/mmu.h
index 3e1d2c03ea..ef7504e16e 100644
--- a/target/xtensa/mmu.h
+++ b/target/xtensa/mmu.h
@@ -90,6 +90,6 @@ int xtensa_get_physical_addr(CPUXtensaState *env, bool 
update_tlb,
  unsigned *access);
 
 void xtensa_reset_mmu(CPUXtensaState *env);
-void xtensa_dump_mmu(CPUXtensaState *env);
+void xtensa_dump_mmu(Monitor *mon, CPUXtensaState *env);
 
 #endif
diff --git a/target/xtensa/monitor.c b/target/xtensa/monitor.c
index 9ba068d624..1c3dc85ea1 100644
--- a/target/xtensa/monitor.c
+++ b/target/xtensa/monitor.c
@@ -22,7 +22,6 @@
  * THE SOFTWARE.
  */
 #include "qemu/osdep.h"
-#include "qemu/qemu-print.h"
 #include "qemu/units.h"
 #include "monitor/monitor.h"
 #include "monitor/hmp-target.h"
@@ -31,7 +30,7 @@
 #include "mmu.h"
 
 
-static void dump_tlb(CPUXtensaState *env, bool dtlb)
+static void dump_tlb(Monitor *mon, CPUXtensaState *env, bool dtlb)
 {
 unsigned wi, ei;
 const xtensa_tlb *conf =
@@ -40,7 +39,7 @@ static void dump_tlb(CPUXtensaState *env, bool dtlb)
 xtensa_option_enabled(env->config, XTENSA_OPTION_MMU) ?
 mmu_attr_to_access : region_attr_to_access;
 
-qemu_printf("%s:\n", dtlb ? "DTLB" : "IBLB");
+monitor_puts(mon, dtlb ? "DTLB\n" : "IBLB\n");
 for (wi = 0; wi < conf->nways; ++wi) {
 uint32_t sz = ~xtensa_tlb_get_addr_mask(env, dtlb, wi) + 1;
 const char *sz_text;
@@ -71,35 +70,39 @@ static void dump_tlb(CPUXtensaState *env, bool dtlb)
 
 if (print_header) {
 print_header = false;
-qemu_printf("Way %u (%d %s)\n", wi, sz, sz_text);
-qemu_printf("\tVaddr   Paddr   ASID  Attr RWX 
Cache\n"
-"\t--  --     --- 
---\n");
+monitor_printf(mon,
+   "Way %u (%d %s)\n", wi, sz, sz_text);
+monitor_puts(mon,
+ "\tVaddr   Paddr   ASID  Attr RWX 
Cache\n"
+ "\t--  --     --- 
---\n");
 }
-qemu_printf("\t0x%08x  0x%08x  0x%02x  0x%02x %c%c%c %s\n",
-entry->vaddr,
-entry->paddr,
-entry->asid,
-entry->attr,
-(access & PAGE_READ) ? 'R' : '-',
-(access & PAGE_WRITE) ? 'W' : '-',
-(access & PAGE_EXEC) ? 'X' : '-',
-cache_text[cache_idx] ?
-cache_text[cache_idx] : "Invalid");
+monitor_printf(mon,
+   "\t0x%08x  0x%08x  0x%02x  0x%02x %c%c%c %s\n",
+   entry->vaddr,
+   entry->paddr,
+   entry->asid,
+   entry->attr,
+   (access & PAGE_READ) ? 'R' : '-',
+   (access & PAGE_WRITE) ? 'W' : '-',
+   (access & PAGE_EXEC) ? 'X' : '-',
+   cache_text[cache_idx] ?
+   cache_text[cache_idx] : "Invalid");
 }
 }
 }
 }
 
-static void dump_mpu(CPUXtensaState *env, const char *map_desc,
+static void dump_mpu(Monitor *mon, CPUXtensaState *env, const char *map_desc,
  const xtensa_mpu_entry *entry, unsigned n)
 {
 unsigned i;
 
-qemu_printf("%s map:\n", map_desc);
-qemu_printf("\t%s  Vaddr   AttrRing0  Ring1  System Type
CPU cache\n"
-"\t%s  --  --  -  -  -  
-\n",
-env ? "En" : "  ",
-env ? "--" : "  ");
+monitor_printf(mon, "%s map:\n", map_desc);
+monitor_printf(mon,
+   "\t%s  Vaddr   AttrRing0  Ring1  System Type
CPU cache\n"
+   "\t%s  --  --  -  -  -  
-\n",
+   env ? "En" : "  ",
+   env ? "--" : "  ");
 
 for (i = 0; i < n; ++i) {
 uint32_t attr = entry[i].attr;
@@ -108,64 +111,64 @@ static void dump_mpu(CPUXtensaState *env, const char 
*map_desc,
 unsigned type = mpu_attr_to_type(attr);
 char cpu_cache = (type & XTENSA_MPU_TYPE_CPU_CACHE) ? '-' : ' ';
 
-qemu_printf("\t %c  0x%08x  0x%08x   %c%c%c%c%c%c   ",
-env ?
-((env->sregs[MPUENB] & (1u << i)) ? '+' : '-') : ' ',
-

[PATCH-for-9.1 16/21] target/sparc: Replace qemu_printf() by monitor_printf() in monitor

Replace qemu_printf() by monitor_printf() in monitor.c.
Rename dump_mmu() as sparc_dump_mmu().

Signed-off-by: Philippe Mathieu-Daudé 
---
 target/sparc/cpu.h |   2 +-
 target/sparc/ldst_helper.c |  18 +++
 target/sparc/mmu_helper.c  | 102 ++---
 target/sparc/monitor.c |   2 +-
 4 files changed, 62 insertions(+), 62 deletions(-)

diff --git a/target/sparc/cpu.h b/target/sparc/cpu.h
index f3cdd17c62..55589c8ae4 100644
--- a/target/sparc/cpu.h
+++ b/target/sparc/cpu.h
@@ -601,7 +601,7 @@ bool sparc_cpu_tlb_fill(CPUState *cs, vaddr address, int 
size,
 MMUAccessType access_type, int mmu_idx,
 bool probe, uintptr_t retaddr);
 target_ulong mmu_probe(CPUSPARCState *env, target_ulong address, int mmulev);
-void dump_mmu(CPUSPARCState *env);
+void sparc_dump_mmu(Monitor *mon, CPUSPARCState *env);
 
 #if !defined(TARGET_SPARC64) && !defined(CONFIG_USER_ONLY)
 int sparc_cpu_memory_rw_debug(CPUState *cpu, vaddr addr,
diff --git a/target/sparc/ldst_helper.c b/target/sparc/ldst_helper.c
index 064390d1d4..44f8b2bb7a 100644
--- a/target/sparc/ldst_helper.c
+++ b/target/sparc/ldst_helper.c
@@ -195,7 +195,7 @@ static void demap_tlb(SparcTLBEntry *tlb, target_ulong 
demap_addr,
 replace_tlb_entry([i], 0, 0, env1);
 #ifdef DEBUG_MMU
 DPRINTF_MMU("%s demap invalidated entry [%02u]\n", strmmu, i);
-dump_mmu(env1);
+sparc_dump_mmu(env1);
 #endif
 }
 }
@@ -257,7 +257,7 @@ static void replace_tlb_1bit_lru(SparcTLBEntry *tlb,
 replace_tlb_entry([i], tlb_tag, tlb_tte, env1);
 #ifdef DEBUG_MMU
 DPRINTF_MMU("%s lru replaced invalid entry [%i]\n", strmmu, i);
-dump_mmu(env1);
+sparc_dump_mmu(env1);
 #endif
 return;
 }
@@ -276,7 +276,7 @@ static void replace_tlb_1bit_lru(SparcTLBEntry *tlb,
 #ifdef DEBUG_MMU
 DPRINTF_MMU("%s lru replaced unlocked %s entry [%i]\n",
 strmmu, (replace_used ? "used" : "unused"), i);
-dump_mmu(env1);
+sparc_dump_mmu(env1);
 #endif
 return;
 }
@@ -995,7 +995,7 @@ void helper_st_asi(CPUSPARCState *env, target_ulong addr, 
uint64_t val,
 break;
 }
 #ifdef DEBUG_MMU
-dump_mmu(env);
+sparc_dump_mmu(env);
 #endif
 }
 break;
@@ -1050,7 +1050,7 @@ void helper_st_asi(CPUSPARCState *env, target_ulong addr, 
uint64_t val,
 reg, oldreg, env->mmuregs[reg]);
 }
 #ifdef DEBUG_MMU
-dump_mmu(env);
+sparc_dump_mmu(env);
 #endif
 }
 break;
@@ -1736,7 +1736,7 @@ void helper_st_asi(CPUSPARCState *env, target_ulong addr, 
target_ulong val,
 PRIx64 "\n", reg, oldreg, env->immu.mmuregs[reg]);
 }
 #ifdef DEBUG_MMU
-dump_mmu(env);
+sparc_dump_mmu(env);
 #endif
 return;
 }
@@ -1760,7 +1760,7 @@ void helper_st_asi(CPUSPARCState *env, target_ulong addr, 
target_ulong val,
 }
 #ifdef DEBUG_MMU
 DPRINTF_MMU("immu data access replaced entry [%i]\n", i);
-dump_mmu(env);
+sparc_dump_mmu(env);
 #endif
 return;
 }
@@ -1820,7 +1820,7 @@ void helper_st_asi(CPUSPARCState *env, target_ulong addr, 
target_ulong val,
 PRIx64 "\n", reg, oldreg, env->dmmu.mmuregs[reg]);
 }
 #ifdef DEBUG_MMU
-dump_mmu(env);
+sparc_dump_mmu(env);
 #endif
 return;
 }
@@ -1842,7 +1842,7 @@ void helper_st_asi(CPUSPARCState *env, target_ulong addr, 
target_ulong val,
 }
 #ifdef DEBUG_MMU
 DPRINTF_MMU("dmmu data access replaced entry [%i]\n", i);
-dump_mmu(env);
+sparc_dump_mmu(env);
 #endif
 return;
 }
diff --git a/target/sparc/mmu_helper.c b/target/sparc/mmu_helper.c
index ad1591d9fd..f325c9a4cc 100644
--- a/target/sparc/mmu_helper.c
+++ b/target/sparc/mmu_helper.c
@@ -21,7 +21,7 @@
 #include "qemu/log.h"
 #include "cpu.h"
 #include "exec/exec-all.h"
-#include "qemu/qemu-print.h"
+#include "monitor/monitor.h"
 #include "trace.h"
 
 /* Sparc MMU emulation */
@@ -344,7 +344,7 @@ target_ulong mmu_probe(CPUSPARCState *env, target_ulong 
address, int mmulev)
 return 0;
 }
 
-void dump_mmu(CPUSPARCState *env)
+void sparc_dump_mmu(Monitor *mon, CPUSPARCState *env)
 {
 CPUState *cs = env_cpu(env);
 target_ulong va, va1, va2;
@@ -352,29 +352,29 @@ void dump_mmu(CPUSPARCState *env)
 hwaddr pa;
 uint32_t pde;
 
-qemu_printf("Root ptr: " HWADDR_FMT_plx ", ctx: %d\n",
-(hwaddr)env->mmuregs[1] << 4, env->mmuregs[2]);
+monitor_printf(mon, "Root ptr: " HWADDR_FMT_plx ", ctx: %d\n",
+   (hwaddr)env->mmuregs[1] << 4, env->mmuregs[2]);
 for (n = 0, va = 0; n <

[PATCH-for-9.1 14/21] target/sh4: Extract sh4_dump_mmu() from hmp_info_tlb()

Extract sh4_dump_mmu() from hmp_info_tlb(), replacing
monitor_printf(FIXED_STRING_WITHOUT_FORMAT) by monitor_puts().

Signed-off-by: Philippe Mathieu-Daudé 
---
 target/sh4/cpu.h |  2 ++
 target/sh4/monitor.c | 22 +++---
 2 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/target/sh4/cpu.h b/target/sh4/cpu.h
index 9211da6bde..4e2e9ffd66 100644
--- a/target/sh4/cpu.h
+++ b/target/sh4/cpu.h
@@ -385,4 +385,6 @@ static inline void cpu_get_tb_cpu_state(CPUSH4State *env, 
vaddr *pc,
 #endif
 }
 
+void sh4_dump_mmu(Monitor *mon, CPUSH4State *env);
+
 #endif /* SH4_CPU_H */
diff --git a/target/sh4/monitor.c b/target/sh4/monitor.c
index 2da6a5426e..1befb42b07 100644
--- a/target/sh4/monitor.c
+++ b/target/sh4/monitor.c
@@ -39,20 +39,28 @@ static void print_tlb(Monitor *mon, int idx, tlb_t *tlb)
tlb->d, tlb->wt);
 }
 
+void sh4_dump_mmu(Monitor *mon, CPUSH4State *env)
+{
+int i;
+
+monitor_puts(mon, "ITLB:\n");
+for (i = 0 ; i < ITLB_SIZE ; i++) {
+print_tlb (mon, i, >itlb[i]);
+}
+monitor_puts(mon, "UTLB:\n");
+for (i = 0 ; i < UTLB_SIZE ; i++) {
+print_tlb (mon, i, >utlb[i]);
+}
+}
+
 void hmp_info_tlb(Monitor *mon, const QDict *qdict)
 {
 CPUArchState *env = mon_get_cpu_env(mon);
-int i;
 
 if (!env) {
 monitor_printf(mon, "No CPU available\n");
 return;
 }
 
-monitor_printf (mon, "ITLB:\n");
-for (i = 0 ; i < ITLB_SIZE ; i++)
-print_tlb (mon, i, >itlb[i]);
-monitor_printf (mon, "UTLB:\n");
-for (i = 0 ; i < UTLB_SIZE ; i++)
-print_tlb (mon, i, >utlb[i]);
+sh4_dump_mmu(mon, env);
 }
-- 
2.41.0

[PATCH-for-9.1 18/21] target/xtensa: Extract MMU API to new mmu.c/mmu.h files

Extract the MMU API and expose it via "mmu.h" so we can
reuse the methods in target/xtensa/ files.

Signed-off-by: Philippe Mathieu-Daudé 
---
 target/xtensa/cpu.h|  32 +-
 target/xtensa/mmu.h|  95 
 target/xtensa/mmu.c| 889 
 target/xtensa/mmu_helper.c | 892 +
 target/xtensa/meson.build  |   1 +
 5 files changed, 991 insertions(+), 918 deletions(-)
 create mode 100644 target/xtensa/mmu.h
 create mode 100644 target/xtensa/mmu.c

diff --git a/target/xtensa/cpu.h b/target/xtensa/cpu.h
index b2cfc78e9d..b67ee987f3 100644
--- a/target/xtensa/cpu.h
+++ b/target/xtensa/cpu.h
@@ -34,6 +34,10 @@
 #include "hw/clock.h"
 #include "xtensa-isa.h"
 
+typedef struct CPUArchState CPUXtensaState;
+
+#include "mmu.h"
+
 /* Xtensa processors have a weak memory model */
 #define TCG_GUEST_DEFAULT_MO  (0)
 
@@ -309,28 +313,6 @@ typedef enum {
 INTTYPE_MAX
 } interrupt_type;
 
-typedef struct CPUArchState CPUXtensaState;
-
-typedef struct xtensa_tlb_entry {
-uint32_t vaddr;
-uint32_t paddr;
-uint8_t asid;
-uint8_t attr;
-bool variable;
-} xtensa_tlb_entry;
-
-typedef struct xtensa_tlb {
-unsigned nways;
-const unsigned way_size[10];
-bool varway56;
-unsigned nrefillentries;
-} xtensa_tlb;
-
-typedef struct xtensa_mpu_entry {
-uint32_t vaddr;
-uint32_t attr;
-} xtensa_mpu_entry;
-
 typedef struct XtensaGdbReg {
 int targno;
 unsigned flags;
@@ -689,12 +671,6 @@ static inline int xtensa_get_cring(const CPUXtensaState 
*env)
 }
 
 #ifndef CONFIG_USER_ONLY
-int xtensa_get_physical_addr(CPUXtensaState *env, bool update_tlb,
-uint32_t vaddr, int is_write, int mmu_idx,
-uint32_t *paddr, uint32_t *page_size, unsigned *access);
-void xtensa_reset_mmu(CPUXtensaState *env);
-void xtensa_dump_mmu(CPUXtensaState *env);
-
 static inline MemoryRegion *xtensa_get_er_region(CPUXtensaState *env)
 {
 return env->system_er;
diff --git a/target/xtensa/mmu.h b/target/xtensa/mmu.h
new file mode 100644
index 00..3e1d2c03ea
--- /dev/null
+++ b/target/xtensa/mmu.h
@@ -0,0 +1,95 @@
+/*
+ * Xtensa MMU/MPU helpers
+ *
+ * SPDX-FileCopyrightText: 2011 - 2019, Max Filippov, Open Source and Linux 
Lab.
+ * SPDX-License-Identifier: BSD-3-Clause
+ */
+
+#ifndef TARGET_XTENSA_MMU_H
+#define TARGET_XTENSA_MMU_H
+
+#include "cpu.h"
+
+typedef struct xtensa_tlb_entry {
+uint32_t vaddr;
+uint32_t paddr;
+uint8_t asid;
+uint8_t attr;
+bool variable;
+} xtensa_tlb_entry;
+
+typedef struct xtensa_tlb {
+unsigned nways;
+const unsigned way_size[10];
+bool varway56;
+unsigned nrefillentries;
+} xtensa_tlb;
+
+typedef struct xtensa_mpu_entry {
+uint32_t vaddr;
+uint32_t attr;
+} xtensa_mpu_entry;
+
+#define XTENSA_MPU_SEGMENT_MASK 0x001f
+#define XTENSA_MPU_ACC_RIGHTS_MASK 0x0f00
+#define XTENSA_MPU_ACC_RIGHTS_SHIFT 8
+#define XTENSA_MPU_MEM_TYPE_MASK 0x001ff000
+#define XTENSA_MPU_MEM_TYPE_SHIFT 12
+#define XTENSA_MPU_ATTR_MASK 0x001fff00
+
+#define XTENSA_MPU_PROBE_B 0x4000
+#define XTENSA_MPU_PROBE_V 0x8000
+
+#define XTENSA_MPU_SYSTEM_TYPE_DEVICE 0x0001
+#define XTENSA_MPU_SYSTEM_TYPE_NC 0x0002
+#define XTENSA_MPU_SYSTEM_TYPE_C  0x0003
+#define XTENSA_MPU_SYSTEM_TYPE_MASK   0x0003
+
+#define XTENSA_MPU_TYPE_SYS_C 0x0010
+#define XTENSA_MPU_TYPE_SYS_W 0x0020
+#define XTENSA_MPU_TYPE_SYS_R 0x0040
+#define XTENSA_MPU_TYPE_CPU_C 0x0100
+#define XTENSA_MPU_TYPE_CPU_W 0x0200
+#define XTENSA_MPU_TYPE_CPU_R 0x0400
+#define XTENSA_MPU_TYPE_CPU_CACHE 0x0800
+#define XTENSA_MPU_TYPE_B 0x1000
+#define XTENSA_MPU_TYPE_INT   0x2000
+
+unsigned mmu_attr_to_access(uint32_t attr);
+unsigned mpu_attr_to_access(uint32_t attr, unsigned ring);
+unsigned mpu_attr_to_cpu_cache(uint32_t attr);
+unsigned mpu_attr_to_type(uint32_t attr);
+
+unsigned region_attr_to_access(uint32_t attr);
+unsigned cacheattr_attr_to_access(uint32_t attr);
+
+xtensa_tlb_entry *xtensa_get_tlb_entry(CPUXtensaState *env,
+   uint32_t v, bool dtlb, uint32_t *pwi);
+xtensa_tlb_entry *xtensa_tlb_get_entry(CPUXtensaState *env, bool dtlb,
+   unsigned wi, unsigned ei);
+void xtensa_tlb_set_entry(CPUXtensaState *env, bool dtlb,
+  unsigned wi, unsigned ei,
+  uint32_t vpn, uint32_t pte);
+
+uint32_t xtensa_tlb_get_addr_mask(const CPUXtensaState *env, bool dtlb,
+  uint32_t way);
+uint32_t xtensa_get_vpn_mask(const CPUXtensaState *env, bool dtlb,
+ uint32_t way);
+
+bool xtensa_split_tlb_entry_spec(CPUXtensaState *env, uint32_t v, bool dtlb,
+ uint32_t *vpn, uint32_t *wi, uint32_t *ei);
+
+int xtensa_tlb_lookup(const CPUXtensaState *env, uint32_t addr, bool dtlb,
+  uint32_t *pwi, uint32_t *pei, uint8_t

[PATCH-for-9.1 11/21] target/nios2: Move monitor commands to monitor.c