On Wed, 15 Oct 2025 at 11:58, Salil Mehta <[email protected]> wrote:
>
> Hi Peter,
>
> > From: [email protected] <qemu-
> > [email protected]> On Behalf Of Peter
> > Maydell
> > Sent: Tuesday, October 14, 2025 11:25 AM
> > To: [email protected]
> > Cc: Salil Mehta <[email protected]>; Marc Zyngier <[email protected]>
> > Subject: [PATCH] hw/intc/arm_gicv3_kvm: Avoid reading ICC_CTLR_EL1 from
> > kernel in cpuif reset
> >
> > Currently in arm_gicv3_icc_reset() we read the kernel's value of
> > ICC_CTLR_EL1 as part of resetting the CPU interface. This mostly works, but
> > we're actually breaking an assumption the kernel makes that userspace only
> > accesses the in-kernel GIC data when the VM is totally paused, which may
> > not be the case if a single vCPU is being reset. The effect is that it's
> > possible
> > that the read attempt returns EBUSY.
> >
> > Avoid this by reading the kernel's value of the reset ICC_CTLR_EL1 once in
> > device realize. This brings ICC_CTLR_EL1 into line with the other cpuif
> > registers, where we assume we know what the kernel is resetting them to
> > and just update QEMU's data structures in arm_gicv3_icc_reset().
> >
> > Signed-off-by: Peter Maydell <[email protected]>
> > ---
> > I've only tested this fairly lightly, but it seems to work.
> > Salil, does this fix the EBUSY issues you were seeing ?
> >
> > include/hw/intc/arm_gicv3_common.h | 3 ++
> > hw/intc/arm_gicv3_kvm.c | 49 +++++++++++++++++++++---------
> > 2 files changed, 38 insertions(+), 14 deletions(-)
> >
> [...]
> > +
> > + /*
> > + * Now we can read the kernel's initial value of ICC_CTLR_EL1, which
> > + * we will need if a CPU interface is reset. If the kernel is ancient
> > + * and doesn't support writing the GIC state then we don't need to
> > + * care what reset does to QEMU's data structures.
> > + */
> > + if (!s->migration_blocker) {
> > + for (i = 0; i < s->num_cpu; i++) {
> > + GICv3CPUState *c = &s->cpu[i];
> > +
> > + kvm_device_access(s->dev_fd,
> > KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS,
> > + KVM_VGIC_ATTR(ICC_CTLR_EL1, c->gicr_typer),
> > + &c->kvm_reset_icc_ctlr_el1, false,
> > &error_abort);
> > + }
> > + }
> > }
>
>
> Above changes assume that the driver's configured value of the ICC_CTLR_EL1
> system register is the same as the default value. I've verified that this
> currently the case. However, it safe to assume that this will remain true
> in the future as well?
I don't understand what you mean here. We read the kernel's ICC_CTLR_EL1
at VM startup, when we know it will be the reset value, because we haven't
run any VCPUs yet.
> We can instead cache the value once during the first cpu_reset() for each
> CPU interface and reuse that cached value for all subsequent cpu_reset()
> operations. This would simplify the implementation while retaining
> major goals.
No, we should read the value at realize time, as this patch does.
Trying to do it at cpu_reset runs into the same problem that we
have now that the kernel doesn't expect us to be reading the
register when another vcpu might be running.
> Calling pause_vcpus_all() at VM initialization time--during the first
> cpu_reset()--should not cause any issues, as all secondary vCPUs are idle
> (i.e. their PCs are not initialized and kernel has not yet loaded). Pausing
> such vCPUs at this stage should not be a disruptive action.
We should not need to run pause_vcpus_all() at all here. CPU reset
should be fine done for one vcpu with the others running.
> Moreover, if this stage poses a problem for a KVM device IOCTL access, it
> similarly affect other components like GICv3 ITS, which also uses KVM
> device IOCTL during GICv3 reset hold. At this stage all the vCPUs should still
> be in a pre-boot or idle state
> static void kvm_arm_its_reset_hold(Object *obj, ResetType type)
> {
> [...]
>
> if (kvm_device_check_attr(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_CTRL,
> KVM_DEV_ARM_ITS_CTRL_RESET)) {
> kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_CTRL,
> KVM_DEV_ARM_ITS_CTRL_RESET, NULL, true,
> &error_abort);
> return;
> }
> [...]
>
> }
>
> We are not even pausing vCPUs here, yet we expect above KVM device IOCTL
> to succeed on the first attempt. The operation also ends up acquiring the
> mutex for all vCPUs within the KVM.
ITS reset should only happen (a) at start of the run, before any
vcpus run or (b) as part of a full system reset. We should check that
the vcpus are paused when we're doing system reset, but I would expect
that to be the case because it would be likely to be buggy if not.
-- PMM