Hi Gavin,
On Wed, Oct 22, 2025 at 6:18 PM Salil Mehta <[email protected]> wrote:
>
> Hi Gavin,
>
> On Wed, Oct 22, 2025 at 10:37 AM Gavin Shan <[email protected]> wrote:
> >
> > Hi Salil,
> >
> > On 10/1/25 11:01 AM, [email protected] wrote:
> > > From: Salil Mehta <[email protected]>
> > >
> > > ARM CPU architecture does not allow CPUs to be plugged after system has
> > > initialized. This is a constraint. Hence, the Kernel must know all the
> > > CPUs
> > > being booted during its initialization. This applies to the Guest Kernel
> > > as
> > > well and therefore, the number of KVM vCPU descriptors in the host must be
> > > fixed at VM initialization time.
> > >
> > > Also, the GIC must know all the CPUs it is connected to during its
> > > initialization, and this cannot change afterward. This must also be
> > > ensured
> > > during the initialization of the VGIC in KVM. This is necessary because:
> > >
> > > 1. The association between GICR and MPIDR must be fixed at VM
> > > initialization
> > > time. This is represented by the register
> > > `GICR_TYPER(mp_affinity, proc_num)`.
> > > 2. Memory regions associated with GICR, etc., cannot be changed (added,
> > > deleted, or modified) after the VM has been initialized. This is not
> > > an
> > > ARM architectural constraint but rather invites a difficult and messy
> > > change in VGIC data structures.
> > >
> > > To enable a hot-add–like model while preserving these constraints, the
> > > virt
> > > machine may enumerate more CPUs than are enabled at boot using
> > > `-smp disabledcpus=N`. Such CPUs are present but start offline (i.e.,
> > > administratively disabled at init). The topology remains fixed at VM
> > > creation time; only the online/offline status may change later.
> > >
> > > Administratively disabled vCPUs are not realized in QOM until first
> > > enabled,
> > > avoiding creation of unnecessary vCPU threads at boot. On large systems,
> > > this
> > > reduces startup time proportionally to the number of disabled vCPUs. Once
> > > a
> > > QOM vCPU is realized and its thread created, subsequent enable/disable
> > > actions
> > > do not unrealize it. This behaviour was adopted following review feedback
> > > and
> > > differs from earlier RFC versions.
> > >
> > > Co-developed-by: Keqian Zhu <[email protected]>
> > > Signed-off-by: Keqian Zhu <[email protected]>
> > > Signed-off-by: Salil Mehta <[email protected]>
> > > ---
> > > accel/kvm/kvm-all.c | 2 +-
> > > hw/arm/virt.c | 77 ++++++++++++++++++++++++++++++++++++++----
> > > hw/core/qdev.c | 17 ++++++++++
> > > include/hw/qdev-core.h | 19 +++++++++++
> > > include/system/kvm.h | 8 +++++
> > > target/arm/cpu.c | 2 ++
> > > target/arm/kvm.c | 40 +++++++++++++++++++++-
> > > target/arm/kvm_arm.h | 11 ++++++
> > > 8 files changed, 168 insertions(+), 8 deletions(-)
> > >
[...]
> > > +void kvm_arm_create_host_vcpu(ARMCPU *cpu)
> > > +{
> > > + CPUState *cs = CPU(cpu);
> > > + unsigned long vcpu_id = cs->cpu_index;
> > > + int ret;
> > > +
> > > + ret = kvm_create_vcpu(cs);
> > > + if (ret < 0) {
> > > + error_report("Failed to create host vcpu %ld", vcpu_id);
> > > + abort();
> > > + }
> > > +
> > > + /*
> > > + * Initialize the vCPU in the host. This will reset the sys regs
> > > + * for this vCPU and related registers like MPIDR_EL1 etc. also
> > > + * get programmed during this call to host. These are referenced
> > > + * later while setting device attributes of the GICR during GICv3
> > > + * reset.
> > > + */
> > > + ret = kvm_arch_init_vcpu(cs);
> > > + if (ret < 0) {
> > > + error_report("Failed to initialize host vcpu %ld", vcpu_id);
> > > + abort();
> > > + }
> > > +
> > > + /*
> > > + * park the created vCPU. shall be used during kvm_get_vcpu() when
> > > + * threads are created during realization of ARM vCPUs.
> > > + */
> > > + kvm_park_vcpu(cs);
> > > +}
> > > +
> >
> > I don't think we're able to simply call kvm_arch_init_vcpu() in the lazily
> > realized
> > path. Otherwise, it can trigger a crash dump on my Nvidia's grace-hopper
> > machine where
> > SVE is supported by default.
>
> Thanks for reporting this. That is not true. As long as we initialize
> KVM correctly and
> finalize the features like SVE we should be fine. In fact, this is
> precisely what we are
> doing right now.
>
> To understand the crash, I need a bit more info.
>
> 1# is happening because KVM_ARM_VCPU_INIT is failing. If yes, the can you
> check
> within the KVM if it is happening because
> a. features specified by QEMU are not matching the defaults within the
> KVM
> (HInt: check kvm_vcpu_init_check_features())?
> b. or complaining about init feate change kvm_vcpu_init_changed()?
> 2# or it is happening during the setting of vector length or
> finalizing features?
>
> int kvm_arch_init_vcpu(CPUState *cs)
> {
> [...]
> /* Do KVM_ARM_VCPU_INIT ioctl */
> ret = kvm_arm_vcpu_init(cpu); ---->[1]
> if (ret) {
> return ret;
> }
> if (cpu_isar_feature(aa64_sve, cpu)) {
> ret = kvm_arm_sve_set_vls(cpu); ---->[2]
> if (ret) {
> return ret;
> }
> ret = kvm_arm_vcpu_finalize(cpu, KVM_ARM_VCPU_SVE);--->[3]
> if (ret) {
> return ret;
> }
> }
> [...]
> }
>
> I think it's happening because vector length is going uninitialized.
> This initialization
> happens in context to arm_cpu_finalize_features() which I forgot to call
> before
> calling KVM finalize.
>
> >
> > kvm_arch_init_vcpu() is supposed to be called in the realization path in
> > current
> > implementation (without this series) because the parameters (features) to
> > KVM_ARM_VCPU_INIT
> > is populated at vCPU realization time.
>
> Not necessarily. It is just meant to initialize the KVM. If we take care of
> the
> KVM requirements in the similar way the realize path does we should be
> fine. Can you try to add the patch below in your code and test if it works?
>
> diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> index c4b68a0b17..1091593478 100644
> --- a/target/arm/kvm.c
> +++ b/target/arm/kvm.c
> @@ -1068,6 +1068,9 @@ void kvm_arm_create_host_vcpu(ARMCPU *cpu)
> abort();
> }
>
> + /* finalize the features like SVE, SME etc */
> + arm_cpu_finalize_features(cpu, &error_abort);
> +
> /*
> * Initialize the vCPU in the host. This will reset the sys regs
> * for this vCPU and related registers like MPIDR_EL1 etc. also
>
>
>
>
> >
> > $ home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
> > --enable-kvm -machine virt,gic-version=3 -cpu host \
> > -smp cpus=4,disabledcpus=2 -m 1024M \
> > -kernel /home/gavin/sandbox/linux.guest/arch/arm64/boot/Image \
> > -initrd /home/gavin/sandbox/images/rootfs.cpio.xz -nographic
> > qemu-system-aarch64: Failed to initialize host vcpu 4
> > Aborted (core dumped)
> >
> > Backtrace
> > =========
> > (gdb) bt
> > #0 0x0000ffff9106bc80 in __pthread_kill_implementation () at
> > /lib64/libc.so.6
> > #1 0x0000ffff9101aa40 [PAC] in raise () at /lib64/libc.so.6
> > #2 0x0000ffff91005988 [PAC] in abort () at /lib64/libc.so.6
> > #3 0x0000aaaab1cc26b8 [PAC] in kvm_arm_create_host_vcpu
> > (cpu=0xaaaab9ab1bc0)
> > at ../target/arm/kvm.c:1081
> > #4 0x0000aaaab1cd0c94 in virt_setup_lazy_vcpu_realization
> > (cpuobj=0xaaaab9ab1bc0, vms=0xaaaab98870a0)
> > at ../hw/arm/virt.c:2483
> > #5 0x0000aaaab1cd180c in machvirt_init (machine=0xaaaab98870a0) at
> > ../hw/arm/virt.c:2777
> > #6 0x0000aaaab160f220 in machine_run_board_init
> > (machine=0xaaaab98870a0, mem_path=0x0, errp=0xfffffa86bdc8) at
> > ../hw/core/machine.c:1722
> > #7 0x0000aaaab1a25ef4 in qemu_init_board () at ../system/vl.c:2723
> > #8 0x0000aaaab1a2635c in qmp_x_exit_preconfig (errp=0xaaaab38a50f0
> > <error_fatal>)
> > at ../system/vl.c:2821
> > #9 0x0000aaaab1a28b08 in qemu_init (argc=15, argv=0xfffffa86c1f8) at
> > ../system/vl.c:3882
> > #10 0x0000aaaab221d9e4 in main (argc=15, argv=0xfffffa86c1f8) at
> > ../system/main.c:71
>
>
> Thank you for this. Please let me know if the above fix works and also
> the return values in
> case you encounter errors.
I've pushed the fix to below branch for your convenience:
Branch: https://github.com/salil-mehta/qemu/commits/virt-cpuhp-armv8/rfc-v6.2
Fix:
https://github.com/salil-mehta/qemu/commit/1f1fbc0998ffb1fe26140df3c336bf2be2aa8669
Thanks
Salil.