Hi Salil,

On 10/23/25 10:35 AM, Salil Mehta wrote:
On Thu, Oct 23, 2025 at 12:14 AM Gavin Shan <[email protected]> wrote:
On 10/23/25 4:50 AM, Salil Mehta wrote:
On Wed, Oct 22, 2025 at 6:18 PM Salil Mehta <[email protected]> wrote:
On Wed, Oct 22, 2025 at 10:37 AM Gavin Shan <[email protected]> wrote:
On 10/1/25 11:01 AM, [email protected] wrote:
From: Salil Mehta <[email protected]>

[...]

+void kvm_arm_create_host_vcpu(ARMCPU *cpu)
+{
+    CPUState *cs = CPU(cpu);
+    unsigned long vcpu_id = cs->cpu_index;
+    int ret;
+
+    ret = kvm_create_vcpu(cs);
+    if (ret < 0) {
+        error_report("Failed to create host vcpu %ld", vcpu_id);
+        abort();
+    }
+
+    /*
+     * Initialize the vCPU in the host. This will reset the sys regs
+     * for this vCPU and related registers like MPIDR_EL1 etc. also
+     * get programmed during this call to host. These are referenced
+     * later while setting device attributes of the GICR during GICv3
+     * reset.
+     */
+    ret = kvm_arch_init_vcpu(cs);
+    if (ret < 0) {
+        error_report("Failed to initialize host vcpu %ld", vcpu_id);
+        abort();
+    }
+
+    /*
+     * park the created vCPU. shall be used during kvm_get_vcpu() when
+     * threads are created during realization of ARM vCPUs.
+     */
+    kvm_park_vcpu(cs);
+}
+

I don't think we're able to simply call kvm_arch_init_vcpu() in the lazily 
realized
path. Otherwise, it can trigger a crash dump on my Nvidia's grace-hopper 
machine where
SVE is supported by default.

Thanks for reporting this. That is not true. As long as we initialize
KVM correctly and
finalize the features like SVE we should be fine. In fact, this is
precisely what we are
doing right now.

To understand the crash, I need a bit more info.

1#  is happening because KVM_ARM_VCPU_INIT is failing. If yes, the can you check
        within the KVM if it is happening because
       a.  features specified by QEMU are not matching the defaults within the 
KVM
             (HInt: check kvm_vcpu_init_check_features())?
       b. or complaining about init feate change kvm_vcpu_init_changed()?
2#  or it is happening during the setting of vector length or
finalizing features?

int kvm_arch_init_vcpu(CPUState *cs)
{
     [...]
           /* Do KVM_ARM_VCPU_INIT ioctl */
          ret = kvm_arm_vcpu_init(cpu);   ---->[1]
          if (ret) {
             return ret;
         }
            if (cpu_isar_feature(aa64_sve, cpu)) {
          ret = kvm_arm_sve_set_vls(cpu); ---->[2]
          if (ret) {
              return ret;
          }
          ret = kvm_arm_vcpu_finalize(cpu, KVM_ARM_VCPU_SVE);--->[3]
          if (ret) {
              return ret;
          }
      }
[...]
}

I think it's happening because vector length is going uninitialized.
This initialization
happens in context to  arm_cpu_finalize_features() which I forgot to call before
calling KVM finalize.


kvm_arch_init_vcpu() is supposed to be called in the realization path in current
implementation (without this series) because the parameters (features) to 
KVM_ARM_VCPU_INIT
is populated at vCPU realization time.

Not necessarily. It is just meant to initialize the KVM. If we take care of the
KVM requirements in the similar way the realize path does we should be
fine. Can you try to add the patch below in your code and test if it works?

   diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index c4b68a0b17..1091593478 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -1068,6 +1068,9 @@ void kvm_arm_create_host_vcpu(ARMCPU *cpu)
           abort();
       }

+     /* finalize the features like SVE, SME etc */
+     arm_cpu_finalize_features(cpu, &error_abort);
+
       /*
        * Initialize the vCPU in the host. This will reset the sys regs
        * for this vCPU and related registers like MPIDR_EL1 etc. also





$ home/gavin/sandbox/qemu.main/build/qemu-system-aarch64           \
     --enable-kvm -machine virt,gic-version=3 -cpu host               \
     -smp cpus=4,disabledcpus=2 -m 1024M                              \
     -kernel /home/gavin/sandbox/linux.guest/arch/arm64/boot/Image    \
     -initrd /home/gavin/sandbox/images/rootfs.cpio.xz -nographic
qemu-system-aarch64: Failed to initialize host vcpu 4
Aborted (core dumped)

Backtrace
=========
(gdb) bt
#0  0x0000ffff9106bc80 in __pthread_kill_implementation () at /lib64/libc.so.6
#1  0x0000ffff9101aa40 [PAC] in raise () at /lib64/libc.so.6
#2  0x0000ffff91005988 [PAC] in abort () at /lib64/libc.so.6
#3  0x0000aaaab1cc26b8 [PAC] in kvm_arm_create_host_vcpu (cpu=0xaaaab9ab1bc0)
       at ../target/arm/kvm.c:1081
#4  0x0000aaaab1cd0c94 in virt_setup_lazy_vcpu_realization 
(cpuobj=0xaaaab9ab1bc0, vms=0xaaaab98870a0)
       at ../hw/arm/virt.c:2483
#5  0x0000aaaab1cd180c in machvirt_init (machine=0xaaaab98870a0) at 
../hw/arm/virt.c:2777
#6  0x0000aaaab160f220 in machine_run_board_init
       (machine=0xaaaab98870a0, mem_path=0x0, errp=0xfffffa86bdc8) at 
../hw/core/machine.c:1722
#7  0x0000aaaab1a25ef4 in qemu_init_board () at ../system/vl.c:2723
#8  0x0000aaaab1a2635c in qmp_x_exit_preconfig (errp=0xaaaab38a50f0 
<error_fatal>)
       at ../system/vl.c:2821
#9  0x0000aaaab1a28b08 in qemu_init (argc=15, argv=0xfffffa86c1f8) at 
../system/vl.c:3882
#10 0x0000aaaab221d9e4 in main (argc=15, argv=0xfffffa86c1f8) at 
../system/main.c:71


Thank you for this. Please let me know if the above fix works and also
the return values in
case you encounter errors.

I've pushed the fix to below branch for your convenience:

Branch: https://github.com/salil-mehta/qemu/commits/virt-cpuhp-armv8/rfc-v6.2
Fix: 
https://github.com/salil-mehta/qemu/commit/1f1fbc0998ffb1fe26140df3c336bf2be2aa8669


I guess rfc-v6.2 branch isn't ready for test because it runs into another crash
dump with rfc-v6.2 branch, like below.


rfc-6.2 is not crashing on Kunpeng920 where I tested. But this
chip does not have some ARM extensions like SVE etc so
Unfortunately, I can't test SVE/SME/PAuth etc support.

Can you disable SVE and then try if it comes up just to corner
the case?


Right, this crash dump shouldn't be encountered if SVE isn't supported. I 
already
had the workaround "-cpu host,sve=off" to keep my tests moving forwards...


host$ /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64                   
  \
        -accel kvm -machine virt,gic-version=host,nvdimm=on                     
    \
        -cpu host,sve=on                                                        
    \
        -smp 
maxcpus=4,cpus=2,disabledcpus=2,sockets=2,clusters=2,cores=1,threads=1 \
        -m 4096M,slots=16,maxmem=128G                                           
    \
        -object memory-backend-ram,id=mem0,size=2048M                           
    \
        -object memory-backend-ram,id=mem1,size=2048M                           
    \
        -numa node,nodeid=0,memdev=mem0,cpus=0-1                                
    \
        -numa node,nodeid=1,memdev=mem1,cpus=2-3                                
    \
        -L /home/gavin/sandbox/qemu.main/build/pc-bios                          
    \
        -monitor none -serial mon:stdio -nographic -gdb tcp::6666               
    \
        -qmp tcp:localhost:5555,server,wait=off                                 
    \
        -bios /home/gavin/sandbox/qemu.main/build/pc-bios/edk2-aarch64-code.fd  
    \
        -kernel /home/gavin/sandbox/linux.guest/arch/arm64/boot/Image           
    \
        -initrd /home/gavin/sandbox/images/rootfs.cpio.xz                       
    \
        -append memhp_default_state=online_movable
          :
          :
guest$ cd /sys/devices/system/cpu/
guest$ cat present enabled online
0-3
0-1
0-1
(qemu) device_set 
host-arm-cpu,socket-id=1,cluster-id=0,core-id=0,thread-id=0,admin-state=enable
qemu-system-aarch64: kvm_init_vcpu: kvm_arch_init_vcpu failed (2): Operation 
not permitted


Ah, I see. I think I understand the issue. It's complaining
about calling the  finalize twice. Is it possible to check as
I do not have a way to test it?


int kvm_arm_vcpu_finalize(struct kvm_vcpu *vcpu, int feature)
{
switch (feature) {
case KVM_ARM_VCPU_SVE:
[...]
if (kvm_arm_vcpu_sve_finalized(vcpu))
return -EPERM;-----> this where it must be popping?
[...]
}


Right, I think that's the case: QEMU tries to finalize SVE capability for twice,
which is the real problem. I'm explaining what I found as below, which would be
helpful to the forthcoming revisions.

machvirt_init
  virt_setup_lazy_vcpu_realization
    arm_cpu_finalize_features
    kvm_arm_create_host_vcpu
      kvm_create_vcpu                       // New fd is created
      kvm_arch_init_vcpu
        kvm_arm_vcpu_init
        kvm_arm_sve_set_vls
        kvm_arm_vcpu_finalize               // (A) SVE capability is finalized

device_set_admin_power_state
  device_pre_poweron
    virt_machine_device_pre_poweron
      virt_cpu_pre_poweron
        qdev_realize
          arm_cpu_realizefn
            cpu_exec_realizefn
            arm_cpu_finalize_features       // Called for the second time
            qemu_init_vcpu
              kvm_start_vcpu_thread
                kvm_vcpu_thread_fn
                  kvm_init_vcpu
                    kvm_create_vcpu         // Called for the second time
                    kvm_arch_init_vcpu      // Called for the second time
                      kvm_arm_vcpu_init
                      kvm_arm_sve_set_vls   // (B) Failed here
                      kvm_arm_vcpu_finalize

(B) where we try to finalize SVE capability again. It has been finalized at (A)
    Fianlizing SVE capability for twice is disallowed by KVM on the host side.



I picked the fix (the last patch in rfc-v6.2 branch) to rfc-v6 branch, same 
crash dump
can be seen.

Are you getting previously reported abort or above new problem?


Previously, the VM can't be started. After your fix is applied, the VM is able 
to start.
It's a new problem that qemu crash dump is seens on attempt to hot add a vCPU.

Thanks,
Gavin


Reply via email to