Hi Peter,
> From: [email protected] <qemu-
> [email protected]> On Behalf Of Peter
> Maydell
> Sent: Tuesday, October 14, 2025 4:44 PM
> To: Salil Mehta <[email protected]>
>
> On Tue, 14 Oct 2025 at 16:33, Salil Mehta <[email protected]> wrote:
> >
> > > From: Peter Maydell <[email protected]>
> > > Sent: Tuesday, October 14, 2025 4:24 PM
> > > To: Salil Mehta <[email protected]>
> > >
> > > On Tue, 14 Oct 2025 at 16:13, Salil Mehta <[email protected]>
> wrote:
> > > >
> > > > > From: Peter Maydell <[email protected]> In what situation
> > > > > do we ever start running a VCPU before the *GIC* has been
> > > > > realized? The GIC should get realized as part of creating the
> > > > > virt board, which must complete before we do anything like running a
> vcpu.
> > > >
> > > >
> > > > Just after realization of vCPU in the machvirt_init() you can see
> > > > the default power_state is PSCI CPU_ON, which means
> > > KVM_MP_STATE_RUNNABLE.
> > > > Since, the thread is up and not doing IO wait in userspace it gets
> > > > into
> > > > cpu_exec() loop and actually run KVM_RUN IOCTL. Inside the KVM it
> > > > momentarily takes the vCPU mutex but later exit and releases. This
> > > > keeps going on for all of the vCPU threads realized early.
> > >
> > > Yikes. We definitely should fix that : letting the vcpu run before
> > > we get to
> > > qemu_machine_creation_done() seems like it would be a massive source
> > > of race conditions.
> >
> > I've already proposed fix for this by parking such threads in
> > userspace. Please check functions virt_(un)park_cpu_in_userspace().
> > But need to check if we can use this trick can be used at the very early
> stages of the VM initialization.
>
> I had a look at this on x86, and we correctly don't try to KVM_RUN the vcpus
> early. What happens there is:
> * the vcpu thread calls qemu_process_cpu_events()
I cannot find this function in the Qemu mainline of 28th September ?
> * this causes it to go to sleep on the cpu->halt_cond: so
> it will not end up doing KVM_RUN yet
> * later, the main thread completes initialization of the board
> * qdev_machine_creation_done() calls cpu_synchronize_all_post_init()
> * for kvm, this causes us to call kvm_cpu_synchronize_post_init()
> for each vcpu
> * that will call run_on_cpu() which ends up calling qemu_cpu_kick()
> * qemu_cpu_kick() does a broadcast on cpu->halt_cond, which
> wakes up the vcpu thread and lets it go into kvm_cpu_exec()
> for the first time
>
> Why doesn't this mechanism work on Arm ?
It is a combination of things:
void qemu_wait_io_event(CPUState *cpu)
{
[...]
while (cpu_thread_is_idle(cpu)) {
[...]
qemu_cond_wait(cpu->halt_cond, &bql);
}
[...]
}
1. To block we should wait on 'halt_cond' as you rightly pointed.
2. but condition to wait is to check of the CPU is IDLE or not.
3. Various conditions in which CPU can be termed IDLE are:
3.1 STOPPED
3.2 HALTED
3.3 It does not have any queued work to process.
Because CPU never ran we can rule out 3.1 & 3.2. If CPU is in
'halted' condition i.e. 'CPUState::halted=1' then we can also assume
thread to be in IDLE state. On ARM, default value of this variable is 0.
This get only set in context to CPU common reset when the first time
all CPUs are reset in context to the system reset.
static void cpu_common_reset_hold(Object *obj, ResetType type)
{
[...]
cpu->halted = cpu->start_powered_off;
[...];
}
Maybe we can fix this default value?
Best regards
Salil.