On 10.01.2012, at 23:43, Scott Wood wrote: > On 01/10/2012 11:52 AM, Jan Kiszka wrote: >> On 2012-01-10 18:43, Scott Wood wrote: >>> On 01/10/2012 03:38 AM, Jan Kiszka wrote: >>>> On 2012-01-10 00:17, Scott Wood wrote: >>>>> On 01/09/2012 04:39 PM, Alexander Graf wrote: >>>>>> >>>>>> On 09.01.2012, at 22:23, Scott Wood wrote: >>>>>>> Alex, is there a better way to deal with the IRQ chip issue? >>>>>> >>>>>> To be honest, I'm not sure what the issue really is. >>>>> >>>>> If irqchip is enabled, env->halted won't result in a CPU being >>>>> considered idle -- since QEMU won't see the interrupt that wakes the >>>>> vcpu, and the idling is handled in the kernel. In this case we're >>>>> waiting for MMIO rather than an interrupt, and it's the kernel that >>>>> doesn't know what's going on. >>>>> >>>>> It seems wrong to use env->stopped, though, as a spin-table release >>>>> should not override a user's explicit request to stop a CPU. It might >>>>> be OK (though a bit ugly) if the only usage of env->stopped is through >>>>> pause_all_vcpus(), and the boot thread is the first one to be kicked >>>>> (though in theory the boot cpu could wake another cpu, and that could >>>>> wake a cpu that comes before it, causing a race with pause_all_vcpus()). >>>>> >>>>> If it is OK to use env->stopped, is there any reason not to always use >>>>> it (versus just with irqchip)? >>>> >>>> Why don't you wait in the kernel with in-kernel irqchip under all >>>> condition (except pausing VCPUs, of course) on PPC? Just like x86 does. >>> >>> We do for normal idling. This is a bit different, in that we're not >>> waiting for an interrupt, but for an MMIO that releases the cpu at >>> boot-time. >> >> Where is the state stored that declares a VCPU to wait for that event? >> Where is it set, where removed? >> >> What about implementing MP_STATE on PPC, at least those states that make >> sense? Don't you need that anyway for normal HALT<->RUNNABLE transitions? > > On ppc, normal halt/runnable transitions are handled entirely in the > kernel, even without irqchip. > > So, the idea is that on secondary VCPU creation, QEMU sets MP_STATE to > KVM_MP_STATE_UNITIALIZED, and KVM will hold the thread idle until the > MMIO is done and QEMU sets MP_STATE to KVM_MP_STATE_RUNNABLE? It seems > excessive compared to QEMU being able to figure out for itself when it > doesn't want to run a VCPU thread, when the decision is based entirely > on things that are modeled in QEMU (which it will still need to do in > the non-KVM case).
I agree, but the closer we can stick with how x86 models it today the more generic code we have, the better. Alex