On Tue, 2007-06-19 at 17:48 +0300, Avi Kivity wrote: > Gregory Haskins wrote: > > Halting in userspace requires a relatively cumbersome mechanism to signal > > the > > halted VCPU. Implementing halt in kernel should be relatively straight > > forward and it eliminates the need for the signaling > > > > > > Merging this one in, found some nits:
I believe you will find that this implementation works correctly and possibly even optimally for the situation. I crafted it after carefully balancing the design requirements with observed behavior through many rounds of testing. I do have a bit of experience in this area of linux, though I am not an out-and-out expert. Therefore its possible there are other/better ways to do this. Based on this, I will just detail why I did things this way and you guys can poke holes in it ;) > > > +/* > > * This function is invoked whenever we want to interrupt a vcpu that is > > * currently executing in guest-mode. It currently is a no-op because > > * the simple delivery of the IPI to execute this function accomplishes our > > @@ -2481,6 +2556,16 @@ static void kvm_vcpu_intr(struct kvm_irqsink *this, > > BUG_ON(direct_ipi == smp_processor_id()); > > ++vcpu->stat.guest_preempt; > > } > > + > > + /* > > + * If the CPU is halted it will be waiting for a wake-up > > + */ > > + if (waitqueue_active(&vcpu->irq.wq)) { > > > > Why do the check? The only reason I can see is to keep the stats > correct. Otherwise we can do the body of the if unconditionally. I agree the waitqueue_active() check is superfluous if you are only waking up a wait_queue and nothing else. However, one reason to have it this way is the stat update (as you pointed out). More importantly, another reason is that we cannot use a simple wake_up here (I believe) due to constraints of the different contexts that may invoke this function. Therefore we have to use wake_up_sync with a deferred reschedule. We could do these operations unconditionally, but having an unnecessary reschedule seems like it would be wasteful at best. > > > + wake_up_interruptible_sync(&vcpu->irq.wq); > > + set_tsk_need_resched(current); > > > > This is unneeded? I'd expect wake_up_interruptible_sync() to take care > of any rescheduling needed. IIUC, a regular wake_up() does two primary things: 1) it moves an item from a wait_queue to the run_queue, and 2) it calls schedule. wake_up_sync() only does (1) and expects that the caller will call schedule() later on its own. In our case, we cannot safely call schedule() directly anywhere in the interrupt code because there are multiple paths that can cause the apic to inject something, some of which are not conducive to sleeping (for example hrtimer callback driven LVTT interrupts). Therefore, we mark the task as needing rescheduling to defer scheduling before we return from this function, but to also induce scheduling ASAP afterwards. IIRC, if you remove the NEED_RESCHED, sometimes the LVTT will fail to kick the vcpu out of halt until the next natural schedule point happens orthogonal to this codepath. This results in unexpected behavior in the guest, and is what drove me to add it. Regards, -Greg ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel