Ryan Harper wrote: > I've been digging into some of the instability we see when running > larger numbers of guests at the same time. The test I'm currently using > involves launching 64 1vcpu guests on an 8-way AMD box.
Note this is a Barcelona system and therefore should have a fixed-frequency TSC. > With the latest > kvm-userspace git and kvm.git + Gerd's kvmclock fixes, I can launch all > 64 of these 1 second apart, BTW, what if you don't pace-out the startups? Do we still have issues with that? > and only a handful (1 to 3) end up not > making it up. In dmesg on the host, I get a couple messages: > > [321365.362534] vcpu not ready for apic_round_robin > > and > > [321503.023788] Unsupported delivery mode 7 > > Now, the interesting bit for me was when I used numactl to pin the guest > to a processor, all of the guests come up with no issues at all. As I > looked into it, it means that we're not running any of the vcpu > migration code which on svm is comprised of tsc_offset recalibration and > apic migration, and on vmx, a little more per-vcpu work > Another data point is that -no-kvm-irqchip doesn't make the situation better. > I've convinced myself that svm.c's tsc offset calculation works and > handles the migration from cpu to cpu quite well. I added the following > snippet to trigger if we ever encountered the case where we migrated to > a tsc that was behind: > > rdtscll(tsc_this); > delta = vcpu->arch.host_tsc - tsc_this; > old_time = vcpu->arch.host_tsc + svm->vmcb->control.tsc_offset; > new_time = tsc_this + svm->vmcb->control.tsc_offset + delta; > if (new_time < old_time) { > printk(KERN_ERR "ACK! (CPU%d->CPU%d) time goes back %llu\n", > vcpu->cpu, cpu, old_time - new_time); > } > svm->vmcb->control.tsc_offset += delta; > Time will never go backwards, but what can happen is that the TSC frequency will slow down. This is because upon VCPU migration, we don't account for the time between vcpu_put on the old processor and vcpu_load on the new processor. This time then disappears. A possible way to fix this (that's only valid on a processor with a fixed-frequency TSC), is to take a high-res timestamp on vcpu_put, and then on vcpu_load, take the delta timestamp since the old TSC was saved, and use the TSC frequency on the new pcpu to calculate the number of elapsed cycles. Assuming a fixed frequency TSC, and a calibrated TSC across all processors, you could get the same affects by using the VT tsc delta logic. Basically, it always uses the new CPU's TSC unless that would cause the guest to move backwards in time. As long as you have a stable, calibrated TSC, this would work out. Can you try your old patch that did this and see if it fixes the problem? > Noting that vcpu->arch.host_tsc is the tsc of the previous cpu the vcpu > was running on (see svm_put_vcpu()). This allows me to check if we are > in fact increasing the guest's view of the tsc. I've not be able to > trigger this at all when the vcpus are migrating. > > As for the apic, the migrate code seems to be rather simple, but I've > not yet dived in to see if we've got anything racy in there: > > lapic.c: > void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu) > { > struct kvm_lapic *apic = vcpu->arch.apic; > struct hrtimer *timer; > > if (!apic) > return; > > timer = &apic->timer.dev; > if (hrtimer_cancel(timer)) > hrtimer_start(timer, timer->expires, HRTIMER_MODE_ABS); > } > > There's a big FIXME in the __apic_timer_fn() to make sure the timer runs on the current "pCPU". As written, it's possible for the timer to happen on a different pcpu as the current vcpu's but it wasn't obvious to me that it would cause problems. Eddie, et al: Care to elaborate on what the TODO was trying to address? Regards, Anthony Liguori > Ryan Harper > Software Engineer; Linux Technology Center > IBM Corp., Austin, Tx > (512) 838-9253 T/L: 678-9253 > [EMAIL PROTECTED] > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > kvm-devel mailing list > kvm-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/kvm-devel > ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel