Re: [PATCH] fix kvmclock bug
On 09/19/2010 02:15 AM, Zachary Amsden wrote: commit 1abe7e8806fd71ea802c6622ed3ce7821a18f271 Author: Zachary Amsdenzams...@redhat.com Date: Sat Sep 18 13:58:37 2010 -1000 Fix kvmclock bug I think there's some redundancy here. Anyone who has tracked your kernel work knows you've done a lot of work on kvmclock, so Fix bug would be just as descriptive. Of course, if we fix something, it's because it was a bug, so Fix is all that's really necessary (bug would be reserved for commits that introduce bugs). In the case of kvmclock, it's hard to see how new features could be added, and it has a pretty bad history of bugs, so most readers would probably deduce that a commit fixes a bug. I don't really see why you wrote a subject line at all. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fix kvmclock bug
Am 27.09.2010 21:00, Zachary Amsden wrote: On 09/25/2010 11:54 PM, Jan Kiszka wrote: That only leaves us with the likely wrong unstable declaration of the TSC after resume. And that raises the question for me if KVM is actually that much smarter than the Linux kernel in detecting TSC jumps. If something is missing, can't we improve the kernel's detection mechanism which already has suspend/resume support? Linux must make the the conservative choice about TSC being declared unstable; if it is possible that it has become unstable, it is unstable. Unfortunately, this bodes not well for us, as most of the finer points of accuracy depend on having a stable TSC. There's a bunch of places that declare TSC unstable, and where in the suspend / resume cycle that happens would depend on your actual hardware. It's absolutely clear where this happens: kvm_arch_vcpu_load. And it seems to happen as the TSC is reset due to suspend-to-RAM. Again: Linux recovers from this and continues to use the TSC. KVM is more picky, so my question is if this is really required. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fix kvmclock bug
Am 24.09.2010 09:28, Jan Kiszka wrote: Am 19.09.2010 02:15, Zachary Amsden wrote: For CPUs with unstable TSC, we null time offset between not just VCPU switches, but all preemptions of the kvm thread. This makes a bug much more likely where the kvmclock values are updated before a successful exit from virt, causing an underflow. The null offsetting was added at : bf0fb4a42ba7eb362f4013bd2e93209666793e66 The underflow happens with this additional patch : cf839f5da2b0779b9ec8b990f851fb4e7d681da0 There is a secondary bug, which is that TSC fails to advance with real time on unstable TSC, but the fix is much more involved (it requires the TSC catchup code). For now, this patch is sufficient to get things working again for me. ...but not for me. I still face stuck (or infinitely slow) guests that want to use kvmclock once tsc_unstable gets set. Or is this patch addressing a different issue? Commit bfb3f332 (TSC catchup mode) in kvm.git finally resolves the issue here. That only leaves us with the likely wrong unstable declaration of the TSC after resume. And that raises the question for me if KVM is actually that much smarter than the Linux kernel in detecting TSC jumps. If something is missing, can't we improve the kernel's detection mechanism which already has suspend/resume support? Jan signature.asc Description: OpenPGP digital signature
Re: [PATCH] fix kvmclock bug
Am 19.09.2010 02:15, Zachary Amsden wrote: For CPUs with unstable TSC, we null time offset between not just VCPU switches, but all preemptions of the kvm thread. This makes a bug much more likely where the kvmclock values are updated before a successful exit from virt, causing an underflow. The null offsetting was added at : bf0fb4a42ba7eb362f4013bd2e93209666793e66 The underflow happens with this additional patch : cf839f5da2b0779b9ec8b990f851fb4e7d681da0 There is a secondary bug, which is that TSC fails to advance with real time on unstable TSC, but the fix is much more involved (it requires the TSC catchup code). For now, this patch is sufficient to get things working again for me. ...but not for me. I still face stuck (or infinitely slow) guests that want to use kvmclock once tsc_unstable gets set. Or is this patch addressing a different issue? Jan signature.asc Description: OpenPGP digital signature
[PATCH] fix kvmclock bug
For CPUs with unstable TSC, we null time offset between not just VCPU switches, but all preemptions of the kvm thread. This makes a bug much more likely where the kvmclock values are updated before a successful exit from virt, causing an underflow. The null offsetting was added at : bf0fb4a42ba7eb362f4013bd2e93209666793e66 The underflow happens with this additional patch : cf839f5da2b0779b9ec8b990f851fb4e7d681da0 There is a secondary bug, which is that TSC fails to advance with real time on unstable TSC, but the fix is much more involved (it requires the TSC catchup code). For now, this patch is sufficient to get things working again for me. commit 1abe7e8806fd71ea802c6622ed3ce7821a18f271 Author: Zachary Amsden zams...@redhat.com Date: Sat Sep 18 13:58:37 2010 -1000 Fix kvmclock bug If preempted after kvmclock values are updated, but before hardware virtualization is entered, the last tsc time as read by the guest is never set. It underflows the next time kvmclock is updated if there has not yet been a successful entry / exit into hardware virt. Fix this by simply setting last_tsc to the newly read tsc value so that any computed nsec advance of kvmclock is nulled. Signed-off-by: Zachary Amsden zams...@redhat.com diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 76db85a..09f468a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1101,6 +1101,7 @@ static int kvm_guest_time_update(struct kvm_vcpu *v) vcpu-hv_clock.tsc_timestamp = tsc_timestamp; vcpu-hv_clock.system_time = kernel_ns + v-kvm-arch.kvmclock_offset; vcpu-last_kernel_ns = kernel_ns; + vcpu-last_guest_tsc = tsc_timestamp; vcpu-hv_clock.flags = 0; /*