Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
On 03/05/2011 02:21 AM, Nikola Ciprich wrote: Can you try this patch to see if it fixes the problem? You haven't read my replies, did you? ;-) kvm_request_guest_time_update seems to have been removed, and kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu) seems to be used instead, adding it fixes the problem. That's what I was going to use in the patch... :) I did read your mail, but I was working on an old tree... because of that transformation, this fix will unfortunately have to be back and forward ported by hand. Did you try just that change right applied on top of the patch (e48672fa25e879f7ae21785c7efd187738139593) implicated by bisect? It will be great to know if that change alone fixes the problem, if so, the fix you propose is probably the right one for upstream. Thanks, Zach -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
I did read your mail, but I was working on an old tree... because of that transformation, this fix will unfortunately have to be back and forward ported by hand. OK, sorry, I didn't mean to be adverse... Did you try just that change right applied on top of the patch (e48672fa25e879f7ae21785c7efd187738139593) implicated by bisect? yes, with host running e48672fa25e879f7ae21785c7efd187738139593, 32bit SMP guest doesn't boot, when I add kvm_request_guest_time_update(vcpu), it helps. It will be great to know if that change alone fixes the problem, if so, the fix you propose is probably the right one for upstream. ok, so shell I submit patch adding kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu)? this fixes things for me for 2.6.37. Thanks, Zach -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- - Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax:+420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: ser...@linuxbox.cz - -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
On 03/03/2011 05:01 PM, Nikola Ciprich wrote: That sounds like a kernel which will be vulnerable to broken KVM clock on 32-bit. There's a kernel side fix that is needed, but why the server side change triggers the problem needs more investigation. OK, it's important for me that I can fix this by kernel parameter, but if I can help somehow with debugging, please let me know. thanks for Your time! nik You don't see any messages about TSC being unstable or switching clocksource after loading the KVM module? And you are not suspending the host or anything? Can you try using processor.max_cstate=1 on the host as a kernel parameter and see if it makes a difference? -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
Hello Zachary, You don't see any messages about TSC being unstable or switching clocksource after loading the KVM module? And you are not suspending the host or anything? no messages, no suspending, nothing. Can you try using processor.max_cstate=1 on the host as a kernel parameter and see if it makes a difference? I tried it, no change.. n. -- - Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax:+420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: ser...@linuxbox.cz - -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
On Fri, 2011-03-04 at 19:27 +0100, Nikola Ciprich wrote: Hello Zachary, You don't see any messages about TSC being unstable or switching clocksource after loading the KVM module? And you are not suspending the host or anything? no messages, no suspending, nothing. Can you try using processor.max_cstate=1 on the host as a kernel parameter and see if it makes a difference? I tried it, no change.. n. Zach, I don't understand 100 % the logic behind all your tsc changes. But kvm-clock-wise, most of the problems we had in the past were related to the difference in resolution between the tsc and the host clocksource (hpet, acpi_pm, etc), which in his case, it is a non-issue. It does seem to me like some compensation logic kicked in, dismantling an otherwise good tsc. He does have nonstop_tsc, which means it can't get any better. One thing I noticed when reading the culprit patch in bisect, is that in vcpu_load(), there were previously a call to kvm_request_guest_time_update(vcpu) that was removed without a counterpart addition. Any idea about why it was done? Nikola, does adding that line back alleviate the problem for you ? -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
Zach, I don't understand 100 % the logic behind all your tsc changes. But kvm-clock-wise, most of the problems we had in the past were related to the difference in resolution between the tsc and the host clocksource (hpet, acpi_pm, etc), which in his case, it is a non-issue. It does seem to me like some compensation logic kicked in, dismantling an otherwise good tsc. He does have nonstop_tsc, which means it can't get any better. One thing I noticed when reading the culprit patch in bisect, is that in vcpu_load(), there were previously a call to kvm_request_guest_time_update(vcpu) that was removed without a counterpart addition. Any idea about why it was done? Nikola, does adding that line back alleviate the problem for you ? Hello Glauber, kvm_request_guest_time_update seems to have been renamed and then removed since then, but I've added kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu); instead and now the guest boots! So maybe missing clock update is really the culprit here? What do You guys think? n. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- - Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax:+420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: ser...@linuxbox.cz - -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
On Fri, 2011-03-04 at 21:55 +0100, Nikola Ciprich wrote: Zach, I don't understand 100 % the logic behind all your tsc changes. But kvm-clock-wise, most of the problems we had in the past were related to the difference in resolution between the tsc and the host clocksource (hpet, acpi_pm, etc), which in his case, it is a non-issue. It does seem to me like some compensation logic kicked in, dismantling an otherwise good tsc. He does have nonstop_tsc, which means it can't get any better. One thing I noticed when reading the culprit patch in bisect, is that in vcpu_load(), there were previously a call to kvm_request_guest_time_update(vcpu) that was removed without a counterpart addition. Any idea about why it was done? Nikola, does adding that line back alleviate the problem for you ? Hello Glauber, kvm_request_guest_time_update seems to have been renamed and then removed since then, but I've added kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu); instead and now the guest boots! So maybe missing clock update is really the culprit here? What do You guys think? n. I think although the long term plan is to just do this update once in your case (stable tsc), this update is needed. Why don't you send a patch to re-include it ? -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
I think although the long term plan is to just do this update once in your case (stable tsc), this update is needed. Why don't you send a patch to re-include it ? Yes, I'll gladly submit patch, one question, is this OK to just add calling kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu) before the conditional (as I did in my test), or should it go somewhere to else {..} section? it's called inside the conditional again, which will cause it to be called twice in some cases, is it OK? n. -- - Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax:+420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: ser...@linuxbox.cz - -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
On 03/04/2011 02:09 PM, Glauber Costa wrote: On Fri, 2011-03-04 at 19:27 +0100, Nikola Ciprich wrote: Hello Zachary, You don't see any messages about TSC being unstable or switching clocksource after loading the KVM module? And you are not suspending the host or anything? no messages, no suspending, nothing. Can you try using processor.max_cstate=1 on the host as a kernel parameter and see if it makes a difference? I tried it, no change.. n. Zach, I don't understand 100 % the logic behind all your tsc changes. But kvm-clock-wise, most of the problems we had in the past were related to the difference in resolution between the tsc and the host clocksource (hpet, acpi_pm, etc), which in his case, it is a non-issue. It does seem to me like some compensation logic kicked in, dismantling an otherwise good tsc. He does have nonstop_tsc, which means it can't get any better. One thing I noticed when reading the culprit patch in bisect, is that in vcpu_load(), there were previously a call to kvm_request_guest_time_update(vcpu) that was removed without a counterpart addition. Any idea about why it was done? That's probably the source of the bug... I've been looking for that exact line, though, and I can't find it missing. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
On 03/04/2011 05:36 PM, Nikola Ciprich wrote: I think although the long term plan is to just do this update once in your case (stable tsc), this update is needed. Why don't you send a patch to re-include it ? Yes, I'll gladly submit patch, one question, is this OK to just add calling kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu) before the conditional (as I did in my test), or should it go somewhere to else {..} section? it's called inside the conditional again, which will cause it to be called twice in some cases, is it OK? n. Let me write a patch to fix this.. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
On 03/04/2011 05:36 PM, Nikola Ciprich wrote: I think although the long term plan is to just do this update once in your case (stable tsc), this update is needed. Why don't you send a patch to re-include it ? Yes, I'll gladly submit patch, one question, is this OK to just add calling kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu) before the conditional (as I did in my test), or should it go somewhere to else {..} section? it's called inside the conditional again, which will cause it to be called twice in some cases, is it OK? n. Can you try this patch to see if it fixes the problem? Thanks, Zach diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 468fafa..ba05303 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1866,6 +1866,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu) } kvm_x86_ops-vcpu_load(vcpu, cpu); + kvm_request_guest_time_update(vcpu); if (unlikely(vcpu-cpu != cpu)) { /* Make sure TSC doesn't go backwards */ s64 tsc_delta = !vcpu-arch.last_host_tsc ? 0 :
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
Can you try this patch to see if it fixes the problem? You haven't read my replies, did you? ;-) kvm_request_guest_time_update seems to have been removed, and kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu) seems to be used instead, adding it fixes the problem. That's what I was going to use in the patch... :) Thanks, Zach diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 468fafa..ba05303 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1866,6 +1866,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu) } kvm_x86_ops-vcpu_load(vcpu, cpu); + kvm_request_guest_time_update(vcpu); if (unlikely(vcpu-cpu != cpu)) { /* Make sure TSC doesn't go backwards */ s64 tsc_delta = !vcpu-arch.last_host_tsc ? 0 : -- - Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax:+420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: ser...@linuxbox.cz - -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
On 03/03/2011 02:06 AM, Nikola Ciprich wrote: No worries. What mess? twice sending the same mail, nevermind :) I have two things you can try: first is running a single VCPU guest, if you have not done so already. yup, UP guest is fine, just SMP doesn't work. Second is adding the bootparameter clocksource=acpi_pm to your guest kernel. yes, this makes SMP work too! I just realized when You were asking about current clocksource, I told You only host source, not the guest. So I checked now, and (at least for UP, I guess for SMP it's the same), the clocksource is kvm-clock! So seems like it got broken with the TSC changes? What is the exact kernel version you are using in the guest. It appears that some earlier 32-bit versions of kvm-clock enabled kernels are still missing the required atomic check for backwards-time protection which would be needed on SMP. This explains why 64-bit is fine, 32-bit is not. Why this change triggers that problem still is a slight mystery, logically it should only affect the system if you have an unstable TSC. Zach -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
What is the exact kernel version you are using in the guest. It's latest centos (2.6.18-194.32.1.el5), so I guess there are a lot of fixes, but it's possible the kvm-clock is broken in it. I can't influence what kernel is used there (at least not on customer's guests), but I guess asking for adding clocksource kernel parameter is not problem. -- - Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax:+420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: ser...@linuxbox.cz - -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
On 03/03/2011 04:06 PM, Nikola Ciprich wrote: What is the exact kernel version you are using in the guest. It's latest centos (2.6.18-194.32.1.el5), so I guess there are a lot of fixes, but it's possible the kvm-clock is broken in it. I can't influence what kernel is used there (at least not on customer's guests), but I guess asking for adding clocksource kernel parameter is not problem. That sounds like a kernel which will be vulnerable to broken KVM clock on 32-bit. There's a kernel side fix that is needed, but why the server side change triggers the problem needs more investigation. Zach -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
That sounds like a kernel which will be vulnerable to broken KVM clock on 32-bit. There's a kernel side fix that is needed, but why the server side change triggers the problem needs more investigation. OK, it's important for me that I can fix this by kernel parameter, but if I can help somehow with debugging, please let me know. thanks for Your time! nik Zach -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- - Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax:+420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: ser...@linuxbox.cz - -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
(resend, sorry for the mess) No worries. What mess? I have two things you can try: first is running a single VCPU guest, if you have not done so already. Second is adding the bootparameter clocksource=acpi_pm to your guest kernel. If either of those fixes the problem, it very well have to do with this change and not that you may be missing later dependent patches. This change should be nearly a 1-1 transformation, and if it is not, something is wrong. What branch are you bisecting on, the kvm branch or the kernel tree itself? It would be helpful to see the exact code in case any surrouding logic changed. Thanks, Zach -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
No worries. What mess? twice sending the same mail, nevermind :) I have two things you can try: first is running a single VCPU guest, if you have not done so already. yup, UP guest is fine, just SMP doesn't work. Second is adding the bootparameter clocksource=acpi_pm to your guest kernel. yes, this makes SMP work too! I just realized when You were asking about current clocksource, I told You only host source, not the guest. So I checked now, and (at least for UP, I guess for SMP it's the same), the clocksource is kvm-clock! So seems like it got broken with the TSC changes? If either of those fixes the problem, it very well have to do with this change and not that you may be missing later dependent patches. This change should be nearly a 1-1 transformation, and if it is not, something is wrong. What branch are you bisecting on, the kvm branch or the kernel tree itself? It would be helpful to see the exact code in case any surrouding logic changed. I was bisecting linus' linux-2.6.git main branch, between 2.6.36..2.6.37 Thanks, Zach -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- - Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax:+420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: ser...@linuxbox.cz - -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
On 02/27/2011 12:20 PM, Nikola Ciprich wrote: I was not aware of the thread. Please cc me directly, or add a keyword I track - timekeeping, TSC.. Hello Zachary, thanks for Your time looking at this! That change alone may not bisect well; without further fixes on top of it, you may end up with a hang or stall, which is likely to manifest in a vendor-specific way. I'm not sure I really understand You here, but this change is exactly to what I got while bisecting. With later revisions, including this one, 32bit SMP guests don't boot, before it, they do.. Does the bug you are hitting manifest on both Intel and AMD platforms? Further, do the systems you are hitting this on have stable or unstable TSCs? Thanks, Zach -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
Does the bug you are hitting manifest on both Intel and AMD platforms? I don't have any AMD box here, I'll try this out at my home box. Further, do the systems you are hitting this on have stable or unstable TSCs? how do I find this out? I don't see any warning about TSC in guest, but I've just started it.. n. Thanks, Zach -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- - Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax:+420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: ser...@linuxbox.cz - -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
On 02/28/2011 09:32 AM, Nikola Ciprich wrote: Does the bug you are hitting manifest on both Intel and AMD platforms? I don't have any AMD box here, I'll try this out at my home box. Further, do the systems you are hitting this on have stable or unstable TSCs? how do I find this out? I don't see any warning about TSC in guest, but I've just started it.. n. Before worrying about the guest, is the host TSC stable? What is the host clocksource? -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
On Mon, Feb 28, 2011 at 10:17:24AM -0500, Zachary Amsden wrote: On 02/28/2011 09:32 AM, Nikola Ciprich wrote: Does the bug you are hitting manifest on both Intel and AMD platforms? I don't have any AMD box here, I'll try this out at my home box. Further, do the systems you are hitting this on have stable or unstable TSCs? how do I find this out? I don't see any warning about TSC in guest, but I've just started it.. n. Before worrying about the guest, is the host TSC stable? What is the host clocksource? not sure, I'm not setting anything specifically, is this snippet of dmesg relevant: [1.148829] HPET: 8 timers in total, 5 timers will be used for per-cpu timer [1.148934] hpet0: at MMIO 0xfed0, IRQs 2, 8, 40, 41, 42, 43, 44, 0 [1.149331] hpet0: 8 comparators, 64-bit 14.318180 MHz counter [1.151831] hpet: hpet2 irq 40 for MSI [1.151962] hpet: hpet3 irq 41 for MSI [1.155930] hpet: hpet4 irq 42 for MSI [1.159937] hpet: hpet5 irq 43 for MSI [1.163943] hpet: hpet6 irq 44 for MSI [1.175955] Switching to clocksource tsc so I guess I'm using hpet? n. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- - Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax:+420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: ser...@linuxbox.cz - -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
On Mon, Feb 28, 2011 at 10:17:24AM -0500, Zachary Amsden wrote: On 02/28/2011 09:32 AM, Nikola Ciprich wrote: Does the bug you are hitting manifest on both Intel and AMD platforms? I don't have any AMD box here, I'll try this out at my home box. Further, do the systems you are hitting this on have stable or unstable TSCs? how do I find this out? I don't see any warning about TSC in guest, but I've just started it.. n. Before worrying about the guest, is the host TSC stable? What is the host clocksource? not sure, I'm not setting anything specifically, is this snippet of dmesg relevant: [1.148829] HPET: 8 timers in total, 5 timers will be used for per-cpu timer [1.148934] hpet0: at MMIO 0xfed0, IRQs 2, 8, 40, 41, 42, 43, 44, 0 [1.149331] hpet0: 8 comparators, 64-bit 14.318180 MHz counter [1.151831] hpet: hpet2 irq 40 for MSI [1.151962] hpet: hpet3 irq 41 for MSI [1.155930] hpet: hpet4 irq 42 for MSI [1.159937] hpet: hpet5 irq 43 for MSI [1.163943] hpet: hpet6 irq 44 for MSI [1.175955] Switching to clocksource tsc so I guess I'm using hpet? n. Looks like you are using tsc based on the last line. Can you tell us please cat /proc/cpuinfo cat /sys/devices/system/clocksource/clocksource0/current_clocksource and grep -i dmesg for these keywords: TSC, clock, hpet, stable, khz, kvm -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
(resend, sorry for the mess) cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 30 model name: Intel(R) Xeon(R) CPU X3440 @ 2.53GHz stepping : 5 cpu MHz : 2533.185 cache size : 8192 KB physical id : 0 siblings : 8 core id: 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu: yes fpu_exception : yes cpuid level: 11 wp : yes flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca c= mov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rd= tscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_= tsc aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pd= cm sse4_1 sse4_2 popcnt lahf_lm ida dts tpr_shadow vnmi flexpriority ept vp= id bogomips : 5066.37 clflush size : 64 cache_alignment : 64 address sizes: 36 bits physical, 48 bits virtual power management: . . . . processor : 7 vendor_id : GenuineIntel cpu family : 6 model : 30 model name: Intel(R) Xeon(R) CPU X3440 @ 2.53GHz stepping : 5 cpu MHz : 2533.185 cache size : 8192 KB physical id : 0 siblings : 8 core id: 3 cpu cores : 4 apicid : 7 initial apicid : 7 fpu: yes fpu_exception : yes cpuid level: 11 wp : yes flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca c= mov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rd= tscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_= tsc aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pd= cm sse4_1 sse4_2 popcnt lahf_lm ida dts tpr_shadow vnmi flexpriority ept vp= id bogomips : 5066.35 clflush size : 64 cache_alignment : 64 address sizes: 36 bits physical, 48 bits virtual power management: cat /sys/devices/system/clocksource/clocksource0/current_clocksource [root@vbox5 ~]# cat /sys/devices/system/clocksource/clocksource0/current_cl= ocksource tsc and grep -i dmesg for these keywords: TSC, clock, hpet, stable, khz, kvm [root@vbox5 ~]# dmesg | grep -i tsc\|clock\|hpet\|stable\|stable\|khz\|kvm [0.00] ACPI: HPET bf7aa5f0 00038 (v01 052710 OEMHPET 20100= 527 MSFT 0097) [0.00] ACPI: HPET id: 0x8086a701 base: 0xfed0 [0.00] hpet clockevent registered [0.00] Fast TSC calibration using PIT [1.148829] HPET: 8 timers in total, 5 timers will be used for per-cpu t= imer [1.148934] hpet0: at MMIO 0xfed0, IRQs 2, 8, 40, 41, 42, 43, 44, 0 [1.149331] hpet0: 8 comparators, 64-bit 14.318180 MHz counter [1.151831] hpet: hpet2 irq 40 for MSI [1.151962] hpet: hpet3 irq 41 for MSI [1.155930] hpet: hpet4 irq 42 for MSI [1.159937] hpet: hpet5 irq 43 for MSI [1.163943] hpet: hpet6 irq 44 for MSI [1.175955] Switching to clocksource tsc [1.260015] CE: hpet3 increased min_delta_ns to 7500 nsec [1.260117] CE: hpet3 increased min_delta_ns to 11250 nsec [1.294150] Real Time Clock Driver v1.12b [7.564355] CE: hpet4 increased min_delta_ns to 7500 nsec [7.564367] CE: hpet4 increased min_delta_ns to 11250 nsec [ 299.307242] CE: hpet2 increased min_delta_ns to 7500 nsec [ 299.307251] CE: hpet2 increased min_delta_ns to 11250 nsec [ 1414.616685] CE: hpet5 increased min_delta_ns to 7500 nsec [ 1414.616694] CE: hpet5 increased min_delta_ns to 11250 nsec [ 5241.474310] CE: hpet6 increased min_delta_ns to 7500 nsec [ 5241.474321] CE: hpet6 increased min_delta_ns to 11250 nsec -- - Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax:+420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: ser...@linuxbox.cz - -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 30 model name: Intel(R) Xeon(R) CPU X3440 @ 2.53GHz stepping : 5 cpu MHz : 2533.185 cache size : 8192 KB physical id : 0 siblings : 8 core id: 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu: yes fpu_exception : yes cpuid level: 11 wp : yes flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 popcnt lahf_lm ida dts tpr_shadow vnmi flexpriority ept vpid bogomips : 5066.37 clflush size : 64 cache_alignment : 64 address sizes: 36 bits physical, 48 bits virtual power management: . . . . . processor : 7 vendor_id : GenuineIntel cpu family : 6 model : 30 model name: Intel(R) Xeon(R) CPU X3440 @ 2.53GHz stepping : 5 cpu MHz : 2533.185 cache size : 8192 KB physical id : 0 siblings : 8 core id: 3 cpu cores : 4 apicid : 7 initial apicid : 7 fpu: yes fpu_exception : yes cpuid level: 11 wp : yes flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 popcnt lahf_lm ida dts tpr_shadow vnmi flexpriority ept vpid bogomips : 5066.35 clflush size : 64 cache_alignment : 64 address sizes: 36 bits physical, 48 bits virtual power management: cat /sys/devices/system/clocksource/clocksource0/current_clocksource [root@vbox5 ~]# cat /sys/devices/system/clocksource/clocksource0/current_clocksource tsc and grep -i dmesg for these keywords: TSC, clock, hpet, stable, khz, kvm [root@vbox5 ~]# dmesg | grep -i tsc\|clock\|hpet\|stable\|stable\|khz\|kvm [0.00] ACPI: HPET bf7aa5f0 00038 (v01 052710 OEMHPET 20100527 MSFT 0097) [0.00] ACPI: HPET id: 0x8086a701 base: 0xfed0 [0.00] hpet clockevent registered [0.00] Fast TSC calibration using PIT [1.148829] HPET: 8 timers in total, 5 timers will be used for per-cpu timer [1.148934] hpet0: at MMIO 0xfed0, IRQs 2, 8, 40, 41, 42, 43, 44, 0 [1.149331] hpet0: 8 comparators, 64-bit 14.318180 MHz counter [1.151831] hpet: hpet2 irq 40 for MSI [1.151962] hpet: hpet3 irq 41 for MSI [1.155930] hpet: hpet4 irq 42 for MSI [1.159937] hpet: hpet5 irq 43 for MSI [1.163943] hpet: hpet6 irq 44 for MSI [1.175955] Switching to clocksource tsc [1.260015] CE: hpet3 increased min_delta_ns to 7500 nsec [1.260117] CE: hpet3 increased min_delta_ns to 11250 nsec [1.294150] Real Time Clock Driver v1.12b [7.564355] CE: hpet4 increased min_delta_ns to 7500 nsec [7.564367] CE: hpet4 increased min_delta_ns to 11250 nsec [ 299.307242] CE: hpet2 increased min_delta_ns to 7500 nsec [ 299.307251] CE: hpet2 increased min_delta_ns to 11250 nsec [ 1414.616685] CE: hpet5 increased min_delta_ns to 7500 nsec [ 1414.616694] CE: hpet5 increased min_delta_ns to 11250 nsec [ 5241.474310] CE: hpet6 increased min_delta_ns to 7500 nsec [ 5241.474321] CE: hpet6 increased min_delta_ns to 11250 nsec -- - Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax:+420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: ser...@linuxbox.cz - pgpa7UOdTfwcn.pgp Description: PGP signature
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
I was not aware of the thread. Please cc me directly, or add a keyword I track - timekeeping, TSC.. Hello Zachary, thanks for Your time looking at this! That change alone may not bisect well; without further fixes on top of it, you may end up with a hang or stall, which is likely to manifest in a vendor-specific way. I'm not sure I really understand You here, but this change is exactly to what I got while bisecting. With later revisions, including this one, 32bit SMP guests don't boot, before it, they do.. Basically there were a few differences in the platform code about how TSC was dealt with on systems which did not have stable clocks, this brought the logic into one location, but there was a slight change to the logic here. Note very carefully, the logic on SVM is gated by a condition before this change: if (unlikely(cpu != vcpu-cpu)) { - u64 delta; - - if (check_tsc_unstable()) { - /* -* Make sure that the guest sees a monotonically -* increasing TSC. -*/ - delta = vcpu-arch.host_tsc - native_read_tsc(); - svm-vmcb-control.tsc_offset += delta; - if (is_nested(svm)) - svm-nested.hsave-control.tsc_offset += delta; - } - vcpu-cpu = cpu; - kvm_migrate_timers(vcpu); So this only happens with a system which reports TSC as unstable. After the change, KVM itself may report the TSC as unstable: + if (unlikely(vcpu-cpu != cpu)) { + /* Make sure TSC doesn't go backwards */ + s64 tsc_delta = !vcpu-arch.last_host_tsc ? 0 : + native_read_tsc() - vcpu-arch.last_host_tsc; + if (tsc_delta 0) + mark_tsc_unstable(KVM discovered backwards TSC); + if (check_tsc_unstable()) + kvm_x86_ops-adjust_tsc_offset(vcpu, -tsc_delta); + kvm_migrate_timers(vcpu); + vcpu-cpu = cpu; + } If the platform has very small TSC deltas across CPUs, but indicates the TSC is stable, this could result in KVM marking the TSC unstable. If that is the case, this compensation logic will kick in to avoid backwards TSCs. Note however, that the logic is not perfect; time which passes while not running on any CPU will be erased, as the delta compensation removes not just backwards, but any elapsed time from the TSC. In extreme cases, this could result in time appearing to stand still with guests failing to boot. This was addressed with a later change, which catches up the missing time: commit c285545f813d7b0ce989fd34e42ad1fe785dc65d yes, but this change is already included in 2.6.37, so maybe some other fix is needed? if You have some idea what could be changed, I'll gladly test whatever You recommend, but I'm afraid that's all I can do, since this is a bit of a rocket science for me, sorry :( nik Author: Zachary Amsden zams...@redhat.com Date: Sat Sep 18 14:38:15 2010 -1000 KVM: x86: TSC catchup mode Negate the effects of AN TYM spell while kvm thread is preempted by tracking conversion factor to the highest TSC rate and catching the TSC up when it has fallen behind the kernel view of time. Note that once triggered, we don't turn off catchup mode. A slightly more clever version of this is possible, which only does catchup when TSC rate drops, and which specifically targets only CPUs with broken TSC, but since these all are considered unstable_tsc(), this patch covers all necessary cases. Signed-off-by: Zachary Amsden zams...@redhat.com Signed-off-by: Marcelo Tosatti mtosa...@redhat.com -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- - Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax:+420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: ser...@linuxbox.cz - -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
(CC: Zachary) Hello, Zachary, in case You haven't noticed the thread, we're trying to find out the reason why 32bit SMP guests stopped working in 2.6.37. bisect shows this as the culprit: e48672fa25e879f7ae21785c7efd187738139593 is first bad commit commit e48672fa25e879f7ae21785c7efd187738139593 Author: Zachary Amsden zams...@redhat.com Date: Thu Aug 19 22:07:23 2010 -1000 KVM: x86: Unify TSC logic Move the TSC control logic from the vendor backends into x86.c by adding adjust_tsc_offset to x86 ops. Now all TSC decisions can be done in one place. Signed-off-by: Zachary Amsden zams...@redhat.com Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Unfortunately I couldn't try 2.6.37 with just this one reverted, certainly other patches rely on it, but hopefully I've not screwed something while bisecting... so what now? n. -- - Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax:+420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: ser...@linuxbox.cz - -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
On 02/25/2011 05:48 AM, Nikola Ciprich wrote: (CC: Zachary) Hello, Zachary, in case You haven't noticed the thread, we're trying to find out the reason why 32bit SMP guests stopped working in 2.6.37. bisect shows this as the culprit: I was not aware of the thread. Please cc me directly, or add a keyword I track - timekeeping, TSC.. e48672fa25e879f7ae21785c7efd187738139593 is first bad commit commit e48672fa25e879f7ae21785c7efd187738139593 Author: Zachary Amsdenzams...@redhat.com Date: Thu Aug 19 22:07:23 2010 -1000 KVM: x86: Unify TSC logic Move the TSC control logic from the vendor backends into x86.c by adding adjust_tsc_offset to x86 ops. Now all TSC decisions can be done in one place. Signed-off-by: Zachary Amsdenzams...@redhat.com Signed-off-by: Marcelo Tosattimtosa...@redhat.com That change alone may not bisect well; without further fixes on top of it, you may end up with a hang or stall, which is likely to manifest in a vendor-specific way. Basically there were a few differences in the platform code about how TSC was dealt with on systems which did not have stable clocks, this brought the logic into one location, but there was a slight change to the logic here. Note very carefully, the logic on SVM is gated by a condition before this change: if (unlikely(cpu != vcpu-cpu)) { - u64 delta; - - if (check_tsc_unstable()) { - /* -* Make sure that the guest sees a monotonically -* increasing TSC. -*/ - delta = vcpu-arch.host_tsc - native_read_tsc(); - svm-vmcb-control.tsc_offset += delta; - if (is_nested(svm)) - svm-nested.hsave-control.tsc_offset += delta; - } - vcpu-cpu = cpu; - kvm_migrate_timers(vcpu); So this only happens with a system which reports TSC as unstable. After the change, KVM itself may report the TSC as unstable: + if (unlikely(vcpu-cpu != cpu)) { + /* Make sure TSC doesn't go backwards */ + s64 tsc_delta = !vcpu-arch.last_host_tsc ? 0 : + native_read_tsc() - vcpu-arch.last_host_tsc; + if (tsc_delta 0) + mark_tsc_unstable(KVM discovered backwards TSC); + if (check_tsc_unstable()) + kvm_x86_ops-adjust_tsc_offset(vcpu, -tsc_delta); + kvm_migrate_timers(vcpu); + vcpu-cpu = cpu; + } If the platform has very small TSC deltas across CPUs, but indicates the TSC is stable, this could result in KVM marking the TSC unstable. If that is the case, this compensation logic will kick in to avoid backwards TSCs. Note however, that the logic is not perfect; time which passes while not running on any CPU will be erased, as the delta compensation removes not just backwards, but any elapsed time from the TSC. In extreme cases, this could result in time appearing to stand still with guests failing to boot. This was addressed with a later change, which catches up the missing time: commit c285545f813d7b0ce989fd34e42ad1fe785dc65d Author: Zachary Amsden zams...@redhat.com Date: Sat Sep 18 14:38:15 2010 -1000 KVM: x86: TSC catchup mode Negate the effects of AN TYM spell while kvm thread is preempted by tracking conversion factor to the highest TSC rate and catching the TSC up when it has fallen behind the kernel view of time. Note that once triggered, we don't turn off catchup mode. A slightly more clever version of this is possible, which only does catchup when TSC rate drops, and which specifically targets only CPUs with broken TSC, but since these all are considered unstable_tsc(), this patch covers all necessary cases. Signed-off-by: Zachary Amsden zams...@redhat.com Signed-off-by: Marcelo Tosatti mtosa...@redhat.com -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
On 02/24/2011 01:42 AM, Nikola Ciprich wrote: Hello Avi et al, seems like I've hit regression in 2.6.37: 32bit SMP centos guest stopped booting, they just hang during initrd phase. (haven't tried different distros) UP guest are OK. when I (forcibly) compiled kvm-kmod-2.6.36.2 and used it in 2.6.37, even the SMP guests boot fine. does somebody have a tip on where the problem could be, or should I bisect this? I tried on 2 different machines, host is x86_64, qemu-kvm 0.13.0, 0.14.0. If I shall provide more information (or bisect), please let me know. Bisect is of course great, if laborious. Meanwhile can you post 'info registers' for all cpus? Is the guest consuming cpu? kvm_stat output? -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
On Thu, Feb 24, 2011 at 12:17:40PM +0200, Avi Kivity wrote: On 02/24/2011 01:42 AM, Nikola Ciprich wrote: Hello Avi et al, seems like I've hit regression in 2.6.37: 32bit SMP centos guest stopped booting, they just hang during initrd phase. (haven't tried different distros) UP guest are OK. when I (forcibly) compiled kvm-kmod-2.6.36.2 and used it in 2.6.37, even the SMP guests boot fine. does somebody have a tip on where the problem could be, or should I bisect this? I tried on 2 different machines, host is x86_64, qemu-kvm 0.13.0, 0.14.0. If I shall provide more information (or bisect), please let me know. Bisect is of course great, if laborious. Meanwhile can you post 'info registers' for all cpus? Is the guest consuming cpu? kvm_stat output? yes, it's eating 100% of one CPU core. kvm_stat for few seconds (hunged guest is the only one running on the host): kvm_entry293279091 kvm_exit 293579090 kvm_inj_virq 245887609 kvm_apic_accept_irq 171465310 kvm_emulate_insn 126823931 kvm_apic 125303879 kvm_mmio 125253879 kvm_exit(APIC_ACCESS)125253879 kvm_exit(HLT)112623466 kvm_ioapic_set_irq65322024 kvm_set_irq 65382024 kvm_pic_set_irq 65362024 kvm_exit(EXTERNAL_INTERRUPT) 42551300 kvm_ack_irq 2442 756 kvm_exit(PENDING_INTERRUPT) 1030 335 kvm_exit(IO_INSTRUCTION) 313 104 kvm_pio312 104 kvm_age_page18 6 kvm_exit(EPT_VIOLATION) 14 4 kvm_page_fault 12 4 kvm_exit(INVALID_STATE) 4 0 kvm_exit(VMLAUNCH) 3 0 kvm_exit(CPUID) 3 0 kvm_exit(DR_ACCESS) 2 0 kvm_exit(MSR_READ) 2 0 kvm_exit(PAUSE_INSTRUCTION) 1 0 info registers: EAX= EBX=6a00 ECX=000a EDX=000f41a8 ESI=000f41a8 EDI= EBP=c0690320 ESP=c0769f58 EIP=c042d137 EFL=0002 [---] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =007b 00c0f300 DPL=3 DS [-WA] CS =0060 00c09b00 DPL=0 CS32 [-RA] SS =0068 00c09300 DPL=0 DS [-WA] DS =007b 00c0f300 DPL=3 DS [-WA] FS = GS = LDT=0088 c0747020 0027 8200 DPL=0 LDT TR =0080 c300f380 2073 8b00 DPL=0 TSS32-busy GDT= c302b000 00ff IDT= c06f7000 07ff CR0=8005003b CR2=ffc46000 CR3=00743000 CR4=06d0 DR0= DR1= DR2= DR3= DR6=0ff0 DR7=0400 EFER= FCW=037f FSW= [ST=0] FTW=00 MXCSR=1f80 FPR0= FPR1= FPR2= FPR3= FPR4= FPR5= FPR6=800bf600 4015 FPR7= XMM00= XMM01= XMM02= XMM03= XMM04= XMM05= XMM06= XMM07= I'll wait a bit with bisect whether You'll spot some obvious bug or not ;) thanks for Your time! PS: I still owe You the kvm_stat comparison about this slow windows chkdsk problem, I'm aware of it, I just had to postpone this due to more urgent matters :( but I'll get back to it sooner or later.. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- - Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax:+420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: ser...@linuxbox.cz - -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
On 02/24/2011 12:48 PM, Nikola Ciprich wrote: On Thu, Feb 24, 2011 at 12:17:40PM +0200, Avi Kivity wrote: On 02/24/2011 01:42 AM, Nikola Ciprich wrote: Hello Avi et al, seems like I've hit regression in 2.6.37: 32bit SMP centos guest stopped booting, they just hang during initrd phase. (haven't tried different distros) UP guest are OK. when I (forcibly) compiled kvm-kmod-2.6.36.2 and used it in 2.6.37, even the SMP guests boot fine. does somebody have a tip on where the problem could be, or should I bisect this? I tried on 2 different machines, host is x86_64, qemu-kvm 0.13.0, 0.14.0. If I shall provide more information (or bisect), please let me know. Bisect is of course great, if laborious. Meanwhile can you post 'info registers' for all cpus? Is the guest consuming cpu? kvm_stat output? yes, it's eating 100% of one CPU core. kvm_stat for few seconds (hunged guest is the only one running on the host): kvm_entry293279091 kvm_exit 293579090 kvm_inj_virq 245887609 kvm_apic_accept_irq 171465310 kvm_emulate_insn 126823931 kvm_apic 125303879 kvm_mmio 125253879 kvm_exit(APIC_ACCESS)125253879 kvm_exit(HLT)112623466 kvm_ioapic_set_irq65322024 kvm_set_irq 65382024 kvm_pic_set_irq 65362024 kvm_exit(EXTERNAL_INTERRUPT) 42551300 kvm_ack_irq 2442 756 kvm_exit(PENDING_INTERRUPT) 1030 335 kvm_exit(IO_INSTRUCTION) 313 104 kvm_pio312 104 kvm_age_page18 6 kvm_exit(EPT_VIOLATION) 14 4 kvm_page_fault 12 4 kvm_exit(INVALID_STATE) 4 0 kvm_exit(VMLAUNCH) 3 0 kvm_exit(CPUID) 3 0 kvm_exit(DR_ACCESS) 2 0 kvm_exit(MSR_READ) 2 0 kvm_exit(PAUSE_INSTRUCTION) 1 0 Guest is churning along. info registers: EAX= EBX=6a00 ECX=000a EDX=000f41a8 ESI=000f41a8 EDI= EBP=c0690320 ESP=c0769f58 EIP=c042d137 EFL=0002 [---] CPL=0 II=0 A20=1 SMM=0 HLT=0 Not very useful when the guest is making progress, I'm afraid. I'll wait a bit with bisect whether You'll spot some obvious bug or not ;) thanks for Your time! Can you try a little trace-cmd -e kvm -b 2? PS: I still owe You the kvm_stat comparison about this slow windows chkdsk problem, I'm aware of it, I just had to postpone this due to more urgent matters :( but I'll get back to it sooner or later.. Sure. Something similar that came up - sometimes Windows IDE drivers fall back to PIO mode. Are you using IDE? If so, please check whether it's using DMA or PIO. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
Not very useful when the guest is making progress, I'm afraid. can perf report help here? Can you try a little trace-cmd -e kvm -b 2? ugh, I'm afraid I'll have some dumb questions here :-[ You mean this: git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/trace-cmd.git ? and then re-execute qemu-kvm using it? or I'm totally wrong? Sure. Something similar that came up - sometimes Windows IDE drivers fall back to PIO mode. Are you using IDE? If so, please check whether it's using DMA or PIO. I'll check, but this problem occurs only during fsck phase, when to guest boots, then it runs pretty fast.. so maybe during boot it might fall back to PIO, but from guest, I guess I won't have a chance to find out.. can I somehow check it from host? -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- - Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax:+420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: ser...@linuxbox.cz - -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
On 02/24/2011 01:27 PM, Nikola Ciprich wrote: Not very useful when the guest is making progress, I'm afraid. can perf report help here? Can you try a little trace-cmd -e kvm -b 2? ugh, I'm afraid I'll have some dumb questions here :-[ You mean this: git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/trace-cmd.git ? Yes. If you have udis86 and udis86-devel installed when building it, it's even better. and then re-execute qemu-kvm using it? or I'm totally wrong? You don't have to execute qemu-kvm under it, if you have a running instance you can run trace-cmd in parallel and it will record whatever's happening. Sure. Something similar that came up - sometimes Windows IDE drivers fall back to PIO mode. Are you using IDE? If so, please check whether it's using DMA or PIO. I'll check, but this problem occurs only during fsck phase, when to guest boots, then it runs pretty fast.. so maybe during boot it might fall back to PIO, but from guest, I guess I won't have a chance to find out.. can I somehow check it from host? The trace-cmd output will show. Please run trace-cmd report afterwards and post the results somewhere. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
On 02/24/2011 02:41 PM, Nikola Ciprich wrote: Yes. If you have udis86 and udis86-devel installed when building it, it's even better. yes, now I remember! I've done some tracing for You already.. You don't have to execute qemu-kvm under it, if you have a running instance you can run trace-cmd in parallel and it will record whatever's happening. I've uploaded the report for You here: nelide.cz/downloads/nik/report.txt.xz The only activity I can see is the timer interrupt, so I'm afraid a bisect is needed. If you let git bisect just kvm, it'll be a bit faster: $ git bisect $BAD $GOOD virt/kvm arch/x86/kvm -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
The only activity I can see is the timer interrupt, so I'm afraid a bisect is needed. OK, nevermind, it's easy to reproduce, so I'll just bisect it and report. n. If you let git bisect just kvm, it'll be a bit faster: $ git bisect $BAD $GOOD virt/kvm arch/x86/kvm -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- - Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax:+420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: ser...@linuxbox.cz - -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot
Yes. If you have udis86 and udis86-devel installed when building it, it's even better. yes, now I remember! I've done some tracing for You already.. You don't have to execute qemu-kvm under it, if you have a running instance you can run trace-cmd in parallel and it will record whatever's happening. I've uploaded the report for You here: nelide.cz/downloads/nik/report.txt.xz The trace-cmd output will show. Please run trace-cmd report afterwards and post the results somewhere. OK, I'll prepare some new windows testing machine, try and report.. -- - Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax:+420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: ser...@linuxbox.cz - pgpVQJypywZSY.pgp Description: PGP signature