Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-03-06 Thread Zachary Amsden

On 03/05/2011 02:21 AM, Nikola Ciprich wrote:
   

Can you try this patch to see if it fixes the problem?
 

You haven't read my replies, did you? ;-)
kvm_request_guest_time_update seems to have been
removed, and kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu)
seems to be used instead, adding it fixes the problem.
That's what I was going to use in the patch... :)
   


I did read your mail, but I was working on an old tree... because of 
that transformation, this fix will unfortunately have to be back and 
forward ported by hand.


Did you try just that change right applied on top of the patch 
(e48672fa25e879f7ae21785c7efd187738139593) implicated by bisect?


It will be great to know if that change alone fixes the problem, if so, 
the fix you propose is probably the right one for upstream.


Thanks,

Zach
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-03-06 Thread Nikola Ciprich
 I did read your mail, but I was working on an old tree... because of  
 that transformation, this fix will unfortunately have to be back and  
 forward ported by hand.
OK, sorry, I didn't mean to be adverse...


 Did you try just that change right applied on top of the patch  
 (e48672fa25e879f7ae21785c7efd187738139593) implicated by bisect?
yes, with host running e48672fa25e879f7ae21785c7efd187738139593,
32bit SMP guest doesn't boot, when I add kvm_request_guest_time_update(vcpu),
it helps.


 It will be great to know if that change alone fixes the problem, if so,  
 the fix you propose is probably the right one for upstream.
ok, so shell I submit patch adding kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu)?
this fixes things for me for 2.6.37.


 Thanks,

 Zach
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
-
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:+420 596 621 273
mobil:  +420 777 093 799

www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-03-04 Thread Zachary Amsden

On 03/03/2011 05:01 PM, Nikola Ciprich wrote:

That sounds like a kernel which will be vulnerable to broken KVM clock
on 32-bit.  There's a kernel side fix that is needed, but why the server
side change triggers the problem needs more investigation.
 

OK, it's important for me that I can fix this by kernel parameter,
but if I can help somehow with debugging, please let me know.
thanks for Your time!
nik
   


You don't see any messages about TSC being unstable or switching 
clocksource after loading the KVM module?  And you are not suspending 
the host or anything?


Can you try using processor.max_cstate=1 on the host as a kernel 
parameter and see if it makes a difference?

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-03-04 Thread Nikola Ciprich
Hello Zachary,

 You don't see any messages about TSC being unstable or switching  
 clocksource after loading the KVM module?  And you are not suspending  
 the host or anything?
no messages, no suspending, nothing.


 Can you try using processor.max_cstate=1 on the host as a kernel  
 parameter and see if it makes a difference?
I tried it, no change..
n.


-- 
-
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:+420 596 621 273
mobil:  +420 777 093 799

www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-03-04 Thread Glauber Costa
On Fri, 2011-03-04 at 19:27 +0100, Nikola Ciprich wrote:
 Hello Zachary,
 
  You don't see any messages about TSC being unstable or switching  
  clocksource after loading the KVM module?  And you are not suspending  
  the host or anything?
 no messages, no suspending, nothing.
 
 
  Can you try using processor.max_cstate=1 on the host as a kernel  
  parameter and see if it makes a difference?
 I tried it, no change..
 n.

Zach,

I don't understand 100 % the logic behind all your tsc changes.
But kvm-clock-wise, most of the problems we had in the past were related
to the difference in resolution between the tsc and the host clocksource
(hpet, acpi_pm, etc), which in his case, it is a non-issue.

It does seem to me like some compensation logic kicked in, dismantling
an otherwise good tsc. He does have nonstop_tsc, which means it can't
get any better.

One thing I noticed when reading the culprit patch in bisect, is that in
vcpu_load(), there were previously a call to  

 kvm_request_guest_time_update(vcpu)

that was removed without a counterpart addition. Any idea about why it
was done?

Nikola, does adding that line back alleviate the problem for you ?

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-03-04 Thread Nikola Ciprich
 Zach,
 
 I don't understand 100 % the logic behind all your tsc changes.
 But kvm-clock-wise, most of the problems we had in the past were related
 to the difference in resolution between the tsc and the host clocksource
 (hpet, acpi_pm, etc), which in his case, it is a non-issue.
 
 It does seem to me like some compensation logic kicked in, dismantling
 an otherwise good tsc. He does have nonstop_tsc, which means it can't
 get any better.
 
 One thing I noticed when reading the culprit patch in bisect, is that in
 vcpu_load(), there were previously a call to  
 
  kvm_request_guest_time_update(vcpu)
 
 that was removed without a counterpart addition. Any idea about why it
 was done?
 
 Nikola, does adding that line back alleviate the problem for you ?
Hello Glauber,
kvm_request_guest_time_update seems to have been renamed and then
removed since then, but I've added 
kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
instead and now the guest boots!
So maybe missing clock update is really the culprit here?
What do You guys think?
n.



 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 

-- 
-
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:+420 596 621 273
mobil:  +420 777 093 799

www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-03-04 Thread Glauber Costa
On Fri, 2011-03-04 at 21:55 +0100, Nikola Ciprich wrote:
  Zach,
  
  I don't understand 100 % the logic behind all your tsc changes.
  But kvm-clock-wise, most of the problems we had in the past were related
  to the difference in resolution between the tsc and the host clocksource
  (hpet, acpi_pm, etc), which in his case, it is a non-issue.
  
  It does seem to me like some compensation logic kicked in, dismantling
  an otherwise good tsc. He does have nonstop_tsc, which means it can't
  get any better.
  
  One thing I noticed when reading the culprit patch in bisect, is that in
  vcpu_load(), there were previously a call to  
  
   kvm_request_guest_time_update(vcpu)
  
  that was removed without a counterpart addition. Any idea about why it
  was done?
  
  Nikola, does adding that line back alleviate the problem for you ?
 Hello Glauber,
 kvm_request_guest_time_update seems to have been renamed and then
 removed since then, but I've added 
 kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
 instead and now the guest boots!
 So maybe missing clock update is really the culprit here?
 What do You guys think?
 n.

I think although the long term plan is to just do this update once in
your case (stable tsc), this update is needed. 

Why don't you send a patch to re-include it ?

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-03-04 Thread Nikola Ciprich
 
 I think although the long term plan is to just do this update once in
 your case (stable tsc), this update is needed. 
 
 Why don't you send a patch to re-include it ?
 
Yes, I'll gladly submit patch, one question, is this OK
to just add calling kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu) before
the conditional (as I did in my test), or should it go somewhere to else {..}
section? it's called inside the conditional again, which will cause it
to be called twice in some cases, is it OK?
n.

-- 
-
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:+420 596 621 273
mobil:  +420 777 093 799

www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-03-04 Thread Zachary Amsden

On 03/04/2011 02:09 PM, Glauber Costa wrote:

On Fri, 2011-03-04 at 19:27 +0100, Nikola Ciprich wrote:
   

Hello Zachary,

 

You don't see any messages about TSC being unstable or switching
clocksource after loading the KVM module?  And you are not suspending
the host or anything?
   

no messages, no suspending, nothing.


 

Can you try using processor.max_cstate=1 on the host as a kernel
parameter and see if it makes a difference?
   

I tried it, no change..
n.
 

Zach,

I don't understand 100 % the logic behind all your tsc changes.
But kvm-clock-wise, most of the problems we had in the past were related
to the difference in resolution between the tsc and the host clocksource
(hpet, acpi_pm, etc), which in his case, it is a non-issue.

It does seem to me like some compensation logic kicked in, dismantling
an otherwise good tsc. He does have nonstop_tsc, which means it can't
get any better.

One thing I noticed when reading the culprit patch in bisect, is that in
vcpu_load(), there were previously a call to

  kvm_request_guest_time_update(vcpu)

that was removed without a counterpart addition. Any idea about why it
was done?
   


That's probably the source of the bug... I've been looking for that 
exact line, though, and I can't find it missing.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-03-04 Thread Zachary Amsden

On 03/04/2011 05:36 PM, Nikola Ciprich wrote:

I think although the long term plan is to just do this update once in
your case (stable tsc), this update is needed.

Why don't you send a patch to re-include it ?

 

Yes, I'll gladly submit patch, one question, is this OK
to just add calling kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu) before
the conditional (as I did in my test), or should it go somewhere to else {..}
section? it's called inside the conditional again, which will cause it
to be called twice in some cases, is it OK?
n.
   


Let me write a patch to fix this..
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-03-04 Thread Zachary Amsden

On 03/04/2011 05:36 PM, Nikola Ciprich wrote:

I think although the long term plan is to just do this update once in
your case (stable tsc), this update is needed.

Why don't you send a patch to re-include it ?

 

Yes, I'll gladly submit patch, one question, is this OK
to just add calling kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu) before
the conditional (as I did in my test), or should it go somewhere to else {..}
section? it's called inside the conditional again, which will cause it
to be called twice in some cases, is it OK?
n.

   


Can you try this patch to see if it fixes the problem?

Thanks,

Zach
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 468fafa..ba05303 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1866,6 +1866,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
}
 
kvm_x86_ops-vcpu_load(vcpu, cpu);
+   kvm_request_guest_time_update(vcpu);
if (unlikely(vcpu-cpu != cpu)) {
/* Make sure TSC doesn't go backwards */
s64 tsc_delta = !vcpu-arch.last_host_tsc ? 0 :


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-03-04 Thread Nikola Ciprich


 Can you try this patch to see if it fixes the problem?
You haven't read my replies, did you? ;-)
kvm_request_guest_time_update seems to have been
removed, and kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu)
seems to be used instead, adding it fixes the problem.
That's what I was going to use in the patch... :)


 Thanks,

 Zach

 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index 468fafa..ba05303 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -1866,6 +1866,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
   }
  
   kvm_x86_ops-vcpu_load(vcpu, cpu);
 + kvm_request_guest_time_update(vcpu);
   if (unlikely(vcpu-cpu != cpu)) {
   /* Make sure TSC doesn't go backwards */
   s64 tsc_delta = !vcpu-arch.last_host_tsc ? 0 :


-- 
-
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:+420 596 621 273
mobil:  +420 777 093 799

www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-03-03 Thread Zachary Amsden

On 03/03/2011 02:06 AM, Nikola Ciprich wrote:

No worries.  What mess?
 

twice sending the same mail, nevermind :)

   

I have two things you can try:

first is running a single VCPU guest, if you have not done so already.
 

yup, UP guest is fine, just SMP doesn't work.

   

Second is adding the bootparameter clocksource=acpi_pm to your guest
kernel.
 

yes, this makes SMP work too! I just realized when You were asking about current
clocksource, I told You only host source, not the guest. So I checked now,
and (at least for UP, I guess for SMP it's the same), the clocksource is
kvm-clock! So seems like it got broken with the TSC changes?
   


What is the exact kernel version you are using in the guest.

It appears that some earlier 32-bit versions of kvm-clock enabled 
kernels are still missing the required atomic check for backwards-time 
protection which would be needed on SMP.  This explains why 64-bit is 
fine, 32-bit is not.


Why this change triggers that problem still is a slight mystery, 
logically it should only affect the system if you have an unstable TSC.


Zach
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-03-03 Thread Nikola Ciprich
 What is the exact kernel version you are using in the guest.
It's latest centos (2.6.18-194.32.1.el5), so I guess there are a lot
of fixes, but it's possible the kvm-clock is broken in it.
I can't influence what kernel is used there (at least not on customer's
guests), but I guess asking for adding clocksource kernel parameter is
not problem.


-- 
-
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:+420 596 621 273
mobil:  +420 777 093 799

www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-03-03 Thread Zachary Amsden

On 03/03/2011 04:06 PM, Nikola Ciprich wrote:

What is the exact kernel version you are using in the guest.
 

It's latest centos (2.6.18-194.32.1.el5), so I guess there are a lot
of fixes, but it's possible the kvm-clock is broken in it.
I can't influence what kernel is used there (at least not on customer's
guests), but I guess asking for adding clocksource kernel parameter is
not problem.

   


That sounds like a kernel which will be vulnerable to broken KVM clock 
on 32-bit.  There's a kernel side fix that is needed, but why the server 
side change triggers the problem needs more investigation.


Zach
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-03-03 Thread Nikola Ciprich
 That sounds like a kernel which will be vulnerable to broken KVM clock  
 on 32-bit.  There's a kernel side fix that is needed, but why the server  
 side change triggers the problem needs more investigation.
OK, it's important for me that I can fix this by kernel parameter,
but if I can help somehow with debugging, please let me know.
thanks for Your time!
nik


 Zach
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
-
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:+420 596 621 273
mobil:  +420 777 093 799

www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-03-02 Thread Zachary Amsden


(resend, sorry for the mess)
   


No worries.  What mess?

I have two things you can try:

first is running a single VCPU guest, if you have not done so already.

Second is adding the bootparameter clocksource=acpi_pm to your guest 
kernel.


If either of those fixes the problem, it very well have to do with this 
change and not that you may be missing later dependent patches.  This 
change should be nearly a 1-1 transformation, and if it is not, 
something is wrong.


What branch are you bisecting on, the kvm branch or the kernel tree 
itself?  It would be helpful to see the exact code in case any 
surrouding logic changed.


Thanks,

Zach
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-03-02 Thread Nikola Ciprich
 No worries.  What mess?
twice sending the same mail, nevermind :)


 I have two things you can try:

 first is running a single VCPU guest, if you have not done so already.
yup, UP guest is fine, just SMP doesn't work.

 Second is adding the bootparameter clocksource=acpi_pm to your guest  
 kernel.
yes, this makes SMP work too! I just realized when You were asking about current
clocksource, I told You only host source, not the guest. So I checked now,
and (at least for UP, I guess for SMP it's the same), the clocksource is
kvm-clock! So seems like it got broken with the TSC changes?



 If either of those fixes the problem, it very well have to do with this  
 change and not that you may be missing later dependent patches.  This  
 change should be nearly a 1-1 transformation, and if it is not,  
 something is wrong.

 What branch are you bisecting on, the kvm branch or the kernel tree  
 itself?  It would be helpful to see the exact code in case any  
 surrouding logic changed.
I was bisecting linus' linux-2.6.git main branch, between 2.6.36..2.6.37


 Thanks,

 Zach
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
-
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:+420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-02-28 Thread Zachary Amsden

On 02/27/2011 12:20 PM, Nikola Ciprich wrote:

I was not aware of the thread.  Please cc me directly, or add a keyword
I track - timekeeping, TSC..
 

Hello Zachary, thanks for Your time looking at this!
   

That change alone may not bisect well; without further fixes on top of
it, you may end up with a hang or stall, which is likely to manifest in
a vendor-specific way.
 

I'm not sure I really understand You here, but this change is exactly to
what I got while bisecting. With later revisions, including this one,
32bit SMP guests don't boot, before it, they do..
   


Does the bug you are hitting manifest on both Intel and AMD platforms?

Further, do the systems you are hitting this on have stable or unstable 
TSCs?


Thanks,

Zach
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-02-28 Thread Nikola Ciprich
 Does the bug you are hitting manifest on both Intel and AMD platforms?
I don't have any AMD box here, I'll try this out at my home box.


 Further, do the systems you are hitting this on have stable or unstable  
 TSCs?
how do I find this out? I don't see any warning about TSC in guest, but I've
just started it..
n.




 Thanks,

 Zach
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
-
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:+420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-02-28 Thread Zachary Amsden

On 02/28/2011 09:32 AM, Nikola Ciprich wrote:

Does the bug you are hitting manifest on both Intel and AMD platforms?
 

I don't have any AMD box here, I'll try this out at my home box.

   

Further, do the systems you are hitting this on have stable or unstable
TSCs?
 

how do I find this out? I don't see any warning about TSC in guest, but I've
just started it..
n.


Before worrying about the guest, is the host TSC stable?  What is the 
host clocksource?

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-02-28 Thread Nikola Ciprich
On Mon, Feb 28, 2011 at 10:17:24AM -0500, Zachary Amsden wrote:
 On 02/28/2011 09:32 AM, Nikola Ciprich wrote:
 Does the bug you are hitting manifest on both Intel and AMD platforms?
  
 I don't have any AMD box here, I'll try this out at my home box.


 Further, do the systems you are hitting this on have stable or unstable
 TSCs?
  
 how do I find this out? I don't see any warning about TSC in guest, but I've
 just started it..
 n.

 Before worrying about the guest, is the host TSC stable?  What is the  
 host clocksource?
not sure, I'm not setting anything specifically, is this snippet of dmesg 
relevant:

[1.148829] HPET: 8 timers in total, 5 timers will be used for per-cpu timer
[1.148934] hpet0: at MMIO 0xfed0, IRQs 2, 8, 40, 41, 42, 43, 44, 0
[1.149331] hpet0: 8 comparators, 64-bit 14.318180 MHz counter
[1.151831] hpet: hpet2 irq 40 for MSI
[1.151962] hpet: hpet3 irq 41 for MSI
[1.155930] hpet: hpet4 irq 42 for MSI
[1.159937] hpet: hpet5 irq 43 for MSI
[1.163943] hpet: hpet6 irq 44 for MSI
[1.175955] Switching to clocksource tsc

so I guess I'm using hpet?
n.


 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
-
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:+420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-02-28 Thread Zachary Amsden


On Mon, Feb 28, 2011 at 10:17:24AM -0500, Zachary Amsden wrote:
   

On 02/28/2011 09:32 AM, Nikola Ciprich wrote:
 

Does the bug you are hitting manifest on both Intel and AMD platforms?

 

I don't have any AMD box here, I'll try this out at my home box.


   

Further, do the systems you are hitting this on have stable or unstable
TSCs?

 

how do I find this out? I don't see any warning about TSC in guest, but I've
just started it..
n.
   

Before worrying about the guest, is the host TSC stable?  What is the
host clocksource?
 

not sure, I'm not setting anything specifically, is this snippet of dmesg 
relevant:

[1.148829] HPET: 8 timers in total, 5 timers will be used for per-cpu timer
[1.148934] hpet0: at MMIO 0xfed0, IRQs 2, 8, 40, 41, 42, 43, 44, 0
[1.149331] hpet0: 8 comparators, 64-bit 14.318180 MHz counter
[1.151831] hpet: hpet2 irq 40 for MSI
[1.151962] hpet: hpet3 irq 41 for MSI
[1.155930] hpet: hpet4 irq 42 for MSI
[1.159937] hpet: hpet5 irq 43 for MSI
[1.163943] hpet: hpet6 irq 44 for MSI
[1.175955] Switching to clocksource tsc

so I guess I'm using hpet?
n.


   
Looks like you are using tsc based on the last line.  Can you tell us 
please


cat /proc/cpuinfo
cat /sys/devices/system/clocksource/clocksource0/current_clocksource

and grep -i dmesg for these keywords: TSC, clock, hpet, stable, khz, kvm
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-02-28 Thread Nikola Ciprich
(resend, sorry for the mess)
 cat /proc/cpuinfo
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model : 30
model name: Intel(R) Xeon(R) CPU   X3440  @ 2.53GHz
stepping  : 5
cpu MHz : 2533.185
cache size  : 8192 KB
physical id : 0
siblings : 8
core id: 0
cpu cores  : 4
apicid   : 0
initial apicid : 0
fpu: yes
fpu_exception  : yes
cpuid level: 11
wp : yes
flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca c=
mov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rd=
tscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_=
tsc aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pd=
cm sse4_1 sse4_2 popcnt lahf_lm ida dts tpr_shadow vnmi flexpriority ept vp=
id
bogomips : 5066.37
clflush size : 64
cache_alignment  : 64
address sizes: 36 bits physical, 48 bits virtual
power management:
.
.
.
.
processor   : 7
vendor_id   : GenuineIntel
cpu family  : 6
model : 30
model name: Intel(R) Xeon(R) CPU   X3440  @ 2.53GHz
stepping  : 5
cpu MHz : 2533.185
cache size  : 8192 KB
physical id : 0
siblings : 8
core id: 3
cpu cores  : 4
apicid   : 7
initial apicid : 7
fpu: yes
fpu_exception  : yes
cpuid level: 11
wp : yes
flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca c=
mov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rd=
tscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_=
tsc aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pd=
cm sse4_1 sse4_2 popcnt lahf_lm ida dts tpr_shadow vnmi flexpriority ept vp=
id
bogomips : 5066.35
clflush size : 64
cache_alignment  : 64
address sizes: 36 bits physical, 48 bits virtual
power management:


 cat /sys/devices/system/clocksource/clocksource0/current_clocksource
[root@vbox5 ~]# cat /sys/devices/system/clocksource/clocksource0/current_cl=
ocksource
tsc



 and grep -i dmesg for these keywords: TSC, clock, hpet, stable, khz, kvm
[root@vbox5 ~]# dmesg | grep -i tsc\|clock\|hpet\|stable\|stable\|khz\|kvm
[0.00] ACPI: HPET bf7aa5f0 00038 (v01 052710 OEMHPET  20100=
527 MSFT 0097)
[0.00] ACPI: HPET id: 0x8086a701 base: 0xfed0
[0.00] hpet clockevent registered
[0.00] Fast TSC calibration using PIT
[1.148829] HPET: 8 timers in total, 5 timers will be used for per-cpu t=
imer
[1.148934] hpet0: at MMIO 0xfed0, IRQs 2, 8, 40, 41, 42, 43, 44, 0
[1.149331] hpet0: 8 comparators, 64-bit 14.318180 MHz counter
[1.151831] hpet: hpet2 irq 40 for MSI
[1.151962] hpet: hpet3 irq 41 for MSI
[1.155930] hpet: hpet4 irq 42 for MSI
[1.159937] hpet: hpet5 irq 43 for MSI
[1.163943] hpet: hpet6 irq 44 for MSI
[1.175955] Switching to clocksource tsc
[1.260015] CE: hpet3 increased min_delta_ns to 7500 nsec
[1.260117] CE: hpet3 increased min_delta_ns to 11250 nsec
[1.294150] Real Time Clock Driver v1.12b
[7.564355] CE: hpet4 increased min_delta_ns to 7500 nsec
[7.564367] CE: hpet4 increased min_delta_ns to 11250 nsec
[  299.307242] CE: hpet2 increased min_delta_ns to 7500 nsec
[  299.307251] CE: hpet2 increased min_delta_ns to 11250 nsec
[ 1414.616685] CE: hpet5 increased min_delta_ns to 7500 nsec
[ 1414.616694] CE: hpet5 increased min_delta_ns to 11250 nsec
[ 5241.474310] CE: hpet6 increased min_delta_ns to 7500 nsec
[ 5241.474321] CE: hpet6 increased min_delta_ns to 11250 nsec


-- 
-
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:+420 596 621 273
mobil:  +420 777 093 799

www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-02-28 Thread Nikola Ciprich
 cat /proc/cpuinfo
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model : 30
model name: Intel(R) Xeon(R) CPU   X3440  @ 2.53GHz
stepping  : 5
cpu MHz : 2533.185
cache size  : 8192 KB
physical id : 0
siblings : 8
core id: 0
cpu cores  : 4
apicid   : 0
initial apicid : 0
fpu: yes
fpu_exception  : yes
cpuid level: 11
wp : yes
flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx 
rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology 
nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 
xtpr pdcm sse4_1 sse4_2 popcnt lahf_lm ida dts tpr_shadow vnmi flexpriority ept 
vpid
bogomips : 5066.37
clflush size : 64
cache_alignment  : 64
address sizes: 36 bits physical, 48 bits virtual
power management:
.
.
.
.
.
processor   : 7
vendor_id   : GenuineIntel
cpu family  : 6
model : 30
model name: Intel(R) Xeon(R) CPU   X3440  @ 2.53GHz
stepping  : 5
cpu MHz : 2533.185
cache size  : 8192 KB
physical id : 0
siblings : 8
core id: 3
cpu cores  : 4
apicid   : 7
initial apicid : 7
fpu: yes
fpu_exception  : yes
cpuid level: 11
wp : yes
flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx 
rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology 
nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 
xtpr pdcm sse4_1 sse4_2 popcnt lahf_lm ida dts tpr_shadow vnmi flexpriority ept 
vpid
bogomips : 5066.35
clflush size : 64
cache_alignment  : 64
address sizes: 36 bits physical, 48 bits virtual
power management:


 cat /sys/devices/system/clocksource/clocksource0/current_clocksource
[root@vbox5 ~]# cat 
/sys/devices/system/clocksource/clocksource0/current_clocksource
tsc



 and grep -i dmesg for these keywords: TSC, clock, hpet, stable, khz, kvm
[root@vbox5 ~]# dmesg | grep -i tsc\|clock\|hpet\|stable\|stable\|khz\|kvm
[0.00] ACPI: HPET bf7aa5f0 00038 (v01 052710 OEMHPET  20100527 
MSFT 0097)
[0.00] ACPI: HPET id: 0x8086a701 base: 0xfed0
[0.00] hpet clockevent registered
[0.00] Fast TSC calibration using PIT
[1.148829] HPET: 8 timers in total, 5 timers will be used for per-cpu timer
[1.148934] hpet0: at MMIO 0xfed0, IRQs 2, 8, 40, 41, 42, 43, 44, 0
[1.149331] hpet0: 8 comparators, 64-bit 14.318180 MHz counter
[1.151831] hpet: hpet2 irq 40 for MSI
[1.151962] hpet: hpet3 irq 41 for MSI
[1.155930] hpet: hpet4 irq 42 for MSI
[1.159937] hpet: hpet5 irq 43 for MSI
[1.163943] hpet: hpet6 irq 44 for MSI
[1.175955] Switching to clocksource tsc
[1.260015] CE: hpet3 increased min_delta_ns to 7500 nsec
[1.260117] CE: hpet3 increased min_delta_ns to 11250 nsec
[1.294150] Real Time Clock Driver v1.12b
[7.564355] CE: hpet4 increased min_delta_ns to 7500 nsec
[7.564367] CE: hpet4 increased min_delta_ns to 11250 nsec
[  299.307242] CE: hpet2 increased min_delta_ns to 7500 nsec
[  299.307251] CE: hpet2 increased min_delta_ns to 11250 nsec
[ 1414.616685] CE: hpet5 increased min_delta_ns to 7500 nsec
[ 1414.616694] CE: hpet5 increased min_delta_ns to 11250 nsec
[ 5241.474310] CE: hpet6 increased min_delta_ns to 7500 nsec
[ 5241.474321] CE: hpet6 increased min_delta_ns to 11250 nsec


-- 
-
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:+420 596 621 273
mobil:  +420 777 093 799

www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-


pgpa7UOdTfwcn.pgp
Description: PGP signature


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-02-27 Thread Nikola Ciprich
 I was not aware of the thread.  Please cc me directly, or add a keyword  
 I track - timekeeping, TSC..
Hello Zachary, thanks for Your time looking at this!
 That change alone may not bisect well; without further fixes on top of  
 it, you may end up with a hang or stall, which is likely to manifest in  
 a vendor-specific way.
I'm not sure I really understand You here, but this change is exactly to
what I got while bisecting. With later revisions, including this one,
32bit SMP guests don't boot, before it, they do..

 Basically there were a few differences in the platform code about how  
 TSC was dealt with on systems which did not have stable clocks, this  
 brought the logic into one location, but there was a slight change to  
 the logic here.

 Note very carefully, the logic on SVM is gated by a condition before  
 this change:

 if (unlikely(cpu != vcpu-cpu)) {
 -   u64 delta;
 -
 -   if (check_tsc_unstable()) {
 -   /*
 -* Make sure that the guest sees a monotonically
 -* increasing TSC.
 -*/
 -   delta = vcpu-arch.host_tsc - native_read_tsc();
 -   svm-vmcb-control.tsc_offset += delta;
 -   if (is_nested(svm))
 -   svm-nested.hsave-control.tsc_offset +=  
 delta;
 -   }
 -   vcpu-cpu = cpu;
 -   kvm_migrate_timers(vcpu);


 So this only happens with a system which reports TSC as unstable.  After  
 the change, KVM itself may report the TSC as unstable:

 +   if (unlikely(vcpu-cpu != cpu)) {
 +   /* Make sure TSC doesn't go backwards */
 +   s64 tsc_delta = !vcpu-arch.last_host_tsc ? 0 :
 +   native_read_tsc() -  
 vcpu-arch.last_host_tsc;
 +   if (tsc_delta  0)
 +   mark_tsc_unstable(KVM discovered backwards TSC);
 +   if (check_tsc_unstable())
 +   kvm_x86_ops-adjust_tsc_offset(vcpu, -tsc_delta);
 +   kvm_migrate_timers(vcpu);
 +   vcpu-cpu = cpu;
 +   }

 If the platform has very small TSC deltas across CPUs, but indicates the  
 TSC is stable, this could result in KVM marking the TSC unstable.  If  
 that is the case, this compensation logic will kick in to avoid  
 backwards TSCs.

 Note however, that the logic is not perfect; time which passes while not  
 running on any CPU will be erased, as the delta compensation removes not  
 just backwards, but any elapsed time from the TSC.  In extreme cases,  
 this could result in time appearing to stand still with guests  
 failing to boot.

 This was addressed with a later change, which catches up the missing time:

 commit c285545f813d7b0ce989fd34e42ad1fe785dc65d
yes, but this change is already included in 2.6.37, so maybe some other fix is 
needed?
if You have some idea what could be changed, I'll gladly test whatever You 
recommend,
but I'm afraid that's all I can do, since this is a bit of a rocket science for 
me, sorry :(
nik




 Author: Zachary Amsden zams...@redhat.com
 Date:   Sat Sep 18 14:38:15 2010 -1000

 KVM: x86: TSC catchup mode

 Negate the effects of AN TYM spell while kvm thread is preempted by  
 tracking
 conversion factor to the highest TSC rate and catching the TSC up  
 when it has
 fallen behind the kernel view of time.  Note that once triggered, we 
 don't
 turn off catchup mode.

 A slightly more clever version of this is possible, which only does  
 catchup
 when TSC rate drops, and which specifically targets only CPUs with  
 broken
 TSC, but since these all are considered unstable_tsc(), this patch  
 covers
 all necessary cases.

 Signed-off-by: Zachary Amsden zams...@redhat.com
 Signed-off-by: Marcelo Tosatti mtosa...@redhat.com

 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
-
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:+420 596 621 273
mobil:  +420 777 093 799

www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-02-25 Thread Nikola Ciprich
(CC: Zachary)

Hello,
Zachary, in case You haven't noticed the thread, we're trying
to find out the reason why 32bit SMP guests stopped working
in 2.6.37. 
bisect shows this as the culprit:

e48672fa25e879f7ae21785c7efd187738139593 is first bad commit
commit e48672fa25e879f7ae21785c7efd187738139593
Author: Zachary Amsden zams...@redhat.com
Date:   Thu Aug 19 22:07:23 2010 -1000

KVM: x86: Unify TSC logic

Move the TSC control logic from the vendor backends into x86.c
by adding adjust_tsc_offset to x86 ops.  Now all TSC decisions
can be done in one place.

Signed-off-by: Zachary Amsden zams...@redhat.com
Signed-off-by: Marcelo Tosatti mtosa...@redhat.com

Unfortunately I couldn't try 2.6.37 with just this one reverted, certainly
other patches rely on it, but hopefully I've not screwed something while 
bisecting...

so what now?
n.

-- 
-
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:+420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-02-25 Thread Zachary Amsden

On 02/25/2011 05:48 AM, Nikola Ciprich wrote:

(CC: Zachary)

Hello,
Zachary, in case You haven't noticed the thread, we're trying
to find out the reason why 32bit SMP guests stopped working
in 2.6.37.
bisect shows this as the culprit:
   


I was not aware of the thread.  Please cc me directly, or add a keyword 
I track - timekeeping, TSC..



e48672fa25e879f7ae21785c7efd187738139593 is first bad commit
commit e48672fa25e879f7ae21785c7efd187738139593
Author: Zachary Amsdenzams...@redhat.com
Date:   Thu Aug 19 22:07:23 2010 -1000

 KVM: x86: Unify TSC logic

 Move the TSC control logic from the vendor backends into x86.c
 by adding adjust_tsc_offset to x86 ops.  Now all TSC decisions
 can be done in one place.

 Signed-off-by: Zachary Amsdenzams...@redhat.com
 Signed-off-by: Marcelo Tosattimtosa...@redhat.com
   


That change alone may not bisect well; without further fixes on top of 
it, you may end up with a hang or stall, which is likely to manifest in 
a vendor-specific way.


Basically there were a few differences in the platform code about how 
TSC was dealt with on systems which did not have stable clocks, this 
brought the logic into one location, but there was a slight change to 
the logic here.


Note very carefully, the logic on SVM is gated by a condition before 
this change:


if (unlikely(cpu != vcpu-cpu)) {
-   u64 delta;
-
-   if (check_tsc_unstable()) {
-   /*
-* Make sure that the guest sees a monotonically
-* increasing TSC.
-*/
-   delta = vcpu-arch.host_tsc - native_read_tsc();
-   svm-vmcb-control.tsc_offset += delta;
-   if (is_nested(svm))
-   svm-nested.hsave-control.tsc_offset += 
delta;

-   }
-   vcpu-cpu = cpu;
-   kvm_migrate_timers(vcpu);


So this only happens with a system which reports TSC as unstable.  After 
the change, KVM itself may report the TSC as unstable:


+   if (unlikely(vcpu-cpu != cpu)) {
+   /* Make sure TSC doesn't go backwards */
+   s64 tsc_delta = !vcpu-arch.last_host_tsc ? 0 :
+   native_read_tsc() - 
vcpu-arch.last_host_tsc;

+   if (tsc_delta  0)
+   mark_tsc_unstable(KVM discovered backwards TSC);
+   if (check_tsc_unstable())
+   kvm_x86_ops-adjust_tsc_offset(vcpu, -tsc_delta);
+   kvm_migrate_timers(vcpu);
+   vcpu-cpu = cpu;
+   }

If the platform has very small TSC deltas across CPUs, but indicates the 
TSC is stable, this could result in KVM marking the TSC unstable.  If 
that is the case, this compensation logic will kick in to avoid 
backwards TSCs.


Note however, that the logic is not perfect; time which passes while not 
running on any CPU will be erased, as the delta compensation removes not 
just backwards, but any elapsed time from the TSC.  In extreme cases, 
this could result in time appearing to stand still with guests 
failing to boot.


This was addressed with a later change, which catches up the missing time:

commit c285545f813d7b0ce989fd34e42ad1fe785dc65d
Author: Zachary Amsden zams...@redhat.com
Date:   Sat Sep 18 14:38:15 2010 -1000

KVM: x86: TSC catchup mode

Negate the effects of AN TYM spell while kvm thread is preempted by 
tracking
conversion factor to the highest TSC rate and catching the TSC up 
when it has
fallen behind the kernel view of time.  Note that once triggered, 
we don't

turn off catchup mode.

A slightly more clever version of this is possible, which only does 
catchup
when TSC rate drops, and which specifically targets only CPUs with 
broken
TSC, but since these all are considered unstable_tsc(), this patch 
covers

all necessary cases.

Signed-off-by: Zachary Amsden zams...@redhat.com
Signed-off-by: Marcelo Tosatti mtosa...@redhat.com

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-02-24 Thread Avi Kivity

On 02/24/2011 01:42 AM, Nikola Ciprich wrote:

Hello Avi et al,
seems like I've hit regression in 2.6.37:
32bit SMP centos guest stopped booting, they just hang during initrd phase. 
(haven't tried
different distros)
UP guest are OK.
when I (forcibly) compiled kvm-kmod-2.6.36.2 and used it in 2.6.37, even
the SMP guests boot fine.
does somebody have a tip on where the problem could be, or should I bisect this?
I tried on 2 different machines, host is x86_64, qemu-kvm 0.13.0, 0.14.0.
If I shall provide more information (or bisect), please let me know.


Bisect is of course great, if laborious.  Meanwhile can you post 'info 
registers' for all cpus?  Is the guest consuming cpu?  kvm_stat output?


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-02-24 Thread Nikola Ciprich
On Thu, Feb 24, 2011 at 12:17:40PM +0200, Avi Kivity wrote:
 On 02/24/2011 01:42 AM, Nikola Ciprich wrote:
 Hello Avi et al,
 seems like I've hit regression in 2.6.37:
 32bit SMP centos guest stopped booting, they just hang during initrd phase. 
 (haven't tried
 different distros)
 UP guest are OK.
 when I (forcibly) compiled kvm-kmod-2.6.36.2 and used it in 2.6.37, even
 the SMP guests boot fine.
 does somebody have a tip on where the problem could be, or should I bisect 
 this?
 I tried on 2 different machines, host is x86_64, qemu-kvm 0.13.0, 0.14.0.
 If I shall provide more information (or bisect), please let me know.

 Bisect is of course great, if laborious.  Meanwhile can you post 'info  
 registers' for all cpus?  Is the guest consuming cpu?  kvm_stat output?
yes, it's eating 100% of one CPU core.

kvm_stat for few seconds (hunged guest is the only one running on the host):

 kvm_entry293279091
 kvm_exit 293579090
 kvm_inj_virq 245887609
 kvm_apic_accept_irq  171465310
 kvm_emulate_insn 126823931
 kvm_apic 125303879
 kvm_mmio 125253879
 kvm_exit(APIC_ACCESS)125253879
 kvm_exit(HLT)112623466
 kvm_ioapic_set_irq65322024
 kvm_set_irq   65382024
 kvm_pic_set_irq   65362024
 kvm_exit(EXTERNAL_INTERRUPT)  42551300
 kvm_ack_irq   2442 756
 kvm_exit(PENDING_INTERRUPT)   1030 335
 kvm_exit(IO_INSTRUCTION)   313 104
 kvm_pio312 104
 kvm_age_page18   6
 kvm_exit(EPT_VIOLATION) 14   4
 kvm_page_fault  12   4
 kvm_exit(INVALID_STATE)  4   0
 kvm_exit(VMLAUNCH)   3   0
 kvm_exit(CPUID)  3   0
 kvm_exit(DR_ACCESS)  2   0
 kvm_exit(MSR_READ)   2   0
 kvm_exit(PAUSE_INSTRUCTION)  1   0

info registers:
EAX= EBX=6a00 ECX=000a EDX=000f41a8
ESI=000f41a8 EDI= EBP=c0690320 ESP=c0769f58
EIP=c042d137 EFL=0002 [---] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =007b   00c0f300 DPL=3 DS   [-WA]
CS =0060   00c09b00 DPL=0 CS32 [-RA]
SS =0068   00c09300 DPL=0 DS   [-WA]
DS =007b   00c0f300 DPL=3 DS   [-WA]
FS =   
GS =   
LDT=0088 c0747020 0027 8200 DPL=0 LDT
TR =0080 c300f380 2073 8b00 DPL=0 TSS32-busy
GDT= c302b000 00ff
IDT= c06f7000 07ff
CR0=8005003b CR2=ffc46000 CR3=00743000 CR4=06d0
DR0= DR1= DR2= 
DR3= 
DR6=0ff0 DR7=0400
EFER=
FCW=037f FSW= [ST=0] FTW=00 MXCSR=1f80
FPR0=  FPR1= 
FPR2=  FPR3= 
FPR4=  FPR5= 
FPR6=800bf600 4015 FPR7= 
XMM00= XMM01=
XMM02= XMM03=
XMM04= XMM05=
XMM06= XMM07=

I'll wait a bit with bisect whether You'll spot some obvious bug or not ;)
thanks for Your time!

PS: I still owe You the kvm_stat comparison about this slow windows chkdsk 
problem,
I'm aware of it, I just had to postpone this due to more urgent matters :(
but I'll get back to it sooner or later..


 -- 
 error compiling committee.c: too many arguments to function

 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
-
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:+420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-02-24 Thread Avi Kivity

On 02/24/2011 12:48 PM, Nikola Ciprich wrote:

On Thu, Feb 24, 2011 at 12:17:40PM +0200, Avi Kivity wrote:
  On 02/24/2011 01:42 AM, Nikola Ciprich wrote:
  Hello Avi et al,
  seems like I've hit regression in 2.6.37:
  32bit SMP centos guest stopped booting, they just hang during initrd phase. 
(haven't tried
  different distros)
  UP guest are OK.
  when I (forcibly) compiled kvm-kmod-2.6.36.2 and used it in 2.6.37, even
  the SMP guests boot fine.
  does somebody have a tip on where the problem could be, or should I bisect 
this?
  I tried on 2 different machines, host is x86_64, qemu-kvm 0.13.0, 0.14.0.
  If I shall provide more information (or bisect), please let me know.

  Bisect is of course great, if laborious.  Meanwhile can you post 'info
  registers' for all cpus?  Is the guest consuming cpu?  kvm_stat output?
yes, it's eating 100% of one CPU core.

kvm_stat for few seconds (hunged guest is the only one running on the host):

  kvm_entry293279091
  kvm_exit 293579090
  kvm_inj_virq 245887609
  kvm_apic_accept_irq  171465310
  kvm_emulate_insn 126823931
  kvm_apic 125303879
  kvm_mmio 125253879
  kvm_exit(APIC_ACCESS)125253879
  kvm_exit(HLT)112623466
  kvm_ioapic_set_irq65322024
  kvm_set_irq   65382024
  kvm_pic_set_irq   65362024
  kvm_exit(EXTERNAL_INTERRUPT)  42551300
  kvm_ack_irq   2442 756
  kvm_exit(PENDING_INTERRUPT)   1030 335
  kvm_exit(IO_INSTRUCTION)   313 104
  kvm_pio312 104
  kvm_age_page18  6
  kvm_exit(EPT_VIOLATION) 14   4
  kvm_page_fault  12  4
  kvm_exit(INVALID_STATE)  4  0
  kvm_exit(VMLAUNCH)   3  0
  kvm_exit(CPUID)  3  0
  kvm_exit(DR_ACCESS)  2   0
  kvm_exit(MSR_READ)   2   0
  kvm_exit(PAUSE_INSTRUCTION)  1   0



Guest is churning along.


info registers:
EAX= EBX=6a00 ECX=000a EDX=000f41a8
ESI=000f41a8 EDI= EBP=c0690320 ESP=c0769f58
EIP=c042d137 EFL=0002 [---] CPL=0 II=0 A20=1 SMM=0 HLT=0



Not very useful when the guest is making progress, I'm afraid.


I'll wait a bit with bisect whether You'll spot some obvious bug or not ;)
thanks for Your time!


Can you try a little trace-cmd -e kvm -b 2?


PS: I still owe You the kvm_stat comparison about this slow windows chkdsk 
problem,
I'm aware of it, I just had to postpone this due to more urgent matters :(
but I'll get back to it sooner or later..


Sure.  Something similar that came up - sometimes Windows IDE drivers 
fall back to PIO mode.  Are you using IDE?  If so, please check whether 
it's using DMA or PIO.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-02-24 Thread Nikola Ciprich
 Not very useful when the guest is making progress, I'm afraid.
can perf report help here?

 Can you try a little trace-cmd -e kvm -b 2?
ugh, I'm afraid I'll have some dumb questions here :-[
You mean this: 
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/trace-cmd.git ?
and then re-execute qemu-kvm using it? or I'm totally wrong?

 Sure.  Something similar that came up - sometimes Windows IDE drivers  
 fall back to PIO mode.  Are you using IDE?  If so, please check whether  
 it's using DMA or PIO.
I'll check, but this problem occurs only during fsck phase, when to guest 
boots, then it runs pretty fast..
so maybe during boot it might fall back to PIO, but from guest, I guess I won't 
have a chance
to find out.. can I somehow check it from host?

 -- 
 error compiling committee.c: too many arguments to function

 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
-
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:+420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-02-24 Thread Avi Kivity

On 02/24/2011 01:27 PM, Nikola Ciprich wrote:

  Not very useful when the guest is making progress, I'm afraid.
can perf report help here?

  Can you try a little trace-cmd -e kvm -b 2?
ugh, I'm afraid I'll have some dumb questions here :-[
You mean this: 
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/trace-cmd.git ?


Yes.  If you have udis86 and udis86-devel installed when building it, 
it's even better.



and then re-execute qemu-kvm using it? or I'm totally wrong?


You don't have to execute qemu-kvm under it, if you have a running 
instance you can run trace-cmd in parallel and it will record whatever's 
happening.



  Sure.  Something similar that came up - sometimes Windows IDE drivers
  fall back to PIO mode.  Are you using IDE?  If so, please check whether
  it's using DMA or PIO.
I'll check, but this problem occurs only during fsck phase, when to guest 
boots, then it runs pretty fast..
so maybe during boot it might fall back to PIO, but from guest, I guess I won't 
have a chance
to find out.. can I somehow check it from host?


The trace-cmd output will show.  Please run trace-cmd report afterwards 
and post the results somewhere.



--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-02-24 Thread Avi Kivity

On 02/24/2011 02:41 PM, Nikola Ciprich wrote:

  Yes.  If you have udis86 and udis86-devel installed when building it,
  it's even better.
yes, now I remember! I've done some tracing for You already..

  You don't have to execute qemu-kvm under it, if you have a running
  instance you can run trace-cmd in parallel and it will record whatever's
  happening.
I've uploaded the report for You here:
nelide.cz/downloads/nik/report.txt.xz



The only activity I can see is the timer interrupt, so I'm afraid a 
bisect is needed.


If you let git bisect just kvm, it'll be a bit faster:

$ git bisect $BAD $GOOD virt/kvm arch/x86/kvm

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-02-24 Thread Nikola Ciprich
 The only activity I can see is the timer interrupt, so I'm afraid a  
 bisect is needed.
OK, nevermind, it's easy to reproduce, so I'll just bisect it and report.
n.



 If you let git bisect just kvm, it'll be a bit faster:

 $ git bisect $BAD $GOOD virt/kvm arch/x86/kvm

 -- 
 error compiling committee.c: too many arguments to function

 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
-
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:+420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression - 2.6.36 - 2.6.37 - kvm - 32bit SMP guests don't boot

2011-02-24 Thread Nikola Ciprich
 Yes.  If you have udis86 and udis86-devel installed when building it,  
 it's even better.
yes, now I remember! I've done some tracing for You already..

 You don't have to execute qemu-kvm under it, if you have a running  
 instance you can run trace-cmd in parallel and it will record whatever's  
 happening.
I've uploaded the report for You here:
nelide.cz/downloads/nik/report.txt.xz

 The trace-cmd output will show.  Please run trace-cmd report afterwards  
 and post the results somewhere.
OK, I'll prepare some new windows testing machine, try and report..

-- 
-
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:+420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-


pgpVQJypywZSY.pgp
Description: PGP signature