Re: [PATCH 2/2] KVM: x86: optimize delivery of TSC deadline timer interrupt

2015-02-10 Thread Marcelo Tosatti
On Sat, Feb 07, 2015 at 09:33:09PM +0100, Paolo Bonzini wrote:
  Can remove another 300 cycles from do_div when programming LAPIC
  tscdeadline timer.
 
 Do you mean using something like lib/reciprocal_div.c?  

Yes.

 Good idea,
 though that's not latency, it's just being slow. :)

It adds to latency, yes (the guest will reprogram the
APIC timer before resuming userspace execution).

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] KVM: x86: optimize delivery of TSC deadline timer interrupt

2015-02-07 Thread Paolo Bonzini


On 06/02/2015 21:51, Marcelo Tosatti wrote:
 On Fri, Feb 06, 2015 at 01:16:59PM +0100, Paolo Bonzini wrote:
 The newly-added tracepoint shows the following results on
 the tscdeadline_latency test:

 qemu-kvm-8387  [002]  6425.558974: kvm_vcpu_wakeup:  poll time 
 10407 ns
 qemu-kvm-8387  [002]  6425.558984: kvm_vcpu_wakeup:  poll time 0 
 ns
 qemu-kvm-8387  [002]  6425.561242: kvm_vcpu_wakeup:  poll time 
 10477 ns
 qemu-kvm-8387  [002]  6425.561251: kvm_vcpu_wakeup:  poll time 0 
 ns

 and so on.  This is because we need to go through kvm_vcpu_block again
 after the timer IRQ is injected.  Avoid it by polling once before
 entering kvm_vcpu_block.

 On my machine (Xeon E5 Sandy Bridge) this removes about 500 cycles (7%)
 from the latency of the TSC deadline timer.

 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
  arch/x86/kvm/x86.c | 14 +-
  1 file changed, 9 insertions(+), 5 deletions(-)

 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index 0b8dd13676ef..1e766033ebff 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -6389,11 +6389,15 @@ static inline int __vcpu_run(struct kvm *kvm, struct 
 kvm_vcpu *vcpu)
  !vcpu-arch.apf.halted)
  return vcpu_enter_guest(vcpu);
  
 -srcu_read_unlock(kvm-srcu, vcpu-srcu_idx);
 -kvm_vcpu_block(vcpu);
 -vcpu-srcu_idx = srcu_read_lock(kvm-srcu);
 -if (!kvm_check_request(KVM_REQ_UNHALT, vcpu))
 -return 1;
 +if (kvm_arch_vcpu_runnable(vcpu))
 +clear_bit(KVM_REQ_UNHALT, vcpu-requests);
 +else {
 
 Why the clear_bit? Since only kvm_vcpu_block in the below section
 sets it, and that section clears it as well.

You're right.

 Can remove another 300 cycles from do_div when programming LAPIC
 tscdeadline timer.

Do you mean using something like lib/reciprocal_div.c?  Good idea,
though that's not latency, it's just being slow. :)

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] KVM: x86: optimize delivery of TSC deadline timer interrupt

2015-02-06 Thread Marcelo Tosatti
On Fri, Feb 06, 2015 at 01:16:59PM +0100, Paolo Bonzini wrote:
 The newly-added tracepoint shows the following results on
 the tscdeadline_latency test:
 
 qemu-kvm-8387  [002]  6425.558974: kvm_vcpu_wakeup:  poll time 
 10407 ns
 qemu-kvm-8387  [002]  6425.558984: kvm_vcpu_wakeup:  poll time 0 
 ns
 qemu-kvm-8387  [002]  6425.561242: kvm_vcpu_wakeup:  poll time 
 10477 ns
 qemu-kvm-8387  [002]  6425.561251: kvm_vcpu_wakeup:  poll time 0 
 ns
 
 and so on.  This is because we need to go through kvm_vcpu_block again
 after the timer IRQ is injected.  Avoid it by polling once before
 entering kvm_vcpu_block.
 
 On my machine (Xeon E5 Sandy Bridge) this removes about 500 cycles (7%)
 from the latency of the TSC deadline timer.
 
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
  arch/x86/kvm/x86.c | 14 +-
  1 file changed, 9 insertions(+), 5 deletions(-)
 
 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index 0b8dd13676ef..1e766033ebff 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -6389,11 +6389,15 @@ static inline int __vcpu_run(struct kvm *kvm, struct 
 kvm_vcpu *vcpu)
   !vcpu-arch.apf.halted)
   return vcpu_enter_guest(vcpu);
  
 - srcu_read_unlock(kvm-srcu, vcpu-srcu_idx);
 - kvm_vcpu_block(vcpu);
 - vcpu-srcu_idx = srcu_read_lock(kvm-srcu);
 - if (!kvm_check_request(KVM_REQ_UNHALT, vcpu))
 - return 1;
 + if (kvm_arch_vcpu_runnable(vcpu))
 + clear_bit(KVM_REQ_UNHALT, vcpu-requests);
 + else {

Why the clear_bit? Since only kvm_vcpu_block in the below section
sets it, and that section clears it as well.

Can remove another 300 cycles from do_div when programming LAPIC
tscdeadline timer.

 + srcu_read_unlock(kvm-srcu, vcpu-srcu_idx);
 + kvm_vcpu_block(vcpu);
 + vcpu-srcu_idx = srcu_read_lock(kvm-srcu);
 + if (!kvm_check_request(KVM_REQ_UNHALT, vcpu))
 + return 1;
 + }
  
   kvm_apic_accept_events(vcpu);
   switch(vcpu-arch.mp_state) {
 -- 
 1.8.3.1
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html