Re: [PATCH 2/2] KVM: x86: optimize delivery of TSC deadline timer interrupt
On Sat, Feb 07, 2015 at 09:33:09PM +0100, Paolo Bonzini wrote: Can remove another 300 cycles from do_div when programming LAPIC tscdeadline timer. Do you mean using something like lib/reciprocal_div.c? Yes. Good idea, though that's not latency, it's just being slow. :) It adds to latency, yes (the guest will reprogram the APIC timer before resuming userspace execution). -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] KVM: x86: optimize delivery of TSC deadline timer interrupt
On 06/02/2015 21:51, Marcelo Tosatti wrote: On Fri, Feb 06, 2015 at 01:16:59PM +0100, Paolo Bonzini wrote: The newly-added tracepoint shows the following results on the tscdeadline_latency test: qemu-kvm-8387 [002] 6425.558974: kvm_vcpu_wakeup: poll time 10407 ns qemu-kvm-8387 [002] 6425.558984: kvm_vcpu_wakeup: poll time 0 ns qemu-kvm-8387 [002] 6425.561242: kvm_vcpu_wakeup: poll time 10477 ns qemu-kvm-8387 [002] 6425.561251: kvm_vcpu_wakeup: poll time 0 ns and so on. This is because we need to go through kvm_vcpu_block again after the timer IRQ is injected. Avoid it by polling once before entering kvm_vcpu_block. On my machine (Xeon E5 Sandy Bridge) this removes about 500 cycles (7%) from the latency of the TSC deadline timer. Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- arch/x86/kvm/x86.c | 14 +- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 0b8dd13676ef..1e766033ebff 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -6389,11 +6389,15 @@ static inline int __vcpu_run(struct kvm *kvm, struct kvm_vcpu *vcpu) !vcpu-arch.apf.halted) return vcpu_enter_guest(vcpu); -srcu_read_unlock(kvm-srcu, vcpu-srcu_idx); -kvm_vcpu_block(vcpu); -vcpu-srcu_idx = srcu_read_lock(kvm-srcu); -if (!kvm_check_request(KVM_REQ_UNHALT, vcpu)) -return 1; +if (kvm_arch_vcpu_runnable(vcpu)) +clear_bit(KVM_REQ_UNHALT, vcpu-requests); +else { Why the clear_bit? Since only kvm_vcpu_block in the below section sets it, and that section clears it as well. You're right. Can remove another 300 cycles from do_div when programming LAPIC tscdeadline timer. Do you mean using something like lib/reciprocal_div.c? Good idea, though that's not latency, it's just being slow. :) Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] KVM: x86: optimize delivery of TSC deadline timer interrupt
On Fri, Feb 06, 2015 at 01:16:59PM +0100, Paolo Bonzini wrote: The newly-added tracepoint shows the following results on the tscdeadline_latency test: qemu-kvm-8387 [002] 6425.558974: kvm_vcpu_wakeup: poll time 10407 ns qemu-kvm-8387 [002] 6425.558984: kvm_vcpu_wakeup: poll time 0 ns qemu-kvm-8387 [002] 6425.561242: kvm_vcpu_wakeup: poll time 10477 ns qemu-kvm-8387 [002] 6425.561251: kvm_vcpu_wakeup: poll time 0 ns and so on. This is because we need to go through kvm_vcpu_block again after the timer IRQ is injected. Avoid it by polling once before entering kvm_vcpu_block. On my machine (Xeon E5 Sandy Bridge) this removes about 500 cycles (7%) from the latency of the TSC deadline timer. Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- arch/x86/kvm/x86.c | 14 +- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 0b8dd13676ef..1e766033ebff 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -6389,11 +6389,15 @@ static inline int __vcpu_run(struct kvm *kvm, struct kvm_vcpu *vcpu) !vcpu-arch.apf.halted) return vcpu_enter_guest(vcpu); - srcu_read_unlock(kvm-srcu, vcpu-srcu_idx); - kvm_vcpu_block(vcpu); - vcpu-srcu_idx = srcu_read_lock(kvm-srcu); - if (!kvm_check_request(KVM_REQ_UNHALT, vcpu)) - return 1; + if (kvm_arch_vcpu_runnable(vcpu)) + clear_bit(KVM_REQ_UNHALT, vcpu-requests); + else { Why the clear_bit? Since only kvm_vcpu_block in the below section sets it, and that section clears it as well. Can remove another 300 cycles from do_div when programming LAPIC tscdeadline timer. + srcu_read_unlock(kvm-srcu, vcpu-srcu_idx); + kvm_vcpu_block(vcpu); + vcpu-srcu_idx = srcu_read_lock(kvm-srcu); + if (!kvm_check_request(KVM_REQ_UNHALT, vcpu)) + return 1; + } kvm_apic_accept_events(vcpu); switch(vcpu-arch.mp_state) { -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html