[PATCH v3 2/6] KVM: X86: Implement PV IPIs in linux guest

2018-07-03 Thread Wanpeng Li
From: Wanpeng Li Implement paravirtual apic hooks to enable PV IPIs. apic->send_IPI_mask apic->send_IPI_mask_allbutself apic->send_IPI_allbutself apic->send_IPI_all The PV IPIs supports maximal 128 vCPUs VM, it is big enough for cloud environment currently, supporting more

[PATCH v3 3/6] KVM: X86: Fallback to original apic hooks when bad happens

2018-07-03 Thread Wanpeng Li
From: Wanpeng Li Fallback to original apic hooks when apic id is sparse and larger than 128 or kvm fails to add the pending IRQ to lapic. Cc: Paolo Bonzini Cc: Radim Krčmář Cc: Vitaly Kuznetsov Signed-off-by: Wanpeng Li --- arch/x86/kernel/kvm.c | 26 +++--- 1 file

[PATCH v3 4/6] KVM: X86: Implement PV IPIs send hypercall

2018-07-03 Thread Wanpeng Li
From: Wanpeng Li Using hypercall to send IPIs by one vmexit instead of one by one for xAPIC/x2APIC physical mode and one vmexit per-cluster for x2APIC cluster mode. Intel guest can enter x2apic cluster mode when interrupt remmaping is enabled in qemu, however, latest AMD EPYC still just

[PATCH v3 5/6] KVM: X86: Add NMI support to PV IPIs

2018-07-03 Thread Wanpeng Li
From: Wanpeng Li The NMI delivery mode of ICR is used to deliver an NMI to the processor, and the vector information is ignored. Cc: Paolo Bonzini Cc: Radim Krčmář Cc: Vitaly Kuznetsov Signed-off-by: Wanpeng Li --- arch/x86/kernel/kvm.c | 13 +++-- arch/x86/kvm/x86.c| 16

[PATCH v3 0/6] KVM: X86: Implement PV IPIs support

2018-07-03 Thread Wanpeng Li
ndles 128 vCPUs * fix KVM_FEATURE_PV_SEND_IPI doc * document hypercall * fix NMI selftest fails * fix build errors reported by 0day Cc: Paolo Bonzini Cc: Radim Krčmář Cc: Vitaly Kuznetsov Wanpeng Li (6): KVM: X86: Add kvm hypervisor init time platform setup callback KVM: X86: Implement P

[PATCH v3 4/6] KVM: X86: Implement PV IPIs send hypercall

2018-07-03 Thread Wanpeng Li
From: Wanpeng Li Using hypercall to send IPIs by one vmexit instead of one by one for xAPIC/x2APIC physical mode and one vmexit per-cluster for x2APIC cluster mode. Intel guest can enter x2apic cluster mode when interrupt remmaping is enabled in qemu, however, latest AMD EPYC still just

[PATCH v3 5/6] KVM: X86: Add NMI support to PV IPIs

2018-07-03 Thread Wanpeng Li
From: Wanpeng Li The NMI delivery mode of ICR is used to deliver an NMI to the processor, and the vector information is ignored. Cc: Paolo Bonzini Cc: Radim Krčmář Cc: Vitaly Kuznetsov Signed-off-by: Wanpeng Li --- arch/x86/kernel/kvm.c | 13 +++-- arch/x86/kvm/x86.c| 16

[PATCH v3 0/6] KVM: X86: Implement PV IPIs support

2018-07-03 Thread Wanpeng Li
ndles 128 vCPUs * fix KVM_FEATURE_PV_SEND_IPI doc * document hypercall * fix NMI selftest fails * fix build errors reported by 0day Cc: Paolo Bonzini Cc: Radim Krčmář Cc: Vitaly Kuznetsov Wanpeng Li (6): KVM: X86: Add kvm hypervisor init time platform setup callback KVM: X86: Implement P

[PATCH v3 6/6] KVM: X86: Expose PV_SEND_IPI CPUID feature bit to guest

2018-07-03 Thread Wanpeng Li
From: Wanpeng Li Expose PV_SEND_IPI feature bit to guest, the guest can check this feature bit before using paravirtualized send IPIs. Cc: Paolo Bonzini Cc: Radim Krčmář Cc: Vitaly Kuznetsov Signed-off-by: Wanpeng Li --- Documentation/virtual/kvm/cpuid.txt | 4 arch/x86/kvm/cpuid.c

[PATCH v3 6/6] KVM: X86: Expose PV_SEND_IPI CPUID feature bit to guest

2018-07-03 Thread Wanpeng Li
From: Wanpeng Li Expose PV_SEND_IPI feature bit to guest, the guest can check this feature bit before using paravirtualized send IPIs. Cc: Paolo Bonzini Cc: Radim Krčmář Cc: Vitaly Kuznetsov Signed-off-by: Wanpeng Li --- Documentation/virtual/kvm/cpuid.txt | 4 arch/x86/kvm/cpuid.c

Re: [PATCH v2 1/2] KVM: X86: Implement PV IPI in linux guest

2018-07-02 Thread Wanpeng Li
On Mon, 2 Jul 2018 at 19:32, Paolo Bonzini wrote: > > On 02/07/2018 12:14, Wanpeng Li wrote: > > + unsigned long flags, ipi_bitmap_low = 0, ipi_bitmap_high = 0, icr = 0; > > + int cpu, apic_id, ret = 1; > > + > > + if (cpumask_empty(mask))

Re: [PATCH v2 1/2] KVM: X86: Implement PV IPI in linux guest

2018-07-02 Thread Wanpeng Li
On Mon, 2 Jul 2018 at 19:32, Paolo Bonzini wrote: > > On 02/07/2018 12:14, Wanpeng Li wrote: > > + unsigned long flags, ipi_bitmap_low = 0, ipi_bitmap_high = 0, icr = 0; > > + int cpu, apic_id, ret = 1; > > + > > + if (cpumask_empty(mask))

Re: [PATCH v2 1/2] KVM: X86: Implement PV IPI in linux guest

2018-07-02 Thread Wanpeng Li
On Mon, 2 Jul 2018 at 18:26, Vitaly Kuznetsov wrote: > > Wanpeng Li writes: > > > From: Wanpeng Li > > > > Implement PV IPIs in guest kernel. > > > > Cc: Paolo Bonzini > > Cc: Radim Krčmář > > Cc: Vitaly Kuznetsov > > Signed-off-by: Wa

Re: [PATCH v2 1/2] KVM: X86: Implement PV IPI in linux guest

2018-07-02 Thread Wanpeng Li
On Mon, 2 Jul 2018 at 18:26, Vitaly Kuznetsov wrote: > > Wanpeng Li writes: > > > From: Wanpeng Li > > > > Implement PV IPIs in guest kernel. > > > > Cc: Paolo Bonzini > > Cc: Radim Krčmář > > Cc: Vitaly Kuznetsov > > Signed-off-by: Wa

[PATCH v2 1/2] KVM: X86: Implement PV IPI in linux guest

2018-07-02 Thread Wanpeng Li
From: Wanpeng Li Implement PV IPIs in guest kernel. Cc: Paolo Bonzini Cc: Radim Krčmář Cc: Vitaly Kuznetsov Signed-off-by: Wanpeng Li --- arch/x86/kernel/kvm.c | 99 +++ include/uapi/linux/kvm_para.h | 1 + 2 files changed, 100 insertions

[PATCH v2 1/2] KVM: X86: Implement PV IPI in linux guest

2018-07-02 Thread Wanpeng Li
From: Wanpeng Li Implement PV IPIs in guest kernel. Cc: Paolo Bonzini Cc: Radim Krčmář Cc: Vitaly Kuznetsov Signed-off-by: Wanpeng Li --- arch/x86/kernel/kvm.c | 99 +++ include/uapi/linux/kvm_para.h | 1 + 2 files changed, 100 insertions

[PATCH v2 0/2] KVM: x86: Add PV IPIs support

2018-07-02 Thread Wanpeng Li
* fix build errors reported by 0day Wanpeng Li (2): KVM: X86: Implement PV IPI in linux guest KVM: X86: Implement PV send IPI support Documentation/virtual/kvm/cpuid.txt | 4 ++ Documentation/virtual/kvm/hypercalls.txt | 6 ++ arch/x86/include/uapi/asm/kvm_para.h | 1 + arch/x86/

[PATCH v2 2/2] KVM: X86: Implement PV send IPI support

2018-07-02 Thread Wanpeng Li
From: Wanpeng Li Using hypercall to send IPIs by one vmexit instead of one by one for xAPIC/x2APIC physical mode and one vmexit per-cluster for x2APIC cluster mode. Even if enable qemu interrupt remapping and PV TLB Shootdown, I can still observe ~14% performance boost by ebizzy benchmark

[PATCH v2 0/2] KVM: x86: Add PV IPIs support

2018-07-02 Thread Wanpeng Li
* fix build errors reported by 0day Wanpeng Li (2): KVM: X86: Implement PV IPI in linux guest KVM: X86: Implement PV send IPI support Documentation/virtual/kvm/cpuid.txt | 4 ++ Documentation/virtual/kvm/hypercalls.txt | 6 ++ arch/x86/include/uapi/asm/kvm_para.h | 1 + arch/x86/

[PATCH v2 2/2] KVM: X86: Implement PV send IPI support

2018-07-02 Thread Wanpeng Li
From: Wanpeng Li Using hypercall to send IPIs by one vmexit instead of one by one for xAPIC/x2APIC physical mode and one vmexit per-cluster for x2APIC cluster mode. Even if enable qemu interrupt remapping and PV TLB Shootdown, I can still observe ~14% performance boost by ebizzy benchmark

Re: [PATCH 2/2] KVM: X86: Implement PV send IPI support

2018-06-30 Thread Wanpeng Li
andles 128 vCPUs. > > to remove or limit the need for the more generic hypercall. Have already done 2) in v2, will send out later. Regards, Wanpeng Li

Re: [PATCH 2/2] KVM: X86: Implement PV send IPI support

2018-06-30 Thread Wanpeng Li
andles 128 vCPUs. > > to remove or limit the need for the more generic hypercall. Have already done 2) in v2, will send out later. Regards, Wanpeng Li

Re: [PATCH 2/2] KVM: X86: Implement PV send IPI support

2018-06-30 Thread Wanpeng Li
On Fri, 29 Jun 2018 at 18:45, Paolo Bonzini wrote: > > On 29/06/2018 11:51, Wanpeng Li wrote: > > From: Wanpeng Li > > > > Using hypercall to send IPIs by one vmexit instead of one by one for > > xAPIC/x2APIC physical mode and one vmexit per-cluster for x2APIC

Re: [PATCH 2/2] KVM: X86: Implement PV send IPI support

2018-06-30 Thread Wanpeng Li
On Fri, 29 Jun 2018 at 18:45, Paolo Bonzini wrote: > > On 29/06/2018 11:51, Wanpeng Li wrote: > > From: Wanpeng Li > > > > Using hypercall to send IPIs by one vmexit instead of one by one for > > xAPIC/x2APIC physical mode and one vmexit per-cluster for x2APIC

Re: [PATCH 1/2] KVM: X86: Implement PV IPI in linux guest

2018-06-30 Thread Wanpeng Li
On Fri, 29 Jun 2018 at 18:10, Vitaly Kuznetsov wrote: > > Wanpeng Li writes: > > > From: Wanpeng Li > > > > Implement PV IPIs in guest kernel. > > > > Cc: Paolo Bonzini > > Cc: Radim Krčmář > > Cc: Vitaly Kuznetsov > > Signed-o

Re: [PATCH 1/2] KVM: X86: Implement PV IPI in linux guest

2018-06-30 Thread Wanpeng Li
On Fri, 29 Jun 2018 at 18:10, Vitaly Kuznetsov wrote: > > Wanpeng Li writes: > > > From: Wanpeng Li > > > > Implement PV IPIs in guest kernel. > > > > Cc: Paolo Bonzini > > Cc: Radim Krčmář > > Cc: Vitaly Kuznetsov > > Signed-o

[PATCH 2/2] KVM: X86: Implement PV send IPI support

2018-06-29 Thread Wanpeng Li
From: Wanpeng Li Using hypercall to send IPIs by one vmexit instead of one by one for xAPIC/x2APIC physical mode and one vmexit per-cluster for x2APIC cluster mode. Even if enable qemu interrupt remapping and PV TLB Shootdown, I can still observe ~14% performance boost by ebizzy benchmark

[PATCH 2/2] KVM: X86: Implement PV send IPI support

2018-06-29 Thread Wanpeng Li
From: Wanpeng Li Using hypercall to send IPIs by one vmexit instead of one by one for xAPIC/x2APIC physical mode and one vmexit per-cluster for x2APIC cluster mode. Even if enable qemu interrupt remapping and PV TLB Shootdown, I can still observe ~14% performance boost by ebizzy benchmark

[PATCH 0/2] KVM: x86: Add PV IPIs support

2018-06-29 Thread Wanpeng Li
onzini Cc: Radim Krčmář Cc: Vitaly Kuznetsov Wanpeng Li (2): KVM: X86: Implement PV IPI in linux guest KVM: X86: Implement PV send IPI support Documentation/virtual/kvm/cpuid.txt | 4 +++ arch/x86/include/uapi/asm/kvm_para.h | 1 + arch/x86/kernel/kvm.c

[PATCH 0/2] KVM: x86: Add PV IPIs support

2018-06-29 Thread Wanpeng Li
onzini Cc: Radim Krčmář Cc: Vitaly Kuznetsov Wanpeng Li (2): KVM: X86: Implement PV IPI in linux guest KVM: X86: Implement PV send IPI support Documentation/virtual/kvm/cpuid.txt | 4 +++ arch/x86/include/uapi/asm/kvm_para.h | 1 + arch/x86/kernel/kvm.c

[PATCH 1/2] KVM: X86: Implement PV IPI in linux guest

2018-06-29 Thread Wanpeng Li
From: Wanpeng Li Implement PV IPIs in guest kernel. Cc: Paolo Bonzini Cc: Radim Krčmář Cc: Vitaly Kuznetsov Signed-off-by: Wanpeng Li --- arch/x86/include/uapi/asm/kvm_para.h | 1 + arch/x86/kernel/kvm.c| 63 2 files changed, 64

[PATCH 1/2] KVM: X86: Implement PV IPI in linux guest

2018-06-29 Thread Wanpeng Li
From: Wanpeng Li Implement PV IPIs in guest kernel. Cc: Paolo Bonzini Cc: Radim Krčmář Cc: Vitaly Kuznetsov Signed-off-by: Wanpeng Li --- arch/x86/include/uapi/asm/kvm_para.h | 1 + arch/x86/kernel/kvm.c| 63 2 files changed, 64

Re: [PATCH 3/3] KVM: x86: hyperv: implement PV IPI send hypercalls

2018-06-28 Thread Wanpeng Li
ccess: > > + return HV_STATUS_SUCCESS; > > +} > > + > > bool kvm_hv_hypercall_enabled(struct kvm *kvm) > > { > > return READ_ONCE(kvm->arch.hyperv.hv_hypercall) & > > HV_X64_MSR_HYPERCALL_ENABLE; > > @@ -1526,6 +1628,20 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu) > > } > > ret = kvm_hv_flush_tlb(vcpu, ingpa, rep_cnt, true); > > break; > > + case HVCALL_SEND_IPI: > > + if (unlikely(rep)) { > > + ret = HV_STATUS_INVALID_HYPERCALL_INPUT; > > + break; > > + } > > + ret = kvm_hv_send_ipi(vcpu, ingpa, outgpa, false, fast); > > + break; > > + case HVCALL_SEND_IPI_EX: Hi Paolo and Radim, I have already completed the patches for linux guest/kvm/qemu w/ vCPUs <= 64, however, extra complication as the ex in hyperv should be introduced for vCPUs > 64, so do you think vCPU <=64 is enough for linux guest or should me introduce two hypercall as what hyperv does w/ ex logic? Regards, Wanpeng Li

Re: [PATCH 3/3] KVM: x86: hyperv: implement PV IPI send hypercalls

2018-06-28 Thread Wanpeng Li
ccess: > > + return HV_STATUS_SUCCESS; > > +} > > + > > bool kvm_hv_hypercall_enabled(struct kvm *kvm) > > { > > return READ_ONCE(kvm->arch.hyperv.hv_hypercall) & > > HV_X64_MSR_HYPERCALL_ENABLE; > > @@ -1526,6 +1628,20 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu) > > } > > ret = kvm_hv_flush_tlb(vcpu, ingpa, rep_cnt, true); > > break; > > + case HVCALL_SEND_IPI: > > + if (unlikely(rep)) { > > + ret = HV_STATUS_INVALID_HYPERCALL_INPUT; > > + break; > > + } > > + ret = kvm_hv_send_ipi(vcpu, ingpa, outgpa, false, fast); > > + break; > > + case HVCALL_SEND_IPI_EX: Hi Paolo and Radim, I have already completed the patches for linux guest/kvm/qemu w/ vCPUs <= 64, however, extra complication as the ex in hyperv should be introduced for vCPUs > 64, so do you think vCPU <=64 is enough for linux guest or should me introduce two hypercall as what hyperv does w/ ex logic? Regards, Wanpeng Li

Re: [PATCH 0/4] x86/hyper-v: optimize PV IPIs

2018-06-27 Thread Wanpeng Li
On Wed, 27 Jun 2018 at 17:25, Vitaly Kuznetsov wrote: > > Wanpeng Li writes: > > > Hi Vitaly, (fix my reply mess this time) > > On Sat, 23 Jun 2018 at 01:09, Vitaly Kuznetsov wrote: > >> > >> When reviewing my "x86/hyper-v: use cheaper HVCALL_FLUSH_V

Re: [PATCH 0/4] x86/hyper-v: optimize PV IPIs

2018-06-27 Thread Wanpeng Li
On Wed, 27 Jun 2018 at 17:25, Vitaly Kuznetsov wrote: > > Wanpeng Li writes: > > > Hi Vitaly, (fix my reply mess this time) > > On Sat, 23 Jun 2018 at 01:09, Vitaly Kuznetsov wrote: > >> > >> When reviewing my "x86/hyper-v: use cheaper HVCALL_FLUSH_V

Re: [PATCH 0/4] x86/hyper-v: optimize PV IPIs

2018-06-26 Thread Wanpeng Li
s guest can switch to cluster mode after entering x2APIC, so sending IPI per cluster. In addition, you can also post the benchmark result for this PV IPI optimization, although it also fixes the bug which you mentioned above. I can post one variant for Linux guest PV IPI if it also makes sense. :) Regards, Wanpeng Li

Re: [PATCH 0/4] x86/hyper-v: optimize PV IPIs

2018-06-26 Thread Wanpeng Li
s guest can switch to cluster mode after entering x2APIC, so sending IPI per cluster. In addition, you can also post the benchmark result for this PV IPI optimization, although it also fixes the bug which you mentioned above. I can post one variant for Linux guest PV IPI if it also makes sense. :) Regards, Wanpeng Li

Re: [GIT PULL] KVM changes for 4.18 merge window

2018-06-13 Thread Wanpeng Li
r country(China), and looking forward to see you at Linux Conf. :) Regards, Wanpeng Li

Re: [GIT PULL] KVM changes for 4.18 merge window

2018-06-13 Thread Wanpeng Li
r country(China), and looking forward to see you at Linux Conf. :) Regards, Wanpeng Li

Re: [GIT PULL] KVM changes for 4.18 merge window

2018-06-12 Thread Wanpeng Li
t happens before the KVM GIT PULL on Linus's tree, you will see the warning above unless you lauch the guest w/ root user, maybe other subsystems' modifications break something. Regards, Wanpeng Li

Re: [GIT PULL] KVM changes for 4.18 merge window

2018-06-12 Thread Wanpeng Li
t happens before the KVM GIT PULL on Linus's tree, you will see the warning above unless you lauch the guest w/ root user, maybe other subsystems' modifications break something. Regards, Wanpeng Li

Re: [PATCH 2/2] sched/core: Consider afffinity constrain when yield to a task

2018-06-11 Thread Wanpeng Li
ll drop the two patches. Regards, Wanpeng Li

Re: [PATCH 2/2] sched/core: Consider afffinity constrain when yield to a task

2018-06-11 Thread Wanpeng Li
ll drop the two patches. Regards, Wanpeng Li

[PATCH 1/2] KVM: Fix lock holder candidate yield

2018-06-11 Thread Wanpeng Li
From: Wanpeng Li After detecting pause loop which is executed by a Lock Waiter in the guest, the pCPU will be yielded to a Lock Holder candidate, the Lock Holder candidate may have its own task affinity constrain, however, current yield logic yield to the Lock Holder condidate unconditionally

[PATCH 2/2] sched/core: Consider afffinity constrain when yield to a task

2018-06-11 Thread Wanpeng Li
From: Wanpeng Li Consider the task afffinity constrain when yield to a task. Cc: Paolo Bonzini Cc: Radim Krčmář Cc: Peter Zijlstra Cc: Ingo Molnar Signed-off-by: Wanpeng Li --- kernel/sched/core.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/kernel/sched/core.c b/kernel/sched

[PATCH 1/2] KVM: Fix lock holder candidate yield

2018-06-11 Thread Wanpeng Li
From: Wanpeng Li After detecting pause loop which is executed by a Lock Waiter in the guest, the pCPU will be yielded to a Lock Holder candidate, the Lock Holder candidate may have its own task affinity constrain, however, current yield logic yield to the Lock Holder condidate unconditionally

[PATCH 2/2] sched/core: Consider afffinity constrain when yield to a task

2018-06-11 Thread Wanpeng Li
From: Wanpeng Li Consider the task afffinity constrain when yield to a task. Cc: Paolo Bonzini Cc: Radim Krčmář Cc: Peter Zijlstra Cc: Ingo Molnar Signed-off-by: Wanpeng Li --- kernel/sched/core.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/kernel/sched/core.c b/kernel/sched

Re: [PATCH] KVM: VMX: Optimize tscdeadline timer latency

2018-06-01 Thread Wanpeng Li
On Wed, 30 May 2018 at 01:08, Paolo Bonzini wrote: > > On 29/05/2018 16:31, Radim Krčmář wrote: > > 2018-05-29 16:23+0200, Radim Krčmář: > >> 2018-05-29 14:53+0800, Wanpeng Li: > >>> From: Wanpeng Li > >>> > >>> 'Commit d0659d946be0 (&

Re: [PATCH] KVM: VMX: Optimize tscdeadline timer latency

2018-06-01 Thread Wanpeng Li
On Wed, 30 May 2018 at 01:08, Paolo Bonzini wrote: > > On 29/05/2018 16:31, Radim Krčmář wrote: > > 2018-05-29 16:23+0200, Radim Krčmář: > >> 2018-05-29 14:53+0800, Wanpeng Li: > >>> From: Wanpeng Li > >>> > >>> 'Commit d0659d946be0 (&

Re: [PATCH] KVM: VMX: Optimize tscdeadline timer latency

2018-05-29 Thread Wanpeng Li
On Wed, 30 May 2018 at 01:08, Paolo Bonzini wrote: > On 29/05/2018 16:31, Radim Krčmář wrote: > > 2018-05-29 16:23+0200, Radim Krčmář: > >> 2018-05-29 14:53+0800, Wanpeng Li: > >>> From: Wanpeng Li > >>> > >>> 'Commit d0659d946be0 (&quo

Re: [PATCH] KVM: VMX: Optimize tscdeadline timer latency

2018-05-29 Thread Wanpeng Li
On Wed, 30 May 2018 at 01:08, Paolo Bonzini wrote: > On 29/05/2018 16:31, Radim Krčmář wrote: > > 2018-05-29 16:23+0200, Radim Krčmář: > >> 2018-05-29 14:53+0800, Wanpeng Li: > >>> From: Wanpeng Li > >>> > >>> 'Commit d0659d946be0 (&quo

[PATCH] KVM: VMX: Optimize tscdeadline timer latency

2018-05-29 Thread Wanpeng Li
From: Wanpeng Li 'Commit d0659d946be0 ("KVM: x86: add option to advance tscdeadline hrtimer expiration")' advances the tscdeadline (the timer is emulated by hrtimer) expiration in order that the latency which is incurred by hypervisor (apic_timer_fn -> vmentry) can be avoide

[PATCH] KVM: VMX: Optimize tscdeadline timer latency

2018-05-29 Thread Wanpeng Li
From: Wanpeng Li 'Commit d0659d946be0 ("KVM: x86: add option to advance tscdeadline hrtimer expiration")' advances the tscdeadline (the timer is emulated by hrtimer) expiration in order that the latency which is incurred by hypervisor (apic_timer_fn -> vmentry) can be avoide

Re: [PATCH v2 0/3] Provides userspace with per-VM capability to improve latency

2018-05-23 Thread Wanpeng Li
2018-03-12 19:53 GMT+08:00 Wanpeng Li <kernel...@gmail.com>: > Provides userspace with per-VM capability(KVM_CAP_X86_DISABLE_EXITS) to > not intercept MWAIT/HLT/PAUSE in order that to improve latency in some > workloads. When running cyclictest in the guest w/ vCPU pin on host and

Re: [PATCH v2 0/3] Provides userspace with per-VM capability to improve latency

2018-05-23 Thread Wanpeng Li
2018-03-12 19:53 GMT+08:00 Wanpeng Li : > Provides userspace with per-VM capability(KVM_CAP_X86_DISABLE_EXITS) to > not intercept MWAIT/HLT/PAUSE in order that to improve latency in some > workloads. When running cyclictest in the guest w/ vCPU pin on host and cyclictest pin in guest

Re: [PATCH v4 0/8] KVM: x86: hyperv: PV TLB flush for Windows guests

2018-05-18 Thread Wanpeng Li
2018-05-18 20:42 GMT+08:00 Vitaly Kuznetsov <vkuzn...@redhat.com>: > Wanpeng Li <kernel...@gmail.com> writes: > >> 2018-05-18 19:19 GMT+08:00 Vitaly Kuznetsov <vkuzn...@redhat.com>: >>> Vitaly Kuznetsov <vkuzn...@redhat.com> writes: >>> >&

Re: [PATCH v4 0/8] KVM: x86: hyperv: PV TLB flush for Windows guests

2018-05-18 Thread Wanpeng Li
2018-05-18 20:42 GMT+08:00 Vitaly Kuznetsov : > Wanpeng Li writes: > >> 2018-05-18 19:19 GMT+08:00 Vitaly Kuznetsov : >>> Vitaly Kuznetsov writes: >>> >>>> Wanpeng Li writes: >>>> >>>>> Hi Vitaly, >>>>>

Re: [PATCH v4 0/8] KVM: x86: hyperv: PV TLB flush for Windows guests

2018-05-18 Thread Wanpeng Li
2018-05-18 19:19 GMT+08:00 Vitaly Kuznetsov <vkuzn...@redhat.com>: > Vitaly Kuznetsov <vkuzn...@redhat.com> writes: > >> Wanpeng Li <kernel...@gmail.com> writes: >> >>> Hi Vitaly, >>> 2018-05-16 23:21 GMT+08:00 Vitaly Kuznetsov <

Re: [PATCH v4 0/8] KVM: x86: hyperv: PV TLB flush for Windows guests

2018-05-18 Thread Wanpeng Li
2018-05-18 19:19 GMT+08:00 Vitaly Kuznetsov : > Vitaly Kuznetsov writes: > >> Wanpeng Li writes: >> >>> Hi Vitaly, >>> 2018-05-16 23:21 GMT+08:00 Vitaly Kuznetsov : >>>> Changes since v3 [Radim Krcmar]: >>>> - PATCH2 fixing 'HV_

Re: [PATCH v4 0/8] KVM: x86: hyperv: PV TLB flush for Windows guests

2018-05-18 Thread Wanpeng Li
68ed/359b18ab-05bb-460d-9b53-89505bca68ed_vda_1.qcow2 -net nic,model=virtio -net user -monitor stdio -usb -usbdevice tablet --enable-kvm --cpu host -vnc 0.0.0.0:2 Regards, Wanpeng Li

Re: [PATCH v4 0/8] KVM: x86: hyperv: PV TLB flush for Windows guests

2018-05-18 Thread Wanpeng Li
-9b53-89505bca68ed_vda_1.qcow2 -net nic,model=virtio -net user -monitor stdio -usb -usbdevice tablet --enable-kvm --cpu host -vnc 0.0.0.0:2 Regards, Wanpeng Li

Re: [PATCH 2/3] KVM: X86: Allow userspace to disable ioport intercept

2018-05-15 Thread Wanpeng Li
2018-05-16 5:56 GMT+08:00 Jim Mattson <jmatt...@google.com>: > I'd prefer to let the kvm module have the final say as to whether or > not to allow userspace to do this. I have an IRC discussion with Paolo, it seems that he didn't like the original idea of the patchset. Regards

Re: [PATCH 2/3] KVM: X86: Allow userspace to disable ioport intercept

2018-05-15 Thread Wanpeng Li
2018-05-16 5:56 GMT+08:00 Jim Mattson : > I'd prefer to let the kvm module have the final say as to whether or > not to allow userspace to do this. I have an IRC discussion with Paolo, it seems that he didn't like the original idea of the patchset. Regards, Wanpeng Li > > On Fri,

Re: [PATCH v3] KVM: X86: Lower the default timer frequency limit to 200us

2018-05-14 Thread Wanpeng Li
ping, 2018-05-05 19:02 GMT+08:00 Wanpeng Li <kernel...@gmail.com>: > From: Wanpeng Li <wanpen...@tencent.com> > > Anthoine reported: > The period used by Windows change over time but it can be 1 > milliseconds or less. I saw the limit_periodic_timer_frequency &

Re: [PATCH v3] KVM: X86: Lower the default timer frequency limit to 200us

2018-05-14 Thread Wanpeng Li
ping, 2018-05-05 19:02 GMT+08:00 Wanpeng Li : > From: Wanpeng Li > > Anthoine reported: > The period used by Windows change over time but it can be 1 > milliseconds or less. I saw the limit_periodic_timer_frequency > print so 500 microseconds is sometimes reached. > &g

[PATCH v2] KVM: X86: Fix CR3 reserve bits

2018-05-13 Thread Wanpeng Li
From: Wanpeng Li <wanpen...@tencent.com> MSB of CR3 is a reserved bit if the PCIDE bit is not set in CR4. It should be checked when PCIDE bit is not set, however commit 'd1cd3ce900441 ("KVM: MMU: check guest CR3 reserved bits based on its physical address width")' removes the

[PATCH v2] KVM: X86: Fix CR3 reserve bits

2018-05-13 Thread Wanpeng Li
From: Wanpeng Li MSB of CR3 is a reserved bit if the PCIDE bit is not set in CR4. It should be checked when PCIDE bit is not set, however commit 'd1cd3ce900441 ("KVM: MMU: check guest CR3 reserved bits based on its physical address width")' removes the bit 63 checking uncon

Re: [PATCH 1/2] KVM: X86: Fix CR3 reserve bits

2018-05-13 Thread Wanpeng Li
;> 2018-05-13 15:53 GMT+08:00 Liran Alon <liran.a...@oracle.com>: >> >> > >> >> > - kernel...@gmail.com wrote: >> >> > >> >> >> From: Wanpeng Li <wanpen...@tencent.com> >> >> >> >> >> >> MSB of CR3

Re: [PATCH 1/2] KVM: X86: Fix CR3 reserve bits

2018-05-13 Thread Wanpeng Li
2018-05-13 17:09 GMT+08:00 Liran Alon : > > - kernel...@gmail.com wrote: > >> 2018-05-13 16:28 GMT+08:00 Liran Alon : >> > >> > - kernel...@gmail.com wrote: >> > >> >> 2018-05-13 15:53 GMT+08:00 Liran Alon : >> >> > >&

Re: [PATCH 1/2] KVM: X86: Fix CR3 reserve bits

2018-05-13 Thread Wanpeng Li
2018-05-13 16:28 GMT+08:00 Liran Alon <liran.a...@oracle.com>: > > - kernel...@gmail.com wrote: > >> 2018-05-13 15:53 GMT+08:00 Liran Alon <liran.a...@oracle.com>: >> > >> > - kernel...@gmail.com wrote: >> > >> >&g

Re: [PATCH 1/2] KVM: X86: Fix CR3 reserve bits

2018-05-13 Thread Wanpeng Li
2018-05-13 16:28 GMT+08:00 Liran Alon : > > - kernel...@gmail.com wrote: > >> 2018-05-13 15:53 GMT+08:00 Liran Alon : >> > >> > - kernel...@gmail.com wrote: >> > >> >> From: Wanpeng Li >> >> >> >> MSB of CR3 is

Re: [PATCH 2/2] KVM: X86: Fix loss of CR3_PCID_INVD bit when guest writes CR3

2018-05-13 Thread Wanpeng Li
2018-05-13 16:03 GMT+08:00 Liran Alon <liran.a...@oracle.com>: > > - kernel...@gmail.com wrote: > >> From: Wanpeng Li <wanpen...@tencent.com> >> >> SDM volume 3, section 4.10.4: >> >> * MOV to CR3. The behavior of the instruction depends

Re: [PATCH 2/2] KVM: X86: Fix loss of CR3_PCID_INVD bit when guest writes CR3

2018-05-13 Thread Wanpeng Li
2018-05-13 16:03 GMT+08:00 Liran Alon : > > - kernel...@gmail.com wrote: > >> From: Wanpeng Li >> >> SDM volume 3, section 4.10.4: >> >> * MOV to CR3. The behavior of the instruction depends on the value of >> CR4.PCIDE: >> — If CR4.PCIDE

Re: [PATCH 1/2] KVM: X86: Fix CR3 reserve bits

2018-05-13 Thread Wanpeng Li
2018-05-13 15:53 GMT+08:00 Liran Alon <liran.a...@oracle.com>: > > - kernel...@gmail.com wrote: > >> From: Wanpeng Li <wanpen...@tencent.com> >> >> MSB of CR3 is a reserved bit if the PCIDE bit is not set in CR4. >> It should be chec

Re: [PATCH 1/2] KVM: X86: Fix CR3 reserve bits

2018-05-13 Thread Wanpeng Li
2018-05-13 15:53 GMT+08:00 Liran Alon : > > - kernel...@gmail.com wrote: > >> From: Wanpeng Li >> >> MSB of CR3 is a reserved bit if the PCIDE bit is not set in CR4. >> It should be checked when PCIDE bit is not set, however commit >> 'd1cd3ce900441 (&

[PATCH 2/2] KVM: X86: Fix loss of CR3_PCID_INVD bit when guest writes CR3

2018-05-12 Thread Wanpeng Li
From: Wanpeng Li <wanpen...@tencent.com> SDM volume 3, section 4.10.4: * MOV to CR3. The behavior of the instruction depends on the value of CR4.PCIDE: — If CR4.PCIDE = 1 and bit 63 of the instruction’s source operand is 1, the instruction is not required to invalidate any TLB e

[PATCH 2/2] KVM: X86: Fix loss of CR3_PCID_INVD bit when guest writes CR3

2018-05-12 Thread Wanpeng Li
From: Wanpeng Li SDM volume 3, section 4.10.4: * MOV to CR3. The behavior of the instruction depends on the value of CR4.PCIDE: — If CR4.PCIDE = 1 and bit 63 of the instruction’s source operand is 1, the instruction is not required to invalidate any TLB entries or entries in paging

[PATCH 1/2] KVM: X86: Fix CR3 reserve bits

2018-05-12 Thread Wanpeng Li
From: Wanpeng Li <wanpen...@tencent.com> MSB of CR3 is a reserved bit if the PCIDE bit is not set in CR4. It should be checked when PCIDE bit is not set, however commit 'd1cd3ce900441 ("KVM: MMU: check guest CR3 reserved bits based on its physical address width")' removes the

[PATCH 1/2] KVM: X86: Fix CR3 reserve bits

2018-05-12 Thread Wanpeng Li
From: Wanpeng Li MSB of CR3 is a reserved bit if the PCIDE bit is not set in CR4. It should be checked when PCIDE bit is not set, however commit 'd1cd3ce900441 ("KVM: MMU: check guest CR3 reserved bits based on its physical address width")' removes the bit 63 checking uncon

Re: [PATCH 0/3] KVM: VMX: Allow to disable ioport intercept per-VM by userspace

2018-05-11 Thread Wanpeng Li
2018-05-11 23:40 GMT+08:00 Konrad Rzeszutek Wilk <konrad.w...@oracle.com>: > On Mon, Apr 16, 2018 at 10:45:59PM -0700, Wanpeng Li wrote: >> Tim Shearer reported that "There is a guest which is running a packet >> forwarding app based on the DPDK (dpdk.org). The packet

Re: [PATCH 0/3] KVM: VMX: Allow to disable ioport intercept per-VM by userspace

2018-05-11 Thread Wanpeng Li
2018-05-11 23:40 GMT+08:00 Konrad Rzeszutek Wilk : > On Mon, Apr 16, 2018 at 10:45:59PM -0700, Wanpeng Li wrote: >> Tim Shearer reported that "There is a guest which is running a packet >> forwarding app based on the DPDK (dpdk.org). The packet receive routine >> writ

Re: [PATCH 0/3] KVM: VMX: Allow to disable ioport intercept per-VM by userspace

2018-05-08 Thread Wanpeng Li
2018-04-17 13:45 GMT+08:00 Wanpeng Li <kernel...@gmail.com>: > Tim Shearer reported that "There is a guest which is running a packet > forwarding app based on the DPDK (dpdk.org). The packet receive routine > writes to 0xc070 using glibc's "outw_p" function which doe

Re: [PATCH 0/3] KVM: VMX: Allow to disable ioport intercept per-VM by userspace

2018-05-08 Thread Wanpeng Li
2018-04-17 13:45 GMT+08:00 Wanpeng Li : > Tim Shearer reported that "There is a guest which is running a packet > forwarding app based on the DPDK (dpdk.org). The packet receive routine > writes to 0xc070 using glibc's "outw_p" function which does an additional > writ

Re: [PATCH v2] KVM: X86: Limit timer frequency to 200ms

2018-05-05 Thread Wanpeng Li
2018-05-05 16:36 GMT+08:00 Jan Kiszka <jan.kis...@web.de>: > On 2018-05-05 01:44, Wanpeng Li wrote: >> ping, >> 2018-05-01 7:35 GMT+08:00 Wanpeng Li <kernel...@gmail.com>: >>> From: Wanpeng Li <wanpen...@tencent.com> >>> >>> Anthoin

Re: [PATCH v2] KVM: X86: Limit timer frequency to 200ms

2018-05-05 Thread Wanpeng Li
2018-05-05 16:36 GMT+08:00 Jan Kiszka : > On 2018-05-05 01:44, Wanpeng Li wrote: >> ping, >> 2018-05-01 7:35 GMT+08:00 Wanpeng Li : >>> From: Wanpeng Li >>> >>> Anthoine reported: >>> The period used by Windows change over ti

[PATCH v3] KVM: X86: Lower the default timer frequency limit to 200us

2018-05-05 Thread Wanpeng Li
From: Wanpeng Li <wanpen...@tencent.com> Anthoine reported: The period used by Windows change over time but it can be 1 milliseconds or less. I saw the limit_periodic_timer_frequency print so 500 microseconds is sometimes reached. As suggested by Paolo, lower the default timer fre

[PATCH v3] KVM: X86: Lower the default timer frequency limit to 200us

2018-05-05 Thread Wanpeng Li
From: Wanpeng Li Anthoine reported: The period used by Windows change over time but it can be 1 milliseconds or less. I saw the limit_periodic_timer_frequency print so 500 microseconds is sometimes reached. As suggested by Paolo, lower the default timer frequency limit to a smaller

Re: [PATCH v2] KVM: X86: Limit timer frequency to 200ms

2018-05-04 Thread Wanpeng Li
ping, 2018-05-01 7:35 GMT+08:00 Wanpeng Li <kernel...@gmail.com>: > From: Wanpeng Li <wanpen...@tencent.com> > > Anthoine reported: > The period used by Windows change over time but it can be 1 milliseconds > or less. I saw the limit_periodic_timer_frequency p

Re: [PATCH v2] KVM: X86: Limit timer frequency to 200ms

2018-05-04 Thread Wanpeng Li
ping, 2018-05-01 7:35 GMT+08:00 Wanpeng Li : > From: Wanpeng Li > > Anthoine reported: > The period used by Windows change over time but it can be 1 milliseconds > or less. I saw the limit_periodic_timer_frequency print so 500 > microseconds is sometimes reached. > &g

Re: [PATCH v3] KVM: Extend MAX_IRQ_ROUTES to 4096 for all archs

2018-05-04 Thread Wanpeng Li
ping, 2018-04-27 8:55 GMT+08:00 Wanpeng Li <kernel...@gmail.com>: > From: Wanpeng Li <wanpen...@tencent.com> > > Our virtual machines make use of device assignment by configuring > 12 NVMe disks for high I/O performance. Each NVMe device has 129 > MSI-X Table entries:

Re: [PATCH v3] KVM: Extend MAX_IRQ_ROUTES to 4096 for all archs

2018-05-04 Thread Wanpeng Li
ping, 2018-04-27 8:55 GMT+08:00 Wanpeng Li : > From: Wanpeng Li > > Our virtual machines make use of device assignment by configuring > 12 NVMe disks for high I/O performance. Each NVMe device has 129 > MSI-X Table entries: > Capabilities: [50] MSI-X: Enable+ Count=129 Masked-V

Re: [PATCH] KVM: X86: Limit timer frequency with more smaller interval

2018-05-04 Thread Wanpeng Li
ping, 2018-04-30 10:38 GMT+08:00 Wanpeng Li <kernel...@gmail.com>: > From: Wanpeng Li <wanpen...@tencent.com> > > Anthoine reported: > The period used by Windows change over time but it can be 1 milliseconds > or less. I saw the limit_periodic_timer_frequency p

Re: [PATCH] KVM: X86: Limit timer frequency with more smaller interval

2018-05-04 Thread Wanpeng Li
ping, 2018-04-30 10:38 GMT+08:00 Wanpeng Li : > From: Wanpeng Li > > Anthoine reported: > The period used by Windows change over time but it can be 1 milliseconds > or less. I saw the limit_periodic_timer_frequency print so 500 > microseconds is sometimes reached. > >

Re: [PATCH v6 2/5] KVM: x86: Add IBPB support

2018-05-03 Thread Wanpeng Li
2018-05-03 17:19 GMT+08:00 Paolo Bonzini <pbonz...@redhat.com>: > On 03/05/2018 03:27, Wanpeng Li wrote: >> So for 1) guest->guest attacks 2) guest/ring3->host/ring3 attacks 3) >> guest/ring0->host/ring0 attacks, if IBPB is enough to protect these >> three s

Re: [PATCH v6 2/5] KVM: x86: Add IBPB support

2018-05-03 Thread Wanpeng Li
2018-05-03 17:19 GMT+08:00 Paolo Bonzini : > On 03/05/2018 03:27, Wanpeng Li wrote: >> So for 1) guest->guest attacks 2) guest/ring3->host/ring3 attacks 3) >> guest/ring0->host/ring0 attacks, if IBPB is enough to protect these >> three scenarios and retpoline is

Re: [PATCH v6 2/5] KVM: x86: Add IBPB support

2018-05-02 Thread Wanpeng Li
host/ring0 attacks. > When host kernel is using retpoline it is safe against these attacks. > If host kernel isn't using retpoline we might need to do a IBPB flush on > every VMEXIT. > So for 1) guest->guest attacks 2) guest/ring3->host/ring3 attacks 3) guest/ring0->host/ri

Re: [PATCH v6 2/5] KVM: x86: Add IBPB support

2018-05-02 Thread Wanpeng Li
tpoline it is safe against these attacks. > If host kernel isn't using retpoline we might need to do a IBPB flush on > every VMEXIT. > So for 1) guest->guest attacks 2) guest/ring3->host/ring3 attacks 3) guest/ring0->host/ring0 attacks, if IBPB is enough to protect these

[PATCH v2] KVM: X86: Limit timer frequency to 200ms

2018-04-30 Thread Wanpeng Li
From: Wanpeng Li <wanpen...@tencent.com> Anthoine reported: The period used by Windows change over time but it can be 1 milliseconds or less. I saw the limit_periodic_timer_frequency print so 500 microseconds is sometimes reached. As suggested by Paolo, lower the timer frequency

[PATCH v2] KVM: X86: Limit timer frequency to 200ms

2018-04-30 Thread Wanpeng Li
From: Wanpeng Li Anthoine reported: The period used by Windows change over time but it can be 1 milliseconds or less. I saw the limit_periodic_timer_frequency print so 500 microseconds is sometimes reached. As suggested by Paolo, lower the timer frequency limit to a smaller interval of 200

<    2   3   4   5   6   7   8   9   10   11   >